Memory-Centric Architectures: Exploiting Emerging Persistent Memory


Recent advances in availability of low latency non-volatile memory have begun to blur the distinction between memory and storage. NVRAM technologies such as Phase Change Memory (PCM), STT-RAM, and Memristors combine low (read) latency approaching DRAM with persistence characteristics of storage. Our research focuses on exploiting emerging persistent memory as a new capability to enable in-memory analysis of very large datasets as well as extending the lifetime of compute processes in general so that they can stop and resume at will. Our research is driven by application use cases such as analysis of massive scale-free graphs, identification of streamlines in scientific simulation result data sets, and classification of metagenomic data sets.

To access locally attached flash storage arrays as if in memory, we’ve developed a data-intensive memory map runtime DI-MMAP that  optimizes access into large external data sets mapped into an application’s address space. DIMMAP is available as open source. We have extended the jemalloc memory allocator to support a named persistent heap as a C library, perm-je, available as open source. Perm-je is used with the Livermore Metagenomics Analysis Toolkit (LMAT), and LMAT is also available as open source. We have developed HavoqGT, which is a C++ algorithm framework to traverse massive scale-free graphs stored in locally attached flash arrays.

Our on-going work quantitatively evaluates potential benefits of active memory that may be possible with 3D packaging of memory with logic such as the Hybrid Memory Cube.

LMAT is an approach to metagenomic classification that supports large, complete reference databases. The database index is precomputed from the reference sequences, and that step is a tradeoff to speed up online runtime query performance while maintaining a high level of accuracy.  These indices are memory mapped directly into the address space of the application, and when stored on flash, enable high concurrent lookup of the index keys (k-mers).  We have designed a custom k-mer index that considerably reduces the memory footprint vs conventional data structures and is suitable for highly random page retrieval from NVRAM using DI-MMAP.

HavoqGT is a framework for expressing asynchronous vertex-centric graph algorithms.   It provides a visitor interface, where actions are defined at an individual vertex level.  All graph data is stored in mmaped files, using Boost.Interprocess and Memory Mapped (mmap) I/O. Large graphs that cannot fit in main-memory may still be processed using mmap as external memory.