Unify: Distributed Burst Buffer File System
Hierarchical storage systems are the wave of the future for high performance computing (HPC) centers like LLNL’s Livermore Computing Complex. The Unify project aims to improve I/O performance by utilizing distributed, node-local storage systems. This design scales bandwidth and capacity according to the computer resources used by a given job. Furthermore, Unify avoids inter-job interference from parallel file systems or shared burst buffers.
Unify is a suite of specialized, flexible file systems—the first is available on GitHub with more on the way—that can be included in a user’s job allocations. A user can request which Unify file system(s) to be loaded and the respective mount points. Tests on LLNL’s Catalyst cluster show more than 2x improvement in write performance.
Figure: UnifyCR supports checkpoint/restart workloads. Like all current and future Unify file systems, UnifyCR is launched at the beginning of a batch job. Additional information about UnifyCR configuration can be found on Read the Docs. (Click to enlarge.)