Developing Software for an Exascale Ecosystem and Emerging Workloads: OpenZFS
As this three-part news series explains, LLNL is striving to create a computing ecosystem that operates at exascale speeds (more than 1018 calculations per second) to carry out its national security and science missions an order of magnitude faster than today’s high performance computing (HPC) systems. The Livermore Computing (LC) Division is developing software—including Flux, OpenZFS, and SCR—to support these systems.
For more than a decade, LC has been helping to bring the revolutionary ZFS technology to the Linux community. ZFS is a combined file system and logical volume manager for Unix-like operating systems; it enables multiple hard drives to be grouped into a storage pool and data to be written redundantly across the pool to protect against loss. Originally released by Sun Microsystems in 2005 as an open source product for its Solaris operating system, ZFS offers users portability, enterprise-class scalability and data integrity, as well as management features like checksumming, compression, and snapshotting.
LC developers soon recognized ZFS’s transformative capabilities and began porting ZFS to Linux in 2007. Since then, LC has helped found OpenZFS, a project dedicated to coordinating the activities of open source ZFS developers from several organizations. Today the OpenZFS community maintains ZFS implementations on Linux, MacOS, Illumos, and FreeBSD.
ZFS provides users with access to large amounts of cost-effective storage and strong data integrity assurances, both of which are important to users running on Livermore’s HPC systems. “Computing has become a cornerstone of modern scientific research, and the application of simulation and data science techniques generates an ever-increasing demand for large amounts of data storage,” says Ned Bass, group leader for System Software Development in LC.
ZFS helps meet this demand by enabling the deployment of large-scale data storage systems on relatively inexpensive hardware. Bass stresses the importance to the scientific community that the data be correct. “Even one wrong byte in a data set could invalidate months’ worth of computing work or lead a simulation to produce wrong results,” Bass says, emphasizing that ZFS can reliably detect on-disk corruption and reconstruct the expected data using redundant storage techniques.
“As we push toward exascale supercomputers to tackle grand challenges and the stockpile stewardship mission,” Bass says, “the demand for faster and larger storage systems will continue.” He cites colleague Brian Behlendorf ’s work to support these next-generation systems, namely developing the UNMAP/TRIM feature in ZFS, which prevents performance degradation on solid-state disks, a type of storage device that will be prevalent in exascale-class storage systems. UNMAP/TRIM identifies disk sectors that are no longer allocated by ZFS and thus enables the underlying device to more efficiently manage itself.
OpenZFS is supported by an open source community, with major contributors including LLNL, Whamcloud, Delphix, Joyent, and Datto. Its development at Livermore is funded by the Advanced Simulation and Computing (ASC) Program.
- In the image above, a large directory structure was copied 3,500 times in a pool composed of solid-state disks. The time elapsed per iteration is plotted. The UNMAP/TRIM feature in ZFS was not used for the first 2,000 iterations, and performance degraded to a steady state. The two dips in elapsed time correspond to manual invocations of the UNMAP/TRIM command, but performance quickly degraded again after each invocation. Finally, performance remained at near-original levels after the pool was configured to automatically us UNMAP/TRIP.
- ZFS on Linux on GitHub