Delivering an open-source Linux version of the popular ZFS software.
ZFS on Linux logo

Livermore’s ZFS on Linux Port a Hit with IT Industry

Friday, July 17, 2015

As big data meets the cloud, the information technology industry needs a data storage system capable of handling increasingly large files. Livermore Computing (LC) has provided a solution—an open-source Linux version of the popular ZFS software. A growing number of companies that offer products for cloud computing and large-scale data storage are using Livermore’s ZFS on Linux in their products. LC originally adopted ZFS as a solution to its need for a file system capable of delivering data to the Sequoia supercomputer, and the generations of systems that will follow.

ZFS “fills a critical functionality gap in Linux,” says Brian Behlendorf, who led Livermore’s project to port ZFS to Linux. “While the Linux kernel provides many different file systems, the majority of them are designed and optimized to be used by desktops. At Livermore we need a file system that can scale up to manage petabytes [1015 bytes] of storage.”

A file system for next-generation computing

LLNL’s Lustre file system did not scale up enough. The commonly used Linux ext4 file system, on which Lustre was originally built, imposes a limit of four billion files and a maximum file size of 16 terabytes (1012 bytes). Lustre was designed to aggregate multiple file systems into a larger one, but it inherits Linux ext4’s limitations. ZFS, which Sun Microsystems created for its Solaris operating system, looked promising but did not run on Linux. Behlendorf’s group ported ZFS to Linux for Lustre to effectively remove these limits. It can handle up to 256 trillion files (1012) with a maximum file size of 16 exabytes (1018).

Livermore freely distributed ZFS on Linux to the user community. It grew into a high profile, multi-platform file system capable of running on Linux, Illumos, FreeBSD, and OS X platforms. Behlendorf evangelized ZFS throughout the industry. “I gave quite a few talks at a variety of storage conferences and summits detailing the work we were doing at Livermore to bring ZFS to Linux,” he says. “Generally they were warmly received because people were already familiar with ZFS from other platforms.” In September 2013, LC became a founding member of OpenZFS, a cross-platform home for developers on all operating systems. Behlendorf is the official maintainer for OpenZFS on the Linux platform.

Excitement followed. ZFS has many advantages over existing file systems, thanks to its efficient data compression, support for high storage capacity, and focus on data integrity. For one, ZFS is highly resilient because of its ability to continuously detect and repair corrupt data caused by faulty hardware and buggy software. Its copy-on-write design allows such features as constant time snapshots of data and inexpensive file system clones. The scalability of ZFS data files, in principle up to one zettabyte, make it attractive for managing data in the cloud and in network-attached (NAS) storage systems.

Running with an opportunity

Industry responded to the opportunities ZFS presented. The OpenZFS companies page lists 25 companies that have developed products built on ZFS. These companies are providing hardware and software solutions, and system integration for cloud-based- and NAS-based data management at the corporate enterprise system level.

The Healthcare Enterprise Linux Operating System (HELiOS) incorporates ZFS and is the standard Linux operating system used in GE Healthcare’s products, which include medical imaging and information technologies, diagnostics, patient monitoring systems, and drug discovery and biopharmaceutical manufacturing technologies. “The reason to have HELiOS pick up ZFS was to be able to leverage all of its goodness in an OS sorely in need of such a file system,” says Chris Brown, compute systems architect at GE Healthcare. “Certain internal products are sensitive to data integrity or acquisition image loss. Enter ZFS, the king of data integrity, and old friend, but certainly a Linux file system for the future.”

As the development lead for OpenZFS on Linux, Behlendorf works with a user community that is always improving the code, submitting everything from bug fixes to major new features. “We’re lucky to have a very large and enthusiastic community of users and developers,” he says. “That’s one of our biggest strengths—it allows us to rapidly develop and rigorously test the software.” Behlendorf sees an expanding future for OpenZFS: “My expectation is that as more companies discover the Linux ZFS port and develop confidence in it, they’ll adopt it. It’s too compelling a technology to simply ignore.”