Large datasets and the growing diversity of data increasingly drive the need for more capable data-intensive computing platforms. At Livermore, this concern takes on additional significance since the Laboratory’s work uses big data to pursue a safer, more secure world for tomorrow. The Laboratory’s high performance computing (HPC) capabilities offer exceptional opportunities for data analysis and processing. However, the specialized nature of Livermore Computing’s (LC’s) hardware presents challenges to traditional approaches to data-intensive computing.
“LC’s HPC systems have been tailored to run scientific simulations very well. Unfortunately, this is not the optimal architecture for many data-intensive computing applications,” explains Robin Goldstone, a member of LC’s Advanced Technologies Office. Goldstone and her team have been exploring solutions that can bring LC’s expertise to bear on the Laboratory’s growing demand for big data computing platforms. “We recognized that we needed to take a look at Hadoop, a solution that has been requested by numerous customers. We set out to see how we could tweak our traditional HPC systems to meet the needs of these big data customers.”
The Hadoop ecosystem—which includes MapReduce, Hbase, and newer frameworks such as Spark and Storm—has gained widespread adoption in part due to its relatively modest computing hardware requirements. Clusters of inexpensive commodity servers with local hard drives can run Hadoop effectively since the software has been designed from the ground up to tolerate failure. In contrast, HPC applications typically do not tolerate failure, which causes HPC systems to demand more expensive hardware and complex recovery mechanisms that will achieve resilience. For these reasons, HPC systems are typically dismissed as being “overkill” for frameworks like Hadoop.
However, since LC already has these HPC systems deployed, the question becomes whether such systems can efficiently run Hadoop in place of an entirely separate set of commodity-class resources. To answer this question, LC purchased a small, generic Hadoop cluster in order to gain experience in deploying and managing such a system. This cluster, named Bigfoot, allowed Goldstone’s team to evaluate the operational impact of supporting this platform while simultaneously providing a place to conduct trials between commodity and HPC systems.
The HPC-centric approach involved the development of a software package, named Magpie, which allows Hadoop and similar data analytics frameworks to run on LC’s HPC systems. Magpie accomplishes this task by instantiating the framework within the context of a batch job—rather than on a persistent, dedicated cluster—and by reading and writing from the Lustre parallel HPC file system instead of local disk drives.
With both Bigfoot and Magpie in hand, Goldstone’s team assessed whether Magpie could replace the need for dedicated Hadoop clusters at Livermore. Using the de facto Hadoop benchmark, TeraSort, the team ran a sort function on the Bigfoot cluster and then on an equivalent number of nodes on one of LC’s HPC systems. The team additionally formulated several TeraSort configuration options and Magpie tunables to test the two systems. The results showed that the benchmark performed at best 50% slower on the HPC cluster than on Bigfoot when using an equivalent node count.
The team performed two additional experiments with more encouraging results. In the first test, the TeraSort benchmark ran on the HPC cluster using double the number of nodes as on Bigfoot. This time, the HPC cluster won the comparison, achieving a 33% reduction in runtime over Bigfoot. While this success might not appear to be a fair test, it demonstrates the “surge” capability that LC offers users—with thousands of cluster nodes already deployed, LC can quickly accommodate a customer’s need to scale up their analysis. To do the same on a dedicated Hadoop cluster would require months of lead time to purchase and deploy additional hardware.
In the second experiment, the team employed Catalyst, a new LC HPC system equipped with non-volatile random access memory, more commonly known as “flash storage.” Each Catalyst compute node contains 800 GB of high performance Peripheral Component Interconnect Express-attached flash storage, which Magpie can use in place of Lustre for storing Hadoop intermediate data files. It is this intermediate input/output (I/O) that puts the most strain on the Lustre file system, so the Magpie developer surmised that a modest amount of fast local storage could significantly improve I/O performance. The team’s testing validated this theory, achieving double the performance when running a TeraSort across 295 of Catalyst’s nodes that use the local flash storage.
“Reducing data motion is not just a big data issue,” says Goldstone, elaborating on the value of these outcomes. “Our HPC simulation customers are also feeling the pain of moving data, and we see architectures like Catalyst and the future Sierra system as the path forward. The work we have done illustrates the synergy between big data and HPC, and puts LC in a leadership position to meet the needs of both camps going forward.”