This article is part of a series about Livermore Computing’s efforts to stand up the NNSA’s first exascale supercomputer. El Capitan will come online in 2024 with the processing power of more than 2 exaflops, or 2 quintillion (1018) calculations per second. The system will be used for predictive modeling and simulation in support of the stockpile stewardship program.

Previous: Modern compilers | Next: Prepping for performance

The software required to run scientific codes on a supercomputer is, in a word, complicated. A massively parallel computing environment uses an operating system, compilers, schedulers, workflow managers, debuggers, and much more—all of which are customized for the hardware.

When a researcher wants to run their multiphysics code on a high performance computing (HPC) system, the build and installation process has to identify the latest software versions needed and resolve the inevitable compatibility issues. Doing this manually is tedious and error-prone, and the reason why automated package management exists.

Perhaps no one is more prepared to address package management on El Capitan than the Spack team. Spack is an open-source package manager created at LLNL over a decade ago and now used widely throughout the HPC community. It was the packaging solution for the Department of Energy Exascale Computing Project, and now it’s primed for the installation demands of exascale machines.

As popular as Spack is, however, not all scientific workloads use it. Computer scientist Greg Becker points out, “We need to be able to support all El Capitan users regardless of their packaging solution—even if they aren’t using a package manager at all and instead go the route of ‘I install it all myself.’”

This wider view of software installation, combined with lessons learned from previous supercomputer procurements, led to the formation of the Packaging Working Group with members from Lawrence Livermore, Argonne, and Oak Ridge national labs alongside El Capitan hardware vendors Hewlett Packard Enterprise (HPE) and Advanced Micro Devices Inc. (AMD). Becker continues, “The working group is the main conduit through which we talk to HPE and AMD about how our scientific software will interface with their system software. For example, we work closely with the HPE folks to make the custom Cray programming environment more usable and more like every other Linux system.”

One major challenge has been managing the interaction between both vendors’ system software. “Often packaging is simply a matter of system software working with user software,” Becker explains. “In this case, we have relationships between the AMD software and HPE software, plus the relationships between our software and both of those. Some packages care a lot about exactly how the three-way connection is done.”

According to Todd Gamblin, who created Spack and co-leads the working group, El Capitan’s packaging solutions depend on the open-source community. He states, “Vendors don’t want to reinvent software capabilities if they don’t have to. We’ve built something that not only our scientists are asking for, but many of the vendors’ other customers as well. We’ve even helped HPE increase their open-source contributions. Software projects such as Spack and Flux provide a way for the Lab to guide the future of HPC.”

The group has also improved compatibility and usability by upgrading the Cray programming environment with MPI compiler wrappers and Linux-like installation features, as well as fine-tuning continuous integration (CI) processes that combine Cray CI with cloud-based CI solutions. Furthermore, they have implemented Spack’s binary build caches, which speed up installations by using previous builds instead of building from source. Gamblin adds, “The working group has had some major victories. El Capitan is going to be an awesome machine. It’s all very exciting.”

Previous: Modern compilers | Next: Prepping for performance

—Holly Auten & Meg Epperly