Providing shared computer science infrastructure for simulation codes
The new Computer Science Toolkit project within the Advanced Simulation and Computing (ASC) program at Lawrence Livermore National Laboratory (LLNL) focuses on developing common software infrastructure components to support multiphysics simulation applications targeted to run on next-generation high-performance computing (HPC) systems.
The Toolkit embodies a new code development approach in the Weapons and Complex Integration (WCI) Principal Directorate that emphasizes software sharing and reuse. It will provide flexible software components that computer simulation tools can share for data management, input/output (I/O), analysis, and visualization, as well as other production needs. In current WCI physics applications, infrastructure that supports such features tends to be unique to each code project and comprises 30 percent or more of its code base. Given this redundancy, there exists a great opportunity moving forward to translate clearly defined requirements from current projects into new robust, shared capabilities. This centralization will help with software support and serve to help insulate simulation projects from key challenges associated with new, revolutionary hardware and software on next-generation HPC systems.
The focus on sharing and reuse in the Toolkit is rooted in a vision to foster a collaborative computer science ecosystem that will broaden the developer base beyond individual, independent WCI application projects. The Toolkit will be released in an unlimited access, open environment so that it can also be used by LLNL research codes and proxy applications. The wide release will enable LLNL programs to engage researchers, students, and vendors and establish a bridge to help deploy innovative research concepts into production applications.
While the primary customer of the Toolkit will be MARBL, a new next-generation multiphysics code under development at LLNL, Toolkit components will be sufficiently general and flexible to be shared across a wide range of HPC applications. In addition to new applications, these components will be integrated into current LLNL production simulation codes as they continue to evolve, including codes for the ASC program’s Physics and Engineering Models (PEM) subprogram. The top figure shows the relationship between the ASC Program Toolkit and other software in a physics application. Toolkit capabilities will be shared by various software components that comprise an integrated application. As adoption of the Toolkit increases across the system, integration of new capabilities is simplified. The figure also lists several Toolkit components currently in development along with a brief description of the services they provide. The initial focus is on foundational components with limited scope, on which more complex functionality will eventually be built as the Toolkit ecosystem develops.
Over the past year, the majority of Toolkit development efforts have focused on the Simulation Data Repository (Sidre) component, which will provide centralized data management for simulation codes. Sidre requirements are based on functionality in existing LLNL ASC program codes that have been developed independently and refined over the past several decades, as well as needs associated with emerging advanced hardware architectures. Sidre will support data declaration, allocation, transformations, hierarchical data organization, and different “views” into shared data. It will also simplify interlanguage data consistency when C/C++ and Fortran software components are used together. Sidre will help application developers use complex memory hierarchies (for example, involving NVRAM and high bandwidth memory) on advanced platforms by providing mechanisms to associate data with memory spaces and move data between them.
Sidre enables adoption of a common set of conventions to describe meshes and field data in mesh-based physics codes. These conventions are named the Mesh Blueprint and are being codeveloped with the Conduit project in WCI, which provides complementary capabilities for in-core data description. Supporting common “mesh-aware” data conventions will allow physics packages, libraries, and tools developed independently to interoperate and share data easily by programming to a single interface. Data described by application components and managed by Sidre will have sufficient context that a general parallel I/O component, for example, will be able to write files in different data formats that can be understood by various externally developed postprocessing tools. The same idea applies for tools that perform in-memory operations, such as visualization and analysis, and mesh and geometry queries and manipulation.