Highlights include the HYPRE library, recent data science efforts, the IDEALS project, and the latest on the Exascale Computing Project.
Topic: HPC Systems and Software
Apollo, an auto-tuning extension of RAJA, improves performance portability in adaptive mesh refinement, multi-physics, and hydrodynamics codes via machine learning classifiers.
Large Linux data centers require flexible system management. At Livermore Computing, we are committed to supporting our Linux ecosystem at the high end of commodity computing.
This project's techniques reduce bandwidth requirements for large unstructured data by making use of data compression and optimizing the layout of the data for better locality and cache reuse.
Researchers are developing a standardized and optimized operating system and software for deployment across Linux clusters to enable HPC at a reduced cost.
LLNL’s Stack Trace Analysis Tool helps users quickly identify errors in code running on today’s largest machines.
ROSE, an open-source project maintained by Livermore researchers, provides easy access to complex, automated compiler technology and assistance.
New platforms are improving big data computing on Livermore’s high performance computers.
LLNL researchers are finding some factors are more important in determining HPC application performance than traditionally thought.
Livermore computer scientists have helped create a flexible framework that aids programmers in creating source code that can be used effectively on multiple hardware architectures.
LLNL computer scientists use machine learning to model and characterize the performance and ultimately accelerate the development of adaptive applications.
Livermore Computing staff is enhancing the high-speed InfiniBand data network used in many of its high performance computing and file systems.
Caliper enables users to build customized performance measurement and analysis solutions by connecting independent context annotations, measurement services, and data processing services.
Spindle improves the library-loading performance of dynamically linked HPC applications by plugging into the system’s dynamic linker and intercepting its file operations.
PnMPI is a thin, low-overhead wrapper library that is automatically generated from mpi.h file and that can be linked by default.
Performance analysis of parallel scientific codes is difficult. The HAC model allows direct comparison of data across domains with data viz and analysis tools available in other domains.
Fast Global File Status (FGFS) is an open-source package that provides scalable mechanisms and programming interfaces to retrieve global information of a file.
MPI_T is an interface for tools introduced in the 3.0 version of MPI. The interface provides mechanisms for tools to access and set performance and control variables that are exposed by an MPI implementation.
Cram lets you easily run many small MPI jobs within a single, large MPI job by splitting MPI_COMM_WORLD up into many small communicators to run each job in the cram file independently.
libMSR provides a convenient interface to access Model Specific Registers and to allow tools to utilize their full functionality.
Veritas provides a method for validating proxy applications to ensure that they capture the intended characteristics of their parents.
Application-level resilience is emerging as an alternative to traditional fault tolerance approaches because it provides fault tolerance at a lower cost than traditional approaches.
This tool that automatically diagnoses performance and correctness faults in MPI applications. It identifies abnormal MPI tasks and code regions and finds the least-progressed task.
Working on world-class supercomputers at a U.S. national laboratory was not what Edgar Leon, a native of Mexico, envisioned when he began preparing for university.
These techniques emulate the behavior of anticipated future architectures on current machines to improve performance modeling and evaluation.