Topic: HPC Systems and Software

The sheer size of data poses significant problems in all stages of the visualization pipeline, from offline pre-processing of simulation data, to interactive queries, to real-time rendering. Moreover, visualization data is often unstructured in nature, which further complicates its management and representation. The goal of this project is to develop techniques for reducing bandwidth requirements for large unstructured data, both explicitly, by making use of data compression, and implicitly, by optimizing the layout of the data for better locality and cache reuse.

Project

Livermore builds an open-source community around its award-winning HPC package manager.

Project

Researchers have been developing a standardized and optimized operating system and software for deployment across a series of Linux clusters to enable high performance computing at a reduced cost.

Project

LLNL’s Stack Trace Analysis Tool helps users quickly identify errors in code running on today’s largest machines.

Project

ROSE, an open-source project maintained by Livermore researchers, provides easy access to complex, automated compiler technology and assistance.

Project

New platforms are improving big data computing on Livermore’s high performance computers.

Project

LLNL researchers are finding some factors are more important in determining HPC application performance than traditionally thought.

Project

Livermore computer scientists have helped create a flexible framework that aids programmers in creating source code that can be used effectively on multiple hardware architectures.

Project

LLNL computer scientists use machine learning to model and characterize the performance and ultimately accelerate the development of adaptive applications.

Project

Livermore Computing staff is enhancing the high-speed InfiniBand data network used in many of its high performance computing and file systems.

Project

Computer scientists are incorporating ZFS into their high-performance parallel file systems for better performance and scalability.

Project

Performance analysis of parallel scientific codes is becoming increasingly difficult, and existing tools fall short in revealing the root causes of performance problems. We have developed the HAC model, which allows us to directly compare the data across domains and use data visualization and analysis tools available in other domains.

Project

Fast Global File Status (FGFS) is an open-source package that provides scalable mechanisms and programming interfaces to retrieve global information of a file.

Project

MPI_T is an interface for tools introduced in the 3.0 version of MPI. The interface provides mechanisms for tools to access and set performance and control variables that are exposed by an MPI implementation.

Project

Cram lets you easily run many small MPI jobs within a single, large MPI job by splitting MPI_COMM_WORLD up into many small communicators to run each job in the cram file independently.

Project

libMSR provides a convenient interface to access Model Specific Registers and to allow tools to utilize their full functionality.

Project

A comprehensive understanding of the performance behavior of large-scale simulations requires the ability to compile, analyze, and compare measurements and contexts from many independent sources. Caliper, a general-purpose application introspection system, makes that task easier by connecting various independent context annotations, measurement services, and data processing services.

Project

Spindle improves the library-loading performance of dynamically linked HPC applications. It plugs into the system’s dynamic linker and intercepts its file operations so that only one process (or other small amount) will perform the file operations necessary and share the results with other processes in the job.

Project

PnMPI is a thin, low-overhead wrapper library that is automatically generated from mpi.h file and that can be linked by default.

Project

Veritas provides a method for validating proxy applications to ensure that they capture the intended characteristics of their parents.

Project

AutomaDeD is a tool that automatically diagnoses performance and correctness faults in MPI applications. It has two major functionalities: identifying abnormal MPI tasks and code regions and finding the least-progressed task. The tool produces a ranking of MPI processes by their abnormality degree and specifies the regions of code where faults are first manifested.

Project

Application-level resilience is emerging as an alternative to traditional fault tolerance approaches because it provides fault tolerance at a lower cost than traditional approaches.

Project

Working on world-class supercomputers at a U.S. national laboratory was not what Edgar Leon, a native of Mexico, envisioned when he began preparing for university.

People Highlight

To overcome the shortcomings of the analytical and architectural approaches to performance modeling and evaluation, we are developing techniques that emulate the behavior of anticipated future architectures on current machines.

Project

With SCR, jobs run more efficiently, recover more work upon failure, and reduce load on critical shared resources.

Project