AutomaDeD
This tool that automatically diagnoses performance and correctness faults in MPI applications. It identifies abnormal MPI tasks and code regions and finds the least-progressed task.
Application-Level Resilience
Application-level resilience is emerging as an alternative to traditional fault tolerance approaches because it provides fault tolerance at a lower cost than traditional approaches.
GREMLINs
These techniques emulate the behavior of anticipated future architectures on current machines to improve performance modeling and evaluation.
High Performance Storage System: Taking the long view
A multidecade, multi-laboratory collaboration evolves scalable long-term data storage and retrieval solutions to survive the march of time.
Supercomputing’s critical role in the fusion ignition breakthrough
High performance computing was key to the December 5 breakthrough at the National Ignition Facility.
Due credit: Sierra, Jade and HPC’s role in Livermore’s fusion ignition breakthrough
Two supercomputers powered the research of hundreds of scientists at Livermore’s NNSA National Ignition Facility, which recently achieved ignition.