Privacy & Legal Notice

Livermore Computing Training Announcement

Title: Tool Training Workshop: Updates and User Training for the MPI tools Vampir and MUST
Date/Times
Locations:
Jul 9, 2014
10:00am-noon: B453 R1012 (Black Diamond Room) - Presentations and demos
1:30pm-5:00pm: T1889 Classroom1 - Hands-on training session (Map here)
Jul 10, 2014
9:00am-noon: B453 R1016 - Individual sessions scheduled through Martin Schulz (schulzm@llnl.gov)
1:30pm-5:00pm: T1889 Classroom2 - Hands-on training session (Map here)
Presentors: Matthias Mueller, RWTH Aachen
Tobias Hilbrich, TU-Dresden
Joachim Protze, RWTH Aachen
Description: High performance computing system architectures challenge application developers with heterogeneity and increasing system scale. Tools aid application developers and system support personnel in tuning applications for these systems and in avoiding correctness errors. Thus, we present the Vampir tool suite that provides deep and detailed insights into application performance for various architectures and programming paradigms, and the MUST tool that provides runtime error detection for MPI applications. After a short introduction to the use cases that these tools target, we will present workflows and advanced features for tool usage at scale, as well as for multi-paradigm applications (e.g., OpenMP-MPI). We will conclude our presentations with a summary of novel features (last 2 years) and ongoing development. Most importantly this includes the monitoring component Score-P that Vampir uses. This component unifies instrumentation and performance measurement for a wide range of tools. Finally, we invite interested application developers and system support groups to discuss best strategies for using our tools for their specific use cases, as well as to provide us feedback on useful/missing functionality.

Additionally, the presenters will be available in the afternoons of Jul 9 and 10 for hands-on training sessions, and the morning of Jul 10 for individual meetings with application developers and development teams. Please contact Martin Schulz (schulzm@llnl.gov) if you are interested in scheduling such a session.

Fee: None
Registration: No registration is required. Seating is on a first-come, first-served basis.
Questions? Please contact Martin Schulz (schulzm@llnl.gov)
Additional Details: The MUST tool targets the detection of usage errors of MPI. It primarily serves for removing errors that manifested in runs that showed hangs, crashes, or wrong results. Further, the tool aids in cases where it is unclear whether a defect exists in an application or a system library, e.g., the MPI library. MUST provides a wide range of correctness analyses that include simple MPI resource usage issues, datatype mismatch situations, collective consistency analysis, and deadlock detection. Also, an analysis checks whether communication buffers overlap--sending and receiving with multiple MPI operations on equal memory regions. Such errors can be hard to track and reproduce in practice. All of MUST's checks target scalability and showed low overhead (usually below 100% increased application runtime) at up to 16,384 processes on a BG/Q system. Ongoing development of MUST considers PGAS-like languages, Debugger integrations, OpenMP checks (especially new OpenMP 4.0 target constructs), and hybrid MPI-OpenMP checks.

The Vampir tool suite serves for performance analysis. The tool visualizes the behavior of massively parallel application to highlight bottlenecks and inefficiencies. Basic profiling information guides the tool user towards interesting spots and detailed timeline views then provide an understanding of why the application exhibits an inefficiency. This information provides application developers and maintainers with input for performance optimization. Repeated runs with optimized codes than highlight the effects and efficacy of the individual optimizations. Vampir uses a post-mortem approach where the monitoring component Score-P captures application behavior during runtime and a the visualization component Vampir/VampirServer then visualizes this data after an application run. The long and community driven development of the monitor Score-P yields a wide range of features that include different instrumentation types, hardware performance counter support, energy counter support, and multi programming paradigm support (MPI, OpenMP, CUDA, ...). Scalability features of both Score-P and Vampir then enable performance analysis with 10,000's of processes where frontier use cases applied Vampir with 200,000 MPI processes. Ongoing development of Vampir considers further paradigms such as OpenSHMEM/GASPI, support for OpenMP 4.0 constructs, automatic critical path analysis/visualization, sampling-based tracing, and trace comparison.

The measurement component Score-P as well as MUST are open source projects and licenses for all features of Vampir are available. Installations of all tools are available on Tri-Lab clusters and on virtual machines for demonstration purposes.