Privacy & Legal Notice

Livermore Computing Workshop Announcement

Workshop Title: Parallel Performance Evaluation Using TAU (EC3528)
Date/Time: Sep 12-13, 2012
9:00am - 5:00pm
Instructor: Sameer Shende, University of Oregon
Description: To meet the needs of computational scientists to evaluate the performance of their parallel, scientific applications, we present five parallel performance evaluation tools - TAU, PAPI, Scalasca, OTF and Vampir/VNG. This two day workshop will cover performance evaluation of applications on Tri-lab OCF platforms. This workshop will focus on performance data collection, analysis, and performance optimization. After describing and demonstrating how performance data (both profile and trace data) can be collected in a straightforward manner using TAU's (Tuning and Analysis Utilities) automated instrumentation, the bulk of the workshop will cover how to analyze the performance data collected and drill down to find performance bottlenecks and determine their causes. The workshop will include some sample codes that illustrate the different instrumentation and measurement choices available to the users. Topics will cover generating performance profiles and traces with memory utilization and headroom, I/O and hardware performance counters data using PAPI. Hardware counter data can show not only which routines are taking the most time, but why? For example, because of cache misses, TLB misses, excess address arithmetic, or poor branch prediction behavior. Automated analysis of trace data using the Scalasca tool, in conjunction with TAU, can help find and determine the causes of communication inefficiencies such as excessive communication blocking. We will demonstrate scalable tracing using OTF and visualization using the Vampir and VNG trace visualizers. Performance data analysis using ParaProf and PerfExplorer will be demonstrated using the performance data management framework (PerfDMF) that includes TAU's performance database. The workshop will also feature cross experiment analysis including comparing the effects of multi-core architectures on code performance. We will attempt to collect and analyze performance data for additional user codes during the hands-on portion of the workshop. Users and developers are welcome to contact the instructor ahead of time to begin collecting data so as to have it on hand for the workshop. The workshop will include demonstrations on the IBM BlueGene/Q platform.

Hands-on sessions: attendees may use their own Livermore Computing (LC) computer accounts on clusters such as hera, cab, sierra, udawn, rzmerl, rzuseq, etc. If you do not have an account on an LC cluster, you can use a temporary workshop account provided during the workshop, or you can request an account through the LC Hotline ( If you are interested in using these tools on a BG/Q machine, you will need an account on rzuseq, which likewise, can be requested through the LC Hotline.

Agenda: Day 1
  • Introduction to TAU
  • Instrumentation: PDT, MPI, OpenMP, DyninstAPI
  • I/O, and memory evaluation
  • Hands-on
  • PAPI
  • Hands-on using loop level instrumentation, PAPI
Day 2:
  • Introduction to analysis tools: Paraprof, PerfDMF and PerfExplorer
  • Hands-on
  • Vampir and VNG
  • Scalasca
  • Hands-on
  • Applying performance evaluation tools to user codes
Location: Laboratory Training Center 2, Trailer 1889 (near the West Gate Badge Office). Directions and contact information are available HERE.
Fee: No cost
Level/Prerequisites: Introductory level. A basic understanding of parallel programming with C or Fortran is essential.
Registration: See the "Registration" section below.
Hardcopy: Hardcopy notes will NOT be provided.


Registration is limited to LLNL employees, students and collaborators. You must register in advance. Note that enrollment is limited to 20 attendees, due to the number of available workstations.

If you are an LLNL employee:

If you are not an LLNL employee:

Questions? Please call or email Blaise Barney (925-422-2578 /