A team of LLNL computer scientists and a collaborator from Argonne National Laboratory (ANL) won the Best Paper Award at the International Workshop on OpenMP (IWOMP) 2020 in September. Giorgis Georgakoudis, Ignacio Laguna, Tom Scogland (LLNL), and Johannes Doerfert (ANL) accepted the award for their paper, “FAROS: A Framework to Analyze OpenMP Compilation Through Benchmarking and Compiler Optimization Analysis.”

The paper showcases the new Livermore-developed framework, FAROS, which pinpoints missing compiler optimizations due to OpenMP compilation and measures the impact on performance. FAROS is the result of a previously funded Laboratory Directed Research and Development (LDRD) Feasibility Study, which was led by Georgakoudis. “We wanted to understand how parallelism affects compiler optimization, what the impact on performance is, and whether it is feasible to improve it.”

FAROS workflow diagram
Figure: The workflow of FAROS. The user provides a configuration file for the application under test, and FAROS generates execution time and compiler optimization reports. Click to enlarge.

FAROS is expected to help improve the performance of LLNL mission-critical scientific applications that use parallel programming models, such as OpenMP and RAJA, to enable parallelism in the applications and achieve better performance on today’s HPC systems. Compilers are critical to the process—their job is to translate the application code to the actual machine executable program. Georgakoudis explains, “Programming models like OpenMP and RAJA can sometimes make it hard for the compiler to optimize the generated program, which decreases performance. FAROS pinpoints in the source code of the program which optimizations are missing and why.”

Using FAROS on a collection of OpenMP proxy applications and kernels, the team discovered that, in most cases, compiling with OpenMP hinders compiler optimizations, making the single-threaded execution of OpenMP programs up to 2.3 times slower than its sequential equivalent without OpenMP. Based on FAROS’s analysis, the team was able to overcome the resulting performance issues by manually refactoring the application to help the compiler optimize parallel code. Georgakoudis adds, “In few but interesting cases, we found that OpenMP compilation enables more compiler optimizations, when parallelism semantics provide more information to the compiler that helps optimization.”

As follow-on work, Georgakoudis will lead a recently awarded LDRD Exploratory Research project, titled “Achieving Peak Performance of HPC Applications by Optimizing Parallelism,” which is just getting off the ground. Working with Laguna, Scogland, Chunhua Liao, Markus Schordan, and David Beckingsale, Georgakoudis will investigate novel methods to represent parallelism in compilation to automatically apply compiler optimizations that improve the performance of parallel execution.

IWOMP is an annual workshop dedicated to the promotion and advancement of all aspects of parallel programming with OpenMP. Due to the COVID-19 pandemic, the conference was held virtually.

The team's slide deck is available online. As an open-source repository, FAROS is on GitHub.