MPI_T: Tools for MPI 3.0

Gyan

The English word that captures the essence of the Sanskrit word “Gyan” is “Knowledge”. Our tool named Gyan offers tool writers just that – insight into an application’s MPI performance from the point of view of performance variables internal to an MPI implementation. These performance variables will vary depending on the MPI implementation used and possibly across different versions of the same MPI library. Currently, Gyan provides two ways to select which performance variables to monitor. The user of Gyan can select a specific performance variable using the environment variable MPIT_VAR_TO_TRACE, or simply let the tool monitor all performance variables exposed by the MPI implementation being used. The latter alleviates the potential for mistakenly setting a performance variable that is not exposed by that particular MPI implementation. To start using Gyan, please follow the instructions listed below:

Software

MPI tools on GitHub 

To build the library

cd gyan
make

To start profiling an application, you can do one of two things

Option 1: Let’s assume, the command to run your application is “srun -n 128 ./application”. Then, do the following:

LD_PRELOAD=$GYAN_INSTALL_PATH/libgyan.so srun -n 128 ./application

Or, Option 2: Link the library to your application. For this, you will need to add the following to your Makefile’s link command:

mpicc <list_of_flags> <list_of_object_files> <list_of_other_library_paths>
-L$GYAN_INSTALL_PATH -o application -lgyan <list_of_other_libraries>

Output format

If you did not specify a variable to monitor, Gyan will print out a summary of all performance variables exposed by the MPI implementation. The following is a sample output from Gyan:

Currently, the tool starts monitoring the variables in MPI_Init and reads them in MPI_Finalize. Thus, the values for the minimum, maximum, and average for each variable in the output above are all the same. In the future, we plan to provide users the flexibility of monitoring specific MPI collectives and to generate statistics for the performance variables. Please let us know your feedback.

LLNL-WEB-646166