Privacy & Legal Notice

eGprof: Extending Gprof for Comparative Analysis



Goals

While Gprof is a powerful and widely available performance analysis tool, it misses some functionality often requested by users. In this project we extend Gprof to include some of these featues. In particular we focus on the ability to:

Why Gprof?

New Functionality

The following new features have been integrated into Gprof:

All this new functionality is introduced into Gprof without changing or reducing its orginial functionality. eGprof can continue to be used as before and new features can be activated on demand.

Running Example

#include <stdio.h>
#include <stdlib.h>
#include <math.h>

double ret = 1000.0;

void foo(int it) {
  int i;

  for (i=0; i<it; i++)
    ret+=sqrt(ret)+0.3;
}

int main(int argc, char **argv) {
  int i;
  int it1=atoi(argv[1]);
  int it2=atoi(argv[2]);

  foo(it1);

  for (i=0; i<it2; i++)
    ret+=sqrt(ret)+0.1;

  return (int) ret;
}	
#include <stdio.h>
#include <stdlib.h>
#include <math.h>

double ret = 1000.0;

void bar(int it) {
  int i;

  for (i=0; i<it; i++)
    ret+=sqrt(ret)+0.3;
}

int main(int argc, char **argv) {
  int i;
  int it1=atoi(argv[1]);
  int it2=atoi(argv[2]);

  bar(it1);

  for (i=0; i<it2; i++)
    ret+=sqrt(ret)+0.1;

  return (int) ret;
}	
Program A Program B

Setup

Compile both programs using -pg

Run experiments

Standard Gprof Output

Program A/Run A-1-2

Options:


$> gprof -b -p progA.out gmon-A-1-2.out
Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 67.47      0.56     0.56                             main
 32.53      0.83     0.27        1   270.00   270.00  foo
$>

 

Program A/Run A-2-1

Options:


$> gprof -b -p progA.out gmon-A-2-1.out
Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 67.61      0.48     0.48        1   480.00   480.00  foo
 32.39      0.71     0.23                             main
$>

 

Program B/Run B-2-2

Options:


$> gprof -b -p progB.out gmon-B-2-2.out
Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 50.00      0.50     0.50        1   500.00   500.00  bar
 50.00      1.00     0.50                             main
$>

 

Comparative Performance Profiles

Comparing Two Runs of the Same Binary

Options:


$> gprof -b -p -u gmon-A-2-1.out progA.out gmon-A-1-2.out
Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 61.11      0.33     0.33                             main
-38.89      0.54    -0.21        0   210.00   210.00  foo
$>

 

Differential Output Format

Options:


$> gprof -b -p -Y -u gmon-A-2-1.out progA.out gmon-A-1-2.out
Flat profile (differences):

Each sample counts as 0.01 seconds.

   %       +self    -self     diff     +self    -self     diff  sym
impact   seconds  seconds  seconds     calls    calls    calls  +-  name

 61.11      0.56     0.23     0.33         -        -        0  XX  main
-38.89      0.27     0.48    -0.21         1        1        0  XX  foo
$>

 

Comparing Two Runs of Different Binary

Options:


$> gprof -b -p -Y -u gmon-A-2-1.out -U progB.out progA.out gmon-A-1-2.out
Flat profile (differences):

Each sample counts as 0.01 seconds.

   %       +self    -self     diff     +self    -self     diff  sym
impact   seconds  seconds  seconds     calls    calls    calls  +-  name

-44.44       ---     0.48    -0.48         -        1       -1  -X  bar
 30.56      0.56     0.23     0.33         -        -        0  XX  main
 25.00      0.27      ---     0.27         1        -        1  X-  foo
$>

 

Callgraph Output

Generate a GML Using the Callgraph Information

Options:


$> gprof -X graphA-1-2 progA.out gmon-A-1-2.out
$> gprof -X graphA-2-1 progA.out gmon-A-2-1.out
$> java -jar yed.jar graphA.gml
$>

graphA-1-2.gml graphA-2-1.gml

Can Be Combined with Differential Output

Options:


$> gprof -X graphAdiff -u gmon-A-2-1.out progA.out gmon-A-1-2.out
$> gprof -X graphBdiff -u gmon-B-2-2.out -U progB.out progA.out gmon-A-1-2.out
$> java -jar yed.jar graphAdiff.gml
$>

graphAdiff.gml graphBdiff.gml

Exporting Gprof Data to Other Tools

Generic tabular output

Options:


$> gprof -p -M 10 progA.out gmon-A-1-2.out
   0.56            67.47            0.83          100.00         main
   0.27            32.53            0.27           32.53         foo
$>

 

Export to the PerfTrack Performance Database

Options:


$> gprof -p -M 10 -W progA,comp:gcc progA.out gmon-A-1-2.out
Application     Tool    comp    execution   Rank ()   Exclusive time (s)   Inclusive time (s)    Function
progA.out       gprof   gcc     progA       1         0.56                 0.83                  main
progA.out       gprof   gcc     progA       2         0.27                 0.27                  foo
$>

 

More Information

Martin Schulz, Bronis R. de Supinski, Practical Differential Profiling, Proceedings of EuroPar 2007.

Contact

Martin Schulz, schulzm@llnl.gov

High Performance Computing at LLNL    Lawrence Livermore National Laboratory