TotalView V3.8.0-LLNL

Release Features

October 22, 1998


This documents the release of TotalView V3.8.0-LLNL installed on blue/sky and compass/forest on October 14, 1998.

§ Features of Etnus' TotalView release 3.8.0


Below are the features added by LLNL to create TotalView V3.8.0-LLNL.

§ Displaying an Array with a Vector Subscript
§ Displaying Memory Utilization
§ Format
§ Find
§ Performance Monitoring (IBM SP2 only)
§ Trace Points
§ Wait State Graph (modification to release 1)
§ Clear/Eval Commands for Expression and Eval Windows
§ Previous LLNL enhancements to TotalView in release V3X.7.9-LLNL-1
§ A Way to Visualize data with MeshTV
§ TotalView V3.8.1-1-LLNL Release Features


Documentation: TotalView User's Guide(Acrobat). TotalView User's Guide(Postcript)

Tutorials: Introduction to TotalView and Debugging Parallel Codes.

For more information, contact Bor Chan, Karen Warren, or Rich Zwakenberg. click here to send e-mail.















Displaying an Array with a Vector Subscript

Users can now display those values of an array that are specified by a subscript which itself is a 1-dimensional integer array indexing the array elements.

For example, if the vector k = /3, 7, 11/, then typing the expression, a(k), will display a(3), a(7), a(11).

This feature is accessible two ways:












Displaying Memory Utilization

During runtime, it is useful to know the status of memory utilization, i.e., how much total memory is being used and how much is available. This is being made available on a process by process (MPI based) view. This is implemented in TotalView using the ps command and thus reports memory utilization in terms of virtual data size, resident real memory size, resident text size and percentage of real memory being used. Totalview has added the following commands to the Process State Info command:












Format

A Format menu command has been added to the variable window to allow displaying the data in a different format. The data is displayed in the same window in the selected format.


This release looks at only the following types of variables:

The data formats we allow displaying the data in are:












Find

A Find menu command has been added to the variable window to allow the user to search the displayed data values based on some function.


This release looks at the following types of variables:

The Find functions we have initially implemented are:

Our implementation is as follows. A user has a variable window displayed. The user selects a Find function to apply to the displayed values via the menu command in the variable window. TotalView searches the entire data array displayed, taking into account the starting and ending index and slice. TotalView performs what looks like a dive operation in the variable window, that is the variable window will be changed to display the values and indices meeting the function, but with a carat describing the user has performed a dive. The variable window, call it now the found window, will have a changed title bar describing the function used to create the window. The found windows menu command will not allow the Find, Format and visualization commands. The found window may have more than one value, and is therefore a scrollable window. We interpret all the Find functions as multiple valued function results. That is, max has multiple values because what is displayed is all indices and their values that meet the max criteria. The value fields in the found window are not editable.

After displaying a variable window, the following scenarios are how one might use the Find command.

When the user reaches some action point, found windows are not updated until the user selects the undive from the window and reexecutes the Find command.













Performance Monitoring (IBM SP2, AIX 4.3)

Setting up your program to be monitored

Monitoring your whole program

At the beginning of your C program, put the library function:
 
	set_msr(0x4);

Monitoring only parts of a program

To monitor only parts of a C program, place
  
	set_msr(0x4);
  
immediately before the part you want to monitor, and
   
	set_msr(0);

immediately after.

Compile change:

When you compile, you need to put the following at the end of your compile line: 

     /usr/local/lib/PMshrsub.a 

For Fortran programs, you need to make the following start and stop monitoring calls:

   call set_msr(%val(4))

   call set_msr(%val(0))


How to run the monitors in TotalView

Start up TotalView on your program. If it is a parallel job, make sure you are beyond the execution of poe and ready to begin debugging your program. Before you start your program, select the command Open Performance Monitor Display, which will open the Performance Monitor Display Window.

Indicate what kind of events to count

The menu command in the Performance Monitor Display Window allows for the selection of events to be asssigned to counters : Set Counter 1, Set Counter 2, etc. Only one event can be counted on a counter at a time. The user selects from the counters subcommand the allowable event for that counter. In this way, the user can select events for as many counters as allowed without the problem of selecting more than one event on a counter or selecting an event not available on that counter. You need not place an event on each counter. Some events take more than one counter, so TotalView will not allow you to use the second counter when that event is chosen.

The events to be counted will be the same on all processes. We are not allowing different events to be counted in different processes.

A full list of the events can be found in /usr/local/include/PM604e.h

Control display of the performance monitoring results

Choose from the Sampling Interval command when you want the information updated. You may choose intervals, such as 3 seconds, etc. It will be automatically updated at regular intervals. If you would like it to only update when you request it (by typing u), or only give results when the program exits, choose the command Sampling Interval -> Only on Update for an interval.

Choose ranges or entire program

If you want to monitor the entire program, select the Monitor the entire program box. If you would like to monitor ranges, select the Monitor just defined ranges box and set count/nocnt action points in your program.

Setting up monitoring ranges

We have added a new kind of action point, Start Performance Monitor Counting, displayed as count, and Stop Performance Monitor Counting, displayed as nocnt. Press the right mouse button on the line number where you want the performance monitoring count/nocnt action point. An action point window will pop up. If you would like monitoring to start at this point, select the count action point. If you would like monitoring to stop at this point, select the nocnt action point. Then press OK or hit return.

The count/nocnt action points have the same options as do the other action points (breakpoint, barrierpoint, eval point), that being enable/disable, share or don't share with other processes, and are saved when action points are saved.

The count/nocnt actions are defined by the dynamic scope of the process. Once a count action is encounterd, subsequent count actions are redundant. The first nocnt action encountered stops the monitoring.

Make sure you check "Monitor just defined ranges" in the Performance Monitoring Display Window.

Commit choices - start counters

After you have selected events, select Performance Monitor Display Window the command Counter Control->Commit (or type c) to commit those choices and start the counter(s).

Start the program

Type G in the main source window, and watch the counters go. If you chose to never update, just wait until the processes exits to get results or type u when you want to see the values of the counters.



Implementation is via IBM's Performance Monitor Application Programming Interface (PM API) routines.













Trace Points

A new action point has been defined, a trace point. The idea is to place a trace point at the beginning of a function. When your program reaches a trace point it will print out the process number, the function name, the function argument names and their values. The execution does not stop, it just pauses to record the values into a trace file and continues execution. All trace information is placed into one file. You may not specify different files for a different trace point.

To set a trace point:

After execution, you can inspect the file.
















Wait State Graph (modification to release 1)

The Wait State Graph has been modified for more readability.













Clear/Eval Commands for Expression and Eval Windows

In the Expression Window, we have made it possible for a user to evaluate an expression completely from the keyboard, never having to use the mouse.

The normal way to evaluate an expression is:

We have added two new commands to the expression window:

Thus to evaluate an expression now, the user types:

These two commands have also been added to the Eval pane of the action point window. When one uses the right mouse button to set a breakpoint, the action point window pops up. When the Eval button is selected, the user can use the ctrl-r command to clear the window and the ctrl-g command in place of selecting the OK button.










A Way to Visualize Data with MeshTV

Users can view data with MeshTV via the expresssion capability in TotalView. MeshTV is an interactive program for visualizing and analyzing scientific data. Mesh and variable data is written to a file in the SILO format to be visualized by MeshTV.

The user builds into their code callable plotting functions to do their SILO file generation. When the code reaches a breakpoint (or any stop point), the user, via the expression window, can call one of their callable plotting functions to create a SILO file. The user can than, via some other X window, invoke MeshTV to display the plot.

SILO/MeshTV plotting example

Below is source for a program that has a plotting function mesh_builder that is called from within the program to generate the SILO file, but can also be called from within TotalView directly at any time to also generate a SILO file. From within TotalView, via the expression window, which is brought up by choosing the command Open Expression Window, you type in a call to the plotting function mesh_builder with the arguments you want plotted. You than, via some other xterm window, run MeshTV on the created SILO file.

/********************** C source *************************/
#include 
#include 

DBfile  *file = NULL;           /* The SILO file pointer */
char    *coordnames[2];         /* Names of the coordinates */
float   nodex[8];               /* The coordinate arrays */
float   nodey[4];
float   *coordinates[2];        /* The array of coordinate arrays */
int     dimensions[2];          /* The number of nodes in each dimension */
int     ndims;                  /* Number of dimensions */
 
void mesh_builder (a,b,dim)
float *a;
float *b;
int *dim;
{
 printf ("In mesh_builder\n");

 /* Create the SILO file */
 file = DBCreate("sample.silo", DB_CLOBBER, DB_LOCAL, NULL, DB_PDB);

 /* Write out the mesh to the file */
 DBPutQuadmesh(file, "test_mesh", coordnames, coordinates, dimensions,
	       ndims, DB_FLOAT, DB_COLLINEAR, NULL);

 /* Plot a variable */
 DBPutQuadvar1(file, "var1", "test_mesh", a, dim, ndims,
               NULL, 0, DB_FLOAT, DB_ZONECENT, NULL);
 DBPutQuadvar1(file, "var2", "test_mesh", b, dim, ndims,
	       NULL, 0, DB_FLOAT, DB_ZONECENT, NULL);
				
 /* Close the SILO file */
 DBClose(file);
}


int main()
{
 float  var1[5][6];             /* Variable to be plotted on mesh */
 float  var2[5][6];             /* Variable to be plotted on mesh */
 int    dims[2];                /* Dimension of var Variable */
 int    i,j,k;

 ndims = 2;

 /* Name the coordinate axes 'X' and 'Y' */
 coordnames[0] = strdup("X");
 coordnames[1] = strdup("Y");

 /* Give the x coordinates of the mesh */
 nodex[0] = -1.1;
 nodex[1] = -0.1;
 nodex[2] =  1.3;
 nodex[3] =  1.7;
 nodex[4] =  1.9;
 nodex[5] =  2.1;
 nodex[6] =  2.3;
 nodex[7] =  2.7;

 /* Give the y coordinates of the mesh */
 nodey[0] = -2.3;
 nodey[1] = -1.2;
 nodey[2] =  0.4;
 nodey[3] =  0.8;

 /* How many nodes in each directions? */
 dimensions[0] = 8;
 dimensions[1] = 4;

 /* Assign coordinates to coordinates array */
 coordinates[0] = nodex;
 coordinates[1] = nodey;

 for (k=0;k<4;k++)
  {
   printf ("Iteration %d\n",k);

   /* Set values into Variable */
   dims[0] = 5;
   dims[1] = 6;

   for (i=0;i<5;i++)
    for (j=0;j<6;j++)
     var1[i][j] = (k+j)%2;

   for (i=0;i<5;i++)
    for (j=0;j<6;j++)
     var2[i][j] = (k-j)%2;
  }

 printf ("Calling mesh_builder\n");

 /* Write out the mesh and variables to the file */
 mesh_builder(&var1,&var2,&dims);

 return(0);
}
/******************* end of C source **********************/

MeshTV run

Below is a window dump of a MeshTV run from the above mesh_builder function.
















LLNL Disclaimers

Last revised November 30, 1998