TotalView™
Multiprocess Debugger

Version 4.0.0 January 31, 2000

These Release Notes for TotalView 4.0.0 contain important information about requirements that affect your TotalView 4.0.0 software and license. The platforms described in these notes are listed in Chapter 2:TotalView 4.0 Platforms and System Requirements.

This document describes changes made since the release of TotalView 3.9.0.

The manuals for this release are TotalView User's Guide, Version 4.0.0, January 2000, TotalView CLI Guide, January 2000, and the TotalView Installation Guide, Version 4.0.0, January 2000.


Contents

Chapter 1: New Features
Chapter 2: TotalView 4.0 Platforms and System Requirements

Alpha Compaq Tru64 UNIX

SGI IRIX 6.x MIPS

Alpha Linux RedHat

Intel x86 Linux RedHat

RS/6000 Power AIX

Sparc SunOS 5 (Solaris 2.x)

Portland Group HPF 2.4 Supported Configurations

Myrinet Support

Chapter 3: TotalView News
Chapter 4: Special IBM Considerations

Pthread Considerations

AIX Patch Considerations

IBM PE Message Queue Display

Forcing 1:1 Thread Scheduling Mode on RS6000 Systems

Chapter 5: License Management
Chapter 6: Problems and Reports

Problems Fixed

Problems on All Platforms

Problems on Compaq Tru64 UNIX (Alpha) platforms

Problems on IRIX6-MIPS Platforms

Problems on RS/6000 Platforms

Problems on SPARC SunOS 5 platforms

Problems in the Portland Group HPF 2.4 Compiler

Problems in Linux

Problems in the CLI

Reporting Problems

How to Contact Us

Chapter 7: Patching Compilers and Operating Systems

Apogee 4.0 Compiler Patch Procedures

Portland Group HPF 2.4 Compiler Patch Procedures

Sun WorkShop 5.0 Compiler Patch Procedures

RS6000 System Patch Procedures


Chapter 1: New Features

TotalView 4.0 has the following new features:

Contents


Command Line Interface (CLI) and Scripting Language

TotalView now complements its powerful GUI with a command line interface and scripting language. You will now be able to debug programs in a command line environment or in a mode that combines the TotalView GUI with an xterm window in which you can enter CLI commands. The scripting language is a TCL 8.0 interpreter that is embedded in the CLI, allowing you to create powerful debugger scripts.


Linux (Intel X86 and Alpha)

TotalView now runs on versions of Linux running on the Intel X86 and Alpha processors. TotalView runs under RedHat 5.2 and RedHat 6.0 Linux. All standard capabilities of TotalView are available on Linux. The following shows the available compilers:

Platform Compiler
Intel x86 Linux RedHat GCC EGCS C, C++, and F77

KAI Guide OpenMP C/C++

Alpha Linux RedHat GCC EGCS C, C++, and F77

Compaq Fortran and C (Planned, pending Release from Compaq)


Support for 64-bit Applications on the IBM RS6000

TotalView now allows you to debug 64-bit applications on the RS6000 Power3 platform. A single TotalView image can debug both 32-bit and 64-bit applications in the same debugging session.


KAI Guide Fortran, C, and C++ OpenMP Compilers

TotalView allows you to debug Fortran, C, and C++ OpenMP programs compiled with Kuck and Associates, Inc. (KAI) Guide 3.8 compilers on the RS6000, IRIX6-MIPS, ALPHA, and LINUX-X86 platforms.


SunPro 5.0 Compilers on SUN5

TotalView allows a user to debug SunPro 5.0 C, C++, and Fortran programs.


Fast TotalView Debugger Server Launch

TotalView has increased its efficiency when launching large-scale applications, making the launching of the TotalView Debugger Server faster on all platforms.


Mutex, Condition Variable, R/W Lock, and pthread Key Display on RS6000

TotalView can now display information for mutexes, condition-variables, R/W Locks, and pthread Keys in a separate window, providing you much greater visibility into the state of your threaded programs.


M:N User-Mode Schedule Threads on RS6000

TotalView now supports debugging pthread programs that use M:N user-mode scheduling on the RS6000 platform.


Fast Data Watchpoints on IRIX6-MIPS, LINUX-X86, and SUN5

On the IRIX6-MIPS, LINUX-X86, and SUN5 platforms, TotalView lets you create fast data watchpoints. A data watchpoint triggers a debugging event when a memory location is modified. TotalView also lets you associate an expression with a fast data watchpoint so that the expression is evaluated when the watchpoint triggers


Advanced Array Data Features

TotalView supports three new powerful features that allow you to easily analyze your array data:


OpenMP THREADPRIVATE Common Block Support on IRIX6-MIPS

On the IRIX6-MIPS platform, TotalView will allows you to debug SGI MIPS OpenMP or compiler parallel Fortran programs. TotalView will allow access to TASKCOMMON variables, access to uplevel variables, proper handling of #line directives needed to debug at the original source level, and properly handling parallel regions and do-serial constructs.


License Management Enhancements

TotalView extends licensing to include a single processor.


FLEXlm 6.1

TotalView now uses FLEXlm version 6.1.

Contents


Chapter 2: TotalView 4.0 Platforms and System Requirements

To run TotalView on your system, you must have the correct hardware configuration and the correct software installed.

Software requirements are:

Hardware requirements are:

The following table shows the supported platforms and the TotalView version supporting each platform.

Platform Name TotalView Version
Alpha Compaq Tru64 UNIX 4.0.0
SGI IRIX 6.x MIPS 4.0.0
Alpha Linux RedHat 4.0.0
Intel x86 Linux RedHat 4.0.0
RS/6000 Power AIX 4.0.0
Sparc SunOS 5 (Solaris 2.x) 4.0.0

Contents


Alpha Compaq Tru64 UNIX

The software and hardware requirements for running TotalView 4.0 on Compaq Tru64 UNIX are as follows:

SOFTWARE REQUIREMENTS:
HARDWARE REQUIREMENTS:
ADDITIONAL REQUIREMENTS:

Specific TotalView 4.0 features have the following additional requirements:

Compiler or Environment Product
C compiler C compilers provided with Compaq Tru64 UNIX V4.0B-F, V5.0

GCC EGCS 2.95.2

C++ compiler Compaq Tru64 UNIX C++ V6.1, V6.2

KAI 3.4

GCC EGCS 2.95.2

FORTRAN 77 compiler

Compaq Tru64 UNIX V5.1, V5.2

Fortran 90 compiler

Compaq Tru64 UNIX V5.1, V5.2

OpenMP Fortran compiler

Compaq Tru64 UNIX V5.1, 5.2

KAI Guide 3.8

OpenMP C/C++ compiler KAI Guide 3.8
MPICH version 1.1.1, 1.1.2 MPICH is available from: http://www.mcs.anl.gov/mpi/mpich.

MPICH patches are available from: http://www.mcs.anl.gov/mpi/mpich/buglist-tbl.html

Compaq MPI (DMPI) Versions 1.8 and 1.9
QSW RMS2 Running on AlphaServer SC systems

Needs specific version; not yet tested by Etnus.

ORNL PVM version 3.4.1 See the PVM home page at http://www.epm.ornl.gov/pvm/pvm_home.html for more information on PVM.
Compaq PVM (DPVM) Versions 1.8 and 1.9

RESTRICTIONS:

None.

For additional information, see Problems on Compaq Tru64 UNIX (Alpha) platforms.

Contents


SGI IRIX 6.x MIPS

The software and hardware requirements for running TotalView 4.0 on SGI IRIX 6.x MIPS systems are as follows:

SOFTWARE REQUIREMENTS:
HARDWARE REQUIREMENTS:
ADDITIONAL REQUIREMENTS:

Specific TotalView 4.0 features have the following additional requirements:

Compiler or Environment Product
C compiler Silicon Graphics MIPSpro 7.2.1 or 7.3

GCC EGCS 2.95.2

C++ compiler Silicon Graphics MIPSpro 7.2.1 or 7.3

KAI 3.4

GCC EGCS 2.95.2

FORTRAN 77 compiler Silicon Graphics MIPSpro 7.2.1 or 7.3 (see RESTRICTIONS below)
Fortran 90 compiler Silicon Graphics MIPSpro 7.2.1 or 7.3 (see RESTRICTIONS below)
OpenMP Fortran Silicon Graphics MIPSpro 7.3 (see RESTRICTIONS below)

KAI Guide 3.8

OpenMP C/C++ compiler KAI Guide 3.8
Portland Group HPF 2.4 (see RESTRICTIONS below)
SGI MPI 3.1 or 3.2, which is part of the Message Passing Toolkit (MPT) Release 1.2 or 1.3 MPI 3.1 does not support message queue display, but MPI 3.2 does; see RESTRICTIONS below.

MPT 1.3 is available from: http://www.sgi.com/Products/Evaluation

MPT 1.3 general information is available from: http://www.sgi.com/software/mpt/.

SGI MPI requires Array Services to be installed and properly configured. Array Services is also available from: http://www.sgi.com/Products/Evaluation

See the MPT documentation for required version Array Services.

MPICH version 1.1.1, 1.1.2 MPICH is available from: http://www.mcs.anl.gov/mpi/mpich

MPICH patches are available from: http://www.mcs.anl.gov/mpi/mpich/buglist-tbl.html

ORNL PVM version 3.4.1

See the PVM home page at http://www.epm.ornl.gov/pvm/pvm_home.html for more information on PVM.

RESTRICTIONS:

Contents


Alpha Linux RedHat

The software and hardware requirements for running TotalView 4.0 on Linux Alpha systems are as follows:

SOFTWARE REQUIREMENTS:
HARDWARE REQUIREMENTS:
ADDITIONAL REQUIREMENTS:

Specific TotalView 4.0 features have the following additional requirements:

Compiler or Environment Product
C compiler ccc-6.2.0-8 Compaq C T6.2-235 or later

GCC gcc EGCS 2.90.29 (bundled with RedHat 5.2)

GCC gcc EGCS 2.91.66 (bundled with RedHat 6.0)

GCC gcc EGCS 2.95.2

C++ compiler GCC g++ EGCS 2.90.29 (RedHat 5.2)

GCC g++ EGCS 2.91.66 (bundled with RedHat 6.0)

GCC g++ EGCS 2.95.2

FORTRAN 77 compiler cfal-1.0.6 Compaq Fortran T1.0-916 or later

GCC g77 EGCS 2.90.29 (bundled with RedHat 5.2) (see RESTRICTIONS below)

GCC g77 EGCS 2.91.66 (bundled with RedHat 6.0) (see RESTRICTIONS below)

GCC g77 EGCS 2.95.2 (see RESTRICTIONS below)

Fortran 90 compiler cfal-1.0.6 Compaq Fortran T1.0-916 or later
MPICH version 1.1.1, 1.1.2 MPICH is available from: http://www.mcs.anl.gov/mpi/mpich. MPICH patches are available from: http://www.mcs.anl.gov/mpi/mpich/buglist-tbl.html

ORNL PVM version 3.4.1

See the PVM home page at http://www.epm.ornl.gov/pvm/pvm_home.html for more information on PVM.

RESTRICTIONS:

Contents


Intel x86 Linux RedHat

The software and hardware requirements for running Linux Intel-X86 systems are as follows:

SOFTWARE REQUIREMENTS:
HARDWARE REQUIREMENTS:
ADDITIONAL REQUIREMENTS:

Specific TotalView 4.0 features have the following additional requirements:

Compiler or Environment Product
C compiler

GCC gcc 2.7.2.3 (bundled with RedHat 5.2)

GCC gcc EGCS 2.91.66 (bundled with RedHat 6.0)

GCC gcc EGCS 2.95.2

C++ compiler KAI 3.4

GCC g++ EGCS 2.90.29 (bundled with RedHat 5.2)

GCC g++ EGCS 2.91.66 (bundled with RedHat 6.0)

GCC g++ EGCS 2.95.2

FORTRAN 77 compiler

GCC g77 EGCS 2.90.29 (bundled with RedHat 5.2) (see RESTRICTIONS below)

GCC g77 EGCS 2.91.66 (bundled with RedHat 6.0) (see RESTRICTIONS below)

GCC g77 EGCS 2.95.2 (see RESTRICTIONS below)

OpenMP C/C++ compiler KAI Guide 3.8
MPICH version 1.1.1, 1.1.2 MPICH is available from: http://www.mcs.anl.gov/mpi/mpich. MPICH patches are available from: http://www.mcs.anl.gov/mpi/mpich/buglist-tbl.html

ORNL PVM version 3.4.1 (See the PVM home page at http://www.epm.ornl.gov/pvm/pvm_home.html for more information on PVM.)

Note: Appendix A of the TotalView User's Guide states that PGI compilers are supported in TotalView 4.0. This is incorrect. Due to instability in these compilers, they are not yet supported.

RESTRICTIONS:

Contents


Other Linux x86 Platforms

While TotalView has only been tested on the RedHat platform, we know of no reasons why TotalView should fail on other Linux/x86 platforms.

The TotalView executable image is built on RedHat 5.2, and uses the following dynamic libraries:

The only library that is likely to cause a problem is libbfd. We believe that using more modern versions of libbfd is not a problem, which you can do by creating a symbolic link to make the libbfd which you have available through the name libbfd-2.9.1.0.15.so.0.

We would be interested to hear about your experiences in using TotalView on other Linux/x86 platforms.

OTHER LINUX HINTS

If you have source code for Linux run time libraries available on your system, TotalView should be able to display this code provided that it appears in the directory from which its debug information claims that it was compiled. On Red Hat systems, this is /usr/src/bs/BUILD; other systems may vary. Since the source RPMS on RedHat installs sources under /usr/src/redhat/BUILD, a simple symbolic link so that /usr/src/redhat also appears as /usr/src/bs is all that is required.

To work out where your library sources claim to have been compiled you should do the following

% objdump --stabs library_of_interest | grep SO | head -5 

This will produce something like this.

% objdump --stabs /lib/libc.so.6 | grep SO | head -5 
0    SO    0     0     0000000000017a10 9     /usr/src/bs/BUILD/glibc/elf/ 
1    SO    0     0     0000000000017a10 0     soinit.c 
96   SO    0     0     0000000000017a58 954 
97   SO    0     0     0000000000017a60 2340  /usr/src/bs/BUILD/glibc/csu/ 
98   SO    0     0     0000000000017a60 2369  ../sysdeps/unix/sysv/linux/init-first.c

Here you can see that the library was compiled from /usr/src/bs.

Contents


RS/6000 Power AIX

The software and hardware requirements for running TotalView 4.0 on RS/6000 Power AIX systems are as follows:

SOFTWARE REQUIREMENTS:
HARDWARE REQUIREMENTS:
ADDITIONAL REQUIREMENTS:

Specific TotalView 4.0 features have the following additional requirements:

Compiler or Environment Product
C compiler IBM xlc 3.1.3.3, 3.6.0.0

GCC EGCS 2.95.2

C++ compiler IBM xlC 3.1.3.2, 3.6.0.0

KAI 3.4

GCC EGCS 2.95.2

FORTRAN 77 compiler IBM xlf 5.1.0.0, 6.1.0.0
Fortran 90 compiler IBM xlf90 5.1.0.0, 6.1.0.0
OpenMP Fortran compiler KAI Guide 3.8
OpenMP C/C++ compiler KAI Guide 3.8
Portland Group HPF 2.4 See RESTRICTIONS below
Parallel Environment for AIX version 2.2, 2.3, and 2.4 See RESTRICTIONS below
MPICH version 1.1.1, 1.1.2 MPICH is available from http://www.mcs.anl.gov/mpi/mpich. MPICH patches are available from http://www.mcs.anl.gov/mpi/mpich/buglist-tbl.html
ORNL PVM version 3.4.1 See the PVM home page at http://www.epm.ornl.gov/pvm/pvm_home.html for more information on PVM

RESTRICTIONS:

Contents


Sparc SunOS 5 (Solaris 2.x)

The software and hardware requirements for running TotalView 4.0 on Sparc SunOS 5 (Solaris 2.X) systems are as follows:

SOFTWARE REQUIREMENTS:
HARDWARE REQUIREMENTS:
ADDITIONAL REQUIREMENTS:

Specific TotalView 4.0 features have the following additional requirements:

Compiler or Environment Product
C compiler WorkShop compiler 4.2, 5.0

Apogee 3.1, 4.010

GCC EGCS 2.95.2

C++ compiler WorkShop compiler 4.2, 5.0 (see RESTRICTIONS below)

KAI 3.4

Apogee 3.1, 4.010

GCC EGCS 2.95.2

FORTRAN 77 compiler WorkShop compiler 4.2, 5.0 (see RESTRICTIONS below)
Fortran 90 compiler WorkShop compiler 4.2, 5.0 (see RESTRICTIONS below)
Portland Group HPF 2.4 See RESTRICTIONS below
MPICH version 1.1.1, 1.1.2 MPICH is available from http://www.mcs.anl.gov/mpi/mpich. MPICH patches are available from http://www.mcs.anl.gov/mpi/mpich/buglist-tbl.html
ORNL PVM version 3.4.1 See the PVM home page at http://www.epm.ornl.gov/pvm/pvm_home.html for more information on PVM

RESTRICTIONS:

Contents


Portland Group HPF 2.4 Supported Configurations

The following table lists the supported Portland Group HPF 2.4 compiler runtime configurations by platform.

Platform Name MPICH IBM PE RPM SMP
SGI IRIX 6.x MIPS OK N/A OK (May need patch) OK (May need patch)
RS/6000 Power AIX N/T OK OK (May need patch) N/S
Sparc SunOS 5 (Solaris 2.x) OK N/A OK (May need patch) OK (May need patch)

Key:

Contents

Myrinet Support

Version 1.1.3 of the Myrinet GM software supports TotalView. (GM is a message-passing system for Myrinet networks. The GM system includes a driver, Myrinet-interface control program, a network mapping program, and the GM API, library, and header files.) You can obtain this software from http://www.myrinet.com/scs/index.html.

Contents


Chapter 3: TotalView News

Default TotalView Server Launch String Has Changed

In TotalView 3.9, the default server launch string was:

rsh %R -n "cd %D && tvdsvr -callback %L -set_pw %P -verbosity %V"

In TotalView 4.0, the default server launch string is:

%C %R -n "cd %D && tvdsvr -callback %L -set_pw %P -verbosity %V"

If you have set the default server launch string to something different, you may also want to update your launch string.

The TotalView 4.0 default server launch string uses the new feature %C. A %C in the server launch string expands to the name of the server launch command being used. On most platforms, this is rsh.

Here is how %C is expanded:

Contents


Chapter 4: Special IBM Considerations

This chapter discusses the following:

Contents


Pthread Considerations

On AIX 4.3.1 and on unpatched AIX 4.3.2 and 4.3.3 systems, TotalView supports debugging pthread programs running in pthread-compatibility mode or pthreads running in system contention scope; that is, each pthread is bound to a kernel thread (1:1 thread scheduling). See Forcing 1:1 Thread Scheduling Mode on RS6000 Systems for information on how to force 1:1 thread scheduling.

On AIX 4.3.2 and 4.3.3 you can apply a patch that will allow you to display mutexes, condition variables, reader-writer locks, and pthread keys in your program. This patch is available for AIX 4.3.2 as:

     "APAR IY02391 -- BACKPORT PTHREAD DEBUG LIBRARY"

However, due to limitations in the pthread and pthread debug libraries, you cannot reliably debug pthread programs in process contention scope (M:N thread scheduling). IBM is working to correct this problem, but meanwhile you should force the pthread debug library to run in system contention scope (1:1 thread scheduling). See Forcing 1:1 Thread Scheduling Mode on RS6000 Systems for information on how to force 1:1 thread scheduling.

If you apply this patch, you may introduce a kernel bug on your system where the ptrace(PT_REATT) system call fails with EPERM when debugging IBM parallel environment (PE) programs. If you are debugging PE programs, we do not recommend applying APAR IY02391.

On AIX 4.3.3, the system already contains the pthread and pthread debug libraries that will allow you to display mutexes, condition variables, reader-writer locks, and pthread keys in your program. However, as mentioned above, you cannot reliably debug pthread programs in process contention scope (M:N thread scheduling), and you should force the pthread debug library to run in system contention scope (1:1 thread scheduling). See Forcing 1:1 Thread Scheduling Mode on RS6000 Systems for information on how to force 1:1 thread scheduling.

For more information, please read "Problems on RS/6000 Platforms".

Contents


AIX Patch Considerations

Some patch levels of AIX 4.3.2 contain a kernel bug where the ptrace(PT_REATT) system call fails with EPERM when debugging IBM parallel environment (PE) programs. A patch for this problem is available for AIX 4.3.2 as:

     APAR IY02037 -- HOT: ptrace(PT_REATT...) returns -1 and sets er

In addition, this patch is quite huge, 262 filesets occupying 420MB. If you apply this patch, you will effectively upgrade your system to AIX 4.3.3, and you should also apply the following patch:

     APAR IY03550 -- DBX CANNOT SET BREAK POINTS IN DATA SECTION IN

Consequently, we recommend that you obtain the AIX 4.3.3 distribution CD from IBM.

On 64-bit Power3-based RS6000 systems, or any system that has split instruction and data caches, ptrace() fails to copy back the data cache for breakpoints planted in the target program's data space. The TotalView compiled expression evaluator and interpreted expression function call features plant breakpoints in the target program's data space, making these feature unusable on 64-bit Power3-based systems. To fix this problem, you should apply:

     APAR IY03550 -- DBX CANNOT SET BREAK POINTS IN DATA SECTION IN

However, Etnus has not yet tested this APAR.

For more information, please read "Problems on RS/6000 Platforms".

Contents


IBM PE Message Queue Display

TotalView supports the Message Queue Display (MQD) feature when used with the threaded version of the IBM MPI libraries that are part of the IBM Parallel Environment (PE). PE version 2.2 and the non-threaded PE 2.3 and 2.4 libraries cannot provide TotalView with the necessary information for the MQD feature, however, Automatic Process Acquisition (APA) is supported in PE 2.2, 2.3, and 2.4. The following table summarizes TotalView's IBM PE support:

IBM PE version APA Support? MQD Support
2.2 Yes No
2.3 Yes Threaded MPI only
2.4 Yes Threaded MPI only

Contents


Message Queue Display Debugging Dynamic Library

Each version of the IBM PE library requires a different MQD debugging dynamic library for use with TotalView. This section explains how TotalView chooses the correct MQD debugging dynamic library.

When TotalView recognizes that it is dealing with an IBM MPI code, it searches for a MQD dynamic library to load capable of handling the appropriate IBM MPI implementation. The IBM MPI library declares its debugging compatibility version in an integer global variable named "mpi_debug_version". The following table shows the current set of values:

"mpi_debug_version" value PE version number
0 2.3
1 2.4

When looking for the MQD debugging dynamic library, TotalView looks for a library named libtvibmmpi.so; if it cannot be found, TotalView looks for a library named libtvibmmpi-<n>.so (where <n> is the integer value of the mpi_debug_version variable).

By default, the MQD debugging dynamic libraries provided with your TotalView distribution are named:

The TotalView distribution has the appropriate symbolic links:

This ensures that the correct MQD debugging dynamic library is loaded for both POE 2.3 and POE 2.4.

PE 2.4 MQD Configuration

Unfortunately, an error in the IBM PE 2.4.0.0 libraries causes the mpi_debug_version variable to have the value 0 (instead of the correct value 1). You may correct this problem in one of two ways:

But first, verify that you need the fix.

Verifying You Need a Fix

If you have POE 2.4.0.0, you may need to apply an IBM-provided PTF or alter your TotalView installation explicitly so that the MQD debugging dynamic library symbolic links point to the correct version of the library.

Here is how to check if you need to modify your TotalView installation:

  1. Verify the version of PE in use on all your nodes.

    Use the "lslpp -l ppe.poe" command to inspect the version of PE installed on all of your nodes. For example:

       lslpp -l ppe.poe
      Fileset                      Level  State      Description 
      ----------------------------------------------------------------------
    Path: /usr/lib/objrepos
      ppe.poe                    2.4.0.0  COMMITTED  poe Parallel Operating
                                                     Environment
     
    Path: /etc/objrepos
      ppe.poe                    2.4.0.0  COMMITTED  poe Parallel Operating
                                                     Environment
    
    

    You may also use the poe command to check multiple nodes on your SP2, for example:

    $ poe lslpp -l ppe.poe -procs 10 -rmpool 0
  2. Inspect the value of mpi_debug_version on all your nodes. Begin by compiling and running the following program:
     /*
     Test program to check mpi_debug_version.
      
     Compile: mpcc_r -g -o mqdvers mqdvers.c
      
     Run: mqdvers -procs <n> -rmpool <p>
          mqdvers -procs <n> -hfile <hostfile>
     */
      
     #include <stdio.h>
     #include <mpi.h>
      
     extern int mpi_debug_version;
      
     main(int arg, char **argv)
     {
          MPI_Init (&arg,&argv);
          printf ("mpi_debug_version == %d\n",
               mpi_debug_version);
          MPI_Finalize();
     } 
    
    

    It produces output similar to:

    $ mqdvers -procs 2 -rmpool 0 0:mpi_debug_version == 0 1:mpi_debug_version == 0
  3. If your version of PE is "ppe.poe 2.4.0.0" and the value of the variable is mpi_debug_version == 0, then you must apply PTF U462081 or modify your TotalView installation if you want to use the TotalView MQD feature. We strongly recommend that you apply the PTF.

    Patches for POE 2.4 are available through the normal AIX FixDist WEB site as follows:

    1. Point your WEB browser to http://service.software.ibm.com/support/rs6000/.
    2. Click on the "Downloads" link.
    3. Click on the "General Software Fixes" link.
    4. Click on the "AIX Fix Distribution Service" link.
    5. Click on the "Search by: PTF Number" radio button.
    6. Enter the PTF number "U462081" in the box and click on the "Find Fix" button.
    7. Select the PTF "U462081 - ppe.poe.2.4.0.1" item from the list.
    8. Select your version of AIX.
    9. Click on the "Get Fix Package" button. The following list of fix packages appears:

      Filesets needed for selected item Information file Byte size
      ppe.poe.2.4.0.2 README 4539392

    10. Using your browser, download the file and put it into a directory, for example, "/tmp/xlfpatches". The file is named ppe.poe.2.4.0.2.bff.
    11. Use the AIX "smit" tool to install the patch from the directory.

      If you apply this PTF, you do not need to patch your TotalView installation manually.

Patching Your TotalView Installation Manually

We strongly recommend that you apply the PTF (above) instead of patching your installation. If you patch your TotalView installation for use with an incorrect PE 2.4.0.0, that installation will no longer support MQD when you use POE 2.3.

If you cannot apply the PTF and you are using only PE 2.4, you or your system administrator must do the following:

$ su
Password: 
# cd <installdir>/totalview.<version>/<platform>/shlib
# rm libtvibmmpi-0.so
# ln -s libtvibmmpi-poe-2.4.so libtvibmmpi-0.so

Consult the TotalView Installation Guide for more information on your TotalView installation.

Contents


Forcing 1:1 Thread Scheduling Mode on RS6000 Systems

Due to limitations in the pthread and pthread debug libraries, you cannot reliably debug pthread programs in process contention scope (M:N thread scheduling). IBM is working to correct this problem, but meanwhile you should force the pthread debug library to run in system contention scope (1:1 thread scheduling).

To successfully debug an Aix pthreads program it is necessary to turn off the pthreads scheduler. To do this, you should do all of the following:

          pthread_attr_t attr;
          pthread_attr_init (&attr);
          pthread_attr_setscope (&attr, PTHREAD_SCOPE_SYSTEM);
          pthread_create (&tid, &attr, worker_function, 0);

This forces the scheduler to keep exactly as many kernel threads as user threads, and to eliminates the problems we have seen with the library.

For more detailed information, see your AIX 4.3 documentation or use the web links to the IBM AIX 4.3 documentation site listed below:

Contents


Chapter 5: License Management

The Etnus License Management scheme for TotalView 3.7/3.8/3.9/4.0 and TimeScan 3.0 is new and requires some planning if you want to mix:

Mixing Etnus License Management With Other Software Managed by FLEXlm

We recommend that initially you do not combine Etnus licenses with those of other third party software managed by FLEXlm. At first, it is easiest to keep separate license manager daemons for Etnus licenses and licenses of other third party software managed by FLEXlm. After you know that your Etnus license works, see the latest FLEXlm documentation for guidance in running a single FLEXlm license manager daemon.

Use the procedures described in the TotalView Installation Guide to install the Etnus FLEXlm license management software. Etnus licenses must be served by the Etnus license manager daemon or the latest FLEXlm license manager daemon.

The TCP/IP port number used for the Etnus license manager daemon must be unique and not in use elsewhere. Find port numbers used by other FLEXlm license managers in their license.dat files.

Mixing Etnus License Management With Older Versions of Etnus products

TotalView 3.7/3.8/3.9/4.0 and TimeScan 3.0 licenses cannot be combined with those of older versions of Etnus products. The Etnus licenses must be served by separate Etnus license manager daemons for best results.

Use the procedures described in the TotalView Installation Guide to install the FLEXlm license management software. TotalView 3.7/3.8/3.9/4.0 and TimeScan 3.0 licenses must be served with the license manager provided in their distributions. Older versions of Etnus products must be served by the license manager provided in their distributions.

To run TotalView 3.7/3.8/3.9/4.0 or TimeScan 3.0 with older versions of Etnus products served by the same license manager server machine, you must:

The old and the new FLEXlms install in different directories, so one does not overwrite the other.

In addition, if you want to use both new and old versions of Etnus products, you must include the full pathnames of both license.dat files in your LM_LICENSE_FILE environment variable. For example:

setenv LM_LICENSE_FILE \
     /usr/totalview/flexlm/license.dat:/usr/toolworks/flexlm-6.1/
license.dat

To verify your FLEXlm installations:

  1. Start both old and new FLEXlm license manager daemons (lmgrd).
  2. Set your LM_LICENSE_FILE environment variable appropriately, as above.
  3. Run the lmstat command in the new product's FLEXlm directory:

    <installdir>/flexlm-6.1/<platform>/bin/lmstat

    In the FLEXlm status output, look for both license servers UP and both new (toolworks) and old (bbnst) Etnus vendor daemons UP.

  4. Once the license managers are both running, you can run TimeScan and TotalView to verify their installations:

    timescan
    totalview

If you encounter installation problems, please review your procedure and also refer to the Troubleshooting section of the appropriate Etnus User's Guide.

Contents


Chapter 6: Problems and Reports

Problems Fixed

Fixed in 4X.0.0-5
435 Shape of assumed-shape dummy argument wrong
1060 SUN5: Symbol table reader IErr. Type tag '!' (0x21) not understood on string ''.
1061 AIX: TV on -bmaxdata SA_FULLDUMP core file won't display stack backtraces for threads
1062 SUN5: Symbol table reader IErr.
1063 Tru64 UNIX: Bpt in large-shared-memory target leads to target SEGV.
1158 IRIX, f90 7.3: FErrSU: Can't find abbrev table entry for debug entry.
1292 IErr: Bitwidth with enum nxm_wake_vals : 3
1393 Totalview IBM SP launch problems
1397 ALPHA: fork/execvp job gets FErr: Forbidden Transition
1398 This MPICH executable crashes totalview on our system.
1402 `totalview date` gets "aix_lookup_symbol_in_load_module: Failed to read the string storageLoading dynamic symbols for date"
1421 Cannot find the dynamic library 'libxmpi.so'
1424 FErr: ::~patchlib_entry_t: destructor called on entry not marked deleted
1429 's' steps over a large block of non-parallelized code w/i a single subr, but a bpt in the block works
1443 TV crashes when finding 'main'
1448 TV 3.9 gets IErr with long directory search paths
1457 SUN5: Can't read type info for "...": Bad virtual table offset. Expected an integer, but found ""
1460 visualizer on alpha dies when processing NaNs, Infinity or Denormalized values
1477 TV doesn't handle realtime signals
1482 RMS 2.x: Unaligned access <tvdmain>; Segmentation fault (core dumped)
1492 SEGV in 3.9.0-1 attempting to save assembler display to file
1532 Problems with pardo001.f Guide Fortran code
1535 File statics given wrong files on AIX when #line used
1584 Internal code generator error: cannot load address of a register
1606 DUNIX: IErr from template problem
1632 AIX: "totalview poe -a a.out -resd no -hostfile hostlist" from login node gets "... poe can't find executable".
1651 xlf90: Character ptr to array of 3 80-char strings displays as a 3-char string.
 
Fixed in 4X.0.0-6
1709 setguid bits set on directories in dist. tarfiles
1710 get rid of carriage returns in Release Notes (and other .txt files?)
 
Fixed in 4X.0.0-7
1052 AIX: Repetitious message: "Function declarations nested too deeply"
1059 AIX: File static variables can't be found.
1450 ALPHA: Can't debug core files, Internal error, and won't display the source for a template function
1496 SUN5, egcs: DBX class tag ':' not understood.
1649 Incorrect value of external variable reported in 64-bit mode
1659 ``FATAL ERROR STARTING UP: ::insert_nlist_symbol reenetered'' debugging KAI KPTS guidec++ OpenMP code on Solaris
1705 TotalView crashes during executable reload
1729 wrong setting of FLEXLM variable in totalview startup script
1774 Irix: Fatal Error from interpreted eval point
 
Fixed in 4.0 Release
266 Online help needs proofreading.
727 F77 w/cmplr directives, MP_SETNUMTHREADS>1, 'u' => FErr:...TTRID
1148 Dive on Duid 1 in Self-Debugging window results in IErr.
1266 Having trouble making PVM work with TotalView 3.9.0
1702 Generated PDF files missing TOC bookmarks and links
1720 Under Linux but NOT Tru64 or Solaris When debugging a module that is in a different directory than its source, the source code is not shown in TotalView until the "Set Search Directory (d)' menu command is used to set the directory to the source
1721 Error in table 2 installation guide
1725 Incomplete installation information in installation guide
1788 ALPHA: 'f' into .so gets IErr on long symbols in STL's std::map
1792 check_event_and_registers: Operation Failed.
1794 flexlm-6.1/alpha/bin/toolworks has Error: Unresolved symbols
1796 SGI, C++: dive on <opaque>* and ... append [10] to type leads to Couldn't find a base type. "OK" to Closest match then leads to Couldn't find a type name.
1802 large executable hangs totalview. "ERROR: Symbol table reading error: no current function" occurs repeatedly.
1803 SUN5: vismain gets "Error: Can't open display"
1807 CDWP expression gets "Error: Identifier i not defined"
1819 AIX: TV hangs at start of target execution
1836 Solaris TV should give user control of signal 36
1840 Please add click-able cross references to TV .pdf docs.
1845 CLI: the "dlist" cmd causes a crash if it is the first cmd given
1847 CLI: SUN5: The 'stty' command does not seem to work
 

Known Problems

The following sections list the problems that have been found.

Contents


Problems on All Platforms

All platforms have the following problems:

C++ Exceptions

TotalView does not have full support for C++ exceptions. Single-stepping over code that will throw an exception is problematic and often results in the process running away. To help with this situation, TotalView has been modified to detect when an exception throw is going to occur while single-stepping.

By default, TotalView brings up a dialog box to ask if you wish to stop the process. Answering "No" continues the process. Be aware that if you are stepping within the "try" block, your process may run away. Answering "Yes" stops the process upon entry into the system runtime routine that issues the throw. This is a temporary solution and full C++ exception handling may be provided in a future TotalView release.

This mechanism is available for all the supported C++ compilers on the supported platforms for SGI IRIX 6.x, Power AIX, Alpha Compaq Tru64 UNIX, and Sparc SunOS 5 (Solaris 2.x) platforms. (See TotalView 4.0 Platforms and System Requirements.)

If this option is turned off, TotalView does not catch C++ exception throws during single-step operations. This may cause the single-step operation to lose control on the process and cause it to run away.

Fortran Arrays Whose Size Changes

The following problem applies only to Fortran arrays whose size changes, and from which have used the Variable (v) command from the Function/File/Variable you are displaying only a single element, either because you have dived, or because you menu with an array index.

When a data pane displays a single element of a Fortran array that has runtime bounds (that is, assumed shape, assumed size, allocatable, or a pointer), and the actual bounds change, the value displayed in the data pane applies to the wrong element in the reshaped array.

To overcome this problem, display the whole array, then dive to the element that you want to see. Alternately, if you select the specific element of interest by setting the slice expression rather than by diving, the correct element always displays, even if the array changes shape.

Evaluation point with a goto and a step

If an evaluation point executes a "goto" statement or an assembly language transfer of control instruction, and you use the step or next command at the line where the evaluation point is enabled, TotalView continues the program and the step or next command does not complete. To regain control, type ^C into the program window.

FLEXlm Hunting For Multiprocessor Features

When FLEXlm reads your license.dat file, it hunts for multiprocessor feature lines when you start a debugging session with more than two processors. The messages:

(toolworks) UNSUPPORTED: "TV/<hdwr>-<OS>/MP/<n>"

which may appear in your license.log, may be safely ignored.

fvwm Version 1 Problems

There are problems with the fvwm version 1 window manager. Some users have reported that TotalView triggers bugs in version 1.22d of the fvwm window manager (and presumably earlier versions, too). However, The last release of fvwm version 1 (release 1.24r) is believed to work correctly with TotalView. Therefore, if you are using the fvwm version 1 window manager, we recommend that you ensure that you are using version 1.24r. We have not tested any later versions. You can find full details on fvwm at http://fvwm.math.uh.edu/.

Flush Pending Evaluation Command Can Corrupt Target Process

The use of the "Flush Pending Evaluation" command of the expression window may corrupt the target process. When the following three conditions hold:

TotalView shows the thread in an inconsistent state: the target threads are still at the breakpoint inside the function, but the stack backtrace shows it where the expression was invoked. As a result, TotalView may: (a) correctly show the source line where the process really is (from whatever line you invoked the expression); or (b) it may mistakenly show the line of the breakpoint in the function.

Further, if you try to continue the target process, one of the following will happen:

To avoid a crash or a hang, toggle the breakpoint (disable then reenable the breakpoint) TotalView is reporting as current before continuing the process. But, on SUN5 (and on all other platforms after you've toggled the breakpoint appropriately), if the process was sitting at a breakpoint when you called the function from the expression window, TotalView immediately hits that breakpoint again.

Attaching to Portland Group HPF Jobs Is Not Supported

TotalView does not support attaching to Portland Group HPF jobs. If you attempt to attach to Portland Group HPF jobs, you may not see all of the processes that the job is composed of, and you may not be able to display distributed variables.

MPICH 1.2.0 Cannot Locate libtvmpich.so Library

If you are running MPICH 1.2.0, TotalView cannot find the libtvmpich.so library. Installing patch 4959 (downloadable at http://www.mcs.anl.gov/mpi/mpich/buglist-tbl.html) fixes this problem.

Static Functions May Be Invisible When Using KCC

The KCC compiler places a static variable from the function in which it is declared and places the declaration at file or global scope. It also mangles the name to show that the variable ought to be at function scope Unfortunately, TotalView does not understand this mangling.

Contents


Problems on Compaq Tru64 UNIX (Alpha) platforms

The following are known problems with this platform:

For additional information, see Alpha Compaq Tru64 UNIX.

Thread Debugging Problems On All Versions of Compaq Tru64 UNIX

Because of a bug in the ALPHA thread debugging support on Compaq Tru64 UNIX, the low-level thread hold operation can allow a held thread to run. TotalView uses the low-level thread hold operation to prevent a thread from running when single-stepping another thread.

For example, assume your program has two threads, thread A and thread B. Assume that thread A is stopped at a breakpoint, and thread B is stopped elsewhere but not at a breakpoint. To continue the process (that is, both threads), TotalView must step thread A off the breakpoint. To do this, TotalView holds thread B. Then it unplants the breakpoint where thread A is stopped, sets a temporary breakpoint at the next instruction, and continues the process. Because of the hold thread bug, both thread A and thread B may run even though thread B is held. This means that thread B may miss the real breakpoint and hit the temporary breakpoint instead.

The following behaviors can indicate the presence of this bug:

Using include and #include

If you compile Fortran 90 files with include or #include statements on the Compaq Alpha platform using the Compaq Fortran V5.0 compiler (or earlier), TotalView may show line numbers following the include statement at incorrect lines. This problem is fixed by the Fortran V5.1 compiler.

Anonymous unions Using GNU

The GNU compiler does not output debugging information for members of anonymous unions that are enclosed in other aggregates when using the ECOFF format on the Compaq Alpha. As a result, if you are debugging in such an environment, you will not see such members if you use TotalView to look at a data structure that contains them. Furthermore, the debugging information for the offsets of aggregate members that follow the anonymous union is output incorrectly, so these members will be displayed with incorrect values.

Planting Too Many Action Points Causes Problems

On a V4.0 or later Compaq UNIX system, using one or more TotalView commands that plant a lot of breakpoints results in an error message being displayed when you run, continue, step, or otherwise cause your program to continue or start execution.

Compaq is aware of the problem, but a fix is not yet available.

You can temporarily workaround this problem by using dbx to increase the vt_maxentries variable to something like 20,000. For example:

dbx -k /vmunix
assign vm_tune.vt_mapentries=20000
quit

You can also alter vt_mapentries using the sysconfigdb program. Consult the man page for more information.

Setting a Breakpoint In a Large Shared-Memory Target Causes a SEGV

If setting a breakpoint causes the operating system to allocate shared page tables, reading information from these pages can lead to the program getting a SEGV and TotalView exiting with a "resources lost" message. You can avoid this problem by setting the value of ssm-threshold to 0. For example:

#sysconfig -r ipc ssm-threshold=0
ssm-threshold: reconfigured
#sysconfig -q ipc ssm-threshold
ipc:
ssm-threshold = 0

Setting this value to 0 can degrade performance.

This problem has been reported, but a fix is not yet available.

Contents


Problems on IRIX6-MIPS Platforms

The following are known problems with this platform:

For additional information, see SGI IRIX 6.x MIPS.

Using #include and -cpp Together in Fortran 90

If source files contain #include statements and are compiled with the -cpp switch on a Fortran 90 program using the MIPSpro compilers, TotalView generates incorrect line numbers. To avoid this problem, use the standard Fortran include statement (without the -cpp switch).

Fortran Arrays With Runtime Bounds Display Problem

Some Fortran arrays with runtime bounds are displayed improperly. Because of a limitation in the debug output produced by the SGI Fortran 90 compilers, this happens for arrays which are the targets of pointers embedded in a user-defined type which has itself been arrayed. Consider the following code:

type array_ptr
     real, dimension (:), pointer :: ap
end type array_ptr
 
type (array_ptr), allocatable, dimension (:) :: arrays
 
allocate (arrays(20))
do i = 1,20
     allocate (arrays(i)%ap(i))
end do

TotalView reports the bounds of the elements arrays%ap incorrectly. Unfortunately, there is nothing we can do in TotalView to overcome the fact that the compiler has generated invalid debug information for the runtime bounds for these elements.

This problem should be fixed in the MIPSpro F90 compiler version 7.3 Beta 2 and later for TotalView 3.9 and later.

Inadvertent Single-stepping into System Routines

The single-step commands sometimes step into system routines.

Cannot Find Source Code For System Routines Complaint

TotalView occasionally complains about not being able to find the source code for system routines (such as printf()).

Evaluation System Cannot Access Fortran 90 Up-level Variables

Access to F90 up-level variables does not work in the evaluation system. Because of SGI F90 7.2.1 and earlier compiler bugs, access to F90 up-level variables does not work from EVAL expressions. Those variables are correctly located and displayed in data panes, however.

This problem should be fixed in the MIPSpro F90 compiler version 7.3 Beta 1 and later for TotalView 3.9 and later.

Fortran 90 Pointer Variables Not Correctly Identified

F90 pointer variables are not correctly identified as pointers because of incomplete debugging information generated by the compilers. TotalView displays the target data correctly, however.

This problem should be fixed in the MIPSpro F90 compiler version 7.3 Beta 2 and later for TotalView 3.9 and later.

Shows Multiple Instances of Virtual Base Classes

Because of SGI 7.1 C++ compiler bugs, when that compiler is generating debugging information, TotalView shows multiple instances of virtual base classes. Normally only one instance is correct, which is the one that is of type pointer to the base class.

Incorrect Lower Bound for Allocatable Arrays

Because of SGI 7.2 F90 compiler bugs, when that compiler is generating debugging information, TotalView (and other debuggers) do not show the correct lower bound and element count for allocatable arrays in modules and pointers in common blocks. This bug has been fixed in the SGI 7.2.1 F90 compiler.

Shows Pointers With Unlimited Bounds With Bound of 1

Because of SGI 7.2.1 F90 compiler bugs, when that compiler is generating debugging information, TotalView shows the target of Cray pointers with unlimited bounds as having an upper bound of 1. Consider the following code:

subroutine test (ixx, n)
     common /sf/ iptr
     pointer (iptr, ita(*))
     ... etc ...
end

In this example, the compiler generates debug information for "ita" that indicates it has an upper bound of 1. This is incorrect because it has an unlimited upper bound.

Cannot Show Target of a Formal Parameter

Because of SGI 7.2.1 F90 compiler bugs, when that compiler is generating debugging information, TotalView can not show the target of a formal parameter Cray pointer. Consider the following code:

subroutine rex (rp)
     pointer (rp, p(8))
     p(2) = 6.
     P(5) = 3.
     write (*,*) "Should be 6,3 - ",p(2), p(5)
     return
end

In this example, the compiler generates debugging information for "p" without any addressing information.

This problem should be fixed in the MIPSpro F90 compiler version 7.3 Beta 1 and later for TotalView 3.9 and later.

Bad Template Names May Be Generated

Because of a compiler bug in the SGI 7.2 and 7.2.1 compilers, bad template type names may be generated for certain template instantiations. This problem is fixed by "Patch 3492: MIPSpro 7.2.1 C++ front-end rollup #4," which is available at the SGI Support Web Site.

KCC Does Not Put Original File Name Into Symbol Table

IRIX KCC code: TotalView fails to put the original file name (before preprocessing) into the symbol table. This prevents you from asking for the file by name until TotalView has processed all the symbols for that file.

If you use the --keep_gen_c option to the KCC compiler, you can use the TotalView command "f" xxx.int.c (where your original source file was xxx.C) to force full symbol processing of that file, after which you'll be able to do "f" xxx.C.

Attaching To SHMEM Jobs Is Not Supported

TotalView does not support attaching to SHMEM jobs. If you attempt to attach to SHMEM jobs, you may not see all of the processes that the job is composed of, and the process leader may not be properly identified, which may cause hangs.

Cray Pointers in Common Blocks Broken

The debugging information generated by SGI 7.3 Fortran compiler for the targets of Cray pointers contained within common blocks contains the wrong address. Here is an example:

common a1(1000)
common /ptrs/ jj,iparray,kk
pointer (iparray,array)
iparray = loc(a1)
end

"array" is a real variable that is the target of the Cray pointer "iparray". Because the address is wrong, TotalView cannot show you the correct values for the "iparray" variable. This bug has been reported to SGI. (The SGI 7.2.1 and earlier versions of the compiler do not have this bug. )

Arrays in "main" Are Not Found Unless Declared in Common

If an array is declared in "main", the SGI MIPSpro 7.3.3 compiler does not create debugging information for the variable. Consequently, TotalView does not know that the array exists. You can workaround this problem by placing the array in a common block.

Contents


Problems on RS/6000 Platforms

The RS/6000 platform has the following known problems:

For additional information, see RS/6000 Power AIX.

Multithreaded Problems

You may experience some problems when debugging multithreaded programs, because of limitations in the ptrace() operating system call.

The following problems can show up while you are debugging multithreaded applications:

  1. When a thread stops (e.g., hits a breakpoint) all the other threads stop. If any of the other threads stops while in a system call (e.g., read, sleep, select, etc.), however, ptrace() does not allow the debugger to read the thread's registers. As a result, TotalView:
    • cannot display the registers, including the program counter; but does display the stack pointer
    • cannot show you which system call is being executed
    • cannot single-step using the step or step-over command, but return out of function and run to selection work
    • cannot display the top stack frame

      If you have a multithreaded application which makes a lot of system calls, this might mean that most of your threads are not fully debuggable whenever one of them stops.

  2. TotalView shows you which threads are stuck in the kernel by displaying their state as In Kernel (K).
  3. When a thread is created or destroyed, the system does not notify the debugger of this event. As a result, the list of threads displayed by TotalView may be stale when the program is running.
  4. If the process stops for any reason, TotalView automatically updates the thread list. You may also type Current/Update/Relatives -> Update Process Info (u) to force the thread list to update.
Calling Dynamic Objects From Expression Window

If a routine in a dynamic object is called from the expression window, and if the target routine is never called from the main program, then TotalView refuses to call the routine.

XL Fortran Problems Generating Incorrect Section Numbers

Code compiled with the XL Fortran for AIX compiler Versions 4.1 and 5.1 may contain incorrect section numbers for bstat (.bs) and estat (.es) symbols. TotalView detects any incorrect section numbers and generates a warning in a dialog box for the first such problem only. TotalView notes any additional incorrect section numbers on its message output only. Symptom: common blocks have invalid addresses.

Patches for the 5.1 compilers are available through the normal AIX FixDist WEB site as follows:

  1. Point your WEB browser to http://service.software.ibm.com/support/rs6000/.
  2. Click on the "Downloads" link
  3. Click on the "Software Fixes" link
  4. Click on the "AIX Fix Distribution Service" link
  5. Click on the "Search by: PTF Number" radio button
  6. Enter the PTF number "U457231" in the box and click on the "Find Fix" button
  7. Select the PTF "U457231 - xlfcmp.5.1.0.2" item from the list
  8. Select your version of AIX
  9. Click on the "Get Fix Package" button. The following list of fix packages appears. You must download all of them:

    Filesets needed for selected item Information file Byte size
    xlfcmp.5.1.0.2 README 18824192
    xlfrte.5.1.0.2 README 24093696
    xlsmp.rte.1.0.0.1 README 63488

  10. Using your browser, download each of the files and put them into a common directory, for example, "/tmp/xlfpatches". The three files are named xlfcmp.5.1.0.2.bff, xlfrte.5.1.0.2.bff, and xlsmp.rte.1.1.0.1.bff.
  11. Use the AIX "smit" tool to install all three patches from the directory.
GNU Demangling Problem

Some small C++ programs compiled with the GNU compiler on AIX may not be recognized by TotalView as having been compiled with the GNU compiler. In these cases, TotalView will not demangle various program names. To make TotalView demangle the names in these programs properly, specify -demangler=gnu on the command line.

TotalView Can Hang in Parallel Session Running in the Background

On AIX systems, TotalView can hang if you have a TotalView parallel debug session running in the background in an xterm window, and you type anything in the underlying xterm window while the poe process is stopped. Type the fg command in the TotalView xterm window to clear up this condition.

AIX May Only Create a Partial Core File

Recent versions of AIX (4.1 or later) dump a partial core file by default. In general, a partial core dump contains only enough information to give a stack backtrace for the faulting thread. User data sections as well as some other potentially useful information are only available in a full core dump.

To force a full core dump on AIX, you must set a signal flag with sigaction for the signal that caused the core dump. For example:

struct sigaction act;
act.sa_handler = SIG_DFL;
if (bigcore)
     act.sa_flags = SA_FULLDUMP;
else if (smallcore)
     act.sa_flags = SA_PARTDUMP;
sigaction (SIGSEGV, &act, 0);
Process Contention Scope Not Supported

On AIX 4.3.1, 4.3.2, and 4.3.3 systems, TotalView supports debugging pthread programs running in pthread-compatibility mode or pthreads scheduled in "system contention scope," that is, each pthread is bound to a kernel thread (the 1:1 thread scheduling model). TotalView does not support "process contention scope," that is, multiple pthreads scheduled in user mode (M:N thread scheduling model).

On AIX 4.3.1, 4.3.2, and 4.3.3, when using TotalView to debug a program built with libpthreads.a, you must force the 1:1 model using the procedure outlined in Forcing 1:1 Thread Scheduling Mode on RS6000 Systems.

GPFS File System Not Supported

TotalView 4.0 will not work on the current release of the GPFS file system due to limitations in this file system. No file that must be read by TotalView should be stored on a GPFS system until these limitations are corrected.

pthdb_pthread() Returns an Empty pthread List

Sometimes when a process is stopped and the pthdb_pthread() function is used to obtain a list of pthreads, the returned list is empty even when there are pthreads. (TotalView displays a console message saying that there are no more threads.) You can fix this problem by applying the APAR IY06378 patch to your system. The procedure for obtaining and applying patches is described in RS6000 System Patch Procedures.

Contents


Problems on SPARC SunOS 5 platforms

The SPARC SunOS 5 and QSW CS-2 platforms have the following known problems:

For additional information, see Sparc SunOS 5 (Solaris 2.x).

Apogee 4.0 compilers must be patched

The Apogee 4.0 compilers on SUN4 and SUN5 require a patch to bring them up to revision level 4.010. Follow the Apogee 4.0 Compiler Patch Procedures.

Breakpoints in thunks may cause crash

Using breakpoints in thunks may lead to unexpected results, including having the target program crash unexpectedly. A thunk is a small linkage routine that connects a subroutine call to the actual subroutines in a dynamic library. The SPARC SunOS 5 dynamic loader modifies the code in the thunks during program execution, which conflicts with TotalView's planting and unplanting of breakpoints. The first time through a thunk, the thunk branches to the dynamic loader, and the dynamic loader modifies the thunk to branch directly to the corresponding dynamic library routine. Subsequent trips through the thunk branch directly to the dynamic library routine.

Contents


Problems in the Portland Group HPF 2.4 Compiler

The Portland Group HPF has the following known problems:

The Portland Group HPF compiler generates bad debugging information for TotalView in cases where the compiler needs to generate static initialization subprograms. The symptom of this problem is that some line numbers in the HPF source window are not associated with actual Fortran source and TotalView either disallows setting breakpoints at some lines or it sets the breakpoint in the wrong place. This bug occurs quite often with 'WHERE'. constructs.

Contents


Problems in Linux

The following problem occurs with Linux X86:

The following problems occur with Linux Alpha:

For additional information, see Alpha Linux RedHat and Intel x86 Linux RedHat.

Contents


Problems in the CLI

The following problems exist with the CLI:

Contents


Reporting Problems

If you experience any problems with TotalView, or if you have questions or suggestions, please contact us:

Etnus Inc.
111 Speen Street
Framingham, MA 01701-2090
Internet email: support@etnus.com
1-800-856-3766 in the United States
(+1) 508-875-3030 worldwide

Contents


TotalView Problem Report Form and Instructions

Fill out and email the form below. Document just one problem on a form. Remove or replace all data fields (<...>) with a selection or with contents. Do not remove '>' characters at the beginnings of lines.

All data requested below is helpful to us, though not all is necessary to solve each problem. Supply as much detail as you can.

If your problem involves TotalView execution, attach or FTP a reproducible example.

How to prepare and send your example

Create a directory named "repro" and place your problem files in it.

Add the following files to repro:

  1. index.txt: (List and describe the files in repro.)
    1. Include the target executable (a test program or code fragment is preferable to a large production code).
    2. Build the executable statically and with "-v", if possible.
    3. Show compiler version used.
    4. Show the compile/build session (stdin/stdout/stderr).
    5. Include sources (not always necessary, but usually helpful).
    6. Include any special libraries or input files required.
  2. README.TXT: (Describe the problem; please be very specific.)
  3. repro.txt: (Tell us how to reproduce the problem.)
    1. Show `uname -a`.
    2. Show your TotalView session (stdin/stdout/stderr).
    3. Describe in exact detail your interaction with TotalView. For example, you may edit your TotalView commands into the session output using angle brackets; that is:
            <where I did it: what I did next>
      
      

      For example:

           <Process 0: Hello_World (Exited or Never Created): 'g'> .)
    4. Use "Save Window to File..." to capture TotalView windows that document the problem.

Package your "repro" directory as follows:

cd ../repro/../
tar cvf - repro | compress -c > repro.tar.Z

If repro.tar.Z is larger than about 1MB, use FTP to send it to:

ftp://ftp.etnus.com/incoming/repro.tar.Z-<your email address>

You should also cut and paste `ls -l repro.tar.Z` to the end of your form.

If repro.tar.Z is less than about 1MB, use uuencode to package the information:

uuencode repro.tar.Z > repro.tar.Z.uu < repro.tar.Z

You should also cut and paste `cat repro.tar.Z.uu` to the end of your form.

Email the form as shown in the header in the next section.


Problem Report Email Header


To: support@etnus.com, bug-toolworks@etnus.com
Subject: <copy value of Synopsis field, below  .  .  .  .  .  .  .  .  
> 


Problem Report Email Body

Here is the body of the Problem Report:

>Submitter-Id:  <primary contact's *simplest* E-mail address (one line)>
>Originator:    <originator's name .  .  .  .  .  .  .  .  . (one line)>
>Organization:
 <originator's E-mail signature block .  .  .  .  .  . (multiple lines)>
>Confidential:  <[ no | yes ]   .  .  .  .  .  .  .  .  .  . (one line)>
>Synopsis:      <synopsis of the problem .  .  .  .  .  .  . (one line)>
>Severity:      <[ non-critical | serious | critical ]  .  . (one line)>
>Priority:      <[ low | medium | high ] .  .  .  .  .  .  . (one line)>
>Category:      totalview
>Class:         <[ sw-bug | doc-bug | change-request | support ] (1 ln)>
>Release:       4.0.<?>-<?>
>Environment:
 
 System:     <`uname -a` output .  .  .  .  .  .  .  .  .  .  .  .  .  >
 Platform:   <machine make&model, processor, etc. .  .  .  .  .  .  .  >
 OS:         <OS version and patch level .  .  .  .  .  .  .  .  .  .  >
 ToolChain:  <compiler version, linker version, etc. .  .  .  .  .  .  >
 Libraries:  <versions of parallel runtimes or standard libraries   .  >
 
 <other environmental factors you think are relevant . (multiple lines)>
 See also index.txt .
 
>Description:
 
 <precise description of the problem  .  .  .  .  .  . (multiple lines)>
 See README.TXT .
 
>How-To-Repeat:
 
 <step-by-step: how to reproduce the problem   .  .  . (multiple lines)>
 See repro.txt .
 
>Fix:
 
 <how to correct or work around the problem, if known  (multiple lines)>
 
>Unformatted:
<misc. comments, forwarded E-mail, encoded repro  .  . (multiple lines)>
 
`ls -l repro.tar.Z`  or  `cat repro.tar.Z.uu`
 
<End of problem report E-mail body.>

Contents


How to Contact Us

Etnus LLC.
111 Speen Street
Framingham, MA 01701-2090
Internet Email: info@etnus.com
1-800-856-3766 in the United States
(+1) 508-875-3030 worldwide

Visit our web site at http://www.etnus.com/.

Contents


Chapter 7: Patching Compilers and Operating Systems

EXTREMELY IMPORTANT

Compaq Tru64 UNIX Patch Procedures

NOTICE TO ALL Compaq Tru64 UNIX TOTALVIEW CUSTOMERS

All versions of TotalView 3.4, 3.7, 3.8, 3.9, and 4.0 for Compaq Tru64 UNIX versions V4.0B, V4.0C, V4.0D, and V4.0E require that you patch the Compaq Tru64 UNIX operating system before running TotalView. You must apply the entire patch kit; partial patch kits are not supported.

Do not run TotalView without first patching your Compaq Tru64 UNIX V4.0-based system! Failure to patch your Compaq Tru64 UNIX operating system before running TotalView will cause system crashes, system hangs, hung and unkillable processes, and TotalView malfunctions.

Note: The patch procedure requires that you have root user privileges on your systems.

Patch files include an operating system version, a patch number, and a patch date. For example, the patch file named duv40bas00007-19980514.tar is a patch file for Compaq Tru64 UNIX 4.0B, with a patch number of "00007" representing the patch level, and a patch date of "19980514".

No matter what patch level you actually install, the patch contains all the prior patches up to and including the current patch. Install the latest patch to get the operating system and runtime library versions required to run TotalView. Follow the step-by-step directions below to download the software and prepare to install the patch kit.

Retrieving and Applying the Compaq Tru64 UNIX V4.0x Aggregate ECO

Step 1: Retrieve the DUNIX V4.0 Aggregate ECO files and save them to your system using the following procedure:

Step 2: Print the patch procedure documentation contained in the ".README" and ".pdf" (or ".ps") files, and the "PatchInstallGuide.htm or PatchInstallGuide.pdf file.

Step 3: Follow the directions contained in the patch procedure documentation and install this patch on your Compaq Tru64 UNIX V4.0x systems. Perform this procedure in single-user mode, rebuild your kernel, and reboot your system.

Minimum Patch Level for 4.0D Systems

If you are patching a Compaq Tru64 UNIX 4.0D system, use patch kit "DUV40DAS0005-19991007" or later. Earlier patch kits do not contain all of the required Compaq Tru64 UNIX 4.0D patches.

Minimum Patch Level for 4.0E Systems

If you are patching a Compaq Tru64 UNIX 4.0E system, use patch kit "DUV40EAS0002-19990617" or later.

Getting Help With the Patch Procedures

If you need help following these procedures or have any questions, follow the directions for "Reporting Problems".


Apogee 4.0 Compiler Patch Procedures

The Apogee 4.0 compilers require a patch that brings the compiler version up to 4.010 (or later). You must obtain this patch directly from Apogee Software Inc. To get the Apogee 4.0 compiler patches for the SPARC, visit the Apogee ftp site at ftp://ftp.apogee.com/pub/users/apogee/patches/SPARC/ and read the README file.


Portland Group HPF 2.4 Compiler Patch Procedures

Some of the Portland Group HPF 2.4 distribution libraries for the RS/6000 Power AIX, Sparc SunOS 5 (Solaris 2.x), and SGI IRIX 6.x MIPS platforms may require a patch or a new installation to enable debugging HPF programs using TotalView.

TotalView depends on the tvdebug.o library module. This module must have symbol information that is used by TotalView. It is required that PGI build the tvdebug.o module with debugging information enabled.

If TotalView issues the following message when debugging your Portland Group HPF application:

MPICH library contains no type definition for struct MPIR_PROCDESC.

then, it is likely that the Portland Group HPF library you are linking with has a copy of the tvdebug.o module that was not compiled with debugging information enabled. To check for this situation, you can extract a copy of the tvdebug.o module from the libraries in your Portland Group HPF installation and check for the missing symbol.

In the directions below, $INSTALLDIR must be set to the directory where your Portland Group HPF compiler was installed. $LIBRARY is either libpghpf_rpm.a or libpghpf_smp.a, depending on whether you are using RPM or SMP run time support.

To check for the MPIR_PROCDESC symbol in your libraries, follow the platform-dependent directions below.

For the SGI IRIX 6.x MIPS platform:
  1. Enter the following commands:

    ar -xf $INSTALLDIR/sgi/lib-64/$LIBRARY tvdebug.o

    dwarfdump -a tvdebug.o | grep MPIR_PROCDESC

  2. If the MPIR_PROCDESC symbol is not found, you should download a new copy of the Portland Group HPF 2.4 compiler and reinstall it. The latest 2.4 release appears to have been fixed to correct this problem, therefore we have not provided a patch for the SGI IRIX 6.x MIPS platform.
For RS/6000 Power AIX or Sparc SunOS 5 (Solaris 2.x) platforms:
  1. Depending on your platform, enter the following commands:

    RS/6000 Power AIX:

         ar -xf $INSTALLDIR/rs6000/lib/$LIBRARY tvdebug.o

         dump -vtd tvdebug.o | grep MPIR_PROCDESC

    Sparc SunOS 5 (Solaris 2.x):

         ar -xf $INSTALLDIR/solaris/lib/$LIBRARY tvdebug.o

         dump -vsn .stabstr tvdebug.o | grep MPIR_PROCDESC

  2. If the MPIR_PROCDESC symbol is not found, you will need to patch your libraries. You need this patch only if you are using the HPF rpm or smp run time support. The four libraries that require patching are:
     
         libpghpf_rpm.a
         libpghpf_rpm_p.a
         libpghpf_smp.a
         libpghpf_smp_p.a

     
  3. All libraries require a new version of the module tvdebug.o. You can download and install fixed versions of these modules from our support site. Download the following files and save them to your system:

         ftp://ftp.etnus.com/support/toolworks/pgi/PGI_HPF_2.4_patch.tar

         ftp://ftp.etnus.com/support/toolworks/pgi/PGI_HPF_2.4_patch.README

  4. Follow the directions contained in the PGI_HPF_2.4_patch.README file to install the tvdebug.o module in these libraries.

Contents


Sun WorkShop 5.0 Compiler Patch Procedures

Due to bugs in the initial release of the Sun WorkShop 5.0 FORTRAN 77 and Fortran 90 compilers, you must apply several Sun-provided patches to your compiler.

Visit the following URL:

http://access1.sun.com/workshop/current-patches.htm

And download the following patches:

Patch Number Synopsis
107356 Fortran 90 2.0: Patch for Fortran 90 (f90) 2.0 compiler
107357 Compiler Common 5.0: Patch C 5.0, C++ 5.0, F77 5.0, F90 2.0
107377 Fortran 90 2.0: Patch for 64-bit Fortran 90 (f90) 2.0 compiler
107989 Fortran Common 5.0: Patch F77 5.0, F90 2.0

Contents


RS6000 System Patch Procedures

Patches for AIX are available through the normal AIX FixDist WEB site as follows:

  1. Point your WEB browser to http://service.software.ibm.com/support/rs6000/.
  2. Click on the "Downloads" link.
  3. Click on the "General Software Fixes" link.
  4. Click on the "AIX Fix Distribution Service" link.
  5. Click on the "Search by: APAR Number" radio button.
  6. Enter the APAR number in the box and click on the "Find Fix" button.
  7. Select the APAR item from the list.
  8. Select your version of AIX.
  9. Click on the "Get Fix Package" button.
  10. Using your browser, download the filesets and put them into a directory, for example, "/tmp/apar123".
  11. Use the AIX "smit" tool to install the patch from the directory.

Contents


Notices

Copyright (c)1999-2000 by Etnus LLC. All rights reserved

Copyright (c)1999 by Etnus Inc. All rights reserved

Copyright (c)1996-1998 by Dolphin Interconnect Solutions, Inc.

Copyright (c) 1993-1996 by BBN Systems and Technologies, a division of BBN Corporation.

No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise without the prior written permission of Etnus LLC. (Etnus).

Use, duplication, or disclosure by the Government is subject to restrictions as set forth in subparagraph (c)(1)(ii) of the Rights in Technical Data and Computer Software clause at DFARS 252.227-7013.

Etnus has prepared this document for the exclusive use of its customers, personnel, and licensees. The information in this document is subject to change without notice, and should not be construed as a commitment by Etnus. Etnus assumes no responsibility for any errors that appear in this document.

TotalView, TimeScan, and Gist are trademarks of Etnus LLC.

All other brand names are the trademarks of their respective holders.


Etnus LLC
http://www.etnus.com
Voice: (508) 875-3030
Fax: (508) 875-1517
support@etnus.com
info@etnus.com