TotalView®
Multiprocess Debugger

Version 4.1.0-4 January 31, 2001

These Release Notes for TotalView 4.1.0-4 contain important information that affects your software and license for the platforms described in Section 2: TotalView 4.1 Platforms and System Requirements.

This document also describes changes made since the release of TotalView 3.9 and 4.0 as well as bug fixes.

The manuals for this release are TotalView User's Guide, TotalView CLI Guide, and TotalView Installation Guide. Their version number is 4.1 and are dated June 2000.


Contents

Section 1 New Features
Section 2 TotalView 4.1 Platforms and System Requirements
Compaq Alpha Tru64 UNIX
Compaq Alpha Linux Red Hat
HP HP-UX
Intel x86 Linux Red Hat
RS/6000 Power AIX
SGI IRIX 6.x MIPS
SPARC SunOS 5 (Solaris 2.x)
Myrinet Support
Portland Group HPF 2.4 Supported Configurations
Section 3 TotalView News
Section 4 Special IBM Considerations
pthread Considerations
AIX Patch Considerations
IBM PE Message Queue Display
Forcing 1:1 Thread Scheduling Mode on RS/6000 Systems
Section 5 Special Linux Considerations
Section 6 License Management
Section 7 Problems and Reports
Problems Fixed
Documentation Problems
Problems on All Platforms
Problems on Compaq Alpha Tru64 UNIX Platforms
Problems on HP HP-UX Platforms
Problems on SGI IRIX Platforms
Problems on RS/6000 Platforms
Problems on SPARC SunOS 5 Platforms
Problems in the Portland Group HPF 2.4 Compiler
Problems in Linux
Problems in the CLI
Reporting Problems
How to Contact Us
Section 8 Patching Compilers and Operating Systems
Compaq Tru64 UNIX Patch Procedures
Apogee 4.0 Compiler Patch Procedures
Portland Group HPF 2.4 Compiler Patch Procedures
Sun WorkShop 5.0 Compiler Patch Procedures
RS/6000 System Patch Procedures


Section 1: New Features

TotalView 4.0 has the following new features:

Contents

HP-UX Now Supported

TotalView now runs on HP-UX. Before running TotalView on this machine, you should read the following sections of this document:

For information that is unique to using HP-UX, you should read the following sections in the Totalview User Guide:


Bulk Server Launch

TotalView can now simultaneously launch the TotalView Debugger Server (tvdsvr) on multiple remote systems. See Chapter 4 of the TotalView User's Guide for more information.


POE 3.x 32-bit Application Support

TotalView now allows you to debug 32-bit POE 3.x applications.


Remote Communication Interface Selection

You can now set the interface name that the server uses when it makes a call back. For example, on an IBM SP2 machine, the following resource setting sets the callback to use the hardware switch:

     totalview*useInterface:css0


Command Line Interface (CLI) and Scripting Language

TotalView now complements its powerful GUI with a command line interface and scripting language. You will now be able to debug programs in a command line environment or in a mode that combines the TotalView GUI with an xterm window in which you can enter CLI commands. The scripting language is a Tcl 8.0 interpreter that is embedded in the CLI, allowing you to create powerful debugger scripts.

At Release 4.1, the ddetach command was added. In addition, the dattach and dload commands now support group manipulation for automatic process acquisition. Additional variables are controlled by the dset command. Information on these changes can be found in Chapter 4 of the CLI Guide.


Linux (Intel X86 and Alpha)

TotalView now runs on versions of Linux running on the Intel X86 and Alpha processors. TotalView runs under Red Hat 5.2, 6.0, 6.1, 6.2, and 7.0 Linux. All standard capabilities of TotalView are available on Linux. Here are the available compilers:

Platform Compiler
Intel x86 Linux Red Hat GCC EGCS C, C++, and F77

KAI Guide OpenMP C++

Compaq Alpha Linux Red Hat GCC EGCS C, C++, and F77

Compaq Fortran and C (Planned, pending Release from Compaq)


Support for 64-bit Applications on the IBM RS/6000

TotalView now allows you to debug 64-bit applications on the RS/6000 Power3 platform. A single TotalView image can debug both 32-bit and 64-bit applications in the same debugging session.


KAI Guide Fortran and C++ OpenMP Compilers

TotalView allows you to debug Fortran and C++ OpenMP programs compiled with Kuck and Associates, Inc. (KAI) Guide 3.8 compilers on the Alpha, HP-UX, IRIX 6-MIPS, RS/6000, Linux-X86, and Sun 5 platforms.


SunPro 5.0 Compilers on Solaris 5

TotalView allows a user to debug SunPro 5.0 C, C++, and Fortran programs.


Fast TotalView Debugger Server Launch

TotalView has increased its efficiency when launching large-scale applications, making the launching of the TotalView Debugger Server (tvdsvr) faster on all platforms.


Mutex, Condition Variable, R/W Lock, and pthread Key Display on RS/6000

TotalView can now display information for mutexes, condition-variables, R/W Locks, and pthread Keys in a separate window, providing you much greater visibility into the state of your threaded programs.


M:N User-Mode Schedule Threads on RS/6000

TotalView now supports debugging pthread programs that use M:N user-mode scheduling on the RS/6000 platform.


Fast Data Watchpoints

On all natively supported architectures except for Linux on Alpha and HP, TotalView lets you create fast data watchpoints. A data watchpoint triggers a debugging event when a memory location is modified. TotalView also lets you associate an expression with a fast data watchpoint so that the expression is evaluated when the watchpoint triggers.

Watchpoint support is available for AIX 4.3.3.0-2 (4.3R), which is distributed as APAR IY06844. Unfortunately, a bug exists within this version and you must apply patch APAR IY07644 to it before you can use watchpoints. For information on making both of these patches, see RS/6000 System Patch Procedures.

Watchpoints are discussed in Chapter 8 of the TotalView User's Guide.


Advanced Array Data Features

TotalView supports three new powerful features that allow you to easily analyze your array data:


OpenMP THREADPRIVATE Common Block Support on IRIX 6-MIPS

On the IRIX 6-MIPS platform, TotalView will allows you to debug SGI MIPS OpenMP or compiler parallel Fortran programs. TotalView will allow access to TASKCOMMON variables, access to uplevel variables, proper handling of #line directives needed to debug at the original source level, and properly handling parallel regions and do-serial constructs.


License Management Enhancements

TotalView extends licensing to include a single processor.


FLEXlm 6.1

TotalView now uses FLEXlm version 6.1.

Contents


Section 2: TotalView 4.1 Platforms and System Requirements

To run TotalView on your system, you must have the correct hardware configuration and the correct software installed.

The following table shows the supported platforms and the TotalView version supporting each platform.

Platform Name TotalView Version
Compaq Alpha Tru64 UNIX 4.1.0-4
Compaq Alpha Linux Red Hat 4.1.0-4
HP HP-UX 4.1.0-4
Intel x86 Linux Red Hat 4.1.0-4
RS/6000 Power AIX 4.1.0-4
SGI IRIX 6.x MIPS 4.1.0-4
SPARC SunOS 5 (Solaris 2.x) 4.1.0-4

Contents


Compaq Alpha Tru64 UNIX

Software Requirements
Hardware Requirements
Additional Requirements

Specific TotalView 4.1 features have the following additional requirements:

Compiler or Environment Product
C compiler C compilers provided with Compaq Tru64 UNIX V4.0B through F, V5.0, V5.0A, and V5.1

GCC EGCS 2.95.2

C++ compiler Compaq Tru64 UNIX C++ V6.1, V6.2

KAI 3.4

GCC EGCS 2.95.2

FORTRAN 77 compiler

Compaq Tru64 UNIX V5.1, V5.2

Fortran 90 compiler

Compaq Tru64 UNIX V5.1, V5.2

OpenMP Fortran compiler

Compaq Tru64 UNIX V5.1, V5.2

KAI Guide 3.8

OpenMP C++ compiler KAI Guide 3.8
MPICH version 1.1.1, 1.1.2, 1.2.0 MPICH is available from: http://www.mcs.anl.gov/mpi/mpich.

MPICH patches are available from: http://www.mcs.anl.gov/mpi/mpich/buglist-tbl.html

Compaq MPI (DMPI) Versions 1.8 and 1.9
QSW RMS2 Running on AlphaServer SC systems

Needs specific RMS2 version; not yet tested by Etnus

ORNL PVM version 3.4.1 See the PVM home page at http://www.epm.ornl.gov/pvm/pvm_home.html for more information on PVM--the Users Guide description that says "release 3.3.4 or later" is a mistake
Compaq PVM (DPVM) Versions 1.8 and 1.9

Restrictions

For additional information, see Problems on Compaq Alpha Tru64 UNIX Platforms.

Contents


HP HP-UX

The software and hardware requirements for running TotalView 4.1 on HP HP-UX systems are as follows:

Software Requirements
Hardware Requirements
Additional Requirements

Specific TotalView 4.1 features have the following additional requirements:

Compiler or Environment Product
C compiler HP ANSI C compiler, version A.11.00.15 and A.11.01.00
C++ compiler HP C++ compiler, version A.03.13

KAI 3.4

Fortran 90 compiler HP Fortran 90, version 2.2
OpenMP Fortran KAI Guide 3.8
OpenMP C++ compiler KAI Guide 3.8
HP MPI HP MPI, version 1.6
MPICH version 1.1.1, 1.1.2, 1.2.0 MPICH is available from: http://www.mcs.anl.gov/mpi/mpich

MPICH patches are available from: http://www.mcs.anl.gov/mpi/mpich/buglist-tbl.html

ORNL PVM version 3.4.1 See the PVM home page at http://www.epm.ornl.gov/pvm/pvm_home.html for more information on PVM--the Users Guide description that says "release 3.3.4 or later" is a mistake

Contents


SGI IRIX 6.x MIPS

The software and hardware requirements for running TotalView 4.1 on SGI IRIX 6.x MIPS systems are as follows:

Software Requirements
Hardware Requirements
Additional Requirements

Specific TotalView 4.1 features have the following additional requirements:

Compiler or Environment Product
C compiler Silicon Graphics MIPSpro 7.2.1 or 7.3

GCC EGCS 2.95.2

C++ compiler Silicon Graphics MIPSpro 7.2.1 or 7.3

KAI 3.4

GCC EGCS 2.95.2

FORTRAN 77 compiler Silicon Graphics MIPSpro 7.2.1 or 7.3 (see Restrictions below)
Fortran 90 compiler Silicon Graphics MIPSpro 7.2.1 or 7.3 (see Restrictions below)
OpenMP Fortran Silicon Graphics MIPSpro 7.3 (see Restrictions below)

KAI Guide 3.8

OpenMP C++ compiler KAI Guide 3.8
Portland Group HPF 2.4 See Restrictions below
SGI MPI 3.1 or 3.2 These releases are part of the Message Passing Toolkit (MPT) Release 1.2 or 1.3 MPI 3.1 does not support message queue display, but MPI 3.2 does; see Restrictions below

MPT 1.3 is available from: http://www.sgi.com/Products/Evaluation

MPT 1.3 general information is available from: http://www.sgi.com/software/mpt/

SGI MPI requires Array Services to be installed and properly configured. Array Services is also available from: http://www.sgi.com/Products/Evaluation

See the MPT documentation for required version Array Services

MPICH version 1.1.1, 1.1.2, 1.2.0 MPICH is available from: http://www.mcs.anl.gov/mpi/mpich

MPICH patches are available from: http://www.mcs.anl.gov/mpi/mpich/buglist-tbl.html

ORNL PVM version 3.4.1

See the PVM home page at http://www.epm.ornl.gov/pvm/pvm_home.html for more information on PVM--the Users Guide description that says "release 3.3.4 or later" is a mistake

Restrictions

Contents


Compaq Alpha Linux Red Hat

The software and hardware requirements for running TotalView 4.1 on Linux Alpha systems are as follows:

Software Requirements
Hardware Requirements
Additional Requirements

Specific TotalView 4.1 features have the following additional requirements:

Compiler or Environment Product
C compiler ccc-6.2.0-8 Compaq C T6.2-235 or later

GCC gcc EGCS 2.90.29 (bundled with Red Hat 5.2)

GCC gcc EGCS 2.91.66 (bundled with Red Hat 6.0, 6.1, 6.2)

GCC gcc EGCS 2.95.2

GCC gcc 2.96 (bundled with Red Hat 7.0)

C++ compiler GCC g++ EGCS 2.90.29 (Red Hat 5.2)

GCC g++ EGCS 2.91.66 (bundled with Red Hat 6.0, 6.1, 6.2)

GCC g++ EGCS 2.95.2

GCC g++ 2.96 (bundled with Red Hat 7.0)

FORTRAN 77 compiler cfal-1.0.6 Compaq Fortran T1.0-916 or later

GCC g77 EGCS 2.90.29 (bundled with Red Hat 5.2)

GCC g77 EGCS 2.91.66 (bundled with Red Hat 6.0, 6.1, 6.2)

GCC g77 EGCS 2.95.2

See Restrictions below for information on using these GNU compilers)

GCC g77 2.96 (bundled with Red Hat 7.0)

Fortran 90 compiler cfal-1.0.6 Compaq Fortran T1.0-916 or later
MPICH version 1.1.1, 1.1.2, 1.2.0 MPICH is available from: http://www.mcs.anl.gov/mpi/mpich. MPICH patches are available from: http://www.mcs.anl.gov/mpi/mpich/buglist-tbl.html

ORNL PVM version 3.4.1

See the PVM home page at http://www.epm.ornl.gov/pvm/pvm_home.html for more information on PVM--the Users Guide description that says "release 3.3.4 or later" is a mistake

Restrictions

Contents


Intel x86 Linux Red Hat

The software and hardware requirements for running TotalView 4.1 on Linux Intel-X86 systems are as follows:

Software Requirements
Hardware Requirements
Additional Requirements

Specific TotalView 4.1 features have the following additional requirements:

Compiler or Environment Product
C compiler

GCC gcc 2.7.2.3 (bundled with Red Hat 5.2)

GCC gcc EGCS 2.91.66 (bundled with Red Hat 6.0, 6.1, 6.2)

GCC gcc EGCS 2.95.2

GCC gcc 2.96 (bundled with Red Hat 7.0)

C++ compiler KAI 3.4

GCC g++ EGCS 2.90.29 (bundled with Red Hat 5.2)

GCC g++ EGCS 2.91.66 (bundled with Red Hat 6.0, 6.1, 6.2)

GCC g++ EGCS 2.95.2

GCC g++ 2.96 (bundled with Red Hat 7.0)

FORTRAN 77 compiler

GCC g77 EGCS 2.90.29 (bundled with Red Hat 5.2) (see RESTRICTIONS below)

GCC g77 EGCS 2.91.66 (bundled with Red Hat 6.0, 6.1, 6.2) (see RESTRICTIONS below)

GCC g77 EGCS 2.95.2 (bundled with Red Hat 6.0, 6.1, 6.2) (see RESTRICTIONS below)

GCC g77 2.96 (bundled with Red Hat 7.0)

See Restrictions for information on using these GNU compilers

OpenMP C++ compiler KAI Guide 3.8
MPICH version 1.1.1, 1.1.2, 1.2.0 MPICH is available from: http://www.mcs.anl.gov/mpi/mpich. MPICH patches are available from: http://www.mcs.anl.gov/mpi/mpich/buglist-tbl.html
ORNL PVM version 3.4.1 See the PVM home page at http://www.epm.ornl.gov/pvm/pvm_home.html for more information on PVM--the Users Guide description that says "release 3.3.4 or later" is a mistake

Restrictions

Contents


Other Linux x86 Platforms

While TotalView has only been tested on the Red Hat platform, we know of no reasons why TotalView should fail on other Linux/x86 platforms.

The TotalView executable image is built on Red Hat 5.2, and uses the following dynamic libraries:

The only library that is likely to cause a problem is libbfd. We believe that using more modern versions of libbfd is not a problem, which you can do by creating a symbolic link to make the libbfd. This library is available through the name libbfd-2.9.1.0.15.so.0.

We would be interested to hear about your experiences in using TotalView on other Linux/x86 platforms.

Other Linux Hints

If you have source code for Linux run time libraries available on your system, TotalView should be able to display this code provided that it appears in the directory from which its debug information claims that it was compiled. On Red Hat systems, this is /usr/src/bs/BUILD; other systems may vary. Since the source RPMS on Red Hat installs sources under /usr/src/redhat/BUILD, a simple symbolic link so that /usr/src/redhat also appears as /usr/src/bs is all that is required.

To work out where your library sources claim to have been compiled you should do the following:

     % objdump --stabs library_of_interest | grep SO | head -5

Here's an example.

     % objdump --stabs /lib/libc.so.6 | grep SO | head -5 
     0    SO    0     0     0000000000017a10 9     /usr/src/bs/BUILD/glibc/   elf/ 
     1    SO    0     0     0000000000017a10 0     soinit.c 
     96   SO    0     0     0000000000017a58 954 
     97   SO    0     0     0000000000017a60 2340  /usr/src/bs/BUILD/glibc/csu/ 
     98   SO    0     0     0000000000017a60 2369  ../sysdeps/unix/sysv/linux/init-first.c

Here you can see that the library was compiled from /usr/src/bs.

Contents


RS/6000 Power AIX

The software and hardware requirements for running TotalView 4.1 on RS/6000 Power AIX systems are as follows:

Software Requirements
Hardware Requirements
Additional Requirements

Specific TotalView 4.1 features have the following additional requirements:

Compiler or Environment Product
C compiler IBM xlc 3.1.3.3, 3.6.0.0

GCC EGCS 2.95.2

C++ compiler IBM xlC 3.1.3.2, 3.6.0.0

KAI 3.4

GCC EGCS 2.95.2

FORTRAN 77 compiler IBM xlf 5.1.0.0, 6.1.0.0
Fortran 90 compiler IBM xlf90 5.1.0.0, 6.1.0.0
OpenMP Fortran compiler KAI Guide 3.8
OpenMP C++ compiler KAI Guide 3.8
Portland Group HPF 2.4 See Restrictions below
Parallel Environment for AIX version 2.2, 2.3, and 2.4 See Restrictions below
MPICH version 1.1.1, 1.1.2, 1.2.0 MPICH is available from http://www.mcs.anl.gov/mpi/mpich. MPICH patches are available from http://www.mcs.anl.gov/mpi/mpich/buglist-tbl.html
ORNL PVM version 3.4.1 See the PVM home page at http://www.epm.ornl.gov/pvm/pvm_home.html for more information on PVM--the Users Guide description that says "release 3.3.4 or later" is a mistake

Restrictions

Contents


SPARC SunOS 5 (Solaris 2.x)

The software and hardware requirements for running TotalView 4.1 on SPARC SunOS 5 (Solaris 2.X) systems are as follows:

Software Requirements
Hardware Requirements
Additional Requirements

Specific TotalView 4.1 features have the following additional requirements:

Compiler or Environment Product
C compiler WorkShop compiler 4.2, 5.0

Apogee 3.1, 4.010

GCC EGCS 2.95.2

C++ compiler WorkShop compiler 4.2, 5.0 (see Restrictions below)

KAI 3.4

Apogee 3.1, 4.010

GCC EGCS 2.95.2

FORTRAN 77 compiler WorkShop compiler 4.2, 5.0 (see Restrictions below)
Fortran 90 compiler WorkShop compiler 4.2, 5.0 (see Restrictions below)
OpenMP Fortran Compiler KAI Guide 3.8
OpenMP C++ compiler KAI Guide 3.8
Portland Group HPF 2.4 See Restrictions below
MPICH version 1.1.1, 1.1.2, 1.2.0 MPICH is available from http://www.mcs.anl.gov/mpi/mpich. MPICH patches are available from http://www.mcs.anl.gov/mpi/mpich/buglist-tbl.html
ORNL PVM version 3.4.1 See the PVM home page at http://www.epm.ornl.gov/pvm/pvm_home.html for more information on PVM--the Users Guide description that says "release 3.3.4 or later" is a mistake

Restrictions

Contents


Portland Group HPF 2.4 Supported Configurations

The following table lists the supported Portland Group HPF 2.4 compiler runtime configurations by platform.

Platform Name MPICH IBM PE RPM SMP
SGI IRIX 6.x MIPS OK N/A OK (May need patch) OK (may need patch)
RS/6000 Power AIX Not tested OK OK (May need patch) Not supported by Portland Group
SPARC SunOS 5 (Solaris 2.x) OK N/A OK (May need patch) OK (may need patch)

Key:

Contents

Myrinet Support

Version 1.1.3 of the Myrinet GM software supports TotalView. (GM is a message-passing system for Myrinet networks. The GM system includes a driver, Myrinet-interface control program, a network mapping program, and the GM API, library, and header files.) You can obtain this software from http://www.myrinet.com/scs/index.html.

Contents


Section 3: TotalView News

Default TotalView Server Launch String Has Changed

In TotalView 3.9, the default server launch string was:

     rsh %R -n "cd %D && tvdsvr -callback %L -set_pw %P -verbosity %V"

In TotalView 4.0, the default server launch string is:

     %C %R -n "cd %D && tvdsvr -callback %L -set_pw %P -verbosity %V"

In TotalView 4.1, the default server launch string is:

     %C %R -n "tvdsvr -working_directory %D -callback %L
               -set_pw %P -verbosity %V"

If you have set the default server launch string to something different, you may also want to update your launch string.

Beginning with TotalView 4.0, the default server launch string uses the new %C feature. A %C in the server launch string expands to the name of the server launch command being used. On most platforms, this is rsh.

Here is how %C is expanded:

Contents

dwhere -a option within CLI

The -a option to the CLI's dwhere command replaces the -all option (which was never implemented).

Contents


Section 4: Special IBM Considerations

This section discusses the following:

Contents


pthread Considerations

On AIX 4.3.1 and on unpatched AIX 4.3.2 and 4.3.3 systems, TotalView supports debugging pthread programs running in pthread-compatibility mode or pthreads running in system contention scope; that is, each pthread is bound to a kernel thread (1:1 thread scheduling). See Forcing 1:1 Thread Scheduling Mode on RS/6000 Systems for information on how to force 1:1 thread scheduling.

On AIX 4.3.2 and 4.3.3 you can apply a patch that will allow you to display mutexes, condition variables, reader-writer locks, and pthread keys in your program. This patch is available for AIX 4.3.2 as:

      APAR IY02391 -- BACKPORT PTHREAD DEBUG LIBRARY

Due to limitations in the pthread and pthread debug libraries, you cannot reliably debug pthread programs in process contention scope (M:N thread scheduling). IBM is working to correct this problem, but meanwhile you should force the pthread debug library to run in system contention scope (1:1 thread scheduling). See Forcing 1:1 Thread Scheduling Mode on RS/6000 Systems for information on how to force 1:1 thread scheduling.

If you apply this patch, you may introduce a kernel bug on your system where the ptrace(PT_REATT) system call fails with EPERM when debugging IBM parallel environment (PE) programs. If you are debugging PE programs, we do not recommend applying APAR IY02391.

On AIX 4.3.3, the system already contains the pthread and pthread debug libraries that will allow you to display mutexes, condition variables, reader-writer locks, and pthread keys in your program. However, as mentioned above, you cannot reliably debug pthread programs in process contention scope (M:N thread scheduling), and you should force the pthread debug library to run in system contention scope (1:1 thread scheduling). See Forcing 1:1 Thread Scheduling Mode on RS/6000 Systems for information on how to force 1:1 thread scheduling.

For more information, please read "Problems on RS/6000 Platforms".

Contents


AIX Patch Considerations

Some patch levels of AIX 4.3.2 contain a kernel bug where the ptrace(PT_REATT) system call fails with EPERM when debugging IBM parallel environment (PE) programs. A patch for this problem is available for AIX 4.3.2 as:

     APAR IY02037 -- HOT: ptrace(PT_REATT...) returns -1 and sets er

In addition, this patch is quite huge, 262 filesets occupying 420MB. If you apply this patch, you will effectively upgrade your system to AIX 4.3.3, and you should also apply the following patch:

     APAR IY03550 -- DBX CANNOT SET BREAK POINTS IN DATA SECTION IN

Consequently, we recommend that you obtain the AIX 4.3.3 distribution CD from IBM.

On 64-bit Power3-based RS/6000 systems, or any system that has split instruction and data caches, ptrace() fails to copy back the data cache for breakpoints planted in the target program's data space. The TotalView compiled expression evaluator and interpreted expression function call features plant breakpoints in the target program's data space, making these feature unusable on 64-bit Power3-based systems. The symptom is that TotalView sometimes hangs when creating a process at the start of execution. To fix this problem, you should apply:

     APAR IY03550 -- DBX CANNOT SET BREAK POINTS IN DATA SECTION IN

For more information, please read "Problems on RS/6000 Platforms".

Contents


IBM PE Message Queue Display

TotalView supports the Message Queue Display (MQD) feature when used with the threaded version of the IBM MPI libraries that are part of the IBM Parallel Environment (PE). PE version 2.2 and the non-threaded PE 2.3 and later libraries cannot provide TotalView with the necessary information for the MQD feature, however, Automatic Process Acquisition (APA) is supported in PE 2.2 and later. The following table summarizes TotalView's IBM PE support:

IBM PE version APA Support? MQD Support
2.2 Yes No
2.3 Yes Threaded MPI only
2.4 Yes Threaded MPI only
3.1 Yes Threaded MPI only

Contents


Message Queue Display Debugging Dynamic Library

Each version of the IBM PE library requires a different MQD debugging dynamic library for use with TotalView. This section explains how TotalView chooses the correct MQD debugging dynamic library.

When TotalView recognizes that it is dealing with an IBM MPI code, it searches for a MQD dynamic library to load capable of handling the appropriate IBM MPI implementation. The IBM MPI library declares its debugging compatibility version in an integer global variable named mpi_debug_version. The following table shows the current set of values:

mpi_debug_version value PE version number
0 2.3
1 2.4
2 3.1

When looking for the MQD debugging dynamic library, TotalView looks for a library named libtvibmmpi.so; if it cannot be found, TotalView looks for a library named libtvibmmpi32-<n>.so (where <n> is the integer value of the mpi_debug_version variable) for a 32 bit MPI process or libtvibmmpi64-<n>.so for a 64 bit process.

By default, the MQD debugging dynamic libraries provided with your TotalView distribution are named:

The TotalView distribution has the appropriate symbolic links:

This ensures that the correct MQD debugging dynamic library is loaded for both POE 2.3, POE 2.4, and either 32 or 64 bit codes using POE 3.1.

PE 2.4 MQD Configuration

Unfortunately, an error in the IBM PE 2.4.0.0 libraries causes the mpi_debug_version variable to have the value 0 (instead of the correct value 1). You may correct this problem by:

But first, verify that you need the fix.

Verifying You Need a Fix

If you have POE 2.4.0.0, you may need to apply an IBM-provided PTF or alter your TotalView installation explicitly so that the MQD debugging dynamic library symbolic links point to the correct version of the library.

Here is how to check if you need to modify your TotalView installation:

  1. Verify the version of PE in use on all your nodes. You can use the following command to inspect the version of PE installed on all of your nodes.

         lslpp -l ppe.poe

    For example:

    You may also use the poe command to check multiple nodes on your SP2, for example:

         $ poe lslpp -l ppe.poe -procs 10 -rmpool 0

  2. Inspect the value of mpi_debug_version on all your nodes. Begin by compiling and running the following program:

         /*
         Test program to check mpi_debug_version.
          
         Compile: mpcc_r -g -o mqdvers mqdvers.c
          
         Run: mqdvers -procs <n> -rmpool <p>
          mqdvers -procs <n> -hfile <hostfile>
         */
          
         #include <stdio.h>
         #include <mpi.h>
          
         extern int mpi_debug_version;
          
         main(int arg, char **argv)
         {
              MPI_Init (&arg,&argv);
              printf ("mpi_debug_version == %d\n",
                   mpi_debug_version);
              MPI_Finalize();
         }

    It produces output similar to:

         $ mqdvers -procs 2 -rmpool 0
         0:mpi_debug_version == 0
         1:mpi_debug_version == 0

  3. If your version of PE is ppe.poe 2.4.0.0 and the value of the variable is mpi_debug_version == 0, then you must apply PTF U462081 or modify your TotalView installation if you want to use the TotalView MQD feature. We strongly recommend that you apply this PTF.

Here is the procedure for acquiring the patches for POE 2.4. These patches are available through the normal AIX FixDist WEB site as follows:

  1. Point your Web browser to http://service.software.ibm.com/support/rs6000/.
  2. Click on the Downloads link.
  3. Click on the General Software Fixes link.
  4. Click on the AIX Fix Distribution Service link.
  5. Click on the Search by: PTF Number radio button.
  6. Enter the PTF number U462081 in the box and click on the Find Fix button.
  7. Select the PTF U462081 - ppe.poe.2.4.0.1 item from the list.
  8. Select your version of AIX.
  9. Click on the Get Fix Package button. The following list of fix packages appears:

    Filesets needed for selected item Information file Byte size
    ppe.poe.2.4.0.2 README 4539392

  10. Using your browser, download the file and put it into a directory, for example, /tmp/xlfpatches. The file is named ppe.poe.2.4.0.2.bff.
  11. Use the AIX smit tool to install the patch from the directory.

    If you apply this PTF, you do not need to patch your TotalView installation manually.

Patching Your TotalView Installation Manually

We strongly recommend that you apply the PTF (above) instead of patching your installation. If you patch your TotalView installation for use with an incorrect PE 2.4.0.0, that installation will no longer support MQD when you use POE 2.3.

If you cannot apply the PTF and you are using only PE 2.4, you or your system administrator must do the following:

     $ su
     Password: 
     # cd <installdir>/totalview.<version>/<platform>/shlib
     # rm libtvibmmpi32-0.so
     # ln -s libtvibmmpi-poe-2.4.so libtvibmmpi32-0.so

NOTE     Debugging 64-bit MPI applications has not yet been tested.

Contents


Forcing 1:1 Thread Scheduling Mode on RS/6000 Systems

Due to limitations in the pthread and pthread debug libraries, you cannot reliably debug pthread programs in process contention scope (M:N thread scheduling). IBM is working to correct this problem, but meanwhile you should force the pthread debug library to run in system contention scope (1:1 thread scheduling).

To successfully debug an AIX pthreads program it is necessary to turn off the pthreads scheduler. To do this, you should do all of the following:

For more detailed information, see your AIX 4.3 documentation or use the web links to the IBM AIX 4.3 documentation site listed below:

Contents

Section 5: Special Linux Considerations

Since Linux is an open source operating system, you can install the sources for the C runtime library. (Use the Red Hat package manager--rpm--to install them.) This lets you debug crashes within C library functions.

On Red Hat, the default compilation path used to build the C library appears to be /usr/src/bs/BUILD/... . The default location for installing sources is /usr/src/redhat/BUILD. If you want TotalView to locate these sources, you need to set up a symbolic link. For example:

     cd /usr/src
     mkdir bs
     chmod 0755 bs
     cd bs
     ln -s /usr/src/redhat/BUILD .

You can determine which version of glibc you are using by telling rpm to display a list of installed packages, and then use grep on the information it displays. For example:

     % rpm -q -a | grep blibc
     glibc-debug-2.0.7-29
     glibc-devel-2.0.7-29
     glibc-profile-2.0.8-29
     glibc-2.0.7-29

Because glibc-2.07-29 is being used, the source rpm will be glibc-2.0.7.29.src.rpm. The commands that you would now use to install this package are:

     rpm -iv glibc-2.07-29.src.rmp
     cd /usr/src/redhat/SPECS
     rpm -bp glibc-2.07-29.spec

These commands install the sources for glibc and apply patches packaged with it.

Contents


Section 6: License Management

Beta test note: This section does not apply to beta test. See the beta license installation instructions in your beta program FTP instructions email.

The Etnus License Management scheme for TotalView 3.7/3.8/3.9/4.0/4.1 and TimeScan 3.0 is new and requires some planning if you want to mix:

Mixing Etnus License Management With Other Software Managed by FLEXlm

We recommend that initially you do not combine Etnus licenses with those of other third party software managed by FLEXlm. At first, it is easiest to keep separate license manager daemons for Etnus licenses and licenses of other third party software managed by FLEXlm. After you know that your Etnus license works, see the latest FLEXlm documentation for guidance in running a single FLEXlm license manager daemon.

Use the procedures described in the TotalView Installation Guide to install the Etnus FLEXlm license management software. Etnus licenses must be served by the Etnus license manager daemon or the latest FLEXlm license manager daemon.

The TCP/IP port number used for the Etnus license manager daemon must be unique and not in use elsewhere. Find port numbers used by other FLEXlm license managers in their license.dat files.

Mixing Etnus License Management With Older Versions of Etnus products

TotalView 3.7/3.8/3.9/4.0/4.1 and TimeScan 3.0 licenses cannot be combined with those of older versions of Etnus products. The Etnus licenses must be served by separate Etnus license manager daemons for best results.

Use the procedures described in the TotalView Installation Guide to install the FLEXlm license management software. TotalView 3.7/3.8/3.9/4.0/4.1 and TimeScan 3.0 licenses must be served with the license manager provided in their distributions. Older versions of Etnus products must be served by the license manager provided in their distributions.

To run TotalView 3.7/3.8/3.9/4.0/4.1 or TimeScan 3.0 with older versions of Etnus products served by the same license manager server machine, you must:

The old and the new FLEXlms install in different directories, so one does not overwrite the other.

In addition, if you want to use both new and old versions of Etnus products, you must include the full pathnames of both license.dat files in your LM_LICENSE_FILE environment variable. For example:

     setenv LM_LICENSE_FILE \
          /usr/totalview/flexlm/license.dat:/usr/toolworks/flexlm-6.1/license.dat

To verify your FLEXlm installations:

  1. Start both old and new FLEXlm license manager daemons (lmgrd).
  2. Set your LM_LICENSE_FILE environment variable appropriately, as above.
  3. Run the lmstat command in the new product's FLEXlm directory:

         <installdir>/flexlm-6.1/<platform>/bin/lmstat

    In the FLEXlm status output, look for both license servers UP and both new (toolworks) and old (bbnst) Etnus vendor daemons UP.

  4. Once the license managers are both running, you can run TimeScan and TotalView to verify their installations:

         timescan
         totalview

If you encounter installation problems, please review your procedure and also refer to the Troubleshooting section of the appropriate Etnus User's Guide.

Contents


Section 7: Problems and Reports

Problems Fixed

 
Fixed in 4.1.0-4
2616 CLI: dwhat on a common block variable described the common block and not the member.
2589 C++ source files with .c++ extensions were treated as C source files.
2576 SUNPro5 demangler incorrectly overrode a -demangler= option entered on the command line.
2574 Linux x86: Processing bad debugging information from the PGI f90 compiler would cause a fatal error.
2565 C++ base to derived type cast in the data pane was sometimes not noticed.
2534 CLI: dassign to an array was erroneously allowed and corrupted the array.
2514 Tru64: TotalView complained about non-MPI jobs run under dmpirun.
2492 CLI: dwhat on an OpenMP (OMP) up-scope variable caused a fatal error.
2337 Linux x86: TotalView did not properly display the execution stack on programs halted within a signal handler compiled with libc 2.1.2.
1161 IRIX: Breakpoints planted in the delay slot of branch instructions did not allow execution to continue.

 
Fixed in 4.1.0-3
2427 AIX and IRIX: ifstream datatypes could cause TotalView to incorrectly display data in other variables.
2446 g++: The way in which C++ exceptions were handled did not work with g++.
2471 HP: TotalView crashed when linked with pxdb processing was turned off. We now warn the user.
2472 Red Hat 7.0: The libbfd.so file provided by binutils-2.10.0.18 is incompatible with TotalView. (We now bundle our own version.)
2473 AIX: You could not view thread private variables in programs compiled using the OpenMP compiler's -qsmp=noopt option.
2477 AIX: TotalView could not do stack backtraces in programs compiled using the OpenMP compiler's -qsmp=noopt option.
2478 AIX: The stack backlink token linking a worker thread to it's parent was omitted in programs compiled using the OpenMP compiler's -qsmp=noopt option.
2481 HP MPI: TotalView failed with an "address_space_t::write_data_block" error when used on 64 bit programs.
2508 Linux x86, CLI: Single stepping code with the CLI could cause an internal error.
2515 Linux x86: Some variables could be marked [Stale] if a signal handler stack frame was present on the stack.

 
Fixed in 4.1.0-2
2122 HP: Fortran 90 compiler drops symbols needed for MPI debugging. While this is a compiler problem, a workaround was added to TotalView.
2172 Linux x86: Stepping was misleading when programs were compiled with the -kPIC option. In addition, the cursor could be placed about 10 lines above the function being entered.
2193 CLI: Several storage leaks have been fixed.
2204 CLI: The drun command no longer ignores I/O redirection syntax for sending data to an output file (> outfile) or reading data from an input file (< infile).
2272 PGI: TotalView skips common blocks after seeing an empty common block.
2275 If localhost is misconfigured, TotalView now continues with a warning message. If you have this problem, please send email to support@etnus.com.
2277 When executing a bulk server launch command, TotalView could crash with the following error message:

Fatal error: Attempt to free an invalid block

2279 gcc: For bool variables, TotalView ignored debugging information that specified the byte size of the Boolean. As a result, TotalView displayed the contents of bool variables incorrectly.
2293 Linux: Previously, TotalView did not use the address stored in the core file to locate dynamic library text.

When the Linux Kernel is configured for 2 GB memory, the memory map for loaded processes changes. This means that dynamic libraries are loaded at a different address. If a process running on a kernel compiled for 2 GB is dumped and the core dump is debugged on a machine running on a 1 GB kernel, TotalView cannot debug code from dynamic link libraries.

2299 Sun 5: Previously, the FLEXlm license manager daemon (lmgrd) would not run correctly on a system configured for more than 1024 file descriptors.
2370 Linux gcc: The gcc compiler treats long double types as 12 bytes to ensure that data is aligned correctly. However, only 10 of the 12 bytes are used. Previously, TotalView complained whenever it tried to display a long double register variables compiled by gcc. While TotalView now "understands" how the compiler allocates this space, this can cause problems if a user casts a floating register to <string>.
2393 TotalView now displays missing signal numbers within the dialog box displayed by the Set Continuation Signal command.
2403 Previously, function and variables in Fortran 90 modules were case sensitive within TotalView. They should not have been.
2404 Linux: When debugging a DLL built with gcc 2.95.2 and DWARF2, TotalView displayed the following error message:

Skimming ... bytes of DWARF `.debug_info' symbols ...
TotalView: FATAL ERROR STARTING UP: Can't find abbrev table entry ...

2407 Compaq Alpha: Debugging a C language program using the following options caused a TotalView stack overflow, segmentation fault, or hang:

-g3 -non_shared -om.

2414 Linux/Alpha: prolog_size is set wrong. TotalView now correctly interprets this information as instructions instead of bytes.
2421 AIX: An error occurred if you used the g command when TotalView did not load a target and you used TotalView's -nodynamic option.
2443 MPI: TotalView did not focus onto the main function at the end of an MPICH-style parallel job startup because it did not recognize that the processes have not already entered main.
2446 gcc: The way in which TotalView handled C++ exceptions made by a throw statement did not work with the gcc compiler.
2464 SGI: TotalView incorrectly forwarded siginfo structures to signal handlers, causing packages like ObjectStore to malfunction.

 
Fixed in 4.1.0-1
2169 Very large assembler source pane listings could crash TotalView.
2180 Compaq Alpha: Invoking the Fortran Modules Window command could crash TotalView.
2181 AIX: TotalView crashes when planting remote watchpoints with the following error message:

Fatal error: Subclass responsibility: tracenode_watch

2208 SGI: Fortran 90 code passing a pointer to an array of derived types to a subroutine could cause TotalView to crash with fatal error:

Fatal error: ::remove_indirection on non-indirect mode reference_param_var; ldl -64; indirect; ldac 0

2246 AIX: When attempting to step-into, TotalView would instead step-over calls to functions through function pointers compiled with the -q64 option for 64 bit addresses.
2247 AIX: When attempting to step-into, TotalView would instead step-over calls to source level functions in shared libraries compiled for 32 or 64 bit addresses.
2248 SUN KCC: TotalView incorrectly loads SPRO5-compiler function prototypes, causing single stepping problems.
2265 SGI MPI: If the SGI MPI job you are attaching to has only one rank process, TotalView does not ask you if you want to attach to the rank process.
 
Fixed in 4.1.0-0
1990 Linux-x86: TV loses related process when breakpoint is hit
2041 Crash when rerunning program
2043 Linux: TV crashes if attaching to threaded code with missing threads
2050 Loosing contact with program when threads exit
2062 SEGV when hitting breakpoint after program has been changed
2084 CLI: display of Fortran character array element cause FATAL_ERROR
2086 TV should handle SIGTRAPs generated by target compiled -C
2102 SGI/DWARF: still enter some locals without addresses
2108 egcs 2.91.66 with -ggdb3 (DWARF 2) crashes TV
2109 C++: TV should have the capability to "stop at exception throw site"
2111 Sun: Pth 1.4a1 crashes TotalView under Solaris!
2113 CLI: quit does not leave the "attached" processes running
2116 HP: floating point status register: can't edit it, can't dive on it, it's type is a real in expressions, dwhat of it crashes TV
2123 HP: FErr: Can't cast to a hpux_core_traceable_info
2126 CLI: At least on SGI, the dbreak command is pretty broken
2136 Alpha/UNIX: Child fork won't reveal mutex info
2137 IRIX: exec tracing SEGVs
2148 Linux x86: TV core dumps on null hostname from gethostbyname
 
Fixed in 4X.1.0-4
1763 'Unexpected messages' always displays 'none' even when I call MPI_Send without a matching MPI_Recv.
1819 AIX: TV hangs at start of target execution
1893 AIX: Eval of symbol in dlopen'ed image fails with: Error: Identifier f not defined
1895 Eval of ``.x'' fails with: Internal error in TotalView.
1908 Alpha/UNIX: patchcode_t is regarded as an opaque type in TV
1914 Internal error from printing dlopen function call in CLI
1915 AIX: 32-bit Address check FAILED from dprint of dlopen call
1923 AIX: Step into main fails to show first line; prints an error
1932 Font problems with visualizer and CDE
1943 HP-UX: TV doesn't support function calls in eval expressions.
1947 HP-UX: disambiguate menu contains unmatched symbols?
1980 HP-UX: TV shows blank process windows for new processes prior to halting the process.
1981 HP-UX: HP mangler not working
1988 AIX: Exec of 64-bit application shows .ustart, not main
1994 Fatal Error: Mismatched end tags in block
2003 LINUX-X86: TV gets IE on RH6.0 with native gcc compiled program.
2002 HP-UX: for 64-bit programs TV gets 2 of many libc symbols.
2004 Warning: Cannot convert string "-dt-interface system-medium-r-normal-xl ...*" to type FontSet
2006 HP: Dynamic loaded shared libs have no symbols debugging core files
2010 Diving on thread laminated variable pops new data window.
2011 Nested edit not being set correctly
2015 Linux/x86: TV crashes on pth demo code
2018 HP: Install's DIST1.tar.Z untar failed for non-root user
2025 On Linux, may see new threads without pthread TCR signal; TV takes a fatal error as a result
2026 On Linux, the user thread list may occasionally have a 0 tid on it -- crashes TV
2027 Linux x86: During thread creation, if target hits a breakpoint, may get timeout & balk
2029 Linux-x86: 's', 'v' Pixel, 'w' gets IErr
2031 Group instruction single-step crashes target (and, sometimes, TotalView)
2032 SPRO5: TotalView does not parse SPRO C++ 5.0 pointer to class member stabs
2036 Linux thread manager not identified
2037 eval of function name after program start yields ``Error: internal error: bad local type in generate_data_section''
2038 Alpha/UNIX: Eval of a function name when interpreting yields ``Fatal error: eval_data_value_t::insert_data: Length mismatch: len=4, new_len=8''
2044 AIX: PC arrow may point to wrong instruction in Interleave Display Mode (only)
2046 SEGV when trying to step out from dynamic linker
2047 Linux/x86 TV crashes diving into C++ "function" in PLT
2049 Linux/Alpha tv crashes on CompaQ F90 generated code
2052 HP: dive into subr in same file, get asm, not src!
2055 SUN5: Internal error debugging ACE test code compiled with SPRO 5 compiler
2058 HP: Describe correct MPI job start-up under TotalView
2059 IRIX: C code compiled -64 -g -mp doesn't display source
2060 Visualizer dies on Alpha
2061 HP: totalview dumping core on HP MPI job startup
2068 HP: Can TV debug binaries that reside across a hard, interruptible NFS mount?
2070 HP: "s" doesn't always step into when it should
2071 Please improve processing speed for Save Window to File and Show Array Statistics commands.
2072 SunPro C++ 5.0: IErr while Reading symbols from object file
2073 TV displays incorrect rank for array passed to subroutine in module
2074 CLI: assumed size (or shape) arrays mis-handled
2080 Linux-x86: dwhat shows padding instead of pointer in C structure
2089 Alpha: interpreted eval gets int32_to_ptr wrong
2090 Alpha/Tru64 Invalid question from dive to function
2091 TotalView prun... skips launch completely
2095 HP: core file mismatch problem
 
Fixed in 4X.1.0-1
1449 SUN5: Internal errors and BOGUS Tag messages
1687 Can't cast a field to array of bit sized enums in data pane
1717 TV gets fatal error with Guidec OpenMP program on Red Hat 6.0
1727 SUN5: IErr on fork/exec target
1739 How can we use TV with our proprietary SP2 job manager which runs in its own user account and uses PVM to spawn dynamically linked processes in job sets on various SP2 processors on behalf of multiple users who operate a GUI from their own user accounts which provides access to their job sets through the job manager?
1783 IRIX-MIPS: gcc: program dies in evaluation pt. with seg fault after trying to print a structure field
1814 CLI should have a ddetach command.
1849 CLI: add a doc section to describe what the various prompts mean
1852 F90 module variables not accepted in 'v'
1874 ALL: comma separated X resource totalview*searchPath values not properly split up
1878 AIX: POE: missing libtvibmmpi-2.so; can't see the message queues
1886 IRIX64: Evaluating ``pointer || int'' while debugging 32-bit app yields: Fatal error: eval_data_value_t::retrieve_data: Length mismatch: buf_len=4, extern_len=8
1887 CLI: trouble printing a global array element
1892 VIS: We don't distribute and point to an up-to-date XKeysymDB
1902 SUN5: While examining core file stack trace, popup window says: ``Error: Symbol table parser, reading symbol "!__1nPXEEvtDispatcher_": no ppp-code found on SunPro class''
1904 CLI: Please print Fortran arrays locations in Fortran style.
1906 CLI: Fortran indexes are printed in reverse order
1911 linux/x86: PGI F90 module problem
1918 FErr: stab_typetable_entry_t::scan_type_info:is_xlf90_ptr escaped
1925 Unchecking 'Share Action Point in All Related Processes' in UDWP causes fatal error
1926 Stack Trace is truncated for f90 core file
1927 Values of variables in modules are wrong
1933 With IRIX TV 4.0 (regression from 4X.0.0-7) when attaching to a running proccess, and running to a breakpoint within the code, the code stops at the breakpoint but the source code displayed is incorrect, always displaying the main routine
1935 TV 4.0 manual states THREADPRIVATE on SGI IRIX is not supported, which is wrong
1945 Single-thread run operations may balk and hang until ^C
1949 Step single thread => other threads may run
1957 Expression system can't do && or || with a char or short as the second operand; worse, sometimes the C++ type "bool" is really a char.
1959 Slow step into routine when -DEBUG:trap_uninitialized=ON used
1964 check_event_and_registers: Operation Failed. on AlphaServer SC
1965 Linux-x86: Step into mostly steps over when calling a .so function from another .so, and when it works the focus is about 10 lines above the function being entered
1967 DW_CFA_remember_state error message (Linux/Alpha, Compaq F90 (?))
 
Fixed in 4.0.0-1
1903 TotalView no longer terminates with "Fatal error: Ran off end of chunk in extract_data2" code compiled by PGI compilers.
1911 TotalView no longer crashes due to PGI F90 module related problems. However, the PGI F90 compilers do not currently emit debug information for module variables. Consequently, all modules appear empty. (This is PGI's bug number TPR2132).
1917 TotalView can now debug executables containing code from both PGI and GNU compilers.
1920 TotalView correctly handles DWARF2 frame information that previously lead to incorrect or truncated backtraces on Linux Alpha.
1848, 1877 The CLI dlist command no longer has a problem when listing a source location for a thread that also had an execution point. (It previously displayed the "Can't find the source for thread P.T, at address 0xXXXXXXXXX" error message.
1820 Selecting ambiguous source line numbers now allows you to select from all functions where you could toggle the breakpoint. Formerly, selecting a line number in the source pane might only toggle a breakpoint in that function.
1650 TotalView now displays "extern C" function choices in the overload dialog box on AIX.
1901 TotalView no longer terminates with an internal error on certain codes compiled with the SUNPro 5.0 C++ compiler.
 
Fixed in 4.0 Release
266 On-line help needs proofreading.
727 F77 w/cmplr directives, MP_SETNUMTHREADS>1, 'u' => FErr:...TTRID
1148 Dive on Duid 1 in Self-Debugging window results in IErr.
1266 Having trouble making PVM work with TotalView 3.9.0
1702 Generated PDF files missing TOC bookmarks and links
1720 Under Linux but NOT Tru64 or Solaris When debugging a module that is in a different directory than its source, the source code is not shown in TotalView until the "Set Search Directory (d)' menu command is used to set the directory to the source
1721 Error in table 2 installation guide
1725 Incomplete installation information in installation guide
1788 Alpha: 'f' into .so gets IErr on long symbols in STL's std::map
1792 check_event_and_registers: Operation Failed.
1794 flexlm-6.1/alpha/bin/toolworks has Error: Unresolved symbols
1796 SGI, C++: dive on <opaque>* and ... append [10] to type leads to Couldn't find a base type. "OK" to Closest match then leads to Couldn't find a type name.
1802 large executable hangs totalview. "ERROR: Symbol table reading error: no current function" occurs repeatedly.
1803 SUN5: vismain gets "Error: Can't open display"
1807 CDWP expression gets "Error: Identifier i not defined"
1819 AIX: TV hangs at start of target execution
1836 Solaris TV should give user control of signal 36
1840 Please add click-able cross references to TV pdf documents.
1845 CLI: the "dlist" command causes a crash if it is the first command given
1847 CLI: SUN5: The 'stty' command does not seem to work
 
Fixed in 4X.0.0-7
1052 AIX: Repetitious message: "Function declarations nested too deeply"
1059 AIX: File static variables can't be found.
1450 Alpha: Can't debug core files, Internal error, and won't display the source for a template function
1496 SUN5, egcs: DBX class tag ':' not understood.
1649 Incorrect value of external variable reported in 64-bit mode
1659 ``FATAL ERROR STARTING UP: ::insert_nlist_symbol reentered'' debugging KAI KPTS guidec++ OpenMP code on Solaris
1705 TotalView crashes during executable reload
1729 wrong setting of FLEXLM variable in totalview startup script
1774 Irix: Fatal Error from interpreted eval point
 
Fixed in 4X.0.0-6
1709 setguid bits set on directories in dist. tarfiles
1710 get rid of carriage returns in Release Notes (and other text files?)
 
Fixed in 4X.0.0-5
435 Shape of assumed-shape dummy argument wrong
1060 SUN5: Symbol table reader IErr. Type tag '!' (0x21) not understood on string ''.
1061 AIX: TV on -bmaxdata SA_FULLDUMP core file won't display stack backtraces for threads
1062 SUN5: Symbol table reader IErr.
1063 Tru64 UNIX: Bpt in large-shared-memory target leads to target SEGV.
1158 IRIX, f90 7.3: FErrSU: Can't find abbrev table entry for debug entry.
1292 IErr: Bitwidth with enum nxm_wake_vals : 3
1393 Totalview IBM SP launch problems
1397 Alpha: fork/execvp job gets FErr: Forbidden Transition
1398 This MPICH executable crashes totalview on our system.
1402 `totalview date` gets "aix_lookup_symbol_in_load_module: Failed to read the string storageLoading dynamic symbols for date"
1421 Cannot find the dynamic library 'libxmpi.so'
1424 FErr: ::~patchlib_entry_t: destructor called on entry not marked deleted
1429 's' steps over a large block of non-parallelized code w/i a single subr, but a bpt in the block works
1443 TV crashes when finding 'main'
1448 TV 3.9 gets IErr with long directory search paths
1457 SUN5: Can't read type info for "...": Bad virtual table offset. Expected an integer, but found ""
1460 visualizer on alpha dies when processing NaNs, Infinity or Denormalized values
1477 TV doesn't handle realtime signals
1482 RMS 2.x: Unaligned access <tvdmain>; Segmentation fault (core dumped)
1492 SEGV in 3.9.0-1 attempting to save assembler display to file
1532 Problems with pardo001.f Guide Fortran code
1535 File statics given wrong files on AIX when #line used
1584 Internal code generator error: cannot load address of a register
1606 DUNIX: IErr from template problem
1632 AIX: "totalview poe -a a.out -resd no -hostfile hostlist" from login node gets "... poe can't find executable".
1651 xlf90: Character ptr to array of 3 80-char strings displays as a 3-char string.
 

Known Problems

The following sections list the problems that have been found.

You may find your problem (and its solution) documented on our website's FAQ, which is located at http://www.etnus.com/support/FAQs/index.html.

Contents

Documentation Problems

Here are corrections for the TotalView User's Guide:

Contents

KAI Guide Fortran Compiler Option is Listed Incorrectly

The documentation lists the KAI Guide Fortran compiler option incorrectly. It should be:

     -WG,-cmpo=i

Platforms where Smart Single Stepping Occurs

The documentation (page 131) states the smart single stepping occurs "on all platforms except Compaq Alpha Linux." This is incorrect. Smart single stepping occurs on the following platforms: IBM RS/6000, Compaq Alpha, Linux Alpha, SGI, and HP-UX. (pr2201)

Using LD_BIND_NOW

If you are executing a dynamically linked program, calls from the executable into a shared library are made using the Procedure Linkage Table (PLT). Each function in the dynamic library that is called by the main program has an entry in this table. Normally, the dynamic linker fills the PLT entries with code that calls the dynamic linker. This means that the first time that user code calls a function in the dynamic library, the dynamic linker is actually called. The linker will then modify the entry so that next time this function is called, it will not be involved.

This is not the behavior you want or expect when debugging a program because TotalView will either:

And, because the entry is altered, everything appears to work fine the next time you step into this function.

On most operating systems (except HP), you can correct this problem by setting the LD_BIND_NOW environment variable. For example:

     setenv LD_BIND_NOW 1

This tells the dynamic linker that it should alter the PLT when the program starts executing rather than doing it when the program calls the function.

HP-UX does not have this (or an equivalent) variable. On HP systems, you can avoid this problem by linking the executable being debugged with the -B immediate option or by invoking chatr with the -B immediate option. (See the chatr documentation for complete information on how this command is used.)

You will also have to enter pxdb -s on. For more information, see Appendix B of the TotalView User's Guide.

Contents


Problems on All Platforms

All platforms have the following problems:

Xoftware and Motif Problems

If you set Xoftware version 8 to emulate setting Motif properties, modal dialog boxes for TimeScan and TotalView can become system modal; that is, they prevent all other window input to any window until the window is dispatched. If some other problem occurs at this time, you will need to reboot your NT workstation.

You can avoid this problem by selecting the Windows Option tab from within the Options->Configuration dialog box and set Motif Properties to off.

C++ Exceptions

TotalView does not have full support for C++ exceptions. Single-stepping over code that will throw an exception is problematic and often results in the process running away. To help with this situation, TotalView will detect when an exception throw is going to occur while single-stepping.

By default, TotalView brings up a dialog box to ask if you wish to stop the process. Answering No continues the process. Be aware that if you are stepping within the "try" block, your process may run away. Answering Yes stops the process upon entry into the system runtime routine that issues the throw. This is a temporary solution and full C++ exception handling may be provided in a future TotalView release.

This mechanism is available for all supported C++ compilers on the supported platforms for SGI IRIX 6.x, Power AIX, Alpha Compaq Tru64 UNIX, and SPARC SunOS 5 (Solaris 2.x) platforms. (See TotalView 4.1 Platforms and System Requirements.)

The following user interface controls are available for turning this dialog box on and off:

If this option is turned off, TotalView does not catch C++ exception throws during single-step operations. This may cause the single-step operation to lose control on the process and cause it to run away.

EGCS Problems

The abbreviation table created EGCS 2.91.66 is incorrect. If TotalView prints an error message. This abbreviation table problem was fixed in gcc 2.95.2.

You have two alternatives:

Evaluation point with a goto and a step

If an evaluation point executes a goto statement or an assembly language transfer of control instruction, and you use the step or next command at the line where the evaluation point is enabled, TotalView continues the program and the step or next command does not complete. To regain control, type ^C into the program window.

Fortran Arrays Whose Size Changes

The following problem applies only to Fortran arrays whose size changes, and from which have used the Variable (v) command from the Function/File/Variable you are displaying only a single element, either because you have dived, or because you menu with an array index.

When a data pane displays a single element of a Fortran array that has runtime bounds (that is, assumed shape, assumed size, allocatable, or a pointer), and the actual bounds change, the value displayed in the data pane applies to the wrong element in the reshaped array.

To overcome this problem, display the whole array, then dive to the element that you want to see. Alternately, if you select the specific element of interest by setting the slice expression rather than by diving, the correct element always displays, even if the array changes shape.

FLEXlm Hunting For Multiprocessor Features

When FLEXlm reads your license.dat file, it hunts for multiprocessor feature lines when you start a debugging session with more than two processors. The messages:

     (toolworks) UNSUPPORTED: "TV/<hdwr>-<OS>/MP/<n>"

which may appear in your license.log, may be safely ignored.

fvwm Version 1 Problems

There are problems with the fvwm version 1 window manager. Some users have reported that TotalView triggers bugs in version 1.22d of the fvwm window manager (and presumably earlier versions, too). However, The last release of fvwm version 1 (release 1.24r) is believed to work correctly with TotalView. Therefore, if you are using the fvwm version 1 window manager, we recommend that you ensure that you are using version 1.24r. We have not tested any later versions. You can find full details on fvwm at http://fvwm.math.uh.edu/.

Flush Pending Evaluation Command Can Corrupt Target Process

The use of the Flush Pending Evaluation command of the expression window may corrupt the target process. When the following three conditions hold:

TotalView shows the thread in an inconsistent state: the target threads are still at the breakpoint inside the function, but the stack backtrace shows it where the expression was invoked. As a result, TotalView may: (a) correctly show the source line where the process really is (from whatever line you invoked the expression); or (b) it may mistakenly show the line of the breakpoint in the function.

Further, if you try to continue the target process, one of the following will happen:

To avoid a crash or a hang, toggle the breakpoint (disable then reenable the breakpoint) TotalView is reporting as current before continuing the process. But, on SUN5 (and on all other platforms after you've toggled the breakpoint appropriately), if the process was sitting at a breakpoint when you called the function from the expression window, TotalView immediately hits that breakpoint again.

Attaching to Portland Group HPF Jobs Is Not Supported

TotalView does not support attaching to Portland Group HPF jobs. If you attempt to attach to Portland Group HPF jobs, you may not see all of the processes that the job is composed of, and you may not be able to display distributed variables.

MPICH 1.2.0 Cannot Locate libtvmpich.so Library

If you are running MPICH 1.2.0, TotalView cannot find the libtvmpich.so library. Installing patch 4959 (downloadable at http://www.mcs.anl.gov/mpi/mpich/buglist-tbl.html) fixes this problem.

Function Static Variables May Be Invisible When Using KCC

The KCC compiler moves a static variable from the function in which it is declared and places the declaration at file or global scope. It also mangles the name to show that the variable ought to be at function scope. Unfortunately, TotalView does not understand this mangling.

Visualizer Can Have Font Problems under CDE

In some cases, you will receive an error message when starting the Visualizer that complains about fonts. For example:

     Warning: Cannot convert string "-dt-interface system-medium-r-normal
                    -xl*-*-*-*-*-*-*-*-*" to type FontSet
     Warning: Unable to load any usable fontset
     Warning: 
          Name: FONTLIST_DEFAULT_TAG_STRING
          Conversion failed.  Cannot load font.

You can fix this problem in several ways:

Contents


Problems on Compaq Alpha Tru64 UNIX Platforms

The following are known problems with this platform:

For additional information, see Compaq Alpha Tru64 UNIX.

TotalView Segmentation Fault at Start-up on a Node Within a Cluster After Executing for Several Days

TotalView gets a segmentation fault at start-up on a node within a cluster after executing for several days. This problem occurs over time on new nodes until no node is capable of running TotalView.

Compaq has determined that this is an NFS problem. You can fix this problem by applying the patch found at the following location:

     http://ftp1.support.compaq.com/public/osf/v5.0a/

(pr 2153)

Thread Debugging Problems On All Versions of Compaq Tru64 UNIX

Because of a bug in the Alpha thread debugging support on Compaq Tru64 UNIX, the low-level thread hold operation can allow a held thread to run. TotalView uses the low-level thread hold operation to prevent a thread from running when single-stepping another thread.

For example, assume your program has two threads, thread A and thread B. Assume that thread A is stopped at a breakpoint, and thread B is stopped elsewhere but not at a breakpoint. To continue the process (that is, both threads), TotalView must step thread A off the breakpoint. To do this, TotalView holds thread B. Then it unplants the breakpoint where thread A is stopped, sets a temporary breakpoint at the next instruction, and continues the process. Because of the hold thread bug, both thread A and thread B may run even though thread B is held. This means that thread B may miss the real breakpoint and hit the temporary breakpoint instead.

The following behaviors can indicate the presence of this bug:

Using include and #include

If you compile Fortran 90 files with include or #include statements on the Compaq Alpha platform using the Compaq Fortran V5.0 compiler (or earlier), TotalView may show line numbers following the include statement at incorrect lines. This problem is fixed by the Fortran V5.1 compiler.

Anonymous unions Using GNU

The GNU compiler does not output debugging information for members of anonymous unions that are enclosed in other aggregates when using the ECOFF format on the Compaq Alpha. As a result, if you are debugging in such an environment, you will not see such members if you use TotalView to look at a data structure that contains them. Furthermore, the debugging information for the offsets of aggregate members that follow the anonymous union is output incorrectly, so these members will be displayed with incorrect values.

Planting Too Many Action Points Causes Problems

On a V4.0 or later Compaq UNIX system, using one or more TotalView commands that plant a lot of breakpoints results in an error message being displayed when you run, continue, step, or otherwise cause your program to continue or start execution.

Compaq is aware of the problem, but a fix is not yet available.

You can temporarily workaround this problem by using dbx to increase the vt_maxentries variable to something like 20,000. For example:

     dbx -k /vmunix
     assign vm_tune.vt_mapentries=20000
     quit

You can also alter vt_mapentries using the sysconfigdb program. Consult the man page for more information.

Setting a Breakpoint In a Large Shared-Memory Target Causes a SEGV

If setting a breakpoint causes the operating system to allocate shared page tables, reading information from these pages can lead to the program getting a SEGV and TotalView exiting with a resources lost message. You can avoid this problem by setting the value of ssm-threshold to 0. For example:

     #sysconfig -r ipc ssm-threshold=0
     ssm-threshold: reconfigured
     #sysconfig -q ipc ssm-threshold
     ipc:
     ssm-threshold = 0

Setting this value to 0 can degrade performance.

This problem has been reported, but a fix is not yet available.

Contents

Problems on HP HP-UX Platforms

The following are known problems with this platform:

Backtrace Problems While Stopped in Some Stubs in 32-Bit Applications

Some versions of the ld linker generate incomplete unwind information for relocation and export stubs. If TotalView is stopped in one of these stubs, the backtrace display can be affected. If you are using HP-UX 11.0, you can partially solve this problem by installing the PHSS_19866 patch, which is available from HP. This patch does not fix the problem in executables or shared libraries already linked with the faulty linker and which were supplied with the system and its compilers.

Debugging shared libraries

The dynamic library loader on HP-UX loads shared libraries into shared memory. Writing breakpoints into code sections loaded in shared memory can cause programs that are not under TotalView's control to fail when they execute an unexpected breakpoint.

If you need to single-step or set breakpoints in shared libraries, you must set your application to load those libraries in private memory. This is done using HP's pxdb command.

     pxdb -s on  appname  (load shared libraries into private memory)
     pxdb -s off appname  (load shared libraries into shared memory)

For 64-bit platforms, use pxdb64 instead of pxdb. At the present time, the version of pxdb64.exe supplied with HP's compilers does not work correctly. You must install patch PHSS_20122. You can download it from:

     ftp://us-ffs.external.hp.com/hu-ux_patches/s700_800/11.X/PHSS_20122

Fortran90 Compiler Problems

The HP Fortran90 compiler does not generate correct debug information for file names or line numbers when the C preprocessor is used. This is the default mode for files with the suffix .F. If your code does not need the C preprocessor, use the suffix .f90 or the command line option +cpp=no. If your code needs the C preprocessor, compile with the +cpp_keep option. This will save the intermediate file, which is normally deleted, for use by TotalView.

NFS filesystems Must Be Hard-Mounted

The debugger interface provided by HP-UX requires that the executable being debugged cannot be on a soft-mounted NFS filesystem.

Problem with Cray Pointers

The HP compiler does not emit a symbol for the "pointee" information for a Cray pointer. For example:

     pointer (iptr, ixx())
     iptr = malloc(100) 

Because no information exists for symbol ixx, you are not able to look at it. You can, however, modify the pointer's type and then look at its contents. For example, you could change the type of iptr from integer to a a pointer such as <real*4>*, dive through it, then add bounds to the type.

Single-Stepping

Totalview does not allow you to step into functions that have not been bound by the run-time linker. If you wish to step into such a function, set a breakpoint at that function and run to the breakpoint.

You can avoid this problem by linking the executable being debugged with the -B immediate option or by calling chatr with the -B immediate option.

Contents


Problems on SGI IRIX Platforms

The following are known problems with this platform:

For additional information, see SGI IRIX 6.x MIPS.

Expression System Forces Real Function Result into a long Temporary

When a program is compiled with the Fortran 90 compiler, the TotalView expression evaluation system erroneously converts real function results. The SGI Fortran 90 compiler fails to emit the return type of the function, so TotalView assumes that the return type of the function is a long. When assigned to a real variable, the return result of the function is erroneously converted from a long to a real, when in fact the function had already returned a real. Here is an example:

     real function x_to_the_y_power(x, y) 

TotalView expression:

     real result
     result = x_to_the_y_power(2.0, 4.0)

This problem, which does not occur with the Fortran 77 compiler, has been reported to SGI. (pr 2296)

TotalView will not find main

TotalView will not find a Fortran 90 main program. TotalView will not display any source code if you do not use a PROGRAM statement within a Fortran 90 program. You can correct this problem by adding a PROGRAM statement to your main program. (pr 2099)

Using #include and -cpp Together in Fortran 90

If source files contain #include statements and are compiled with the -cpp switch on a Fortran 90 program using the MIPSpro compilers, TotalView generates incorrect line numbers. To avoid this problem, use the standard Fortran include statement (without the -cpp switch).

Fortran Arrays With Runtime Bounds Display Problem

Some Fortran arrays with runtime bounds are displayed improperly. Because of a limitation in the debug output produced by the SGI Fortran 90 compilers, this happens for arrays which are the targets of pointers embedded in a user-defined type which has itself been arrayed. Consider the following code:

     type array_ptr
          real, dimension (:), pointer :: ap
     end type array_ptr
      
     type (array_ptr), allocatable, dimension (:) :: arrays
 
     allocate (arrays(20))
     do i = 1,20
          allocate (arrays(i)%ap(i))
     end do

TotalView reports the bounds of the elements arrays%ap incorrectly. Unfortunately, there is nothing we can do in TotalView to overcome the fact that the compiler has generated invalid debug information for the runtime bounds for these elements.

This problem should be fixed in the MIPSpro F90 compiler version 7.3 Beta 2 and later for TotalView 3.9 and later.

Inadvertent Single-stepping into System Routines

The single-step commands sometimes step into system routines.

Cannot Find Source Code For System Routines Complaint

TotalView occasionally complains about not being able to find the source code for system routines (such as printf()).

Evaluation System Cannot Access Fortran 90 Up-level Variables

Access to F90 up-level variables does not work in the evaluation system. Because of SGI F90 7.2.1 and earlier compiler bugs, access to F90 up-level variables does not work from EVAL expressions. Those variables are correctly located and displayed in data panes, however.

This problem should be fixed in the MIPSpro F90 compiler version 7.3 Beta 1 and later for TotalView 3.9 and later.

Fortran 90 Pointer Variables Not Correctly Identified

F90 pointer variables are not correctly identified as pointers because of incomplete debugging information generated by the compilers. TotalView displays the target data correctly, however.

This problem should be fixed in the MIPSpro F90 compiler version 7.3 Beta 2 and later for TotalView 3.9 and later.

Shows Multiple Instances of Virtual Base Classes

Because of SGI 7.1 C++ compiler bugs, when that compiler is generating debugging information, TotalView shows multiple instances of virtual base classes. Normally only one instance is correct, which is the one that is of type pointer to the base class.

Incorrect Lower Bound for Allocatable Arrays

Because of SGI 7.2 F90 compiler bugs, when that compiler is generating debugging information, TotalView (and other debuggers) do not show the correct lower bound and element count for allocatable arrays in modules and pointers in common blocks. This bug has been fixed in the SGI 7.2.1 F90 compiler.

Shows Pointers With Unlimited Bounds With Bound of 1

Because of SGI 7.2.1 F90 compiler bugs, when that compiler is generating debugging information, TotalView shows the target of Cray pointers with unlimited bounds as having an upper bound of 1. Consider the following code:

     subroutine test (ixx, n)
          common /sf/ iptr
          pointer (iptr, ita(*))
          ... etc ...
     end

In this example, the compiler generates debug information for ita that indicates it has an upper bound of 1. This is incorrect because it has an unlimited upper bound.

Cannot Show Target of a Formal Parameter

Because of SGI 7.2.1 F90 compiler bugs, when that compiler is generating debugging information, TotalView can not show the target of a formal parameter Cray pointer. Consider the following code:

     subroutine rex (rp)
          pointer (rp, p(8))
          p(2) = 6.
          P(5) = 3.
          write (*,*) "Should be 6,3 - ",p(2), p(5)
          return
     end

In this example, the compiler generates debugging information for p without any addressing information.

This problem should be fixed in the MIPSpro F90 compiler version 7.3 Beta 1 and later for TotalView 3.9 and later.

Bad Template Names May Be Generated

Because of a compiler bug in the SGI 7.2 and 7.2.1 compilers, bad template type names may be generated for certain template instantiations. This problem is fixed by Patch 3492: MIPSpro 7.2.1 C++ front-end rollup #4, which is available at the SGI Support Web Site.

KCC Does Not Put Original File Name Into Symbol Table

IRIX KCC code: TotalView fails to put the original file name (before preprocessing) into the symbol table. This prevents you from asking for the file by name until TotalView processes all the file's symbols.

If you use the --keep_gen_c option to the KCC compiler, you can use the following TotalView command: f xxx.int.c (where your original source file was xxx.C) to force full symbol processing of that file, after which you'll be able to do f xxx.C.

Attaching To SHMEM Jobs Is Not Supported

TotalView does not support attaching to SHMEM jobs. If you attempt this, you may not see all of the job's processes and the process leader may not be properly identified; this could hang your job.

Cray Pointers in Common Blocks Broken

The debugging information generated by SGI 7.3 Fortran compiler for the targets of Cray pointers contained within common blocks contains the wrong address. Here is an example:

     common a1(1000)
     common /ptrs/ jj,iparray,kk
     pointer (iparray,array)
     iparray = loc(a1)
     end

array is a real variable that is the target of the Cray pointer iparray. Because the address is wrong, TotalView cannot show you the correct values for the iparray variable. This bug has been reported to SGI. (The SGI 7.2.1 and earlier versions of the compiler do not have this bug.)

Arrays in "main" Are Not Found Unless Declared in Common

If an array is declared in main, the SGI MIPSpro 7.3.3 compiler does not create debugging information for the variable. Consequently, TotalView does not know that the array exists. You can workaround this problem by placing the array in a common block.

Contents


Problems on RS/6000 Platforms

The RS/6000 platform has the following known problems:

For additional information, see RS/6000 Power AIX.

Cannot Start TotalView on MPI Tasks After Installing PTF Set 4 on AIX 4.3.3

After installing PTF Set 4 on some versions AIX.4.3.3, TotalView will begin executing and hang after you answer Do you wish to stop the parallel tasks question. If you have this problem, please send email to support@etnus.com.

ptrace Attaching Fails

Versions of the AIX kernel after AIX 4.3.3.1 contain a bug that causes a ptrace attach to fail for some programs. In particular, attaching to a Parallel Environment program may fail. You can solve this problem by installing IY10784, whose description is:

     IY10784: ATTACH FAILS TO THE CHILD PROCESS OF A ROOT PROCESS

For information on installing this APAR, see RS/6000 System Patch Procedures.

Multithreaded Problems

You may experience some problems when debugging multithreaded programs, because of limitations in the ptrace() operating system call.

The following problems can show up while you are debugging multithreaded applications:

  1. When a thread stops (e.g., hits a breakpoint) all the other threads stop. If any of the other threads stops while in a system call (e.g., read, sleep, select, etc.), however, ptrace() does not allow the debugger to read the thread's registers. As a result, TotalView:
    • Cannot display the registers, including the program counter; but does display the stack pointer
    • Cannot show you which system call is being executed
    • Cannot single-step using the step or step-over command, but return out of function and run to selection work
    • Cannot display the top stack frame

      If you have a multithreaded application which makes a lot of system calls, this might mean that most of your threads are not fully debuggable whenever one of them stops.

  2. TotalView shows you which threads are stuck in the kernel by displaying their state as In Kernel (K).
  3. When a thread is created or destroyed, the system does not notify the debugger of this event. As a result, the list of threads displayed by TotalView may be stale when the program is running.
  4. If the process stops for any reason, TotalView automatically updates the thread list. You may also type Current/Update/Relatives -> Update Process Info (u) to force the thread list to update.
Calling Dynamic Objects From Expression Window

If a routine in a dynamic object is called from the expression window, and if the target routine is never called from the main program, then TotalView refuses to call the routine.

XL Fortran Problems Generating Incorrect Section Numbers

Code compiled with the XL Fortran for AIX compiler Versions 4.1 and 5.1 may contain incorrect section numbers for bstat (.bs) and estat (.es) symbols. TotalView detects any incorrect section numbers and generates a warning in a dialog box for the first such problem only. TotalView notes any additional incorrect section numbers on its message output only. Symptom: common blocks have invalid addresses.

Patches for the 5.1 compilers are available through the normal AIX FixDist WEB site as follows:

  1. Point your WEB browser to http://service.software.ibm.com/support/rs6000/.
  2. Click on the Downloads link
  3. Click on the Software Fixes link
  4. Click on the AIX Fix Distribution Service link
  5. Click on the Search by: PTF Number radio button
  6. Enter the PTF number U457231 in the box and click on the Find Fix button
  7. Select the PTF U457231 - xlfcmp.5.1.0.2 item from the list
  8. Select your version of AIX
  9. Click on the Get Fix Package button. The following list of fix packages appears. You must download all of them:

    Filesets needed for selected item Information file Byte size
    xlfcmp.5.1.0.2 README 18824192
    xlfrte.5.1.0.2 README 24093696
    xlsmp.rte.1.0.0.1 README 63488

  10. Using your browser, download each of the files and put them into a common directory, for example, /tmp/xlfpatches. The three files are named xlfcmp.5.1.0.2.bff, xlfrte.5.1.0.2.bff, and xlsmp.rte.1.1.0.1.bff.
  11. Use the AIX smit tool to install all three patches from the directory.
GNU Demangling Problem

Some small C++ programs compiled with the GNU compiler on AIX may not be recognized by TotalView as having been compiled with the GNU compiler. In these cases, TotalView will not demangle various program names. To make TotalView demangle the names in these programs properly, specify -demangler=gnu on the command line.

poe Interferes With CLI's use of stdin

Because poe tries to manage stdin on behalf of its target processes, the CLI cannot read from stdin. If your target processes do not use stdin, using the -stdinmode none option to the poe command allows the CLI to use stdin. Unfortunately, this option is incompatible with the poe command's -cmdfile option.

If your processes do use stdin, your only recourse is to redirect stdin from within the CLI. For example:

     drun < in.txt

(pr 2078, 2422)

TotalView Can Hang in Parallel Session Running in the Background

On AIX systems, TotalView can hang if you have a TotalView parallel debug session running in the background in an xterm window, and you type anything in the underlying xterm window while the poe process is stopped. Type the fg command in the TotalView xterm window to clear up this condition.

AIX May Only Create a Partial Core File

Recent versions of AIX (4.1 or later) dump a partial core file by default. In general, a partial core dump contains only enough information to give a stack backtrace for the faulting thread. User data sections as well as some other potentially useful information are only available in a full core dump.

To force a full core dump on AIX, you must set a signal flag with sigaction for the signal that caused the core dump. For example:

     struct sigaction act;
     act.sa_handler = SIG_DFL;
     if (bigcore)
          act.sa_flags = SA_FULLDUMP;
     else if (smallcore)
          act.sa_flags = SA_PARTDUMP;
     sigaction (SIGSEGV, &act, 0);
Process Contention Scope Not Supported

On AIX 4.3.1, 4.3.2, and 4.3.3 systems, TotalView supports debugging pthread programs running in pthread-compatibility mode or pthreads scheduled in system contention scope, that is, each pthread is bound to a kernel thread (the 1:1 thread scheduling model). TotalView does not support process contention scope, that is, multiple pthreads scheduled in user mode (M:N thread scheduling model).

On AIX 4.3.1, 4.3.2, and 4.3.3, when using TotalView to debug a program built with libpthreads.a, you must force the 1:1 model using the procedure outlined in Forcing 1:1 Thread Scheduling Mode on RS/6000 Systems.

GPFS File System Not Supported

TotalView 4.1 will not work on the current release of the GPFS file system due to limitations in this file system. No file that must be read by TotalView should be stored on a GPFS system until these limitations are corrected.

If you attempt to debug a program that has executable components (the image or dynamic libraries) on GPFS, you will see error messages like this from TotalView:

     aix_lookup_symbol_in_load_module: Failed to map module; errno = 109
pthdb_pthread() Returns an Empty pthread List

Sometimes when a process is stopped and the pthdb_pthread() function is used to obtain a list of pthreads, the returned list is empty even when there are pthreads. (TotalView displays a console message saying that there are no more threads.) You can fix this problem by applying the APAR IY06378 patch to your system. The procedure for obtaining and applying patches is described in RS/6000 System Patch Procedures.

Contents


Problems on SPARC SunOS 5 Platforms

The SPARC SunOS 5 and QSW CS-2 platforms have the following known problems:

Apogee 4.0 compilers must be patched

The Apogee 4.0 compilers on SUN4 and SUN5 require a patch to bring them up to revision level 4.010. Follow the Apogee 4.0 Compiler Patch Procedures.

Breakpoints in thunks may cause crash

Using breakpoints in thunks may lead to unexpected results, including having the target program crash unexpectedly. A thunk is a small linkage routine that connects a subroutine call to the actual subroutines in a dynamic library. The SPARC SunOS 5 dynamic loader modifies the code in the thunks during program execution, which conflicts with TotalView's planting and unplanting of breakpoints. The first time through a thunk, the thunk branches to the dynamic loader, and the dynamic loader modifies the thunk to branch directly to the corresponding dynamic library routine. Subsequent trips through the thunk branch directly to the dynamic library routine.

Contents


Problems in the Portland Group HPF 2.4 Compiler

The Portland Group HPF has the following problem:

The Portland Group HPF compiler generates bad debugging information for TotalView in cases where the compiler needs to generate static initialization subprograms. The symptom of this problem is that some line numbers in the HPF source window are not associated with actual Fortran source and TotalView either disallows setting breakpoints at some lines or it sets the breakpoint in the wrong place. This bug occurs quite often with 'WHERE'. constructs.

Contents


Problems in Linux

The following problems exist on all Linux Platforms:

For platform-specific information, see:

TotalView Versions Before 4.1.0-2 Can Break When Upgrading Compiler/Linux Version.

The first thing to do is check your version of libbfd. Totalview works with libbfd-2.9.5.0.22. You can correct this problem by installing the library that TotalView expects to find in one of the following ways:

A precompiled Linux i386 binary can be obtained from the Etnus web page. This binary is straight out of the binutils-2.9.5.0.22-6.i386.rpm from RedHat.

libbfd is a part of GNU binutils. See http://sourceware.cygnus.com/binutils and http://www.gnu.org/ for sources and information on the GNU Public License under which binutils is distributed.

Problems Using LIBDBFORK

The Linux implementation of the ptrace() debug function is flawed. This implementation reassigns the parent process of the process being debugged to TotalView. This means that an error will occur when the process's real parent attempts to wait for it because it will not find the child (as the child now belongs to the TotalView).

As this is a Linux kernel problem, Totalview cannot be patched to solve this problem. (The same problem will occur with any debugger on Linux). (pr 2100)

KDE Redisplay Problem

Some users have reported that TotalView menus appear without a surrounding box when using KDE. The only workaround for this problem is to use a different window manager.

Thread Debugging and errno

When using pthreads on Linux, the errno variable is actually a C macro defined as follows in bits/errno.h:

     #define errno (*__errno_location ())

This definition allows each thread to have its own errno value. Unfortunately, the program does not contain information that allows TotalView to find this thread specific errno value and there remains a single global errno variable still exists.

The result is that displaying errno in any thread other than the initial one in a process is likely to be very misleading, since you will see the global errno variable, rather than the per-thread value accessed by your code through the macro above.

exec() Not Yet Supported

Debugging threaded programs (pthreads) that call exec() is not yet supported.

GCC g77 Problem with Common Blocks

The GCC g77 compilers do not output debugging information for common blocks. Consequently, TotalView cannot show the values of variables in common blocks.

Problems occur with Linux X86

     --- traps.c.orig        Thu Dec  9 21:39:40 1999
     +++ traps.c     Thu Dec  9 21:49:13 1999
     @@ -354,10 +354,11 @@
             unsigned int condition;
             struct task_struct *tsk = current;
      
     +       __asm__ __volatile__("movl %%db6,%0" : "=r" (condition));
     +       tsk->tss.debugreg[6] = condition;
     +
             if (regs->eflags & VM_MASK)
                     goto debug_vm86;
     -
     -       __asm__ __volatile__("movl %%db6,%0" : "=r" (condition));
     
             /* Mask out spurious TF errors due to lazy TF clearing */
             if (condition & DR_STEP) {

Problems occur with Linux Alpha

For additional information, see Compaq Alpha Linux Red Hat and Intel x86 Linux Red Hat.

Contents


Problems in the CLI

In some cases, the CLI does not let you specify a variable's scope. For example, #foo#bar#x should identify a variable x within executable bar contained within the executable named foo. You may get errors indicating that the variable is not found.

Contents


Reporting Problems

If you experience any problems with TotalView, or if you have questions or suggestions, please contact us:

     Etnus Inc.
     111 Speen Street
     Framingham, MA 01701-2090
     Internet email: support@etnus.com
     1-800-856-3766 in the United States
     (+1) 508-875-3030 worldwide

Contents


TotalView Problem Report Form and Instructions

Copy and paste the form listed below into your email editor. Then, complete and return the form.

Document just one problem on a form. Remove or replace all data fields (<...>) with a selection or with contents. Do not remove '>' characters at the beginnings of lines.

All data requested below is helpful to us, though not all is necessary to solve each problem. Supply as much detail as you can.

If your problem involves TotalView execution, attach or FTP a reproducible example.

How to Prepare and Send Your Example

Create a directory named repro and place your problem files in it.

Add the following files to repro:

Package your repro directory as follows:

     cd ../repro/../
     tar cvf - repro | compress -c > repro.tar.Z

If repro.tar.Z is larger than about 1 MB, use FTP to send it to:

     ftp://ftp.etnus.com/incoming/repro.tar.Z-<your email address>

You should also cut and paste `ls -l repro.tar.Z` to the end of your form.

If repro.tar.Z is less than about 1 MB, use uuencode to package your information:

     uuencode repro.tar.Z > repro.tar.Z.uu < repro.tar.Z

You should also cut and paste `cat repro.tar.Z.uu` to the end of your form.

Email the form as shown in the header in the next section.


Problem Report Email Header


To: support@etnus.com, bug-toolworks@etnus.com
Subject: <copy value of Synopsis field, below  .  .  .  .  .  .  .  .  > 


Problem Report Email Body

Here is the Problem Report form:

>Submitter-Id:  <primary contact's *simplest* E-mail address (one line)>
>Originator:    <originator's name .  .  .  .  .  .  .  .  . (one line)>
>Organization:
 <originator's E-mail signature block .  .  .  .  .  . (multiple lines)>
>Confidential:  <[ no | yes ]   .  .  .  .  .  .  .  .  .  . (one line)>
>Synopsis:      <synopsis of the problem .  .  .  .  .  .  . (one line)>
>Severity:      <[ non-critical | serious | critical ]  .  . (one line)>
>Priority:      <[ low | medium | high ] .  .  .  .  .  .  . (one line)>
>Category:      totalview
>Class:         <[ sw-bug | doc-bug | change-request | support ] (1 ln)>
>Release:       4.1.0-4
>Environment:
 
 System:     <`uname -a` output .  .  .  .  .  .  .  .  .  .  .  .  .  >
 Platform:   <machine make&model, processor, etc. .  .  .  .  .  .  .  >
 OS:         <OS version and patch level .  .  .  .  .  .  .  .  .  .  >
 ToolChain:  <compiler version, linker version, etc. .  .  .  .  .  .  >
 Libraries:  <versions of parallel runtimes or standard libraries   .  >
 
 <other environmental factors you think are relevant . (multiple lines)>
 See also index.txt.
 
>Description:
 
 <precise description of the problem  .  .  .  .  .  . (multiple lines)>
 See README.TXT.
 
>How-To-Repeat:
 
 <step-by-step: how to reproduce the problem   .  .  . (multiple lines)>
 See repro.txt.
 
>Fix:
 
 <how to correct or work around the problem, if known  (multiple lines)>
 
>Unformatted:
<misc. comments, forwarded E-mail, encoded repro  .  . (multiple lines)>
 
`ls -l repro.tar.Z`  or  `cat repro.tar.Z.uu`
 
<End of problem report E-mail body.>

Contents


How to Contact Us

Etnus LLC.
111 Speen Street
Framingham, MA 01701-2090
Internet Email: info@etnus.com
1-800-856-3766 in the United States
(+1) 508-875-3030 worldwide

Visit our web site at http://www.etnus.com/.

Contents


Section 8: Patching Compilers and Operating Systems

Compaq Tru64 UNIX Patch Procedures

NOTICE TO ALL Compaq Tru64 UNIX TOTALVIEW CUSTOMERS

All versions of TotalView 3.4, 3.7, 3.8, 3.9, 4.0, and 4.1 for Compaq Tru64 UNIX versions V4.0B, V4.0C, V4.0D, V4.0E, and V5.0A require that you patch the Compaq Tru64 UNIX operating system before running TotalView. You must apply the entire patch kit; partial patch kits are not supported.

Do not run TotalView without first patching your Compaq Tru64 UNIX V4.0-based system! Failure to patch your Compaq Tru64 UNIX operating system before running TotalView will cause system crashes, system hangs, hung and unkillable processes, and TotalView malfunctions.

NOTE:     The patch procedure requires that you have root user privileges on your systems.

Patch files include an operating system version, a patch number, and a patch date. For example, the patch file named duv40bas00007-19980514.tar is a patch file for Compaq Tru64 UNIX 4.0B, with a patch number of 00007 representing the patch level, and a patch date of 19980514.

No matter what patch level you actually install, the patch contains all the prior patches up to and including the current patch. Install the latest patch to get the operating system and runtime library versions required to run TotalView. Follow the step-by-step directions below to download the software and prepare to install the patch kit.

Retrieving and Applying the Compaq Tru64 UNIX V4.0x Aggregate ECO

Step 1: Retrieve the DUNIX V4.0 Aggregate ECO files and save them to your system using the following procedure:

Step 2: Print the patch procedure documentation contained in the .README and .pdf (or .ps) files, and the PatchInstallGuide.htm or PatchInstallGuide.pdf file.

Step 3: Follow the directions contained in the patch procedure documentation and install this patch on your Compaq Tru64 UNIX V4.0x systems. Perform this procedure in single-user mode, rebuild your kernel, and reboot your system.

Minimum Patch Level for 4.0D Systems

If you are patching a Compaq Tru64 UNIX 4.0D system, use patch kit DUV40DAS0005-19991007 or later. Earlier patch kits do not contain all of the required Compaq Tru64 UNIX 4.0D patches.

Minimum Patch Level for 4.0E Systems

If you are patching a Compaq Tru64 UNIX 4.0E system, use patch kit DUV40EAS0002-19990617 or later.

Minimum Patch Level for 5.0A Systems

If you are patching a Compaq Tru64 UNIX 5.0A system, use patch kit t64v50aas0002-20001004 or later.

Getting Help With the Patch Procedures

If you need help following these procedures or have any questions, follow the directions for "Reporting Problems".


Apogee 4.0 Compiler Patch Procedures

The Apogee 4.0 compilers require a patch that brings the compiler version up to 4.010 (or later). You must obtain this patch directly from Apogee Software Inc. To get the Apogee 4.0 compiler patches for the SPARC, visit the Apogee ftp site at ftp://ftp.apogee.com/pub/users/apogee/patches/SPARC/ and read the README file.


Portland Group HPF 2.4 Compiler Patch Procedures

Some of the Portland Group HPF 2.4 distribution libraries for the RS/6000 Power AIX, SPARC SunOS 5 (Solaris 2.x), and SGI IRIX 6.x MIPS platforms may require a patch or a new installation to enable debugging HPF programs using TotalView.

TotalView depends on the tvdebug.o library module. This module must have symbol information that is used by TotalView. It is required that PGI build the tvdebug.o module with debugging information enabled.

If TotalView issues the following message when debugging your Portland Group HPF application:

     MPICH library contains no type definition for struct MPIR_PROCDESC.

then, it is likely that the Portland Group HPF library you are linking with has a copy of the tvdebug.o module that was not compiled with debugging information enabled. To check for this situation, you can extract a copy of the tvdebug.o module from the libraries in your Portland Group HPF installation and check for the missing symbol.

In the directions below, $INSTALLDIR must be set to the directory where your Portland Group HPF compiler was installed. $LIBRARY is either libpghpf_rpm.a or libpghpf_smp.a, depending on whether you are using RPM or SMP run time support.

To check for the MPIR_PROCDESC symbol in your libraries, follow the platform-dependent directions below.

For the SGI IRIX 6.x MIPS platform:
  1. Enter the following commands:

    ar -xf $INSTALLDIR/sgi/lib-64/$LIBRARY tvdebug.o

    dwarfdump -a tvdebug.o | grep MPIR_PROCDESC

  2. If the MPIR_PROCDESC symbol is not found, you should download a new copy of the Portland Group HPF 2.4 compiler and reinstall it. The latest 2.4 release appears to have been fixed to correct this problem, therefore we have not provided a patch for the SGI IRIX 6.x MIPS platform.
For RS/6000 Power AIX or SPARC SunOS 5 (Solaris 2.x) platforms:
  1. Depending on your platform, enter the following commands:

    RS/6000 Power AIX:

         ar -xf $INSTALLDIR/rs6000/lib/$LIBRARY tvdebug.o

         dump -vtd tvdebug.o | grep MPIR_PROCDESC

    SPARC SunOS 5 (Solaris 2.x):

         ar -xf $INSTALLDIR/solaris/lib/$LIBRARY tvdebug.o

         dump -vsn .stabstr tvdebug.o | grep MPIR_PROCDESC

  2. If the MPIR_PROCDESC symbol is not found, you will need to patch your libraries. You need this patch only if you are using the HPF rpm or smp run time support. The four libraries that require patching are:
     
         libpghpf_rpm.a
         libpghpf_rpm_p.a
         libpghpf_smp.a
         libpghpf_smp_p.a
     
  3. All libraries require a new version of the module tvdebug.o. You can download and install fixed versions of these modules from our support site. Download the following files and save them to your system:

         ftp://ftp.etnus.com/support/toolworks/pgi/PGI_HPF_2.4_patch.tar

         ftp://ftp.etnus.com/support/toolworks/pgi/PGI_HPF_2.4_patch.README

  4. Follow the directions contained in the PGI_HPF_2.4_patch.README file to install the tvdebug.o module in these libraries.

Contents


Sun WorkShop 5.0 Compiler Patch Procedures

Due to bugs in the initial release of the Sun WorkShop 5.0 FORTRAN 77 and Fortran 90 compilers, you must apply several Sun-provided patches to your compiler.

Visit the following URL:

     http://access1.sun.com/workshop/current-patches.html

And download the following patches:

Patch Number Synopsis
107356 Fortran 90 2.0: Patch for Fortran 90 (f90) 2.0 compiler
107357 Compiler Common 5.0: Patch C 5.0, C++ 5.0, F77 5.0, F90 2.0
107377 Fortran 90 2.0: Patch for 64-bit Fortran 90 (f90) 2.0 compiler
107989 Fortran Common 5.0: Patch F77 5.0, F90 2.0

Contents

RS/6000 System Patch Procedures

Patches for AIX are available through the normal AIX FixDist WEB site as follows:

  1. Point your WEB browser to http://service.software.ibm.com/support/rs6000/.
  2. Click on the Downloads link.
  3. Click on the General Software Fixes link.
  4. Click on the AIX Fix Distribution Service link.
  5. Click on the Search by: APAR Number radio button.
  6. Enter the APAR number in the box and click on the Find Fix button.
  7. Select the APAR item from the list.
  8. Select your version of AIX.
  9. Click on the Get Fix Package button.
  10. Using your Web browser, download the filesets and put them into a directory; for example, /tmp/apar123.
  11. Use the AIX smit tool to install the patch from the directory.

Contents


Notices

Copyright (c)1999-2000 by Etnus LLC. All rights reserved

Copyright (c)1999 by Etnus Inc. All rights reserved

Copyright (c)1996-1998 by Dolphin Interconnect Solutions, Inc.

Copyright (c) 1993-1996 by BBN Systems and Technologies, a division of BBN Corporation.

No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise without the prior written permission of Etnus LLC. (Etnus).

Use, duplication, or disclosure by the Government is subject to restrictions as set forth in subparagraph (c)(1)(ii) of the Rights in Technical Data and Computer Software clause at DFARS 252.227-7013.

Etnus has prepared this document for the exclusive use of its customers, personnel, and licensees. The information in this document is subject to change without notice, and should not be construed as a commitment by Etnus. Etnus assumes no responsibility for any errors that appear in this document.

TotalView, TimeScan, and Gist are trademarks of Etnus LLC.

All other brand names are the trademarks of their respective holders.


Etnus LLC
http://www.etnus.com
Voice: (508) 875-3030
Fax: (508) 875-1517
support@etnus.com
info@etnus.com