Privacy and Legal Notice





Building Executables

Environment Variables

SMT and

Performance Results

Open Issues, Gotchas, and Recent Changes




Libraries/Building Executables


Library Paths, Implementations, and Machine Availability

Implementation Platforms Machines Paths
(US or IP)
IBM SPs   /usr/lpp/ppe.poe/include
MPL (US or IP)
IBM SPs   Default version:
Oldest version:
  /usr/local/old_mpi/manNewest version:

Notes about Paths

Many of the paths are symbolic links. The actual paths sometimes change for minor bug fixes and other maintenance. For MPICH and IBM's MPI, there are scripts that link in the necessary libraries and include directories. When accessing the library files and include directories explicitly, care must be taken that the -I and -L paths are consistent with the MPI used. Users are encouraged to use the standard MPICH or IBM scripts as they will automatically provide all required macro definitions, environment settings, include and library paths, and platform-specific libraries.

Other Notes

  • MPICH vs. vendor MPIs

    The vendor MPI implementations and Argonne's MPICH are different. Most vendor libraries yield faster communication; therefore, the vendor libraries are generally recommended over MPICH. However, maintaining compilability with MPICH can often simplify debugging. Because the include files (in particular, mpi.h) may be different, a complete recompile, as well as reload, may be necessary when switching between a vendor library and MPICH. For example, the Compaq and Quadrics MPIs are MPICH-based and appear to be compatible with only a reload, provided no MPI-2 features are being used; your mileage may vary, of course. Care should be taken when building applications using MPI with libraries that also use MPI to guarantee that consistent MPI implementations are used by both.

  • Man Pages

    The man pages for the MPI related commands and scripts described in the following sections are available either in the default search paths or by using:

    man -Mmpi_man_path mpi_command

    where mpi_man_path is the man path for the MPI as indicated in the table above, and mpi_command is the command (e.g., mpcc_r or mpirun).

    For example, if you want to see the man page for mpif90 on an Intel Linux cluster, you would use:

    man -M/usr/lib/mpi/man mpif90

  • C++

    C++ support for most of the MPIs is limited to C++ compatibility mode, so that C++ codes may invoke the MPI C routines. The MPI-2 standard defined an MPI class and bindings and the class definitions developed by Notre Dame are slowly being incorporated in MPICH and vendor MPIs.

    The MPICH configurations on most of our systems are still currently limited to C++ compatibility mode. The MPI-2 C++ interfaces are included with the MPICH releases starting with version 1.2.4. The C++ interfaces with MPICH 1.2.4 are only available on the IBM SPs. The MPI-2 C++ interfaces are also available on the IBM SPs with AIX 5.1 and PSSP 3.2 and above.

  • MPI-IO

    MPI-IO is available in all supported versions of MPICH and in IBM MPI. The IBM MPI-IO is a vendor implementation. All other MPI-IO support is the ROMIO implementation, packaged as part of MPICH.

  • One-Sided Communication

    The MPI-2 one-sided communication APIs are supported in IBM MPI, but they are not yet supported in MPICH.



Two variations of the IBM MPI library are available, a threaded library and a signal library. (Note: The signal MPI library does not work on Power4 or Power 5 systems.) The threaded library processes MPI calls in a separate, kernel-bound thread, while the signal library uses interrupts to ensure progress of MPI calls. The threaded library is thread-safe and is the default library used, whether or not the thread-safe compiler scripts (e.g., mpcc vs. mpcc_r) are used.

Note that the signal library yields slightly faster communication, but the compiled code is not thread-safe. Performance of the threaded library is comparable to that of the signal library if all MPI calls are performed in a single user thread and the environment variable MP_SINGLE_THREAD=yes.

In order to link with the signal library, you cannot use the thread-safe compiler scripts (e.g., mpcc_r) and you must set the environment variable LLNL_COMPILE_SINGLE_THREADED=TRUE.

Both libraries support two communication methods: User Space (US) and Internet Protocol (IP). US is an IBM OS bypass mechanism that provides user processes with fast access to communication hardware. Latencies with IP are about a factor of 5 higher than US. Current configurations limit the number of US processes per node to the number of cpus per node.

The IBM MPI libraries will use shared memory for on-node communication and the network interface for all off-node communication. The shared memory communication is enabled by setting the environment variable MP_SHARED_MEMORY=yes, which is the default setting in our current system configuration. Using the shared memory configuration provides slightly faster on-node communication at the cost of higher CPU overhead. Its impact on performance depends on the MPI calls that are used; generally, it will improve performance for blocking MPI calls, while codes that use nonblocking MPI calls can see performance degradation. Shared memory communication can be disabled by setting MP_SHARED_MEMORY=no.

How Do I Compile and Load with IBM's Threaded MPI Library?

The threaded MPI library can be used with C, C++, Fortran77, Fortran90, or Fortran95 codes.

Recall that in the LLNL default configuration, all the mp* compilation scripts are mapped to _r versions (e.g., mpcc and mpcc_r are equivalent).

C Example

mpcc_r -g code.c -o codex

C++ Example

mpCC_r -g code.C -o codex

Fortran77 Example

mpxlf_r -g code.f -o codex

Fortran90 Example

mpxlf90_r -g code.F -o codex

Fortran95 Example

mpxlf95_r -g code.f -o codex

How Do I Compile and Load with IBM's Signal MPI Library?

The signal MPI library can be used with C, C++, Fortran77, Fortran90, or Fortran95 codes. Note: The signal MPI library does not work on Power4 or Power5 systems.

The following examples show that for both the compilation and load steps, the nonthreaded compile script must be used with the setting of the environment variable LLNL_COMPILE_SINGLE_THREADED=TRUE. In this case, the mp* scripts are not equivalent to the _r versions.

C Example

mpcc -g code.c -o codex

C++ Example

mpCC -g code.C -o codex

Fortran77 Example

mpxlf -g code.f -o codex

Fortran90 Example

mpxlf90 -g code.F -o codex

Fortran95 Example

mpxlf95 -g code.f -o codex

How Do I Run with IBM's MPI?

The resulting executable may be run with poe, using environment variables or command-line arguments to set job parameters.

There are many environment variables that affect the performance tuning of the IBM MPI. The default user environment sets MP_EUILIB=us, MP_SHARED_MEMORY=yes, and a few other environment variables most often needed for MPI or other parallel programs. You may find additional settings, or overriding of the defaults, are necessary for optimal performance for some codes.

Execution Line

To execute your code with n nodes and p processes in the indicated pool with US communications, use the environment settings as shown in the following example:

setenv MP_NODES n
setenv MP_PROCS p
setenv MP_RMPOOL pool
codex args

Alternatively, using poe command-line arguments:

poe codex args -nodes n -procs p -rmpool pool

Batch Script

# Sample LCRM script submitted with psub
#PSUB -ln n
#PSUB -g p

cd /myhome/mydirectory

poe ./codex args

How Do I Debug with IBM's MPI Using TotalView?

totalview poe -a codex args -nodes n -procs p -rmpool pool

Note: TotalView needs to know that the job is actually a poe job so

totalview codex

does not work.



We support three versions of MPICH: a default version, a latest installed version, and an oldest supported version. All versions may be used with C, C++, Fortran77, or Fortran90 through MPICH compilation and run scripts that are accessed through symbolic links in /usr/local/bin, as described in the examples below. All versions are installed in /usr/local because it is assumed that users will commonly have /usr/local/bin included in their PATH environment variable.

The most stable recent version of MPICH is the default. The default version of MPICH is installed as /usr/local/mpi and is accessed through links in /usr/local/bin to the standard MPICH compilation and run scripts. The compilation scripts are mpicc, mpiCC, mpif77, and mpif90, and the run script is mpirun, which is used to execute programs created using the compilation scripts.

The latest version of MPICH is installed as /usr/local/new_mpi, and there are links in /usr/local/bin to its corresponding scripts. The compilation scripts are new_mpicc, new_mpiCC, new_mpif77, and new_mpif90, and new_mpirun is used to execute programs built with those scripts. When a new MPICH release becomes available, the previous latest release will become the default, and the new latest release will be installed as new_mpi.

The oldest version of MPICH is installed as /usr/local/old_mpi, and there are links in /usr/local/bin to its corresponding scripts. The compilation scripts are old_mpicc, old_mpiCC, old_mpif77, and old_mpif90, and old_mpirun is used to execute programs built with those scripts. When a new version of MPICH becomes the default version, the previous default becomes old_mpi.

Users should be aware that the installed versions of MPICH can vary across platforms or machines. In general, we try to keep the versions consistent, but there can be a lag in migrating a new version to all systems because of programmatic requests. Other than very short lags to update links across the full set of machines, these versions will be consistent across machines of the same platform, and, usually, across platforms.

On the Intel Linux Cluster we only support a default version, so the remarks here about old_ and new_ versions do not apply there.

We install the best MPICH abstract device interface (ADI) available for each platform. On the IBM's, this is the MPL device, which is able to interface to poe and to make use of the SP switch in both US and IP mode.

Note: No thread-safe version of MPICH is available.

The MPICH Scripts

As stated above, /usr/local/bin contains soft links to the MPICH scripts for all the currently supported versions. The standard MPICH script names are linked to the default MPICH path. For example, /usr/local/bin/mpicc is a link to /usr/local/mpi/bin/mpicc. Other MPICH script names in /usr/local/bin are links to the additional MPICH versions that are supported, using the prefixes old_ and new_, so that these names are derived from the standard MPICH script names.

We use symbolic links so that different names can distinguish the different versions installed, because all MPICH versions provide the same script names, relative to their installation paths. For example, new_mpicc is a link to /usr/local/new_mpi/bin/mpicc, while mpicc is a link to /usr/local/mpi/bin/mpicc. Please note that the scripts for each version do differ, and cannot be used interchangeably; e.g., you cannot use mpirun to execute a program built with the old_mpicc script.

We have made site-specific modifications to the MPICH scripts in some cases. On the IBM SPs, the compilation scripts will automatically set the environment variable LLNL_COMPILE_SINGLE_THREADED=TRUE to prevent unintentional mixing of IBM's threaded MPL library with MPICH definitions.

The MPICH compilation scripts add configuration-specific macro definitions and automatically set the appropriate include directories and link in the appropriate libraries. Users are discouraged from accessing the MPI libraries and include files explicitly; they are subject to change with new versions of MPICH, and path names and MPI support libraries needed vary by platform. If explicit paths and libraries are required, consult the information in /usr/local/docs/MPI_Use_Summary on the platform you are using for more details on the paths and libraries needed.

Each MPICH compilation script is configured to use a specific C, C++, or Fortran compiler, typically the native compiler on the given platform. MPICH allows the user to change the compiler and linker/loader used by these scripts by defining appropriate environment variables, as described below. Note that you generally use the same command for both the compiler and the linker/loader, which requires setting a pair of MPICH environment variables (e.g., MPICH_CC=gcc and MPICH_CLINKER=gcc).

alternate C compiler
alternate C loader
alternate C++ compiler
alternate C++ loader
alternate Fortran77 compiler
alternate Fortran77 loader
alternate Fortran90 compiler
alternate Fortran90 loader

To determine what other definitions or paths are provided by the version of MPICH you are using, you may use the -compile_info or -link_info options to any of the MPICH compilation scripts such as mpicc to see the options used by these scripts. This can assist you in providing the MPICH-required options in your compile and link commands, if you are not using the MPICH scripts, to guarantee compatibility with MPICH.

Executables built with the oldest or newest versions of MPICH should be run using the corresponding old_mpirun or new_mpirun, respectively, as there could be subtle differences in the runtime environments created by each.


Using InfiniBand MPI on Linux

Mellanox/OSU MVAPICH MPI is based on MPICH 1.2.7. Currently, there is no support for MPI one-sided communications.

MPI compiler wrapper scripts are available in /usr/local/bin/, which is in the default $PATH. These scripts mimic the familiar MPICH scripts in their functionality, meaning they automatically include the appropriate MPI include files and link to the necessary MPI libraries and pass switches to the underlying compiler.

Type [scriptname] -help for a list of command-line options. Scripts available are:

Script Name Underlying Compiler
mpicc gcc (typically)
mpiCC g++ (typically)
mpif77 f77 (typically)
mpif90 f77 (typically)
mpigcc gcc
mpig++ g++
mpig77 g77
mpiicc icc
mpiicpc icpc
mpiifort ifort
mpipgcc pgcc
mpipgCC pgCC
mpipgf77 pgf77
mpipgf90 pgf90
mpipathcc pathcc
mpipathCC pathCC
mpipathf90 pathf90

Note: See the Environment Variables page for environment variable settings used with MPI runs.



On IBM SP platforms, MPICH uses IBM's proprietary Message Passing Library (MPL), which supports both US and IP communication (see the IBM MPI section). Because no thread-safe version of IBM's MPL exists, MPICH cannot use the _r compilers. MPICH users must therefore set LLNL_COMPILE_SINGLE_THREADED=TRUE on IBM machines. Failure to have this environment variable set can result in missing externals at load time or inappropriate mixing of MPICH and IBMs MPI definitions that can generate illegal/bad communicator errors at run time. This environment setting is automatically set in the MPICH scripts, but it must be added explicitly if you are not using the scripts.

Note that the MPL MPICH mpirun provides SMP support. By default it will use n = ceiling (p/4) nodes, placing up to 4 tasks on each node, where p is the number of processes requested. The -nodes n option will override this behavior, where the desired number of nodes to use is n. MPL MPICH will distribute the p tasks evenly across the n nodes, or complain if it cannot evenly distribute the tasks. Note that mpirun also understands several IBM environment variables, such as MP_NODES and MP_TASKS_PER_NODE to determine the number of nodes to use, but these must be consistent with the -np option used on the mpirun command. The default number of processes used if no -np option is specified is 1.

How Do I Compile and Load with MPL MPICH?

The following examples demonstrate how to use the MPICH scripts on the IBM SPs. Most of the examples use the default script names, but the oldest or newest versions of MPICH supported are also available with the old_ or new_-prefixed names as indicated.

C Examples

mpicc -g code.c -o codex

Alternatively, to use the oldest or newest versions of MPICH:

old_mpicc -g code.c -o codex or new_mpicc -g code.c -o codex

C++ Examples

mpiCC -g code.C -o codex

Alternatively, to use KAI as the C++ compiler on the IBM machines:

setenv MPICH_CCC mpKCC
mpiCC -g code.C -o codex

Fortran77 Example

mpif77 -g code.f -o codex

Fortran90 Example

mpif90 -g code.F -o codex

How Do I Run with MPL MPICH?

The resulting executable is run with mpirun, old_mpirun, or new_mpirun, as appropriate. Although MPICH-compiled executables can generally be run as serial jobs on most platforms, it is strongly recommended that MPL MPICH jobs be run only through the mpirun scripts.

Execution Line

To run with p processes or tasks on n nodes:

mpirun -nodes n -np p codex args


setenv MP_NODES n
mpirun -np p codex args

Alternatively, use:

mpirun -np p codex args

to run with the default of up to 4 tasks per node.

Batch Script

# Sample LCRM script submitted with psub
#PSUB -ln n
#PSUB -g p

cd /myhome/mydirectory

mpirun -np p ./codex args

How Do I Debug MPL MPICH Using TotalView?

Using the -tv option to mpirun will start your executable under TotalView.

mpirun -tv -np p codex args


Using Open MPI

Need $OMPI/bin in $PATH, $OMPI/lib in $LD_LIBRARY_PATH. The ompi_info command lists various configuration settings, compilers used, available components, and more.

Compiler wrappers are provided. Users should never need to explicitly link against any OMPI libraries.

C Example

mpicc code.c -o code

C++ Example

{mpiCC, mpic++, mpicxx} code.C -o code

Fortran 77 Example

mpif77 code.f -o code F90: mpif90 code.F -o code

By default, Open MPI will use the fastest networks available. On a Peloton system with InfiniBand, for example, shared memory will be used for communication between processes on the same node, and InfiniBand will be used for communication across nodes. A single-node MPI job will use shared memory by default. A multinode MPI job across nodes without InfiniBand will use TCP for communication. Networks can be explicitly selected via an MCA parameter, which will be discussed below.

Running Open MPI under SLURM

Within a batch script or interact allocation, Open MPI automatically detects how many nodes are available and how many cores each node has. For example, in a two-node allocation on Atlas (8 cores per node):

$ mpirun ./hello
atlas34 is rank 0 of 16
atlas34 is rank 1 of 16
atlas34 is rank 2 of 16
atlas34 is rank 3 of 16
atlas34 is rank 7 of 16
atlas34 is rank 4 of 16
atlas34 is rank 5 of 16
atlas34 is rank 6 of 16
atlas35 is rank 8 of 16
atlas35 is rank 9 of 16
atlas35 is rank 10 of 16
atlas35 is rank 11 of 16
atlas35 is rank 12 of 16
atlas35 is rank 13 of 16
atlas35 is rank 14 of 16
atlas35 is rank 15 of 16

Note that adjacent ranks are grouped onto the same node. The number of processes may be explicitly specified with the -np parameter. Also, ranks may be assigned in a round-robin fashion across available nodes using the -bynode parameter:

$ mpirun -np 4 -bynode ./hello
atlas34 is rank 0 of 4
atlas35 is rank 1 of 4
atlas34 is rank 2 of 4
atlas35 is rank 3 of 4

Open MPI supports for runtime configuration of via MCA parameters. MCA parameters may be specified on the command line, in the shell environment, and/or in per-user and per-OMPI-installation configuration files. More information can be found in the OMPI FAQ:

As mentioned earlier, one way in which MCA parameters are useful is to select which network interconnects are used for communication. By default, the fastest network available is used; however, manually selecting e.g. TCP may be useful for debugging purposes.

Use shared memory and InfiniBand

mpirun -mca btl openib,sm,self ./hello

Use only InfiniBand, no shared memory (even within one node)

mpirun -mca btl openib,self ./hello

Use only TCP

mpirun -mca btl tcp,self ./hello

Several parameters are useful for running large-scale jobs. These include:

oob_tcp_listen_mode listen_thread

Instructs an mpirun to spawn a separate thread for accepting management connections from spawned processes. Slightly improves startup times.

btl_openib_ib_timeout    20

Increases the InfiniBand transmit timeout, which significantly reduces the occurrence of code 12 errors from the InfiniBand network.

mpi_preconnect_all    1

Establishes TCP management connections and MPI-level communication connections between all MPI processes during initialization. Generally not needed, though may help with some applications that communcate between every process in the MPI job.

An example mpirun command line:

mpirun -mca oob_tcp_listen_mode listen_thread \
     -mca btl_openib_ib_timeout 20 \

Some useful debugging parameters include:

mpi_show_handle_leaks (default 0, enable by setting to 1)

Whether MPI_FINALIZE should show all MPI handles that were not freed.

mpi_show_mpi_alloc_mem_leaks (default 0, enable by setting to N)

MPI_FINALIZE will show up to N instances of memory allocated by MPI_ALLOC_MEM that was not freed by MPI_FREE_MEM.

mpi_no_free_handles (default 0, enable by setting to 1)

Enable to prevent OMPI from actually freeing MPI objects when their handles are freed.


High Performance Computing at LLNL    Lawrence Livermore National Laboratory

Last modified October 21, 2009