BG/Q 5D Torus Point-to-Point Message Protocols

In addition to the discussion below, see Chapter 6 of the Blue Gene/Q Application Development Redbook.

The 'immediate' protocol cutoff determines when to use PAMI_Send_immediate() for the transport. The maximum amount of data that can be sent with PAMI_Send_immediate() is 128 bytes. The mpi msginfo metadata is 16 bytes, so this 'immediate' protocol is used for mpi data lengths of 0-112 bytes. You can use the 'PAMID_SHORT' environment variable to reduce or eliminate the use of PAMI_Send_immmediate().

The 'short' protocol is actually implemented as part of the 'eager' protocol and the 'eager' protocol uses a single packet when all information, mpi msginfo and mpi data, can fit into a single packet. The 'eager' protocol uses PAMI_Send() for the transport. The BGQ torus packet size is 512 bytes, so this single-packet send will be used for mpi data lengths of 0-496 bytes.

The 'eager' protocol uses multiple packets, a single envelope packet and one or more data packets, when all information, mpi msginfo and mpi data, does NOT fit into a single packet. The 'eager' protocol uses PAMI_Send() for the transport. There is no upper bound to the message size that can be handled by multi-packet eager, although in practice eager should be limited in order to avoid execessive memory consumption on the destination when receiving unexpected messages.

The 'rendezvous' protocol cutoff determines when to switch from using a single PAMI_Send(), a.k.a. 'eager' protocol, to a compound MPICH protocol using PAMI_Send_immediate() to send the mpi msginfo metadata to the destination, followed by a PAMI_Rget() by the destination to retrieve the data from the source, and completed with an ack from the destination to the source using PAMI_Send_immediate(). The 'PAMID_RZV' environment variable is used to specify this cutoff. The default cutoff is 2049 bytes which means that eager will be used for mpi data lengths of 0-2048 bytes. A 2048 byte MPI_Send will result in 5 packets sent to the destination: a single eager envelope packet followed by 4 "full" eager data packets.

With these environment variables you can completely disable immediate sends, eager (including 'short') sends, and rendezvous sends. You could disable only the non-short eager sends by setting PAMID_RZV=497 but currently there is no way to only disable short eager sends.