Parallelism, Scalability, and Timing

Parallelism and Scalability Expectations

These benchmarks have been run on a wide variety of platforms: IBM SPs with four-way, eight-way and sixteen-way SMP nodes; Clusters based on Compaq Alpha four-way and eight-way SMP nodes; A cluster based on SGI Origin 2000 256-way nodes; and several Sun Sparc based systems with SMP nodes. It has also been used with MPICH-g for Grid-based computing tests. The largest runs have used over 1500 MPI tasks. The current test harness supports these large runs although some of the collective tests do not scale well. Care should be exercised in the choice of tests when running large scale MPI tests; even then poorly scaling MPI implementations will be difficult to test at large scales. The Pthreads tests are generally designed to require no more than three active threads at a time; additional tests that use more concurrent threads may be implemented in the future. All of the OpenMP tests scale up to at least 16 threads; their scaling should depend solely on the scaling of the implementation being tested.

Timing Issues

The code uses MPI_Wtime to measure wall clock time throughout its tests. Future versions may include the option to use the Unix gettimeofday of call so a no MPI version can be built. The time to run the code depends on the tests selected, the implementations being tested and the scale of the system being tested. The Pthreads tests typically can all be run in a few minutes; the OpenMP and MPI tests can take much longer.