High Performance Computing: Scalable I/O Project

HPSS Test Definition

In order to accurately show the HPSS capabilities delivered to ASCI users today, the HPSS systems used in the demonstrations were in full production and were serving production users concurrent with these Milepost tests. ASCI Snow and Frost platforms were used to demonstrate basic functionality, and the White platform was used to demonstrate this functionality at scale and under load.

A total of five demonstrations were developed. In each demonstration, a series of like-sized files are transferred from the application platform to an HPSS production archive using the PFTP interface. The tests varied in the following ways:

  • File size: Files were either large (2GB) or small (20MB).
  • Concurrency: From two to six files were transferred at a time (concurrently).
  • Platform: The ASCI application platforms used were Snow, Frost and White.
  • Target HPSS system: Depending on the application platform, production SCF (White) or OCF (Frost and Snow) HPSS systems were used concurrently with regular production user loads.
  • Duration: The number of files to be transferred was calculated to yield a test duration of two hours. A simple miscalculation doubled this length for one test (four hours), but we let the test run anyway to demonstrate extended stability.
  • GPFS load: One test was intentionally run concurrently with an introduced heavy platform GPFS load.
  • Networks: The number and type of network connections to HPSS varied depending on the application platform used. The Snow demonstration used a single 100Mb Ethernet connection. The Frost demonstration used two Jumbo Gigabit Ethernet links. The White demonstrations used eight Jumbo Gigabit Ethernet links.
  • Application Nodes: The number of "login nodes" used to run PFTP varied by platform from one node on Snow and Frost, to two on White.

The following table details the demonstrations:

Demo
File Size
Application Host
Concurrent Transfers
GPFS Load
Duration
1
2 GB
Snow
2
No
4 hrs
2
2 GB
Frost
3
No
2 hrs
3
2 GB
White
6
No
2 hrs
4
20 MB
White
6
No
2 hrs
5
2 GB
White
6
Yes
2 hrs

HPSS Milepost Results

All runs were made with the enhanced parallel ftp client in parallel mode. The default pwidth was 4 and the default pblocksize was 1,048,576 bytes. All files resided in GPFS on the corresponding systems.

Run 1: Large Files

Machines: Snow to OCF HPSS Storage
Network: 100 megabit ethernet
Date/time: November 23, 12:06:38 pm to 3:59:33 pm
Duration: 3 hours 52 minutes 55 seconds
Number of files: 80
File size: 2 gigabytes
Login nodes: 1
Concurrent sessions: 2
Total data moved: 0.16 terabytes
Average throughput: 11.5 megabytes per second
Comment: Limited by network bandwidth at 92%

Run 2: Large Files

Machines: Frost to OCF HPSS Storage
Network: 1 gigabit jumbo frame ethernet, 2 links/layers
Date/time: November 26, 1:04:29 pm to 2:57:50 pm
Duration: 1 hours 53 minutes 21 seconds
Number of files: 216
File size: 2 gigabytes
Login nodes: 1
Concurrent sessions: 3
Total data moved: 0.432 terabytes
Average throughput: 63.5 megabytes per second
Comment: Limited by OCF HPSS Storage disk bandwidth

Run 3: Large Files

Machines: White to SCF HPSS Storage
Network: 1 gigabit jumbo frame ethernet, 4 links/layers
Date/time: December 4, 8:56:12 am to 10:42:41 am
Duration: 1 hours 46 minutes 29 seconds
Number of files: 540
File size: 2 gigabytes
Login nodes: 2
Concurrent sessions: 6
Total data moved: 1.08 terabytes
Average throughput: 169.0 megabytes per second
Comment: Projected to 2 hours, 1.2 terabytes would have moved

Run 4: Small Files

Machines: White to SCF HPSS Storage
Network: 1 gigabit jumbo frame ethernet, 4 links/layers
Date/time: December 8, 12:00:39 pm to 1:58:56 pm
Duration: 1 hours 58 minutes 17 seconds
Number of files: 13,947
File size: 20 megabytes
Login nodes: 2
Concurrent sessions: 6
Total data moved: 0.279 terabytes
Average throughput: 39.3 megabytes per second
Comment: Less than 1/4 of the throughput because of small files

Run 5: Large Files Under Load

Machines: White to SCF HPSS Storage
Network: 1 gigabit jumbo frame ethernet, 4 links/layers
Date/time: December 8, 2:00:46 pm to 3:59:36 pm
Duration: 1 hours 58 minutes 50 seconds
Number of files: 498
File size: 2 gigabytes
Login nodes: 2
Concurrent sessions: 6
Total data moved: 0.996 terabytes
Average throughput: 140 megabytes per second
Comment: Introduced heavy load on GPFS yielding a 17% drop in performance compared to Run 3

HPSS Milepost Results vs Delivered Performance

The results of the HPSS Milepost runs closely mirror performance and stability delivered to ASCI platform users on a daily basis. Following are comparisons of four Milepost runs to typical HPSS production:

Comparison 1: Milepost Run4 vs White small file load.
On October 19, 2000, a single user transferred 1,600 small files from White to SCF storage mimicking Milepost Run 4. The user sustained an average of 41MB/sec over the transfers, exceeding the 39MB/sec performance measured in Run 4.

Comparison 2: Milepost Run 3 vs White large file performance measurements.
In mid-December 2000, a series of performance tests between White and the production SCF HPSS system were run. The graph below shows the results of these runs. At a 2GB file size and six concurrent sessions, the production performance test achieved 222MB/sec, exceeding the 169MB/sec achieved in Milepost Run 3 which used the same file size and number of sessions.

Data (Write) Transfer Rates:

Comparison 3: Milepost Run 2 vs Frost user offload.
During the last two weeks of January 2001, users moved 8.8 Terabytes of into the OCF HPSS system. One LANL user on Frost transferred 1.3 Terabytes in 91 files (filesizes ranged from 1.9 to 35.7GB) to HPSS in 7 hours and 12 minutes at a sustained rate of 50.5 MB/sec. While this is slightly less than Run 2 performance (64MB/sec) it was concurrent with an extremely heavy offload of Frost data and used the NFT user interface rather than PFTP.

Comparison 4: Milepost Run 5 vs White large file offload.
During the first two days of February 2001, a number of very large files were stored from White to the SCF HPSS system. While the numbers presented below signify single file performance, not aggregate, it shows the per-file transfer rates provided to users storing large files from White.

User File Size (Gigabytes) Per File Transfer Rate (MB/sec)
User1 239.7 67.4
User2 221.0 74.5
User3 21.5 83.7
User4 8.8 84.2

We believe that the above comparisons show that the HPSS Milepost runs were demonstrative of the service that HPSS provides to ASCI platform users.

HPSS Milepost/PSE Involvement

The High Performance Storage System (HPSS) is a large collaborative software development project, begun in 1993 as a Cooperative Research and Development Agreement (CRADA) between government and industry. The HPSS collaboration is based on the premise that no single organization has the experience and resources to meet all the extreme challenges represented by the growing storage system I/O, capacity and functionality imbalances present in high-performance computing environments such as ASCI.

For the past five years, the ASCI PSE Archival Storage Project has been the primary funding agent for HPSS and has focused the requirements, design, development and Tri-Lab deployment of HPSS. Tri-Lab PSE developers lead the design and development of most critical HPSS software components and their efforts ensure that ASCI user priorities are accurately represented in new HPSS releases. Testbed and production HPSS systems are deployed and supported in open and secure computing environments at all three Tri-Lab sites.

While the PSE provides major support, over 20 organizations have contributed to the success of HPSS and IBM markets the HPSS system. This large successful collaboration has garnered an R&D 100 Award, official ISO 9001 certification and SEI Capability Maturity Model (CMM) Level 3 status. It is however the requirements of the ASCI program for file, capacity and performance scalability as well as security that continue to drive the HPSS Project. As ASCI code development teams and their applications continue to raise the bar for HPSS, PSE developers continue to respond to these challenges.

Benefit to the ASCI Program

The most obvious and direct benefits of PSE's involvement in, and funding of, the HPSS Project are made evident by the PSE Milepost runs. These runs demonstrated HPSS providing ASCI codes and code developers with a scalable, stable, high performance archive in which the results, images and work products generated by ASCI multi-TeraOp machines could be stored. This archive not only provides permanent and safe storage of the fruits of the ASCI computational investment, but also frees up platform disk resources allowing codes and users to continue generating stockpile simulations while moving data to more cost effective storage media.

As mentioned above the ASCI Program's extreme storage requirements require the performance, security and scalability provided by HPSS. The ability of the ASCI program to directly and immediately influence HPSS requirements based on unique ASCI user and security needs can not be underestimated. At the same time, the ASCI Program directly benefits by leveraging the substantial investment and expertise that IBM, Sun, StorageTek and other HPSS development partners bring to the HPSS product and its functionality. This collaboration and its product directly serve ASCI users around-the-clock, 365 days a year at all three ASCI laboratories.

PSE Milepost: HPSS & Disk I/O