Open Computing Facility—OCF

Ansel | Aztec | Cab | Catalyst | Herd | OSLIC | RZCereal | RZHasGPU | RZMerl
RZSLIC | RZuSeq | RZZeus | Sierra | Surface | Syrah | Vulcan | Visualization Servers

Ansel

Ansel is an M&IC capacity* resource for small to moderate parallel jobs. Ansel is reserved for special projects in the S&T directorates.

Ansel
Nodes
   Login nodes: ansel[0,6]
   Batch nodes
   Debug nodes
   Total nodes

2
296
16
324
CPUs (Intel Xeon EP X5660)
   Cores per node
   Total cores

12
3,888
CPU speed (GHz)

2.8

Theoretical system peak performance (TFLOP/s) 43.5
Memory
   Memory per node (GB)
   Total memory (TB)

24
7.8
Peak CPU memory bandwidth (GB/s) 32
Operating system TOSS
High-speed interconnect InfiniBand QDR (QLogic)
Compilers  
Parallel job type multiple nodes per job
Job limits  
Run command srun
Recommended location for scratch file space /p/lscratch{...}
Password authentication OTP
Documentation Introduction to LC Resources
Linux Clusters Overview
/usr/local/docs/linux.basics
/usr/local/docs/lustre.basics
* Capacity computing is accomplished through the use of smaller and less expensive high-performance systems to run parallel problems with more modest computational requirements.

Aztec

Aztec is an M&IC resource to be used for serial and on-node parallelism only. This system does not have a high-speed interconnect.

Aztec

Nodes
   Login nodes: aztec[1-4]
   Batch nodes
   Debug nodes
   Total nodes


4
87
3
96

CPUs (Intel Xeon EP X5660)
   Cores per node
   Total cores


12
1,152
CPU speed (GHz)

2.8

Theoretical system peak performance (TFLOP/s) 12.9
Memory
   Memory per node (GB)
   Total memory (TB)

48
4.6
Peak CPU memory bandwidth (GB/s) 32
Operating system TOSS
High-speed interconnect none
Compilers  
Parallel job type multiple jobs per node
Job limits  
Run command srun
Recommended location for scratch file space /p/lscratch{...}
Password authentication OTP
Documentation Introduction to LC Resources
Linux Clusters Overview
/usr/local/docs/linux.basics
/usr/local/docs/lustre.basics

Cab

Cab is a large capacity* resource shared by M&IC and ASC for small to moderate parallel jobs.

Cab

Nodes
   Login nodes: cab[667-670,687-690]
   Batch nodes
   Debug nodes
   Total nodes


8
1,200
32
1,296

CPUs (Intel Xeon E5-2670)
   Cores per node
   Total cores


16
20,736
CPU speed (GHz)

2.6

Theoretical system peak performance (TFLOP/s) 431.3
Memory
   Memory per node (GB)
   Total memory (TB)

32
41.5
Peak CPU memory bandwidth (GB/s) 51.2
Operating system TOSS
High-speed interconnect InfiniBand QDR (QLogic)
Compilers  
Parallel job type multiple nodes per job
Job limits  
Run command srun
Recommended location for scratch file space /p/lscratch{...}
Password authentication OTP
Documentation Introduction to LC Resources
Linux Clusters Overview
/usr/local/docs/linux.basics
/usr/local/docs/lustre.basics
* Capacity computing is accomplished through the use of smaller and less expensive high performance systems to run parallel problems with more modest computational requirements.

Catalyst

Catalyst is an unclassified capacity* resource for data-intensive computing that is shared by the ASC and M&IC programs.

Catalyst **

Nodes
   Login nodes: catalyst[159,160]
   Batch nodes
   Total nodes


2
304
324

CPUs (Intel Xeon E5-2695 v2)
   Cores per node
   Total cores


48 (login); 24 (compute)
7,776
CPU speed (GHz)

2.4

Theoretical system peak performance (TFLOP/s) 149.3
Memory
   Memory per node (GB)
   Total memory (TB)

128
41.5
Local NVRAM storage, mounted on each node as /l/ssd (GB) 800
Peak CPU memory bandwidth (GB/s) 99.4
Operating system TOSS
High-speed interconnect InfiniBand QDR (QLogic)
Compilers  
Parallel job type multiple nodes per job
Job limits  
Run command srun
Recommended location for scratch file space /p/lscratch{...}
Password authentication OTP
Documentation Introduction to LC Resources
Linux Clusters Overview
/usr/local/docs/linux.basics
/usr/local/docs/lustre.basics
* Capacity computing is accomplished through the use of smaller and less expensive high performance systems to run parallel problems with more modest computational requirements.
** Limited access

Herd

Herd is small M&IC resource for jobs that require large memory on a compute node.

Herd *
Nodes
   Login node: herd[2]
   Batch nodes
   Debug nodes
   Total nodes

1
5
1
9
CPUs
   herd[2] (AMD Opteron 8356)
   herd[4-7] (AMD Opteron 6128)
   herd[8-9] (Intel EX E7-4850)
   Total cores

16 cores/node
32 cores/node
40 cores/node
256
CPU speed (GHz)
   herd[2-9]

2.0
Theoretical system peak performance (TFLOP/s) 1.6
Memory
   Memory per node (GB)
   herd[2]
   herd[4-7]
   herd[8-9]
   Total memory (TB)


32
512
1,024
4.0
Operating system TOSS
High-speed interconnect InfiniBand QDR (Mellanox)
Compilers  
Parallel job type On-node only
Job limits  
Run command srun
Recommended location for scratch file space /p/lscratch{...}
Password authentication OTP
Documentation Introduction to LC Resources
Linux Clusters Overview
/usr/local/docs/linux.basics
/usr/local/docs/lustre.basics
* Limited access

OSLIC (Storage Lustre Interface Cluster)

OSLIC is a resource reserved for moving files between LC file systems and HPSS archival storage.

OSLIC
Nodes 10
Cores/node 4
Total cores 40
CPU speed (GHz) 2.4
Network bandwidth per node Four 1-gigabit Ethernet
Memory/node (GB) 48
Operating system TOSS
Password authentication OTP

RZCereal

RZCereal is an M&IC capacity* resource that is best used for small serial jobs.

RZcereal

Nodes
   Login nodes: rzcereal[1,2]
   Batch nodes
   Debug nodes: rzcereal[4]
   Total nodes


2
16
1
21

CPUs (Intel Xeon E5530)
   Cores per node
   Total cores


8
169
CPU speed (GHz)

2.4

Theoretical system peak performance (TFLOP/s) 1.6
Memory
   Memory per node (GB)
   Total memory (GB)

24
504
Peak CPU memory bandwidth (GB/s) 25.6
Operating system TOSS
High-speed interconnect none
Compilers  
Parallel job type multiple jobs per node
Job limits  
Run command srun
Recommended location for scratch file space /p/lscratch{...}
Password authentication OTP
Documentation Introduction to LC Resources
Linux Clusters Overview
/usr/local/docs/linux.basics
/usr/local/docs/lustre.basics
* Capacity computing is accomplished through the use of smaller and less expensive high performance systems to run parallel problems with more modest computational requirements.

RZHasGPU

RZHasGPU is a small, unclassified resource for visualization and data analysis work.

RZHasGPU *

Nodes
   Login nodes: rzhasgpu[18]
   Batch nodes
   Debug nodes
   Total nodes


1
4
2
20

CPUs (Intel Xeon E5-2667 v3)
   Cores per node
   Total cores


16
320 (physical);  640 (with hyperthreading)
CPU speed (GHz) 3.2
GPUs (NVIDIA Tesla K80)
   GPUs per compute node
   Total GPUs
   GPU peak performance (TFLOP/s double precision)
   GPU global memory (GB)

2
32
1.87
24
Theoretical system peak performance (TFLOP/s) 8.2 (CPUs); 59.8 (GPUs)
Memory
   Memory per compute node (GB)
   Total memory compute nodes (TB)

128
2.6
Peak CPU memory bandwidth (GB/s) 68
Operating system TOSS
High-speed interconnect InfiniBand QDR (QLogic)
Compilers  
Parallel job type multiple nodes per job
Job limits  
Run command srun
Recommended location for scratch file space /p/lscratch{...}
Password authentication OTP, Kerberos
Documentation Introduction to LC Resources
Linux Clusters Overview
GPU Techonology at LC
/usr/local/docs/linux.basics
/usr/local/docs/lustre.basics
* Limited access.

RZMerl

RZMerl is a small capacity* resource shared by M&IC and ASC for small to moderate parallel jobs.

RZMerl

Nodes
   Login nodes: rzmerl[156]
   Batch nodes
   Debug nodes
   Total nodes


1
138
16
162

CPUs (Intel Xeon E5-2670)
   Cores per node
   Total cores


16
2,592
CPU speed (GHz)

2.6

Theoretical system peak performance (TFLOP/s) 53.9
Memory
   Memory per node (GB)
   Total memory (TB)

32
5.2
Peak CPU memory bandwidth (GB/s) 51.2
Operating system TOSS
High-speed interconnect InfiniBand QDR (Qlogic)
Compilers  
Parallel job type multiple nodes per job
Job limits  
Run command srun
Recommended location for scratch file space /p/lscratch{...}
Password authentication OTP
Documentation Introduction to LC Resources
Linux Clusters Overview
/usr/local/docs/linux.basics
/usr/local/docs/lustre.basics
* Capacity computing is accomplished through the use of smaller and less expensive high performance systems to run parallel problems with more modest computational requirements.

RZSLIC (Storage Lustre Interface Cluster)

RZSLIC is a resource reserved for moving files between RZ LC file systems and HPSS archival storage.

RZSLIC
Nodes 3
Cores/node 8
Total cores 24
CPUs Intel Xeon E5330
CPU speed (GHz) 2.4
Network bandwidth per node Four 1-gigabit Ethernet
Memory/node (GB) 24
Operating system TOSS
Password authentication OTP

RZuSeq

RZuSeq is the RZ development system for Sequoia.

RZuSeq *
Nodes
   Login nodes
   Debug nodes
   Total nodes

10
512
522
CPUs (IBM Power7; PPC A2)
   Cores per node
   Total cores

48 (login; Power7); 16 (compute; PPC A2)
8,192
CPU speed (GHz) 3.7 (login); 1.6 (compute)
Theoretical system peak performance (TFLOP/s) 100
Memory
   Memory per node (GB)
   Total compute memory (TB)

64 (login); 16 (compute)
8.2 (compute)
Peak CPU memory bandwidth (GB/s) 42.6
Operating system
   
Login nodes
   Compute nodes

Red Hat Enterprise Linux
Compute Node Kernel
High-speed interconnect BlueGene torus
Compilers  
Parallel job type multiple nodes per job
Job limits  
Run command srun
Recommended location for scratch file space /p/lscratch{...}
Password authentication OTP
Documentation Introduction to LC Resources
Using the Sequoia BG/Q System
/usr/local/docs/rzuseq.basics
/usr/local/docs/lustre.basics
* Limited access

RZZeus

RZZeus is an M&IC capacity* resource that is best used for small to moderately large parallel jobs.

RZzeus
Nodes
   Login nodes: rzzeus[286,287]
   Batch nodes
   Debug nodes
   Total nodes

2
239
16
267
CPUs (Intel Xeon E5530)
   Cores per node
   Total cores

8
2,144
CPU speed (GHz) 2.4
Theoretical system peak performance (TFLOP/s) 20.6
Memory
   Memory per node (GB)
   Total memory (TB)

24
6.4
Peak CPU memory bandwidth (GB/s) 25.6
Operating system TOSS
High-speed interconnect InfiniBand DDR (Mellanox)
Compilers  
Parallel job type multiple nodes per job
Job limits  
Run command srun
Recommended location for scratch file space /p/lscratch{...}
Password authentication OTP
Documentation Introduction to LC Resources
Linux Clusters Overview
/usr/local/docs/linux.basics
/usr/local/docs/lustre.basics
* Capacity computing is accomplished through the use of smaller and less expensive high performance systems to run parallel problems with more modest computational requirements.

Sierra

Sierra is a large M&IC capability* resource for moderate to large parallel jobs.

Sierra

Nodes
   Login nodes: sierra[0,6,324,330,648,654,972,978,1296,1302,1620,1626]
   Batch nodes
   Debug nodes
   Total nodes


12
1,856
16
1,944

CPUs (Intel Xeon EP X5660)
   Cores per node
   Total cores


12
23,328
CPU speed (GHz) 2.8
Theoretical system peak performance (TFLOP/s) 261.3
Memory
   Memory per node (GB)
   Total memory (TB)

24
46.7
Peak CPU memory bandwidth (GB/s) 32
Operating system TOSS
High-speed interconnect InfiniBand QDR (QLogic)
Compilers  
Parallel job type multiple nodes per job
Job limits  
Run command srun
Recommended location for scratch file space /p/lscratch{...}
Password authentication OTP
Documentation Introduction to LC Resources
Linux Clusters Overview
/usr/local/docs/linux.basics
/usr/local/docs/lustre.basics
* Capability computing refers to the use of the most powerful supercomputers to solve the largest and most demanding problems with the intent to minimize time to solution. A capability computer is dedicated to running one problem, or at most a few problems, at at time.

Surface

Surface is an unclassified resource for visualization and data analysis work.

Surface *

Nodes
   Login nodes: surface[86]
   Batch nodes
   Total nodes


1
158
162

CPUs (Intel Xeon E5-2670)
   Cores per node
   Total cores


16
2,592 (physical); 5,198 (w/hyperthreading)
CPU speed (GHz) 2.6
GPUs (NVIDIA Tesla K40m)
   GPUs per compute node
   Total GPUs
   GPU peak performance (TFLOP/s double precision)
   GPU global memory (GB)

2
316
1.43
12
Theoretical system peak performance (TFLOP/s) 53.9 (CPUs); 451.9 (GPUs)
Memory
   Memory per node (GB)
   Total memory (TB)

256
41.5
Peak CPU memory bandwidth (GB/s) 51.2
Operating system TOSS
High-speed interconnect InfiniBand QDR (QLogic)
Compilers  
Parallel job type multiple nodes per job
Job limits  
Run command srun
Recommended location for scratch file space /p/lscratch{...}
Password authentication OTP, Kerberos
Documentation Introduction to LC Resources
Linux Clusters Overview
GPU Techonology at LC
/usr/local/docs/linux.basics
/usr/local/docs/lustre.basics
* Limited access.

Syrah

Syrah is a medium-sized unclassified resource, tuned for capacity* jobs in support of ASC and HPCIC projects.

Syrah **

Nodes
   Login nodes: syrah[144,256]
   Batch nodes
   Debug nodes
   Total nodes


2
308
8
324

CPUs (Intel Xeon E5-2670)
   Cores per node
   Total cores


16
5056
CPU speed (GHz) 2.6
Theoretical system peak performance (TFLOP/s) 107.8
Memory
   Memory per node (GB)
   Total memory (TB)

64
20.224
Peak Node memory bandwidth (GB/s) 102.4
Peak CPU memory bandwidth (GB/s) 6.4
Operating system TOSS
High-speed interconnect InifiniBand QDR (QLogic)
Compilers  
Parallel job type multiple nodes per job
Job limits  
Run command srun
Recommended location for scratch file space /p/lscratch{...}
Password authentication OTP, Kerberos
Documentation Introduction to LC Resources
Linux Clusters Overview
/usr/local/docs/linux.basics
/usr/local/docs/lustre.basics

*Capacity computing is accomplished through the use of smaller and less expensive high performance systems to run parallel problems with more modest computational requirements.

**Limited access.

Vulcan

Vulcan is designed to accommodate both capacity and capability jobs. It is co-owned by M&IC and ASC and will also be used for HPCIC projects.

Vulcan
Nodes
   Login nodes
   Compute nodes

2
24,576
CPUs (IBM Power7; PPC A2)
   Cores per node
   Total compute cores

48 (login; Power7); 16 (compute; PPC A2)
393,216
CPU speed (GHz) 3.7 (login); 1.6 (compute)
Theoretical system peak performance (TFLOP/s) 5,033
Memory
   Memory per node (GB)
   Total compute memory (PB)

64 (login); 16 (compute)
1.6
Peak CPU memory bandwidth (GB/s) 42.6
Operating system
   
Login nodes
   Compute nodes

Red Hat Enterprise Linux
Compute Node Kernel
High-speed interconnect BlueGene torus
Compilers  
Parallel job type multiple nodes per job
Job limits  
Run command srun
Recommended location for scratch file space /p/lscratch{...}
Password authentication OTP
Documentation Sequoia User Information [Authentication requires login with LC user name and OTP;
select BG/Q from Global Spaces menu.]
Introduction to LC Resources
Using the Sequoia BG/Q System
/usr/local/docs/rzuseq.basics
/usr/local/docs/lustre.basics

Top


Secure Computing Facility (SCF)