CHAOS: Linux from Livermore
Parallel Resource Manager (SLURM)
The primary threefold purpose of a cluster resource manager (such as LoadLeveler on LC's IBM ASC machines or the Resource Management System (RMS) from Quadrics) is to:
At LC, an adequate cluster resource manager needs to meet two general requirements:
Any LC resource manager must also meet two additional, locally important, requirements:
Finally, to fit well into the emerging CHAOS environment, a resource manager at LC should ideally have these two very beneficial extra properties as well:
No commercial (or existing open source) resource manager meets all nine of these needs. So since 2001 Livermore Computing, in collaboration with Linux NetworX and Brigham Young University, has developed and refined the "Simple Linux Utility for Resource Management" (SLURM). The summary of its requirements above gives a good profile of SLURM's role and design strategy. But it says little about how SLURM actually works.
This diagram shows SLURM's architecture (from the system point of view):
SRUN -| ------------- | | | SCANCEL -|--------| SLURMCTLD |--------| SCONTROL | | | SQUEUE -| ------------- | | SINFO -| --------------------- | | | SLURMD SLURMD SLURMD (...compute nodes...)At the center is SLURM's centralized work manager (SLURMCTLD) or control daemon (with a duplicate backup for reliability, not shown). Along the bottom are the SLURMD daemons residing on every compute node, each of which runs jobs locally as a remote shell. (On BlueGene/L, compute nodes can execute only a single process so the SLURMD daemon runs instead on one of the BlueGene/L "front end nodes," but it fills the same role.) User tools (left side) allocate resources and start jobs (SRUN) on SLURM-managed nodes, terminate them (SCANCEL), report job status (SQUEUE), and separately report current node and partition status (SINFO). The administrative tool SCONTROL (right side) monitors and modifies configurations and job states. These SLURM parts were tested on an LC Linux system during 2002, then deployed for public use with the release of CHAOS1.2 across all LC Linux clusters (that had a suitable switch) in the fall of 2003.
From the user point of view, SRUN is the central SLURM tool. SRUN offers over 65 command-line options that you can combine to provide:
On CHAOS machines, jobs submitted to SLURM using SRUN (either as a stand-alone utility or executed within an LCRM script) can be monitored for progress and resource use with the SQUEUE reporting tool. SQUEUE thus fills the role for CHAOS and SLRUM that SPJSTAT fills for AIX and LoadLeveler on IBM machines. And like SPJSTAT, SQUEUE reports jobs by means of their SLURM-assigned "local" job ID rather than their LCRM JID (even if they have one). SQUEUE also lets users request customized job-status reports, in which they can specify both the job features reported (from a list of 24) and the order in which reported jobs are sorted.
Likewise, on CHAOS machines, compute resources managed by SLURM can be monitored for features or availability with the SINFO reporting tool. SINFO thus fills the role for CHAOS that LLSTATUS fills for AIX on IBM machines. Like LLSTATUS, by default SINFO reports broadly on all node partitions, but you can focus on specific nodes or node sets if you wish. And like SQUEUE, SINFO offers customization options to change not only the node properties reported but also the order or format of columns shown in SINFO output.
On BlueGene/L only, an additional SLURM tool called SMAP shows the topological distribution of jobs among nodes (because job geometry is important on that machine's unusual architecture).
More details on SLURM, including how its subsystems interact with each other, how users interact with SLURM, the many specialized job-control features offered by the SRUN tool, and the customization possibilities for SQUEUE, SINFO, and SMAP output, appear in the SLURM Reference Manual. In 2006, LC began replacing LoadLeveler with SLURM for resource management even on its AIX machines. For an AIX/CHAOS and LoadLeveler/SLURM cross-comparison matrix, see the "SLURM and Operating Systems" section of the SLURM Reference Manual.
Navigation Links: [ Document List ] [ HPC Home ] [ Next ]