ICC Home Privacy and Legal Notice LC User Documents Banner

UCRL-WEB-201386

SLURM Reference Manual


SRUN Roles and Modes

SRUN executes tasks ("jobs") in parallel on multiple compute nodes at the same time (on machines where SLURM manages the resources). SRUN options let you both:

  • Specify the parallel environment for your job(s), such as the number of nodes used, node partition, distribution of processes among nodes, and total time, and also
  • Control the behavior of your parallel job as it runs, such as by redirecting or labeling its output, sending it signals, or specifying its reporting verbosity.

Because it performs several different roles, SRUN can be used in five distinct ways or "modes":

  • SIMPLE.
    The simplest way to use SRUN is to distribute execution of a serial program (such as a UNIX utility) across a specified number or range of compute nodes. For example,
    srun -N 8 cp ~/data1 /var/tmp/data1
    copies (CP) file data1 from your common home directory into local disk space on each of eight compute nodes. This is very like running simple programs in parallel under AIX by using IBM's POE command (except that SRUN lets you set relevant environment variables on its own execute line, unlike POE). In simple mode, SRUN submits your job to the local SLURM job controller, initiates all processes on the specified nodes, and blocks until needed resources are free to run the job if necessary. Many control options can change the details of this general pattern.
  • BATCH (WITHOUT LCRM).
    SRUN can also directly submit complex scripts to the (Trivial Batch System, TBS) job queue(s) managed by SLURM for later execution when needed resources become available and when no higher priority jobs are pending. For example,
    srun -N 16 -b myscript.sh
    uses SRUN's -b option to place myscript.sh into the TBS queue to later run on 16 nodes. Scripts in turn normally contain either MPI programs or other, simple invocations of SRUN itself (as shown above). SRUN's -b option thus supports basic, local batch service even on machines where LC's metabatch system LCRM has not yet been installed (see below). On BlueGene/L only, scripts must invoke MPIRUN instead of simple SRUN to start tasks.
  • ALLOCATE.
    To combine the job complexity of scripts with the immediacy of interactive execution, you can use SRUN's "allocate" mode. For example,
    srun -A -N 4 myscript.sh
    uses SRUN's (uppercase) -A option to allocate specified resources (here, four nodes), spawn a subshell with access to those resources, and then run multiple jobs using simple SRUN commands within the specified script (here, myscript.sh) that the subshell immediately starts to execute. This is very like allocating resources by setting AIX environment variables at the beginning of a script, and then using them for scripted tasks. No job queues are involved.
  • ATTACH.
    You can monitor or intervene in an already running SRUN job, either batch (started with -b) or interactive ("allocated," started with -A), by executing SRUN again and "attaching" (-a, lowercase) to that job. For example,
    srun -a 6543 -j
    forwards the standard output and error messages from the running job with SLURM ID 6543 to the attaching SRUN to reveal the job's current status, and (with -j, lowercase) also "joins" the job so that you can send it signals as if this SRUN had initiated the job. Omit -j for read-only attachments. Because you are attaching to a running job whose resources have already been allocated, SRUN's resource-allocation options (such as -N) are incompatible with -a.
  • BATCH (WITH LCRM).
    On machines where LC's metabatch job-control and accounting system LCRM/DPCS is installed, you can submit (with the LCRM utility PSUB) a script to LCRM that contains (simple) SRUN commands within it to execute parallel jobs later, after LCRM applies the usual fair-share scheduling process to your job and its competitors. Here LCRM takes the place of SRUN's -b option for indirect, across-machine job-queue management.

SRUN SIGNAL HANDLING.
Signals sent to SRUN are automatically forwarded to the tasks that SRUN controls, with a few special cases. SRUN handles the sequence CTRL-C differently depending on how many it receives in one second:

     CTRL-Cs within one second
     -------------------------
     First    reports the state of all tasks
              associated with SRUN.
     Second   sends SIGINT signal to all
              associated SRUN tasks.
     Third    terminates the job at once,
              without waiting for remote tasks
              to exit.

MPI SUPPORT.
On computer clusters with a Quadrics interconnect among the nodes (such as Lilac on SCF, or Thunder and ALC on OCF) SRUN directly supports the Quadrics version of MPI without modification. Applications built using the Quadrics MPI library communicate over the Quadrics interconnect without any special SRUN options.

You may also use MPICH on any computer where it is available. MPIRUN will, however, need information on its command line identifying the resources to use, namely

     -np SLURM_NPROCS -machinefile filename


where SLURM_NPROCS is the environment variable that contains the (-n) number of processors to use and filename lists the names of the nodes on which to execute (the captured output from /bin/hostname run across those nodes with simple SRUN). Sometimes the MPICH vendor configures these options automatically. See also SRUN's --mpi "working features" option.



Navigation Links: [ Document List ] [ HPC Home ] [ Next ]