ICC Home Privacy and Legal Notice LC User Documents Banner

UCRL-WEB-201386

SLURM Reference Manual


SLURMD

The SLURMD daemon runs on every compute node of every cluster that SLURM manages and it performs the lowest level work of resource management. Like SLURMCTLD (above), SLURMD is multi-threaded for efficiency, but unlike SLURMCTLD it runs with root privilege (so it can initiate jobs on behalf of other users).

SLURMD carries out five key tasks and has five corresponding subsystems:

Machine Status
responds to SLURMCTLD requests for machine state information and sends asynchronous reports of state changes to help with queue control.
Job Status
responds to SLURMCTLD requests for job state information and sends asynchronous reports of state changes to help with queue control.
Remote Execution
starts, monitors, and cleans up after a set of processes (usually shared by a parallel job), as decided by SLURMCTLD (or by direct user intervention). This often involves many process-limit, environment-variable, working-directory, and user-id changes.
Stream Copy Service
handles all STDERR, STDIN, and STDOUT for remote tasks. This may involve redirection, and it always involves locally buffering job output to avoid blocking local tasks.
Job Control
propagates signals and job-termination requests to any SLURM-managed processes (often interacting with the Remote Execution subsystem).



Navigation Links: [ Document List ] [ HPC Home ] [ Next ]