Limits on job size and run duration are imposed on interactive and batch jobs. See OCF Machine-specific News and Information for interactive and batch job limits for OCF (CZ and RZ) machines. These limits are viewable per machine by invoking "news job.lim.<machine>".
Interactive-use machines include the nodes that are configured to accept user logins and the nodes that are configured into an interactive use pool, typically called pdebug. Login nodes may be used to edit files, compile/link codes, interact with the batch system (submit, query, etc.), run single-node applications, and launch parallel jobs that should run in the pdebug pool. In most cases, users do not pick a login node; it is automatically assigned through an aliasing mechanism that distributes users across all available login nodes.
The pdebug pool (when present) is configured to accommodate quick prototyping and debugging of user codes, not production runs. The number, identity, and usage policies of pdebug nodes are usually different between systems and also can (and do) change.
You can query the system and follow your job status using the following commands. See the man page for each command for details.
|Interactive Job Commands|
|ps||All||Show current status of processes on node.|
|top||All||Display and update information about the top CPU processes on current node.|
|ju||TOSS||Report node availability and usage.|
|mjstat||All||List attributes of jobs under control of Moab.|
|sinfo||All||Display partition and node information for a system running SLURM.|
|squeue||All||Display information of jobs located in the scheduling queue.|
The Moab Workload Manager is the batch scheduling system at LC. Moab can be run without terminal access (default), with terminal access via run/proxy, or using the Moab mxterm utility on LC systems. (The run and proxy utilities are available to allow connection to the standard in, standard out, and standard error channels of jobs running in batch or elsewhere. Run must be used in starting the job to be connected to and then proxy can be used in an interactive environment to deal with the messages. See the man pages for run and proxy for more information.)
User documentation for Moab is available from the Computing Resource Management information page. The Moab Quick Start Guide explains the basic features of Moab. The Moab Tutorial focuses on Moab usage within LC's HPC environment.
The following list includes some of the commonly used Moab commands. Most commands have a man page (many commands have a number of options); the command usage is also described in detail in the information sources mentioned in the preceding paragraphs.
|Commonly Used Moab Commands|
|msub||Submit a job control script to batch system. (See Note)|
|mjstat||Display statistics on running jobs.|
|checkjob||Query job status.|
|showq||Display the job queue.|
|mjobctl -h user||Place a queued job on hold.|
|mjobctl -u||Release a previously held job.|
|canceljob||Remove a queued or running job.|
|mjobctl -m||Change a queued job.|
|mshare||Display Moab account share allocations, usage statistics, and priorities. Lists valid accounts for current machine.|
|mdiag -u||Display Moab account access permissions.|
|mdiag -f||Display Moab account quota statistics.|
|Display your default Moab account.|
|Note: The msub command has a number of options that can be used in either your job script or from the command line. See the MSUB Options section below for details.|
The Simple Linux Utility for Resource Management (SLURM) is a customized replacement for RMS or NQS in allocating compute resources (mostly nodes) to queued jobs on machines running the CHAOS operating system. The five SLURM user commands for querying and controlling jobs managed by SLURM are listed below.
|srun||Submit jobs to run under SLURM management.|
|squeue||Display the queue of running and waiting jobs (or "job steps"), including the JobId (used for scancel), and the nodes assigned to each running job.|
|sinfo||Display a summary of available partition and node (not job) information (such as partition names, nodes/partition, and CPUs/node).|
|scancel||Cancel a running or waiting job, or send a specified signal to all processes on all nodes associated with a job (only job owners or their administrators can cancel their jobs).|
|scontrol||Manage available nodes, e.g., by "draining" jobs from a node or partition to prepare it for servicing. (Privileged users only.)|
The msub command is used to submit your job script to Moab. Upon successful submission, Moab returns the job's ID and spools it for execution. For example:
% msub myjobscript
% msub -q pdebug myjobscript
The msub command has a number of options that can be used in either your
job script or from the command line. Some of the more common/useful options are
shown below—see the Computing Resource Management
information page for additional documentation.
|#MSUB -a||-a||Declares the time after which the job is eligible for execution.
Syntax: (brackets delimit optional items with the default being current date/time):
|#MSUB -A account||-A account||Defines the account associated with the job.|
|#MSUB -d path||-d path||Specifies the directory in which the job should begin executing.|
|#MSUB -e filename||-e filename||Defines the file name to be used for stderr.|
|#MSUB -h||-h||Put a user hold on the job at submission time.|
|#MSUB -j oe||-j oe||Combine stdout and stderr into the same output file. This is the default. If you want to give the combined stdout/stderr file a specific name, include the -o path flag also.|
|#MSUB -l string||-l string||Defines the resources that are required by the job. See the discussion below for this important flag.|
|#MSUB -m option(s)||-m option(s)||Defines the set of conditions (a=abort,b=begin,e=end) when the server will send a mail message about the job to the user.|
|#MSUB -N name||-N name||Gives a user specified name to the job. Note that job names do not appear in all Moab job info displays, and do not determine how your job's stdout/stderr files are named.|
|#MSUB -o filename||-o filename||Defines the file name to be used for stdout.|
|#MSUB -p priority||-p priority||Assigns a user priority value to a job. For more information, see the Setting User Job Priority section of the Moab tutorial.|
|#MSUB -q queue
#MSUB -q queue@host
|-q queue||Run the job in the specified queue (pdebug, pbatch, etc.). A host may also be specified if it is not the local host.|
|#MSUB -r y||-r y||Automatically rerun the job is there is a system failure. The default behavior at LC is to NOT automatically rerun a job in such cases.|
|#MSUB -S path||-S path||Specifies the shell which interprets the job script. The default is your login shell.|
|#MSUB -v list||-v list||Specifically adds a list (comma separated) of environment variables that are exported to the job.|
|#MSUB -V||-V||Declares that all environment variables in the msub environment are exported to the batch job.|
|#MSUB -W||-W||This option has been deprecated and should be ignored.|
|MSUB -l Options|
|-l depend=jobid||Dependency upon completion of another job. jobid is the Moab jobid for the job that must complete first. For more information, see the Setting Up Dependent Jobs section of the Moab tutorial.|
|Requirement for a specific node feature. Use the mdiag -t command to see what features are available on a node.|
-l gres=filesystem, filesystem
|Job requires the specified parallel Lustre file system(s). Valid labels are:
where ... is a lowercase letter (OCF systems) or a number (SCF systems).
Not available on AIX (Purple) systems. The purpose of this option is to prevent jobs from being scheduled if the specified file system is unavailable. The default is to require all mounted lscratch file systems. The ignore descriptor can be used for jobs that don't require a parallel file system, enabling them to be scheduled even if there are parallel file system problems. More information is available in File System-Aware Scheduling with Moab.
|-l nodes=256||Number of nodes|
|Try to keep job running if a node fails
Requeue the job automatically if it fails
|Run job on a specific cluster
Run a job on either cluster
Run a job on either cluster
Run a job on any cluster
|-l qos=standby||Quality of service (standby, expedite)|
|Signaling specifies the pre-termination signal to be sent to a job at the desired time before expiration of the job's wall clock limit. Default time is 60 seconds.|
|-l ttc=8||Stands for "total task count." Used to request multiple cores on an Aztec, Inca, or RZcereal node.|
|Wall-clock time. Default units are seconds. HH:MM:SS format is also accepted.|
If more than one resource needs to be specified, the best thing to do is to use a separate #MSUB -l line for each resource. For example:
#MSUB -l nodes=64
#MSUB -l qos=standby
#MSUB -l walltime=2:00:00
#MSUB -l partition=atlas
Alternately you can include all resources on a single #MSUB -l line separated with commas and NO white space, For example:
#MSUB -l nodes=64,qos=standby,walltime=2:00:00,partition=cab
WARNING: Avoid using white space with -l option specifications. A white space after any comma will cause the rest of the line to be ignored without an error message. On the other hand, a white space on either side of the equal sign will elicit a job rejection with an error message.
Currently, Moab does not support a way to pass arguments to your job.
Your path should be imported by Moab. However, if you seem to be having a problem finding executables within your job script, try using the -V (uppercase V) msub option.
LC manages and tracks your use of computer resources when your job is scheduled by Moab. Moab accounts are established for every project and assigned a target share of the machine. Every job a user submits must specify a Moab account or a default will be assigned. The computing resource usage (typically processor minutes) will be charged to the associated account. The fair-share scheduling scheme is used for batch jobs on all LC production hosts. See Understanding Moab Job Priorities for more information.
Visualization resources include classified and unclassified "big data" visualization servers, assessment theaters, video production, graphics software, graphics consulting, and graphics applications development.
|Surface||Intel Xeon E5-2670||162/2,592||256 GB
|316 (2 GPUs/node) Tesla K40m||FDR InfiniBand
|Max||Intel Xeon E5-2670||324/5,184|| 256 GB
|40 (2 GPUs/node) Kepler K20X with 6 GB RAM||QDR InfiniBand
Accounts for the visualization servers may be requested through IdM.
|Bldg. 111||Classified||pw111 (Gremlin)||3x2 tiled PowerWall|
|Bldg. 451, Room 1025||Unclassified||pw451 (RZThriller)||3x2 tiled PowerWall|
|Bldg. 453, Room 1000||Unclassified||pw453 (RZThriller)||4x2 tiled PowerWall|
|Bldg. 453, Room 1205||Classified||pw453c (Gremlin)||3x2 tiled PowerWall|
Users of visualization clusters can run jobs that require X11/OpenGL by starting jobs under the control of xinit. For example, the following SLURM command would invoke xhost on four nodes by starting an X11 server on each node, running xhost with $DISPLAY set to the local X11 server, then shutting down the X11 server when xhost had finished:
srun -N 4 /usr/bin/X11/xinit /usr/bin/X11/xhost -- /usr/X11R6/bin/X
In normal usage, the xhost command would be replaced with whatever application the user wanted to run.
The most important goal of performance tuning and optimization is to reduce a program's wall-clock execution time. Reducing resource usage in other areas, such as memory or disk requirements, may also be a tuning goal. Performance analysis tools are essential to optimizing an application's performance. The Supported Software and Computing Tools listing identifies which tools are best for each type of tuning/optimization and categorizes them with a brief description of each software tool's purpose.
Shells are used as UNIX command interpreters. Each shell has its own language that can be used interactively (at the UNIX prompt) or from within a script, which is a file containing shell commands.
Users have an initial login shell that is determined by their entry in the file /etc/passwd. The entry for your default shell in the /etc/passwd file can be changed by the system administrator or by the LC Hotline staff.
For more information, see the list of most commonly used shells. On LC machines, the list of path names of valid shells can be found in the /etc/shells text file. There is also a "cheat sheet" for useful shell commands and shortcuts [PDF].
Commands are typically run at the shell level, but they can also be put in a file known as a shell script and then executed as you would a command or program. In addition to the traditional shell scripts, scripts can be written in scripting languages such as Perl, Python, Java, Tcl/Tk, PHP, and awk.
For all LC environments, a configuration utility called Dotkit provides a convenient, uniform way to select among multiple versions of software installed on LC systems. It has a simple user interface that is standard across UNIX shells. See the detailed Dotkit Web pages for more information about how Dotkit works, its use, Dotkit packages, and commands, functions, and special variables.
The High Performance Storage System (HPSS) is an archival storage system available on both OCF and SCF. Users are strongly urged to store vital files in archival storage because online files can be lost during a machine crash, not all directories are backed up, and files on some machines are purged. If you have an account in either the open (OCF) or closed (SCF) environment, you also have a storage account. To connect to storage, type ftp storage.llnl.gov (OCF or SCF).
EZSTORAGE (basic file storage guide) explains how to transfer files between machines where you work (mostly LC production machines) and HPSS. Especially see Storage Summarized, which briefly summarizes the chief storage-system constraints, tells how to perform the most important file-storage tasks at LC, and compares the FTP, NFT, and HSI commands used for common tasks. An additional dual-copy class of service (COS) for large mission-critical files is available so that a file can be copied to two different tapes. The HPSS User Guide provides detailed information for transferring files to or from LLNL's installation of HPSS.
The Backup Policy Summary summarizes the backup status for each major file system on the LC machines. Files are also backed up in a special hidden directory called .snapshot that resides in your home directory. (There are also .snapshot on-line backup directories for several other NFS-provided file systems, including /usr/gapps, /usr/local, and user group-owned file systems.) If you accidentally delete a file, you may be able to retrieve it.
Purge policies are subject to change and, when revised, are announced in MOTD, news postings, and status e-mails. Once files are purged, there is no possibility of recovering them. See the File Purge Policy for the Lustre and temporary file systems.