Table 6-1: SLURM Commands (cont.)

Command

Function

sinfo

Reports the state of partitions and nodes managed by SLURM. It has a wide variety

 

of filtering, sorting, and formatting options. sinfo displays a summary of available

 

partition and node (not job) information (such as partition names, nodes/partition,

 

and CPUs/node).

scontrol

Is an administrative tool used to view or modify the SLURM state. Typically, users

 

do not need to access this command. Therefore, the scontrol command can only

 

be executed as user root. Refer to the HP XC System Software Administration

 

Guide for information about using this command.

 

 

The -helpcommand option also provides a brief summary of SLURM options. Note that command options are not case sensitive.

6.3 Accessing the SLURM Manpages

You can also view online descriptions of these commands by accessing the SLURM manpages. Manpages are provided for all SLURM commands and API functions. If SLURM manpages are not already available in your MANPATH environment variable, you can set and export them as follows:

$ MANPATH=$MANPATH:/opt/hptc/man $ export MANPATH

You can now access the SLURM manpages with the standard man command. For example:

$ man srun

6.4 Launching Jobs with the srun Command

The srun command submits jobs to run under SLURM management. Jobs can be submitted to run in parallel on multiple compute nodes. srun is used to submit a job for execution, allocate resources, attach to an existing allocation, or initiate job steps. srun can perform the following:

Submit a batch job and then terminate

Submit an interactive job and then persist to shepherd the job as it runs

Allocate resources to a shell and then spawn that shell for use in running subordinate jobs

Jobs can be submitted for immediate execution or later execution (batch). srun has a wide variety of options to specify resource requirements, including: minimum and maximum node count, processor count, specific nodes to use or not use, and specific node characteristics (so much memory, disk space, certain required features). Besides securing a resource allocation, srun is used to initiate job steps. These job steps can execute sequentially or in parallel on independent or shared nodes within the job’s node allocation.

Example 6-1: Simple Launch of a Serial Program

$ srun -n2 -l hostname

0:n1

1:n1

6.4.1The srun Roles and Modes

The srun command submits jobs to run under SLURM management. The srun command can perform many roles in launching and managing your job. srun also provides several distinct usage modes to accommodate the roles it performs.

6-2Using SLURM