6.4.8 srun Environment Variables

Many srun options have corresponding environment variables. An srun option, if invoked, always overrides (resets) the corresponding environment variable (which contains each job feature’s default value, if there is a default).

In addition, srun sets the following environment variables for each executing task on the remote compute nodes:

SLURM_JOBID

Specifies the job ID of the executing job.

SLURM_NODEID

Specifies the relative node ID of the current node.

SLURM_NODELIST

Specifies the list of nodes on which the job is actually running.

SLURM_NPROCS

Specifies the total number of processes in the job.

SLURM_PROCID

Specifies the MPI rank (or relative process ID) for the current

 

process.

Other environment variables important for srun — managed jobs include:

MAX_TASKS_PER_NODE

Provides an upper bound on the number of tasks that

 

srun assigns to each job node, even if you allow

 

more than one process per CPU by invoking the srun

 

-Ooption.

SLURM_NNODES

Is the actual number of nodes assigned to run your

 

job (which may exceed the number of nodes that you

 

explicitly requested with the srun -Noption).

6.4.9 Using srun with HP-MPI

The srun command can be used as an option in an HP-MPI launch command. Refer to Section 8.3.3 for information about using srun with HP-MPI.

6.4.10 Using srun with LSF

The srun command can be used in an LSF launch command. Refer to Chapter 7 for information about using srun with LSF.

6.5 Monitoring Jobs with the squeue Command

The squeue command displays the queue of running and waiting jobs (or "job steps"), including the JobID used for scancel), and the nodes assigned to each running job. It has a wide variety of filtering, sorting, and formatting options. By default, it reports the running jobs in priority order and then the pending jobs in priority order.

Example 6-2reports on job 12345 and job 12346:

Example 6-2: Displaying Queued Jobs by Their JobIDs

$ squeue --jobs 12345,12346

 

 

 

JOBID PARTITION NAME USER ST TIME_USED NODES

NODELIST

12345

debug

job1 jody R

0:21

4

n[9-12]

12346

debug

job2 jody PD

0:00

8

 

 

 

 

 

 

 

6-12Using SLURM