Example 9-2 Displaying Queued Jobs by Their JobIDs

$ squeue --jobs 12345,12346

 

 

 

 

JOBID PARTITION NAME USER

ST TIME_USED NODES

NODELIST(REASON)

12345

debug

job1 jody

R

0:21

4

n[9-12]

12346

debug

job2 jody

PD

0:00

8

 

The squeue command can report on jobs in the job queue according to their state; possible states are: pending, running, completing, completed, failed, timeout, and node_fail. Example 9-3uses the squeue command to report on failed jobs.

Example 9-3 Reporting on Failed Jobs in the Queue

$ squeue --state=FAILED

 

 

 

 

 

JOBID

PARTITION

NAME

USER

ST

TIME

NODES

NODELIST(REASON)

59

amt1

hostname

root

F

0:00

0

 

9.5 Terminating Jobs with the scancel Command

The scancel command cancels a pending or running job or job step. It can also be used to send a specified signal to all processes on all nodes associated with a job. Only job owners or administrators can cancel jobs.

Example 9-4terminates job #415 and all its jobsteps.

Example 9-4 Terminating a Job by Its JobID

$ scancel 415

Example 9-5cancels all pending jobs.

Example 9-5 Cancelling All Pending Jobs

$ scancel --state=PENDING

Example 9-6sends the TERM signal to terminate jobsteps 421.2 and 421.3.

Example 9-6 Sending a Signal to a Job

$ scancel --signal=TERM 421.2 421.3

9.6 Getting System Information with the sinfo Command

The sinfo command reports the state of partitions and nodes managed by SLURM. It has a wide variety of filtering, sorting, and formatting options. The sinfo command displays a summary of available partition and node (not job) information, such as partition names, nodes/partition, and cores/node.

Example 9-7 Using the sinfo Command (No Options)

$ sinfo

 

 

 

 

 

PARTITION AVAIL TIMELIMIT NODES

STATE

NODELIST

lsf

up

infinite

3

down*

n[0,5,8]

lsf

up

infinite

14

idle

n[1-4,6-7,9-16]

The node STATE codes in these examples may be appended by an asterisk character (*) ; this indicates that the reported node is not responding. See the sinfo(1) manpage for a complete listing and description of STATE codes.

9.5 Terminating Jobs with the scancel Command 83

Page 83
Image 83
HP XC System 4.x Software Terminating Jobs with the scancel Command, Getting System Information with the sinfo Command