Getting Information About Jobs

There are several ways you can get information about a specific job after it has been submitted to LSF-HPC. This section briefly describes some of the commands that are available under LSF-HPC to gather information about a job. This section is not intended as complete information about this topic. It is intended only to give you an idea of the commands that are commonly used, and to describe any differences there may be in the way these commands operate in the HP XC environment. Refer to the LSF manpages for full information about the commands described in this section.

The following LSF commands are described in this section:

bjobs "Examining the Status of a Job"

bhist "Viewing the Historical Information for a Job"

Getting Job Allocation Information

Before a job runs, LSF-HPC allocates SLURM compute nodes based on job resource requirements. After LSF-HPC allocates nodes for a job, it attaches allocation information to the job. You can view job allocation information through the bjobs -land bhist -lcommands. Refer to the LSF manpages for details about using these commands.

A job allocation information string looks like the following:

slurm_id=slurm_jobid;ncpus=slurm_nprocs;slurm_alloc=node_list

This allocation string has the following values:

slurm_id	SLURM_JOBID environment variable. This is SLURM allocation ID (Associates LSF-HPC
	job with SLURM allocated resources.)
ncpus	SLURM_NPROCS environment variable. This the actual number of allocated cores. Under
	node-level allocation scheduling, this number may be bigger than what the job requests.)
slurm_alloc	Allocated node list (comma separated).

When LSF-HPC starts a job, LSF-HPC sets the SLURM_JOBID and SLURM_NPROCS environment variables.

Job Allocation Information for a Running Job

The following is an example of the output obtained using the bjobs -lcommand to obtain job allocation information about a running job:

$ bjobs -l 24

Job <24>, User <lsfadmin>, Project <default>, Status <RUN>, Queue <normal>, Interactive pseudo-terminal shell mode, Extsched <SLURM[nodes=4]>, Command </bin/bash>

date and time stamp: Submitted from host <n2>, CWD <$HOME>,

4 Processors Requested, Requested Resources <type=any>; date and time stamp: Started on 4 Hosts/Processors <4*lsfhost.localdomain>; date and time stamp: slurm_id=22;ncpus=8;slurm_alloc=n[5-8];

SCHEDULING PARAMETERS:
	r15s	r1m	r15m	ut	pg	io	ls	it	tmp	swp	mem
loadSched	-	-	-	-	-	-	-	-	-	-	-
loadStop	-	-	-	-	-	-	-	-	-	-	-
EXTERNAL MESSAGES:
MSG_ID	FROM		POST_TIME			MESSAGE				ATTACHMENT
0	-			-			-			-
1	lsfadmin		date and time			stamp		SLURM[nodes=4]			N

In particular, note the node and job allocation information provided in the above output:

date and time stamp: Started on 4 Hosts/Processors <4*lsfhost.localdomain>;

date and time stamp: slurm_id=22;ncpus=8;slurm_alloc=n[5-8];

80 Using LSF

HP XC System 3.x Software Getting Information About Jobs, Getting Job Allocation Information

Models: XC System 3.x Software