Getting Information About Jobs

There are several ways you can get information about a specific job after it has been submitted to LSF-HPC. This section briefly describes some of the commands that are available under LSF-HPC to gather information about a job. This section is not intended as complete information about this topic. It is intended only to give you an idea of the commands that are commonly used, and to describe any differences there may be in the way these commands operate in the HP XC environment. Refer to the LSF manpages for full information about the commands described in this section.

The following LSF commands are described in this section:

bjobs "Examining the Status of a Job"

bhist "Viewing the Historical Information for a Job"

Getting Job Allocation Information

Before a job runs, LSF-HPC allocates SLURM compute nodes based on job resource requirements. After LSF-HPC allocates nodes for a job, it attaches allocation information to the job. You can view job allocation information through the bjobs -land bhist -lcommands. Refer to the LSF manpages for details about using these commands.

A job allocation information string looks like the following:

slurm_id=slurm_jobid;ncpus=slurm_nprocs;slurm_alloc=node_list

This allocation string has the following values:

slurm_id

SLURM_JOBID environment variable. This is SLURM allocation ID (Associates LSF-HPC

 

job with SLURM allocated resources.)

ncpus

SLURM_NPROCS environment variable. This the actual number of allocated cores. Under

 

node-level allocation scheduling, this number may be bigger than what the job requests.)

slurm_alloc

Allocated node list (comma separated).

When LSF-HPC starts a job, LSF-HPC sets the SLURM_JOBID and SLURM_NPROCS environment variables.

Job Allocation Information for a Running Job

The following is an example of the output obtained using the bjobs -lcommand to obtain job allocation information about a running job:

$ bjobs -l 24

Job <24>, User <lsfadmin>, Project <default>, Status <RUN>, Queue <normal>, Interactive pseudo-terminal shell mode, Extsched <SLURM[nodes=4]>, Command </bin/bash>

date and time stamp: Submitted from host <n2>, CWD <$HOME>,

4 Processors Requested, Requested Resources <type=any>; date and time stamp: Started on 4 Hosts/Processors <4*lsfhost.localdomain>; date and time stamp: slurm_id=22;ncpus=8;slurm_alloc=n[5-8];

SCHEDULING PARAMETERS:

 

 

 

 

 

 

 

 

 

 

r15s

r1m

r15m

ut

pg

io

ls

it

tmp

swp

mem

loadSched

-

-

-

-

-

-

-

-

-

-

-

loadStop

-

-

-

-

-

-

-

-

-

-

-

EXTERNAL MESSAGES:

 

 

 

 

 

 

 

 

 

MSG_ID

FROM

 

POST_TIME

 

MESSAGE

 

 

ATTACHMENT

0

-

 

 

-

 

 

-

 

 

-

 

1

lsfadmin

date and time

stamp

SLURM[nodes=4]

N

In particular, note the node and job allocation information provided in the above output:

date and time stamp: Started on 4 Hosts/Processors <4*lsfhost.localdomain>;

date and time stamp: slurm_id=22;ncpus=8;slurm_alloc=n[5-8];

80 Using LSF