Use the bjobs command to monitor job status in LSF-HPC.

Use the bqueues command to list the configured job queues in LSF-HPC.

How LSF-HPC and SLURM Launch and Manage a Job

This section describes what happens in the HP XC system when a job is submitted to LSF-HPC. Figure 9-1illustrates this process. Use the numbered steps in the text and depicted in the illustration as an aid to understanding the process.

Consider the HP XC system configuration shown in Figure 9-1, in which lsfhost.localdomain is the virtual IP name assigned to the LSF execution host, node n16 is the login node, and nodes n[1-10]are compute nodes in the lsf partition. All nodes contain two cores, providing 20 cores for use by LSF-HPC jobs.

Figure 9-1 How LSF-HPC and SLURM Launch and Manage a Job

User

1

N16

N166

Login node

$ bsub-n4 -ext ”SLURM[nodes-4]” -o output.out./myscript

2

lsfhost.localdomain

LSF Execution Host

job_starter.sh

$srun -nl myscript

4

3

 

SLURM_JOBID=53

SLURM_NPROCS=4

N2

6

hostname Compute Node

n2

7

 

N1

5

Compute Node

 

 

N3

 

 

 

 

 

 

myscript

 

6

 

 

 

 

 

$

hostname

 

$ hostname

 

hostname

 

 

n1

 

$ srun hostname

srun

Compute Node

 

 

 

$ mpirun -srun ./hellompi

 

n3

 

 

 

 

 

 

 

67

hostname

N4

n16

7hostname Compute Node n4

7

1.A user logs in to login node n16.

2.The user executes the following LSF bsub command on login node n16:

$ bsub -n4 -ext "SLURM[nodes=4]" -o output.out ./myscript

Using LSF-HPC 73

Page 73
Image 73
HP XC System 3.x Software manual How LSF-HPC and Slurm Launch and Manage a Job