4.LSF-HPC prepares the user environment for the job on the LSF-HPC execution host node and dispatches the job with the job_starter.sh script. This user environment includes standard LSF environment variables and two SLURM-specific environment variables: SLURM_JOBID and SLURM_NPROCS.

SLURM_JOBID is the SLURM job ID of the job. Note that this is not the same as the LSF jobID.

SLURM_NPROCS is the number of processors allocated.

These environment variables are intended for use by the user’s job, whether it is explicitly (user scripts may use these variables as necessary) or implicitly (any srun commands in the user’s job can use these variables to determine its allocation of resources).

The value for SLURM_NPROCS is 4 and the SLURM_JOBID is 53 in this example.

5.The user job myscript begins execution on compute node n1.

The first line in myscript is the hostname command. It executes locally and returns the name of node, n1.

6.The second line in the myscript script is the srun hostname command. The srun command in myscript inherits SLURM_JOBID and SLURM_NPROCS from the environment and executes the hostname command on each compute node in the allocation.

7.The output of the hostname tasks (n1, n2, n3, and n4). is aggregated back to the srun launch command (shown as dashed lines in Figure 7-1), and is ultimately returned to the srun command in the job starter script, where it is collected by LSF-HPC.

The last line in myscript is the mpirun -srun ./hellompicommand. The srun command inside the mpirun command in myscript inherits the SLURM_JOBID and SLURM_NPROCS environment variables from the environment and executes hellompi on each compute node in the allocation.

The output of the hellompi tasks is aggregated back to the srun launch command where it is collected by LSF-HPC.

The command executes on the allocated compute nodes n1, n2, n3, and n4.

When the job finishes, LSF-HPC cancels the SLURM allocation, which frees the compute nodes for use by another job.

7.1.5 Differences Between LSF on HP XC and Standard LSF

LSF for the HP XC environment supports all the standard features and functions that standard LSF supports, except for those items described in this section, in Section 7.1.6, and in the HP XC release notes for LSF.

The external scheduler option for HP XC provides additional capabilities at the job level and queue level by allowing the inclusion of several SLURM options in the LSF command line.

LSF does not collect maxswap, ndisks, r15s, r1m, r15m, ut, pg, io, tmp, swp and mem load indices from each application node. lshosts and lsload commands will display "-" for all of these items.

LSF-enforced job-level run-time limits are not supported.

Except run-time (wall clock) and total number of CPUs, LSF cannot report any other job accounting information.

LSF does not support parallel or SLURM-based interactive jobs in PTY mode (bsub -Isand bsub -Ip).

LSF does not support user-account mapping and system-account mapping.

7-6Using LSF

Page 90
Image 90
HP XC System 2.x Software manual Differences Between LSF on HP XC and Standard LSF