The LSF environment is set up automatically for the user on login; LSF commands and their manpages are readily accessible:
•The bhosts command is useful for viewing LSF batch host information.
•The lshosts command provides static resource information.
•The lsload command provides dynamic resource information.
•The bsub command is used to submit jobs to LSF.
•The bjobs command provides information on batch jobs.
10.2Overview of LSF Integrated with SLURM
LSF was integrated with SLURM for the HP XC system to merge the scalable and efficient resource management of SLURM with the extensive scheduling capabilities of LSF. In this integration:
•SLURM manages the compute resources.
•LSF performs the job management.
SLURM extends the parallel capabilities of LSF with its own fast parallel launcher (which is integrated with
Managing the compute resources of the HP XC system with SLURM means that the LSF daemons run only on one HP XC node and can present the HP XC system as a single LSF host. As a result:
•All the nodes are configured as LSF Client Hosts; every node is able to access LSF. You can submit jobs from any node in the HP XC system.
•The lshosts and bhosts commands only list one host that represents all the resources of the HP XC system.
LSF integrated with SLURM obtains resource information about the HP XC system. This information is consolidated and key information such as the total number of cores and the maximum memory available on all nodes becomes the characteristics of the single HP XC “LSF Execution Host”. Additional resource information from SLURM, such as
Integrating LSF with SLURM on HP XC systems provides you with a parallel launch command to distribute and manage parallel tasks efficiently. The SLURM srun command offers much flexibility for requesting requirements across an HP XC system; for example, you can request
•Request contiguous nodes
•Execute only one task per node
•Request nodes with specific features
This flexibility is preserved in LSF through the external SLURM scheduler; this is discussed in more detail in the section titled
A SLURM partition named lsf is used to manage LSF jobs. Thus:
•You can view information about this partition with the sinfo command.
•The total number of cores listed by the lshosts and bhosts commands for that host should be equal to the total number of cores assigned to the SLURM lsf partition.
When a job is submitted and the resources are available, LSF creates a
SLURM_JOBID This environment variable is created so that subsequent srun commands make use of the SLURM allocation created by LSF for the job. This variable can be used by a job script to query information about the SLURM allocation, as shown here:
$ squeue --jobs $SLURM_JOBID
86 Using LSF