10 Using LSF

The Load Sharing Facility (LSF) from Platform Computing is a batch system resource manager used on the HP XC system.

On an HP XC system, a job is submitted to LSF, which places the job in a queue and allows it to run when the necessary resources become available. In addition to launching jobs, LSF provides extensive job management and information capabilities. LSF schedules, launches, controls, and tracks jobs that are submitted to it according to the policies established by the HP XC site administrator.

Two types of LSF are available for installation on the HP XC:LSF

This product is the popular batch system produced by Platform Computing that has become an industry standard. For full information about LSF, see the standard LSF documentation set, which is described in the “Related Software Products and Additional Publications” section of this manual. LSF manpages are also available online on the HP XC system.

LSF integrated with SLURM (LSF)

This product is the LSF product from Platform Computing that has been integrated with SLURM to take advantage of SLURM's scalable, efficient resource management and parallel job support.

The intended primary use of the HP XC system determined which of these LSF products was installed.

This chapter introduces you to LSF integrated with SLURM in the HP XC environment. It provides an overview of how LSF integrated with SLURM works, and discusses some of the features and differences of LSF compared to LSF integrated with SLURM on an HP XC system.

This chapter also contains an important discussion of how LSF and SLURM work together to provide the HP XC job management environment. A description of SLURM is provided in Chapter 9 (page 81).

This chapter addresses the following topics:

“Information for LSF” (page 85)“Overview of LSF Integrated with SLURM” (page 86)“Differences Between LSF and LSF Integrated with SLURM” (page 88)“Job Terminology” (page 89)“Using LSF Integrated with SLURM in the HP XC Environment” (page 91)“Submitting Jobs” (page 91)“How LSF and SLURM Launch and Manage a Job” (page 92)“Determining the LSF Execution Host” (page 94)“ Determining Available System Resources” (page 94)“Getting Information About Jobs” (page 96)“Translating SLURM and LSF JOBIDs” (page 100)“Working Interactively Within an Allocation” (page 101)“LSF Equivalents of SLURM srun Options” (page 103)

10.1Information for LSF

The information for LSF is provided in the LSF documentation. This documentation is on the HP XC installation disk and manpages are online.

LSF is installed and configured on all nodes of the HP XC system by default. Nodes without the compute role are closed with '0' job slots available for use.

10.1 Information for LSF 85