allocates the appropriate whole node for exclusive use by the

 

serial job in the same manner as it does for parallel jobs, hence

 

the name “pseudo-parallel”.
Parallel jobA job that requests more than one slot, regardless of any other

 

constraints. Parallel jobs are allocated up to the maximum

 

number of nodes specified by the following specifications:

 

SLURM[nodes=min-max](if specified)

 

SLURM[nodelist=node_list] (if specified)

 

bsub -n

 

Parallel jobs and serial jobs cannot run on the same node.
Small jobA parallel job that can potentially fit into a single node, and

 

does not explicitly request more than one node (SLURM[nodes]

 

or SLURM[node_list] specification). LSF tries to allocate a

 

single node for a small job.

10.5 Using LSF Integrated with SLURM in the HP XC Environment

This section provides some additional information that should be noted about using LSF in the HP XC Environment.

10.5.1 Useful Commands

The following describe useful commands for LSF Integrated with SLURM:

Use the bjobs -land bhist -lcommands to see the components of the actual SLURM allocation command.

Use the bkill command to kill jobs.Use the bjobs command to monitor job status in LSF integrated with SLURM.

Use the bqueues command to list the configured job queues in LSF integrated with SLURM.

10.5.2Job Startup and Job Control

When LSF starts a SLURM job, it sets SLURM_JOBID to associate the job with the SLURM allocation. While a job is running, all LSF supported operating-system-enforced resource limits are supported, including core limit, CPU time limit, data limit, file size limit, memory limit, and stack limit. If the user kills a job, LSF propagates signals to entire job, including the job file running on the local node and all tasks running on remote nodes.

10.5.3 Preemption

LSF uses the SLURM "node share" feature to facilitate preemption. When a low-priority is job preempted, job processes are suspended on allocated nodes, and LSF places the high-priority job on the same node. After the high-priority job completes, LSF resumes suspended low-priority jobs.

10.6 Submitting Jobs

The bsub command submits jobs to LSF; it is used to request a set of resources on which to launch a job. This section focuses on enhancements to this command from the LSF integration with SLURM on the HP XC system; this section does not discuss standard bsub functionality or flexibility. See the Platform LSF documentation and the bsub(1) manpage for more information on this important command. The topic of submitting jobs with the LSF-SLURM External Scheduler is explored in detail in “Submitting a Parallel Job Using the SLURM External Scheduler”.

The HP XC system has several features that make it optimal for running parallel applications, particularly (but not exclusively) MPI applications. You can use the bsub command's -nto

10.5 Using LSF Integrated with SLURM in the HP XC Environment 91