The srun command, used by the mpirun command to launch the MPI tasks in parallel, determines the number of tasks to launch from the SLURM_NPROCS environment variable that was set by LSF-HPC. Recall that the value of this environment variable is equivalent to the number provided by the -noption of the bsub command.

Consider an HP XC system configuration in which lsfhost.localdomain is the LSF execution host and nodes n[1-10]are compute nodes in the lsf partition. All nodes contain 2 processors, providing 20 processors for use by LSF jobs.

Example 7-6runs a hello_world MPI program on four processors.

Example 7-6: Submitting an HP-MPI Job

$ bsub -n4 -I mpirun -srun ./hello_world

Job <75> is submitted to default queue <normal>. <<Waiting for dispatch ...>>

<<Starting on lsfhost.localdomain>> Hello world! I’m 0 of 4 on n2 Hello world! I’m 1 of 4 on n2 Hello world! I’m 2 of 4 on n4 Hello world! I’m 3 of 4 on n4

Example 7-7runs the same external SLURM scheduler

hello_world MPI program on four processors, but uses the to request one task per node.

Example 7-7: Submitting an HP-MPI Job with a Specific Topology Request

$ bsub -n4-ext "SLURM[nodes=4]" -I mpirun -srun ./hello_worldJob <77> is submitted to default queue <normal>.

<<Waiting for dispatch ...>> <<Starting on lsfhost.localdomain>> Hello world! I’m 0 of 4 on n1 Hello world! I’m 1 of 4 on n2 Hello world! I’m 2 of 4 on n3 Hello world! I’m 3 of 4 on n4

If the MPI job requires the use of an appfile, or has another reason that prohibits the use of the srun command as the task launcher, some preprocessing to determine the node hostnames to which mpirun’s standard task launcher should launch the tasks needs to be done. In such scenarios, you need to write a batch script; there are several methods available for determining the nodes in an allocation. One is using the SLURM_JOBID environment variable with the squeue command to query the nodes. Another is using LSF environment variables such as LSB_HOSTS and LSB_MCPU_HOSTS, which are prepared by the HP XC job starter script.

7.4.6 Submitting a Batch Job or Job Script

The bsub command format to submit a batch job or job script is:

bsub -nnum-procs [bsub-options]script-name

The -nnum-procsparameter specifies the number of processors the job requests. -nnum-procsis required for parallel jobs. script-nameis the name of the batch job or script. Any bsub options can be included. The script can contain one or more srun or mpirun commands and options.

The script will be executed once on the first allocated node, and any srun or mpirun commands within the script can use some or all of the allocated compute nodes.

7-14Using LSF