The srun command, used by the mpirun command to launch the MPI tasks in parallel, determines the number of tasks to launch from the SLURM_NPROCS environment variable that was set by
Consider an HP XC system configuration in which lsfhost.localdomain is the LSF execution host and nodes
Example
Example 7-6: Submitting an HP-MPI Job
$ bsub
Job <75> is submitted to default queue <normal>. <<Waiting for dispatch ...>>
<<Starting on lsfhost.localdomain>> Hello world! I’m 0 of 4 on n2 Hello world! I’m 1 of 4 on n2 Hello world! I’m 2 of 4 on n4 Hello world! I’m 3 of 4 on n4
hello_world MPI program on four processors, but uses the to request one task per node.
Example 7-7: Submitting an HP-MPI Job with a Specific Topology Request
$ bsub
<<Waiting for dispatch ...>> <<Starting on lsfhost.localdomain>> Hello world! I’m 0 of 4 on n1 Hello world! I’m 1 of 4 on n2 Hello world! I’m 2 of 4 on n3 Hello world! I’m 3 of 4 on n4
If the MPI job requires the use of an appfile, or has another reason that prohibits the use of the srun command as the task launcher, some preprocessing to determine the node hostnames to which mpirun’s standard task launcher should launch the tasks needs to be done. In such scenarios, you need to write a batch script; there are several methods available for determining the nodes in an allocation. One is using the SLURM_JOBID environment variable with the squeue command to query the nodes. Another is using LSF environment variables such as LSB_HOSTS and LSB_MCPU_HOSTS, which are prepared by the HP XC job starter script.
7.4.6 Submitting a Batch Job or Job Script
The bsub command format to submit a batch job or job script is:
bsub
The
The script will be executed once on the first allocated node, and any srun or mpirun commands within the script can use some or all of the allocated compute nodes.