Q-Logic IB6054601-00 D manual Batch Queuing Script, Allocating Resources

Models: IB6054601-00 D

1 122
Download 122 pages 48.66 Kb
Page 71
Image 71

Appendix B

Integration with a Batch Queuing System

Most cluster systems use some kind of batch queuing system as an orderly way to provide users with access to the resources they need to meet their job’s performance requirements. One of the tasks of the cluster administrator is to provide means for users to submit MPI jobs through such batch queuing systems. This can take the form of a script, which your users can invoke much as they would invoke mpirun to submit their MPI jobs. A sample script is presented in this section.

B.1

A Batch Queuing Script

We give an example of the some of the functions that such a script might perform, in the context of the Simple Linux Utility Resource Manager (SLURM) developed at Lawrence Livermore National Laboratory. These functions assume the use of the bash shell. We will call this script batch_mpirun. It is provided here:

#! /bin/sh

#Very simple example batch script for InfiniPath MPI, using slurm

#(http://www.llnl.gov/linux/slurm/)

#Invoked as:

#batch_mpirun #cpus mpi_program_name mpi_program_args ...

#

np=$1 mpi_prog="$2" # assume arguments to script are correct shift 2 # program args are now $@

eval ‘srun --allocate --ntasks=$np --no-shell‘ mpihosts_file=‘mktemp -p /tmp mpihosts_file.XXXXXX‘

srun --jobid=${SLURM_JOBID} hostname -s sort uniq -c \ awk ’{printf "%s:%s\n", $2, $1}’ > $mpihosts_file

mpirun -np $np -m $mpihosts_file "$mpi_prog" $@ exit_code=$?

scancel ${SLURM_JOBID} rm -f $mpihosts_file exit $exit_code

In the following sections, setup and the various functions of the script are discussed in further detail.

B.1.1

Allocating Resources

When the mpirun command starts, it requires specification of the number of node programs it must spawn (via the -npoption) and specification of an mpihosts file listing the nodes on which the node programs may be run. (See section 3.5.8 for more information.) Normally, since performance is usually important, a user might

IB6054601-00 D

B-1

Page 71
Image 71
Q-Logic IB6054601-00 D manual Batch Queuing Script, Allocating Resources