Q-Logic IB6054601-00 D manual Simple Process Management, Clean Termination of MPI Processes

Models: IB6054601-00 D

1 122
Download 122 pages 48.66 Kb
Page 73
Image 73


B – Integration with a Batch Queuing System A Batch Queuing Script

by mpirun.Each line consists of a node name, a colon, and the number of processes to start on that node.

NOTE: This is one of two formats that the file may use. See section 3.5.6 for more information.


Simple Process Management

At this point, your script has enough information to be able to run an MPI program. All that remains is to start the program when the batch system tells us that we can do so, and notify the batch system when the job completes. This is done in the final part of batch_mpirun:

mpirun -np $np -m $mpihosts_file "$mpi_prog" $@ exit_code=$?

scancel ${SLURM_JOBID} rm -f $mpihosts_file exit $exit_code


Clean Termination of MPI Processes

The InfiniPath software will normally ensure clean termination of all MPI programs when a job ends, but in some rare circumstances an MPI process will remain alive, and potentially interfere with future MPI jobs. To avoid this problem, the usual solution is to run a script before and after each batch job which kills all unwanted processes. QLogic does not provide such a script, but it is useful to know how to find out which processes on a node are using the InfiniPath interconnect. The easiest way to do this is through use of the fuser command, which is normally installed in /sbin.Run as root:

#/sbin/fuser -v /dev/ipath

/dev/ipath: 22648m 22651m

In this example, processes 22648 and 22651 are using the InfiniPath interconnect. It is also possible to use this command (as root):

#lsof /dev/ipath

This gets a list of processes using InfiniPath. Additionally, to get all processes, including stats programs, ipath_sma, diags, and others, run the program in this way:

#/sbin/fuser -v /dev/ipath* losf can also take the same form:

#lsof /dev/ipath*

IB6054601-00 D


Page 73
Image 73
Q-Logic IB6054601-00 D manual Simple Process Management, Clean Termination of MPI Processes, # /sbin/fuser -v /dev/ipath