The ping_pong_ring application is submitted twice in a Makefile named mymake; the first time as run1 and the second as run2:

$ cat mymake

PPR_ARGS=10000

NODES=2

TASKS=4

all: run1 run2

run1:

mpirun -srun -N ${NODES} -n ${TASKS} ./ping_pong_ring ${PPR_ARGS}

run2:

mpirun -srun -N ${NODES} -n ${TASKS} ./ping_pong_ring ${PPR_ARGS}

The following command line makes the program and executes it:

$ bsub -o %J.out -n2 -ext "SLURM[nodes=2]" make -j2 -f ./mymake PPR_ARGS=1000000

Job <113> is submitted to default queue <normal>.

Use the squeue command to acquire information on the jobs:

$ squeue

-s

 

 

 

 

STEPID

NAME

PARTITION

USER

TIME NODELIST

15.0

hptclsf@113

lsf

lsfadmin

0:04 n14

15.1

ping_pong_ring

lsf

lsfadmin

0:04

n[14-15]

15.2

ping_pong_ring

lsf

lsfadmin

0:04

n[14-15]

The following command displays the final ten lines of the output file generated by the execution of the application made from mymake:

$ tail 113.out

1000000 bytes: 937.33 MB/sec [2:n15] ping-pong 1000000 bytes ...

1000000 bytes: 1048.41 usec/msg

1000000 bytes: 953.82 MB/sec [3:n15] ping-pong 1000000 bytes ...

1000000 bytes: 15308.02 usec/msg

1000000 bytes: 65.33 MB/sec [3:n15] ping-pong 1000000 bytes ...

1000000 bytes: 15343.11 usec/msg

1000000 bytes: 65.18 MB/sec

The following illustrates how an error in the Makefile is reported. This Makefile specifies a nonexistent program:

$ cat mymake

PPR_ARGS=10000

NODES=2

TASKS=4

all: run1 run2 run3

run1:

mpirun -srun -N ${NODES} -n ${TASKS} ./ping_pong_ring ${PPR_ARGS}

run2:

mpirun -srun -N ${NODES} -n ${TASKS} ./ping_pong_ring ${PPR_ARGS}

run3:

mpirun -srun -N ${NODES} -n ${TASKS} ./ping_bogus ${PPR_ARGS} 1

5.5 Submitting Multiple MPI Jobs Across the Same Set of Nodes 59

Page 59
Image 59
HP XC System 4.x Software manual Following command line makes the program and executes it, $ cat mymake, $ tail 113.out