Loading and Running Programs
processes, removing any core files if requested (see Page 5-11) and then deallocating the
CPUs.
The application processes are run from the user’s current working directory with the
current limits and group rights. The data and stack size limits may be reduced if RMS
has applied a memory limit to the program.
During execution, the processes may be suspended at any time by the scheduler to allow
a program with higher priority to run. All of the processes in a parallel program are
suspended together under the gang-scheduling policy used by RMS for parallel programs
(see Chapter 7 (RMS Scheduling)for details). They are restarted together when the
higher priority program has completed.
A parallel program exits when all of its processes have exited. When this happens, the
rmsloader processes reduce the exit status back to the controlling process by
performing a global OR of the exit status of each of the processes. If prun is run with
verbose reportingenabled, a non-zero exit status is accompanied by a message, as shown
in the following example:
$prun -v myprog
...
myprog: process 0 exited with status 1
If the level of reporting is increased with the -vv option, prun provides a commentary
on the resource request. With the -vvv option, rmsloader also outputs information
identifying the activity on eachnode running the program, as shown in the following
example.
$prun -vvv myprog
prun: running /home/duncan/myprog
prun: requesting 2 CPUs
prun: starting 2 processes on 2 cpus default memlimit no timelimit
prun: stdio server running
prun: loader 1 starting on atlas1 (10.128.0.7)
prun: loader 0 starting on atlas0 (10.128.0.8)
loader[atlas1]: program description complete
loader[atlas1]: nodes 2 contexts 1 capability type 0xffff8002 entries 2
loader[atlas1]: run process 1 node=5 cntx=244
prun: process 1 is pid 1265674 on atlas1
loader[atlas0]: program description complete
loader[atlas0]: nodes 2 contexts 1 capability type 0xffff8002 entries 2
loader[atlas0]: run process 0 node=4 cntx=244
prun: process 0 is pid 525636 on atlas0
...
When the program has exited, the CPUs are deallocated and the scheduler is called to
service the queue of waiting jobs.
3-4 Parallel ProgramsUnder RMS