What Happens When a Request is Received
Time Limit
Jobs are normally run to completion or until they are preempted by a higher priority
request. Each partition may have a time limit associated with it which restricts the
amount of time the Partition Manager may allow for a parallel job. On expiry of this
time limit, the job is sent a SIGXCPU signal. A period of grace is allowed following this
signal for the job to clean up and exit. After this period, the job is killed and the resource
deallocated. The duration of the grace period is specified in the attributes table (see
Section 10.2.3) and can be set using rcontrol.
Memory Size
The Partition Manager can enforce memory limits that restrict the size of a job. The
default memory limits are designed to prevent memory starvation (a node having free
CPUs but no memory) and to control whether parallel jobs page or not.
7.4 What Happens When a Request is Received
A user’s request for resources, made through the RMS commands prun or allocate,
specifies the following parameters:
cpus The total number of CPUs to be allocated.
nodes The number of nodes across which the CPUs are to be allocated. This
parameter is optional.
base node The identifier of the first node to be allocated. This parameteris
optional.
hwbcast A contiguous range of nodes. This parameter is optional. When a
contiguous range ofnodes is allocated to a job, messages can be
broadcast in hardware. This offers advantages of speed over a
software implementation if the job makes use of broadcast operations.
memory The amount of memory required per CPU. This parameter is optional
(set through the environment variable RMS_MEMLIMIT) but jobs with
low memory requirements may be scheduled sooner if they make
these requirements explicit.
time limit The length of time for which the CPUs are required. This parameter is
optional (set through the environment variable RMS_TIMELIMIT).
samecpus The same set of CPUs on each node. This parameter is optional.
RMS Scheduling 7-3