If you specify a script at the end of the srun command line (not as an argument to -A), the spawned shell executes that script using the allocated resources (interactively, without a queue). See the -boption for script requirements.

If you specify no script, you can then execute other instances of srun interactively, within the spawned subshell, to run multiple parallel jobs on the resources that you allocated to the subshell. Resources (such as nodes) will only be freed for other jobs when you terminate the subshell.

-a=jobid (--attach=jobid)

The -a=jobidoption attaches (or reattaches) your current srun session to the already running job whose SLURM ID is jobid. The job to which you attach must have its resources managed by SLURM, but it can be either interactive ("allocated," started with -A) or batch (started with --b). This option allows you to monitor or intervene in previously started srun jobs. You cannot use -awith -bor -A. Because the running job to which you attach already has its resources specified, you cannot use -awith -n, -N, or -c. You can only attach to jobs for which you are the authorized owner.

By default, -aattaches to the designated job read-only. stdout and stderr are copied to the attaching srun, just as if the current srun session had started the job. However, signals are not forwarded to the remote processes (and a single Ctrl/C will detach the read-only srun from the job).

If you use -j(-join) or -s(-steal)along with -a, your srun session joins the running job and can also forward signals to it as well as receive stdout and stderr from it. If you join a SLURM batch (-b) job, you can send signals to its batch script. Join (-j) does not forward stdin, but steal (-s, which closes other open sessions with the job) does forward stdin as well as signals.

-j (--join)

The -joption joins a running SLURM job (always used only with -aoption to specify the jobid). This not only duplicates stdout and stderr to the attaching srun session, but it also forwards signals to the job’s script or processes as well.

-s (--steal)

The -soption steals all connections to a running SLURM job (always used only with -aoption to specify the jobid). -stealcloses any open sessions with the specified job, then copies stdout and stderr to the attaching srun session, and it also forwards both signals and stdin to the job’s script or processes.

6.4.4 srun Resource-Allocation Options

The srun options assign compute resources to your parallel SLURM-managed job. These options can be used alone or in combination. Also, refer to the other srun options that can affect node management for your job, especially the control options and constraint options.

-n procs (--nprocs=procs)

The -n procs option requests that srun execute procs processes. To control how these processes are distributed among nodes and CPUs, combine -nwith -cor -Nas explained below (default is one process per node).

-N n (--nodes=n)

The -N n option allocates at least n nodes to this job, where n may be one of the following:

a specific node count (such as -N16)

a node count range (such as -N14-18)

Using SLURM 6-5

Page 75
Image 75
HP XC System 2.x Software manual Srun Resource-Allocation Options