4.3 Examining Nodes and Partitions Before Running Jobs

Before launching an application, you can determine the availability and status of the system's nodes and partitions. Node and partition information is useful to have before launching a job so that you can launch the job to properly match the resources that are available on the system.

When invoked with no options, the SLURM sinfo command returns information about node availability and partitions, along with other information:

$ sinfo

 

 

 

 

 

 

PARTITION AVAIL TIMELIMIT

NODES

STATE

NODELIST

lsf

up

infinite

 

4

down*

n[12-15]

slurm*

up

infinite

2

idle

n[10-11]

The previous sinfo output shows that there are two partitions on the system:

One for LSF jobs

One for SLURM jobs

The asterisk in the PARTITION column indicates the default partition. An asterisk in the STATE column indicates nodes that are currently not responding.

See Chapter 9 “Using SLURM” for information about using the sinfo command. The SLURM sinfo manpage also provides detailed information about the sinfo command.

4.4 Interrupting a Job

A job launched by the srun command can be interrupted by sending a signal to the command by issuing one or more Ctrl/C key sequences. Signals sent to the srun command are automatically forwarded to the tasks that it is controlling.

The Ctrl/C key sequence will report the state of all tasks associated with the srun command. If the Ctrl/C key sequence is entered twice within one second, the associated SIGINT signal will be sent to all tasks. If a third Ctrl/C key sequence is entered, the job will be terminated without waiting for remote tasks to exit.

The Ctrl/Z key sequence is ignored.

4.5 Setting Debugging Options

In general, the debugging information for your application that is needed by most debuggers can be produced by supplying the -gswitch to the compiler. For more specific information about debugging options, see the documentation and manpages associated with your compiler.

4.6 Developing Serial Applications

This section describes how to build and run serial applications in the HP XC environment. The following topics are covered:

“Serial Application Build Environment” (page 42) describes the serial application programming model.

“Building Serial Applications” (page 42) discusses how to build serial applications.

For further information about developing serial applications, see the following sections:

“Debugging Serial Applications” (page 63) describes how to debug serial applications.

“Launching Jobs with the srun Command” (page 81) describes how to launch applications with the srun command.

“Building and Running a Serial Application” (page 115) provides examples of serial applications.

4.3 Examining Nodes and Partitions Before Running Jobs 41

Page 41
Image 41
HP XC System 4.x Software manual Examining Nodes and Partitions Before Running Jobs, Interrupting a Job