Documentation CD contains XC LSF manuals from Platform Computing. LSF

 

manpages are available on the HP XC system.
SLURM commands

HP XC uses the Simple Linux Utility for Resource Management (SLURM) for system

 

resource management and job scheduling. Standard SLURM commands are

 

available through the command line. SLURM functionality is described in

 

Chapter 8. Using SLURM. Descriptions of SLURM commands are available

 

in the SLURM manpages. Invoke the man command with the SLURM command

 

name to access them.

 

HP-MPI commands

You can run standard HP-MPI commands from the command line. Descriptions of

 

HP-MPI commands are available in the HP-MPI documentation, which is supplied

 

with the HP XC system software.
Modules commandsThe HP XC system uses standard Modules commands to load and unload

 

modulefiles, which are used to configure and modify the user environment. Modules

 

commands are described in"Overview of Modules" .

Application Development Environment

The HP XC system provides an environment that enables developing, building, and running applications using multiple nodes with multiple cores. These applications can range from parallel applications using many cores to serial applications using a single core.

Parallel Applications

The HP XC parallel application development environment allows parallel application processes to be started and stopped together on a large number of application processors, along with the I/O and process control structures to manage these kinds of applications.

Full details and examples of how to build, run, debug, and troubleshoot parallel applications are provided in "Developing Parallel Applications" .

Serial Applications

You can build and run serial applications under the HP XC development environment. A serial application is a command or application that does not use any form of parallelism.

Full details and examples of how to build, run, debug, and troubleshoot serial applications are provided in "Building Serial Applications" .

Run-Time Environment

This section describes LSF-HPC, SLURM, and HP-MPI, and how these components work together to provide the HP XC run-time environment. LSF-HPC focuses on scheduling (and managing the workload) and SLURM provides efficient and scalable resource management of the compute nodes.

Another HP XC environment features Standard LSF without the interaction with the SLURM resource manager.

SLURM

Simple Linux Utility for Resource Management (SLURM) is a resource management system that is integrated into the HP XC system. SLURM is suitable for use on large and small Linux clusters. It was developed by Lawrence Livermore National Lab and Linux Networks. As a resource manager, SLURM allocates exclusive or nonexclusive access to resources (application and compute nodes) for users to perform work, and provides a framework to start, execute and monitor work (normally a parallel job) on the set of allocated nodes.

A SLURM system consists of two daemons, one configuration file, and a set of commands and APIs. The central controller daemon, slurmctld, maintains the global state and directs operations. A slurmd daemon is deployed to each computing node and responds to job-related requests, such as launching jobs, signalling, and terminating jobs. End users and system software (such as LSF-HPC) communicate with SLURM by means of commands or APIs — for example, allocating resources, launching parallel jobs on allocated resources, and terminating running jobs.

SLURM groups compute nodes (the nodes where jobs are run) together into “partitions”. The HP XC system can have one or several partitions. When HP XC is installed, a single partition of compute nodes is created

24 Overview of the User Environment