Version
Page
Table of Contents
3 Configuring Your Environment with Modulefiles
4 Developing Applications
5 Submitting Jobs
6 Debugging Applications
7 Tuning Applications
8 Using SLURM
9 Using LSF
10 Advanced Topics
List of Figures
Page
List of Tables
Page
List of Examples
Page
About This Document
Intended Audience
Document Organization
HP XC Information
For More Information
Supplementary Information
man
Related Information
Related Linux Web Sites
•http://www.redhat.com
•http://www.linux.org/docs/index.html
•http://www.linuxheadquarters.com
•http://linuxvirtualserver.org
Typographic Conventions
HP Encourages Your Comments
1 Overview of the User Environment
System Architecture
Table 1-1Determining the Node Platform
processor
vendor_id
Node Specialization
head node
of the HP XC system
client nodes
such as logging into the system or running jobs
File System
implemented as
ioctls
File System Layout
•HP XC-specificsoftware is located in /opt/hptc
Determining System Configuration Information
User Environment
Application Development Environment
Run-TimeEnvironment
Load Sharing Facility (LSF-HPC)
Standard LSF
How LSF-HPCand SLURM Interact
-srun
JOB_STARTER
Components, Tools, Compilers, Libraries, and Debuggers
2 Using the System
Logging In to the System
Overview of Launching and Managing Jobs
Introduction
Environment (page 24)
JobID
How
and SLURM Launch and Manage a Job (page 73)
Performing Other Common User Tasks
Getting System Help and Information
3 Configuring Your Environment with Modulefiles
Overview of Modules
Supplied Modulefiles
Modulefiles Automatically Loaded on the System
Viewing Available Modulefiles
Viewing Loaded Modulefiles
Loading a Modulefile
Unloading a Modulefile
Modulefile Conflicts
Creating a Modulefile
Viewing Modulefile-SpecificHelp
Page
4 Developing Applications
Application Development Environment Overview
Compilers
Examining Nodes and Partitions Before Running Jobs
Interrupting a Job
Setting Debugging Options
Developing Serial Applications
Developing Parallel Applications
Pthreads
$ mpicc object1.o ... -pthread -omyapp.exe
Quadrics SHMEM
$ gcc -oshping shping.c -lshmem -lelan
MPI Library
Building Parallel Applications
Compiling and Linking Non-MPIApplications
Compiling and Linking HP-MPIApplications
Developing Libraries
<linkcommand> <32-bit> -L/opt/mypackage/lib/i686 -lmystuff
<linkcommand> <64-bit> -L/opt/mypackage/lib/x86_64 -lmystuff
5 Submitting Jobs
Overview of Job Submission
Submitting a Serial Job Using Standard LSF
Submitting a Serial Job Using LSF-HPC
$ bsub -Ihostname
Submitting a Serial Job Through SLURM only
$ cc hw_hostname.c -ohw_hostname
$ ./hw_hostname
$ srun ./hw_hostname
$ srun -n4 ./hw_hostname
Submitting a Non-MPIParallel Job
Submitting a Parallel Job That Uses the HP-MPIMessage Passing Interface
Submitting a Batch Job or Job Script
$ cat myscript.sh #!/bin/sh
$ bsub -I -n4
myscript.sh
$ bsub -n4 -ext"SLURM[nodes=4]" -I ./myscript.sh
Running Preexecution Programs
/opt/hptc/bin/srun
-N1 my_pre_exec
my_pre_exec
6 Debugging Applications
Debugging Serial Applications
Debugging Parallel Applications
SSH and TotalView
Setting Up TotalView
module load mpimodule load totalview
Using TotalView with SLURM
$ srun -Nx -A
Using TotalView with LSF-HPC
$ bsub -nx -ext"SLURM[nodes=x]" \ -Is /usr/bin/xterm
Setting TotalView Preferences
$ totalview
%C %R -n"%B/tvdsvr -working_directory%D -callback%L -set_pw%P -verbosity%V %F
Debugging Running Applications
$ mpirun -srun -n2Psimple
Exiting TotalView
$ squeue
$ scancel --user username
Page
7 Tuning Applications
Using the Intel Trace Collector and Intel Trace Analyzer
Running a Program – Intel Trace Collector and HP-MPI
Visualizing Data – Intel Trace Analyzer and HP-MPI
Page
Page
8 Using SLURM
Introduction to SLURM
SLURM Utilities
Launching Jobs with the srun Command
Monitoring Jobs with the squeue Command
Terminating Jobs with the scancel Command
Getting System Information with the sinfo Command
Job Accounting
Fault Tolerance
Security
9 Using LSF
Using Standard LSF on an HP XC System
Using LSF-HPC
•Introduction to LSF-HPCin the HP XC Environment (page 68)
•Determining the LSF Execution Host (page 75)
Introduction to LSF-HPCin the HP XC Environment
Overview of LSF-HPC
•Request contiguous nodes
•Execute only one task per node
•Request nodes with specific features
Differences Between LSF-HPCand Standard LSF
"Notes About Using
in the HP XC Environment
By LSF standards, the HP XC system is a single host. Therefore, all LSF
LSF-SLURM
n15
UNKNOWN
server
2007M
$ ssh n15 lshosts
HP XCCompute Node Resource Support
$ bsub -n10 -Isrun hostname
$ bsub -n10 -ext"SLURM[nodes=10]" -Isrun hostname
$ bsub -n10 -ext"SLURM[nodes=10;exclude=n16]" -Isrun hostname
$ bsub -n10 -ext"SLURM[constraint=dualcore]" -Isrun hostname
$ bsub -n10 -ext "SLURM[nodelist=n[1-10]]"srun hostname
How LSF-HPCand SLURM Launch and Manage a Job
User
$ bsub -n4 -ext"SLURM[nodes=4]" -ooutput.out ./myscript
Job Startup and Job Control
Preemption
The following example shows the output from the lshosts command:
lsfhost.loc
SLINUX6
Itanium2
3456M
Submitting Jobs
Chapter 5. Submitting Jobs
The basic synopsis of the bsub command is:
bsub [bsub-options]jobname jobname [job-options]
jobname
about running jobs. Refer to "Submitting a Batch Job or Job Script
for information about running scripts
bsub -n num-procs [bsub-options]srun [srun-options] jobname
[job-arguments]
"Submitting a Non-MPIParallel Job
Job <70> is submitted to default queue <normal>. <<Waiting for dispatch
<<Starting on lsfhost.localdomain>> n6
Example 9-2 shows one way to submit a parallel job to run one task per node
Job <71> is submitted to default queue <normal>. <<Waiting for dispatch
Job <72> is submitted to default queue <normal
Job Allocation Information for a Running Job
Job Allocation Information for a Finished Job
Example 9-5Using the bjobs Command (Long Output)
Job <24>, User <msmith>,Project <default>,Status <RUN
date and time stamp: Submitted from host <n16>, CWD <$HOME
Viewing the Historical Information for a Job
9-6Using the bhist Command (Short Output)
Translating SLURM and LSF-HPCJOBIDs
$ bsub -o%J.out -n8 sleep
$ bjobs -l99 | grep slurm
$ bhist -l99 | grep slurm
$ scontrol show job | grep Name
Working Interactively Within an LSF-HPCAllocation
$ bsub -I -n4 -ext"SLURM[nodes=4]" /bin/bash
<<Starting on lsfhost.localdomain
<<Waiting for dispatch
$ bjobs -l124 | grep slurm
$ unset SLURM_JOBID
Section (page 48)
Example 9-8Launching an Interactive MPI Job
Example 9-9Launching an Interactive MPI Job on All Cores in the Allocation
This example assumes 2 cores per node
LSF-HPCEquivalents of SLURM srun Options
Table 9-2 LSF-HPCEquivalents of SLURM srun Options
Page
Page
Page
10 Advanced Topics
Enabling Remote Execution with OpenSSH
Running an X Terminal Session from a Remote Node
$ hostname
$ host mymachine
Step 2. Logging in to HP XC System
Step 3. Running an X terminal Session Using SLURM
$ srun -N1xterm -display14.26.206.134:0.0
Step 4. Running an X terminal Session Using LSF-HPC
Using the GNU Parallel Make Capability
$ cd subdir; srun -n1 -N1$(MAKE) -j4
$ make PREFIX=’srun –n1 –N1 MAKE_J='-j4
Local Disks on Compute Nodes
I/O Performance Considerations
Communication Between Nodes
Page
Appendix A Examples
Building and Running a Serial Application
Launching a Serial Interactive Shell Through LSF-HPC
Running LSF-HPCJobs with a SLURM Allocation Request
Launching a Parallel Interactive Shell Through LSF-HPC
Note the output from the bjobs command:
Examine the the running job's information:
$ bhist -l124
Submitting a Simple Job Script with LSF-HPC
Submitting an Interactive Job with LSF-HPC
Submitting an HP-MPIJob with LSF-HPC
View the running job:
$ bjobs -l1009
View the finished job:
$ bhist -l1009
Using a Resource Requirements String in an LSF-HPCCommand
Page
Glossary
FCFS
to the queue
first-come
See FCFS
first-served
Linux Virtual
See LVS
Server
load file
single command
Network
See NIS
Information
Services
NIS
SMP
power) available per unit of space
ssh
network
standard LSF
Page
Index