June Product Version
Hewlett-Packard Company Palo Alto, California
Page
 Contents
 Developing Applications
5.4
 Using Slurm
Tuning Applications
Debugging Applications
 Using LSF
Using HP-MPI
 Using HP Mlib
6.4
 Examples
Advanced Topics
Glossary Index Examples
 Figures
Tables
Page
 Intended Audience
About This Document
Document Organization
 Linux Administration Handbook
HP XC Information
QuickSpecs for HP XC System Software
HP XC Program Development Environment
 For More Information
Supplementary Information
HP Message Passing Interface
HP Mathematical Library
 Manpages
Http//supermon.sourceforge.net
Http//systemimager.org
Http//sourceforge.net/projects/modules
 Http//linuxvirtualserver.org
Related Information
Http//www-unix.mcs.anl.gov/mpi
 Typographical Conventions
# cd /opt/hptc/config/sbin
Bold text
 HP Encourages Your Comments
Ctrl/x
Page
 Overview of the User Environment
System Architecture
Operating System
Node Specialization
 Storage and I/O
File System
SAN Storage
Local Storage
 System Interconnect Network
File System Layout
 Commands
User Environment
Network Address Translation NAT
1 LVS
 Parallel Applications
Application Development Environment
Serial Applications
 Run-Time Environment
Slurm
Load Sharing Facility LSF-HPC
How LSF-HPC and Slurm Interact
 Components, Tools, Compilers, Libraries, and Debuggers
HP-MPI
 8Overview of the User Environment
 Using the System
Configuring Your Environment with Modulefiles
LVS Login Routing
Using ssh to Log
 2Using the System
 Supplied Modulefiles
Supplied Modulefiles
Modulefile Sets the HP XC User Environment
 Modulefiles Automatically Loaded on the System
Viewing Available Modulefiles
Viewing Loaded Modulefiles
Loading a Modulefile
 Automatically Loading a Modulefile at Login
Unloading a Modulefile
Modulefile Conflicts
Loading a Modulefile for the Current Session
 Creating a Modulefile
Viewing Modulefile-Specific Help
$ module load modules $ man modulefile
$ module help totalview
 Launching and Managing Jobs Quick Start
Introduction
Getting Information About Queues
Getting Information About Resources
 Getting Information About the System’s Partitions
Launching Jobs
Submitting a Serial Job
Example 2-1 Submitting a Serial Job
 Using Slurm Options with the LSF External Scheduler
Submitting a Non-MPI Parallel Job
Example 2-2 Submitting a Non-MPI Parallel Job
$ bsub -n4 -I srun hostname
 Example 2-4 Running an MPI Job with LSF
Submitting an MPI Job
$ bsub -n4 Mpirun -srun ./helloworld
 Example 2-6 Submitting a Job Script
Submitting a Batch Job or Job Script
$ bsub -I -n4 myjobscript.sh
 Performing Other Common User Tasks
Getting System Help and Information
 $ man sinfo
$ man -k keyword
Page
 Developing Applications
Overview
 Using Compilers
Standard Linux Compilers
Intel Compilers
PGI Compilers
 Setting Debugging Options
Checking Nodes and Partitions Before Running Jobs
Interrupting a Job
Developing Serial Applications
 Using Mlib in Serial Applications
Developing Parallel Applications
Serial Application Build Environment
Building Serial Applications
 Parallel Application Build Environment
Modulefiles
HP-MPI
OpenMP
 Quadrics Shmem
Mlib Math Library
MPI Library
$ mpicc object1.o ... -pthread -o myapp.exe
 Intel Fortran and C/C++Compilers
PGI Fortran and C/C++ Compilers
GNU C and C++ Compilers
GNU Parallel Make
 Reserved Symbols and Names
Building Parallel Applications
Compiling and Linking Non-MPI Applications
Compiling and Linking HP-MPI Applications
 Designing Libraries for XC4000
Developing Libraries
$ mpicc -c -g foo.c
 Using the GNU Parallel Make Capability
Advanced Topics
Example 3-1 Directory Structure
Example 3-2 Recommended Directory Structure
 $ cd subdir srun $MAKE
$ cd subdir srun -n1 -N1 $MAKE -j4
 Example Procedure
 $ make PREFIX=’srun -n1 -N1 MAKEJ=’-j4’
$ make PREFIX=’srun -n1 -N1’ MAKEJ=’-j4’
 Local Disks on Compute Nodes
3 I/O Performance Considerations
Shared File View
Private File View
 Communication Between Nodes
Page
 Debugging Applications
Debugging Serial Applications
Debugging Parallel Applications
TotalView
 Debugging with TotalView
Setting Up TotalView
SSH and TotalView
 Using TotalView with Slurm
Using TotalView with LSF-HPC
$ srun -Nx-A $ mpirun -tv -srun application
$ bsub -nx-ext SLURMnodes=x \ -Is /usr/bin/xterm
 Starting TotalView for the First Time
$ totalview
 TotalView Preferences Window
 Preferences window, click on the Launch Strings tab
 Debugging Applications
 $ mpicc -g -o Psimple simple.c -lm
Debugging an Application
$ mpirun -tv -srun -n2 ./Psimple
 TotalView Process Window Example
 Debugging Running Applications
$ mpirun -srun -n2 Psimple
 Exiting TotalView
$ scancel --user username
$ squeue
Page
 Tuning Applications
Using the Intel Trace Collector/Analyzer
Building a Program Intel Trace Collector and HP-MPI
Example
 Visualizing Data Intel Trace Analyzer and HP-MPI
Running a Program Intel Trace Collector and HP-MPI
Example Running the vtjacobic Example Program
 Using Slurm
Slurm Commands
Command Function
Introduction
 Accessing the Slurm Manpages
Launching Jobs with the srun Command
Srun Roles and Modes
Example 6-1 Simple Launch of a Serial Program
 Srun Roles
Srun Modes
 Srun Signal Handling
Srun Run-Mode Options
Batch
Allocate
 Srun Resource-Allocation Options
 Cpt --cpus-per-task=cpt
Part --partition=part
Minutes --time=minutes
Nthreads --threads=nthreads
 Srun Control Options
 Srun I/O Options
 Mode --output=mode
Mode --input=mode
Mode --error=mode
Label
 Srun Constraint Options
Clist --constraint=clist
 Contiguous=yesno
Mem=size
Mincpus=n
Vmem=size
 Monitoring Jobs with the squeue Command
Using srun with HP-MPI
Using srun with LSF
Srun Environment Variables
 Killing Jobs with the scancel Command
Getting System Information with the sinfo Command
 Fault Tolerance
Job Accounting
Security
$ sinfo -R
 Introduction to LSF in the HP XC Environment
Using LSF
Overview of LSF
 Topology Support
Nodelist= list-of-nodes Exclude= list-of-nodes
 $ bsub -n 10 -ext SLURMnodes=10 srun myapp
$ bsub -n 10 -ext SLURMnodes=10exclude=n16 srun myapp
 $ bqueues -l normal grep Jobstarter
How LSF and Slurm Launch and Manage a Job
$ bsub -Is hostname
 How LSF-HPC and Slurm Launch and Manage a Job
$ bsub -n4 -ext SLURMnodes=4 -o output.out ./myscript
 Differences Between LSF on HP XC and Standard LSF
 Determining Execution Host
Determining Available System Resources
Getting Status of LSF
Job Startup and Job Control
 Getting Information About LSF-HPC Execution Host Node
Getting Host Load Information
 Submitting Jobs
Checking LSF System Queues
Getting Information About the lsf Partition
$ sinfo -p lsf
 Summary of the LSF bsub Command Format
Bsub bsuboptions jobname joboptions
 LSF-SLURM External Scheduler
Slurm Arguments Function
 Starting on lsfhost.localdomain n6
 Submitting a Serial Job
Submitting a Job in Parallel
Submitting an HP-MPI Job
Example 7-5 Submitting an Interactive Serial Job
 Example 7-6 Submitting an HP-MPI Job
Submitting a Batch Job or Job Script
$ bsub -n4 -I mpirun -srun ./helloworld
 Examples
Example 7-8 Submitting a Batch Job Script
$ bsub -n4 -I ./myscript.sh
$ bsub -n4 -ext SLURMnodes=4 -I ./myscript.sh
 $ bsub -n4 -I ./myscript.sh -n8 -O
Submitting a Job from a Non-HP XC Host
Type=SLINUX64
 Getting Information About Jobs
Getting Job Allocation Information
Job Allocation Information for a Running Job
$ bsub -R type=SLINUX64 -n4 -I srun hostname
 Checking Status of a Job
Example 7-13 Using the bjobs Command Short Output
Job Allocation Information for a Finished Job
 Example 7-14 Using the bjobs Command Long Output
Example 7-15 Using the bhist Command Short Output
Output Provided by the bhist Command
Viewing a Job’s Historical Information
 Working Interactively Within an LSF-HPC Allocation
Example 7-16 Using the bhist Command Long Output
Submitting an Interactive Job to Launch the xterm Program
 Example 7-17 View Your Environment
Example 7-18 View Your Allocation in Slurm
Example 7-19 View Your Running Job in LSF
Example 7-20 View Job Details in LSF
 Submitting an Interactive Job to Launch a Shell
Example 7-21 Running Jobs from an xterm Window
Example 7-22 Submitting an Interactive Shell Program
$ hostname $ srun hostname $ srun -n2 hostname
 LSF Equivalents of Slurm srun Options
LSF Equivalents of Slurm srun Options
$ srun hostname n1 $ exit
Srun Option Description LSF Equivalent
 24Using LSF
 Control the parallel job
Page
 Using HP-MPI
 Setting Environment Variables
HP-MPI Directory Structure
Compiling and Running Applications
Building and Running an Example Application
 Example Application helloworld
Building and Running helloworld
$ mpicc -o helloworld $MPIROOT/help/helloworld.c
$ $MPIROOT/bin/mpirun -srun -n4 helloworld
 Launching MPI Jobs
HP-MPI options allowed with -srun
 Creating Subshells and Launching Jobsteps
System Interconnect Selection
$ mpirun -srun -n6 -O -N2 -m cyclic ./a.out host1 rank1
$ mpirun -srun -n4 -N2 -O -m cyclic ./a.out host1 rank1
 Using LSF and HP-MPI
$ mpirun -subnet 192.168.1.1 -prot -srun -n4 ./a.out
 $ /usr/sbin/ifconfig -a
System Interconnect Support
MPI Versioning
Example 8-5 Allocating 12 Processors on 6 Nodes
 Truncated Messages
32-Bit Builds on XC4000
Allowing Windows to Use Exclusive Locks
 Mpirun Command Options
$ mpirun -TCP -srun -N8 ./a.out
 Environment Variables
$MPIROOT/bin/mpirun -v -prot -np 2 /path/to/program.x
 Export MPIPHYSICALMEMORY=1048576
Export MPIPINPERCENTAGE=30
Export MPIPAGEALIGNMEM=1
Export MPIMAXWINDOW=10
 Mpich Object Compatibility
Export MPIUSELIBELAN=0
Export MPIUSELIBELANUSE=5
$MPIROOT/bin/mpirun.mpich -np 2 ./prog.x
 HP-MPI Documentation and Manpages
At http//docs.hp.com
HP-MPI Manpage Categories
Category Manpages Description
 Additional Information, Known Problems, and Work-arounds
 Using HP Mlib
Intel Compiler Notes
Version 8 Fortran Compiler
Version 7 Fortran Compiler
 HP Mlib for the HP XC6000 Platform
Mlib and Module Files
Platform Support
Library Support
 Modulefiles and Mlib
Using Intel Compilers with HP Mlib
Compiling and Linking
 HP Mlib for the HP XC4000 Platform
Licensing
Mlib Manpages
Linking SuperLUDIST
 $ pgcc options file
 6Using HP Mlib
 Advanced Topics
Enabling Remote Execution with OpenSSH
Running an X Terminal Session from a Remote Node
Determining IP Address of Your Local Machine
 Running an X terminal Session Using Slurm
Logging in to HP XC System
$ hostname
$ host mymachine
 $ bsub -n4 -Ip srun -n1 xterm -display
Running an X terminal Session Using LSF
$ bjobs
Page
 Building and Running a Serial Application
Examples
Launching a Serial Interactive Shell Through LSF
 Running LSF Jobs with a Slurm Allocation Request
Example 1. Two Processors on Any Two Nodes
 Example 2. Four Processors on Two Specific Nodes
Launching a Parallel Interactive Shell Through LSF
View the job
 Check the running job’s information
$ bsub -Is -n4 -ext SLURMnodes=4 /bin/bash
 Submitting a Simple Job Script with LSF
Check the finished job’s information
Show the environment
Display the script
 Submitting an Interactive Job with LSF
Show the job allocation
Show the Slurm job ID
$ bsub -n8 -Ip /bin/sh
 Submitting an HP-MPI Job with LSF
 View the finished job
View the running job
$ bjobs -l
 Using a Resource Requirements String in an LSF Command
$ bsub -n 8 -R ALPHA5 SLINUX64 \ -ext SLURMnodes=4-4 myjob
Page
 Glossary
 Extensible firmware interface
External network node
Fairshare
First come first served
 Image server
Integrated Lights Out
Interconnect
Internet address
 Network Information Services
LSF master host
Management Processor
Master host
 Root Administration Switch
Parallel application
Resource manager role
Role
 Symmetric multiprocessing
Glossary-6
 Index
Index-1
 Gdb, 4-1 GNU
Index-2
 Modulefile
Index-3
 Resource manager, 7-1role Serial applications
Index-4