Hewlett-Packard Company Palo Alto, California
June Product Version
Page
 Contents
 5.4
Developing Applications
 Using Slurm
Tuning Applications
Debugging Applications
 Using HP-MPI
Using LSF
 6.4
Using HP Mlib
 Examples
Advanced Topics
Glossary Index Examples
 Tables
Figures
Page
 Intended Audience
About This Document
Document Organization
 HP XC Program Development Environment
Linux Administration Handbook
HP XC Information
QuickSpecs for HP XC System Software
 HP Mathematical Library
For More Information
Supplementary Information
HP Message Passing Interface
 Http//sourceforge.net/projects/modules
Manpages
Http//supermon.sourceforge.net
Http//systemimager.org
 Http//linuxvirtualserver.org
Related Information
Http//www-unix.mcs.anl.gov/mpi
 Typographical Conventions
# cd /opt/hptc/config/sbin
Bold text
 Ctrl/x
HP Encourages Your Comments
Page
 Node Specialization
Overview of the User Environment
System Architecture
Operating System
 Local Storage
Storage and I/O
File System
SAN Storage
 File System Layout
System Interconnect Network
 1 LVS
Commands
User Environment
Network Address Translation NAT
 Parallel Applications
Application Development Environment
Serial Applications
 How LSF-HPC and Slurm Interact
Run-Time Environment
Slurm
Load Sharing Facility LSF-HPC
 HP-MPI
Components, Tools, Compilers, Libraries, and Debuggers
 8Overview of the User Environment
 Using ssh to Log
Using the System
Configuring Your Environment with Modulefiles
LVS Login Routing
 2Using the System
 Supplied Modulefiles
Supplied Modulefiles
Modulefile Sets the HP XC User Environment
 Loading a Modulefile
Modulefiles Automatically Loaded on the System
Viewing Available Modulefiles
Viewing Loaded Modulefiles
 Loading a Modulefile for the Current Session
Automatically Loading a Modulefile at Login
Unloading a Modulefile
Modulefile Conflicts
 $ module help totalview
Creating a Modulefile
Viewing Modulefile-Specific Help
$ module load modules $ man modulefile
 Getting Information About Resources
Launching and Managing Jobs Quick Start
Introduction
Getting Information About Queues
 Example 2-1 Submitting a Serial Job
Getting Information About the System’s Partitions
Launching Jobs
Submitting a Serial Job
 $ bsub -n4 -I srun hostname
Using Slurm Options with the LSF External Scheduler
Submitting a Non-MPI Parallel Job
Example 2-2 Submitting a Non-MPI Parallel Job
 Example 2-4 Running an MPI Job with LSF
Submitting an MPI Job
$ bsub -n4 Mpirun -srun ./helloworld
 Example 2-6 Submitting a Job Script
Submitting a Batch Job or Job Script
$ bsub -I -n4 myjobscript.sh
 Getting System Help and Information
Performing Other Common User Tasks
 $ man -k keyword
$ man sinfo
Page
 Overview
Developing Applications
 PGI Compilers
Using Compilers
Standard Linux Compilers
Intel Compilers
 Developing Serial Applications
Setting Debugging Options
Checking Nodes and Partitions Before Running Jobs
Interrupting a Job
 Building Serial Applications
Using Mlib in Serial Applications
Developing Parallel Applications
Serial Application Build Environment
 OpenMP
Parallel Application Build Environment
Modulefiles
HP-MPI
 $ mpicc object1.o ... -pthread -o myapp.exe
Quadrics Shmem
Mlib Math Library
MPI Library
 GNU Parallel Make
Intel Fortran and C/C++Compilers
PGI Fortran and C/C++ Compilers
GNU C and C++ Compilers
 Compiling and Linking HP-MPI Applications
Reserved Symbols and Names
Building Parallel Applications
Compiling and Linking Non-MPI Applications
 Designing Libraries for XC4000
Developing Libraries
$ mpicc -c -g foo.c
 Example 3-2 Recommended Directory Structure
Using the GNU Parallel Make Capability
Advanced Topics
Example 3-1 Directory Structure
 $ cd subdir srun -n1 -N1 $MAKE -j4
$ cd subdir srun $MAKE
 Example Procedure
 $ make PREFIX=’srun -n1 -N1’ MAKEJ=’-j4’
$ make PREFIX=’srun -n1 -N1 MAKEJ=’-j4’
 Private File View
Local Disks on Compute Nodes
3 I/O Performance Considerations
Shared File View
 Communication Between Nodes
Page
 TotalView
Debugging Applications
Debugging Serial Applications
Debugging Parallel Applications
 Debugging with TotalView
Setting Up TotalView
SSH and TotalView
 $ bsub -nx-ext SLURMnodes=x \ -Is /usr/bin/xterm
Using TotalView with Slurm
Using TotalView with LSF-HPC
$ srun -Nx-A $ mpirun -tv -srun application
 $ totalview
Starting TotalView for the First Time
 TotalView Preferences Window
 Preferences window, click on the Launch Strings tab
 Debugging Applications
 $ mpicc -g -o Psimple simple.c -lm
Debugging an Application
$ mpirun -tv -srun -n2 ./Psimple
 TotalView Process Window Example
 $ mpirun -srun -n2 Psimple
Debugging Running Applications
 Exiting TotalView
$ scancel --user username
$ squeue
Page
 Example
Tuning Applications
Using the Intel Trace Collector/Analyzer
Building a Program Intel Trace Collector and HP-MPI
 Visualizing Data Intel Trace Analyzer and HP-MPI
Running a Program Intel Trace Collector and HP-MPI
Example Running the vtjacobic Example Program
 Introduction
Using Slurm
Slurm Commands
Command Function
 Example 6-1 Simple Launch of a Serial Program
Accessing the Slurm Manpages
Launching Jobs with the srun Command
Srun Roles and Modes
 Srun Modes
Srun Roles
 Allocate
Srun Signal Handling
Srun Run-Mode Options
Batch
 Srun Resource-Allocation Options
 Nthreads --threads=nthreads
Cpt --cpus-per-task=cpt
Part --partition=part
Minutes --time=minutes
 Srun Control Options
 Srun I/O Options
 Label
Mode --output=mode
Mode --input=mode
Mode --error=mode
 Clist --constraint=clist
Srun Constraint Options
 Vmem=size
Contiguous=yesno
Mem=size
Mincpus=n
 Srun Environment Variables
Monitoring Jobs with the squeue Command
Using srun with HP-MPI
Using srun with LSF
 Getting System Information with the sinfo Command
Killing Jobs with the scancel Command
 $ sinfo -R
Fault Tolerance
Job Accounting
Security
 Introduction to LSF in the HP XC Environment
Using LSF
Overview of LSF
 Nodelist= list-of-nodes Exclude= list-of-nodes
Topology Support
 $ bsub -n 10 -ext SLURMnodes=10exclude=n16 srun myapp
$ bsub -n 10 -ext SLURMnodes=10 srun myapp
 $ bqueues -l normal grep Jobstarter
How LSF and Slurm Launch and Manage a Job
$ bsub -Is hostname
 $ bsub -n4 -ext SLURMnodes=4 -o output.out ./myscript
How LSF-HPC and Slurm Launch and Manage a Job
 Differences Between LSF on HP XC and Standard LSF
 Job Startup and Job Control
Determining Execution Host
Determining Available System Resources
Getting Status of LSF
 Getting Host Load Information
Getting Information About LSF-HPC Execution Host Node
 $ sinfo -p lsf
Submitting Jobs
Checking LSF System Queues
Getting Information About the lsf Partition
 Bsub bsuboptions jobname joboptions
Summary of the LSF bsub Command Format
 Slurm Arguments Function
LSF-SLURM External Scheduler
 Starting on lsfhost.localdomain n6
 Example 7-5 Submitting an Interactive Serial Job
Submitting a Serial Job
Submitting a Job in Parallel
Submitting an HP-MPI Job
 Example 7-6 Submitting an HP-MPI Job
Submitting a Batch Job or Job Script
$ bsub -n4 -I mpirun -srun ./helloworld
 $ bsub -n4 -ext SLURMnodes=4 -I ./myscript.sh
Examples
Example 7-8 Submitting a Batch Job Script
$ bsub -n4 -I ./myscript.sh
 $ bsub -n4 -I ./myscript.sh -n8 -O
Submitting a Job from a Non-HP XC Host
Type=SLINUX64
 $ bsub -R type=SLINUX64 -n4 -I srun hostname
Getting Information About Jobs
Getting Job Allocation Information
Job Allocation Information for a Running Job
 Checking Status of a Job
Example 7-13 Using the bjobs Command Short Output
Job Allocation Information for a Finished Job
 Viewing a Job’s Historical Information
Example 7-14 Using the bjobs Command Long Output
Example 7-15 Using the bhist Command Short Output
Output Provided by the bhist Command
 Working Interactively Within an LSF-HPC Allocation
Example 7-16 Using the bhist Command Long Output
Submitting an Interactive Job to Launch the xterm Program
 Example 7-20 View Job Details in LSF
Example 7-17 View Your Environment
Example 7-18 View Your Allocation in Slurm
Example 7-19 View Your Running Job in LSF
 $ hostname $ srun hostname $ srun -n2 hostname
Submitting an Interactive Job to Launch a Shell
Example 7-21 Running Jobs from an xterm Window
Example 7-22 Submitting an Interactive Shell Program
 Srun Option Description LSF Equivalent
LSF Equivalents of Slurm srun Options
LSF Equivalents of Slurm srun Options
$ srun hostname n1 $ exit
 24Using LSF
 Control the parallel job
Page
 Using HP-MPI
 Building and Running an Example Application
Setting Environment Variables
HP-MPI Directory Structure
Compiling and Running Applications
 $ $MPIROOT/bin/mpirun -srun -n4 helloworld
Example Application helloworld
Building and Running helloworld
$ mpicc -o helloworld $MPIROOT/help/helloworld.c
 HP-MPI options allowed with -srun
Launching MPI Jobs
 $ mpirun -srun -n4 -N2 -O -m cyclic ./a.out host1 rank1
Creating Subshells and Launching Jobsteps
System Interconnect Selection
$ mpirun -srun -n6 -O -N2 -m cyclic ./a.out host1 rank1
 $ mpirun -subnet 192.168.1.1 -prot -srun -n4 ./a.out
Using LSF and HP-MPI
 Example 8-5 Allocating 12 Processors on 6 Nodes
$ /usr/sbin/ifconfig -a
System Interconnect Support
MPI Versioning
 Truncated Messages
32-Bit Builds on XC4000
Allowing Windows to Use Exclusive Locks
 $ mpirun -TCP -srun -N8 ./a.out
Mpirun Command Options
 $MPIROOT/bin/mpirun -v -prot -np 2 /path/to/program.x
Environment Variables
 Export MPIMAXWINDOW=10
Export MPIPHYSICALMEMORY=1048576
Export MPIPINPERCENTAGE=30
Export MPIPAGEALIGNMEM=1
 $MPIROOT/bin/mpirun.mpich -np 2 ./prog.x
Mpich Object Compatibility
Export MPIUSELIBELAN=0
Export MPIUSELIBELANUSE=5
 Category Manpages Description
HP-MPI Documentation and Manpages
At http//docs.hp.com
HP-MPI Manpage Categories
 Additional Information, Known Problems, and Work-arounds
 Version 7 Fortran Compiler
Using HP Mlib
Intel Compiler Notes
Version 8 Fortran Compiler
 Library Support
HP Mlib for the HP XC6000 Platform
Mlib and Module Files
Platform Support
 Modulefiles and Mlib
Using Intel Compilers with HP Mlib
Compiling and Linking
 Linking SuperLUDIST
HP Mlib for the HP XC4000 Platform
Licensing
Mlib Manpages
 $ pgcc options file
 6Using HP Mlib
 Determining IP Address of Your Local Machine
Advanced Topics
Enabling Remote Execution with OpenSSH
Running an X Terminal Session from a Remote Node
 $ host mymachine
Running an X terminal Session Using Slurm
Logging in to HP XC System
$ hostname
 $ bsub -n4 -Ip srun -n1 xterm -display
Running an X terminal Session Using LSF
$ bjobs
Page
 Building and Running a Serial Application
Examples
Launching a Serial Interactive Shell Through LSF
 Example 1. Two Processors on Any Two Nodes
Running LSF Jobs with a Slurm Allocation Request
 Example 2. Four Processors on Two Specific Nodes
Launching a Parallel Interactive Shell Through LSF
View the job
 $ bsub -Is -n4 -ext SLURMnodes=4 /bin/bash
Check the running job’s information
 Display the script
Submitting a Simple Job Script with LSF
Check the finished job’s information
Show the environment
 $ bsub -n8 -Ip /bin/sh
Submitting an Interactive Job with LSF
Show the job allocation
Show the Slurm job ID
 Submitting an HP-MPI Job with LSF
 View the finished job
View the running job
$ bjobs -l
 $ bsub -n 8 -R ALPHA5 SLINUX64 \ -ext SLURMnodes=4-4 myjob
Using a Resource Requirements String in an LSF Command
Page
 Glossary
 First come first served
Extensible firmware interface
External network node
Fairshare
 Internet address
Image server
Integrated Lights Out
Interconnect
 Master host
Network Information Services
LSF master host
Management Processor
 Role
Root Administration Switch
Parallel application
Resource manager role
 Glossary-6
Symmetric multiprocessing
 Index-1
Index
 Index-2
Gdb, 4-1 GNU
 Index-3
Modulefile
 Index-4
Resource manager, 7-1role Serial applications