Hewlett-Packard Company Palo Alto, California
June Product Version
Page
 Contents
 5.4
Developing Applications
 Tuning Applications
Using Slurm
Debugging Applications
 Using HP-MPI
Using LSF
 6.4
Using HP Mlib
 Advanced Topics
Examples
Glossary Index Examples
 Tables
Figures
Page
 About This Document
Intended Audience
Document Organization
 HP XC Information
Linux Administration Handbook
QuickSpecs for HP XC System Software
HP XC Program Development Environment
 Supplementary Information
For More Information
HP Message Passing Interface
HP Mathematical Library
 Http//supermon.sourceforge.net
Manpages
Http//systemimager.org
Http//sourceforge.net/projects/modules
 Related Information
Http//linuxvirtualserver.org
Http//www-unix.mcs.anl.gov/mpi
 # cd /opt/hptc/config/sbin
Typographical Conventions
Bold text
 Ctrl/x
HP Encourages Your Comments
Page
 System Architecture
Overview of the User Environment
Operating System
Node Specialization
 File System
Storage and I/O
SAN Storage
Local Storage
 File System Layout
System Interconnect Network
 User Environment
Commands
Network Address Translation NAT
1 LVS
 Application Development Environment
Parallel Applications
Serial Applications
 Slurm
Run-Time Environment
Load Sharing Facility LSF-HPC
How LSF-HPC and Slurm Interact
 HP-MPI
Components, Tools, Compilers, Libraries, and Debuggers
 8Overview of the User Environment
 Configuring Your Environment with Modulefiles
Using the System
LVS Login Routing
Using ssh to Log
 2Using the System
 Supplied Modulefiles
Supplied Modulefiles
Modulefile Sets the HP XC User Environment
 Viewing Available Modulefiles
Modulefiles Automatically Loaded on the System
Viewing Loaded Modulefiles
Loading a Modulefile
 Unloading a Modulefile
Automatically Loading a Modulefile at Login
Modulefile Conflicts
Loading a Modulefile for the Current Session
 Viewing Modulefile-Specific Help
Creating a Modulefile
$ module load modules $ man modulefile
$ module help totalview
 Introduction
Launching and Managing Jobs Quick Start
Getting Information About Queues
Getting Information About Resources
 Launching Jobs
Getting Information About the System’s Partitions
Submitting a Serial Job
Example 2-1 Submitting a Serial Job
 Submitting a Non-MPI Parallel Job
Using Slurm Options with the LSF External Scheduler
Example 2-2 Submitting a Non-MPI Parallel Job
$ bsub -n4 -I srun hostname
 Submitting an MPI Job
Example 2-4 Running an MPI Job with LSF
$ bsub -n4 Mpirun -srun ./helloworld
 Submitting a Batch Job or Job Script
Example 2-6 Submitting a Job Script
$ bsub -I -n4 myjobscript.sh
 Getting System Help and Information
Performing Other Common User Tasks
 $ man -k keyword
$ man sinfo
Page
 Overview
Developing Applications
 Standard Linux Compilers
Using Compilers
Intel Compilers
PGI Compilers
 Checking Nodes and Partitions Before Running Jobs
Setting Debugging Options
Interrupting a Job
Developing Serial Applications
 Developing Parallel Applications
Using Mlib in Serial Applications
Serial Application Build Environment
Building Serial Applications
 Modulefiles
Parallel Application Build Environment
HP-MPI
OpenMP
 Mlib Math Library
Quadrics Shmem
MPI Library
$ mpicc object1.o ... -pthread -o myapp.exe
 PGI Fortran and C/C++ Compilers
Intel Fortran and C/C++Compilers
GNU C and C++ Compilers
GNU Parallel Make
 Building Parallel Applications
Reserved Symbols and Names
Compiling and Linking Non-MPI Applications
Compiling and Linking HP-MPI Applications
 Developing Libraries
Designing Libraries for XC4000
$ mpicc -c -g foo.c
 Advanced Topics
Using the GNU Parallel Make Capability
Example 3-1 Directory Structure
Example 3-2 Recommended Directory Structure
 $ cd subdir srun -n1 -N1 $MAKE -j4
$ cd subdir srun $MAKE
 Example Procedure
 $ make PREFIX=’srun -n1 -N1’ MAKEJ=’-j4’
$ make PREFIX=’srun -n1 -N1 MAKEJ=’-j4’
 3 I/O Performance Considerations
Local Disks on Compute Nodes
Shared File View
Private File View
 Communication Between Nodes
Page
 Debugging Serial Applications
Debugging Applications
Debugging Parallel Applications
TotalView
 Setting Up TotalView
Debugging with TotalView
SSH and TotalView
 Using TotalView with LSF-HPC
Using TotalView with Slurm
$ srun -Nx-A $ mpirun -tv -srun application
$ bsub -nx-ext SLURMnodes=x \ -Is /usr/bin/xterm
 $ totalview
Starting TotalView for the First Time
 TotalView Preferences Window
 Preferences window, click on the Launch Strings tab
 Debugging Applications
 Debugging an Application
$ mpicc -g -o Psimple simple.c -lm
$ mpirun -tv -srun -n2 ./Psimple
 TotalView Process Window Example
 $ mpirun -srun -n2 Psimple
Debugging Running Applications
 $ scancel --user username
Exiting TotalView
$ squeue
Page
 Using the Intel Trace Collector/Analyzer
Tuning Applications
Building a Program Intel Trace Collector and HP-MPI
Example
 Running a Program Intel Trace Collector and HP-MPI
Visualizing Data Intel Trace Analyzer and HP-MPI
Example Running the vtjacobic Example Program
 Slurm Commands
Using Slurm
Command Function
Introduction
 Launching Jobs with the srun Command
Accessing the Slurm Manpages
Srun Roles and Modes
Example 6-1 Simple Launch of a Serial Program
 Srun Modes
Srun Roles
 Srun Run-Mode Options
Srun Signal Handling
Batch
Allocate
 Srun Resource-Allocation Options
 Part --partition=part
Cpt --cpus-per-task=cpt
Minutes --time=minutes
Nthreads --threads=nthreads
 Srun Control Options
 Srun I/O Options
 Mode --input=mode
Mode --output=mode
Mode --error=mode
Label
 Clist --constraint=clist
Srun Constraint Options
 Mem=size
Contiguous=yesno
Mincpus=n
Vmem=size
 Using srun with HP-MPI
Monitoring Jobs with the squeue Command
Using srun with LSF
Srun Environment Variables
 Getting System Information with the sinfo Command
Killing Jobs with the scancel Command
 Job Accounting
Fault Tolerance
Security
$ sinfo -R
 Using LSF
Introduction to LSF in the HP XC Environment
Overview of LSF
 Nodelist= list-of-nodes Exclude= list-of-nodes
Topology Support
 $ bsub -n 10 -ext SLURMnodes=10exclude=n16 srun myapp
$ bsub -n 10 -ext SLURMnodes=10 srun myapp
 How LSF and Slurm Launch and Manage a Job
$ bqueues -l normal grep Jobstarter
$ bsub -Is hostname
 $ bsub -n4 -ext SLURMnodes=4 -o output.out ./myscript
How LSF-HPC and Slurm Launch and Manage a Job
 Differences Between LSF on HP XC and Standard LSF
 Determining Available System Resources
Determining Execution Host
Getting Status of LSF
Job Startup and Job Control
 Getting Host Load Information
Getting Information About LSF-HPC Execution Host Node
 Checking LSF System Queues
Submitting Jobs
Getting Information About the lsf Partition
$ sinfo -p lsf
 Bsub bsuboptions jobname joboptions
Summary of the LSF bsub Command Format
 Slurm Arguments Function
LSF-SLURM External Scheduler
 Starting on lsfhost.localdomain n6
 Submitting a Job in Parallel
Submitting a Serial Job
Submitting an HP-MPI Job
Example 7-5 Submitting an Interactive Serial Job
 Submitting a Batch Job or Job Script
Example 7-6 Submitting an HP-MPI Job
$ bsub -n4 -I mpirun -srun ./helloworld
 Example 7-8 Submitting a Batch Job Script
Examples
$ bsub -n4 -I ./myscript.sh
$ bsub -n4 -ext SLURMnodes=4 -I ./myscript.sh
 Submitting a Job from a Non-HP XC Host
$ bsub -n4 -I ./myscript.sh -n8 -O
Type=SLINUX64
 Getting Job Allocation Information
Getting Information About Jobs
Job Allocation Information for a Running Job
$ bsub -R type=SLINUX64 -n4 -I srun hostname
 Example 7-13 Using the bjobs Command Short Output
Checking Status of a Job
Job Allocation Information for a Finished Job
 Example 7-15 Using the bhist Command Short Output
Example 7-14 Using the bjobs Command Long Output
Output Provided by the bhist Command
Viewing a Job’s Historical Information
 Example 7-16 Using the bhist Command Long Output
Working Interactively Within an LSF-HPC Allocation
Submitting an Interactive Job to Launch the xterm Program
 Example 7-18 View Your Allocation in Slurm
Example 7-17 View Your Environment
Example 7-19 View Your Running Job in LSF
Example 7-20 View Job Details in LSF
 Example 7-21 Running Jobs from an xterm Window
Submitting an Interactive Job to Launch a Shell
Example 7-22 Submitting an Interactive Shell Program
$ hostname $ srun hostname $ srun -n2 hostname
 LSF Equivalents of Slurm srun Options
LSF Equivalents of Slurm srun Options
$ srun hostname n1 $ exit
Srun Option Description LSF Equivalent
 24Using LSF
 Control the parallel job
Page
 Using HP-MPI
 HP-MPI Directory Structure
Setting Environment Variables
Compiling and Running Applications
Building and Running an Example Application
 Building and Running helloworld
Example Application helloworld
$ mpicc -o helloworld $MPIROOT/help/helloworld.c
$ $MPIROOT/bin/mpirun -srun -n4 helloworld
 HP-MPI options allowed with -srun
Launching MPI Jobs
 System Interconnect Selection
Creating Subshells and Launching Jobsteps
$ mpirun -srun -n6 -O -N2 -m cyclic ./a.out host1 rank1
$ mpirun -srun -n4 -N2 -O -m cyclic ./a.out host1 rank1
 $ mpirun -subnet 192.168.1.1 -prot -srun -n4 ./a.out
Using LSF and HP-MPI
 System Interconnect Support
$ /usr/sbin/ifconfig -a
MPI Versioning
Example 8-5 Allocating 12 Processors on 6 Nodes
 32-Bit Builds on XC4000
Truncated Messages
Allowing Windows to Use Exclusive Locks
 $ mpirun -TCP -srun -N8 ./a.out
Mpirun Command Options
 $MPIROOT/bin/mpirun -v -prot -np 2 /path/to/program.x
Environment Variables
 Export MPIPINPERCENTAGE=30
Export MPIPHYSICALMEMORY=1048576
Export MPIPAGEALIGNMEM=1
Export MPIMAXWINDOW=10
 Export MPIUSELIBELAN=0
Mpich Object Compatibility
Export MPIUSELIBELANUSE=5
$MPIROOT/bin/mpirun.mpich -np 2 ./prog.x
 At http//docs.hp.com
HP-MPI Documentation and Manpages
HP-MPI Manpage Categories
Category Manpages Description
 Additional Information, Known Problems, and Work-arounds
 Intel Compiler Notes
Using HP Mlib
Version 8 Fortran Compiler
Version 7 Fortran Compiler
 Mlib and Module Files
HP Mlib for the HP XC6000 Platform
Platform Support
Library Support
 Using Intel Compilers with HP Mlib
Modulefiles and Mlib
Compiling and Linking
 Licensing
HP Mlib for the HP XC4000 Platform
Mlib Manpages
Linking SuperLUDIST
 $ pgcc options file
 6Using HP Mlib
 Enabling Remote Execution with OpenSSH
Advanced Topics
Running an X Terminal Session from a Remote Node
Determining IP Address of Your Local Machine
 Logging in to HP XC System
Running an X terminal Session Using Slurm
$ hostname
$ host mymachine
 Running an X terminal Session Using LSF
$ bsub -n4 -Ip srun -n1 xterm -display
$ bjobs
Page
 Examples
Building and Running a Serial Application
Launching a Serial Interactive Shell Through LSF
 Example 1. Two Processors on Any Two Nodes
Running LSF Jobs with a Slurm Allocation Request
 Launching a Parallel Interactive Shell Through LSF
Example 2. Four Processors on Two Specific Nodes
View the job
 $ bsub -Is -n4 -ext SLURMnodes=4 /bin/bash
Check the running job’s information
 Check the finished job’s information
Submitting a Simple Job Script with LSF
Show the environment
Display the script
 Show the job allocation
Submitting an Interactive Job with LSF
Show the Slurm job ID
$ bsub -n8 -Ip /bin/sh
 Submitting an HP-MPI Job with LSF
 View the running job
View the finished job
$ bjobs -l
 $ bsub -n 8 -R ALPHA5 SLINUX64 \ -ext SLURMnodes=4-4 myjob
Using a Resource Requirements String in an LSF Command
Page
 Glossary
 External network node
Extensible firmware interface
Fairshare
First come first served
 Integrated Lights Out
Image server
Interconnect
Internet address
 LSF master host
Network Information Services
Management Processor
Master host
 Parallel application
Root Administration Switch
Resource manager role
Role
 Glossary-6
Symmetric multiprocessing
 Index-1
Index
 Index-2
Gdb, 4-1 GNU
 Index-3
Modulefile
 Index-4
Resource manager, 7-1role Serial applications