Manuals
/
HP
/
Computer Equipment
/
Software
HP
XC System 4.x Software
manual
HP XC System Software XC User Guide
Models:
XC System 4.x Software
1
1
135
135
Download
135 pages
34.25 Kb
1
2
3
4
5
6
7
8
Install
Fault Tolerance
Login
Setting Up TotalView
Commands
Example Procedure
Tuning Applications
Setting Debugging Options
$ scancel --user username
Using TotalView with Slurm
Page 1
Image 1
HP XC System Software XC User Guide
Version 4.0
HP Part Number:
A-XCUSR-40a
Published: February 2009
Page 1
Page 2
Page 1
Image 1
Page 1
Page 2
Contents
HP XC System Software XC User Guide
Page
Table of Contents
Compiling and Linking Serial Applications
Configuring Your Environment with Modulefiles
Srun Roles Srun Modes
Debugging Applications
115
107
131
125
10-1
List of Figures
11-1
List of Tables
List of Examples
Intended Audience
About This Document
New and Changed Information in This Edition
Typographic Conventions
HP XC and Related HP Products Information
HP XC System Software Users Guide
Related Information
Https//computing.llnl.gov/linux/slurm/documentation.html
Http//systemimager.org
Http//www-unix.mcs.anl.gov/mpi
Manpages
$ man discover $ man 8 discover
HP Encourages Your Comments
$ man -k keyword
Page
Page
Node Specialization
Only to the administrator of the HP XC system
Determining the Node Platform
/proc/cpuinfo file is dynamic
File System
Storage and I/O
SAN Storage
Local Storage
File System Layout
System Interconnect Network
HP XC System Interconnects
User Environment
Determining System Configuration Information
Network Address Translation NAT
1 LVS
Application Development Environment
Commands
Parallel Applications
Serial Applications
Run-Time Environment
Slurm
Load Sharing Facility LSF
How LSF and Slurm Interact
Requested by the HP-MPI mpirun command
HP-MPI
Nodes for the job
Components, Tools, Compilers, Libraries, and Debuggers
Page
LVS Login Routing
Using the System
Using the Secure Shell to Log
Logging In to the System
Getting Information About Queues
Introduction
Getting Information About Resources
$ bqueues
Performing Other Common User Tasks
Determining the LSF Cluster Name and the LSF Execution Host
Getting System Help and Information
$ lsid
$ man sinfo
Overview of Modules
Configuring Your Environment with Modulefiles
Supplied Modulefiles
Supplied Modulefiles
HP-MPI
Viewing Available Modulefiles
Modulefiles Automatically Loaded on the System
Viewing Loaded Modulefiles
Loading a Modulefile
Viewing Modulefile-Specific Help
Unloading a Modulefile
Modulefile Conflicts
$ module load modules $ man modulefile
Creating a Modulefile
Application Development Environment Overview
Developing Applications
Compilers
Compiler Commands
MPI Compiler
Examining Nodes and Partitions Before Running Jobs
Setting Debugging Options
Interrupting a Job
Developing Serial Applications
Serial Application Build Environment
Developing Parallel Applications
Building Serial Applications
Parallel Application Build Environment
HP-MPI
Modulefiles
OpenMP
Pthreads
Intel Fortran and C/C++Compilers
MPI Library
PGI Fortran and C/C++ Compilers
GNU C and C++ Compilers
Building Parallel Applications
Developing Libraries
To compile and link a C application using the mpicc command
Designing Libraries for the CP4000 Platform
Linkcommand 64-bit -L/opt/mypackage/lib/x8664 -lmystuff
Linkcommand 32-bit -L/opt/mypackage/lib/i686 -lmystuff
To build a 64-bit application, you might enter
Developing Applications
Submitting a Serial Job with the LSF bsub Command
Submitting a Serial Job Using LSF
Submitting Jobs
Overview of Job Submission
Submitting a Serial Job Through Slurm Only
Example 5-2 Submitting a Serial Job Using LSF
Example 5-1 Submitting a Job from the Standard Input
$ bsub -I srun hostname
Submitting a Parallel Job
Following is the command line used to compile this program
Submitting a Non-MPI Parallel Job
To submit a parallel job
$ bsub -n4 -I srun hostname
Example 5-5 Submitting a Non-MPI Parallel Job
$ bsub -n4 -I mpirun -srun ./helloworld
Example 5-7 Submitting an MPI Job
Nodelist=list of nodes
Arguments for the Slurm External Scheduler
$ bsub -n 10 -ext SLURMnodelist=n1-10 srun hostname
Example 5-14 Submitting a Job Script
Submitting a Batch Job or Job Script
$ bsub -n 10 -ext SLURMconstraint=dualcore -I srun hostname
$ bqueues -l dualcore grep Slurm
$ bsub -n4 -I ./myscript.sh
$ bsub -n4 -ext SLURMnodes=4 -I ./myscript.sh
Using a Makefile to Submit Multiple Jobs
Using a Script to Submit Multiple Jobs
Use the squeue command to acquire information on the jobs
Submitting Multiple MPI Jobs Across the Same Set of Nodes
$ cat mymake
Following command line makes the program and executes it
$ tail 113.out
$ cat 117.out
Running Preexecution Programs
Submitting a Job from a Host Other Than an HP XC Host
Type=SLINUX64
$ bsub -R type=SLINUX64 -n4 -I srun hostname
Page
Debugging Serial Applications
Debugging Applications
Debugging Parallel Applications
TotalView
Debugging with TotalView
Setting Up TotalView
SSH and TotalView
Module load mpi module load totalview
Using TotalView with LSF
Using TotalView with Slurm
Setting TotalView Preferences
$ srun -Nx-A $ mpirun -tv -srun application
$ mpirun -tv -srun -n2 ./Psimple
Debugging an Application
Debugging Running Applications
$ scancel --user username
Exiting TotalView
Run the application
Page
Xtools Utilities
Monitoring Node Activity
Where
Running Performance Health Tests
$ ovp -l
You can list the available tests with the ovp -lcommand
$ ovp --verify=perfhealth/cpuusage
$ ovp --verbose --verify=perfhealth/cpuusage
HOMEDIRECTORY/ovpn16mmddyy.log
HOMEDIRECTORY/ovpn16mmddyyr1.log
Page
Using the Intel Trace Collector and Intel Trace Analyzer
Tuning Applications
Building a Program Intel Trace Collector and HP-MPI
Example 8-1 The vtjacobic Example Program
Running a Program Intel Trace Collector and HP-MPI
For more information, see the following Web site
Example 8-2 C Example Running the vtjacobic Example Program
Intel Trace Collector and Analyzer with HP-MPI on HP XC
Installation Kit
HP-MPI and the Intel Trace Collector
Following is a Fortran example called vtjacobif
Running a Program
Visualizing Data Intel Trace Analyzer and HP-MPI
Running a Program Across Nodes Using LSF
# bsub -n4 -I mpirun.mpich -np 2 ./vtjacobic
Page
Launching Jobs with the srun Command
Using Slurm
Introduction to Slurm
Slurm Utilities
Using the srun Command with HP-MPI
Monitoring Jobs with the squeue Command
Using the srun Command with LSF
Srun Roles and Modes
Getting System Information with the sinfo Command
Terminating Jobs with the scancel Command
Example 9-5cancels all pending jobs
Job Accounting
Fault Tolerance
Security
$ sinfo -R
Information for LSF
Using LSF
$ squeue --jobs $SLURMJOBID
Overview of LSF Integrated with Slurm
Example 10-1 Examples of LSF Job Launch
Lsfadmin@n16 ~$ bsub -n4 -I srun hostname
Differences Between LSF and LSF Integrated with Slurm
Lshosts and lsload commands display for each of these items
Connection back to the terminal from which the job was
Job Terminology
Submitted. This job may run immediately, or it may run
$ ssh n15 lshosts
Default, the output is written to
Batch jobs are submitted with the srun -bcommand. By
Are submitted with the bsub -Icommand
Batch system scheduling policies
Useful Commands
Using LSF Integrated with Slurm in the HP XC Environment
Submitting Jobs
Job Startup and Job Control
How LSF and Slurm Launch and Manage a Job
LSF-SLURM External Scheduler
LSF with Slurm Job Launch Exit Codes
How LSF and Slurm Launch and Manage a Job
User logs in to login node n16
Determining Available System Resources
Determining the LSF Execution Host
Examining System Core Status
Following example shows the output from the lshosts command
Getting Information About the LSF Execution Host Node
Getting Host Load Information
Getting Information About Jobs
Examining System Queues
Getting Information About the lsf Partition
This allocation string has the following values
Getting Job Allocation Information
LSF job with Slurm allocated resources
Than what the job requests
Examining the Status of a Job
Example 10-5 Using the bjobs Command Short Output
Example 10-4 Job Allocation Information for a Finished Job
Example 10-7 Using the bhist Command Short Output
Example 10-6 Using the bjobs Command Long Output
Output Provided by the bhist Command
Viewing the Historical Information for a Job
Use the bjobs command to view the Slurm Jobid
Example 10-8 Using the bhist Command Long Output
Translating Slurm and LSF JOBIDs
$ sacct -j
Working Interactively Within an Allocation
$ bsub -I -n4 -ext SLURMnodes=4 /bin/bash
$ bjobs -l 124 grep slurm
This example assumes 2 cores per node
Example 10-9 Launching an Interactive MPI Job
Alternatively, you can use the following
3describes the srun options and lists their LSF equivalents
LSF Equivalents of Slurm srun Options
LSF Equivalents of Slurm srun Options
$ unset Slurmjobid $ unset Slurmnprocs
Requests a specific list of nodes.
Slurm
Suppress informational message
Enabling Remote Execution with OpenSSH
Advanced Topics
Running an X Terminal Session from a Remote Node
Determining IP Address of Your Local Machine
Running an X terminal Session Using LSF
Running an X terminal Session Using Slurm
Logging in to HP XC System
Options used in this command are Allocate 4 cores
Using the GNU Parallel Make Capability
Run the job on 1 core
Monitors display server address
$ cd subdir srun -n1 -N1 $MAKE -j4
$ cd subdir srun $MAKE
$ make PREFIX=’srun -n1 -N1 MAKEJ=-j4
Example Procedure
Modified Makefile is invoked as follows
Local Disks on Compute Nodes
$ make PREFIX=srun -n1 -N1 MAKEJ=-j4
11.5 I/O Performance Considerations
Using Mpich on the HP XC System
Communication Between Nodes
Shared File View
Using Mpich with LSF Allocation
Using Mpich with Slurm Allocation
Bsub command launches the wrapper script
Building and Running a Serial Application
Examples
Launching a Serial Interactive Shell Through LSF
Examine the LSF execution host information
Running LSF Jobs with a Slurm Allocation Request
Example 2. Four Cores on Two Specific Nodes
Launching a Parallel Interactive Shell Through LSF
$ hostname n16 $ srun hostname n5 $ bjobs
Examine the running jobs information
Examine the finished jobs information
Submitting a Simple Job Script with LSF
Show the environment
Display the script
Run the job
Submitting an Interactive Job with LSF
Submit the job
Show the job allocation
Show the Slurm job ID
Run some commands from the pseudo-terminal
Exit the pseudo-terminal
View the interactive jobs
View the node state
Submitting an HP-MPI Job with LSF
View the running job
View the finished job
Using a Resource Requirements String in an LSF Command
$ bsub -n 8 -R ALPHA5 SLINUX64 \ -ext SLURMnodes=4-4 myjob
124
Network Availability set
Glossary
Fcfs
Ipmi
LVS
PXE
SVA
Index
Index
LVS
PGI
135
Top
Page
Image
Contents