Manuals
/
HP
/
Computer Equipment
/
Software
HP
XC System 4.x Software
manual
Models:
XC System 4.x Software
1
62
135
135
Download
135 pages
34.25 Kb
59
60
61
62
63
64
65
66
<
>
Install
Fault Tolerance
Login
Setting Up TotalView
Commands
Example Procedure
Tuning Applications
Setting Debugging Options
$ scancel --user username
Using TotalView with Slurm
Page 62
Image 62
62
Page 61
Page 63
Page 62
Image 62
Page 61
Page 63
Contents
HP XC System Software XC User Guide
Page
Table of Contents
Configuring Your Environment with Modulefiles
Compiling and Linking Serial Applications
Debugging Applications
Srun Roles Srun Modes
107
115
125
131
11-1
List of Figures
10-1
List of Tables
List of Examples
New and Changed Information in This Edition
About This Document
Intended Audience
Typographic Conventions
HP XC and Related HP Products Information
Related Information
HP XC System Software Users Guide
Https//computing.llnl.gov/linux/slurm/documentation.html
Http//systemimager.org
Manpages
Http//www-unix.mcs.anl.gov/mpi
$ man -k keyword
HP Encourages Your Comments
$ man discover $ man 8 discover
Page
Page
Determining the Node Platform
Only to the administrator of the HP XC system
Node Specialization
/proc/cpuinfo file is dynamic
SAN Storage
Storage and I/O
File System
Local Storage
HP XC System Interconnects
System Interconnect Network
File System Layout
Network Address Translation NAT
Determining System Configuration Information
User Environment
1 LVS
Parallel Applications
Commands
Application Development Environment
Slurm
Run-Time Environment
Serial Applications
Load Sharing Facility LSF
HP-MPI
Requested by the HP-MPI mpirun command
How LSF and Slurm Interact
Nodes for the job
Components, Tools, Compilers, Libraries, and Debuggers
Page
Using the Secure Shell to Log
Using the System
LVS Login Routing
Logging In to the System
Getting Information About Resources
Introduction
Getting Information About Queues
$ bqueues
Performing Other Common User Tasks
$ lsid
Getting System Help and Information
Determining the LSF Cluster Name and the LSF Execution Host
$ man sinfo
Configuring Your Environment with Modulefiles
Overview of Modules
Supplied Modulefiles
Supplied Modulefiles
HP-MPI
Viewing Loaded Modulefiles
Modulefiles Automatically Loaded on the System
Viewing Available Modulefiles
Loading a Modulefile
Modulefile Conflicts
Unloading a Modulefile
Viewing Modulefile-Specific Help
Creating a Modulefile
$ module load modules $ man modulefile
Developing Applications
Application Development Environment Overview
MPI Compiler
Compiler Commands
Compilers
Interrupting a Job
Setting Debugging Options
Examining Nodes and Partitions Before Running Jobs
Developing Serial Applications
Building Serial Applications
Developing Parallel Applications
Serial Application Build Environment
Parallel Application Build Environment
OpenMP
Modulefiles
HP-MPI
Pthreads
PGI Fortran and C/C++ Compilers
MPI Library
Intel Fortran and C/C++Compilers
GNU C and C++ Compilers
Building Parallel Applications
Designing Libraries for the CP4000 Platform
To compile and link a C application using the mpicc command
Developing Libraries
To build a 64-bit application, you might enter
Linkcommand 32-bit -L/opt/mypackage/lib/i686 -lmystuff
Linkcommand 64-bit -L/opt/mypackage/lib/x8664 -lmystuff
Developing Applications
Submitting Jobs
Submitting a Serial Job Using LSF
Submitting a Serial Job with the LSF bsub Command
Overview of Job Submission
Example 5-1 Submitting a Job from the Standard Input
Example 5-2 Submitting a Serial Job Using LSF
Submitting a Serial Job Through Slurm Only
$ bsub -I srun hostname
Submitting a Non-MPI Parallel Job
Following is the command line used to compile this program
Submitting a Parallel Job
To submit a parallel job
Example 5-5 Submitting a Non-MPI Parallel Job
$ bsub -n4 -I srun hostname
Example 5-7 Submitting an MPI Job
$ bsub -n4 -I mpirun -srun ./helloworld
Arguments for the Slurm External Scheduler
Nodelist=list of nodes
$ bsub -n 10 -ext SLURMnodelist=n1-10 srun hostname
$ bsub -n 10 -ext SLURMconstraint=dualcore -I srun hostname
Submitting a Batch Job or Job Script
Example 5-14 Submitting a Job Script
$ bqueues -l dualcore grep Slurm
$ bsub -n4 -ext SLURMnodes=4 -I ./myscript.sh
$ bsub -n4 -I ./myscript.sh
Use the squeue command to acquire information on the jobs
Using a Script to Submit Multiple Jobs
Using a Makefile to Submit Multiple Jobs
Submitting Multiple MPI Jobs Across the Same Set of Nodes
$ tail 113.out
Following command line makes the program and executes it
$ cat mymake
$ cat 117.out
Type=SLINUX64
Submitting a Job from a Host Other Than an HP XC Host
Running Preexecution Programs
$ bsub -R type=SLINUX64 -n4 -I srun hostname
Page
Debugging Parallel Applications
Debugging Applications
Debugging Serial Applications
TotalView
SSH and TotalView
Setting Up TotalView
Debugging with TotalView
Module load mpi module load totalview
Setting TotalView Preferences
Using TotalView with Slurm
Using TotalView with LSF
$ srun -Nx-A $ mpirun -tv -srun application
Debugging an Application
$ mpirun -tv -srun -n2 ./Psimple
Exiting TotalView
$ scancel --user username
Debugging Running Applications
Run the application
Page
Monitoring Node Activity
Xtools Utilities
Running Performance Health Tests
Where
$ ovp --verify=perfhealth/cpuusage
You can list the available tests with the ovp -lcommand
$ ovp -l
$ ovp --verbose --verify=perfhealth/cpuusage
HOMEDIRECTORY/ovpn16mmddyy.log
HOMEDIRECTORY/ovpn16mmddyyr1.log
Page
Building a Program Intel Trace Collector and HP-MPI
Tuning Applications
Using the Intel Trace Collector and Intel Trace Analyzer
For more information, see the following Web site
Running a Program Intel Trace Collector and HP-MPI
Example 8-1 The vtjacobic Example Program
Example 8-2 C Example Running the vtjacobic Example Program
HP-MPI and the Intel Trace Collector
Installation Kit
Intel Trace Collector and Analyzer with HP-MPI on HP XC
Running a Program
Following is a Fortran example called vtjacobif
# bsub -n4 -I mpirun.mpich -np 2 ./vtjacobic
Running a Program Across Nodes Using LSF
Visualizing Data Intel Trace Analyzer and HP-MPI
Page
Introduction to Slurm
Using Slurm
Launching Jobs with the srun Command
Slurm Utilities
Using the srun Command with LSF
Monitoring Jobs with the squeue Command
Using the srun Command with HP-MPI
Srun Roles and Modes
Example 9-5cancels all pending jobs
Terminating Jobs with the scancel Command
Getting System Information with the sinfo Command
Security
Fault Tolerance
Job Accounting
$ sinfo -R
Using LSF
Information for LSF
Overview of LSF Integrated with Slurm
$ squeue --jobs $SLURMJOBID
Lsfadmin@n16 ~$ bsub -n4 -I srun hostname
Example 10-1 Examples of LSF Job Launch
Lshosts and lsload commands display for each of these items
Differences Between LSF and LSF Integrated with Slurm
Submitted. This job may run immediately, or it may run
Job Terminology
Connection back to the terminal from which the job was
$ ssh n15 lshosts
Are submitted with the bsub -Icommand
Batch jobs are submitted with the srun -bcommand. By
Default, the output is written to
Batch system scheduling policies
Submitting Jobs
Using LSF Integrated with Slurm in the HP XC Environment
Useful Commands
Job Startup and Job Control
LSF with Slurm Job Launch Exit Codes
LSF-SLURM External Scheduler
How LSF and Slurm Launch and Manage a Job
User logs in to login node n16
How LSF and Slurm Launch and Manage a Job
Determining the LSF Execution Host
Determining Available System Resources
Getting Information About the LSF Execution Host Node
Following example shows the output from the lshosts command
Examining System Core Status
Examining System Queues
Getting Information About Jobs
Getting Host Load Information
Getting Information About the lsf Partition
LSF job with Slurm allocated resources
Getting Job Allocation Information
This allocation string has the following values
Than what the job requests
Example 10-4 Job Allocation Information for a Finished Job
Example 10-5 Using the bjobs Command Short Output
Examining the Status of a Job
Output Provided by the bhist Command
Example 10-6 Using the bjobs Command Long Output
Example 10-7 Using the bhist Command Short Output
Viewing the Historical Information for a Job
Translating Slurm and LSF JOBIDs
Example 10-8 Using the bhist Command Long Output
Use the bjobs command to view the Slurm Jobid
$ bsub -I -n4 -ext SLURMnodes=4 /bin/bash
Working Interactively Within an Allocation
$ sacct -j
$ bjobs -l 124 grep slurm
Alternatively, you can use the following
Example 10-9 Launching an Interactive MPI Job
This example assumes 2 cores per node
LSF Equivalents of Slurm srun Options
LSF Equivalents of Slurm srun Options
3describes the srun options and lists their LSF equivalents
$ unset Slurmjobid $ unset Slurmnprocs
Requests a specific list of nodes.
Slurm
Suppress informational message
Running an X Terminal Session from a Remote Node
Advanced Topics
Enabling Remote Execution with OpenSSH
Determining IP Address of Your Local Machine
Logging in to HP XC System
Running an X terminal Session Using Slurm
Running an X terminal Session Using LSF
Run the job on 1 core
Using the GNU Parallel Make Capability
Options used in this command are Allocate 4 cores
Monitors display server address
$ cd subdir srun $MAKE
$ cd subdir srun -n1 -N1 $MAKE -j4
Example Procedure
$ make PREFIX=’srun -n1 -N1 MAKEJ=-j4
$ make PREFIX=srun -n1 -N1 MAKEJ=-j4
Local Disks on Compute Nodes
Modified Makefile is invoked as follows
Communication Between Nodes
Using Mpich on the HP XC System
11.5 I/O Performance Considerations
Shared File View
Bsub command launches the wrapper script
Using Mpich with Slurm Allocation
Using Mpich with LSF Allocation
Launching a Serial Interactive Shell Through LSF
Examples
Building and Running a Serial Application
Examine the LSF execution host information
Running LSF Jobs with a Slurm Allocation Request
Launching a Parallel Interactive Shell Through LSF
Example 2. Four Cores on Two Specific Nodes
Examine the running jobs information
$ hostname n16 $ srun hostname n5 $ bjobs
Show the environment
Submitting a Simple Job Script with LSF
Examine the finished jobs information
Display the script
Submit the job
Submitting an Interactive Job with LSF
Run the job
Show the job allocation
Exit the pseudo-terminal
Run some commands from the pseudo-terminal
Show the Slurm job ID
View the interactive jobs
View the running job
Submitting an HP-MPI Job with LSF
View the node state
$ bsub -n 8 -R ALPHA5 SLINUX64 \ -ext SLURMnodes=4-4 myjob
Using a Resource Requirements String in an LSF Command
View the finished job
124
Glossary
Network Availability set
Fcfs
Ipmi
LVS
PXE
SVA
Index
Index
LVS
PGI
135
Top
Page
Image
Contents