HP
XC System 3.x Software
manual
Fault Tolerance
Login
Setting Up TotalView
$ man lsfcommandname
Example Procedure
Tuning Applications
Slurm backup
Setting Debugging Options
$ scancel --user username
Using TotalView with LSF-HPC
Page 8
8
Page 7
Page 9
Image 8
Page 7
Page 9
Contents
HP XC System Software Users Guide
Page
Table of Contents
Submitting Jobs
Configuring Your Environment with Modulefiles
Developing Applications
Tuning Applications
Using Slurm
Using LSF
Debugging Applications
Glossary 109 Index 115
Advanced Topics
Examples
List of Figures
Page
List of Tables
Determining the Node Platform
Page
List of Examples
Submitting a Job Script
Page
About This Document
Intended Audience
Document Organization
This document is organized as follows
HP XC Information
Supplementary Information
$ man lsfcommandname
For More Information
Related Information
Manpages
$ man discover $ man 8 discover
$ man -k keyword
Related Linux Web Sites
Related MPI Web Sites
Related Compiler Web Sites
Additional Publications
Typographic Conventions
HP Encourages Your Comments
Environment Variable
User input
Overview of the User Environment
System Architecture
HP XC System Software
Operating System
Node Specialization
Storage and I/O
File System
SAN Storage
Local Storage
File System Layout
Network Address Translation NAT
Determining System Configuration Information
System Interconnect Network
Modules
Commands
User Environment
Application Development Environment
Run-Time Environment
Parallel Applications
Serial Applications
How LSF-HPC and Slurm Interact
Load Sharing Facility LSF-HPC
Standard LSF
Mpirun command
Components, Tools, Compilers, Libraries, and Debuggers
Using the System
LVS Login Routing
Using the Secure Shell to Log
Logging In to the System
Getting Information About Resources
Introduction
Getting Information About Queues
Performing Other Common User Tasks
Getting System Help and Information
$ man sinfo
Configuring Your Environment with Modulefiles
Overview of Modules
Supplied Modulefiles
Modulefiles Automatically Loaded on the System
Viewing Available Modulefiles
Viewing Loaded Modulefiles
Loading a Modulefile
Automatically Loading a Modulefile at Login
Unloading a Modulefile
Modulefile Conflicts
Loading a Modulefile for the Current Session
Creating a Modulefile
Viewing Modulefile-Specific Help
$ module load modules $ man modulefile
$ module help totalview
Page
Compilers
Developing Applications
Application Development Environment Overview
Examining Nodes and Partitions Before Running Jobs
Interrupting a Job
MPI Compiler
Partition Avail Timelimit Nodes State Nodelist
Setting Debugging Options
Developing Serial Applications
Serial Application Build Environment
Building Serial Applications
Developing Parallel Applications
Parallel Application Build Environment
Modulefiles
OpenMP
Pthreads
Quadrics Shmem
MPI Library
Intel Fortran and C/C++Compilers
Building Parallel Applications
Examples of Compiling and Linking HP-MPI Applications
Developing Libraries
Designing Libraries for the CP4000 Platform
To build a 64-bit application, you might enter
Linkcommand 32-bit -L/opt/mypackage/lib/i686 -lmystuff
Linkcommand 64-bit -L/opt/mypackage/lib/x8664 -lmystuff
ExtSLURMslurm-arguments
Submitting Jobs
Overview of Job Submission
Submitting a Serial Job Using Standard LSF
Submitting a Serial Job Using LSF-HPC
Submitting a Serial Job with the LSF bsub Command
$ bsub hostname
Submitting a Serial Job Through Slurm only
Submitting a Non-MPI Parallel Job
$ bsub -n4 -I srun hostname
Bsub -nnum-procsbsub-optionsmpijob
Mpirun mpirun--options-srunsrun-optionsmpi-jobname
Bsub -nnum-procs bsub-optionsscript-name
Submitting a Batch Job or Job Script
$ bsub -n4 -I mpirun -srun ./helloworld
$ cat myscript.sh #!/bin/sh
Srun hostname mpirun -srun hellompi
$ bsub -I -n4 Myscript.sh
$ bsub -n4 -ext SLURMnodes=4 -I ./myscript.sh
$ cat ./envscript.sh #!/bin/sh name=`hostname`
Running Preexecution Programs
$ bsub -n4 -I ./myscript.sh
Opt/hptc/bin/srun Mypreexec
Debugging Applications
Debugging Serial Applications
Debugging Parallel Applications
TotalView
Setting Up TotalView
Using TotalView with Slurm
SSH and TotalView
Module load mpimodule load totalview
Debugging an Application
Using TotalView with LSF-HPC
Setting TotalView Preferences
Sourcefile initfdte.f was not found, using assembler mode
Debugging Running Applications
Directories in File ⇒ Search Path
$ mpirun -srun -n2 Psimple
$ squeue
$ scancel --user username
Exiting TotalView
Page
Building a Program Intel Trace Collector and HP-MPI
Tuning Applications
Using the Intel Trace Collector and Intel Trace Analyzer
Running a Program Intel Trace Collector and HP-MPI
Visualizing Data Intel Trace Analyzer and HP-MPI
Libs
Cldflags
Using the Intel Trace Collector and Intel Trace Analyzer
Page
Using Slurm
Launching Jobs with the srun Command
Srun Squeue Scancel Sinfo Scontrol
Introduction to Slurm
Monitoring Jobs with the squeue Command
Using the srun Command with HP-MPI
Using the srun Command with LSF-HPC
Srun Roles and Modes
Job Accounting
Terminating Jobs with the scancel Command
Getting System Information with the sinfo Command
# chmod a+r /hptccluster/slurm/job/jobacct.log
Fault Tolerance
Security
Using LSF-HPC
Using LSF
Using Standard LSF on an HP XC System
Introduction to LSF-HPC in the HP XC Environment
Overview of LSF-HPC
Differences Between LSF-HPC and Standard LSF
Hostname
Resources
Hostname Status JL/U MAX Njobs RUN Ssusp Ususp RSV
Unknown Unknown
Job Terminology
$ ssh n15 lshosts
HP XCCompute Node Resource Support
SLURMnodelist =nodelist if specified
$ bsub -n 10 -I srun hostname
$ bsub -n 10 -ext SLURMnodes=10 -I srun hostname
$ bsub -n 10 -ext SLURMnodes=10exclude=n16 -I srun hostname
$ bsub -n 10 -ext SLURMconstraint=dualcore -I srun hostname
How LSF-HPC and Slurm Launch and Manage a Job
$ bsub -n4 -ext SLURMnodes=4 -o output.out ./myscript
Job Startup and Job Control
#!/bin/sh hostname srun hostname Mpirun -srun ./hellompi
Determining the LSF Execution Host
Determining Available LSF-HPC System Resources
Getting the Status of LSF-HPC
Getting Information About LSF Execution Host Node
Getting Host Load Information
Examining LSF-HPC System Queues
Getting Information About the lsf Partition
SLINUX6
$ sinfo -p lsf -lNe
Summary of the LSF bsub Command Format
$ sinfo -p lsf
LSF-SLURM External Scheduler
For information about running scripts
Bsub -n num-procs-ext SLURMslurm-arguments \
Bsub-options srun srun-optionsjobname job-options
Submitting a Job from a Non-HP XC Host
Starting on lsfhost.localdomain n6
Waiting for dispatch ... Starting on lsfhost.localdomain n1
Type=SLINUX64
Getting Information About Jobs
Getting Job Allocation Information
Slurmid=slurmjobidncpus=slurmnprocsslurmalloc=nodelist
$ bjobs -l
Examining the Status of a Job
$ bhist -l
Time stamp
$ bjobs
Viewing the Historical Information for a Job
$ bhist
Summary of time in seconds spent Various States
Jobid User Jobname Pend Psusp RUN Ususp Ssusp Unkwn Total
Translating Slurm and LSF-HPC JOBIDs
Working Interactively Within an LSF-HPC Allocation
$ bsub -I -n4 -ext SLURMnodes=4 /bin/bash
$ bjobs -l 124 grep slurm
$ srun --jobid=150 hostname
$ unset Slurmjobid
Alternatively, you can use the following
$ export SLURMJOBID=150 $ export SLURMNPROCS=4
$ unset Slurmjobid $ unset Slurmnprocs
Job 125 is submitted to the default queue normal
LSF-HPC Equivalents of Slurm srun Options
$ srun --jobid=250 uptime
$ bsub -n4 -ext SLURMnodes=4 -o %J.out sleep
Bsub -iinputfile
Mpi=mpitype
Quit-on-interrupt
Page
Advanced Topics
Enabling Remote Execution with OpenSSH
Running an X Terminal Session from a Remote Node
Determining IP Address of Your Local Machine
Logging in to HP XC System
Running an X terminal Session Using Slurm
Running an X terminal Session Using LSF-HPC
Using the GNU Parallel Make Capability
$ bsub -n4 -Ip srun -n1 xterm -display
$ srun -n4 hostname n46
$ srun -n2 hostname n46
$ cd subdir srun -n1 -N1 $MAKE -j4
Example Procedure
$ make PREFIX=’srun -n1 -N1 MAKEJ=-j4
Local Disks on Compute Nodes
Performance Considerations
Modified Makefile is invoked as follows
$ make PREFIX=srun -n1 -N1 MAKEJ=-j4
Communication Between Nodes
Shared File View
Private File View
Fp = fopen myfile, a+
Page
Appendix a Examples
Building and Running a Serial Application
Launching a Serial Interactive Shell Through LSF-HPC
Examine the LSF execution host information
Running LSF-HPC Jobs with a Slurm Allocation Request
Launching a Parallel Interactive Shell Through LSF-HPC
Example 2. Four cores on Two Specific Nodes
R15s r1m r15m It tmp swp mem LoadSched LoadStop
SLURMnodes=2
124 Lsfad
Examine the the running jobs information
$ hostname n16 $ srun hostname n5 $ bjobs
Submitting a Simple Job Script with LSF-HPC
Examine the the finished jobs information
Show the environment
Display the script
Run some commands from the pseudo-terminal
Submitting an Interactive Job with LSF-HPC
Submit the job
Show the job allocation
Submitting an HP-MPI Job with LSF-HPC
Exit the pseudo-terminal
View the interactive jobs
View the finished jobs
Lsfhost.localdomai
View the running job
View the finished job
$ bsub -n 8 -R ALPHA5 SLINUX64 \ -ext SLURMnodes=4-4 myjob
Using a Resource Requirements String in an LSF-HPC Command
States by date and time
108
Glossary
To the queue
First-come See Fcfs First-served Global storage
As local storage
Are not appropriate for replication
Single command
Login requests and directs them to a node with a login role
Linux Virtual See LVS Server Load file
LSF master host
Network See NIS Information Services
Remotely. PXE booting is configured at the Bios level
Notably to install and remove software packages
Slurm backup
Symmetric See SMP Multiprocessing
Power available per unit of space
Ssh
114
Index
Index
PGI
Utilities, 63 Slurm commands
Related pages
Scan to Email Troubleshooting for Xerox G0561
Specification for Hyundai H-CMMD4044
About the Indicator Light for Apple MD094LL/A
Wiring Diagram Multimedia Connections for Jensen VM9311
1.6.5 How to Exit the Setup Program for FIC A360
TO INSTALL MOWER See Fig for Weed Eater 187637
Exploded View and Parts List Control Panel for Briggs & Stratton 030207
Language Code List for Sony HT-X810
Top
Page
Image
Contents