HP XC System 3.x Software Launching a Parallel Interactive Shell Through LSF-HPC, SLURMnodes=2

Page 101

SCHEDULING PARAMETERS:

 

 

 

 

 

 

 

 

 

r15s r1m r15m

ut

pg

io

ls

it tmp swp mem

loadSched

-

-

-

-

-

-

-

-

-

-

-

loadStop

-

-

-

-

-

-

-

-

-

-

-

EXTERNAL MESSAGES:

 

 

 

 

 

 

 

 

 

MSG_ID

FROM

POST_TIME

MESSAGE

ATTACHMENT

0

-

-

-

-

1

lsfadmin

date and time

SLURM[nodes=2]

N

Example 2. Four cores on Two Specific Nodes

This example submits a job that requests four cores on two specific nodes, on an XC system that has three compute nodes.

Submit the job:

$ bsub -I -n4 -ext "SLURM[nodelist=n[14,16]]" srun hostname Job <9> is submitted to default queue <normal>.

<<Waiting for dispatch ...>> <<Starting on lsfhost.localdomain>> n14

n14

n16

n16

View the job:

$ bjobs -l 9

Job <9>, User <smith>, Project <default>, Status <DONE>,

Queue <normal>, Interactive mode, Extsched <SLURM[nodelist=n[14,16]]>, Command <srun hostname>

date and time stamp: Submitted from host <lsfhost.localdomain>, CWD <$HOME>, 4 Processors Requested;

date and time stamp: Started on 4 Hosts/Processors <4*lsfhost.localdomain>;

date and time stamp: slurm_id=24;ncpus=4;<SLURM[nodelist=n[14,16]]>;

date and time stamp: Done successfully. The CPU time used is 0.0 seconds.

SCHEDULING PARAMETERS:

 

 

 

 

 

 

 

 

 

r15s r1m r15m

ut

pg

io

ls

it tmp swp mem

loadSched

-

-

-

-

-

-

-

-

-

-

-

loadStop

-

-

-

-

-

-

-

-

-

-

-

EXTERNAL

MESSAGES:

 

 

 

MSG_ID

FROM

POST_TIME

MESSAGE

ATTACHMENT

0

-

-

-

-

1

lsfadmin

date and time

SLURM[nodelist=n[14,16]]

N

Launching a Parallel Interactive Shell Through LSF-HPC

This section provides an example that shows how to launch a parallel interactive shell through LSF-HPC. The bsub -Iscommand is used to launch an interactive shell through LSF-HPC. This example steps through a series of commands that illustrate what occurs when you launch an interactive shell.

Examine the LSF execution host information:

$ bhosts

 

 

 

 

 

 

 

 

HOST_NAME

STATUS

JL/U

MAX

NJOBS RUN SSUSP

USUSP

RSV

lsfhost.localdomain

ok

-

12

0

0

0

0

0

Submit the job: 101

Image 101
Contents HP XC System Software Users Guide Page Table of Contents Submitting Jobs Configuring Your Environment with ModulefilesDeveloping Applications Using Slurm Tuning ApplicationsUsing LSF Debugging ApplicationsGlossary 109 Index 115 Advanced TopicsExamples List of Figures Page Determining the Node Platform List of TablesPage Submitting a Job Script List of ExamplesPage Intended Audience About This DocumentDocument Organization This document is organized as followsHP XC Information Supplementary Information $ man lsfcommandnameFor More Information Manpages Related Information$ man discover $ man 8 discover $ man -k keywordRelated MPI Web Sites Related Linux Web SitesRelated Compiler Web Sites Additional PublicationsHP Encourages Your Comments Typographic ConventionsEnvironment Variable User inputSystem Architecture Overview of the User EnvironmentHP XC System Software Operating SystemStorage and I/O Node SpecializationSAN Storage File SystemLocal Storage File System LayoutNetwork Address Translation NAT Determining System Configuration InformationSystem Interconnect Network Modules CommandsUser Environment Run-Time Environment Application Development EnvironmentParallel Applications Serial ApplicationsHow LSF-HPC and Slurm Interact Load Sharing Facility LSF-HPCStandard LSF Components, Tools, Compilers, Libraries, and Debuggers Mpirun commandLVS Login Routing Using the SystemUsing the Secure Shell to Log Logging In to the SystemGetting Information About Resources IntroductionGetting Information About Queues Performing Other Common User Tasks $ man sinfo Getting System Help and InformationOverview of Modules Configuring Your Environment with ModulefilesSupplied Modulefiles Viewing Available Modulefiles Modulefiles Automatically Loaded on the SystemViewing Loaded Modulefiles Loading a ModulefileUnloading a Modulefile Automatically Loading a Modulefile at LoginModulefile Conflicts Loading a Modulefile for the Current SessionViewing Modulefile-Specific Help Creating a Modulefile$ module load modules $ man modulefile $ module help totalviewPage Compilers Developing ApplicationsApplication Development Environment Overview Interrupting a Job Examining Nodes and Partitions Before Running JobsMPI Compiler Partition Avail Timelimit Nodes State NodelistDeveloping Serial Applications Setting Debugging OptionsSerial Application Build Environment Building Serial ApplicationsParallel Application Build Environment Developing Parallel ApplicationsModulefiles OpenMPQuadrics Shmem PthreadsMPI Library Intel Fortran and C/C++CompilersBuilding Parallel Applications Examples of Compiling and Linking HP-MPI Applications Developing LibrariesDesigning Libraries for the CP4000 Platform To build a 64-bit application, you might enter Linkcommand 32-bit -L/opt/mypackage/lib/i686 -lmystuffLinkcommand 64-bit -L/opt/mypackage/lib/x8664 -lmystuff ExtSLURMslurm-arguments Submitting JobsOverview of Job Submission Submitting a Serial Job Using LSF-HPC Submitting a Serial Job Using Standard LSFSubmitting a Serial Job with the LSF bsub Command $ bsub hostnameSubmitting a Serial Job Through Slurm only $ bsub -n4 -I srun hostname Submitting a Non-MPI Parallel JobBsub -nnum-procsbsub-optionsmpijob Mpirun mpirun--options-srunsrun-optionsmpi-jobnameBsub -nnum-procs bsub-optionsscript-name Submitting a Batch Job or Job Script$ bsub -n4 -I mpirun -srun ./helloworld Srun hostname mpirun -srun hellompi $ cat myscript.sh #!/bin/sh$ bsub -I -n4 Myscript.sh $ bsub -n4 -ext SLURMnodes=4 -I ./myscript.sh$ cat ./envscript.sh #!/bin/sh name=`hostname` Running Preexecution Programs$ bsub -n4 -I ./myscript.sh Opt/hptc/bin/srun Mypreexec Debugging Serial Applications Debugging ApplicationsDebugging Parallel Applications TotalViewUsing TotalView with Slurm Setting Up TotalViewSSH and TotalView Module load mpimodule load totalviewDebugging an Application Using TotalView with LSF-HPCSetting TotalView Preferences Debugging Running Applications Sourcefile initfdte.f was not found, using assembler modeDirectories in File ⇒ Search Path $ mpirun -srun -n2 Psimple$ squeue $ scancel --user usernameExiting TotalView Page Building a Program Intel Trace Collector and HP-MPI Tuning ApplicationsUsing the Intel Trace Collector and Intel Trace Analyzer Visualizing Data Intel Trace Analyzer and HP-MPI Running a Program Intel Trace Collector and HP-MPILibs CldflagsUsing the Intel Trace Collector and Intel Trace Analyzer Page Launching Jobs with the srun Command Using SlurmSrun Squeue Scancel Sinfo Scontrol Introduction to SlurmUsing the srun Command with HP-MPI Monitoring Jobs with the squeue CommandUsing the srun Command with LSF-HPC Srun Roles and ModesJob Accounting Terminating Jobs with the scancel CommandGetting System Information with the sinfo Command # chmod a+r /hptccluster/slurm/job/jobacct.log Fault ToleranceSecurity Using LSF-HPC Using LSFUsing Standard LSF on an HP XC System Overview of LSF-HPC Introduction to LSF-HPC in the HP XC EnvironmentHostname Differences Between LSF-HPC and Standard LSFResources Hostname Status JL/U MAX Njobs RUN Ssusp Ususp RSVUnknown Unknown Job Terminology$ ssh n15 lshosts SLURMnodelist =nodelist if specified HP XCCompute Node Resource Support$ bsub -n 10 -ext SLURMnodes=10 -I srun hostname $ bsub -n 10 -I srun hostname$ bsub -n 10 -ext SLURMnodes=10exclude=n16 -I srun hostname $ bsub -n 10 -ext SLURMconstraint=dualcore -I srun hostname$ bsub -n4 -ext SLURMnodes=4 -o output.out ./myscript How LSF-HPC and Slurm Launch and Manage a Job#!/bin/sh hostname srun hostname Mpirun -srun ./hellompi Job Startup and Job ControlDetermining Available LSF-HPC System Resources Determining the LSF Execution HostGetting the Status of LSF-HPC Getting Information About LSF Execution Host NodeExamining LSF-HPC System Queues Getting Host Load InformationGetting Information About the lsf Partition SLINUX6$ sinfo -p lsf -lNe Summary of the LSF bsub Command Format$ sinfo -p lsf For information about running scripts LSF-SLURM External SchedulerBsub -n num-procs-ext SLURMslurm-arguments \ Bsub-options srun srun-optionsjobname job-optionsStarting on lsfhost.localdomain n6 Submitting a Job from a Non-HP XC HostWaiting for dispatch ... Starting on lsfhost.localdomain n1 Type=SLINUX64Getting Job Allocation Information Getting Information About JobsSlurmid=slurmjobidncpus=slurmnprocsslurmalloc=nodelist $ bjobs -l$ bhist -l Examining the Status of a JobTime stamp $ bjobs$ bhist Viewing the Historical Information for a JobSummary of time in seconds spent Various States Jobid User Jobname Pend Psusp RUN Ususp Ssusp Unkwn TotalTranslating Slurm and LSF-HPC JOBIDs $ bsub -I -n4 -ext SLURMnodes=4 /bin/bash Working Interactively Within an LSF-HPC Allocation$ bjobs -l 124 grep slurm $ srun --jobid=150 hostnameAlternatively, you can use the following $ unset Slurmjobid$ export SLURMJOBID=150 $ export SLURMNPROCS=4 $ unset Slurmjobid $ unset SlurmnprocsLSF-HPC Equivalents of Slurm srun Options Job 125 is submitted to the default queue normal$ srun --jobid=250 uptime $ bsub -n4 -ext SLURMnodes=4 -o %J.out sleepBsub -iinputfile Mpi=mpitype Quit-on-interrupt Page Enabling Remote Execution with OpenSSH Advanced TopicsRunning an X Terminal Session from a Remote Node Determining IP Address of Your Local MachineLogging in to HP XC System Running an X terminal Session Using SlurmRunning an X terminal Session Using LSF-HPC $ bsub -n4 -Ip srun -n1 xterm -display Using the GNU Parallel Make Capability$ srun -n4 hostname n46 $ srun -n2 hostname n46$ cd subdir srun -n1 -N1 $MAKE -j4 $ make PREFIX=’srun -n1 -N1 MAKEJ=-j4 Example ProcedurePerformance Considerations Local Disks on Compute NodesModified Makefile is invoked as follows $ make PREFIX=srun -n1 -N1 MAKEJ=-j4Shared File View Communication Between NodesPrivate File View Fp = fopen myfile, a+Page Building and Running a Serial Application Appendix a ExamplesLaunching a Serial Interactive Shell Through LSF-HPC Examine the LSF execution host informationRunning LSF-HPC Jobs with a Slurm Allocation Request Example 2. Four cores on Two Specific Nodes Launching a Parallel Interactive Shell Through LSF-HPCR15s r1m r15m It tmp swp mem LoadSched LoadStop SLURMnodes=2124 Lsfad Examine the the running jobs information$ hostname n16 $ srun hostname n5 $ bjobs Examine the the finished jobs information Submitting a Simple Job Script with LSF-HPCShow the environment Display the scriptSubmitting an Interactive Job with LSF-HPC Run some commands from the pseudo-terminalSubmit the job Show the job allocationExit the pseudo-terminal Submitting an HP-MPI Job with LSF-HPCView the interactive jobs View the finished jobsLsfhost.localdomai View the running jobView the finished job $ bsub -n 8 -R ALPHA5 SLINUX64 \ -ext SLURMnodes=4-4 myjob Using a Resource Requirements String in an LSF-HPC CommandStates by date and time 108 Glossary First-come See Fcfs First-served Global storage To the queueAs local storage Are not appropriate for replicationLogin requests and directs them to a node with a login role Single commandLinux Virtual See LVS Server Load file LSF master hostRemotely. PXE booting is configured at the Bios level Network See NIS Information ServicesNotably to install and remove software packages Slurm backupSymmetric See SMP Multiprocessing Power available per unit of spaceSsh 114 Index Index PGI Utilities, 63 Slurm commands