5.2

Submitting a Serial Job Using LSF

49

 

5.2.1 Submitting a Serial Job with the LSF bsub Command

49

 

5.2.2 Submitting a Serial Job Through SLURM Only

50

5.3 Submitting a Parallel Job

51

 

5.3.1 Submitting a Non-MPI Parallel Job

51

 

5.3.2 Submitting a Parallel Job That Uses the HP-MPI Message Passing Interface

52

 

5.3.3 Submitting a Parallel Job Using the SLURM External Scheduler

53

5.4

Submitting a Batch Job or Job Script

56

5.5

Submitting Multiple MPI Jobs Across the Same Set of Nodes

58

 

5.5.1 Using a Script to Submit Multiple Jobs

58

 

5.5.2 Using a Makefile to Submit Multiple Jobs

58

5.6 Submitting a Job from a Host Other Than an HP XC Host

61

5.7

Running Preexecution Programs

61

6 Debugging Applications

63

6.1

Debugging Serial Applications

63

6.2 Debugging Parallel Applications

63

 

6.2.1 Debugging with TotalView

64

 

6.2.1.1 SSH and TotalView

64

 

6.2.1.2 Setting Up TotalView

64

 

6.2.1.3 Using TotalView with SLURM

65

 

6.2.1.4 Using TotalView with LSF

65

 

6.2.1.5 Setting TotalView Preferences

65

 

6.2.1.6 Debugging an Application

66

 

6.2.1.7 Debugging Running Applications

67

 

6.2.1.8 Exiting TotalView

67

7 Monitoring Node Activity

69

7.1 The Xtools Utilities

69

7.2

Running Performance Health Tests

70

8 Tuning Applications

75

8.1

Using the Intel Trace Collector and Intel Trace Analyzer

75

 

8.1.1 Building a Program — Intel Trace Collector and HP-MPI

75

 

8.1.2 Running a Program – Intel Trace Collector and HP-MPI

76

8.2 The Intel Trace Collector and Analyzer with HP-MPI on HP XC

77

 

8.2.1 Installation Kit

77

 

8.2.2 HP-MPI and the Intel Trace Collector

77

8.3

Visualizing Data – Intel Trace Analyzer and HP-MPI

79

9 Using SLURM

81

9.1

Introduction to SLURM

81

9.2

SLURM Utilities

81

9.3

Launching Jobs with the srun Command

81

 

9.3.1 The srun Roles and Modes

82

 

9.3.1.1 The srun Roles

82

 

9.3.1.2 The srun Modes

82

 

9.3.2 Using the srun Command with HP-MPI

82

 

9.3.3 Using the srun Command with LSF

82

9.4

Monitoring Jobs with the squeue Command

82

9.5 Terminating Jobs with the scancel Command

83

9.6 Getting System Information with the sinfo Command

83

Table of Contents

5

Page 5
Image 5
HP XC System 4.x Software manual Debugging Applications, Srun Roles Srun Modes