Running Preexecution Programs...............................................................................................................

51

6 Debugging Applications

 

Debugging Serial Applications...............................................................................................................

53

Debugging Parallel Applications.............................................................................................................

53

Debugging with TotalView................................................................................................................

53

SSH and TotalView.....................................................................................................................

54

Setting Up TotalView...................................................................................................................

54

Using TotalView with SLURM........................................................................................................

54

Using TotalView with LSF-HPC......................................................................................................

55

Setting TotalView Preferences.......................................................................................................

55

Debugging an Application..........................................................................................................

55

Debugging Running Applications..................................................................................................

56

Exiting TotalView........................................................................................................................

57

7 Tuning Applications

 

Using the Intel Trace Collector and Intel Trace Analyzer..............................................................................

59

Building a Program — Intel Trace Collector and HP-MPI........................................................................

59

Running a Program – Intel Trace Collector and HP-MPI..........................................................................

60

Visualizing Data – Intel Trace Analyzer and HP-MPI..............................................................................

60

8 Using SLURM

 

Introduction to SLURM............................................................................................................................

63

SLURM Utilities.....................................................................................................................................

63

Launching Jobs with the srun Command...................................................................................................

63

The srun Roles and Modes................................................................................................................

64

The srun Roles............................................................................................................................

64

The srun Modes..........................................................................................................................

64

Using the srun Command with HP-MPI................................................................................................

64

Using the srun Command with LSF-HPC...............................................................................................

64

Monitoring Jobs with the squeue Command..............................................................................................

64

Terminating Jobs with the scancel Command............................................................................................

65

Getting System Information with the sinfo Command..................................................................................

65

Job Accounting.....................................................................................................................................

65

Fault Tolerance.....................................................................................................................................

66

Security...............................................................................................................................................

66

9 Using LSF

 

Using Standard LSF on an HP XC System.................................................................................................

67

Using LSF-HPC......................................................................................................................................

67

Introduction to LSF-HPC in the HP XC Environment................................................................................

68

Overview of LSF-HPC..................................................................................................................

68

Differences Between LSF-HPC and Standard LSF..............................................................................

69

Job Terminology.........................................................................................................................

70

HP XCCompute Node Resource Support........................................................................................

71

Notes on LSF-HPC.......................................................................................................................

72

How LSF-HPC and SLURM Launch and Manage a Job.....................................................................

73

Notes About Using LSF-HPC in the HP XC Environment....................................................................

74

Job Startup and Job Control....................................................................................................

74

Preemption...........................................................................................................................

75

Determining the LSF Execution Host....................................................................................................

75

Determining Available LSF-HPC System Resources.................................................................................

75

Getting the Status of LSF-HPC.......................................................................................................

75

Getting Information About LSF Execution Host Node.......................................................................

75

Getting Host Load Information......................................................................................................

76

Examining LSF-HPC System Queues...............................................................................................

76

Table of Contents

5