9.7 Job Accounting

84

9.8 Fault Tolerance

84

9.9 Security

84

10 Using LSF

85

10.1 Information for LSF

85

10.2 Overview of LSF Integrated with SLURM

86

10.3 Differences Between LSF and LSF Integrated with SLURM

88

10.4 Job Terminology

89

10.5 Using LSF Integrated with SLURM in the HP XC Environment

91

10.5.1 Useful Commands

91

10.5.2 Job Startup and Job Control

91

10.5.3 Preemption

91

10.6 Submitting Jobs

91

10.7 LSF-SLURM External Scheduler

92

10.8 How LSF and SLURM Launch and Manage a Job

92

10.9 Determining the LSF Execution Host

94

10.10 Determining Available System Resources

94

10.10.1 Examining System Core Status

95

10.10.2 Getting Information About the LSF Execution Host Node

95

10.10.3 Getting Host Load Information

96

10.10.4 Examining System Queues

96

10.10.5 Getting Information About the lsf Partition

96

10.11 Getting Information About Jobs

96

10.11.1 Getting Job Allocation Information

97

10.11.2 Examining the Status of a Job

98

10.11.3 Viewing the Historical Information for a Job

99

10.12 Translating SLURM and LSF JOBIDs

100

10.13 Working Interactively Within an Allocation

101

10.14 LSF Equivalents of SLURM srun Options

103

11 Advanced Topics

107

11.1 Enabling Remote Execution with OpenSSH

107

11.2 Running an X Terminal Session from a Remote Node

107

11.3 Using the GNU Parallel Make Capability

109

11.3.1 Example Procedure 1

111

11.3.2 Example Procedure 2

111

11.3.3 Example Procedure 3

112

11.4 Local Disks on Compute Nodes

112

11.5 I/O Performance Considerations

113

11.5.1 Shared File View

113

11.5.2 Private File View

113

11.6 Communication Between Nodes

113

11.7 Using MPICH on the HP XC System

113

11.7.1 Using MPICH with SLURM Allocation

114

11.7.2 Using MPICH with LSF Allocation

114

A Examples

115

A.1 Building and Running a Serial Application

115

A.2 Launching a Serial Interactive Shell Through LSF

115

A.3 Running LSF Jobs with a SLURM Allocation Request

116

A.3.1 Example 1. Two Cores on Any Two Nodes

116

A.3.2 Example 2. Four Cores on Two Specific Nodes

117

6Table of Contents

Page 6
Image 6
HP XC System 4.x Software manual 107, 115