examples of, 99 programming model, 39

shared file view, 97 signal

sending to a job, 65

Simple Linux Utility for Resource Management (see SLURM) sinfo command, 65, 76

SLURM, 63

fault tolerance, 66 interaction with LSF-HPC, 73 job accounting, 65

lsf partition, 75 security model, 66

SLURM_JOBID environment variable, 80, 83 SLURM_NPROCS environment variable, 80 submitting a serial job, 47

utilities, 63 SLURM commands

sacct, 65 scancel, 65 sinfo, 65, 76 squeue, 64 srun, 38, 63

squeue command, 64 srun, 63

used with HP-MPI, 64 used with LSF-HPC, 64

srun command, 63 interrupting jobs, 38 LSF-HPC equivalents, 86

ssh, 27

ssh_create_shared_keys command, 27, 91 submitting a batch job, 49

submitting a job, 48 overview, 45

submitting a job script, 49 submitting a parallel job, 48 submitting a serial job, 46

through SLURM, 47

with the LSF bsub command, 46 submitting an HP-MPI job, 48 system, 27

architecture, 19

developing applications on, 37 information, 65

logging in, 27 overview, 19

T

TotalView, 53

debugging an application, 55 exiting, 57

setting preferences, 55 setting up, 54

tuning applications, 59

U

user environment, 31

V

VampirTrace/Vampir, 59

X

xterm

running from remote node, 91

118 Index