$ sacct |
|
|
|
| |
Jobstep | Jobname | Partition | Ncpus | Status | Error |
123 | hptclsf@99 | lsf | 8 | CANCELLED | 0 |
123.0 | hptclsf@99 | lsf | 0 | COMPLETED | 0 |
The status of a completed job handled by
•Creates the allocation in SLURM.
•Submits the user job to SLURM.
•Waits for the user job to finish.
•Cancels the allocation in SLURM.
The example shown above had three entries for a completed job. There are at least two entries; the number of entries depends on the construction of the user job:
•The first entry represents the allocation created by
•The second entry, SLURM job step 0, represents the user job that
•Further entries represent srun or mpirun
Working Interactively Within an LSF-HPC Allocation
The best way to work interactively on HP XC is to separate the allocation from the interactive work. In one terminal, submit your allocation request to
$ bsub -I -n4 -ext "SLURM[nodes=4]" /bin/bash
Job <124> is submitted to the default queue <interactive>. <<Waiting for dispatch...>>
<<Starting on lsfhost.localdomain>>
The bsub command requests 4 nodes and runs /bin/bash on the first allocated node. If resources are not immediately available, the terminal pauses at <<Waiting for dispatch...>>. When <<Starting on displays, the resources are allocated and /bin/bash is running.
To gather information about this allocation, run the following command in this first terminal (note there is no prompt from the /bin/bash process):
$ bjobs -l 124 grep slurm
date and time stamp: slurm_id=150;ncpus=8;slurm_alloc=n[1-4];
LSF allocated nodes
Begin your work in another terminal. Use ssh to login to one of the compute nodes. If you want to run tasks in parallel, use the srun command with the
$ srun --jobid=150 hostname
n1
n2
n3
n4
You can simplify this by first setting the SLURM_JOBID environment variable to the SLURM JOBID in the environment, as follows:
$ export SLURM_JOBID=150 $ srun hostname
n1
n2
n3
n4
Note
84 Using LSF