
1This line attempts to submit a program that does not exist.
The following command line makes the program and executes it:
$ bsub
Job <117> is submitted to default queue <normal>.
The output file contains error messages related to the attempt to launch the nonexistent program.
$ cat 117.out
.
.
.
mpirun
slurmstepd: [19.3]: error: execve(): ./ping_bogus: No such file or directory
slurmstepd: [19.3]: error: execve(): ./ping_bogus: No such file or directory
srun: error: n14: task0: Exited with exit code 2
srun: Terminating job
slurmstepd: [19.3]: error: execve(): ./ping_bogus: No such file or directory
slurmstepd: [19.3]: error: execve(): ./ping_bogus: No such file or directory
make: *** [run3] Error 2
make: *** Waiting for unfinished jobs....
[0:n14]
100000 bytes: 99.06 usec/msg
100000 bytes: 1009.51 MB/sec [0:n14]
100000 bytes: 99.76 usec/msg
100000 bytes: 1002.43 MB/sec [1:n14]
100000 bytes: 1516.83 usec/msg
100000 bytes: 65.93 MB/sec [1:n14]
100000 bytes: 1519.73 usec/msg
100000 bytes: 65.80 MB/sec [2:n15]
100000 bytes: 108.65 usec/msg
100000 bytes: 920.38 MB/sec [2:n15]
100000 bytes: 99.44 usec/msg
100000 bytes: 1005.65 MB/sec [3:n15]
100000 bytes: 1877.35 usec/msg
100000 bytes: 53.27 MB/sec [3:n15]
100000 bytes: 1888.22 usec/msg
100000 bytes: 52.96 MB/sec
The sacct command, which displays SLURM accounting information, reflects the error:
[lsfadmin@n16 ~]$ sacct |
|
|
|
| |
Jobstep | Jobname | Partition | Ncpus | Status | Error |
19 | hptclsf@117 | lsf | 8 | CANCELLED | 2 |
19.0 | hptclsf@117 | lsf | 0 | FAILED | 2 |
19.1 | hptclsf@117 | lsf | 8 | COMPLETED | 0 |
19.2 | hptclsf@117 | lsf | 8 | COMPLETED | 0 |
19.3 | hptclsf@117 | lsf | 8 | FAILED | 2 |
60 Submitting Jobs