
When a job is checkpointed, the checkpoint information is stored in checkpoint_dir/job_ID/file_name. Multiple jobs can checkpoint into the same directory. The system can create multiple files.
The checkpoint directory is used for restarting the job (see brestart(1)). The checkpoint directory can be any valid path.
Optionally, specifies a checkpoint period in minutes. Specify a positive integer. The running job is checkpointed automatically every checkpoint period. The checkpoint period can be changed using bchkpnt(1). Because checkpointing is a heavyweight operation, you should choose a checkpoint period greater than half an hour.
Optionally, specifies an initial checkpoint period in minutes. Specify a positive integer. The first checkpoint does not happen until the initial period has elapsed. After the first checkpoint, the job checkpoint frequency is controlled by the normal job checkpoint interval.
Optionally, specifies a custom checkpoint and restart method to use with the job. Use method=default to indicate to use the default LSF checkpoint and restart programs for the job, echkpnt.default and erestart.default.
The echkpnt.method_name and erestart.method_name programs must be in LSF_SERVERDIR or in the directory specified by LSB_ECHKPNT_METHOD_DIR (environment variable or set in lsf.conf).
If a custom checkpoint and restart method is already specified with
LSB_ECHKPNT_METHOD (environment variable or in lsf.conf), the method you specify with bsub
Process checkpointing is not available on all host types, and may require linking programs with a special libraries (see libckpt.a(3)). LSF invokes echkpnt (see echkpnt(8)) found in LSF_SERVERDIR to checkpoint the job. You can override the default echkpnt for the job by defining as environment variables or in lsf.conf LSB_ECHKPNT_METHOD and LSB_ECHKPNT_METHOD_DIR to point to your own echkpnt. This allows you to use other checkpointing facilities, including
The checkpoint method directory should be accessible by all users who need to run the custom echkpnt and erestart programs.
Only running members of a chunk job can be checkpointed.
Login shell is not supported on Windows.
By default, the limit is specified in KB. Use LSF_UNIT_FOR_LIMITS in lsf.conf to specify a larger unit for the limit (MB, GB, TB, PB, or EB).
Platform LSF Command Reference 191