You are here: Home News BLCR

BLCR

Checkpointing

Berkley kernel-level checkpointing is being tested on a development machine for both the current Linux version and the future, updated version that is in preparation.

BLCR has the advantages over other checkpoint schemes:

  • Can work on unmodified binaries
  • Can work for parallel environments (e.g. MPI).

Information on how to use BLCR will be provided when BLCR has gone to production status.

Checkpointing is the process of telling a program or job to save information about its progress to disk, so that it can be restarted should the machine on which it is running fail. It is good practice to use checkpoint (and its associated technique of restart) for long running jobs. For jobs that last more than one week (the maximum queue time allowed) it is the only way to run jobs. Updated example scripts for Sun Grid Engine will be provided, and please email support@wrg.york.ac.uk if you have questions.

Document Actions

Please refer to the legal disclaimer covering content on this site.