Berkley kernel-level checkpointing is being tested on a development machine for both the current Linux version and the future, updated version that is in preparation.
BLCR has the advantages over other checkpoint schemes:
Information on how to use BLCR will be provided when BLCR has gone to production status.
Checkpointing is the process of telling a program or job to save information about its progress to disk, so that it can be restarted should the machine on which it is running fail. It is good practice to use checkpoint (and its associated technique of restart) for long running jobs. For jobs that last more than one week (the maximum queue time allowed) it is the only way to run jobs. Updated example scripts for Sun Grid Engine will be provided, and please email firstname.lastname@example.org if you have questions.
Please refer to the legal disclaimer covering content on this site.