Home  |   Login  |   Logout  |   Access Information  |   Alerts  |   Purchase History  |   Cart  |   Sitemap  |   Help   
 
CrossRef Search
BROWSE SEARCH IEEE XPLORE GUIDE SUPPORT
You requested this document:
1. Adaptive Checkpointing for Master-Worker Style Parallelism
Cooperman, G.; Ansel, J.; Xiaoqin Ma;
Cluster Computing, 2005. IEEE International
Sept. 2005 Page(s):1 - 2
Abstract:

We present a transparent, system-level checkpointing solution for master-worker parallelism that automatically adapts, upon restore, to the number of processor nodes available. We call this adaptive checkpointing. This is important, since nodes in a cluster fail. It also allows one to adapt to using mutliple cluster partitions, as they become available. Checkpointing a master-worker computation has the additional advantage of needing to checkpoint only the master process. This is both fast (0.05 s in our case), and more economical of disk space. We describe a system-level solution. The application writer does not declare what data structures to checkpoint. Furthermore, the solution is transparent. The application writer need not add code to request a checkpoint at appropriate locations. The system-level strategy avoids the labor-intensive and error-prone work of explicitly checkpointing the many data structures of a large program
Abstract | Full Text: PDF(256 KB)    IEEE CNF
 
» Key
IEEE JNL IEEE Journal or Magazine
IEE JNL IEE Journal or Magazine
IEEE CNF IEEE Conference Proceeding
IEE CNF IEE Conference Proceeding
IEEE STD IEEE Standard
 
 
Indexed by IEE Inspec
© Copyright 2008 IEEE – All Rights Reserved