Previous [ 1] [ 2] [ 3] [ 4] [ 5] [ 6] [ 7] [ 8] [ 9] [ 10] [ 11] [ 12]


Journal of Information Science and Engineering, Vol. 21 No. 2, pp. 239-257 (March 2005)

More Properties of Communication-Induced Checkpointing
Protocols with Rollback-Dependency Trackability

Jichiang Tsai, Sy-Yen Kuo and Yi-Min Wang
*Department of Electrical Engineering
National Chung Hsing Universitiy
Taichung, 407 Taiwan
**Department of Electrical Engineering
National Taiwan University
Taipei, 106 Taiwan
+Microsoft Resesarch, Microsoft Corporation
Redmond, Washington, U.S.A.

Rollback-Dependency Trackability (RDT) is a property stating that all rollback dependencies between local checkpoints are on-line trackable using a transitive dependency vector. In this paper, we introduce some properties of communication-induced checkpointing protocols possessing the RDT property. First, we demonstrate that wherever an RDT protocol detects a PCM-path in the checkpoint and communication pattern associated with a distributed computation, it can also detect an EPSCM-path there. Moreover, if this detected PCM-path is non-visibly doubled, its corresponding EPSCMpath is also non-visibly doubled. Next, we go on to prove that if an RDT protocol breaks all EPSCM-cycles and non-visibly doubled EPSCM-paths, it breaks all visibly doubled EPSCM-paths as well. From these results, we find that some RDT protocols actually have the same behavior for all possible patterns. Furthermore, we also construct patterns to show that a few RDT protocols are incomparable in terms of the number of forced checkpoints. Last but not least, we discuss a simulation study to verify our previous theoretical results

Keywords: distributed systems, fault tolerance, rollback-dependency trackability, communication- induced checkpointing protocols, rollback-recovery

Full Text () Retrieve PDF document (200503_01.pdf)

Received April 5, 2004; revised August 3, 2004; accepted August 23, 2004.
Communicated by Chu-Sing Yang.