Previous [1] [2] [3] [4] [5] [6] [7] [8]

Journal of Inforamtion Science and Engineering, Vol.10 No.2, pp.259-269 (June 1994)
Fault Tolerant Distributed Simulation

Yi-Bing Lin
Bell Communications Research
Rm 2D297, 445 South Street, Morristown NJ 07960
U.S.A.

This paper presents a fault tolerant protocol for distributed Time Warp simulation. Based on the concept of global virtual time, we show that a distributed snapshot of Time Warp can be efficiently taken. A set of simple distributed snapshot algorithms and fault recovery algorithms are proposed. The distributed snapshot algorithms checkpoint the system states (distributed snapshots) from time to time. The fault recovery algorithms restore the system state from ht most recent distributed snapshot taken by the distributed snapshot algorithms. This protocol is robust enough to tolerate failures occurring at any moment.

Keywords: discrete event simulation, distributed simulation, distributed snapshot, fault tolerance, global virtual time, time warp protocol

Received July 10, 1992; revised December 30, 1993.
Communicated by Y. S. Kuo.