Bell Communications Research
Rm 2D297, 445 South Street, Morristown NJ 07960
This paper presents a fault tolerant protocol for distributed Time Warp simulation. Based on the concept of global virtual time, we show that a distributed snapshot of Time Warp can be efficiently taken. A set of simple distributed snapshot algorithms and fault recovery algorithms are proposed. The distributed snapshot algorithms checkpoint the system states (distributed snapshots) from time to time. The fault recovery algorithms restore the system state from ht most recent distributed snapshot taken by the distributed snapshot algorithms. This protocol is robust enough to tolerate failures occurring at any moment.
Keywords: discrete event simulation, distributed simulation, distributed snapshot, fault tolerance, global virtual time, time warp protocol
Received July 10, 1992; revised December 30, 1993.
Communicated by Y. S. Kuo.