Shu-Yuen Hwang and Yi-Bing Lin+
Department of Computer Science and Information Engineering
National Chiao Tung University
Hsinchu, Taiwan, 300, R.O.C.
+Bell Communications Research
2d297, 445 South Street
Morristown, NJ 07962-1910,U.S.A.
The Time Warp mechanism is the most commom "optimistic" parallel simulation protocol. A process executes every message as soon as it arrives. If a message with a smaller timestamp subsequently arrives, the process rolls back its state to the time of the earlier message and re-executes from that point.
Clearly, the state of each process must be saved (checkpointed) regularly in case rollback is necessary. Although most existing Time Warp implementations checkpoint after every state transition, this is not necessary, and the checkpoint interval is in reality a tuning parameter of the simulation.
This paper analyzes the behavior of checkpointing in Time Warp simulation. We show how to derive the optimal frequency of checkpointing in a Time Warp simulation. We express the optimal checkpoint interval as a function of the average rollback distance, the average state saving overhead, the ratio of the total number of executed messages to the total number of undone messages, and the average cost of a simulation step. The results are consistent with the following intuition: A large checkpoint interval should be chosen if and only if the state saving overhead is large, and/or a large number of events are executed, on the average, between two consecutive rollbacks. This result also agrees with experiments reported in the literature. An nanlysis of the variance of the checkpoint interval is also shown. Comparisons to checkpointing in other domains are discussed.
Keywords: discrete event simulation, optimistic parallel simulation, time-warp protocol, checkpointing
Received July 15, 1991; revised September 30, 1992.
Communicated by Wen-Tsuen Chen.
*Based on the paper presented in 1990 International Computer Symposium . Reference