Shing-Tsaan Huang and Pei-Wen Kao
Institute of Computer Science
National Tsing-Hua University
Hsin Chu, Taiwan (30043), R.O.C.
This paper proposes two algorithms for detecting termination of distributed computations monitored by an external controlling agent. The first algorithm is based on the weighted throw counting scheme. Weights are assigned to each active process and to each message in transit. The agent has a weight, too. The algorithm maintains an invariant that the sum of all the weights equals one. The agent concludes the termination when its weight equals one. A space-efficient encoding of the weights is also proposed.
The second algorithm adopts the distributed snapshots scheme. When a process becomes idle, it takes a local snapshot and sends the snapshot to the agent. The agent puts the local snapshots together to form a global snapshot and determines the termination by checking the recorded state in the global snapshot.
By comparison, the first one is better if storage space is the major consideration, while the second is more suitable for real-time systems, because no waiting is employed on the processes due to termination detection. The second algorithm is also optimal in minimizing the message complexity: only one additional message carrying the local snapshot is needed per idleness.
Keywords: distributed computing, distributed snapshots, termination detection, weighted throw counting
Received February 16, 1990; revised December 8, 1990.
Communicated by Chin-Chen Chang.
*This work was supported in part by ATC, ERSO, Industrial Technology Institute of the Republic of China under the Contract SF-C-010-1.