Department of Computer Science
National Tsing Hua University
Hsinchu, TAIWAN 30043, R.O.C.
This paper studies strategies for partitioning programs on distributed-memory multicomputers. A technique called skewed partition is proposed. For certain applications the skewed partition method can reduce the amount of synchronization and provide greater control over the granularity than can the commonly used block partition method. To illustrate the idea, two examples”Šan image distance transformer and a linear equation solver”Šare examined. Results obtained from an Ncube-1 multicomputer show that the skewed partition method improves the performance of these programs more than 50¢H over the block partition method. Performance analysis of the skewed partition method is then studied. Expressions are derived to model and estimate the computation and communication costs resulting from a skewed partition. The expressions are based on loop-carried dependencies and take into account overlapping communications. From these analyses, the total execution time of parallel programs using the skewed partition method can be determined and the performance can be optimized.
Keywords: distributed-memory multicomputer, granularity, image distance transformation, performance analysis, program partition, systems of linear equations
Received June 26, 1991; revised November 1, 1991.
Communicated by Ferng-Ching Lin.
*Portions of this paper have appeared in "Skewed partition - Theory and practice," by C. T. King in Proc. Of COMPSAC'91, Sep. 1991, pp.18-24, and in "Modeling the performance of groupting on multicomputers," by C. T. King, W. T. u and I. R. Kau in Proc. Of Int'l Computer Symp., Dec. 1990, pp. 951-956.
This work was supported by National Science Foundation of U.S.A. under Grant CDA-8921063 and by National Science Council of R.O.C. under Grant NSC 81-0416-E-007.