Frdric Desprez, Jack Dongarra#, +, Fabrice Rastello and Yves Robert#
LIP, Ecole Normale Superieure de Lyon
69364 Lyon Cedex 07, France
# Department of Computer Science
University of Tennessee
Knoxville, TN 37996-1301, U.S.A.
+ Mathematical Sciences Section
Oak Ridge National Laboratory
Oak Ridge, TN 37831, U.S.A.
In the framework of fully permutable loops, tiling has been studied extensively as a source-to-source program transformation. We build upon recent results by Hsted, Carter, and Ferrante , who aimed at determining the cumulated idle time spent by all processors while executing the partitioned (tiled) computation domain. We propose new, much shorter proofs of all their results and extend these in several important directions. More precisely, we provide an accurate solution for all values of the rise parameter that relates the shape of the iteration space to that of the tiles, and for all possible distributions of the tiles to processors. In contrast, the authors in  dealt only with a limited number of cases and provided upper bounds rather than exact formulas.
Keywords: distributed arrays, redistribution, block-cyclic distribution, scheduling, MPI (Message Passing Interface), HFP (High Performance Fortran)
Received May 1, 1997; revised October 31, 1997.
Communicated by Chau-Huang Huang