Kien A. Hua, Chiang Lee*and Jih-Kwon Peir#
Dept. of Computer Science
University of Central Florida
Orlando, FL 32816-0362 U.S.A.
*Inst. of Information Engineering
National Cheng-Kung University
Tainan, Taiwan R.O.C.
#Computer and Communication Lab.
Industrial Technology and Research Institute
Hsinchu, Taiwan, R.O.C.
The most debated architectures for parallel database processing are Shared Nothing (SN) and Shared Everything (SE) structures. Although SN is considered to be most scalable, it is very sensitive to the data skew problem. On the other hand, SE sallows the collaborating processors to share the work load more efficiently. It, however, suffers from the limitation of memory and disk I/O bandwidth.
In this paper, we paresent a hybrid architecture in which SE clusters are interconnected through a communication network to form a SN structure at the inter-cluster level. In this approach, processing elements are clustered into SE systems to minimize the skew effect. Each cluster, however, is kept small within the limitation of memory and I/O technology to avoid the data access bottleneck.
A generalized performance model was developed to perform sensitivity analysis for the hybrid structure and to compare it against SE and SN organizations. The comparison results favor the hybrid structure. The selection of a configuration for a hybrid structure, however, is dependent on the cost of the hardware components and the available technology. A correct combination will allow one to design an optimal cost/performance parallel database system.
Keywords: parallel architectures, query processing, parallel hash-join, hybrid architecture, sensitivity analysis, execution time
Received September 20, 1992; revised March 19, 1993.
Communicated by Chi-Chen Chang.