Previous [ 1] [ 2] [ 3] [ 4] [ 5] [ 6] [ 7] [ 8] [ 9] [ 10] [ 11] [ 12] [ 13] [ 14] [ 15] [ 16] [ 17] [ 18] [ 19] [ 20] [ 21] [ 22] [ 23] [ 24] [ 25]

@

Journal of Information Science and Engineering, Vol. 27 No. 2, pp. 511-525 (March 2011)

An Efficient Frequent Pattern Mining Method and its Parallelization in Transactional Databases

S. M. FAKHRAHMAD1 AND GH. DASTGHAIBYFARD2
1Department of Computer Engineering
Islamic Azad University, Shiraz Branch
Shiraz, Iran
2Department of Computer Science and Engineering
Shiraz University
Shiraz, Iran
E-mail: {mfakhrahmad@cse.; dstghaib@}shirazu.ac.ir

One of the important and well-researched problems in data mining is mining association rules from transactional databases, where each transaction consists of a set of items. The main operation in this discovery process is computing the occurrence frequency of the interesting set of items. i.e., Association Rule mining algorithms search for the set of all subsets of items that frequently occur in many database transactions. In practice, we are usually faced with large data warehouses, which contain a large number of transactions and an exponentially large space of candidate itemsets, which have to be verified. A potential solution to the computation complexity is to parallelize the mining algorithm. In this paper, four parallel versions of a novel sequential mining algorithm for discovery of frequent itemsets are proposed. The parallelized solutions are compared analytically and experimentally, by considering some important factors, such as time complexity, communication rate, load balancing, etc.

Keywords: parallel processing, data mining, frequent itemsets, association rules, load balancing

Full Text () Retrieve PDF document (201103_08.pdf)

Received May 25, 2009; revised August 24 & October 7, 2009; accepted November 18, 2009.
Communicated by Xiaodong Zhang.