Previous [ 1] [ 2] [ 3] [ 4] [ 5] [ 6] [ 7] [ 8] [ 9] [ 10] [ 11] [ 12] [ 13] [ 14] [ 15] [ 16] [ 17] [ 18] [ 19] [ 20] [ 21] [ 22] [ 23] [ 24]

@

Journal of Information Science and Engineering, Vol. 26 No. 2, pp. 461-483 (March 2010)

HIC: A Robust and Efficient Hyper-Image-Based Clustering for Very Large Datasets*

KUN-CHE LU AND DON-LIN YANG
Department of Information Engineering and Computer Science
Feng Chia University
Taichung, 407 Taiwan

Most existing clustering approaches not only require several scans of a dataset but also have a very high computational cost. In this paper, we propose a novel, efficient, and effective clustering framework which requires only one scan of the input dataset. In the beginning, the original dataset is transformed and merged into a hyper-image. After that, the dissimilarities between data points are measured, once and for all, by using various image-processing methodologies. Then, image segmentation techniques are applied to extract clusters from the hyper-image. The resulting clusters can be further processed to achieve fuzzy and/or hierarchical clustering effects. Moreover, the proposed framework can cluster incrementally and even dynamically with only one scan of the updated records. With this capability, it can also be used to effectively cluster streaming data. Experimental results show that our approach is robust and stable under various parameter settings and data distributions, and it is more powerful and sophisticated than other methodologies.

Keywords: clustering framework, image processing, fuzzy set, hierarchical clustering, dynamic clustering

Full Text () Retrieve PDF document (201003_09.pdf)

Received April 3, 2008; revised February 23, 2009; accepted July 10, 2009.
Communicated by Suh-Yin Lee.
* This paper was supported by the National Science Council of Taiwan, under grants No. NSC 96-2218-E-007- 007 and NSC 95-2221-E-035-068-MY3.