Journal of Information Science and Engineering, Vol. 28 No. 1, pp. 83-97 (January 2012)

Aggregate Two-Way Co-Clustering of Ads and User Data for Online Advertisements*

Department of Computer Science and Information Engineering
National Central University
Taoyuan, 320 Taiwan

Clustering plays an important role in data mining, as it is used by many applications as a preprocessing step for data analysis. Traditional clustering focuses on grouping similar objects, while two-way co-clustering can group dyadic data (objects as well as their attributes) simultaneously. In this research, we apply two-way co-clustering to the analysis of online advertising where both ads and users need to be clustered. However, in addition to the ad-user link matrix that denotes the ads which a user has linked, we also have two additional matrices, which represent extra information about users and ads. In this paper, we proposed a 3-staged clustering method that makes use of the three data matrices to enhance clustering performance. In addition, an Iterative Cross Co-Clustering (ICCC) algorithm is also proposed for two-way co-clustering. The experiment is performed using the advertisement and user data from Morgenstern, a financial social website that focuses on the agency of advertisements. The result shows that iterative cross co-clustering provides better performance than traditional clustering and completes the task more efficiently.

Keywords: co-clustering, decision tree, KL divergence, dyadic data analysis, clustering evaluation

Received February 22, 2011; revised August 20, 2011; accepted August 31, 2011.
Communicated by Irwin King.
* This paper has been presented in the International Computer Symposium 2010 (ICS 2010) which was held in Tainan, Taiwan on Dec. 16-18, 2010 and sponsored by IEEE, NSC Taiwan, and MOE Taiwan.