Previous [ 1] [ 2] [ 3] [ 4] [ 5] [ 6] [ 7] [ 8] [ 9] [ 10] [ 11] [ 12] [ 13] [ 14] [ 15]

@

Journal of Information Science and Engineering, Vol. 27 No. 6, pp. 1855-1870 (November 2011)

Impact of Behavior Clustering on Web Surfer Behavior Prediction*

MALIK TAHIR HASSAN AND ASIM KARIM
Department of Computer Science
LUMS School of Science and Engineering
Lahore, 54792 Pakistan

We investigate Web surfer behavior prediction by building generative and discriminative models on the entire history of navigation paths and on behavior clustering of the history. The underlying question that we try to answer is: Does behavior clustering improve behavior prediction? For behavior clustering, we adapt the k-modes clustering algorithm by incorporating a new similarity measure that gives greater weight to matches at the beginning of the navigation path. The initial cluster representatives are selected from the set of most dissimilar paths which also fixes the number of clusters. For generative prediction, we adopt Markov chain Bayesian classification models whereas for discriminative prediction we build SVM models. Experiments are performed on two realworld data sets. Surprisingly, the results show that behavior clustering has no significant impact on Web surfer behavior prediction. We also investigate the impact of time of visit, the number of relevant clusters used in prediction models, and the use of cluster modes on Web surfer behavior prediction. We find that for limited scope data simpler approaches such as prediction using cluster modes can produce highly accurate predictions (less than 1% drop from the best prediction) with greater efficiency.

Keywords: clustering, navigation path prediction, order weighted similarity, sequence prediction, web usage mining, generative discriminative models

Full Text () Retrieve PDF document (201111_04.pdf)

Received July 14, 2010; revised November 14, 2010; accepted January 12, 2011.
Communicated by Chih-Jen Lin.
* We gratefully acknowledge the support from Lahore University of Management Sciences (LUMS) and Higher Education Commission (HEC) of Pakistan.