| Previous | [ 1] | [ 2] | [ 3] | [ 4] | [ 5] | [ 6] | [ 7] | [ 8] | [ 9] | [ 10] | [ 11] | [ 12] | [ 13] | [ 14] | [ 15] |
¡@
MALIK TAHIR HASSAN AND ASIM KARIM
Department of Computer Science
LUMS School of Science and Engineering
Lahore, 54792 Pakistan
We investigate Web surfer behavior prediction by building generative and discriminative
models on the entire history of navigation paths and on behavior clustering
of the history. The underlying question that we try to answer is: Does behavior clustering
improve behavior prediction? For behavior clustering, we adapt the k-modes clustering
algorithm by incorporating a new similarity measure that gives greater weight to matches
at the beginning of the navigation path. The initial cluster representatives are selected
from the set of most dissimilar paths which also fixes the number of clusters. For generative
prediction, we adopt Markov chain Bayesian classification models whereas for discriminative
prediction we build SVM models. Experiments are performed on two realworld
data sets. Surprisingly, the results show that behavior clustering has no significant
impact on Web surfer behavior prediction. We also investigate the impact of time of visit,
the number of relevant clusters used in prediction models, and the use of cluster modes
on Web surfer behavior prediction. We find that for limited scope data simpler approaches
such as prediction using cluster modes can produce highly accurate predictions
(less than 1% drop from the best prediction) with greater efficiency.
Received July 14, 2010; revised November 14, 2010; accepted January 12, 2011.
Communicated by Chih-Jen Lin.
* We gratefully acknowledge the support from Lahore University of Management Sciences (LUMS) and Higher
Education Commission (HEC) of Pakistan.