Previous [ 1] [ 2] [ 3] [ 4] [ 5] [ 6] [ 7] [ 8] [ 9] [ 10] [ 11] [ 12] [ 13] [ 14] [ 15] [ 16] [ 17] [ 18] [ 19] [ 20]


Journal of Information Science and Engineering, Vol. 30 No. 4, pp. 1585-1600 (September 2014)

Bayesian Bridging Topic Models for Classification

Computational Intelligence Technology Center
Industrial Technology Research Institute
Hsinchu, 310 Taiwan

We study the problem of constructing the topic-based model over different domains for text classification. In real-world applications, there are abundant unlabeled documents but sparse labeled documents. It is challenging to construct a reliable and adaptive model to classify a large amount of documents containing different domains. The classifiers trained from a source domain shall perform poorly for the test data in a target domain. Also, the trained model is vulnerable to the weakness of classification among ambiguous classes. In this study, we tackle the issues of domain mismatch and confusing classes and conduct the discriminative transfer learning for text classification. We propose a Bayesian bridging topic models (BTM) from a variety of labeled and unlabeled documents and perform the transfer learning for cross-domain text classification. A structural model is built and its parameters are estimated by maximizing the joint marginal likelihood of labeled and unlabeled data via a variational inference procedure. We also construct the discriminative learning on our proposed model for adjust parameters by using the minimum classification error criterion. We show that improvements over cross-domain text classification using the proposed model can be achieved better performance than other models.

Keywords: transfer learning, topic model, cross-domain classification, latent Dirichlet allocation, Bayesian

Full Text () Retrieve PDF document (201409_16.pdf)

Received October 4, 2012; revised December 7, 2012 & January 29, 2013; accepted February 19, 2013.
Communicated by Zhi-Hua Zhou.