Copyright © 2006 Published by Elsevier Ltd.
A hybrid generative/discriminative approach to text classification with additional information
Received 27 May 2006;
References and further reading may be available for this article. To view references and further reading you must purchase this article.
Abstract
This paper presents a classifier for text data samples consisting of main text and additional components, such as Web pages and technical papers. We focus on multiclass and single-labeled text classification problems and design the classifier based on a hybrid composed of probabilistic generative and discriminative approaches. Our formulation considers individual component generative models and constructs the classifier by combining these trained models based on the maximum entropy principle. We use naive Bayes models as the component generative models for the main text and additional components such as titles, links, and authors, so that we can apply our formulation to document and Web page classification problems. Our experimental results for four test collections confirmed that our hybrid approach effectively combined main text and additional components and thus improved classification performance.
Keywords: Multiclass and single-labeled text classification; Multiple components; Maximum entropy principle; Naive Bayes model
Article Outline
- 1. Introduction
- 2. Conventional approaches
- 3. Proposed method
- 3.1. Hybrid approach
- 3.1.1. Component generative models
- 3.1.2. Discriminative class posterior design
- 3.1.3. Another class posterior by ME
- 3.2. Application to text classification
- 4. Experiments
- 4.1. Test collections
- 4.2. Experimental settings
- 4.2.1. Evaluation methods
- 4.2.2. Evaluation measure
- 4.3. Experiment 1
- 4.3.1. Compared classifiers
- 4.3.2. Results
- 4.4. Experiment 2
- 4.4.1. Compared classifiers
- 4.4.2. Results
- 4.4.3. Analysis of combination weights
- 4.5. Experiment 3
- 4.5.1. Compared classifiers
- 4.5.2. Results
- 5. Related work
- 6. Conclusion
- Appendix A. Hyperparameter tuning procedure
- References







E-mail Article
Add to my Quick Links

Cited By in Scopus (2)

)= s sinχ + pσ(θ,φ) cosχ where ρ = cot2χ determines the shape of the hybrid and pσ(θ,φ) = p
p
ang+p





Dc