Abstract
Traditional approaches to pattern recognition tasks normally consider only the unilabel classification problem, that is, each observation (both in the training and test sets) has one unique class label associated to it. Yet in many real-world tasks this is only a rough approximation, as one sample can be labeled with a set of classes and thus techniques for the more general multi-label problem have to be explored. In this paper we review the techniques presented in our previous work and discuss its application to the field of text classification, using the multinomial (Naive Bayes) classifier. Results are presented on the Reuters-21578 dataset, and our proposed approach obtains satisfying results.
This work has been partially supported by the Spanish CICYT under contracts TIC2002-04103-C03-03 and TIC2003-07158-C04-03
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
McCallum, A.K.: Multi-Label Text Classification with a Mixture Model Trained by EM. In: NIPS 1999 (1999)
Sebastiani, F.: Machine learning in automated text categorization. ACM Comput. Surv. 34, 1–47 (2002)
Castro, M.J., Vilar, D., Sanchis, E., Aibar, P.: Uniclass and Multiclass Connectionist Classification of Dialogue Acts. In: Sanfeliu, A., Ruiz-Shulcloper, J. (eds.) CIARP 2003. LNCS, vol. 2905, pp. 266–273. Springer, Heidelberg (2003)
Schapire, R.E., Singer, Y.: Boostexter: A boosting-based system for text categorization. Machine Learning 39, 135–168 (2000)
Yang, Y.: An evaluation of statistical approaches to text categorization. Information Retrieval 1, 69–90 (1999)
Joachims, T.: Text categorization with Support Vector Machines: Learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998)
Nigam, K., McCalum, A., Thrun, S., Mitchell, T.: Text classification from labeled and unlabeled documents using EM. Machine Learning 39, 103–134 (2000)
McCallum, A., Nigam, K.: A comparison of event models for naive Bayes text classification. In: AAAI/ICML 1998 Workshop on Learning for Text Categorization, pp. 41–48. AAAI Press, Menlo Park (1998)
Juan, A., Ney, H.: Reversing and Smoothing the Multinomial Naive Bayes Text Classifier. In: Proc. of the 2nd Int. Workshop on Pattern Recognition in Information Systems (PRIS 2002), Alacant (Spain), pp. 200–212 (2002)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. John Wiley & Sons, New York (2001)
Efron, B., Tibshirani, R.J.: An Introduction to the Bootstrap. Chapman & Hall, New York (1993)
Ney, H., Martin, S., Wessel, F.: Satistical Language Modeling Using Leaving-One-Out. In: Corpus-based Methods in Language and Speech Proceesing, pp. 174–207. Kluwer Academic Publishers, Dordrecht (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Vilar, D., Castro, M.J., Sanchis, E. (2004). Multi-label Text Classification Using Multinomial Models. In: Vicedo, J.L., Martínez-Barco, P., Muńoz, R., Saiz Noeda, M. (eds) Advances in Natural Language Processing. EsTAL 2004. Lecture Notes in Computer Science(), vol 3230. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30228-5_20
Download citation
DOI: https://doi.org/10.1007/978-3-540-30228-5_20
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23498-2
Online ISBN: 978-3-540-30228-5
eBook Packages: Springer Book Archive