Skip to main content

A Term Frequency Based Weighting Scheme Using Naïve Bayes for Text Classification

Buy Article:

$107.14 + tax (Refund Policy)

Term weighting is a strategy to assign weights to terms to improve the performance of many classifiers, such as kNN and SVM in text classification. Supervised term weighting methods have received increasing attention, in which information on the membership of training documents to classes is used. Most existing methods follow the local weight multiplies the global weight framework, but the contribution of term frequency for term weighting has not been fully investigated. In this paper, we propose a weighting scheme named term frequency-relevance term frequency based on a probabilistic model. After investigating two kinds of widely used naïve Bayes (NB) models, we employ the term event Multinomial NB model to capture the term frequency information. The matching score function based on the prediction probability ratio can then be factorized. Finally, we get the weight for each term by replacing the parameter by an estimator, term frequency is used in formulating not only the local weight factor but also the global weight factor. Numerical experiment results on two benchmark text datasets (Reuters-21578 and 20 Newsgroups) demonstrate that our proposed method outperforms the representative term weighting methods.

Keywords: Naïve Bayes; Supervised Term Weighting; Term Event; Term Frequency; Text Classification

Document Type: Research Article

Affiliations: Key Laboratory of Intelligent Information Processing of Jilin Universities, School of Computer Science and Information Technology, Northeast Normal University, Changchun, 130117, China

Publication date: 01 January 2016

More about this publication?
  • Journal of Computational and Theoretical Nanoscience is an international peer-reviewed journal with a wide-ranging coverage, consolidates research activities in all aspects of computational and theoretical nanoscience into a single reference source. This journal offers scientists and engineers peer-reviewed research papers in all aspects of computational and theoretical nanoscience and nanotechnology in chemistry, physics, materials science, engineering and biology to publish original full papers and timely state-of-the-art reviews and short communications encompassing the fundamental and applied research.
  • Editorial Board
  • Information for Authors
  • Submit a Paper
  • Subscribe to this Title
  • Terms & Conditions
  • Ingenta Connect is not responsible for the content or availability of external websites
  • Access Key
  • Free content
  • Partial Free content
  • New content
  • Open access content
  • Partial Open access content
  • Subscribed content
  • Partial Subscribed content
  • Free trial content