Reference Hub3
Prospecting the Effect of Topic Modeling in Information Retrieval

Prospecting the Effect of Topic Modeling in Information Retrieval

Aakanksha Sharaff, Jitesh Kumar Dewangan, Dilip Singh Sisodia
Copyright: © 2021 |Volume: 17 |Issue: 3 |Pages: 17
ISSN: 1552-6283|EISSN: 1552-6291|EISBN13: 9781799859727|DOI: 10.4018/IJSWIS.2021070102
Cite Article Cite Article

MLA

Sharaff, Aakanksha, et al. "Prospecting the Effect of Topic Modeling in Information Retrieval." IJSWIS vol.17, no.3 2021: pp.18-34. http://doi.org/10.4018/IJSWIS.2021070102

APA

Sharaff, A., Dewangan, J. K., & Sisodia, D. S. (2021). Prospecting the Effect of Topic Modeling in Information Retrieval. International Journal on Semantic Web and Information Systems (IJSWIS), 17(3), 18-34. http://doi.org/10.4018/IJSWIS.2021070102

Chicago

Sharaff, Aakanksha, Jitesh Kumar Dewangan, and Dilip Singh Sisodia. "Prospecting the Effect of Topic Modeling in Information Retrieval," International Journal on Semantic Web and Information Systems (IJSWIS) 17, no.3: 18-34. http://doi.org/10.4018/IJSWIS.2021070102

Export Reference

Mendeley
Favorite Full-Issue Download

Abstract

Enormous records and data are gathered every day. Organization of this data is a challenging task. Topic modeling provides a way to categorize these documents, where high dimensionality of the corpus affects the result of topic model, making it important to apply feature selection or information retrieval process for dimensionality reduction. The requirement for efficient topic modeling includes the removal of unrelated words that might lead to specious coexistence of the unrelated words. This paper proposes an efficient framework for the generation of better topic coherence, where term frequency-inverse document frequency (TF-IDF) and parsimonious language model (PLM) are used for the information retrieval task. PLM extracts the important information and expels the general words from the corpus, whereas TF-IDF re-estimates the weightage of each word in the corpus. The work carried out in this paper improved the topic coherence measure to provide a better correlation among the actual topic and the topics generated from PLM.

Request Access

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.