Skip to main content

Research Paper Search Using a Topic-Based Boolean Query Search and a General Query-Based Ranking Model

  • Conference paper
  • First Online:
Database and Expert Systems Applications (DEXA 2019)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11707))

Included in the following conference series:

  • 799 Accesses

Abstract

When conducting a search for research papers, the search should return comprehensive results related to the user’s query. In general, a user inputs a Boolean query that reflects the information need, and the search engine ranks the research papers based on the query. However, it is difficult to anticipate all possible terms that authors of relevant papers might have used. Moreover, general query-based ranking methods emphasize how to rank the relevant documents at the top of the results, but require some means of guaranteeing the comprehensiveness of the results. Therefore, two ranking methods that consider the comprehensiveness of relevant papers are proposed. The first uses a topic-based Boolean query search. This search converts every word in the abstract set and query into a topic via topic analysis by Latent Dirichlet Allocation (LDA) and conducts a search at the topic level. The topic assigned to synonyms of a search term is expected to be the same as that assigned to the search term. Each paper is ranked based on the number of times it is matched with each topic-based Boolean query search executed for various LDA parameter settings. The second is a hybrid method that emphasizes better results from our topic-based ranking result and a general query-based ranking result. This method is based on the observation that the paper sets retrieved by our method and by a general ranking method will be different. Through experiments using the NTCIR-1 and -2 datasets, the effectiveness of our topic-based and hybrid methods are demonstrated.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Amami, M., Pasi, G., Stella, F., Faiz, R.: An LDA-based approach to scientific paper recommendation. In: Métais, E., Meziane, F., Saraee, M., Sugumaran, V., Vadera, S. (eds.) NLDB 2016. LNCS, vol. 9612, pp. 200–210. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-41754-7_17

    Chapter  Google Scholar 

  2. Dhanda, M., Verma, V.: Recommender system for academic literature with incremental dataset. Procedia Comput. Sci. 89, 483–491 (2016)

    Article  Google Scholar 

  3. Ganguly, D., Roy, D., Mitra, M., Jones, G.J.F.: A Word embedding based generalized language model for information retrieval. In: SIGIR, pp. 795–798 (2015)

    Google Scholar 

  4. Griffiths, T.L., Steyvers, M.: Finding scientific topics. In: National Academy of Sciences, pp. 5228–5253 (2004)

    Google Scholar 

  5. Hassan, H.A.M.: Personalized research paper recommendation using deep learning. In: UMAP, pp. 327–330 (2017)

    Google Scholar 

  6. Hong, K., Jeon, H., Jeon, C.: Personalized research paper recommendation system using keyword extraction based on userprofile. Convergence Inf. Technol. 8(16), 106–116 (2013)

    Google Scholar 

  7. Kando, N., et al.: The NTCIR workshop: the first evaluation workshop on Japanese text retrieval and cross-lingual information retrieval. In: Information Retrieval with Asian Languages Workshop (1999)

    Google Scholar 

  8. Kando, N.: Overview of the second NTCIR workshop. In: NTCIR Workshop, pp. 35–43 (2001)

    Google Scholar 

  9. Kim, Y., Seo, J., Croft, W.B.: Automatic Boolean query suggestion for professional search. In: SIGIR, pp. 825–834 (2011)

    Google Scholar 

  10. Kuriyama, K., Kando, N., Nozue, T., Eguchi, K.: Pooling for a large-scale test collection: an analysis of the search results from the first NTCIR workshop. Inf. Retrieval 5(1), 41–59 (2002)

    Article  Google Scholar 

  11. Liu, X., Croft, W.B.: Cluster-based retrieval using language models. In: SIGIR, pp. 186–193 (2004)

    Google Scholar 

  12. Mai, G., Janowicz, K., Yan, B.: Combining text embedding and knowledge graph embedding techniques for academic search engines. In: SemDeep–4 at ISWC (2018)

    Google Scholar 

  13. Masumura, R., Asami, T., Masataki, H., Sadamitsu, K., Nishida, K., Higashinaka, R.: Hyperspherical query likelihood models with word embeddings. In: IJCNLP, pp. 210–216 (2017)

    Google Scholar 

  14. Ponte, J.M., Croft, W.B.: A language modeling approach to information retrieval. In: SIGIR, pp. 275–281 (1998)

    Google Scholar 

  15. Sugiyama, K., Kan, M.-Y.: Scholarly paper recommendation via user’s recent research interests. In: JCDL, pp. 29–38 (2010)

    Google Scholar 

  16. Takaku, M., Egusa, Y.: Simple document-by-document search tool “fuwatto search” using web API. In: Tuamsuk, K., Jatowt, A., Rasmussen, E. (eds.) ICADL 2014. LNCS, vol. 8839, pp. 312–319. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-12823-8_32

    Chapter  Google Scholar 

  17. Tannebaum, W., Rauber, A.: Using query logs of USPTO patent examiners for automatic query expansion in patent searching. Inf. Retrieval 17(5–6), 452–470 (2014)

    Article  Google Scholar 

  18. TreeTagger. http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/

  19. Verberne, S., Sappelli, M., Kraaij, W.: Query term suggestion in academic search. In: de Rijke, M., et al. (eds.) ECIR 2014. LNCS, vol. 8416, pp. 560–566. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-06028-6_57

    Chapter  Google Scholar 

  20. Wei, X., Croft, W.B.: LDA-based document models for ad-hoc retrieval. In: SIGIR, pp. 178–185 (2006)

    Google Scholar 

  21. Xion, C., Power, R., Callan, J.: Explicit semantic ranking for academic search via knowledge graph embedding. In: WWW, pp. 1271–1279 (2017)

    Google Scholar 

  22. Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to information retrieval. ACM Trans. Inf. Syst. 22(2), 179–214 (2004)

    Article  Google Scholar 

  23. Zhao, W., Wu, R., Liu, H.: Paper recommendation based on the knowledge gap between a researcher’s background knowledge and research target. Inf. Process. Manage. 52(5), 976–988 (2016)

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by JSPS KAKENHI Grant Number JP15H01721. We thank Stuart Jenkinson, PhD, from Edanz Group (www.edanzediting.com/ac) for editing a draft of this manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Satoshi Fukuda .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fukuda, S., Tomiura, Y., Ishita, E. (2019). Research Paper Search Using a Topic-Based Boolean Query Search and a General Query-Based Ranking Model. In: Hartmann, S., Küng, J., Chakravarthy, S., Anderst-Kotsis, G., Tjoa, A., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2019. Lecture Notes in Computer Science(), vol 11707. Springer, Cham. https://doi.org/10.1007/978-3-030-27618-8_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-27618-8_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-27617-1

  • Online ISBN: 978-3-030-27618-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics