Skip to main content

Reading Book by the Cover—Book Genre Detection Using Short Descriptions

  • Conference paper
  • First Online:
Book cover Man-Machine Interactions 5 (ICMMI 2017)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 659))

Included in the following conference series:

Abstract

The paper is devoted to the issue of short text classification, working on free textual descriptions of books, gathered by crawling the GoodReads portal. Those descriptions are short, often incomplete, and highly biased towards the genre of their respective books, so that establishing a notion of proximity between such texts is a challenging task. Each book was assigned multiple categories from the total number of 506 categories, which makes the problem of genre distribution statistically significant. In addition, the number of the descriptions varies from genre to genre, causing the data to be imbalanced. In order to choose the best text classification method for this specific task, we examine different methods, including baseline naive Bayes models and semantic enrichment methods consuming neural-based distributional models. The algorithms have been evaluated in terms of the classification quality on the unique data set of almost two hundred thousands book descriptions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.goodreads.com/.

  2. 2.

    https://code.google.com/p/word2vec/.

  3. 3.

    https://www.goodreads.com/.

References

  1. Huang, E.H., Socher, R., Manning, C.D., Ng, A.Y.: Improving word representations via global context and multiple word prototypes. In: ACL 2012, pp. 873–882. Association for Computational Linguistics, Jeju Island (2012)

    Google Scholar 

  2. Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: ICML 2014, pp. 1188–1196. JMLR.org, Beijing (2014)

    Google Scholar 

  3. Manning, C.D., Raghavan, P., Schütze, H., et al.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)

    Google Scholar 

  4. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space (2013). https://arxiv.org/abs/1301.3781

  5. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: ANIPS 2013, pp. 3111–3119. Curran Associates Inc., Lake Tahoe (2013)

    Google Scholar 

  6. Raschka, S.: Naive Bayes and text classification I-introduction and theory. arXiv preprint arXiv:1410.5329 (2014)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Antoni Sobkowicz .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Cite this paper

Sobkowicz, A., Kozłowski, M., Buczkowski, P. (2018). Reading Book by the Cover—Book Genre Detection Using Short Descriptions. In: Gruca, A., Czachórski, T., Harezlak, K., Kozielski, S., Piotrowska, A. (eds) Man-Machine Interactions 5. ICMMI 2017. Advances in Intelligent Systems and Computing, vol 659. Springer, Cham. https://doi.org/10.1007/978-3-319-67792-7_43

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-67792-7_43

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-67791-0

  • Online ISBN: 978-3-319-67792-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics