Skip to main content

Investigating the Application of Multi-lingual Transformer in Graph-Based Extractive Text Summarization for Hindi Text

  • Conference paper
  • First Online:
Data Management, Analytics and Innovation (ICDMAI 2023)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 662))

Included in the following conference series:

  • 411 Accesses

Abstract

Generating meaningful summary for the given natural language text is one of the challenging and popular task in present era. Researchers have come up with various techniques for the abstractive and extractive summarization. This experimental study is focused on the extractive summarization. In graph-based extractive text summarization techniques, the sentences of the input document are used as the nodes of the graph and various similarity measurements are used to weight the edges of the graph. Each node’s rating is determined using the graph ranking algorithms, and the top-ranked nodes (sentences) are then added to the output extractive summary. In this work, we first translate the publicly available dataset into Hindi text using the Google Translate service. Next, we apply a pre-trained multi-lingual transformer to generate embedding vectors of each sentence of the document. We use these embedding vectors as the nodes of the graph. Rest of the approach remains unchanged. At last, we evaluate the generated extractive summaries on the basis on ROUGE score. Evaluation results indicate that the use of pre-trained multi-lingual transformer can be effective to generate more meaningful extractive summaries.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://pypi.org/project/googletrans/.

References

  1. Belwal RC, Rai S, Gupta A (2021) A new graph-based extractive text summarization using keywords or topic modeling. J Ambient Intell Humaniz Comput 12(10):8975–8990

    Article  Google Scholar 

  2. Brin S, Page L (1998) The anatomy of a large-scale hypertextual web search engine. Comput Netw ISDN Syst 30(1–7):107–117

    Article  Google Scholar 

  3. Chan SW, Lai TB, Gao W, Tsou BK (2000) Mining discourse markers for chinesetextual summarization. In: NAACL-ANLP 2000 workshop: automatic summarization

    Google Scholar 

  4. Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805

  5. Endres-Niggemeyer B (2012) Summarizing information: including CD-Rom SimSum, simulation of summarizing, for macintosh and windows. Springer Science and Business Media

    Google Scholar 

  6. Erkan G, Radev DR (2004) Lexrank: graph-based lexical centrality as salience in text summarization. J Artific Intell Res 22:457–479

    Article  Google Scholar 

  7. Ganesan K, Zhai C, Han J (2010) Opinosis: a graph based approach to abstractive summarization of highly redundant opinions

    Google Scholar 

  8. Hermann KM, Kocisky T, Grefenstette E, Espeholt L, Kay W, Suleyman M, Blunsom P (2015) Teaching machines to read and comprehend. Adv Neural Inf Process Syst 28

    Google Scholar 

  9. Kleinberg JM (1999) Authoritative sources in a hyperlinked environment. J ACM (JACM) 46(5):604–632

    Article  MathSciNet  MATH  Google Scholar 

  10. Li Y, Tarlow D, Brockschmidt M, Zemel R (2015) Gated graph sequence neuralnetworks. arXiv preprint arXiv:1511.05493

  11. Lin CY (2004) Rouge: a package for automatic evaluation of summaries. In: Text summarization branches out, pp 74–81

    Google Scholar 

  12. Mihalcea R, Tarau P (2004) Textrank: bringing order into text. In: Proceedings of the2004 conference on empirical methods in natural language processing, pp 404–411

    Google Scholar 

  13. Saggion H, Poibeau T (2013) Automatic text summarization: past, present and future.In: Multi-source, multilingual information extraction and summarization. Springer, pp 3–21

    Google Scholar 

  14. Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neuralnetworks. In: Advances in neural information processing systems, pp 3104–3112

    Google Scholar 

  15. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008

    Google Scholar 

  16. Vetriselvi T, Gopalan N (2020) An improved key term weightage algorithm for textsummarization using local context information and fuzzy graph sentence score. J Ambient Intell Humanized Comput 1–10

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sawan Rai .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Rai, S., Belwal, R.C., Sharma, A. (2023). Investigating the Application of Multi-lingual Transformer in Graph-Based Extractive Text Summarization for Hindi Text. In: Sharma, N., Goje, A., Chakrabarti, A., Bruckstein, A.M. (eds) Data Management, Analytics and Innovation. ICDMAI 2023. Lecture Notes in Networks and Systems, vol 662. Springer, Singapore. https://doi.org/10.1007/978-981-99-1414-2_30

Download citation

Publish with us

Policies and ethics