Investigating the Application of Multi-lingual Transformer in Graph-Based Extractive Text Summarization for Hindi Text

Rai, Sawan; Belwal, Ramesh Chandra; Sharma, Abhinav

doi:10.1007/978-981-99-1414-2_30

Sawan Rai¹³,
Ramesh Chandra Belwal¹⁴ &
Abhinav Sharma¹⁵

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 662))

Included in the following conference series:

International Conference on Data Management, Analytics & Innovation

411 Accesses

Abstract

Generating meaningful summary for the given natural language text is one of the challenging and popular task in present era. Researchers have come up with various techniques for the abstractive and extractive summarization. This experimental study is focused on the extractive summarization. In graph-based extractive text summarization techniques, the sentences of the input document are used as the nodes of the graph and various similarity measurements are used to weight the edges of the graph. Each node’s rating is determined using the graph ranking algorithms, and the top-ranked nodes (sentences) are then added to the output extractive summary. In this work, we first translate the publicly available dataset into Hindi text using the Google Translate service. Next, we apply a pre-trained multi-lingual transformer to generate embedding vectors of each sentence of the document. We use these embedding vectors as the nodes of the graph. Rest of the approach remains unchanged. At last, we evaluate the generated extractive summaries on the basis on ROUGE score. Evaluation results indicate that the use of pre-trained multi-lingual transformer can be effective to generate more meaningful extractive summaries.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://pypi.org/project/googletrans/.

References

Belwal RC, Rai S, Gupta A (2021) A new graph-based extractive text summarization using keywords or topic modeling. J Ambient Intell Humaniz Comput 12(10):8975–8990
Article Google Scholar
Brin S, Page L (1998) The anatomy of a large-scale hypertextual web search engine. Comput Netw ISDN Syst 30(1–7):107–117
Article Google Scholar
Chan SW, Lai TB, Gao W, Tsou BK (2000) Mining discourse markers for chinesetextual summarization. In: NAACL-ANLP 2000 workshop: automatic summarization
Google Scholar
Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805
Endres-Niggemeyer B (2012) Summarizing information: including CD-Rom SimSum, simulation of summarizing, for macintosh and windows. Springer Science and Business Media
Google Scholar
Erkan G, Radev DR (2004) Lexrank: graph-based lexical centrality as salience in text summarization. J Artific Intell Res 22:457–479
Article Google Scholar
Ganesan K, Zhai C, Han J (2010) Opinosis: a graph based approach to abstractive summarization of highly redundant opinions
Google Scholar
Hermann KM, Kocisky T, Grefenstette E, Espeholt L, Kay W, Suleyman M, Blunsom P (2015) Teaching machines to read and comprehend. Adv Neural Inf Process Syst 28
Google Scholar
Kleinberg JM (1999) Authoritative sources in a hyperlinked environment. J ACM (JACM) 46(5):604–632
Article MathSciNet MATH Google Scholar
Li Y, Tarlow D, Brockschmidt M, Zemel R (2015) Gated graph sequence neuralnetworks. arXiv preprint arXiv:1511.05493
Lin CY (2004) Rouge: a package for automatic evaluation of summaries. In: Text summarization branches out, pp 74–81
Google Scholar
Mihalcea R, Tarau P (2004) Textrank: bringing order into text. In: Proceedings of the2004 conference on empirical methods in natural language processing, pp 404–411
Google Scholar
Saggion H, Poibeau T (2013) Automatic text summarization: past, present and future.In: Multi-source, multilingual information extraction and summarization. Springer, pp 3–21
Google Scholar
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neuralnetworks. In: Advances in neural information processing systems, pp 3104–3112
Google Scholar
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008
Google Scholar
Vetriselvi T, Gopalan N (2020) An improved key term weightage algorithm for textsummarization using local context information and fuzzy graph sentence score. J Ambient Intell Humanized Comput 1–10
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science Engineering and Technology, Bennett University, Greater Noida, U.P., India
Sawan Rai
Computer Science and Engineering Department, B.T.K.I.T., Dwarahat, Uttarakhand, India
Ramesh Chandra Belwal
Computer Science and Engineering Department, Institute of Technical Education and Research, Siksha O Anusandhan, Bhubaneswar, Odisha, India
Abhinav Sharma

Authors

Sawan Rai
View author publications
You can also search for this author in PubMed Google Scholar
Ramesh Chandra Belwal
View author publications
You can also search for this author in PubMed Google Scholar
Abhinav Sharma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sawan Rai .

Editor information

Editors and Affiliations

Tata Consultancy Services Ltd., Pune, India
Neha Sharma
Society for Data Science, Pune, India
Amol Goje
Faculty of Engineering, A. K. Choudhury School of Information Technology, Kolkata, West Bengal, India
Amlan Chakrabarti
Faculty of Computer Science, Technion—Israel Institute of Technology, Haifa, Israel
Alfred M. Bruckstein

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rai, S., Belwal, R.C., Sharma, A. (2023). Investigating the Application of Multi-lingual Transformer in Graph-Based Extractive Text Summarization for Hindi Text. In: Sharma, N., Goje, A., Chakrabarti, A., Bruckstein, A.M. (eds) Data Management, Analytics and Innovation. ICDMAI 2023. Lecture Notes in Networks and Systems, vol 662. Springer, Singapore. https://doi.org/10.1007/978-981-99-1414-2_30

Download citation

DOI: https://doi.org/10.1007/978-981-99-1414-2_30
Published: 29 May 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-1413-5
Online ISBN: 978-981-99-1414-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics