Abstract
Generating meaningful summary for the given natural language text is one of the challenging and popular task in present era. Researchers have come up with various techniques for the abstractive and extractive summarization. This experimental study is focused on the extractive summarization. In graph-based extractive text summarization techniques, the sentences of the input document are used as the nodes of the graph and various similarity measurements are used to weight the edges of the graph. Each node’s rating is determined using the graph ranking algorithms, and the top-ranked nodes (sentences) are then added to the output extractive summary. In this work, we first translate the publicly available dataset into Hindi text using the Google Translate service. Next, we apply a pre-trained multi-lingual transformer to generate embedding vectors of each sentence of the document. We use these embedding vectors as the nodes of the graph. Rest of the approach remains unchanged. At last, we evaluate the generated extractive summaries on the basis on ROUGE score. Evaluation results indicate that the use of pre-trained multi-lingual transformer can be effective to generate more meaningful extractive summaries.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Belwal RC, Rai S, Gupta A (2021) A new graph-based extractive text summarization using keywords or topic modeling. J Ambient Intell Humaniz Comput 12(10):8975–8990
Brin S, Page L (1998) The anatomy of a large-scale hypertextual web search engine. Comput Netw ISDN Syst 30(1–7):107–117
Chan SW, Lai TB, Gao W, Tsou BK (2000) Mining discourse markers for chinesetextual summarization. In: NAACL-ANLP 2000 workshop: automatic summarization
Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805
Endres-Niggemeyer B (2012) Summarizing information: including CD-Rom SimSum, simulation of summarizing, for macintosh and windows. Springer Science and Business Media
Erkan G, Radev DR (2004) Lexrank: graph-based lexical centrality as salience in text summarization. J Artific Intell Res 22:457–479
Ganesan K, Zhai C, Han J (2010) Opinosis: a graph based approach to abstractive summarization of highly redundant opinions
Hermann KM, Kocisky T, Grefenstette E, Espeholt L, Kay W, Suleyman M, Blunsom P (2015) Teaching machines to read and comprehend. Adv Neural Inf Process Syst 28
Kleinberg JM (1999) Authoritative sources in a hyperlinked environment. J ACM (JACM) 46(5):604–632
Li Y, Tarlow D, Brockschmidt M, Zemel R (2015) Gated graph sequence neuralnetworks. arXiv preprint arXiv:1511.05493
Lin CY (2004) Rouge: a package for automatic evaluation of summaries. In: Text summarization branches out, pp 74–81
Mihalcea R, Tarau P (2004) Textrank: bringing order into text. In: Proceedings of the2004 conference on empirical methods in natural language processing, pp 404–411
Saggion H, Poibeau T (2013) Automatic text summarization: past, present and future.In: Multi-source, multilingual information extraction and summarization. Springer, pp 3–21
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neuralnetworks. In: Advances in neural information processing systems, pp 3104–3112
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008
Vetriselvi T, Gopalan N (2020) An improved key term weightage algorithm for textsummarization using local context information and fuzzy graph sentence score. J Ambient Intell Humanized Comput 1–10
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Rai, S., Belwal, R.C., Sharma, A. (2023). Investigating the Application of Multi-lingual Transformer in Graph-Based Extractive Text Summarization for Hindi Text. In: Sharma, N., Goje, A., Chakrabarti, A., Bruckstein, A.M. (eds) Data Management, Analytics and Innovation. ICDMAI 2023. Lecture Notes in Networks and Systems, vol 662. Springer, Singapore. https://doi.org/10.1007/978-981-99-1414-2_30
Download citation
DOI: https://doi.org/10.1007/978-981-99-1414-2_30
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-1413-5
Online ISBN: 978-981-99-1414-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)