research-article

Optimization of the Abstract Text Summarization Model Based on Multi-Task Learning

Authors:
Ben Yao

School of Computer Science and Technology, Zhejiang Normal University, China

School of Computer Science and Technology, Zhejiang Normal University, China

0009-0008-9915-6051
View Profile

,
Gejian Ding

School of Computer Science and Technology, Zhejiang Normal University, China

School of Computer Science and Technology, Zhejiang Normal University, China

0009-0000-5337-0863
View Profile

EITCE '23: Proceedings of the 2023 7th International Conference on Electronic Information Technology and Computer EngineeringOctober 2023Pages 424–428https://doi.org/10.1145/3650400.3650469

Published:17 April 2024Publication History

EITCE '23: Proceedings of the 2023 7th International Conference on Electronic Information Technology and Computer Engineering

Pages 424–428

ABSTRACT

Automatic text summtarization technology is a crucial component in the field of natural language processing, utilized to address the demands of processing extensive textual data by effectively extracting key information to enhance task efficiency. With the rise of pretrained language models, abstract text summarization has progressively become mainstream, producing fluent srummaries that encapsulate core content. Nonetheless, abstract text summarization unavoidably faces problems of inconsistency with the original text. This paper introduces a sequence tagging task to achieve multi-task learning for abstract text summarization models. In this sequence tagging task, we meticulously designed annotated datasets at both entity and sentence levels based on an analysis of the XSum dataset, aiming to enhance the factual consistency of generated summaries. Experimental results demonstrate that the optimized BART model yields favorable performance in terms of ROUGE and FactCC metrics.

References

Wafaa S. El-Kassas, Cherif R. Salama, Ahmed A. Rafea, and Hoda K. Mohamed. 2021. Automatic text summarization: A comprehensive survey. Expert Systems with Applications. 165, 113679. https://doi.org/10.1016/j.eswa.2020.113679.Google ScholarCross Ref
Tian Shi, Yaser Keneshloo, Naren Ramakrishnan, and Chandan K. Reddy. 2021. Neural Abstractive Text Summarization with Sequence-to-Sequence Models. 2, 1, 1-37. https://doi.org/10.1145/3419106.Google ScholarDigital Library
Abigail See, Peter J. Liu, and Christopher D. Manning. 2017. Get To The Point: Summarization with Pointer-Generator Networks. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Vancouver, Canada, 1073–1083. https://doi.org/10.18653/v1/P17-1099.Google ScholarCross Ref
Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. 2020. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 7871–7880. https://doi.org/10.18653/v1/2020.acl-main.703.Google ScholarCross Ref
Zheng Zhao, Shay B. Cohen, and Bonnie Webber. 2020. Reducing Quantity Hallucinations in Abstractive Summarization. In Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, Online, 2237–2249. https://doi.org/10.18653/v1/2020.findings-emnlp.203.Google ScholarCross Ref
Sihao Chen, Fan Zhang, Kazoo Sone, and Dan Roth. 2021. Improving Faithfulness in Abstractive Summarization with Contrast Candidate Generation and Selection. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Online, 5935–5941. https://doi.org/10.18653/v1/2021.naacl-main.475.Google ScholarCross Ref
Chenguang Zhu, William Hinthorn, Ruochen Xu, Qingkai Zeng, Michael Zeng, Xuedong Huang, and Meng Jiang. 2021. Enhancing Factual Consistency of Abstractive Summarization. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Online, 718–733. https://doi.org/10.18653/v1/2021.naacl-main.58.Google ScholarCross Ref
Alex Wang, Kyunghyun Cho, and Mike Lewis. 2020. Asking and Answering Questions to Evaluate the Factual Consistency of Summaries. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 5008–5020. https://doi.org/10.18653/v1/2020.acl-main.450.Google ScholarCross Ref
Wojciech Kryscinski, Bryan McCann, Caiming Xiong, and Richard Socher. 2020. Evaluating the Factual Consistency of Abstractive Text Summarization. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Online, 9332–9346. https://doi.org/10.18653/v1/2020.emnlp-main.750.Google ScholarCross Ref
Shuyang Cao and Lu Wang. 2021. CLIFF: Contrastive Learning for Improving Faithfulness and Factuality in Abstractive Summarization. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, 6633–6649. https://doi.org/10.18653/v1/2021.emnlp-main.532.Google ScholarCross Ref
Matthew Honnibal and Ines Montani. 2017. spaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing. To appear.Google Scholar
Rada Mihalcea and Paul Tarau. 2004. TextRank: Bringing Order into Text. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Barcelona, Spain, 404–411.Google Scholar
Shashi Narayan, Shay B. Cohen, and Mirella Lapata. 2018. Don't Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, 1797–1807. https://doi.org/10.18653/v1/D18-1206.Google ScholarCross Ref
Chin-Yew Lin. 2004. ROUGE: A Package for Automatic Evaluation of Summaries. In Text Summarization Branches Out. Association for Computational Linguistics, Barcelona, Spain, 74–81.Google Scholar
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186. https://doi.org/10.18653/v1/N19-1423.Google ScholarCross Ref

Index Terms

Optimization of the Abstract Text Summarization Model Based on Multi-Task Learning
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Natural language generation

Recommendations

Research on Multi-document Summarization Based on LDA Topic Model
IHMSC '14: Proceedings of the 2014 Sixth International Conference on Intelligent Human-Machine Systems and Cybernetics - Volume 02

Compared with VSM (Vector Space Model) and graph-ranking models, LDA (Latent Dirichlet Allocation) Model can discover latent topics in the corpus and latent topics are beneficial to use sentence-ranking mechanisms to form a good summary. In the paper, ...
Read More
Multi-document Hyperedge-based Ranking for Text Summarization
CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management

In a multi-document settings, graph-based extractive summarization approaches build a similarity graph out of sentences in each cluster of documents then use graph centrality approaches to measure the importance of sentences. The similarity is computed ...
Read More
Latent dirichlet allocation based multi-document summarization
AND '08: Proceedings of the second workshop on Analytics for noisy unstructured text data

Extraction based Multi-Document Summarization Algorithms consist of choosing sentences from the documents using some weighting mechanism and combining them into a summary. In this article we use Latent Dirichlet Allocation to capture the events being ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

EITCE '23: Proceedings of the 2023 7th International Conference on Electronic Information Technology and Computer Engineering
October 2023
1809 pages
ISBN:9798400708305
DOI:10.1145/3650400

Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 April 2024
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate508of972submissions,52%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 3
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Optimization of the Abstract Text Summarization Model Based on Multi-Task Learning

EITCE '23: Proceedings of the 2023 7th International Conference on Electronic Information Technology and Computer Engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

Research on Multi-document Summarization Based on LDA Topic Model

Multi-document Hyperedge-based Ranking for Text Summarization

Latent dirichlet allocation based multi-document summarization

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Optimization of the Abstract Text Summarization Model Based on Multi-Task Learning

EITCE '23: Proceedings of the 2023 7th International Conference on Electronic Information Technology and Computer Engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

Research on Multi-document Summarization Based on LDA Topic Model

Multi-document Hyperedge-based Ranking for Text Summarization

Latent dirichlet allocation based multi-document summarization

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media