Abstract
Multi-document summarization (MDS) is a challenging research topic in natural language processing. In order to obtain an effective summary, this paper presents a novel extractive approach based on graph-based sub-topic partition algorithm (GSPSummary). In particular, a sub-topic model based on graph representation is presented with emphasis on the implicit logic structure of the topic covered in the document collection. Then, a new framework of MDS with sub-topic partition is proposed. Furthermore, a novel scalable ranking criterion is adopted, in which both word based features and global features are integrated together. Experimental results on DUC2005 show that the proposed approach can significantly outperform existing approaches of the top performing systems in DUC tasks.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Carbonell, J., Goldstein, J.: The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries. In: Proceedings of SIGIR 1998 (August 1998)
Dang, H.T.: Overview of DUC 2005 (2005), http://duc.nist.gov/pubs/2005papers/
Mani, I.: Recent developments in text summarization. In: Proceedings of CIKM 2001, Atlanta, Georgia, USA, pp. 529–531 (2001)
Mani, I., Maybury, M.T.: Advances in Automatic Text Summarization. MIT Press, Cambridge (1999)
Erkan, G., Radev, D.R.: LexRank: Graph-based Lexical Centrality as Salience in Text Summarization. Journal of Artificial Intelligence Research (JAIR) (July 2004)
Barzilay, R., Elbadad, M.: Using Lexical Chains for Text Summarization. In: Proceedings of the ACL Intelligent Scalable Text Summarization Workshop, pp. 86–90 (1997)
Teufel, S., Moens, M.: Sentence Extraction as a Classification Task. In: Proceedings of the ACL Intelligent Scalable Text summarization Workshop (July 1997)
Barzilay, R., McKeown, K.R., Elhadad, M.: Information Fusion in the Context of Multi-Document Summarization. In: Proceedings of ACL 1999, June 16-20 (1999)
Mitra, M., Singhal, A., Buckley, C.: Automatic text summarization by paragraph extraction. In: ACL/EACL-1997 Workshop on Intelligent Scalable Text Summarization, July 1997, Madrid, Spain (1997)
Marcu, D.: From discourse structures to text summaries. In: Proceedings of the ACL 1997/EACL 1997 Workshop on Intelligent Scalable Text Summarization, Madrid, Spain (1997)
Goldstein, J., Mittal, V.O., Carbonell, J.G., Callan, J.P.: Creating and Evaluating Multi-Document Sentence Extract Summaries. In: Proceedings of CIKM 2000 (2000)
Radev, D.R., McKeown, K.R.: Generating natural language summaries from multiple online sources. Computational Linguistics 24(3) (1998)
Mani, I., Bloedern, E.: Multi-document summarization by graph search and merging. In: Proceedings of AAAI-1997, pp. 622–628 (1997)
Radev, D.R., Jing, H., Budzikowska, M.: Summarization of multiple documents: clustering, sentence extraction, and evaluation. In: Proceedings, ANLP-NAACL Workshop on Automatic Summarization, April 2000, Seattle, WA (2000)
Harabagiu, S., Lacatusu, F.: Topic themes for multi-document summarization. In: Proceedings of SIGIR 2005 (2005)
Lin, C.-Y.: ROUGE: a Package for Automatic Evaluation of Summaries. In: Proceedings of the Workshop on Text Summarization Branches Out, Barcelona, Spain (2004)
Lin, C.-Y., Hovy, E.: The automated acquisition of topic signatures for text summarization. In: Proceedings of the 18th COLING Conference, Saarbrucken, Germany (2000)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, J., Cheng, X., Xu, H. (2008). GSPSummary: A Graph-Based Sub-topic Partition Algorithm for Summarization. In: Li, H., Liu, T., Ma, WY., Sakai, T., Wong, KF., Zhou, G. (eds) Information Retrieval Technology. AIRS 2008. Lecture Notes in Computer Science, vol 4993. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68636-1_31
Download citation
DOI: https://doi.org/10.1007/978-3-540-68636-1_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68633-0
Online ISBN: 978-3-540-68636-1
eBook Packages: Computer ScienceComputer Science (R0)