Multi-document Summarization Using Weighted Similarity Between Topic and Clustering-Based Non-negative Semantic Feature

Park, Sun; Lee, Ju-Hong; Kim, Deok-Hwan; Ahn, Chan-Min

doi:10.1007/978-3-540-72524-4_14

Sun Park¹,
Ju-Hong Lee¹,
Deok-Hwan Kim² &
…
Chan-Min Ahn¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4505))

Included in the following conference series:

1166 Accesses
1 Citations

Abstract

This paper presents a new multi-document summarization method using weighted similarity between topic and non-negative semantic features to extract meaningful sentences relevant to a given topic. The proposed method decomposes a sentence into the linear combination of sparse non-negative semantic features so that it can represent a sentence as the sum of a few semantic features that are comprehensible intuitively. It can avoid extracting the sentences whose similarities with topic are high but are meaningless by using the weighted similarity measure between the topic and the semantic features. Clustering sentences remove noises so that it can avoid the biased semantics of the documents to be reflected in summaries. Besides, it can enhance the coherence of document summaries by arranging extracted sentences in the order of their rank. The experimental results using DUC data show that the proposed method achieves better performance than the other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Chin-Yew, L.: ROUGE: A Package for Automatic Evaluation of Summaries. In: Proceedings of Workshop on Text Summarization Branches Out, Post-Conference Workshop of ACL (2004)
Google Scholar
Goldstein, J., Mittal, V., Carbonell, J., Kantrowitz, M.: Multi-Document Summarization By Sentence Extraction. In: The Proceeding of the ANLP/NAACL Workshop (2000)
Google Scholar
Gong, Y., Liu, X.: Generic Text Summarization Using Relevance Measure and Latent Semantic Analysis. In: Proceeding of ACM SIGIR, pp. 19–25 (2001)
Google Scholar
Han, J., Kamber, M.: Data Mining Concepts and Techniques. Morgan Kaufmann, San Francisco (2001)
Google Scholar
Hoa, H.D.: Overview of DUC 2005. In: Proceedings of the DUC (2005)
Google Scholar
Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999)
Article Google Scholar
Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: Advances in Neural Information Processing Systems, vol. 13, pp. 556–562 (2000)
Google Scholar
Nomoto, T., Matsumoto, Y.: A New Approach to Unsupervised Text Summarization. In: Proceeding of ACM SIGIR, pp. 26–34 (2001)
Google Scholar
Lee, J.H., Part, S., Ahn, C.M.: Automatic Generic Document Summarization Based on Non-negative Matrix Factorization. In: Proceeding of BIS (2007)
Google Scholar
Park, S., Lee, J.-H., Ahn, C.-M., Hong, J.S., Chun, S.-J.: Query Based Summarization Using Non-negative Matrix Factorization. In: Gabrys, B., Howlett, R.J., Jain, L.C. (eds.) KES 2006. LNCS (LNAI), vol. 4253, pp. 84–89. Springer, Heidelberg (2006)
Chapter Google Scholar
Park, S., Lee, J.-H., Kim, D.-H., Ahn, C.-M.: Multi-document Summarization Based on Cluster Using Non-negative Matrix Factorization. In: van Leeuwen, J., Italiano, G.F., van der Hoek, W., Meinel, C., Sack, H., Plášil, F. (eds.) SOFSEM 2007. LNCS, vol. 4362, pp. 761–770. Springer, Heidelberg (2007)
Chapter Google Scholar
Radev, D.R., Hovy, E., Mckeown, K.: Introduction to the Special Issue on Summarization. Computational Linguistics 28, 399–408 (2002)
Article Google Scholar
Ricardo, B.Y., Berthier, R.N.: Modern Information Retrieval. ACM Press, New York (1999)
Google Scholar
Sassion, H.: Topic-based Summarization at DUC 2005. In: Proceedings of DUC (2005)
Google Scholar
Wild, S., Curry, J., Dougherty, A.: Motivating Non-Negative Matrix Factorizations. In: Proceeding of SIAM ALA (2003)
Google Scholar
Xu, W., Liu, X., Gong, Y.: Document Clustering Based On Non-negative Matrix Factorization. In: Proceeding of ACM SIGIR, pp. 267–273 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Computer Science & Information Engineering, Inha University, Incheon, Korea
Sun Park, Ju-Hong Lee & Chan-Min Ahn
Dept. of Electronics Engineering, Inha University,
Deok-Hwan Kim

Authors

Sun Park
View author publications
You can also search for this author in PubMed Google Scholar
Ju-Hong Lee
View author publications
You can also search for this author in PubMed Google Scholar
Deok-Hwan Kim
View author publications
You can also search for this author in PubMed Google Scholar
Chan-Min Ahn
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Guozhu Dong Xuemin Lin Wei Wang Yun Yang Jeffrey Xu Yu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Park, S., Lee, JH., Kim, DH., Ahn, CM. (2007). Multi-document Summarization Using Weighted Similarity Between Topic and Clustering-Based Non-negative Semantic Feature. In: Dong, G., Lin, X., Wang, W., Yang, Y., Yu, J.X. (eds) Advances in Data and Web Management. APWeb WAIM 2007 2007. Lecture Notes in Computer Science, vol 4505. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72524-4_14

Download citation

DOI: https://doi.org/10.1007/978-3-540-72524-4_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72483-4
Online ISBN: 978-3-540-72524-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics