Skip to main content
Log in

Recent advances in document summarization

  • Survey Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

The task of automatic document summarization aims at generating short summaries for originally long documents. A good summary should cover the most important information of the original document or a cluster of documents, while being coherent, non-redundant and grammatically readable. Numerous approaches for automatic summarization have been developed to date. In this paper we give a self-contained, broad overview of recent progress made for document summarization within the last 5 years. Specifically, we emphasize on significant contributions made in recent years that represent the state-of-the-art of document summarization, including progress on modern sentence extraction approaches that improve concept coverage, information diversity and content coherence, as well as attempts from summarization frameworks that integrate sentence compression, and more abstractive systems that are able to produce completely new sentences. In addition, we review progress made for document summarization in domains, genres and applications that are different from traditional settings. We also point out some of the latest trends and highlight a few possible future directions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. www.summly.com.

  2. However, readers are still assumed to have some basic knowledge in natural language processing and text mining in general.

  3. The tf-idf weighting scheme is a well-known concept in information retrieval that uses the term frequency (tf) in the document for each term and a complementary weight for each term which penalizes terms found in many documents in the collection by using the inverse document frequency (idf), i.e., the inverse of the number of documents that contain the term, as weights.

  4. There is an equivalent definition which provides less intuition in the context of document summarization: f is submodular iff for \(\forall A,B\subseteq V\) we have \(f(A)+f(B)\ge f(A\cup B) + f(A\cap B)\).

  5. A set function f is called monotone, if \(f(A)\le f(B)\) whenever \(A\subseteq B\).

  6. The original paper [116] incorrectly proved a better \((1-1/\sqrt{e})\) bound, as pointed out in a later work from a different research group [134].

  7. Available at http://www.cs.cornell.edu/~rs/sfour/.

  8. Starting from [70], all these papers weirdly evaluate their systems merely on query-focused datasets although they are designed for generic cases.

  9. Nevertheless, in some specific domains and genres such as meeting summarization or opinion summarization, the system has to produce abstractive summaries. We will briefly give some relevant introduction in next section.

  10. That said, designing architectures that actually work is commonly reckoned to be equally labor-intensive.

  11. The authors of [119] use ROUGE-1 recall as the fitness function for measuring summarization quality. The discreteness of objective function (ROUGE) hampers the use of linear programming solutions. In principle, other more advanced and more efficient global optimization techniques such as Bayesian optimization [173] may also be applicable.

  12. For a more specific, comprehensive discussion on opinion summarization, readers may refer to existing survey papers (e.g., [90, 120]).

  13. A scheme of information structure that classifies sentences in scientific text into categories (such as Aim, Background, Own, Contrast and Basis) based on their rhetorical status in scientific discourse.

References

  1. Alfonseca E, Pighin D, Garrido G (2013) Heady: news headline abstraction through event pattern clustering. In: Proceedings of the 51st annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Sofia, pp 1243–1253

  2. Almeida M, Martins A (2013) Fast and robust compressive summarization with dual decomposition and multi-task learning. In: Proceedings of the 51st annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Sofia, pp 196–206

  3. Ayana, Shen S, Liu Z, Sun M (2016) Neural headline generation with minimum risk training. CoRR abs/1604.01904

  4. Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: International conference on learning representations (ICLR)

  5. Bairi R, Iyer R, Ramakrishnan G, Bilmes J (2015) Summarization of multi-document topic hierarchies using submodular mixtures. In: Proceedings of the 53rd annual meeting of the Association for Computational Linguistics and the 7th international joint conference on natural language processing (volume 1: long papers). Association for Computational Linguistics, Beijing, pp 553–563

  6. Banerjee S, Mitra P, Sugiyama K (2015) Multi-document abstractive summarization using ilp based multi-sentence compression. In: International joint conference on artificial intelligence

  7. Barzilay R, Elhadad M (1999) Using lexical chains for text summarization. Advances in automatic text summarization, pp 111–121

  8. Barzilay R, Elhadad N (2002) Inferring strategies for sentence ordering in multidocument news summarization. J Artif Intell Res 17:35–55

    MATH  Google Scholar 

  9. Barzilay R, McKeown K (2005) Sentence fusion for multidocument news summarization. Comput Linguist 31(3):297–328. doi:10.1162/089120105774321091

    Article  MATH  Google Scholar 

  10. Baumel T, Cohen R, Elhadad M (2014) Query-chain focused summarization. In: Proceedings of the 52nd annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Baltimore, pp 913–922

  11. Baumel T, Cohen R, Elhadad M (2016) Topic concentration in query focused summarization datasets. In: AAAI Conference on Artificial Intelligence

  12. Berg-Kirkpatrick T, Gillick D, Klein D (2011) Jointly learning to extract and compress. In: Proceedings of the 49th annual meeting of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, Portland, pp 481–490

  13. Bing L, Li P, Liao Y, Lam W, Guo W, Passonneau R (2015) Abstractive multi-document summarization via phrase selection and merging. In: Proceedings of the 53rd annual meeting of the Association for Computational Linguistics and the 7th international joint conference on natural language processing (volume 1: long papers). Association for Computational Linguistics, Beijing, pp 1587–1597

  14. Boudin F, Mougard H, Favre B (2015) Concept-based summarization using integer linear programming: From concept pruning to multiple optimal solutions. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, pp 1914–1918

  15. Cao Z, Wei F, Dong L, Li S, Zhou M (2015) Ranking with recursive neural networks and its application to multi-document summarization. In: AAAI conference on artificial intelligence

  16. Cao Z, Wei F, Li S, Li W, Zhou M, Wang H (2015) Learning summary prior representation for extractive summarization. In: Proceedings of the 53rd annual meeting of the Association for Computational Linguistics and the 7th international joint conference on natural language processing (volume 2: short papers). Association for Computational Linguistics, Beijing, pp 829–833

  17. Cao Z, Chen C, Li W, Li S, Wei F, Zhou M (2016) Tgsum: build tweet guided multi-document summarization dataset. In: AAAI conference on artificial intelligence

  18. Cao Z, Li W, Li S, Wei F, Li Y (2016) Attsum: Joint learning of focusing and summarization with neural attention. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers. The COLING 2016 Organizing Committee. Osaka, pp 547–556

  19. Carbonell JG, Goldstein J (1998) The use of mmr, diversity-based reranking for reordering documents and producing summaries. In: SIGIR ’98: Proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval, August 24–28, 1998, Melbourne, Australia, pp 335–336. doi:10.1145/290941.291025

  20. Carenini G, Cheung JCK, Pauls A (2013) Multi-document summarization of evaluative text. Comput Intell 29(4):545–576. doi:10.1111/j.1467-8640.2012.00417.x

    Article  MathSciNet  Google Scholar 

  21. Celikyilmaz A, Hakkani-Tur D (2010) A hybrid hierarchical model for multi-document summarization. In: Proceedings of the 48th annual meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Uppsala, pp 815–824

  22. Celikyilmaz A, Hakkani-Tur D (2011) Discovery of topically coherent sentences for extractive summarization. In: Proceedings of the 49th annual meeting of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, Portland, pp 491–499

  23. Ceylan H, Mihalcea R, Özertem U, Lloret E, Palomar M (2010) Quantifying the limits and success of extractive summarization systems across domains. In: Human language technologies: the 2010 annual conference of the North American chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Los Angeles, pp 903–911

  24. Chakrabarti D, Punera K (2011) Event summarization using tweets. In: International AAAI conference on web and social media

  25. Chali Y, Hasan SA (2012) On the effectiveness of using sentence compression models for query-focused multi-document summarization. In: Proceedings of COLING 2012. The COLING 2012 Organizing Committee. Mumbai, pp 457–474

  26. Chan W, Zhou X, Wang W, Chua TS (2012) Community answer summarization for multi-sentence question with group l1 regularization. In: Proceedings of the 50th annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Jeju Island, pp 582–591

  27. Cheng G, Xu D, Qu Y (2015) Summarizing entity descriptions for effective and efficient human-centered entity linking. In: Proceedings of the 24th international conference on World Wide Web, WWW 2015, Florence, Italy, May 18–22, 2015, pp 184–194. doi:10.1145/2736277.2741094

  28. Cheng J, Lapata M (2016) Neural summarization by extracting sentences and words. In: Proceedings of the 54th annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Berlin, pp 484–494

  29. Cheung JCK, Penn G (2013) Towards robust abstractive multi-document summarization: In: A caseframe analysis of centrality and domain. In: Proceedings of the 51st annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Sofia, pp 1233–1242

  30. Cheung JCK, Penn G (2014) Unsupervised sentence enhancement for automatic summarization. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics, Doha, pp 775–786

  31. Chopra S, Auli M, Rush AM (2016) Abstractive sentence summarization with attentive recurrent neural networks. In: Proceedings of the 2016 conference of the North American chapter of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, San Diego, pp 93–98

  32. Christensen J, Mausam Soderland S, Etzioni O (2013) Towards coherent multi-document summarization. In: Proceedings of the 2013 conference of the North American chapter of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, Atlanta, pp 1163–1173

  33. Christensen J, Soderland S, Bansal G, Mausam, (2014) Hierarchical summarization: Scaling up multi-document summarization. In: Proceedings of the 52nd annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Baltimore, pp 902–912

  34. Clarke J, Lapata M (2008) Global inference for sentence compression: an integer linear programming approach. J Artif Intell Res 31:399–429. doi:10.1613/jair.2433

    MATH  Google Scholar 

  35. Cohan A, Goharian N (2015) Scientific article summarization using citation-context and article’s discourse structure. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, pp 390–400

  36. Cohen WW, Schapire RE, Singer Y (1999) Learning to order things. J Artif Intell Res 10:243–270. doi:10.1613/jair.587

    MathSciNet  MATH  Google Scholar 

  37. Conroy JM, O’Leary DP (2001) Text summarization via hidden markov models. In: SIGIR 2001: proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval, September 9–13, 2001, New Orleans, Louisiana, USA, pp 406–407. doi:10.1145/383952.384042

  38. Contractor D, Guo Y, Korhonen A (2012) Using argumentative zones for extractive summarization of scientific articles. In: Proceedings of COLING 2012, The COLING 2012 Organizing Committee. Mumbai, India, pp 663–678

  39. Das D, Martins AF (2007) A survey on automatic text summarization. Lit Surv Lang Stat II Course CMU 4:192–195

    Google Scholar 

  40. Dasgupta A, Kumar R, Ravi S (2013) Summarization through submodularity and dispersion. In: Proceedings of the 51st annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Sofia, pp 1014–1022

  41. Davis ST, Conroy JM, Schlesinger JD (2012) Occams–an optimal combinatorial covering algorithm for multi-document summarization. In: 2012 IEEE 12th international conference on data mining workshops. IEEE, pp 454–463

  42. Delort JY, Alfonseca E (2012) Dualsum: a topic-model based approach for update summarization. In: Proceedings of the 13th conference of the European chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Avignon, pp 214–223

  43. Di Fabbrizio G, Stent A, Gaizauskas R (2014) A hybrid approach to multi-document summarization of opinions in reviews. In: Proceedings of the 8th international natural language generation conference (INLG). Association for Computational Linguistics, Philadelphia, pp 54–63

  44. Duan Y, Chen Z, Wei F, Zhou M, Shum HY (2012) Twitter topic summarization by ranking tweets using social influence and content quality. In: Proceedings of COLING 2012. The COLING 2012 Organizing Committee. Mumbai, pp 763–780

  45. Durrett G, Berg-Kirkpatrick T, Klein D (2016) Learning-based single-document summarization with compression and anaphoricity constraints. In: Proceedings of the 54th annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Berlin, pp 1998–2008

  46. Elsner M, Santhanam D (2011) Learning to fuse disparate sentences. In: Proceedings of the workshop on monolingual text-to-text generation. Association for Computational Linguistics, Portland, pp 54–63

  47. Erkan G, Radev DR (2004) Lexrank: graph-based lexical centrality as salience in text summarization. J Artif Intell Res 22:457–479

    Google Scholar 

  48. Fang Y, Teufel S (2014) A summariser based on human memory limitations and lexical competition. In: Proceedings of the 14th conference of the European chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Gothenburg, pp 732–741

  49. Fang Y, Teufel S (2016) Improving argument overlap for proposition-based summarisation. In: Proceedings of the 54th annual meeting of the Association for Computational Linguistics (volume 2: short papers). Association for Computational Linguistics, Berlin, pp 479–485

  50. Fang Y, Zhu H, Muszyńska E, Kuhnle A, Teufel S (2016) A proposition-based abstractive summariser. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers. The COLING 2016 Organizing Committee. Osaka, pp 567–578

  51. Filippova K (2010) Multi-sentence compression: Finding shortest paths in word graphs. In: Proceedings of the 23rd international conference on computational linguistics (Coling 2010). Coling 2010 Organizing Committee, Beijing, pp 322–330

  52. Fried D, Jansen P, Hahn-Powell G, Surdeanu M, Clark P (2015) Higher-order lexical semantic models for non-factoid answer reranking. Trans Assoc Comput Linguist 3:197–210

    Google Scholar 

  53. Galanis D, Lampouras G, Androutsopoulos I (2012) Extractive multi-document summarization with integer linear programming and support vector regression. In: Proceedings of COLING 2012. The COLING 2012 Organizing Committee. Mumbai, pp 911–926

  54. Gambhir M, Gupta V (2016) Recent automatic text summarization techniques: a survey. Artif Intell Rev 47:1–66

    Article  Google Scholar 

  55. Ganesan K, Zhai C, Han J (2010) Opinosis: a graph based approach to abstractive summarization of highly redundant opinions. In: Proceedings of the 23rd international conference on computational linguistics (Coling 2010). Coling 2010 Organizing Committee, Beijing, pp 340–348

  56. Gao D, Li W, Zhang R (2013) Sequential summarization: A new application for timely updated twitter trending topics. In: Proceedings of the 51st annual meeting of the Association for Computational Linguistics (volume 2: short papers). Association for Computational Linguistics, Sofia, pp 567–571

  57. Ge T, Pei W, Ji H, Li S, Chang B, Sui Z (2015) Bring you to the past: Automatic generation of topically relevant event chronicles. In: Proceedings of the 53rd annual meeting of the Association for Computational Linguistics and the 7th international joint conference on natural language processing (volume 1: long papers). Association for Computational Linguistics, Beijing, pp 575–585

  58. Ge T, Cui L, Chang B, Li S, Zhou M, Sui Z (2016) News stream summarization using burst information networks. In: Proceedings of the 2016 conference on empirical methods in natural language processing. Association for Computational Linguistics, Austin, pp 784–794

  59. Genest PE, Lapalme G (2012) Fully abstractive approach to guided summarization. In: Proceedings of the 50th annual meeting of the Association for Computational Linguistics (volume 2: short papers). Association for Computational Linguistics, Jeju Island, pp 354–358

  60. Gerani S, Mehdad Y, Carenini G, Ng RT, Nejat B (2014) Abstractive summarization of product reviews using discourse structure. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics, Doha, pp 1602–1613

  61. Gillenwater J, Kulesza A, Taskar B (2012) Discovering diverse and salient threads in document collections. In: Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning. Association for Computational Linguistics, Jeju Island, pp 710–720

  62. Gillick D, Favre B, Hakkani-Tur D (2008) The ICSI summarization system at TAC 2008. In: Proceedings of the text understanding conference

  63. Gillick D, Favre B, Hakkani-Tur D, Bohnet B, Liu Y, Xie S (2009) The ICSI/UTD summarization system at TAC 2009. In: Proceedings of the second text analysis conference. National Institute of Standards and Technology, Gaithersburg

  64. Gorinski PJ, Lapata M (2015) Movie script summarization as graph-based scene extraction. In: Proceedings of the 2015 conference of the North American chapter of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, Denver, pp 1066–1076

  65. Graham Y (2015) Re-evaluating automatic summarization with bleu and 192 shades of rouge. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, pp 128–137

  66. Gu J, Lu Z, Li H, Li VO (2016) Incorporating copying mechanism in sequence-to-sequence learning. In: Proceedings of the 54th annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Berlin, pp 1631–1640

  67. Gulcehre C, Ahn S, Nallapati R, Zhou B, Bengio Y (2016) Pointing the unknown words. In: Proceedings of the 54th annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Berlin, pp 140–149

  68. Haghighi A, Vanderwende L (2009) Exploring content models for multi-document summarization. In: Proceedings of human language technologies: the 2009 annual conference of the North American chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Boulder, pp 362–370

  69. He L, Li W, Zhuge H (2016) Exploring differential topic models for comparative summarization of scientific papers. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers. The COLING 2016 Organizing Committee, Osaka, pp 1028–1038

  70. He Z, Chen C, Bu J, Wang C, Zhang L, Cai D, He X (2012) Document summarization based on data reconstruction. In: AAAI conference on artificial intelligence

  71. Hirao T, Yoshida Y, Nishino M, Yasuda N, Nagata M (2013) Single-document summarization as a tree knapsack problem. In: Proceedings of the 2013 conference on empirical methods in natural language processing. Association for Computational Linguistics, Seattle, pp 1515–1520

  72. Hong K, Nenkova A (2014) Improving the estimation of word importance for news multi-document summarization. In: Proceedings of the 14th conference of the European chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Gothenburg, pp 712–721

  73. Hong K, Conroy J, Favre B, Kulesza A, Lin H, Nenkova A (2014) A repository of state of the art and competitive baseline summaries for generic news summarization. In: Calzolari N, Choukri K, Declerck T, Loftsson H, Maegaard B, Mariani J, Moreno A, Odijk J, Piperidis S (eds) Proceedings of the ninth international conference on language resources and evaluation (LREC’14). European Language Resources Association (ELRA), Reykjavik, pp 1608–1616, aCL Anthology Identifier: L14-1070

  74. Hong K, Marcus M, Nenkova A (2015) System combination for multi-document summarization. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, pp 107–117

  75. Hovy E, Lin CY, Zhou L, Fukumoto J (2006) Automated summarization evaluation with basic elements. In: Proceedings of the Fifth conference on language resources and evaluation (LREC 2006), Citeseer, pp 604–611

  76. Hu B, Chen Q, Zhu F (2015) Lcsts: A large scale chinese short text summarization dataset. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, pp 1967–1972

  77. Hu P, Ji D, Teng C, Guo Y (2012) Context-enhanced personalized social summarization. In: Proceedings of COLING 2012. The COLING 2012 Organizing Committee, Mumbai, pp 1223–1238

  78. Hu Y, Wan X (2015) Ppsgen: Learning-based presentation slides generation for academic papers. IEEE Trans Knowl Data Eng 27(4):1085–1097. doi:10.1109/TKDE.2014.2359652

    Article  Google Scholar 

  79. Huang X, Wan X, Xiao J (2011) Comparative news summarization using linear programming. In: Proceedings of the 49th annual meeting of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, Portland, pp 648–653

  80. Iyer S, Konstas I, Cheung A, Zettlemoyer L (2016) Summarizing source code using a neural attention model. In: Proceedings of the 54th annual meeting of the Association for computational linguistics (volume 1: long papers). Association for Computational Linguistics, Berlin, pp 2073–2083

  81. Jayanth J, Sundararaj J, Bhattacharyya P (2015) Monotone submodularity in opinion summaries. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, pp 169–178

  82. Jha R, Finegan-Dollak C, King B, Coke R, Radev D (2015) Content models for survey generation: a factoid-based evaluation. In: Proceedings of the 53rd annual meeting of the Association for Computational Linguistics and the 7th international joint conference on natural language processing (volume 1: long papers). Association for Computational Linguistics, Beijing, pp 441–450

  83. Ji H, Favre B, Lin WP, Gillick D, Hakkani-Tur D, Grishman R (2013) Open-domain multi-document summarization via information extraction: challenges and prospects. In: Poibeau T, Saggion H, Piskorski J, Yangarber R (eds) Multi-source, multilingual information extraction and summarization. Springer, Berlin, pp 177–201

  84. Ji Y, Haffari G, Eisenstein J (2016) A latent variable recurrent neural network for discourse-driven language models. In: Proceedings of the 2016 conference of the North American chapter of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, San Diego, pp 332–342

  85. Judd J, Kalita J (2013) Better twitter summaries? In: Proceedings of the 2013 conference of the North American chapter of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, Atlanta, pp 445–449

  86. Kedzie C, McKeown K, Diaz F (2015) Predicting salient updates for disaster summarization. In: Proceedings of the 53rd annual meeting of the Association for computational linguistics and the 7th international joint conference on natural language processing (volume 1: long papers). Association for Computational Linguistics, Beijing, pp 1608–1617

  87. Kedzie C, Diaz F, McKeown K (2016) Real-time web scale event summarization using sequential decision making. In: International joint conference on artificial intelligence, pp 3754–3760

  88. Kikuchi Y, Hirao T, Takamura H, Okumura M, Nagata M (2014) Single document summarization based on nested tree structure. In: Proceedings of the 52nd annual meeting of the Association for Computational Linguistics (volume 2: short papers). Association for Computational Linguistics, Baltimore, pp 315–320

  89. Kikuchi Y, Neubig G, Sasano R, Takamura H, Okumura M (2016) Controlling output length in neural encoder-decoders. In: Proceedings of the 2016 conference on empirical methods in natural language processing. Association for Computational Linguistics, Austin, pp 1328–1338

  90. Kim HD, Ganesan K, Sondhi P, Zhai CX (2011) Comprehensive review of opinion summarization. UIUC Technical Report, USA

    Google Scholar 

  91. Kobayashi H, Noguchi M, Yatsuka T (2015) Summarization based on embedding distributions. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, pp 1984–1989

  92. Kågebäck M, Mogren O, Tahmasebi N, Dubhashi D (2014) Extractive summarization using continuous vector space models. In: Proceedings of the 2nd workshop on continuous vector space models and their compositionality (CVSC). Association for Computational Linguistics, Gothenburg, pp 31–39

  93. Kulesza A, Taskar B (2011) Learning determinantal point processes. In: Proceedings of the 27th conference on uncertainty in artificial intelligence

  94. Kulesza A, Taskar B (2012) Determinantal point processes for machine learning. Found Trends Mach Learn 5(2–3):123–286

    Article  MATH  Google Scholar 

  95. Lei T, Barzilay R, Jaakkola T (2016) Rationalizing neural predictions. In: Proceedings of the 2016 conference on empirical methods in natural language processing. Association for Computational Linguistics, Austin, pp 107–117

  96. Li C, Liu F, Weng F, Liu Y (2013) Document summarization via guided sentence compression. In: Proceedings of the 2013 conference on empirical methods in natural language processing. Association for Computational Linguistics, Seattle, pp 490–500

  97. Li C, Qian X, Liu Y (2013) Using supervised bigram-based ilp for extractive summarization. In: Proceedings of the 51st Annual Meeting of the Association for computational linguistics (volume 1: long papers). Association for Computational Linguistics, Sofia, pp 1004–1013

  98. Li C, Liu Y, Liu F, Zhao L, Weng F (2014) Improving multi-documents summarization by sentence compression based on expanded constituent parse trees. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics, Doha, pp 691–701

  99. Li C, Liu Y, Zhao L (2015) Improving update summarization via supervised ilp and sentence reranking. In: Proceedings of the 2015 conference of the North American chapter of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, Denver, pp 1317–1322

  100. Li C, Liu Y, Zhao L (2015) Using external resources and joint learning for bigram weighting in ilp-based multi-document summarization. In: Proceedings of the 2015 conference of the North American chapter of the Association for computational linguistics: human language technologies. Association for Computational Linguistics, Denver, pp 778–787

  101. Li C, Wei Z, Liu Y, Jin Y, Huang F (2016) Using relevant public posts to enhance news article summarization. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers. The COLING 2016 Organizing Committee. Osaka, pp 557–566

  102. Li J, Cardie C (2014) Timeline generation: tracking individuals on twitter. In: 23rd international world wide web conference, WWW ’14, Seoul, Republic of Korea, April 7–11, 2014, pp 643–652. doi:10.1145/2566486.2567969

  103. Li J, Li S (2013) Evolutionary hierarchical dirichlet process for timeline summarization. In: Proceedings of the 51st annual meeting of the Association for Computational linguistics (volume 2: short papers). Association for Computational Linguistics, Sofia, pp 556–560

  104. Li J, Li S (2013) A novel feature-based bayesian model for query focused multi-document summarization. Trans Assoc Comput Linguist 1:89–98

    Google Scholar 

  105. Li J, Li S, Wang X, Tian Y, Chang B (2012) Update summarization using a multi-level hierarchical dirichlet process model. In: Proceedings of COLING 2012. The COLING 2012 Organizing Committee. Mumbai, pp 1603–1618

  106. Li J, Gao W, Wei Z, Peng B, Wong KF (2015) Using content-level structures for summarizing microblog repost trees. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, pp 2168–2178

  107. Li J, Luong T, Jurafsky D (2015) A hierarchical neural autoencoder for paragraphs and documents. In: Proceedings of the 53rd annual meeting of the Association for Computational linguistics and the 7th international joint conference on natural language processing (volume 1: long papers). Association for Computational Linguistics, Beijing, pp 1106–1115

  108. Li JJ, Thadani K, Stent A (2016) The role of discourse units in near-extractive summarization. In: Proceedings of the 17th annual meeting of the special interest group on discourse and dialogue. Association for Computational Linguistics, Los Angeles, pp 137–147

  109. Li L, Zhou K, Xue G, Zha H, Yu Y (2009) Enhancing diversity, coverage and balance for summarization through structure learning. In: Proceedings of the 18th international conference on world wide web, WWW 2009, Madrid, Spain, April 20–24, 2009, pp 71–80. doi:10.1145/1526709.1526720

  110. Li P, Bing L, Lam W, Li H, Liao Y (2015) Reader-aware multi-document summarization via sparse coding. In: International joint conference on artificial intelligence

  111. Li Y, Li S (2014) Query-focused multi-document summarization: Combining a topic model with graph-based semi-supervised learning. In: Proceedings of COLING 2014, the 25th international conference on computational linguistics: technical papers. Dublin City University and Association for Computational Linguistics, Dublin, pp 1197–1207

  112. Liakata M, Dobnik S, Saha S, Batchelor C, Rebholz-Schuhmann D (2013) A discourse-driven content model for summarising scientific articles evaluated in a complex question answering task. In: Proceedings of the 2013 conference on empirical methods in natural language processing. Association for Computational Linguistics, Seattle, pp 747–757

  113. Lin CY (2003) Improving summarization performance by sentence compression—a pilot study. In: Proceedings of the sixth international workshop on information retrieval with Asian languages. Association for Computational Linguistics, Sapporo, pp 1–8

  114. Lin CY, Hovy E (2000) The automated acquisition of topic signatures for text summarization. In: Proceedings of the 18th conference on computational linguistics—volume 1. Association for Computational Linguistics, pp 495–501

  115. Lin CY, Hovy E (2003) Automatic evaluation of summaries using n-gram co-occurrence statistics. In: Proceedings of the 2003 conference of the North American chapter of the Association for Computational Linguistics on human language technology—volume 1. Association for Computational Linguistics, pp 71–78

  116. Lin H, Bilmes J (2010) Multi-document summarization via budgeted maximization of submodular functions. In: Human language technologies: the 2010 annual conference of the North American chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Los Angeles, pp 912–920

  117. Lin H, Bilmes J (2011) A class of submodular functions for document summarization. In: Proceedings of the 49th annual meeting of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, Portland, pp 510–520

  118. Lin H, Bilmes JA (2012) Learning mixtures of submodular shells with application to document summarization. In: Proceedings of the 28th conference on uncertainty in artificial intelligence

  119. Litvak M, Last M (2013) Cross-lingual training of summarization systems using annotated corpora in a foreign language. Inf Retr 16(5):629–656. doi:10.1007/s10791-012-9210-3

    Article  Google Scholar 

  120. Liu B (2012) Sentiment analysis and opinion mining. Synth Lect Hum Lang Technol 5(1):1–167

    Article  MathSciNet  Google Scholar 

  121. Liu F, Flanigan J, Thomson S, Sadeh N, Smith NA (2015) Toward abstractive summarization using semantic representations. In: Proceedings of the 2015 conference of the North American chapter of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, Denver, pp 1077–1086

  122. Liu H, Yu H, Deng ZH (2015) Multi-document summarization based on two-level sparse representation model. In: AAAI conference on artificial intelligence

  123. Liu X, Li Y, Wei F, Zhou M (2012) Graph-based multi-tweet summarization using social signals. In: Proceedings of COLING 2012. The COLING 2012 Organizing Committee, Mumbai, pp 1699–1714

  124. Liu Y, hua Zhong S, Li W (2012) Query-oriented multi-document summarization via unsupervised deep learning. In: AAAI conference on artificial intelligence, pp 1699–1705

  125. Lloret E, Palomar M (2013) Towards automatic tweet generation: a comparative study from the text summarization perspective in the journalism genre. Expert Syst Appl 40(16):6624–6630. doi:10.1016/j.eswa.2013.06.021

    Article  Google Scholar 

  126. Louis A, Nenkova A (2013) Automatically assessing machine summary content without a gold standard. Comput Linguist 39(2):267–300

    Article  Google Scholar 

  127. Loza V, Lahiri S, Mihalcea R, Lai PH (2014) Building a dataset for summarization and keyword extraction from emails. In: Proceedings of the ninth international conference on language resources and evaluation (LREC’14). European Language Resources Association (ELRA), Reykjavik, Iceland, pp 2441–2446, aCL Anthology Identifier: L14-1028

  128. Luo W, Litman D (2015) Summarizing student responses to reflection prompts. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, pp 1955–1960

  129. Ma S, Deng ZH, Yang Y (2016) An unsupervised multi-document summarization framework based on neural document model. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers. The COLING 2016 Organizing Committee, Osaka, pp 1514–1523

  130. Mann WC, Thompson SA (1988) Rhetorical structure theory: toward a functional theory of text organization. Text Interdiscip J Study Discourse 8(3):243–281

    Article  Google Scholar 

  131. McDonald RT (2007) A study of global inference algorithms in multi-document summarization. In: Advances in information retrieval, 29th European conference on IR research, ECIR 2007, Rome, Italy, April 2–5, 2007, proceedings, pp 557–564

  132. Metzler D, Kanungo T (2008) Machine learned sentence selection strategies for query-biased summarization. In: SIGIR learning to rank workshop, pp 40–47

  133. Mihalcea R, Tarau P (2004) Textrank: bringing order into texts. In: Lin D, Wu D (eds) Proceedings of EMNLP 2004. Association for Computational Linguistics, Barcelona, pp 404–411

  134. Morita H, Sasano R, Takamura H, Okumura M (2013) Subtree extractive summarization via submodular maximization. In: Proceedings of the 51st annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Sofia, pp 1023–1032

  135. Nallapati R, Zhou B, glar Gulcehre C, Xiang B, (2016) Abstractive text summarization using sequence-to-sequence rnns and beyond. In: Proceedings of the 20th SIGNLL conference on computational natural language learning. Association for Computational Linguistics, Berlin, pp 280–290

  136. Nemhauser GL, Wolsey LA, Fisher ML (1978) An analysis of approximations for maximizing submodular set functionsi. Math Program 14(1):265–294

    Article  MATH  Google Scholar 

  137. Nenkova A, McKeown K (2012) A survey of text summarization techniques. In: Aggarwal CC, Zhai CX (eds) Mining text data. Springer, Berlin, pp 43–76

  138. Nenkova A, Passonneau R (2004) Evaluating content selection in summarization: the pyramid method. In: Susan Dumais DM, Roukos S (eds) HLT-NAACL 2004: main proceedings. Association for Computational Linguistics, Boston, pp 145–152

  139. Nenkova A, McKeown K et al (2011) Automatic summarization. Found Trends Inf Retr 5(2–3):103–233

    Article  Google Scholar 

  140. Ng JP, Abrecht V (2015) Better summarization evaluation with word embeddings for rouge. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, pp 1925–1930

  141. Ng JP, Bysani P, Lin Z, Kan MY, Tan CL (2012) Exploiting category-specific information for multi-document summarization. In: Proceedings of COLING 2012. The COLING 2012 Organizing Committee. Mumbai, pp 2093–2108

  142. Ng JP, Chen Y, Kan MY, Li Z (2014) Exploiting timelines to enhance multi-document summarization. In: Proceedings of the 52nd annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Baltimore, pp 923–933

  143. Nichols J, Mahmud J, Drews C (2012) Summarizing sporting events using twitter. In: Proceedings of the 2012 ACM international conference on intelligent user interfaces. ACM, pp 189–198

  144. Nishikawa H, Arita K, Tanaka K, Hirao T, Makino T, Matsuo Y (2014) Learning to generate coherent summary with discriminative hidden semi-markov model. In: Proceedings of COLING 2014, the 25th international conference on computational linguistics: technical papers. Dublin City University and Association for Computational Linguistics, Dublin, pp 1648–1659

  145. Nishino M, Yasuda N, Hirao T, Si Minato, Nagata M (2015) A dynamic programming algorithm for tree trimming-based text summarization. In: Proceedings of the 2015 conference of the North American chapter of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, Denver, pp 462–471

  146. Olariu A (2014) Efficient online summarization of microblogging streams. In: Proceedings of the 14th conference of the European chapter of the Association for Computational Linguistics, volume 2: short papers. Association for Computational Linguistics, Gothenburg, pp 236–240

  147. Owczarzak K, Conroy JM, Dang HT, Nenkova A (2012) An assessment of the accuracy of automatic evaluation in summarization. In: Proceedings of workshop on evaluation metrics and system comparison for automatic summarization. Association for Computational Linguistics, Montréal, pp 1–9

  148. Oya T, Mehdad Y, Carenini G, Ng R (2014) A template-based abstractive meeting summarization: Leveraging summary and source text relationships. In: Proceedings of the 8th international natural language generation conference (INLG). Association for Computational Linguistics, Philadelphia, pp 45–53

  149. Parveen D, Strube M (2015) Integrating importance, non-redundancy and coherence in graph-based extractive summarization. In: International joint conference on artificial intelligence

  150. Parveen D, Ramsl HM, Strube M (2015) Topical coherence for graph-based extractive summarization. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, pp 1949–1954

  151. Passonneau RJ, Chen E, Guo W, Perin D (2013) Automated pyramid scoring of summaries using distributional semantics. In: Proceedings of the 51st annual meeting of the Association for Computational Linguistics (volume 2: short papers). Association for Computational Linguistics, Sofia, pp 143–147

  152. Pei Y, Yin W, Fan Q, Huang L (2012) A supervised aggregation framework for multi-document summarization. In: Proceedings of COLING 2012. The COLING 2012 Organizing Committee. Mumbai, pp 2225–2242

  153. Peyrard M, Eckle-Kohler J (2016) Optimizing an approximation of rouge - a problem-reduction approach to extractive multi-document summarization. In: Proceedings of the 54th annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Berlin, pp 1825–1836

  154. Pighin D, Cornolti M, Alfonseca E, Filippova K (2014) Modelling events through memory-based, open-ie patterns for abstractive summarization. In: Proceedings of the 52nd annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Baltimore, pp 892–901

  155. Qazvinian V, Radev DR, Mohammad S, Dorr BJ, Zajic DM, Whidby M, Moon T (2013) Generating extractive summaries of scientific paradigms. J Artif Intell Res 46:165–201. doi:10.1613/jair.3732

    MathSciNet  Google Scholar 

  156. Qian X, Liu Y (2013) Fast joint compression and summarization via graph cuts. In: Proceedings of the 2013 conference on empirical methods in natural language processing. Association for Computational Linguistics, Seattle, pp 1492–1502

  157. Radev DR, Jing H, Sty M, Tam D (2004) Centroid-based summarization of multiple documents. Inf Process Manag 40(6):919–938. doi:10.1016/j.ipm.2003.10.006

    Article  MATH  Google Scholar 

  158. Rankel PA, Conroy JM, Dang HT, Nenkova A (2013) A decade of automatic content evaluation of news summaries: reassessing the state of the art. In: Proceedings of the 51st annual meeting of the Association for Computational Linguistics (volume 2: short papers). Association for Computational Linguistics, Sofia, pp 131–136

  159. Ranzato M, Chopra S, Auli M, Zaremba W (2016) Sequence level training with recurrent neural networks. In: International conference on learning representations (ICLR)

  160. Ren P, Wei F, CHEN Z, MA J, Zhou M (2016) A redundancy-aware sentence regression framework for extractive summarization. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers. The COLING 2016 Organizing Committee, Osaka, pp 33–43

  161. Ren Z, de Rijke M (2015) Summarizing contrastive themes via hierarchical non-parametric processes. In: Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval, Santiago, Chile, August 9–3, 2015, pp 93–102. doi:10.1145/2766462.2767713

  162. Rioux C, Hasan SA, Chali Y (2014) Fear the reaper: A system for automatic multi-document summarization with reinforcement learning. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics, Doha, pp 681–690

  163. Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344(6191):1492–1496

    Article  Google Scholar 

  164. Ross S, Zhou J, Yue Y, Dey D, Bagnell D (2013) Learning policies for contextual submodular prediction. In: Proceedings of the 30th international conference on machine learning, ICML 2013, Atlanta, GA, USA, 16–21 June 2013, pp 1364–1372

  165. Rush AM, Chopra S, Weston J (2015) A neural attention model for abstractive sentence summarization. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, pp 379–389

  166. Saggion H (2013) Unsupervised learning summarization templates from concise summaries. In: Proceedings of the 2013 conference of the North American chapter of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, Atlanta, pp 270–279

  167. Schluter N, Søgaard A (2015) Unsupervised extractive summarization via coverage maximization with syntactic and semantic concepts. In: Proceedings of the 53rd annual meeting of the Association for Computational Linguistics and the 7th international joint conference on natural language processing (volume 2: short papers). Association for Computational Linguistics, Beijing, pp 840–844

  168. Sharifi B, Hutton MA, Kalita J (2010) Summarizing microblogs automatically. In: Human language technologies: the 2010 annual conference of the North American chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Los Angeles, pp 685–688

  169. Shen C, Li T (2011) Learning to rank for query-focused multi-document summarization. In: 2011 IEEE 11th international conference on data mining (ICDM). IEEE, pp 626–634

  170. Shen D, Sun JT, Li H, Yang Q, Chen Z (2007) Document summarization using conditional random fields. In: International joint conference on artificial intelligence, vol 7, pp 2862–2867

  171. Sidhaye P, Cheung JCK (2015) Indicative tweet generation: an extractive summarization problem? In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, pp 138–147

  172. Sipos R, Shivaswamy P, Joachims T (2012) Large-margin learning of submodular summarization models. In: Proceedings of the 13th conference of the European chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Avignon, pp 224–233

  173. Snoek J, Larochelle H, Adams RP (2012) Practical bayesian optimization of machine learning algorithms. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ (eds) Advances in neural information processing systems. Curran Associates, Inc., Lake Tahoe, Nevada, pp 2951–2959

  174. Sukhbaatar S, Szlam A, Weston J, Fergus R (2015) End-to-end memory networks. Adv Neural Inf Process Syst 28:2440–2448

    Google Scholar 

  175. Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. Adv Neural Inf Process Syst 27:3104–3112

    Google Scholar 

  176. Swisher K (2013) Yahoo paid $30 million in cash for 18 months of young summly entrepreneur’s time. http://allthingsd.com/20130325/yahoo-paid-30-million-in-cash-for-18-months-of-young-summly-entrepreneurs-time/. Accessed 30 Dec 2016

  177. Takamura H, Yokono H, Okumura M (2011) Summarizing a document stream. In: Advances in information retrieval—33rd European conference on IR research, ECIR 2011, Dublin, Ireland, April 18–21, 2011. Proceedings, pp 177–188

  178. Thadani K, McKeown K (2013) Supervised sentence fusion with single-stage inference. In: Proceedings of the sixth international joint conference on natural language processing. Asian Federation of Natural Language Processing, Nagoya, pp 1410–1418

  179. Toutanova K, Brockett C, Tran KM, Amershi S (2016) A dataset and evaluation metrics for abstractive compression of sentences and short paragraphs. In: Proceedings of the 2016 conference on empirical methods in natural language processing. Association for Computational Linguistics, Austin, pp 340–350

  180. Tran G, Herder E, Markert K (2015) Joint graphical models for date selection in timeline summarization. In: Proceedings of the 53rd annual meeting of the Association for Computational Linguistics and the 7th international joint conference on natural language processing (volume 1: long papers). Association for Computational Linguistics, Beijing, pp 1598–1607

  181. Trione J, Favre B, Béchet F (2016) Beyond utterance extraction: summary recombination for speech summarization. Interspeech 2016:680–684

    Article  Google Scholar 

  182. Vanderwende L, Suzuki H, Brockett C, Nenkova A (2007) Beyond sumbasic: task-focused summarization with sentence simplification and lexical expansion. Inf Process Manag 43(6):1606–1618. doi:10.1016/j.ipm.2007.01.023

    Article  Google Scholar 

  183. Wan X (2011) Using bilingual information for cross-language document summarization. In: Proceedings of the 49th annual meeting of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, Portland, pp 1546–1555

  184. Wan X (2012) Update summarization based on co-ranking with constraints. In: Proceedings of COLING 2012: posters. The COLING 2012 Organizing Committee, Mumbai, pp 1291–1300

  185. Wan X, Zhang J (2014) CTSUM: extracting more certain summaries for news articles. In: The 37th international ACM SIGIR conference on research and development in information retrieval, SIGIR ’14, Gold Coast , QLD, Australia, July 06–11, 2014, pp 787–796. doi:10.1145/2600428.2609559

  186. Wang D, Li T (2012) Weighted consensus multi-document summarization. Inf Process Manag 48(3):513–523

    Article  Google Scholar 

  187. Wang D, Zhu S, Li T, Gong Y (2013) Comparative document summarization via discriminative sentence selection. TKDD 7(1):21–218. doi:10.1145/2435209.2435211

    Article  Google Scholar 

  188. Wang L, Cardie C (2013) Domain-independent abstract generation for focused meeting summarization. In: Proceedings of the 51st annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Sofia, pp 1395–1405

  189. Wang L, Ling W (2016) Neural network-based abstract generation for opinions and arguments. In: Proceedings of the 2016 conference of the North American chapter of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, San Diego, pp 47–57

  190. Wang L, Raghavan H, Castelli V, Florian R, Cardie C (2013) A sentence compression based framework to query-focused multi-document summarization. In: Proceedings of the 51st annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Sofia, pp 1384–1394

  191. Wang L, Raghavan H, Cardie C, Castelli V (2014) Query-focused opinion summarization for user-generated content. In: Proceedings of COLING 2014, the 25th international conference on computational linguistics: technical papers. Dublin City University and Association for Computational Linguistics, Dublin, pp 1660–1669

  192. Wang WY, Mehdad Y, Radev DR, Stent A (2016) A low-rank approximation approach to learning joint embeddings of news stories and images for timeline summarization. In: Proceedings of the 2016 conference of the North American chapter of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, San Diego, pp 58–68

  193. Wang X, Yoshida Y, Hirao T, Sudoh K, Nagata M (2015) Summarization based on task-oriented discourse parsing. IEEE/ACM Trans Audio Speech Lang Process 23(8):1358–1367. doi:10.1109/TASLP.2015.2432573

    Article  Google Scholar 

  194. Wang X, Nishino M, Hirao T, Sudoh K, Nagata M (2016) Exploring text links for coherent multi-document summarization. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers. The COLING 2016 Organizing Committee, Osaka, pp 213–223

  195. Woodsend K, Lapata M (2012) Multiple aspect summarization using integer linear programming. In: Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning. Association for Computational Linguistics, Jeju Island, pp 233–243

  196. Xiong W, Litman D (2014) Empirical analysis of exploiting review helpfulness for extractive summarization of online reviews. In: Proceedings of COLING 2014, the 25th international conference on computational linguistics: technical papers. Dublin City University and Association for Computational Linguistics, Dublin, pp 1985–1995

  197. Xu H, Martin E, Mahidadia A (2015) Extractive summarisation based on keyword profile and language model. In: Proceedings of the 2015 conference of the North American chapter of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, Denver, pp 123–132

  198. Yan R, Kong L, Huang C, Wan X, Li X, Zhang Y (2011) Timeline generation through evolutionary trans-temporal summarization. In: Proceedings of the 2011 conference on empirical methods in natural language processing. Association for Computational Linguistics, Edinburgh, pp 433–443

  199. Yan R, Wan X, Otterbacher J, Kong L, Li X, Zhang Y (2011) Evolutionary timeline summarization: a balanced optimization framework via iterative substitution. In: Proceeding of the 34th international ACM SIGIR conference on research and development in information retrieval, SIGIR 2011, Beijing, China, July 25–29, 2011, pp 745–754, doi:10.1145/2009916.2010016

  200. Yan R, Jiang H, Lapata M, Lin SD, Lv X, Li X (2013) I, poet: automatic chinese poetry composition through a generative summarization framework under constrained optimization. In: Proceedings of the twenty-third international joint conference on artificial intelligence. AAAI Press, pp 2197–2203

  201. Yan S, Wan X (2014) Srrank: leveraging semantic roles for extractive multi-document summarization. IEEE/ACM Trans Audio Speech Lang Process 22(12):2048–2058

    Article  MathSciNet  Google Scholar 

  202. Yang L, Ai Q, Spina D, Chen RC, Pang L, Croft WB, Guo J, Scholer F (2016) Beyond factoid QA: effective methods for non-factoid answer sentence retrieval. In: European conference on information retrieval, Springer, Berlin pp 115–128

  203. Yang Z, Cai K, Tang J, Zhang L, Su Z, Li J (2011) Social context summarization. In: Proceedings of the 34th international ACM SIGIR conference on research and development in information retrieval. ACM, pp 255–264

  204. Yao J, Wan X, Xiao J (2015) Compressive document summarization via sparse optimization. In: International joint conference on artificial intelligence

  205. Yao J, Wan X, Xiao J (2015) Phrase-based compressive cross-language summarization. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, pp 118–127

  206. Yin W, Pei Y (2015) Optimizing sentence modeling and selection for document summarization. In: International joint conference on artificial intelligence

  207. Yogatama D, Liu F, Smith NA (2015) Extractive summarization by maximizing semantic volume. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, pp 1961–1966

  208. Yoshida Y, Suzuki J, Hirao T, Nagata M (2014) Dependency-based discourse parser for single-document summarization. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics, Doha, pp 1834–1839

  209. You O, Li W, Li S, Lu Q (2011) Applying regression models to query-focused multi-document summarization. Inf Process Manag 47(2):227–237. doi:10.1016/j.ipm.2010.03.005

    Article  Google Scholar 

  210. Yu N, Huang M, Shi Y, zhu x, (2016) Product review summarization by exploiting phrase properties. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers. The COLING 2016 Organizing Committee, Osaka, pp 1113–1124

  211. Zajic DM, Dorr B, Lin J, Schwartz R (2006) Sentence compression as a component of a multi-document summarization system. In: Proceedings of the 2006 document understanding workshop, New York

  212. Zhang J, Yao J, Wan X (2016a) Towards constructing sports news from live text commentary. In: Proceedings of the 54th annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Berlin, pp 1361–1371

  213. Zhang J, Zhou Y, Zong C (2016b) Abstractive cross-language summarization via translation model enhanced predicate argument structure fusing. IEEE/ACM Trans Audio Speech Lang Process 24(10):1842–1853

    Article  Google Scholar 

  214. Zhang R, Li W, Gao D (2013) Towards content-level coherence with aspect-guided summarization. TSLP 10(1):2:1–2:22. doi:10.1145/2442076.2442078

    Article  Google Scholar 

  215. Zhang Y, Xia Y, Liu Y, Wang W (2015) Clustering sentences with density peaks for multi-document summarization. In: Proceedings of the 2015 conference of the North American chapter of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, Denver, pp 1262–1267

  216. Zhao WX, Guo Y, Yan R, He Y, Li X (2013) Timeline generation with social attention. In: The 36th international ACM SIGIR conference on research and development in information retrieval, SIGIR ’13, Dublin, Ireland, July 28–August 01, 2013, pp 1061–1064. doi:10.1145/2484028.2484103

  217. Zopf M, Loza Mencía E, Fürnkranz J (2016) Sequential clustering and contextual importance measures for incremental update summarization. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers. The COLING 2016 Organizing Committee, Osaka, pp 1071–1082

  218. Zopf M, Mencıa EL, Fürnkranz J (2016b) Beyond centrality and structural features: Learning information importance for text summarization. In: Proceedings of the 20th SIGNLL conference on computational natural language learning. Association for Computational Linguistics, Berlin, pp 84–94

Download references

Acknowledgements

The work was supported by National Hi-Tech Research and Development Program (863 Program) of China (2015AA015403), National Natural Science Foundation of China (61331011) and IBM Global Faculty Award Program. We would like to thank anonymous reviewers for feedbacks and Jiwei Tan for reporting typos in an earlier draft of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaojun Wan.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yao, Jg., Wan, X. & Xiao, J. Recent advances in document summarization. Knowl Inf Syst 53, 297–336 (2017). https://doi.org/10.1007/s10115-017-1042-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-017-1042-4

Keywords

Navigation