skip to main content
10.1145/3340531.3411960acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Quality-Aware Ranking of Arguments

Published:19 October 2020Publication History

ABSTRACT

Argument search engines identify, extract, and rank the most important arguments for and against a given controversial topic. A number of such systems have recently been developed, usually focusing on classic information retrieval ranking methods that are based on frequency information. An important aspect that has been ignored so far by search engines is the quality of arguments. We present a quality-aware ranking framework for arguments already extracted from texts and represented as argument graphs, considering multiple established quality measures. An extensive evaluation with a standard benchmark collection demonstrates that taking quality into account significantly helps to improve retrieval quality for argument search. We also publish a dataset in which arguments with respect to topics were tediously annotated by humans with three widely accepted argument quality dimensions.

Skip Supplemental Material Section

Supplemental Material

3340531.3411960.mp4

mp4

97.3 MB

References

  1. Yamen Ajjour, Henning Wachsmuth, Johannes Kiesel, Martin Potthast, Matthias Hagen, and Benno Stein. 2019. Data Acquisition for Argument Search: The args.me Corpus. In KI 2019: Advances in Artificial Intelligence - 42nd German Conference on AI, Kassel, Germany, September 23--26, 2019, Proceedings (Lecture Notes in Computer Science), Vol. 11793. Springer, 48--59. https://doi.org/10.1007/978--3-030--30179--8_4Google ScholarGoogle ScholarCross RefCross Ref
  2. Gianni Amati and C. J. van Rijsbergen. 2002. Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM Transactions on Information Systems, Vol. 20, 4 (2002), 357--389. https://doi.org/10.1145/582415.582416Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J. Anthony Blair. 2012. Groundwork in the Theory of Argumentation. Argumentation Library, Vol. 21. Springer Netherlands. https://doi.org/10.1007/978--94-007--2363--4Google ScholarGoogle Scholar
  4. Alexander Bondarenko, Matthias Hagen, Martin Potthast, Henning Wachsmuth, Meriem Beloucif, Chris Biemann, Alexander Panchenko, and Benno Stein. 2020. Touché : First Shared Task on Argument Retrieval. In Advances in Information Retrieval - 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal, April 14--17, 2020, Proceedings, Part II (Lecture Notes in Computer Science), Vol. 12036. Springer, 517--523. https://doi.org/10.1007/978--3-030--45442--5_67Google ScholarGoogle Scholar
  5. Leo Breiman. 1997. Arcing the edge. Technical Report. Technical Report 486, Statistics Department, University of California at ?.Google ScholarGoogle Scholar
  6. Elena Cabrio and Serena Villata. 2018. Five Years of Argument Mining: a Data-driven Analysis. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13--19, 2018, Stockholm, Sweden. 5427--5433. https://doi.org/10.24963/ijcai.2018/766Google ScholarGoogle ScholarCross RefCross Ref
  7. Charles L. A. Clarke, Maheedhar Kolla, Gordon V. Cormack, Olga Vechtomova, Azin Ashkan, Stefan Bü ttcher, and Ian MacKinnon. 2008. Novelty and diversity in information retrieval evaluation. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2008, Singapore, July 20--24, 2008. ACM, 659--666. https://doi.org/10.1145/1390334.1390446Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Corinna Cortes and Vladimir Vapnik. 1995. Support-Vector Networks. Mach. Learn., Vol. 20, 3 (1995), 273--297. https://doi.org/10.1007/BF00994018Google ScholarGoogle ScholarCross RefCross Ref
  9. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2--7, 2019, Volume 1 (Long and Short Papers). 4171--4186. https://aclweb.org/anthology/papers/N/N19/N19--1423/Google ScholarGoogle Scholar
  10. Lorik Dumani, Patrick J. Neumann, and Ralf Schenkel. 2020. A Framework for Argument Retrieval - Ranking Argument Clusters by Frequency and Specificity. In Advances in Information Retrieval - 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal, April 14--17, 2020, Proceedings, Part I (Lecture Notes in Computer Science), Vol. 12035. Springer, 431--445. https://doi.org/10.1007/978--3-030--45439--5_29Google ScholarGoogle Scholar
  11. Lorik Dumani and Ralf Schenkel. 2019. A Systematic Comparison of Methods for Finding Good Premises for Claims. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2019, Paris, France, July 21--25, 2019. 957--960. https://doi.org/10.1145/3331184.3331282Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Evelyn Fix and J. L. Hodges Jr. 1952. Discriminatory analysis: Nonparametric discrimination: Consistency properties. USAF School of Aviation Medicine, Project (1952), 21--49.Google ScholarGoogle Scholar
  13. Jerome H Friedman. 2002. Stochastic gradient boosting. Computational statistics & data analysis, Vol. 38, 4 (2002), 367--378.Google ScholarGoogle Scholar
  14. Martin Gleize, Eyal Shnarch, Leshem Choshen, Lena Dankin, Guy Moshkowich, Ranit Aharonov, and Noam Slonim. 2019. Are You Convinced? Choosing the More Convincing Evidence with a Siamese Network. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers. 967--976. https://www.aclweb.org/anthology/P19--1093/Google ScholarGoogle ScholarCross RefCross Ref
  15. Ivan Habernal and Iryna Gurevych. 2016. Which argument is more convincing? Analyzing and predicting convincingness of Web arguments using bidirectional LSTM. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, August 7--12, 2016, Berlin, Germany, Volume 1: Long Papers. https://www.aclweb.org/anthology/P16--1150/Google ScholarGoogle ScholarCross RefCross Ref
  16. Anil K. Jain and Richard C. Dubes. 1988. Algorithms for Clustering Data .Prentice-Hall.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Kalervo Jarvelin and Jaana Kekalainen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems, Vol. 20, 4 (2002), 422--446. https://doi.org/10.1145/582415.582418Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Klaus Krippendorff. 1970. Estimating the Reliability, Systematic Error and Random Error of Interval Data.Google ScholarGoogle Scholar
  19. David M. Lane. 2018. All Pairwise Comparisons Among Means. http://onlinestatbook.com/2/tests_of_means/pairwise.html Retrieved 05-August-2020 fromGoogle ScholarGoogle Scholar
  20. Andy Liaw and Matthew Wiener. 2002. Classification and Regression by randomForest. R News, Vol. 2, 3 (2002), 18--22. http://CRAN.R-project.org/doc/Rnews/Google ScholarGoogle Scholar
  21. Stuart P. Lloyd. 1982. Least squares quantization in PCM. IEEE Trans. Inf. Theory, Vol. 28, 2 (1982), 129--136. https://doi.org/10.1109/TIT.1982.1056489Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schü tze. 2008. Introduction to information retrieval. Cambridge University Press. https://doi.org/10.1017/CBO9780511809071Google ScholarGoogle Scholar
  23. Peter McCullagh and John A. Nelder. 1989. Generalized Linear Models .Springer. https://doi.org/10.1007/978--1--4899--3242--6Google ScholarGoogle Scholar
  24. Fabian Pedregosa, Gaë l Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake VanderPlas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, and Edouard Duchesnay. 2011. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res., Vol. 12 (2011), 2825--2830. http://dl.acm.org/citation.cfm?id=2078195Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3--7, 2019. Association for Computational Linguistics, 3980--3990. https://doi.org/10.18653/v1/D19--1410Google ScholarGoogle ScholarCross RefCross Ref
  26. Nils Reimers, Benjamin Schiller, Tilman Beck, Johannes Daxenberger, Christian Stab, and Iryna Gurevych. 2019. Classification and Clustering of Arguments with Contextualized Word Embeddings. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers. 567--578. https://www.aclweb.org/anthology/P19--1054/Google ScholarGoogle ScholarCross RefCross Ref
  27. Stephen E. Robertson, Steve Walker, Susan Jones, Micheline Hancock-Beaulieu, and Mike Gatford. 1994. Okapi at TREC-3. In Proceedings of The Third Text REtrieval Conference, TREC 1994, Gaithersburg, Maryland, USA, November 2--4, 1994, Vol. Special Publication 500--225. National Institute of Standards and Technology (NIST), 109--126. http://trec.nist.gov/pubs/trec3/papers/city.ps.gzGoogle ScholarGoogle Scholar
  28. Stephen E. Robertson and Hugo Zaragoza. 2009. The Probabilistic Relevance Framework: BM25 and Beyond. Foundations and Trends in Information Retrieval, Vol. 3, 4 (2009), 333--389. https://doi.org/10.1561/1500000019Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Tetsuya Sakai. 2018. Laboratory Experiments in Information Retrieval - Sample Sizes, Effect Sizes, and Statistical Power. The Information Retrieval Series, Vol. 40. Springer. https://doi.org/10.1007/978--981--13--1199--4Google ScholarGoogle Scholar
  30. Gerard Salton, A. Wong, and Chung-Shu Yang. 1975. A Vector Space Model for Automatic Indexing. Commun. ACM, Vol. 18, 11 (1975), 613--620. https://doi.org/10.1145/361219.361220Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Samuel Sanford Shapiro and Martin B Wilk. 1965. An analysis of variance test for normality (complete samples). Biometrika, Vol. 52, 3/4 (1965), 591--611.Google ScholarGoogle ScholarCross RefCross Ref
  32. R. R. Sokal and C. D. Michener. 1958. A statistical method for evaluating systematic relationships. University of Kansas Science Bulletin, Vol. 38 (1958), 1409--1438.Google ScholarGoogle Scholar
  33. Christian Stab, Johannes Daxenberger, Chris Stahlhut, Tristan Miller, Benjamin Schiller, Christopher Tauchmann, Steffen Eger, and Iryna Gurevych. 2018. ArgumenText: Searching for Arguments in Heterogeneous Sources. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics, NAACL-HLT 2018, New Orleans, Louisiana, USA, June 2--4, 2018, Demonstrations. 21--25. https://www.aclweb.org/anthology/N18--5005/Google ScholarGoogle ScholarCross RefCross Ref
  34. Christian Stab and Iryna Gurevych. 2014. Identifying Argumentative Discourse Structures in Persuasive Essays. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25--29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL. 46--56. https://www.aclweb.org/anthology/D14--1006/Google ScholarGoogle ScholarCross RefCross Ref
  35. Manfred Stede, Stergos D. Afantenos, Andreas Peldszus, Nicholas Asher, and Jé ré my Perret. 2016. Parallel Discourse Annotations on a Corpus of Short Texts. In Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, Portorovz, Slovenia, May 23--28, 2016. http://www.lrec-conf.org/proceedings/lrec2016/summaries/477.htmlGoogle ScholarGoogle Scholar
  36. Henning Wachsmuth, Martin Potthast, Khalid Al Khatib, Yamen Ajjour, Jana Puschmann, Jiani Qu, Jonas Dorsch, Viorel Morari, Janek Bevendorff, and Benno Stein. 2017a. Building an Argument Search Engine for the Web. In Proc. 4th Workshop on Argument Mining (ArgMining@EMNLP). 49--59. https://doi.org/10.18653/v1/W17--5106Google ScholarGoogle ScholarCross RefCross Ref
  37. Henning Wachsmuth, Benno Stein, Graeme Hirst, Vinodkumar Prabhakaran, Yonatan Bilu, Yufang Hou, Nona Naderi, and Tim Alberdingk Thijm. 2017b. Computational Argumentation Quality Assessment in Natural Language. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2017, Valencia, Spain, April 3--7, 2017, Volume 1: Long Papers. 176--187. https://aclweb.org/anthology/E17--1017/Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Quality-Aware Ranking of Arguments

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge Management
          October 2020
          3619 pages
          ISBN:9781450368599
          DOI:10.1145/3340531

          Copyright © 2020 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 19 October 2020

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate1,861of8,427submissions,22%

          Upcoming Conference

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader