skip to main content
research-article

Phrase detectives: Utilizing collective intelligence for internet-scale language resource creation

Published:24 April 2013Publication History
Skip Abstract Section

Abstract

We are witnessing a paradigm shift in Human Language Technology (HLT) that may well have an impact on the field comparable to the statistical revolution: acquiring large-scale resources by exploiting collective intelligence. An illustration of this new approach is Phrase Detectives, an interactive online game with a purpose for creating anaphorically annotated resources that makes use of a highly distributed population of contributors with different levels of expertise.

The purpose of this article is to first of all give an overview of all aspects of Phrase Detectives, from the design of the game and the HLT methods we used to the results we have obtained so far. It furthermore summarizes the lessons that we have learned in developing this game which should help other researchers to design and implement similar games.

References

  1. Albakour, M.-D., Kruschwitz, U., and Lucas, S. 2010. Sentence-Level attachment prediction. In Proceedings of the 1st Information Retrieval Facility Conference. Lecture Notes in Computer Science, vol. 6107. Springer, 6--19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Alonso, O. and Mizzaro, S. 2009. Can we get rid of trec assessors? Using mechanical turk for relevance assessment. In Proceedings of the Workshop on the Future of Information Retrieval Evaluation, Collocated at Special Interest Group on Information Retrieval Conference (SIGIR).Google ScholarGoogle Scholar
  3. Alonso, O., Rose, D. E., and Stewart, B. 2008. Crowdsourcing for relevance evaluation. SIGIR Forum 42, 2, 9--15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Artstein, R. and Poesio, M. 2008. Inter-Coder agreement for computational linguistics. Comput. Linguist. 34, 4, 555--596. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Attardi, G. and The Galoap Team. 2010. Phratris. In Proceedings of the INSEMTIVES'10 (Demo).Google ScholarGoogle Scholar
  6. Baroni, M., Bernardini, S., Ferraresi, A., and Zanchetta, E. 2009. The wacky wide web: A collection of very large linguistically processed web-crawled corpora. Lang. Resour. Eval. 3, 209--226.Google ScholarGoogle ScholarCross RefCross Ref
  7. Bird, S. and Liberman, M. 1999. Annotation graphs as a framework for multidimensional linguistic data analysis. In Proceedings of the Workshop “Towards Standards and Tools for Discourse Tagging”. Association for Computational Linguistics, 1--10.Google ScholarGoogle Scholar
  8. Broscheit, S., Poesio, M., Ponzetto, S.-P., Rodriguez, K. J., Romano, L., Uryupina, O., Versley, Y., and Zanoli, R. 2010. Bart: A multilingual anaphora resolution system. In Proceedings of the Semantic Evaluation Workshop (SEMEVAL). Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Burchardt, A., Erk, K., Frank, A., Kowalski, A., Pado, S., and Pinkal, M. 2009. Framenet for the semantic analysis of German: Annotation, representation and automation. In Multilingual FrameNets in Computational Lexicography: Methods and Applications, H. C. Boas, Ed., Mouton De Gruyter.Google ScholarGoogle Scholar
  10. Burnard, L. 2000. The british national corpus reference guide. Tech. rep., Oxford University Computing Services, Oxford, UK.Google ScholarGoogle Scholar
  11. Callison-Burch, C. 2009. Fast, cheap, and creative: Evaluating translation quality using amazon's mechanical turk. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Vol. 1. Association for Computational Linguistics, 286--295. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Chamberlain, J., Kruschwitz, U., and Poesio, M. 2009a. Constructing an anaphorically annotated corpus with non-experts: Assessing the quality of collaborative annotations. In Proceedings of the Joint Conference of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing Workshop on the People's Web Meets NLP: Collaboratively Constructed Semantic Resources. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Chamberlain, J., Kruschwitz, U., and Poesio, M. 2012. Motivations for participation in socially networked collective intelligence systems. In Proceedings of the Conference on Collective Intelligence (CI'12).Google ScholarGoogle Scholar
  14. Chamberlain, J., Poesio, M., and Kruschwitz, U. 2008a. Addressing the resource bottleneck to create large-scale annotated texts. In Proceedings of the Symposium on Semantics in Systems for Text Processing (STEP). Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Chamberlain, J., Poesio, M., and Kruschwitz, U. 2008b. Phrase detectives: A web-based collaborative annotation game. In Proceedings of the International Conference on Semantic Systems (ISemantics'08).Google ScholarGoogle Scholar
  16. Chamberlain, J., Poesio, M., and Kruschwitz, U. 2009b. A new life for a dead parrot: Incentive structures in the phrase detectives game. In Proceedings of the WWW Workshop on Web Incentives (WEBCENTIVES'09).Google ScholarGoogle Scholar
  17. Chklovski, T. and Gil, Y. 2005. Improving the design of intelligent acquisition interfaces for collecting world knowledge from web contributors. In Proceedings of the 3rd International Conference on Knowledge Capture. 35--42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Chklovski, T. 2005. Collecting paraphrase corpora from volunteer contributors. In Proceedings of the 3rd International Conference on Knowledge Capture (K-CAP'05). ACM Press, New York, 115--120. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Csomai, A. and Mihalcea, R. 2008. Linking documents to encyclopedic knowledge. IEEE Intell. Syst. 23, 5, 34--41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Feng, D., Besana, S., and Zajac, R. 2009. Acquiring high quality non-expert knowledge from on-demand workforce. In Proceedings of the Workshop on the People's Web Meets NLP: Collaboratively Constructed Semantic Resources. Association for Computational Linguistics, 51--56. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Garnham, A. 2001. Mental Models and the Interpretation of Anaphora. Psychology Press.Google ScholarGoogle Scholar
  22. Hitzeman, J. and Poesio, M. 1998. Long-Distance pronominalisation and global focus. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics. Vol. 1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Hladka, B., Mirovsky, J., and Schlesinger, P. 2009. Play the language: Play coreference. In Proceedings of the Joint Conference of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing. Association for Computational Linguistics, 209--212. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Hobbs, J. R. 1978. Resolving pronoun references. Lingua 44, 311--338.Google ScholarGoogle ScholarCross RefCross Ref
  25. Hovy, E., Marcus, M., Palmer, M., Ramshaw, L., and Weischedel, R. 2006. Ontonotes: The 90% solution. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 57--60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Johnson, N. L., Rasmussen, S., Joslyn, C., Rocha, L., Smith, S., and Kantor, M. 1998. Symbiotic intelligence: Self-Organizing knowledge on distributed networks driven by human interaction. In Proceedings of the 6th International Conference on Artificial Life. MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Kabadjov, M. A. 2007. Task-Oriented evaluation of anaphora resolution. Ph.D. thesis, University of Essex, Colchester, UK.Google ScholarGoogle Scholar
  28. Kamp, H. and Reyle, U. 1993. From Discourse to Logic. D. Reidel, Dordrecht.Google ScholarGoogle Scholar
  29. Kazai, G. 2011. In search of quality in crowdsourcing for search engine evaluation. In Proceedings of the 33rd European Conference on Information Retrieval (ECIR'11). Lecture Notes in Computer Science, vol. 6611. Springer, 165--176. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Kazai, G., Milic-Frayling, N., and Costello, J. 2009. Towards methods for the collective gathering and quality control of relevance assessments. In Proceedings of the 32nd International Special Interest Group on Information Retrieval Conference on Research and Development in Information Retrieval (SIGIR'09). ACM Press, New York, 452--459. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Koller, A., Striegnitz, K., Gargett, A., Byron, D., Cassell, J., Dale, R., Moore, J., and Oberlander, J. 2010. Report on the second nlg challenge on generating instructions in virtual environments (give-2). In Proceedings of the 6th International Natural Language Generation Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Koster, R. 2005. A Theory of Fun for Game Design. Paraglyph. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Krotzsch, M., Vrandecic, D., Volkel, M., Haller, H., and Studer, R. 2007. Semantic wikipedia. J. Web Semantics 5, 251--261. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Kruschwitz, U., Chamberlain, J., and Poesio, M. 2009. (Linguistic) science through web collaboration in the ANAWIKI project. In Proceedings of the International Conference on Web Science (WebSci'09).Google ScholarGoogle Scholar
  35. Kucera, H. and Francis, W. N. 1967. Computational Analysis of Present-Day American English. Brown University Press.Google ScholarGoogle Scholar
  36. Lesmo, L. and Lombardo, V. 2002. Transformed subcategorization frames in chunk parsing. In Proceedings of the 3rd International Conference on Language Resources and Evaluation. 512--519.Google ScholarGoogle Scholar
  37. Lieberman, H., Smith, A. D., and Teeters, A. 2007. Common consensus: A web-based game for collecting commonsense goals. In Proceedings of the Workshop on Common Sense and Intelligent User Interfaces held in Conjunction with the International Conference on Intelligent UserInterfaces (IUI'07).Google ScholarGoogle Scholar
  38. Marcus, M. P., Marcinkiewicz, M. A., and Santorini, B. 1993. Building a large annotated corpus of english: The penn treebank. Comput. Linguist. 19, 2, 313--330. Google ScholarGoogle ScholarCross RefCross Ref
  39. Markey, K. 2007. Twenty-Five years of end-user searching, Part 1: Research findings. J. Amer. Soc. Inf. Sci. Technol. 58, 8, 1071--1081. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Mason, W. and Watts, D. J. 2010. Financial incentives and the “performance of crowds”. Special Interest Group Knowl. Discov. Data Min. Explorations Newslett. 11, 100--108. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Mintz, M., Bills, S., Snow, R., and Jurafsky, D. 2009. Distant supervision for relation extraction without labeled data. In Proceedings of the Joint Conference of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language. 1003--1011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Mitkov, R. 2002. Anaphora Resolution. Longman.Google ScholarGoogle Scholar
  43. Mrozinski, J., Whittaker, E., and Furui, S. 2008. Collecting a why-question corpus for development and evaluation of an automatic QA-system. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics. 443--451.Google ScholarGoogle Scholar
  44. Ng, V. 2008. Unsupervised models for coreference resolution. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Nivre, J. 2005. Dependency grammar and dependency parsing. Tech. rep., Vaxjo University.Google ScholarGoogle Scholar
  46. Petrov, S., Barrett, L., Thibaux, R., and Klein, D. 2006. Learning accurate, compact, and interpretable tree annotation. In Proceedings of the 21st International Conference on Computational Linguistics. Association for Computational Linguistics, 433--440. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Poesio, M. 2004a. Discourse annotation and semantic annotation in the GNOME corpus. In Proceedings of the Association for Computational Linguistics Workshop on Discourse Annotation. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Poesio, M. 2004b. The MATE/GNOME scheme for anaphoric annotation, revisited. In Proceedings of the Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL).Google ScholarGoogle Scholar
  49. Poesio, M. and Artstein, R. 2008. Anaphoric annotation in the arrau corpus. In Proceedings of the 6th International Conference on Language Resources and Evaluation.Google ScholarGoogle Scholar
  50. Poesio, M., Diewald, N., Stuhrenberg, M., Chamberlain, J., Jettka, D., Goecke, D., and Kruschwitz, U. 2011a. Markup infrastructure for the anaphoric bank: Supporting web collaboration. In Modeling, Learning, and Processing of Text Technological Data Structures, A. Mehler, K.-U. Kuhnberger, H. Lobin, H. Lungen, A. Storrer, and A. Witt, Eds., Studies in Computational Intelligence, vol. 370, Springer, 175--195.Google ScholarGoogle Scholar
  51. Poesio, M., Kruschwitz, U., and Chamberlain, J. 2008. ANAWIKI: Creating anaphorically annotated resources through Web cooperation. In Proceedings of the International Conference on Language Resources and Evaluation.Google ScholarGoogle Scholar
  52. Poesio, M., Stuckardt, R., and Versley, Y. 2011b. Anaphora Resolution: Algorithms, Resources and Applications. Springer.Google ScholarGoogle Scholar
  53. Poesio, M., Sturt, P., Arstein, R., and Filik, R. 2006. Underspecification and anaphora: Theoretical issues and preliminary evidence. Discourse Processes 42, 2, 157--175.Google ScholarGoogle ScholarCross RefCross Ref
  54. Poesio, M. and Vieira, R. 1998. A corpus-based investigation of definite description use. Comput. Linguist. 24, 2, 183--216. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Ponzetto, S. and Strube, M. 2007. Knowledge derived from wikipedia for computing semantic relatedness. J. Artif. Intell. Res. 30, 181--212. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Pradhan, S., Ramshaw, L., Marcus, M., Palmer, M., Weischedel, R., and Xue, N. 2011. Conll-2011 shared task: Modeling unrestricted coreference in ontonotes. In Proceedings of the 15th Computational Natural Language Learning Conference (CoNLL). Association for Computational Linguistics, 1--27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Pradhan, S. S., Ramshaw, L., Weischedel, R., Macbride, J., and Micciulla, L. 2007. Unrestricted coreference: Indentifying entities and events in ontonotes. In Proceedings of the International Conference on Semantic Computing. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Prince, E. F. 1992. The ZPG letter: Subjects, definiteness, and information status. In Discourse Description: Diverse Analyses of a Fund-Raising Text, S. Thompson and W. Mann, Eds., John Benjamins, 295--325.Google ScholarGoogle Scholar
  59. Rafelsberger, W. and Scharl, A. 2009. Games with a purpose for social networking platforms. In Proceedings of the 20th ACM Conference on Hypertext and hypermedia. ACM Press, New York, 193--198. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Recasens, M., Marquez, L., Sapena, E., Marti, M. A., Taule, M., Hoste, V., Poesio, M., and Versley, Y. 2010. Semeval-2010 task 1: Coreference resolution in multiple languages. In Proceedings of the Semantic Evaluation Workshop (SEMEVAL). Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Robaldo, L., Poesio, M., Ducceschi, L., Chamberlain, J., and Kruschwitz, U. 2011. Italian anaphoric annotation with the phrase detectives game-with-a-purpose. In Proceedings of the 12th Congress of the Italian Association for Artificial Intelligence. Lecture Notes in Computer Science, vol. 6934. Springer, 407--412. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Settles, B. 2009. Active learning literature survey. Tech. rep. 1648, Department of Computer Science, University of Wisconsin at Madison.Google ScholarGoogle Scholar
  63. Singh, P. 2002. The public acquisition of commonsense knowledge. In Proceedings of the AAAI Spring Symposium on Acquiring (and Using) Linguistic (and World) Knowledge for Information Access.Google ScholarGoogle Scholar
  64. Siorpaes, K. and Hepp, M. 2008. Games with a purpose for the semantic web. IEEE Intell. Syst. 23, 3, 50--60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Smadja, F. 2009. Mixing financial, social and fun incentives for social voting. In World Wide Web Internet and Web Information Systems.Google ScholarGoogle Scholar
  66. Snow, R., O'connor, B., Jurafsky, D., and Ng, A. Y. 2008. Cheap and fast—But is it good?: Evaluating non-expert annotations for natural language tasks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP'08). Association for Computational Linguistics, 254--263. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Soon, W. M., Lim, D. C. Y., and Ng, H. T. 2001. A machine learning approach to coreference resolution of noun phrases. Comput. Linguist. 27, 4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Stuhrenberg, M. and Goecke, D. 2008. SGF--An integrated model for multiple annotations and its application in a linguistic domain. In Balisage: The Markup Conference.Google ScholarGoogle Scholar
  69. Stuhrenberg, M., Goecke, D., Diewald, N., Mehler, A., and Cramer, I. 2007. Web-Based annotation of anaphoric relations and lexical chains. In Proceedings of the Association for Computational Linguistics, Linguistic Annotation Workshop. 140--147. Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Tang, J. and Sanderson, M. 2010. Evaluation and user preference study on spatial diversity. In Proceedings of the European Conference on IR Research (ECIR). Lecture Notes in Computer Science, vol. 5993, Springer, 179--190. Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. Vieira, R. and Poesio, M. 2000. An empirically based system for processing definite descriptions. Comput. Linguist. 26, 539--593. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. Vilain, M., Burger, J., Aberdeen, J., Connolly, D., and Hirschman, L. 1995. A model-theoretic coreference scoring scheme. In Proceedings of the 6th Message Understanding Conference. 45--52. Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. Vlachos, A. 2006. Active annotation. In Proceedings of the Workshop on Adaptive Text Extraction and Mining, Collocated at the European Chapter of the Association for Computational Linguistics.Google ScholarGoogle Scholar
  74. Von Ahn, L. 2006. Games with a purpose. Comput. 39, 6, 92--94. Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. Von Ahn, L. and Dabbish, L. 2004. Labeling images with a computer game. In Proceedings of the Conference on Human Factors in Computing Systems. ACM Press, New York, 319--326. Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. Von Ahn, L. and Dabbish, L. 2008. Designing games with a purpose. Comm. ACM 8, 58--67. Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. Von Ahn, L., Liu, R., and Blum, M. 2006. Peekaboom: A game for locating objects in images. In Proceedings of the Conference on Human Factors in Computing Systems. ACM Press, New York, 55--64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. Yang, H. and Lai, C. 2010. Motivations of wikipedia content contributors. Comput. Hum. Behav. 26, 6, 1377--1383. Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. Zaenen, A. 2006. Mark-Up barking up the wrong tree. Comput. Linguist. 32, 4, 577--580. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Phrase detectives: Utilizing collective intelligence for internet-scale language resource creation

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Interactive Intelligent Systems
      ACM Transactions on Interactive Intelligent Systems  Volume 3, Issue 1
      Special section on internet-scale human problem solving and regular papers
      April 2013
      140 pages
      ISSN:2160-6455
      EISSN:2160-6463
      DOI:10.1145/2448116
      Issue’s Table of Contents

      Copyright © 2013 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 24 April 2013
      • Accepted: 1 April 2012
      • Revised: 1 January 2012
      • Received: 1 June 2011
      Published in tiis Volume 3, Issue 1

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader