Abstract
We are witnessing a paradigm shift in Human Language Technology (HLT) that may well have an impact on the field comparable to the statistical revolution: acquiring large-scale resources by exploiting collective intelligence. An illustration of this new approach is Phrase Detectives, an interactive online game with a purpose for creating anaphorically annotated resources that makes use of a highly distributed population of contributors with different levels of expertise.
The purpose of this article is to first of all give an overview of all aspects of Phrase Detectives, from the design of the game and the HLT methods we used to the results we have obtained so far. It furthermore summarizes the lessons that we have learned in developing this game which should help other researchers to design and implement similar games.
- Albakour, M.-D., Kruschwitz, U., and Lucas, S. 2010. Sentence-Level attachment prediction. In Proceedings of the 1st Information Retrieval Facility Conference. Lecture Notes in Computer Science, vol. 6107. Springer, 6--19. Google ScholarDigital Library
- Alonso, O. and Mizzaro, S. 2009. Can we get rid of trec assessors? Using mechanical turk for relevance assessment. In Proceedings of the Workshop on the Future of Information Retrieval Evaluation, Collocated at Special Interest Group on Information Retrieval Conference (SIGIR).Google Scholar
- Alonso, O., Rose, D. E., and Stewart, B. 2008. Crowdsourcing for relevance evaluation. SIGIR Forum 42, 2, 9--15. Google ScholarDigital Library
- Artstein, R. and Poesio, M. 2008. Inter-Coder agreement for computational linguistics. Comput. Linguist. 34, 4, 555--596. Google ScholarDigital Library
- Attardi, G. and The Galoap Team. 2010. Phratris. In Proceedings of the INSEMTIVES'10 (Demo).Google Scholar
- Baroni, M., Bernardini, S., Ferraresi, A., and Zanchetta, E. 2009. The wacky wide web: A collection of very large linguistically processed web-crawled corpora. Lang. Resour. Eval. 3, 209--226.Google ScholarCross Ref
- Bird, S. and Liberman, M. 1999. Annotation graphs as a framework for multidimensional linguistic data analysis. In Proceedings of the Workshop “Towards Standards and Tools for Discourse Tagging”. Association for Computational Linguistics, 1--10.Google Scholar
- Broscheit, S., Poesio, M., Ponzetto, S.-P., Rodriguez, K. J., Romano, L., Uryupina, O., Versley, Y., and Zanoli, R. 2010. Bart: A multilingual anaphora resolution system. In Proceedings of the Semantic Evaluation Workshop (SEMEVAL). Google ScholarDigital Library
- Burchardt, A., Erk, K., Frank, A., Kowalski, A., Pado, S., and Pinkal, M. 2009. Framenet for the semantic analysis of German: Annotation, representation and automation. In Multilingual FrameNets in Computational Lexicography: Methods and Applications, H. C. Boas, Ed., Mouton De Gruyter.Google Scholar
- Burnard, L. 2000. The british national corpus reference guide. Tech. rep., Oxford University Computing Services, Oxford, UK.Google Scholar
- Callison-Burch, C. 2009. Fast, cheap, and creative: Evaluating translation quality using amazon's mechanical turk. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Vol. 1. Association for Computational Linguistics, 286--295. Google ScholarDigital Library
- Chamberlain, J., Kruschwitz, U., and Poesio, M. 2009a. Constructing an anaphorically annotated corpus with non-experts: Assessing the quality of collaborative annotations. In Proceedings of the Joint Conference of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing Workshop on the People's Web Meets NLP: Collaboratively Constructed Semantic Resources. Google ScholarDigital Library
- Chamberlain, J., Kruschwitz, U., and Poesio, M. 2012. Motivations for participation in socially networked collective intelligence systems. In Proceedings of the Conference on Collective Intelligence (CI'12).Google Scholar
- Chamberlain, J., Poesio, M., and Kruschwitz, U. 2008a. Addressing the resource bottleneck to create large-scale annotated texts. In Proceedings of the Symposium on Semantics in Systems for Text Processing (STEP). Google ScholarDigital Library
- Chamberlain, J., Poesio, M., and Kruschwitz, U. 2008b. Phrase detectives: A web-based collaborative annotation game. In Proceedings of the International Conference on Semantic Systems (ISemantics'08).Google Scholar
- Chamberlain, J., Poesio, M., and Kruschwitz, U. 2009b. A new life for a dead parrot: Incentive structures in the phrase detectives game. In Proceedings of the WWW Workshop on Web Incentives (WEBCENTIVES'09).Google Scholar
- Chklovski, T. and Gil, Y. 2005. Improving the design of intelligent acquisition interfaces for collecting world knowledge from web contributors. In Proceedings of the 3rd International Conference on Knowledge Capture. 35--42. Google ScholarDigital Library
- Chklovski, T. 2005. Collecting paraphrase corpora from volunteer contributors. In Proceedings of the 3rd International Conference on Knowledge Capture (K-CAP'05). ACM Press, New York, 115--120. Google ScholarDigital Library
- Csomai, A. and Mihalcea, R. 2008. Linking documents to encyclopedic knowledge. IEEE Intell. Syst. 23, 5, 34--41. Google ScholarDigital Library
- Feng, D., Besana, S., and Zajac, R. 2009. Acquiring high quality non-expert knowledge from on-demand workforce. In Proceedings of the Workshop on the People's Web Meets NLP: Collaboratively Constructed Semantic Resources. Association for Computational Linguistics, 51--56. Google ScholarDigital Library
- Garnham, A. 2001. Mental Models and the Interpretation of Anaphora. Psychology Press.Google Scholar
- Hitzeman, J. and Poesio, M. 1998. Long-Distance pronominalisation and global focus. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics. Vol. 1. Google ScholarDigital Library
- Hladka, B., Mirovsky, J., and Schlesinger, P. 2009. Play the language: Play coreference. In Proceedings of the Joint Conference of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing. Association for Computational Linguistics, 209--212. Google ScholarDigital Library
- Hobbs, J. R. 1978. Resolving pronoun references. Lingua 44, 311--338.Google ScholarCross Ref
- Hovy, E., Marcus, M., Palmer, M., Ramshaw, L., and Weischedel, R. 2006. Ontonotes: The 90% solution. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 57--60. Google ScholarDigital Library
- Johnson, N. L., Rasmussen, S., Joslyn, C., Rocha, L., Smith, S., and Kantor, M. 1998. Symbiotic intelligence: Self-Organizing knowledge on distributed networks driven by human interaction. In Proceedings of the 6th International Conference on Artificial Life. MIT Press. Google ScholarDigital Library
- Kabadjov, M. A. 2007. Task-Oriented evaluation of anaphora resolution. Ph.D. thesis, University of Essex, Colchester, UK.Google Scholar
- Kamp, H. and Reyle, U. 1993. From Discourse to Logic. D. Reidel, Dordrecht.Google Scholar
- Kazai, G. 2011. In search of quality in crowdsourcing for search engine evaluation. In Proceedings of the 33rd European Conference on Information Retrieval (ECIR'11). Lecture Notes in Computer Science, vol. 6611. Springer, 165--176. Google ScholarDigital Library
- Kazai, G., Milic-Frayling, N., and Costello, J. 2009. Towards methods for the collective gathering and quality control of relevance assessments. In Proceedings of the 32nd International Special Interest Group on Information Retrieval Conference on Research and Development in Information Retrieval (SIGIR'09). ACM Press, New York, 452--459. Google ScholarDigital Library
- Koller, A., Striegnitz, K., Gargett, A., Byron, D., Cassell, J., Dale, R., Moore, J., and Oberlander, J. 2010. Report on the second nlg challenge on generating instructions in virtual environments (give-2). In Proceedings of the 6th International Natural Language Generation Conference. Google ScholarDigital Library
- Koster, R. 2005. A Theory of Fun for Game Design. Paraglyph. Google ScholarDigital Library
- Krotzsch, M., Vrandecic, D., Volkel, M., Haller, H., and Studer, R. 2007. Semantic wikipedia. J. Web Semantics 5, 251--261. Google ScholarDigital Library
- Kruschwitz, U., Chamberlain, J., and Poesio, M. 2009. (Linguistic) science through web collaboration in the ANAWIKI project. In Proceedings of the International Conference on Web Science (WebSci'09).Google Scholar
- Kucera, H. and Francis, W. N. 1967. Computational Analysis of Present-Day American English. Brown University Press.Google Scholar
- Lesmo, L. and Lombardo, V. 2002. Transformed subcategorization frames in chunk parsing. In Proceedings of the 3rd International Conference on Language Resources and Evaluation. 512--519.Google Scholar
- Lieberman, H., Smith, A. D., and Teeters, A. 2007. Common consensus: A web-based game for collecting commonsense goals. In Proceedings of the Workshop on Common Sense and Intelligent User Interfaces held in Conjunction with the International Conference on Intelligent UserInterfaces (IUI'07).Google Scholar
- Marcus, M. P., Marcinkiewicz, M. A., and Santorini, B. 1993. Building a large annotated corpus of english: The penn treebank. Comput. Linguist. 19, 2, 313--330. Google ScholarCross Ref
- Markey, K. 2007. Twenty-Five years of end-user searching, Part 1: Research findings. J. Amer. Soc. Inf. Sci. Technol. 58, 8, 1071--1081. Google ScholarDigital Library
- Mason, W. and Watts, D. J. 2010. Financial incentives and the “performance of crowds”. Special Interest Group Knowl. Discov. Data Min. Explorations Newslett. 11, 100--108. Google ScholarDigital Library
- Mintz, M., Bills, S., Snow, R., and Jurafsky, D. 2009. Distant supervision for relation extraction without labeled data. In Proceedings of the Joint Conference of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language. 1003--1011. Google ScholarDigital Library
- Mitkov, R. 2002. Anaphora Resolution. Longman.Google Scholar
- Mrozinski, J., Whittaker, E., and Furui, S. 2008. Collecting a why-question corpus for development and evaluation of an automatic QA-system. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics. 443--451.Google Scholar
- Ng, V. 2008. Unsupervised models for coreference resolution. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Google ScholarDigital Library
- Nivre, J. 2005. Dependency grammar and dependency parsing. Tech. rep., Vaxjo University.Google Scholar
- Petrov, S., Barrett, L., Thibaux, R., and Klein, D. 2006. Learning accurate, compact, and interpretable tree annotation. In Proceedings of the 21st International Conference on Computational Linguistics. Association for Computational Linguistics, 433--440. Google ScholarDigital Library
- Poesio, M. 2004a. Discourse annotation and semantic annotation in the GNOME corpus. In Proceedings of the Association for Computational Linguistics Workshop on Discourse Annotation. Google ScholarDigital Library
- Poesio, M. 2004b. The MATE/GNOME scheme for anaphoric annotation, revisited. In Proceedings of the Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL).Google Scholar
- Poesio, M. and Artstein, R. 2008. Anaphoric annotation in the arrau corpus. In Proceedings of the 6th International Conference on Language Resources and Evaluation.Google Scholar
- Poesio, M., Diewald, N., Stuhrenberg, M., Chamberlain, J., Jettka, D., Goecke, D., and Kruschwitz, U. 2011a. Markup infrastructure for the anaphoric bank: Supporting web collaboration. In Modeling, Learning, and Processing of Text Technological Data Structures, A. Mehler, K.-U. Kuhnberger, H. Lobin, H. Lungen, A. Storrer, and A. Witt, Eds., Studies in Computational Intelligence, vol. 370, Springer, 175--195.Google Scholar
- Poesio, M., Kruschwitz, U., and Chamberlain, J. 2008. ANAWIKI: Creating anaphorically annotated resources through Web cooperation. In Proceedings of the International Conference on Language Resources and Evaluation.Google Scholar
- Poesio, M., Stuckardt, R., and Versley, Y. 2011b. Anaphora Resolution: Algorithms, Resources and Applications. Springer.Google Scholar
- Poesio, M., Sturt, P., Arstein, R., and Filik, R. 2006. Underspecification and anaphora: Theoretical issues and preliminary evidence. Discourse Processes 42, 2, 157--175.Google ScholarCross Ref
- Poesio, M. and Vieira, R. 1998. A corpus-based investigation of definite description use. Comput. Linguist. 24, 2, 183--216. Google ScholarDigital Library
- Ponzetto, S. and Strube, M. 2007. Knowledge derived from wikipedia for computing semantic relatedness. J. Artif. Intell. Res. 30, 181--212. Google ScholarDigital Library
- Pradhan, S., Ramshaw, L., Marcus, M., Palmer, M., Weischedel, R., and Xue, N. 2011. Conll-2011 shared task: Modeling unrestricted coreference in ontonotes. In Proceedings of the 15th Computational Natural Language Learning Conference (CoNLL). Association for Computational Linguistics, 1--27. Google ScholarDigital Library
- Pradhan, S. S., Ramshaw, L., Weischedel, R., Macbride, J., and Micciulla, L. 2007. Unrestricted coreference: Indentifying entities and events in ontonotes. In Proceedings of the International Conference on Semantic Computing. Google ScholarDigital Library
- Prince, E. F. 1992. The ZPG letter: Subjects, definiteness, and information status. In Discourse Description: Diverse Analyses of a Fund-Raising Text, S. Thompson and W. Mann, Eds., John Benjamins, 295--325.Google Scholar
- Rafelsberger, W. and Scharl, A. 2009. Games with a purpose for social networking platforms. In Proceedings of the 20th ACM Conference on Hypertext and hypermedia. ACM Press, New York, 193--198. Google ScholarDigital Library
- Recasens, M., Marquez, L., Sapena, E., Marti, M. A., Taule, M., Hoste, V., Poesio, M., and Versley, Y. 2010. Semeval-2010 task 1: Coreference resolution in multiple languages. In Proceedings of the Semantic Evaluation Workshop (SEMEVAL). Google ScholarDigital Library
- Robaldo, L., Poesio, M., Ducceschi, L., Chamberlain, J., and Kruschwitz, U. 2011. Italian anaphoric annotation with the phrase detectives game-with-a-purpose. In Proceedings of the 12th Congress of the Italian Association for Artificial Intelligence. Lecture Notes in Computer Science, vol. 6934. Springer, 407--412. Google ScholarDigital Library
- Settles, B. 2009. Active learning literature survey. Tech. rep. 1648, Department of Computer Science, University of Wisconsin at Madison.Google Scholar
- Singh, P. 2002. The public acquisition of commonsense knowledge. In Proceedings of the AAAI Spring Symposium on Acquiring (and Using) Linguistic (and World) Knowledge for Information Access.Google Scholar
- Siorpaes, K. and Hepp, M. 2008. Games with a purpose for the semantic web. IEEE Intell. Syst. 23, 3, 50--60. Google ScholarDigital Library
- Smadja, F. 2009. Mixing financial, social and fun incentives for social voting. In World Wide Web Internet and Web Information Systems.Google Scholar
- Snow, R., O'connor, B., Jurafsky, D., and Ng, A. Y. 2008. Cheap and fast—But is it good?: Evaluating non-expert annotations for natural language tasks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP'08). Association for Computational Linguistics, 254--263. Google ScholarDigital Library
- Soon, W. M., Lim, D. C. Y., and Ng, H. T. 2001. A machine learning approach to coreference resolution of noun phrases. Comput. Linguist. 27, 4. Google ScholarDigital Library
- Stuhrenberg, M. and Goecke, D. 2008. SGF--An integrated model for multiple annotations and its application in a linguistic domain. In Balisage: The Markup Conference.Google Scholar
- Stuhrenberg, M., Goecke, D., Diewald, N., Mehler, A., and Cramer, I. 2007. Web-Based annotation of anaphoric relations and lexical chains. In Proceedings of the Association for Computational Linguistics, Linguistic Annotation Workshop. 140--147. Google ScholarDigital Library
- Tang, J. and Sanderson, M. 2010. Evaluation and user preference study on spatial diversity. In Proceedings of the European Conference on IR Research (ECIR). Lecture Notes in Computer Science, vol. 5993, Springer, 179--190. Google ScholarDigital Library
- Vieira, R. and Poesio, M. 2000. An empirically based system for processing definite descriptions. Comput. Linguist. 26, 539--593. Google ScholarDigital Library
- Vilain, M., Burger, J., Aberdeen, J., Connolly, D., and Hirschman, L. 1995. A model-theoretic coreference scoring scheme. In Proceedings of the 6th Message Understanding Conference. 45--52. Google ScholarDigital Library
- Vlachos, A. 2006. Active annotation. In Proceedings of the Workshop on Adaptive Text Extraction and Mining, Collocated at the European Chapter of the Association for Computational Linguistics.Google Scholar
- Von Ahn, L. 2006. Games with a purpose. Comput. 39, 6, 92--94. Google ScholarDigital Library
- Von Ahn, L. and Dabbish, L. 2004. Labeling images with a computer game. In Proceedings of the Conference on Human Factors in Computing Systems. ACM Press, New York, 319--326. Google ScholarDigital Library
- Von Ahn, L. and Dabbish, L. 2008. Designing games with a purpose. Comm. ACM 8, 58--67. Google ScholarDigital Library
- Von Ahn, L., Liu, R., and Blum, M. 2006. Peekaboom: A game for locating objects in images. In Proceedings of the Conference on Human Factors in Computing Systems. ACM Press, New York, 55--64. Google ScholarDigital Library
- Yang, H. and Lai, C. 2010. Motivations of wikipedia content contributors. Comput. Hum. Behav. 26, 6, 1377--1383. Google ScholarDigital Library
- Zaenen, A. 2006. Mark-Up barking up the wrong tree. Comput. Linguist. 32, 4, 577--580. Google ScholarDigital Library
Index Terms
- Phrase detectives: Utilizing collective intelligence for internet-scale language resource creation
Recommendations
AnCora-CO: Coreferentially annotated corpora for Spanish and Catalan
This article describes the enrichment of the AnCora corpora of Spanish and Catalan (400 k each) with coreference links between pronouns (including elliptical subjects and clitics), full noun phrases (including proper nouns), and discourse segments. The ...
Toward an Effective Igbo Part-of-Speech Tagger
Part-of-speech (POS) tagging is a well-established technology for most Western European languages and a few other world languages, but it has not been evaluated on Igbo, an agglutinative African language. This article presents POS tagging experiments ...
A Basic Language Resource Kit Implementation for the IgboNLP Project
Igbo, an African language with around 32 million speakers worldwide, is one of the many languages having few or none of the language processing resources needed for advanced language technology applications. In this article, we describe the approach ...
Comments