|
ABSTRACT
The so-called “redundancy-based” approach to question answering represents a successful strategy for mining answers to factoid questions such as “Who shot Abraham Lincoln?” from the World Wide Web. Through contrastive and ablation experiments with Aranea, a system that has performed well in several TREC QA evaluations, this work examines the underlying assumptions and principles behind redundancy-based techniques. Specifically, we develop two theses: that stable characteristics of data redundancy allow factoid systems to rely on external “black box” components, and that despite embodying a data-driven approach, redundancy-based methods encode a substantial amount of knowledge in the form of heuristics. Overall, this work attempts to address the broader question of “what really matters” and to provide guidance for future researchers.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
|
| |
3
|
Enrique Amigó , Julio Gonzalo , Víctor Peinado , Anselmo Peñas , Felisa Verdejo, An empirical study of information synthesis tasks, Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, p.207-es, July 21-26, 2004, Barcelona, Spain
[doi> 10.3115/1218955.1218982]
|
| |
4
|
David Azari , Eric Horvitz , Susan Dumais , Eric Brill, Actions, answers, and uncertainty: a decision-making perspective on Web-based question answering, Information Processing and Management: an International Journal, v.40 n.5, p.849-868, September 2004
[doi> 10.1016/j.ipm.2004.04.013]
|
| |
5
|
|
| |
6
|
|
 |
7
|
Adam Berger , Rich Caruana , David Cohn , Dayne Freitag , Vibhu Mittal, Bridging the lexical chasm: statistical approaches to answer-finding, Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, p.192-199, July 24-28, 2000, Athens, Greece
[doi> 10.1145/345508.345576]
|
| |
8
|
|
| |
9
|
|
| |
10
|
Brill, E., Lin, J., Banko, M., Dumais, S., and Ng, A. 2001. Data-intensive question answering. In Proceedings of the Tenth Text REtrieval Conference (TREC 2001). 393--400.
|
| |
11
|
Brill, E. and Mooney, R. J. 1997. An overview of empirical natural language processing. AI Mag. 18, 4, 13--24.
|
| |
12
|
Michael J. Cafarella , Doug Downey , Stephen Soderland , Oren Etzioni, KnowItNow: fast, scalable information extraction from the web, Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, p.563-570, October 06-08, 2005, Vancouver, British Columbia, Canada
[doi> 10.3115/1220575.1220646]
|
| |
13
|
Cahn, S. M., Kitcher, P., Sher, G., and Markie, P. J. 1996. Reason at Work: Introductory Readings in Philosophy, 3rd ed. Hardcourt Brace College Publishers, Fort Worth, TX.
|
| |
14
|
Jennifer Chu-Carroll , Krzysztof Czuba , John Prager , Abraham Ittycheriah, In question answering, two heads are better than one, Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, p.24-31, May 27-June 01, 2003, Edmonton, Canada
[doi> 10.3115/1073445.1073449]
|
| |
15
|
|
 |
16
|
|
| |
17
|
Clarke, C., Cormack, G., Lynam, T., Li, C., and McLearn, G. 2001b. Web reinforced question answering (MultiText experiments for TREC 2001). In Proceedings of the Tenth Text REtrieval Conference (TREC 2001). 673--679.
|
 |
18
|
Hang Cui , Renxu Sun , Keya Li , Min-Yen Kan , Tat-Seng Chua, Question answering passage retrieval using dependency relations, Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, August 15-19, 2005, Salvador, Brazil
[doi> 10.1145/1076034.1076103]
|
| |
19
|
Dang, H. 2005. Overview of DUC 2005. In Proceedings of the 2005 Document Understanding Conference (DUC 2005) at NLT/EMNLP 2005.
|
| |
20
|
Dang, H., Lin, J., and Kelly, D. 2006. Overview of the TREC 2006 question answering track. In Proceedings of the Fifteenth Text REtrieval Conference (TREC 2006).
|
 |
21
|
Susan Dumais , Michele Banko , Eric Brill , Jimmy Lin , Andrew Ng, Web question answering: is more always better?, Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval, August 11-15, 2002, Tampere, Finland
[doi> 10.1145/564376.564428]
|
| |
22
|
|
| |
23
|
|
| |
24
|
Fukumoto, J., Kato, T., and Masui, F. 2002. Question Answering Challenge (QAC-1): An evaluation of question answering task at NTCIR Workshop 3. In Proceedings of the Third NTCIR Workshop on Research in Information Retrieval, Automatic Text Summarization and Question Answering.
|
| |
25
|
Harabagiu, S., Moldovan, D., Paşca, M., Mihalcea, R., Surdeanu, M., Bunescu, R., Gîrju, R., Rus, V., and Morărescu, P. 2000a. FALCON: Boosting knowledge for answer engines. In Proceedings of the Ninth Text REtrieval Conference (TREC-9). 497--506.
|
| |
26
|
|
| |
27
|
Hildebrandt, W., Katz, B., and Lin, J. 2004. Answering definition questions with multiple knowledge sources. In Proceedings of the 2004 Human Language Technology Conference and the North American Chapter of the Association for Computational Linguistics Annual Meeting (HLT/NAACL 2004). 49--56.
|
| |
28
|
|
| |
29
|
Hovy, E., Gerber, L., Hermjakob, U., Junk, M., and Lin, C.-Y. 2000. Question answering in Webclopedia. In Proceedings of the Ninth Text REtrieval Conference (TREC-9). 655--664.
|
| |
30
|
Ittycheriah, A., Franz, M., Zhu, W.-J., and Ratnaparkhi, A. 2000. IBM's statistical question answering system. In Proceedings of the Ninth Text REtrieval Conference (TREC-9). 258--264.
|
| |
31
|
Kato, T., Fukumoto, J., Masui, F., and Kando, N. 2004. Handling information access dialogue through QA technologies---a novel challenge for open-domain question answering. In Proceedings of the HLT-NAACL 2004 Workshop on Pragmatics of Question Answering. 70--77.
|
| |
32
|
Katz, B. 1997. Annotating the World Wide Web using natural language. In Proceedings of the 5th RIAO Conference on Computer Assisted Information Searching on the Internet (RIAO 1997). 136--155.
|
| |
33
|
Boris Katz , Sue Felshin , Deniz Yuret , Ali Ibrahim , Jimmy J. Lin , Gregory Marton , Alton Jerome McFarland , Baris Temelkuran, Omnibase: Uniform Access to Heterogeneous Data for Question Answering, Proceedings of the 6th International Conference on Applications of Natural Language to Information Systems-Revised Papers, p.230-234, June 27-28, 2002
|
 |
34
|
|
| |
35
|
|
 |
36
|
|
| |
37
|
|
| |
38
|
Lin, J., Fernandes, A., Katz, B., Marton, G., and Tellex, S. 2002. Extracting answers from the Web using knowledge annotation and knowledge mining techniques. In Proceedings of the Eleventh Text REtrieval Conference (TREC 2002).
|
 |
39
|
|
| |
40
|
|
 |
41
|
Jimmy Lin , Dennis Quan , Vineet Sinha , Karun Bakshi , David Huynh , Boris Katz , David R. Karger, The role of context in question answering systems, CHI '03 extended abstracts on Human factors in computing systems, April 05-10, 2003, Ft. Lauderdale, Florida, USA
[doi> 10.1145/765891.766119]
|
| |
42
|
Lowe, J. B. 2000. What's in store for question answering? (Invited talk.) In Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (EMNLP/VLC-2000).
|
| |
43
|
Magnini, B., Romagnoli, S., Vallin, A., Herrera, J., Peñas, A., Peinado, V., Verdejo, F., and de Rijke, M. 2004. The multiple language question answering track at CLEF 2003. In Comparative Evaluation of Multilingual Information Access Systems: 4th Workshop of the Cross-Language Evaluation Forum, CLEF 2003, Trondheim, Norway, August 21--22, 2003, Revised Selected Papers, C. Peters, J. Gonzalo, M. Braschler, and M. Kluck, Eds. Lecture Notes in Computer Science, vol. 3237. Springer, Berlin, Germany, 471--486.
|
| |
44
|
|
| |
45
|
|
| |
46
|
Moffat, A., Sacks-Davis, R., Wilkinson, R., and Zobel, J. 1993. Retrieval of partial documents. In Proceedings of the Second Text REtrieval Conference (TREC-2). 181--190.
|
| |
47
|
|
 |
48
|
|
 |
49
|
John Prager , Eric Brown , Anni Coden , Dragomir Radev, Question-answering by predictive annotation, Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, p.184-191, July 24-28, 2000, Athens, Greece
[doi> 10.1145/345508.345574]
|
| |
50
|
|
| |
51
|
|
| |
52
|
Robertson, S. 1977. The probability ranking principle in IR. J. Documentat. 33, 4, 294--304.
|
| |
53
|
Robertson, S. 2004. Understanding inverse document frequency: On theoretical arguments for IDF. J. Documentat. 60, 5, 503--520.
|
 |
54
|
Gerard Salton , J. Allan , Chris Buckley, Approaches to passage retrieval in full text information systems, Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval, p.49-58, June 27-July 01, 1993, Pittsburgh, Pennsylvania, United States
[doi> 10.1145/160688.160693]
|
| |
55
|
Srihari, R. and Li, W. 1999. Information extraction supported question answering. In Proceedings of the Eighth Text REtrieval Conference (TREC-8). 185--196.
|
 |
56
|
Stefanie Tellex , Boris Katz , Jimmy Lin , Aaron Fernandes , Gregory Marton, Quantitative evaluation of passage retrieval algorithms for question answering, Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, July 28-August 01, 2003, Toronto, Canada
[doi> 10.1145/860435.860445]
|
| |
57
|
Voorhees, E. 2001. Overview of the TREC 2001 question answering track. In Proceedings of the Tenth Text REtrieval Conference (TREC 2001). 42--51.
|
| |
58
|
Voorhees, E. 2002. Overview of the TREC 2002 question answering track. In Proceedings of the Eleventh Text REtrieval Conference (TREC 2002). 57--68.
|
| |
59
|
Voorhees, E. 2003. Overview of the TREC 2003 question answering track. In Proceedings of the Twelfth Text REtrieval Conference (TREC 2003). 54--68.
|
| |
60
|
Voorhees, E. 2004. Overview of the TREC 2004 question answering track. In Proceedings of the Thirteenth Text REtrieval Conference (TREC 2004). 52--69.
|
| |
61
|
Voorhees, E. and Tice, D. 1999. The TREC-8 question answering track evaluation. In Proceedings of the Eighth Text REtrieval Conference (TREC-8). 83--106.
|
 |
62
|
|
| |
63
|
|
|