Skip to main content
Log in

A Method for Supporting Document Selection in Cross-language Information Retrieval and its Evaluation

  • Published:
Computers and the Humanities Aims and scope Submit manuscript

Abstract

It is important to give useful clues for selecting desiredcontent from a number of retrieval results obtained (usually) from avague search request. Compared with monolingual retrieval, such asupport framework is inevitable and much more significant for filteringgiven translingual retrieval results. This paper describes an attempt toprovide appropriate translation of major keywords in each document in across-language information retrieval (CLIR) result, as a browsingsupport for users. Our idea of determining appropriate translation ofmajor keywords is based on word co-occurrence distribution in thetranslation target language, considering the actual situation of WWWcontent where it is difficult to obtain aligned parallel (multilingual)corpora. The proposed method provides higher quality of keywordtranslation to yield a more effective support in identifying the targetdocuments in the retrieval result. We report the advantage of thisbrowsing support technique through evaluation experiments includingcomparison with conditions of referring to a translated documentsummary, and discuss related issues to be examined towards moreeffective cross-language information extraction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Aoki, K., K. Matsumoto, K. Hoashi and K. Hashimoto. “A Study of Bayesian Clustering of a Document Set Based on GA”.Proceedings of The Second Asia-Pacific Conference on Simulated Evolution And Learning (SEAL98), 1998.

  • Ballesteros, L. and W. B. Croft. “Statistical Method for Cross-Language Information Retrieval”. In Cross-Language Information Retrieval Ed. G. Grefenstette, Kluwer Academic Publishers, 1998.

  • Carbonell, J. G., Y. Yang, R. E. Frederking, R. D. Brown, Y. Geng and D. Lee, “Translingual Information Retrieval: A Comparative Evaluation”. Proceedings of International Joint Conference on Artificial Intelligence (IJCAI'97), 1997, pp. 708–715.

  • Davis, M. W. and W. C. Ogden. “Implementing Cross-Language Text Retrieval Systems for Largescale Text Collections and the World Wide Web”. AAAI Spring Symposium on Cross-Language Text and Speech Retrieval Electronic Working Notes, 1997.

  • Dorr, B. J. and D. W. Oard. “Evaluating Resources for Query Translation in Cross-Language Information Retrieval”. Proceedings of the First International Conference on Language Resource Evaluation (LREC), Granada, Spain, 1998.

  • Grefenstette, G. “The World-Wide-Web as a Resource for Example-Based Machine Translation”. Proceedings of ASLIB Õ99 Translating and the Computer 21, 1999.

  • Kikui, G., S. Suzaki, Y. Hayashi and R. Sunaba. “Cross-lingual Internet Navigation System: TITAN”. Proceedings of Symposium on Application of Natural Language Processing '95, Information Processing Society of Japan, 1995, pp. 97–105.

  • Kikui, G. “Term-list Translation using Mono-lingual Word Co-occurrence Vectors”. Proceedings of COLING-ACL '98, 1998, pp. 670–674.

  • Matsumoto, Y., A. Kitauchi, T. Yamashita and Y. Hirano. “Japanese Morphological Analyzer, ChaSen 2.0 Users Manual”. NAIST-IS-TR99009, Nara Institute of Science and Technology (NAIST), 1999.

  • Mochizuki, H., M. Iwayama and M. Okumura. “Passage-Level Document Retrieval Using Lexical Chains”. Journal of Natural Language Processing, 6(3) (1999), 101–126.

    Google Scholar 

  • Oard, D.W. “Cross-Language Information Retrieval Resources”. http://www.clis.umd.edu/dlrg/clir/, 1999.

  • Ogden, W., J. Cowie, M. Davis, E. Ludovik, H. Molina-Salgado and H. Shin. “Getting Information from Documents You Cannot Read: An Interactive Cross-Language Text Retrieval and Summarization System”. Joint ACM Digital Library/SIGIR Workshop on Multilingual Information Discovery and AccesS (MIDAS) Electronic Working Notes, 1999.

  • Okumura, A., K. Ishikawa and K. Satoh. “GDMAX Query Translation Model for Cross-Language Information Retrieval”. Proceedings of Information Processing Society of Japan (IPSJ) 1998 Spring Meeting, Vol. 3, 1998, pp. 138–139.

    Google Scholar 

  • Resnik, P. “Evaluating Multilingual Gisting of Web Pages”. AAAI Spring Symposium on Cross-Language Text and Speech Retrieval Electronic Working Notes, 1997.

  • Sato, S. “Automatic Digesting of the NetNews”. Proceedings of Symposium on Application of Natural Language Processing '95, IPSJ, 1995, pp. 81–88

  • Suzuki, M. and K. Hashimoto. “Enhancing Source Text for WWW Distribution”. Proceedings of Workshop on Information Retrieval with Oriental Languages (IROL-96), 1996, pp. 51–56.

  • Suzuki, M., N. Inoue and K. Hashimoto. “Effect on Displaying Translated Major Keyword of Contents as Browsing Support in Cross-Language Information Retrieval”. Technical Report of IEICE(Institute of Electronics, Information and Communication Engineers). NLC98-20, 1998, pp. 37–44.

  • Takeda, K. and H. Nomiyama. “Site Outlining”. Proceedings of ACM Digital Libraries 98 (DL'98), 1998, pp. 309–310.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Masami Suzuki.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Suzuki, M., Inoue, N. & Hashimoto, K. A Method for Supporting Document Selection in Cross-language Information Retrieval and its Evaluation. Computers and the Humanities 35, 421–438 (2001). https://doi.org/10.1023/A:1011877503081

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1011877503081

Navigation