skip to main content
10.3115/1220355.1220469dlproceedingsArticle/Chapter ViewAbstractPublication PagescolingConference Proceedingsconference-collections
Article
Free Access

Improving statistical machine translation in the medical domain using the unified medical language system

Published:23 August 2004Publication History

ABSTRACT

Texts from the medical domain are an important task for natural language processing. This paper investigates the usefulness of a large medical database (the Unified Medical Language System) for the translation of dialogues between doctors and patients using a statistical machine translation system. We are able to show that the extraction of a large dictionary and the usage of semantic type information to generalize the training data significantly improves the translation performance.

References

  1. Allen C. Browne, Guy Divita, Alan R. Aronson, Alexa T. McGray, 2003. UMLS Language and Vocabulary Tools, Proceedings of the American Medical Informatics Association (AMIA) 2003 Symposium, Washington, DC, USA.Google ScholarGoogle Scholar
  2. George Doddington. 2001. Automatic Evaluation of Machine Translation Quality using n-Gram Cooccurrence Statistics. NIST Washington, DC, USA.Google ScholarGoogle Scholar
  3. Glenn Flores, M. Barton Laws, Sandra I Mayo, Barry Zuckerman, Milagros Abreu, Leonardo Medina, Eric J. Hardt, 2003. Errors in medical interpretation and their potential clinical consequences in pediatric encounters, Pediatrics, Jan 2003.Google ScholarGoogle Scholar
  4. Carol Friedman, Hongfang Liu, Lyuda Shagina, Stephen Johnson, George Hripcsak, 2001. Evaluating the UMLS as a Source of Lexical Knowledge for Medical Language Processing, Proceedings of the AMIA 2001 Symposium, Washington, DC, USA.Google ScholarGoogle Scholar
  5. Vipul Kashyap, 2003. The UMLS semantic network and the semantic web, Proceedings of the AMIA 2003 Symposium, Washington, DC, USA.Google ScholarGoogle Scholar
  6. C. Lindberg, 1990. The Unified Medical Language System (UMLS) of the National Library of Medicine, Journal of the American Medical Record Association, 1990;61(5):40--42.Google ScholarGoogle Scholar
  7. Lauren Neergard, 2003. Hospitals struggle with growing language barrier, Associated Press, The Charlotte Observer Sept. 2, 2003Google ScholarGoogle Scholar
  8. Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu, 2002. BLEU: a Method for Automatic Evaluation of Machine Translation, Proceedings of the ACL 2002, Philadelphia, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. SRI Speech Technology and Research Laboratory, SRI Language Modeling Toolkit, 1995--2004 (ongoing) http://www.speech.sri.com/projects/srilm/Google ScholarGoogle Scholar
  10. UMLS Unified Medical Language System, National Library of Medicine, 1986--2004 (ongoing) http://www.nlm.nih.gov/research/umls/Google ScholarGoogle Scholar
  11. Stephan Vogel and Hermann Ney, 2000. Translation with Cascaded Finite State Transducers. Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics (ACL 2000), pp. 23--30. Hongkong, China, October 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Stephan Vogel, Hermann Ney, and Christoph Tillmann, 1996. HMM-based Word Alignment in Statistical Translation, Proceedings of COLING 1996: The 16th International Conference on Computational Linguistics, pp. 836--841. Copenhagen, August 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Stephan Vogel, Ying Zhang, Fei Huang, Alicia Tribble, Ashish Venogupal, Bing Zhao, Alex Waibel, 2003. The CMU Statistical Translation System, Proceedings of MT-Summit IX. New Orleans, LA. Sep 2003.Google ScholarGoogle Scholar
  14. Ying Zhang, Stephan Vogel, Alex Waibel, 2003. Integrated Phrase Segmentation and Alignment Algorithm for Statistical Machine Translation, Proceedings of International Conference on Natural Language Processing and Knowledge Engineering 2003, Beijing, China, Oct 2003.Google ScholarGoogle ScholarCross RefCross Ref
  15. Pierre Zweigenbaum, Robert Baud, Anita Burgun, Fiammetta Namer, Éric Jarrousse, Natalia Grabar, Patrick Ruch, Franck Le Duff, Benoît Thirion, Stéfan Darmoni, 2003. UMLF: a Unified Medical Lexicon for French, Proceedings of the AMIA 2003 Symposium, Washington, DC, USA.Google ScholarGoogle Scholar
  1. Improving statistical machine translation in the medical domain using the unified medical language system

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image DL Hosted proceedings
          COLING '04: Proceedings of the 20th international conference on Computational Linguistics
          August 2004
          1411 pages

          Publisher

          Association for Computational Linguistics

          United States

          Publication History

          • Published: 23 August 2004

          Qualifiers

          • Article

          Acceptance Rates

          COLING '04 Paper Acceptance Rate1,411of1,411submissions,100%Overall Acceptance Rate1,537of1,537submissions,100%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader