ABSTRACT
We present a joint probability model for statistical machine translation, which automatically learns word and phrase equivalents from bilingual corpora. Translations produced with parameters estimated using the joint model are more accurate than translations produced using IBM Model 4.
- Yaser Al-Onaizan, Jan Curin, Michael Jahr, Kevin Knight, John Lafferty, Dan Melamed, Franz-Josef Och, David Purdy, Noah A. Smith, and David Yarowsky. 1999. Statistical machine translation. Final Report, JHU Summer Workshop.Google Scholar
- Peter F. Brown, Stephen A. Della Pietra, Vincent J. Della Pietra, and Robert L. Mercer. 1993. The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics, 19(2):263--311. Google ScholarDigital Library
- Philip Clarkson and Ronald Rosenfeld. 1997. Statistical language modeling using the CMU-Cambridge toolkit. In Proceedings of Eurospeech, September.Google Scholar
- A. P. Dempster, N. M. Laird, and D. B. Rubin. 1977. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, 39(Ser B):1--38.Google Scholar
- Ulrich Germann, Mike Jahr, Kevin Knight, Daniel Marcu, and Kenji Yamada. 2001. Fast decoding and optimal decoding for machine translation. In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics (ACL'01), pages 228--235, Toulouse, France, July 6--11. Decoder available at http://www.isi.edu/natural-language/projects/rewrite/. Google ScholarDigital Library
- Daniel Marcu. 2001. Towards a unified approach to memory-and statistical-based machine translation. In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics (ACL'01), pages 378--385, Toulouse, France, July 6--11. Google ScholarDigital Library
- Dan Melamed. 2001. Empirical Methods for Exploiting Parallel Texts. The MIT Press.Google Scholar
- Franz Josef Och, Christoph Tillmann, and Herman Ney. 1999. Improved alignment models for statistical machine translation. In Proceedings of the Joint Work-shop on Empirical Methods in NLP and Very Large Corpora, pages 20--28, University of Maryland, Maryland.Google Scholar
- Kishore Papineni, Salim Roukos, Todd Ward, John Henderson, and Florence Reeder. 2002. Corpus-based comprehensive and diagnostic MT evaluation: Initial Arabic, Chinese, French, and Spanish results. In Proceedings of the Human Language Technology Conference, pages 124--127, San Diego, CA, March 24--27. Google ScholarDigital Library
- Kenji Yamada and Kevin Knight. 2001. A syntax-based statistical translation model. In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics (ACL'01), Toulouse, France, July 6--11. Google ScholarDigital Library
- A phrase-based, joint probability model for statistical machine translation
Recommendations
A tree-to-string phrase-based model for statistical machine translation
CoNLL '08: Proceedings of the Twelfth Conference on Computational Natural Language LearningThough phrase-based SMT has achieved high translation quality, it still lacks of generalization ability to capture word order differences between languages. In this paper we describe a general method for tree-to-string phrase-based SMT. We study how ...
Integrating source-language context into phrase-based statistical machine translation
The translation features typically used in Phrase-Based Statistical Machine Translation (PB-SMT) model dependencies between the source and target phrases, but not among the phrases in the source language themselves. A swathe of research has demonstrated ...
Comments