Skip to main content

Open-Source Portuguese–Spanish Machine Translation

  • Conference paper
Book cover Computational Processing of the Portuguese Language (PROPOR 2006)

Abstract

This paper describes the current status of development of an open-source shallow-transfer machine translation (MT) system for the [European] Portuguese \(\leftrightarrow\) Spanish language pair, developed using the OpenTrad Apertium MT toolbox (www.apertium.org). Apertium uses finite-state transducers for lexical processing, hidden Markov models for part-of-speech tagging, and finite-state-based chunking for structural transfer, and is based on a simple rationale: to produce fast, reasonably intelligible and easily correctable translations between related languages, it suffices to use a MT strategy which uses shallow parsing techniques to refine word-for-word MT. This paper briefly describes the MT engine, the formats it uses for linguistic data, and the compilers that convert these data into an efficient format used by the engine, and then goes on to describe in more detail the pilot Portuguese\(\leftrightarrow\)Spanish linguistic data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Canals-Marote, R., Esteve-Guillen, A., Garrido-Alenda, A., Guardiola-Savall, M., Iturraspe-Bellver, A., Montserrat-Buendia, S., Ortiz-Rojas, S., Pastor-Pina, H., Perez-Antón, P., Forcada, M.: The Spanish-Catalan machine translation system interNOSTRUM. In: Proceedings of MT Summit VIII: Machine Translation in the Information Age, Santiago de Compostela, Spain, July 18–22 (2001)

    Google Scholar 

  2. Garrido-Alenda, A., Gilabert Zarco, P., Pérez-Ortiz, J.A., Pertusa-Ibáñez, A., Ramírez-Sánchez, G., Sánchez-Martínez, F., Scalco, M.A., Forcada, M.L.: Shallow parsing for Portuguese-Spanish machine translation. In: Branco, A., Mendes, A., Ribeiro, R. (eds.) Language technology for Portuguese: shallow processing tools and resources, Edições Colibri, Lisboa, pp. 135–144 (2004)

    Google Scholar 

  3. Corbí-Bellot, A.M., Forcada, M.L., Ortiz-Rojas, S., Pérez-Ortiz, J.A., Ramírez- Sánchez, G., Sánchez-Martínez, F., Alegria, I., Mayor, A., Sarasola, K.: An opensource shallow-transfer machine translation engine for the romance languages of Spain. In: Proceedings of the Tenth Conference of the European Association for Machine Translation, pp. 79–86 (2005)

    Google Scholar 

  4. Cutting, D., Kupiec, J., Pedersen, J., Sibun, P.: A practical part-of-speech tagger. In: Third Conference on Applied Natural Language Processing. Association for Computational Linguistics, Proceedings of the Conference, Trento, Italy, pp. 133–140 (1992)

    Google Scholar 

  5. Lesk, M.: Lex — a lexical analyzer generator. Technical Report 39, AT&T Bell Laboratories, Murray Hill, N.J (1975)

    Google Scholar 

  6. Roche, E., Schabes, Y.: Introduction. In: Roche, E., Schabes, Y. (eds.) Finite-State Language Processing, pp. 1–65. MIT Press, Cambridge (1997)

    Google Scholar 

  7. Garrido-Alenda, A., Forcada, M.L., Carrasco, R.C.: Incremental construction and maintenance of morphological analysers based on augmented letter transducers. In: Proceedings of TMI 2002 (Theoretical and Methodological Issues in Machine Translation, Keihanna/Kyoto, Japan, March 2002), pp. 53–62 (2002)

    Google Scholar 

  8. Ortiz-Rojas, S., Forcada, M.L., Ramírez-Sánchez, G.: Construcción y minimización eficiente de transductores de letras a partir de diccionarios con paradigmas. Procesamiento del Lenguaje Natural (35), 51–57 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Armentano-Oller, C. et al. (2006). Open-Source Portuguese–Spanish Machine Translation. In: Vieira, R., Quaresma, P., Nunes, M.d.G.V., Mamede, N.J., Oliveira, C., Dias, M.C. (eds) Computational Processing of the Portuguese Language. PROPOR 2006. Lecture Notes in Computer Science(), vol 3960. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11751984_6

Download citation

  • DOI: https://doi.org/10.1007/11751984_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-34045-4

  • Online ISBN: 978-3-540-34046-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics