Skip to main content

An Efficient Pre-determinization Algorithm

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2759))

Abstract

We present a general algorithm, pre-determinization, that makes an arbitrary weighted transducer over the tropical semiring or an arbitrary unambiguous weighted transducer over a cancellative commutative semiring determinizable by inserting in it transitions labeled with special symbols. After determinization, the special symbols can be removed or replaced with ε-transitions. The resulting transducer can be significantly more efficient to use. We report empirical results showing that our algorithm leads to a substantial speed-up in large-vocabulary speech recognition. Our pre-determinization algorithm makes use of an efficient algorithm for testing a general twins property, a sufficient condition for the determinizability of all weighted transducers over the tropical semiring and unambiguous weighted transducers over cancellative commutative semirings. It inserts new transitions just when needed to guarantee that the resulting transducer has the twins property and thus is determinizable. It also uses a single-source shortest-paths algorithm over the min-max semiring for carefully selecting the positions for insertion of new transitions to benefit from the subsequent application of determinization. These positions are proved to be optimal in a sense that we describe.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A.V. Aho, R. Sethi, and J.D. Ullman. Compilers, Principles, Techniques and Tools. Addison Wesley: Reading, MA, 1986.

    Google Scholar 

  2. C. Allauzen and M. Mohri. Efficient Algorithms for Testing the Twins Property. Journal of Automata, Languages and Combinatorics, 8(2), 2003.

    Google Scholar 

  3. C. Allauzen and M. Mohri. Finitely Subsequential Transducers. International Journal of Foundations of Computer Science, to appear, 2003.

    Google Scholar 

  4. M.-P. Béal, O. Carton, C. Prieur, and J. Sakarovitch. Squaring transducers: An efficient procedure for deciding functionality and sequentiality. Theoretical Computer Science, 292:45–63, 2003.

    Article  MATH  MathSciNet  Google Scholar 

  5. J. Berstel. Transductions and Context-Free Languages. Teubner Studienbucher: Stuttgart, 1979.

    MATH  Google Scholar 

  6. C. Choffrut. Une caractérisation des fonctions séquentielles et des fonctions souss équentielles en tant que relations rationnelles. Theoretical Computer Science, 5:325–338, 1977.

    Article  MathSciNet  Google Scholar 

  7. C. Choffrut. Contributions à l’étude de quelques familles remarquables de fonctions rationnelles. PhD thesis, (thèse de doctorat d’Etat), Université Paris 7, LITP: Paris, France, 1978.

    Google Scholar 

  8. T.H. Cormen, C. E. Leiserson, and R.L. Rivest. Introduction to Algorithms. The MIT Press: Cambridge, MA, 1992.

    Google Scholar 

  9. K. Culik II and J. Kari. Digital Images and Formal Languages. In G. Rozenberg and A. Salomaa, editors, Handbook of Formal Languages, volume 3, pages 599–616. Springer, 1997.

    Google Scholar 

  10. W. Kuich and A. Salomaa. Semirings, Automata, Languages. Number 5 in EATCS Monographs on Theoretical Computer Science. Springer-Verlag, Berlin, Germany, 1986.

    MATH  Google Scholar 

  11. M. Mohri. Finite-State Transducers in Language and Speech Processing. Computational Linguistics, 23(2), 1997.

    Google Scholar 

  12. M. Mohri. Semiring Frameworks and Algorithms for Shortest-Distance Problems. Journal of Automata, Languages and Combinatorics, 7(3):321–350, 2002.

    MATH  MathSciNet  Google Scholar 

  13. M. Mohri, F. C.N. Pereira, and M. Riley. Weighted Automata in Text and Speech Processing. In Proceedings of the 12th biennial European Conference on Artificial Intelligence (ECAI-96), Workshop on Extended finite state models of language, Budapest, Hungary. ECAI, 1996.

    Google Scholar 

  14. D. Perrin. Words. In M. Lothaire, editor, Combinatorics on words, Cambridge Mathematical Library. Cambridge University Press, 1997.

    Google Scholar 

  15. A. Salomaa and M. Soittola. Automata-Theoretic Aspects of Formal Power Series. Springer-Verlag: New York, 1978.

    MATH  Google Scholar 

  16. A. Weber and R. Klemm. Economy of Description for Single-Valued Transducers. Information and Computation, 118(2):327–340, 1995.

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Allauzen, C., Mohri, M. (2003). An Efficient Pre-determinization Algorithm. In: Ibarra, O.H., Dang, Z. (eds) Implementation and Application of Automata. CIAA 2003. Lecture Notes in Computer Science, vol 2759. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45089-0_9

Download citation

  • DOI: https://doi.org/10.1007/3-540-45089-0_9

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40561-0

  • Online ISBN: 978-3-540-45089-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics