Abstract
Most modern implementations of regular expression engines allow the use of variables (also called backreferences). The resulting extended regular expressions (which, in the literature, are also called practical regular expressions, rewbr, or regex) are able to express non-regular languages.
The present paper demonstrates that extended regular-expressions cannot be minimized effectively (neither with respect to length, nor number of variables), and that the tradeoff in size between extended and “classical” regular expressions is not bounded by any recursive function. In addition to this, we prove the undecidability of several decision problems (universality, regularity, and cofiniteness) for extended regular expressions. Furthermore, we show that all these results hold even if the extended regular expressions contain only a single variable.
Similar content being viewed by others
References
Abigail: Re: Random number in perl. Posting in the newsgroup comp.lang.perl.misc, October 1997. Message-ID slrn64sudh.qp.abigail@betelgeuse.wayne.fnx.com
Aho, A.: Algorithms for finding patterns in strings. In: van Leeuwen, J. (ed.) Handbook of Theoretical Computer Science, vol. A, Chap. 5, pp. 255–300. Amsterdam, Elsevier (1990)
Aho, A., Hopcroft, J., Ullman, J.: The Design and Analysis of Computer Algorithms, Chap. 10.6, pp. 395–400. Addison-Wesley, Reading (1974)
Albert, J., Wegner, L.: Languages with homomorphic replacements. Theor. Comput. Sci. 16, 291–305 (1981)
Bordihn, H., Dassow, J., Holzer, M.: Extending regular expressions with homomorphic replacements. RAIRO Theor. Inform. Appl. 44(2), 229–255 (2010)
Bremer, J., Freydenberger, D.D.: Inclusion problems for patterns with a bounded number of variables. In: Proc. 14th International Conference on Developments in Language Theory, DLT 2010. LNCS, vol. 6224, pp. 100–111. Springer, Heidelberg (2010)
Câmpeanu, C., Santean, N.: On the intersection of regex languages with regular languages. Theor. Comput. Sci. 410(24–25), 2336–2344 (2009)
Câmpeanu, C., Yu, S.: Pattern expressions and pattern automata. Inf. Process. Lett. 92(6), 267–274 (2004)
Câmpeanu, C., Salomaa, K., Yu, S.: A formal study of practical regular expressions. Int. J. Found. Comput. Sci. 14, 1007–1018 (2003)
Carle, B., Narendran, P.: On extended regular expressions. In: Proc. Language and Automata Theory and Applications, Third International Conference, LATA 2009. LNCS, vol. 5457, pp. 279–289. Springer, Heidelberg (2009)
Cassaigne, J.: Unavoidable patterns. In: Lothaire, M. (ed.) Algebraic Combinatorics on Words, Chap. 3, pp. 111–134. Cambridge University Press, Cambridge (2002)
Currie, J.: Open problems in pattern avoidance. Am. Math. Mon. 100(8), 790–793 (1993)
Cutland, N.: Computability. Cambridge University Press, Cambridge (1980)
Della Penna, G., Intrigila, B., Tronci, E., Zilli, M.V.: Synchronized regular expressions. Acta Inform. 39(1), 31–70 (2003)
Freydenberger, D.D.: Extended regular expressions: Succinctness and decidability. In: 28th International Symposium on Theoretical Aspects of Computer Science (STACS 2011). Leibniz International Proceedings in Informatics (LIPIcs), vol. 9, pp. 507–518. Schloss Dagstuhl–Leibniz-Zentrum für Informatik, Dagstuhl (2011)
Friedl, J.: Mastering Regular Expressions, 3rd edn. O’Reilly Media, Sebastopol (2006)
Glaister, I., Shallit, J.: A lower bound technique for the size of nondeterministic finite automata. Inf. Process. Lett. 59(2), 75–77 (1996)
Goldstine, J., Kappes, M., Kintala, C.M.R., Leung, H., Malcher, A., Wotschke, D.: Descriptional complexity of machines with limited resources. J. Univers. Comput. Sci. 8(2), 193–234 (2002)
Hartmanis, J.: On Gödel speed-up and succinctness of language representations. Theor. Comput. Sci. 26(3), 335–342 (1983)
Holzer, M., Kutrib, M.: The complexity of regular(-like) expressions. In: Proc. 14th Conference on Developments in Language Theory, DLT 2010. LNCS, vol. 6224, pp. 16–30. Springer, Heidelberg (2010)
Holzer, M., Kutrib, M.: Descriptional complexity—an introductory survey. In: Martín-Vide, C. (ed.) Scientific Applications of Language Methods, pp. 1–58. Imperial College Press, London (2010)
Hopcroft, J., Ullman, J.: Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, Reading (1979)
Kleene, S.: Representation of events in nerve nets and finite automata. In: Shannon, C.E., McCarthy, J., Ashby, W.R. (eds.) Automata Studies, pp. 3–42. Princeton University Press, Princeton (1956)
Kutrib, M.: The phenomenon of non-recursive trade-offs. Int. J. Found. Comput. Sci. 16(5), 957–973 (2005)
Larsen, K.: Regular expressions with nested levels of back referencing form a hierarchy. Inf. Process. Lett. 65(4), 169–172 (1998)
Meyer, A.R., Fischer, M.J.: Economy of description by automata, grammars, and formal systems. In: 12th Annual Symposium on Switching and Automata Theory, SWAT (FOCS), pp. 188–191. IEEE Computer Society, Washington (1971)
Minsky, M.L.: Computation: Finite and Infinite Machines. Prentice-Hall, Upper Saddle River (1967)
Odifreddi, P.: Classical Recursion Theory, vol. I. Elsevier, Amsterdam (1989)
Odifreddi, P.: Classical Recursion Theory, vol. II. Elsevier, Amsterdam (1999)
Reidenbach, D., Schmid, M.L.: A polynomial time match test for large classes of extended regular expressions. In: Proc. 15th International Conference on Implementation and Application of Automata, CIAA 2010. LNCS, vol. 6482, pp. 241–250. Springer, Heidelberg (2010)
Acknowledgements
The author wishes to thank Nicole Schweikardt and the anonymous referees for the conference version [15] and the present version for their helpful remarks.
Author information
Authors and Affiliations
Corresponding author
Additional information
A preliminary version of this article appeared as [15].
Rights and permissions
About this article
Cite this article
Freydenberger, D.D. Extended Regular Expressions: Succinctness and Decidability. Theory Comput Syst 53, 159–193 (2013). https://doi.org/10.1007/s00224-012-9389-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00224-012-9389-0