Skip to main content
Log in

A Randomized Algorithm for Approximate String Matching

Algorithmica Aims and scope Submit manuscript

Abstract.

We give a randomized algorithm in deterministic time O(Nlog  M) for estimating the score vector of matches between a text string of length N and a pattern string of length M , i.e., the vector obtained when the pattern is slid along the text, and the number of matches is counted for each position. A direct application is approximate string matching. The randomized algorithm uses convolution to find an estimator of the scores; the variance of the estimator is particularly small for scores that are close to M , i.e., for approximate occurrences of the pattern in the text. No assumption is made about the probabilistic characteristics of the input, or about the size of the alphabet. The solution extends to string matching with classes, class complements, ``never match'' and ``always match'' symbols, to the weighted case and to higher dimensions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Author information

Authors and Affiliations

Authors

Additional information

Received July 20, 1997; revised April 20, 1998, and June 1, 1999.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Atallah, M., Chyzak, F. & Dumas, P. A Randomized Algorithm for Approximate String Matching . Algorithmica 29, 468–486 (2001). https://doi.org/10.1007/s004530010062

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s004530010062

Navigation