Copyright © 2006 Elsevier B.V. All rights reserved.
Fast string matching by using probabilities: On an optimal mismatch variant of Horspool's algorithm
Received 1 April 2005;
References and further reading may be available for this article. To view references and further reading you must purchase this article.
Abstract
The string matching problem, i.e. the task of finding all occurrences of one string as a substring of another one, is a fundamental problem in computer science. Recently, this problem received a great deal of attention due to numerous applications in computational biology. In this paper we address a modified version of Horspool's string matching algorithm using the probabilities of the different symbols to speed up the search. We show that the modified algorithm has a linear average running time; a precise asymptotical representation of the running time will be proven. A comparison of the average running time of the modified algorithm with well-known results for the original method shows that a substantial speed up for most of the symbol distributions has been achieved. Finally, we show that the distribution of the symbols can be approximated to a high precision using a random sample of sublinear size.
Keywords: Average-case analysis; String-matching algorithms






E-mail Article
Add to my Quick Links

Cited By in Scopus (0)

. The weight 
3 as well as the case (in any dimension) when the zero set of the polynomial in the simplex consists of a finite number of points. We also discuss an application to the representations of non-homogeneous polynomials which are non-negative on a general simplex.




