ScienceDirect® Home Skip Main Navigation Links
You have guest access to ScienceDirect. Find out more.
 
Home
Browse
My Settings
Alerts
Help
 Quick Search
 Search tips (Opens new window)
    Clear all fields    
Theoretical Computer Science
Volume 359, Issues 1-3, 14 August 2006, Pages 329-343
 
Font Size: Decrease Font Size  Increase Font Size
 Abstract - selected
Purchase PDF (353 K)

  E-mail Article   
  Add to my Quick Links   
Bookmark and share in 2collab (opens in new window)
Request permission to reuse this article
  Cited By in Scopus (0)
 
 
 
Related Articles in ScienceDirect
View More Related Articles
 
View Record in Scopus
 
doi:10.1016/j.tcs.2006.05.028    How to Cite or Link Using DOI (Opens New Window)
Copyright © 2006 Elsevier B.V. All rights reserved.

Fast string matching by using probabilities: On an optimal mismatch variant of Horspool's algorithm

Markus E. NebelCorresponding Author Contact Information, a, E-mail The Corresponding Author

aFachbereich Informatik, Technische Universität Kaiserslautern, Gottlieb-Daimler- Straße, D-67663 Kaiserslautern, Germany

Received 1 April 2005; 
accepted 5 May 2006. 
Communicated by H. Prodinger. 
Available online 30 June 2006.

Purchase the full-text article



References and further reading may be available for this article. To view references and further reading you must purchase this article.

Abstract

The string matching problem, i.e. the task of finding all occurrences of one string as a substring of another one, is a fundamental problem in computer science. Recently, this problem received a great deal of attention due to numerous applications in computational biology. In this paper we address a modified version of Horspool's string matching algorithm using the probabilities of the different symbols to speed up the search. We show that the modified algorithm has a linear average running time; a precise asymptotical representation of the running time will be proven. A comparison of the average running time of the modified algorithm with well-known results for the original method shows that a substantial speed up for most of the symbol distributions has been achieved. Finally, we show that the distribution of the symbols can be approximated to a high precision using a random sample of sublinear size.

Keywords: Average-case analysis; String-matching algorithms


 
Home
Browse
My Settings
Alerts
Help
Elsevier.com (Opens new window)
About ScienceDirect  |  Contact Us  |  Information for Advertisers  |  Terms & Conditions  |  Privacy Policy
Copyright © 2008 Elsevier B.V. All rights reserved. ScienceDirect® is a registered trademark of Elsevier B.V.