ScienceDirect® Home Skip Main Navigation Links
You have guest access to ScienceDirect. Find out more.
 
Home
Browse
My Settings
Alerts
Help
 Quick Search
 Search tips (Opens new window)
    Clear all fields    
Information Processing Letters
Volume 71, Issues 3-4, 27 August 1999, Pages 107-113
 
Font Size: Decrease Font Size  Increase Font Size
 Abstract - selected
Purchase PDF (127 K)

 
 
 
Related Articles in ScienceDirect
View More Related Articles
 
View Record in Scopus
 
doi:10.1016/S0020-0190(99)00092-7    How to Cite or Link Using DOI (Opens New Window)
Copyright © 1999 Elsevier Science B.V. All rights reserved.

Fast practical multi-pattern matching

Maxime Crochemore1, , a, A. Czumajb, L. GImage sieniecc, T. LecroqCorresponding Author Contact Information, E-mail The Corresponding Author, d, W. Plandowski2, , e and W. Rytter2, , c, , e

a Institut Gaspard-Monge, Université de Marne-la-Vallée, 77454 Marne-la-Vallée Cedex 2, France b Department of Mathematics and Computer Science, University of Paderborn, Fürstenallee 11, 33102 Paderborn, Germany c Department of Computer Science, The University of Liverpool, Chadwick Building, Peach Street, LiverpoolL69 7ZF, UK d Laboratoire d'Informatique Fondamentale et Appliquée de Rouen, Atelier Biologie Informatique Statistique Sociolinguistique, Faculté des Sciences et des Techniques, Université de Rouen, 76821 Mont-Saint-Aignan Cedex, France e Institute of Informatics, Warsaw University, ul. Banacha 2, 00-913 Warsaw 59, Poland

Received 5 October 1998;
revised 6 July 1999.
communicated by L. Boasson
Available online 14 October 1999.

Purchase the full-text article



References and further reading may be available for this article. To view references and further reading you must purchase this article.

Abstract

The multi-pattern matching problem consists in finding all occurrences of the patterns from a finite set X in a given text T of length n. We present a new and simple algorithm combining the ideas of the Aho–Corasick algorithm and the directed acyclic word graphs. The algorithm has time complexity which is linear in the worst case (it makes at most 2n symbol comparisons) and has good average-case time complexity assuming the shortest pattern is sufficiently long. Denote the length of the shortest pattern by m, and the total length of all patterns by M. Assume that M is polynomial with respect to m, the alphabet contains at least 2 symbols and the text (in which the pattern is to be found) is random, for each position each letter occurs independently with the same probability. Then the average number of comparisons is O((n/m)·logm), which matches the lower bound of the problem. For sufficiently large values of m the algorithm has a good behavior in practice.

Author Keywords: Formal languages; Combinatorial problems

1 Supported in part by programme “Génomes” of CNRS.

2 Supported by the grant KBN 8T11C03915.

Corresponding Author Contact Information Corresponding author. fr. Supported in part by programme “Génomes” of CNRS; email: Thierry.Lecroq@dir.univ-rouen


 
Home
Browse
My Settings
Alerts
Help
Elsevier.com (Opens new window)
About ScienceDirect  |  Contact Us  |  Information for Advertisers  |  Terms & Conditions  |  Privacy Policy
Copyright © 2008 Elsevier B.V. All rights reserved. ScienceDirect® is a registered trademark of Elsevier B.V.