ScienceDirect® Home Skip Main Navigation Links
You have guest access to ScienceDirect. Find out more.
 
Home
Browse
My Settings
Alerts
Help
 Quick Search
 Search tips (Opens new window)
    Clear all fields    
Theoretical Computer Science
Volume 385, Issues 1-3, 15 October 2007, Pages 152-166
 
Font Size: Decrease Font Size  Increase Font Size
 Abstract - selected
Purchase PDF (371 K)

  E-mail Article   
  Add to my Quick Links   
Bookmark and share in 2collab (opens in new window)
Request permission to reuse this article
  Cited By in Scopus (0)
 
 
 
Related Articles in ScienceDirect
View More Related Articles
 
View Record in Scopus
 
doi:10.1016/j.tcs.2007.06.006    How to Cite or Link Using DOI (Opens New Window)
Copyright © 2007 Elsevier Ltd All rights reserved.

Languages with mismatchesstar, open

C. Epifanioa, E-mail The Corresponding Author, A. Gabrielea, Corresponding Author Contact Information, E-mail The Corresponding Author, F. Mignosib, E-mail The Corresponding Author, A. Restivoa, E-mail The Corresponding Author and M. Sciortinoa, E-mail The Corresponding Author

aDipartimento di Matematica e Applicazioni, Università di Palermo, Italy bDipartimento di Informatica, Università dell’Aquila, Italy

Received 14 December 2005; 
revised 19 January 2007; 
accepted 24 June 2007. 
Communicated by M. Crochemore. 
Available online 30 June 2007.

Purchase the full-text article



References and further reading may be available for this article. To view references and further reading you must purchase this article.

Abstract

In this paper we study some combinatorial properties of a class of languages that represent sets of words occurring in a text S up to some errors. More precisely, we consider sets of words that occur in a text S with k mismatches in any window of size r. The study of this class of languages mainly focuses both on a parameter, called repetition index, and on the set of the minimal forbidden words of the language of factors of S with errors. The repetition index of a string S is defined as the smallest integer such that all strings of this length occur at most in a unique position of the text S up to errors. We prove that there is a strong relation between the repetition index of S and the maximal length of the minimal forbidden words of the language of factors of S with errors. Moreover, the repetition index plays an important role in the construction of an indexing data structure. More precisely, given a text S over a fixed alphabet, we build a data structure for approximate string matching having average size O(|S|dot operatorlogk+1|S|) and answering queries in time O(|x|+|occ(x)|) for any word x, where occ is the list of all occurrences of x in S up to errors.

Keywords: Combinatorics on words; Formal languages; Approximate string matching; Indexing


Theoretical Computer Science
Volume 385, Issues 1-3, 15 October 2007, Pages 152-166
 
Home
Browse
My Settings
Alerts
Help
Elsevier.com (Opens new window)
About ScienceDirect  |  Contact Us  |  Information for Advertisers  |  Terms & Conditions  |  Privacy Policy
Copyright © 2008 Elsevier B.V. All rights reserved. ScienceDirect® is a registered trademark of Elsevier B.V.