Skip to main content

Approximate word sequence matching over Sparse Suffix Trees

  • Session II
  • Conference paper
  • First Online:
Book cover Combinatorial Pattern Matching (CPM 1998)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1448))

Included in the following conference series:

Abstract

In this paper, we discuss word sequence matching, and we adapt the common edit distance metric for approximate string matching to searching for words and sequences of words. We furthermore create a variant of the Sparse Suffix Tree([3]) and adapt algorithms for approximate word and word sequence matching over the Sparse Suffix Tree variant. The algorithms have been implemented and tested in WWW information retrieval environment, and performance data is presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cobbs A. L. (1995) “Fast Approximate Matching using Suffix Trees,” In Proceedings of Sixth Symposium on Combinatorial Pattern Matching (CPM'95) Springer Verlag, pp. 41–54.

    Google Scholar 

  2. Gonnet G.H, Baeza-Yates R.A., Snider T. (1991) “Lexicographical indices for text: Inverted files vs. PAT trees.,” Technical Report OED-91-10, Center for the new OED, University of Waterloo.

    Google Scholar 

  3. Kärkkäinen J., Ukkonen E. “Sparse Suffix Trees“ In Proceedings of the Second Annual International Computing and Combinatorias Conference (COCOON 96), Springer Verlag, pp. 219–230.

    Google Scholar 

  4. Levenstein, V.I. (1965) “Binary codes capable of correcting deletions, insertions, and reversals,” (Russian) Doklady Akademii nauk SSSR, Vol. 163, No. 4, p. 845–8 (also Cybernetics and Control Theory, Vol. 10, No. 8, p. 707–10, 1966).

    Google Scholar 

  5. Morrison D.R. (1968) “PATRICIA — Practical Algorithm To Retrieve Information Coded in Alphanumeric,” Journal of the ACM, 15, pp. 514–534.

    Google Scholar 

  6. Sbang H., Merrettal T.H. (1996) “Tries for Approximate String Matching,” IEEE Transactions on Knowledge and Data Engineering, Vol 5, No. 4, p. 540–547.

    Google Scholar 

  7. Ukkonen E. (1985) “Finding Approximate Patterns in Strings,” Journal of Algorithms, vol. 6, pp. 132–137.

    Google Scholar 

  8. Weiner P. (1973) “Linear pattern matching algorithms,” In Proceedings of the IEEE 14th Annual Symposium on Switching and Automata Theory, pp. 1–11.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Martin Farach-Colton

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Risvik, K.M. (1998). Approximate word sequence matching over Sparse Suffix Trees. In: Farach-Colton, M. (eds) Combinatorial Pattern Matching. CPM 1998. Lecture Notes in Computer Science, vol 1448. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0030781

Download citation

  • DOI: https://doi.org/10.1007/BFb0030781

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-64739-3

  • Online ISBN: 978-3-540-69054-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics