Skip to main content

An Improved Algorithm for Sequence Comparison with Block Reversals

  • Conference paper
  • First Online:
Book cover LATIN 2002: Theoretical Informatics (LATIN 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2286))

Included in the following conference series:

Abstract

Given two sequences X and Y that are strings over some alphabet set, we consider the distance d(X, Y ) between them defined to be minimum number of character replacements and block (substring) reversals needed to transform X to Y (or vice versa). This is the “simplest” sequence comparison problem we know of that allows natural block edit operations. Block reversals arise naturally in genomic sequence comparison; they are also of interest in matching music data. We present an improved algorithm for exactly computing the distance d(X, Y ); it takes time O(X log2 X), and hence, is near-linear. Trivial approach takes quadratic time and the best known previous algorithm for this problem takes time ω(X log3 X).

Supported in part by a grant from Charles B. Wang foundation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. R. Agarwal, K. Lin, H. Sawhney and K. Shim. Fast similarity search in the presence of noise, scaling and translation in time-series databases. Proc. 21st VLDB conf, 1995.

    Google Scholar 

  2. M. Gribskov and J. Devereux Sequence Analysis Primer, Stockton Press, 1991.

    Google Scholar 

  3. D. Harel and R. Tarjan. Fast Algorithms for Finding Nearest Common Ancestors. SIAM J. Comput., 13(2): 338–355, 1984.

    Article  MathSciNet  Google Scholar 

  4. M. Jackson, T. Strachan and G. Dover. Human Genome Evolution, Bios Scientific Publishers, 1996.

    Google Scholar 

  5. R. Karp, R. Miller and A. Rosenberg, Rapid Identification of Repeated Patterns in Strings, Trees, and Arrays, Proceedings of ACM Symposium on Theory of Computing, (1972).

    Google Scholar 

  6. D. Lopresti and A. Tomkins. Block Edit Models for Approximate String Matching. Theoretical Computer Science, 181(1): 159–179, 1997.

    Article  MathSciNet  Google Scholar 

  7. D. Sanko. and J. Kruskal, Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison, Addison-Wesley, Reading, Mass., 1983.

    Google Scholar 

  8. V. I. Levenshtein, Binary Codes Capable of Correcting Deletions, Insertions and Reversals, Cybernetics and Control Theory, 10(8):707–710, 1966.

    MathSciNet  Google Scholar 

  9. S. Muthukrishnan and S. C. Sahinalp, Approximate Nearest Neighbors and Sequence Comparison with Block Operations, Proceedings of ACM Symposium on Theory of Computing, 2000.

    Google Scholar 

  10. P. Sellers, The Theory and Computation of Evolutionary Distances: Pattern Recognition. Journal of Algorithms, 1, (1980):359–373.

    Article  MathSciNet  Google Scholar 

  11. J. A. Storer, Data Compression, Methods and Theory. Computer Science Press, 1988.

    Google Scholar 

  12. W. F. Tichy, The String-to-String Correction Problem with Block Moves. ACM Trans. on Computer Systems, 2(4): 309–321, 1984.

    Article  Google Scholar 

  13. J. Ziv and A. Lempel, A Universal Algorithm for Sequential Data Compression IEEE Trans. on Information Theory, 337–343, 1977.

    Article  MathSciNet  Google Scholar 

  14. P. Weiner Linear Pattern Matching Algorithms. Proc. IEEE Foundations of Computer Science (FOCS), 1–11, 1973.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Muthukrishnan, S., Sahinalp, S.C. (2002). An Improved Algorithm for Sequence Comparison with Block Reversals. In: Rajsbaum, S. (eds) LATIN 2002: Theoretical Informatics. LATIN 2002. Lecture Notes in Computer Science, vol 2286. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45995-2_30

Download citation

  • DOI: https://doi.org/10.1007/3-540-45995-2_30

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43400-9

  • Online ISBN: 978-3-540-45995-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics