Skip to main content

On incremental computation of transitive closure and greedy alignment

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1264))

Abstract

Several algorithms based on heuristics have been proposed for the multiple alignment of sequences. The most efficient in time computation are often greedy algorithms. At each step a greedy alignment algorithm must know if two characters are alignable or not, regarding to the characters definitely aligned before. We show that this problem is reducible to find paths in a directed graph. We give an incremental algorithm that maintains the transitive closure of a graph for which we know a spanning set of k disjoined paths. Our algorithm maintains the transitive closure of a graph of n vertices and m edges (in the final state) in O(k 2 m+n minm, n) time and O(kn) space. We show that this algorithm can be used by any greedy alignment algorithm to know in constant time if two characters are alignable or not, by maintaining the transitive closure of an alignment graph in O(k 2 n+n 2) time and O(kn) space, for k sequences whose total length is n. As an example of application we have implemented TwoAlign a efficient multiple alignment program based on greedy computation of pairwise local alignments.

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. M. R. Garey and D. S. Johnson. Computers and intractability; a guide to the theory of NP-completeness. Freeman, 1979.

    Google Scholar 

  2. L. Wang and T. Jiang. On the complexity of multiple sequence alignment. J. Comput. Biol., 1:337–348, 1994.

    Google Scholar 

  3. T. Jiang, E. L. Lawler, and L. Wang. Aligning sequences via an evolutionary tree: complexity and approximation. In Proc. 26-th Annual ACM Symp. Theory of Comput., pages 760–769, 1994.

    Google Scholar 

  4. P. Hogeweg and B. Hesper. The alignment of sets of sequences and the construction of phyletic trees: an integrated method. J. Mol. Evol., 20:175–186, 1984.

    Google Scholar 

  5. D-F. Feng and R. F. Doolittle. Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J. Mol. Evol., 25:351–360, 1987.

    Google Scholar 

  6. W. R. Taylor. Protein structure prediction. In M. J. Bishop and C. J. Rawlings, editors, Nucleic Acid and Protein Sequence Analysis, a Practical Approach., pages 285–323. IRL Press, 1987.

    Google Scholar 

  7. F. Corpet. Multiple sequence alignment with hierarchial clustering. Nucleic Acids Research, 16(22): 10881–10890, 1988.

    Google Scholar 

  8. D.G. Higgins and P.M. Sharp. Fast and sensitive multiple sequence alignments on a microcomputer. CABIOS, 5:151–153, 1989.

    Google Scholar 

  9. O. Gotoh. Further improvement in methods of group-to-group sequence alignment with generalized profile operations. CABIOS, 10(4):379–387, 1994.

    Google Scholar 

  10. A. M. Landraud, J. F. Avril, and P. Chrétienne. An algorithm for finding a common structure shared by a family of strings. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11:890–895, 1989.

    Google Scholar 

  11. Said Abdeddaim. Fast and sound two-step algorithms for multiple alignment of nucleic sequences. In Proceedings of the IEEE International Joint Symposia on Intelligence and Systems, pages 4–11, 1996.

    Google Scholar 

  12. T. Ibaraki and N. Katoh. On-line computation of transitive closure for graphs. Inform. Proc. Lett., 16:95–97, 1983.

    Google Scholar 

  13. G. F. Italiano. Amortized efficiency of a path retrieval data structure. Theor. Comput. Sci., 48:273–281, 1986.

    Google Scholar 

  14. J. A. La Poutré and J. van Leeuwen. Maintenance of transitive closure and transitive reduction of graphs. In Proc. Workshop on Graph-Theoretic Concepts in Computer Science, pages 106–120. Lecture Notes in Computer Science 314, Springer-Verlag, 1988.

    Google Scholar 

  15. J. D. Thompson, D. G. Higgins, and T. J. Gibson. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position specific gap penalties and weight matrix choice. Nucleic Acids Research, 22:4673–4680, 1994.

    Google Scholar 

  16. F. Mattern. Virtual time and global states of distributed systems. In Proc. Workshop on Parallel and Distributed Algorithms, pages 215–226, 1989.

    Google Scholar 

  17. C. J. Fidge. Timestamps in message-passing systems that preserve the partial ordering. In 11-th Australian Computer Science Conference, pages 55–66, 1988.

    Google Scholar 

  18. J. Kececioglu. The maximum weight trace problem in multiple sequence alignment. In 4-th Annual Symp. Combinatorial Pattern Matching, volume 684 of LNCS, pages 106–119. 1993.

    Google Scholar 

  19. S. F. Altschul. Gap costs for multiple sequence alignment. J. Theor. Biol., 138:297–309, 1989.

    Google Scholar 

  20. T. F. Smith and M. S. Waterman. Identification of common molecular subsequences. J. Mol. Biol., 147:195–197, 1981.

    Google Scholar 

  21. T. K. Attwood, M. E. Beck, A. J. Bleasby, and D. J. Parry-Smith. PRINTS — a database of protein motif fingeprints. Nucleic Acids Research, 22:3590–3596, 1994.

    Google Scholar 

  22. M. S. Waterman. Mathematical Methods for DNA Sequences. C.R.C. Press, 1989.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Alberto Apostolico Jotun Hein

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Abdeddaïm, S. (1997). On incremental computation of transitive closure and greedy alignment. In: Apostolico, A., Hein, J. (eds) Combinatorial Pattern Matching. CPM 1997. Lecture Notes in Computer Science, vol 1264. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63220-4_58

Download citation

  • DOI: https://doi.org/10.1007/3-540-63220-4_58

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-63220-7

  • Online ISBN: 978-3-540-69214-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics