skip to main content
research-article

Randomized fast design of short DNA words

Published:06 November 2009Publication History
Skip Abstract Section

Abstract

We consider the problem of efficiently designing sets (codes) of equal-length DNA strings (words) that satisfy certain combinatorial constraints. This problem has numerous motivations including DNA self-assembly and DNA computing. Previous work has extended results from coding theory to obtain bounds on code size for new biologically motivated constraints and has applied heuristic local search and genetic algorithm techniques for code design. This article proposes a natural optimization formulation of the DNA code design problem in which the goal is to design n strings that satisfy a given set of constraints while minimizing the length of the strings. For multiple sets of constraints, we provide simple randomized algorithms that run in time polynomial in n and any given constraint parameters, and output strings of length within a constant factor of the optimal with high probability. To the best of our knowledge, this work is the first to consider this type of optimization problem in the context of DNA code design.

References

  1. Adleman, L. M. 1994. Molecular computation of solutions to combinatorial problems. Sci. 266, 1021--1024. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Aggarwal, G., Goldwasser, M. H., Kao, M.-Y., and Schweller, R. T. 2004. Complexities for generalized models of self-assembly. In Proceedings of the 15th Annual ACM-SIAM Symposium on Discrete Algorithms. 880--889. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Ben-Dor, A., Karp, R., Schwikowski, B., and Yakhini, Z. 2000. Universal DNA tag systems: A combinatorial design scheme. In Proceedings of the 4th Annual International Conference on Computational Molecular Biology. 65--75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Braich, R. S., Johnson, C., Rothemund, P. W. K., Hwang, D., Chelyapov, N. V., and Adleman, L. M. 2001. Solution of a satisfiability problem on a gel-based DNA computer. In Revised Papers of the 6th International Workshop on DNA-Based Computers, A. Condon and G. Rozenberg, Eds. Lecture Notes in Computer Science, vol. 2054. Springer-Verlag, New York, 27--42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Brenneman, A., and Condon, A. E. 2001. Strand design for bio-molecular computation. Theoretical Comput. Sci. 287, 1, 39--58. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Brenner, S. 1997. Methods for sorting polynucleotides using oligonucleotide tags. US Patent Number 5,604,097.Google ScholarGoogle Scholar
  7. Brenner, S., and Lerner, R. A. 1992. Encoded combinatorial chemistry. Proc. Nat. Acad. Sci. 89. 5381--5383.Google ScholarGoogle Scholar
  8. Breslauer, K. J., Frank, R., Blocker, H., and Marky, L. A. 1986. Predicting DNA duplex stability from the base sequence. Proc. Nat. Acad. Sci. 83, 3746--3750.Google ScholarGoogle ScholarCross RefCross Ref
  9. Cormen, T. H., Leiserson, C. L., Rivest, R. L., and Stein, C. 2001. Introduction to Algorithms, 2nd Ed. MIT Press, Cambridge, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Deaton, R., Garzon, M., Murphy, R., Franceschetti, D., and Stevens, S. 1996. Genetic search of reliable encodings for DNA based computation. In Proceedings of the 1st Annual Conference on Genetic Programming. 9--15.Google ScholarGoogle Scholar
  11. Frutos, A. G., Liu, Q., Thiel, A. J., Sanner, A. M. W., Condon, A. E., Smith, L. M., and Corn, R. M. 1997. Demonstration of a word design strategy for DNA computing on surfaces. Nucleic Acids Res. 25, 4748--4757.Google ScholarGoogle ScholarCross RefCross Ref
  12. Gaborit, P., and King, O. D. 2005. Linear constructions for DNA codes. Theoretical Comput. Sci. 334, 99--113. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Garzon, M., Deaton, R., Neathery, P., Franceschetti, D., and Murphy, R. 1997. A new metric for DNA computing. In Proceedings of the 2nd Genetic Programming Conference. 472--278.Google ScholarGoogle Scholar
  14. Garzon, M. H., Phan, V., and Neel, A. J. 2009. Optimal DNA codes for computing and self-assembly. J. Nanotechnol. Molec. Comput. 1, 1, 1--17.Google ScholarGoogle ScholarCross RefCross Ref
  15. Kao, M.-Y., Sanghi, M., and Schweller, R. 2006. Flexible word design and graph labeling. In Proceedings of the 17th International Symposium on Algorithms and Computation. 48--60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. King, O. D. 2003. Bounds for DNA codes with constant gc-content. J. Combinatorics 10, 13.Google ScholarGoogle Scholar
  17. Marathe, A., Condon, A., and Corn, R. M. 2001. On combinatorial DNA word design. J. Comput. Biol. 8, 3, 201--219.Google ScholarGoogle ScholarCross RefCross Ref
  18. Phan, V., and Garzon, M. 2008. On codeword design in metric DNA spaces. Natural Comput. (online). Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Shoemaker, D. D., Lashkari, D. A., Morris, D., Mittmann, M., and Davis, R. W. 1996. Quantitative phenotypic analysis of yeast deletion mutants using a highly parallel molecular bar-coding strategy. Nature 16, 450--456.Google ScholarGoogle Scholar
  20. Tsaftaris, S. A. 2004. DNA computing from a signal processing viewpoint. IEEE Signal Proces. Mag. 21, 100--106.Google ScholarGoogle ScholarCross RefCross Ref
  21. Tulpan, D. C., and Hoos, H. H. 2003. Hybrid randomised neighbourhoods improve stochastic local search for DNA code design. In Proceedings of the 16th Conference of the Canadian Society for Computational Studies of Intelligence, Y. Xiang and B. Chaib-draa, Eds. Lecture Notes in Computer Science, vol. 2671. Springer-Verlag, New York, 418--433. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Tulpan, D. C., Hoos, H. H., and Condon, A. 2003. Stochastic local search algorithms for DNA word design. In Proceedings of the 8th International Workshop on DNA-Based Computers, M. Hagiya and A. Ohuchi, Eds. Lecture Notes in Computer Science, vol. 2568. Springer-Verlag, New York, 229--241. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Winfree, E., Liu, F., Wenzler, L., and Seeman, N. 1998. Design and self-assembly of two-dimensional DNA crystals. Nature 394, 539--544.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Randomized fast design of short DNA words

              Recommendations

              Reviews

              Pietro Hiram Guzzi

              The design of short DNA words-that is, the design of sets of DNA strings that satisfy a number of constraints-is a key problem in many areas. Kao, Sanghi, and Schweller present in this paper a novel formulation of this problem as an optimization problem and propose randomized algorithms to solve it. The proposed solution has a sound formulation in terms of optimization and is computationally tractable, being polynomial in most cases. The main contribution of the paper is the original formulation of the problem that could motivate further discussions. Considering that the paper is designed to be read by an expert audience, it has a clear structure and a clear formulation. Online Computing Reviews Service

              Access critical reviews of Computing literature here

              Become a reviewer for Computing Reviews.

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in

              Full Access

              • Published in

                cover image ACM Transactions on Algorithms
                ACM Transactions on Algorithms  Volume 5, Issue 4
                October 2009
                281 pages
                ISSN:1549-6325
                EISSN:1549-6333
                DOI:10.1145/1597036
                Issue’s Table of Contents

                Copyright © 2009 ACM

                Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                Publisher

                Association for Computing Machinery

                New York, NY, United States

                Publication History

                • Published: 6 November 2009
                • Accepted: 1 March 2009
                • Revised: 1 February 2009
                • Received: 1 October 2005
                Published in talg Volume 5, Issue 4

                Permissions

                Request permissions about this article.

                Request Permissions

                Check for updates

                Qualifiers

                • research-article
                • Research
                • Refereed

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader