Abstract
We consider the problem of efficiently designing sets (codes) of equal-length DNA strings (words) that satisfy certain combinatorial constraints. This problem has numerous motivations including DNA self-assembly and DNA computing. Previous work has extended results from coding theory to obtain bounds on code size for new biologically motivated constraints and has applied heuristic local search and genetic algorithm techniques for code design. This article proposes a natural optimization formulation of the DNA code design problem in which the goal is to design n strings that satisfy a given set of constraints while minimizing the length of the strings. For multiple sets of constraints, we provide simple randomized algorithms that run in time polynomial in n and any given constraint parameters, and output strings of length within a constant factor of the optimal with high probability. To the best of our knowledge, this work is the first to consider this type of optimization problem in the context of DNA code design.
- Adleman, L. M. 1994. Molecular computation of solutions to combinatorial problems. Sci. 266, 1021--1024. Google ScholarDigital Library
- Aggarwal, G., Goldwasser, M. H., Kao, M.-Y., and Schweller, R. T. 2004. Complexities for generalized models of self-assembly. In Proceedings of the 15th Annual ACM-SIAM Symposium on Discrete Algorithms. 880--889. Google ScholarDigital Library
- Ben-Dor, A., Karp, R., Schwikowski, B., and Yakhini, Z. 2000. Universal DNA tag systems: A combinatorial design scheme. In Proceedings of the 4th Annual International Conference on Computational Molecular Biology. 65--75. Google ScholarDigital Library
- Braich, R. S., Johnson, C., Rothemund, P. W. K., Hwang, D., Chelyapov, N. V., and Adleman, L. M. 2001. Solution of a satisfiability problem on a gel-based DNA computer. In Revised Papers of the 6th International Workshop on DNA-Based Computers, A. Condon and G. Rozenberg, Eds. Lecture Notes in Computer Science, vol. 2054. Springer-Verlag, New York, 27--42. Google ScholarDigital Library
- Brenneman, A., and Condon, A. E. 2001. Strand design for bio-molecular computation. Theoretical Comput. Sci. 287, 1, 39--58. Google ScholarDigital Library
- Brenner, S. 1997. Methods for sorting polynucleotides using oligonucleotide tags. US Patent Number 5,604,097.Google Scholar
- Brenner, S., and Lerner, R. A. 1992. Encoded combinatorial chemistry. Proc. Nat. Acad. Sci. 89. 5381--5383.Google Scholar
- Breslauer, K. J., Frank, R., Blocker, H., and Marky, L. A. 1986. Predicting DNA duplex stability from the base sequence. Proc. Nat. Acad. Sci. 83, 3746--3750.Google ScholarCross Ref
- Cormen, T. H., Leiserson, C. L., Rivest, R. L., and Stein, C. 2001. Introduction to Algorithms, 2nd Ed. MIT Press, Cambridge, MA. Google ScholarDigital Library
- Deaton, R., Garzon, M., Murphy, R., Franceschetti, D., and Stevens, S. 1996. Genetic search of reliable encodings for DNA based computation. In Proceedings of the 1st Annual Conference on Genetic Programming. 9--15.Google Scholar
- Frutos, A. G., Liu, Q., Thiel, A. J., Sanner, A. M. W., Condon, A. E., Smith, L. M., and Corn, R. M. 1997. Demonstration of a word design strategy for DNA computing on surfaces. Nucleic Acids Res. 25, 4748--4757.Google ScholarCross Ref
- Gaborit, P., and King, O. D. 2005. Linear constructions for DNA codes. Theoretical Comput. Sci. 334, 99--113. Google ScholarDigital Library
- Garzon, M., Deaton, R., Neathery, P., Franceschetti, D., and Murphy, R. 1997. A new metric for DNA computing. In Proceedings of the 2nd Genetic Programming Conference. 472--278.Google Scholar
- Garzon, M. H., Phan, V., and Neel, A. J. 2009. Optimal DNA codes for computing and self-assembly. J. Nanotechnol. Molec. Comput. 1, 1, 1--17.Google ScholarCross Ref
- Kao, M.-Y., Sanghi, M., and Schweller, R. 2006. Flexible word design and graph labeling. In Proceedings of the 17th International Symposium on Algorithms and Computation. 48--60. Google ScholarDigital Library
- King, O. D. 2003. Bounds for DNA codes with constant gc-content. J. Combinatorics 10, 13.Google Scholar
- Marathe, A., Condon, A., and Corn, R. M. 2001. On combinatorial DNA word design. J. Comput. Biol. 8, 3, 201--219.Google ScholarCross Ref
- Phan, V., and Garzon, M. 2008. On codeword design in metric DNA spaces. Natural Comput. (online). Google ScholarDigital Library
- Shoemaker, D. D., Lashkari, D. A., Morris, D., Mittmann, M., and Davis, R. W. 1996. Quantitative phenotypic analysis of yeast deletion mutants using a highly parallel molecular bar-coding strategy. Nature 16, 450--456.Google Scholar
- Tsaftaris, S. A. 2004. DNA computing from a signal processing viewpoint. IEEE Signal Proces. Mag. 21, 100--106.Google ScholarCross Ref
- Tulpan, D. C., and Hoos, H. H. 2003. Hybrid randomised neighbourhoods improve stochastic local search for DNA code design. In Proceedings of the 16th Conference of the Canadian Society for Computational Studies of Intelligence, Y. Xiang and B. Chaib-draa, Eds. Lecture Notes in Computer Science, vol. 2671. Springer-Verlag, New York, 418--433. Google ScholarDigital Library
- Tulpan, D. C., Hoos, H. H., and Condon, A. 2003. Stochastic local search algorithms for DNA word design. In Proceedings of the 8th International Workshop on DNA-Based Computers, M. Hagiya and A. Ohuchi, Eds. Lecture Notes in Computer Science, vol. 2568. Springer-Verlag, New York, 229--241. Google ScholarDigital Library
- Winfree, E., Liu, F., Wenzler, L., and Seeman, N. 1998. Design and self-assembly of two-dimensional DNA crystals. Nature 394, 539--544.Google ScholarCross Ref
Index Terms
- Randomized fast design of short DNA words
Recommendations
Randomized fast design of short DNA words
ICALP'05: Proceedings of the 32nd international conference on Automata, Languages and ProgrammingWe consider the problem of efficiently designing sets (codes) of equal-length DNA strings (words) that satisfy certain combinatorial constraints. This problem has numerous motivations including DNA computing and DNA self-assembly. Previous work has ...
Constant-Time Randomized Parallel String Matching
Given a pattern string of length m for the string-matching problem, we design an algorithm that computes deterministic samples of a sufficiently long substring of the pattern in constant time. This problem used to be the bottleneck in the pattern ...
Deterministic polynomial-time algorithms for designing short DNA words
TAMC'10: Proceedings of the 7th annual conference on Theory and Applications of Models of ComputationDesigning short DNA words is a problem of constructing n DNA strings (words) with the minimum length such that the Hamming distance between each pair is at least k and the words satisfy a set of extra constraints This problem has applications in DNA ...
Comments