Distribution of Base Pair Repeats in Coding and Noncoding DNA Sequences

Nikolay V. Dokholyan, Sergey V. Buldyrev, Shlomo Havlin, and H. Eugene Stanley
Phys. Rev. Lett. 79, 5182 – Published 22 December 1997
PDFExport Citation

Abstract

We analyze the histograms for the lengths of the 16 possible distinct repeats of identical dimers, known as dimeric tandem repeats, in DNA sequences. For coding regions, the probability of finding a repetitive sequence of copies of a particular dimer decreases exponentially as increases. For the noncoding regions, the distribution functions for most of the 16 dimers have long tails and can be approximated by power-law functions, while for coding DNA, they can be well fit by a first-order Markov process. We propose a model, based on known biophysical processes, which leads to the observed probability distribution functions for noncoding DNA. We argue that this difference in the shape of the distribution functions between coding and noncoding DNA arises from the fact that noncoding DNA is more tolerant to evolutionary mutational alterations than coding DNA.

  • Received 24 April 1997

DOI:https://doi.org/10.1103/PhysRevLett.79.5182

©1997 American Physical Society

Authors & Affiliations

Nikolay V. Dokholyan1, Sergey V. Buldyrev1, Shlomo Havlin1,2, and H. Eugene Stanley1

  • 1Center for Polymer Studies, Physics Department, Boston University, Boston, Massachusetts 02215
  • 2Gonda-Goldschmied Center and Department of Physics, Bar-Ilan University, Ramat Gan, 52900 Israel

References (Subscription Required)

Click to Expand
Issue

Vol. 79, Iss. 25 — 22 December 1997

Reuse & Permissions
Access Options
Author publication services for translation and copyediting assistance advertisement

Authorization Required


×
×

Images

×

Sign up to receive regular email alerts from Physical Review Letters

Log In

Cancel
×

Search


Article Lookup

Paste a citation or DOI

Enter a citation
×