Protein domains of low sequence complexity—dark matter of the proteome

  1. Steven L. McKnight
  1. Department of Biochemistry, UT Southwestern Medical Center, Dallas, Texas 75390-9152, USA
  1. Corresponding author: steven.mcknight{at}utsouthwestern.edu

Abstract

This perspective begins with a speculative consideration of the properties of the earliest proteins to appear during evolution. What did these primitive proteins look like, and how were they of benefit to early forms of life? I proceed to hypothesize that primitive proteins have been preserved through evolution and now serve diverse functions important to the dynamics of cell morphology and biological regulation. The primitive nature of these modern proteins is easy to spot. They are composed of a limited subset of the 20 amino acids used by traditionally evolved proteins and thus are of low sequence complexity. This chemical simplicity limits protein domains of low sequence complexity to forming only a crude and labile type of protein structure currently hidden from the computational powers of machine learning. I conclude by hypothesizing that this structural weakness represents the underlying virtue of proteins that, at least for the moment, constitute the dark matter of the proteome.

Keywords

Footnotes

This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genesdev.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

| Table of Contents

This Article

  1. Genes & Dev. 38: 205-212 © 2024 McKnight; Published by Cold Spring Harbor Laboratory Press

Article Category

Related Content

  1. Cell Biology

Share

Life Science Alliance