Large-scale identification of novel transcripts in the human genome

  1. Brock A. Peters1,2,5,
  2. Brad St. Croix3,
  3. Tobias Sjöblom1,
  4. Jordan M. Cummins1,
  5. Natalie Silliman1,
  6. Janine Ptak1,
  7. Saurabh Saha1,
  8. Kenneth W. Kinzler1,2,
  9. Christos Hatzis4, and
  10. Victor E. Velculescu1,6
  1. 1 The Ludwig Center for Cancer Genetics and Therapeutics, The Johns Hopkins University Kimmel Cancer Center, Baltimore, Maryland 21231, USA;
  2. 2 Department of Pharmacology and Molecular Sciences, Johns Hopkins University, Baltimore, Maryland 21231, USA;
  3. 3 Tumor Angiogenesis Section, Mouse Cancer Genetics Program, National Cancer Institute, Frederick, Maryland 21702, USA;
  4. 4 Nuvera Biosciences, Woburn, Massachusetts, 01801, USA

Abstract

Although the sequencing of the human genome has been completed, the number and identity of genes contained within it remains to be fully determined. We used LongSAGE to analyze 660,357 human transcripts from human brain mRNA and identified expression of 17,409 known genes and >15,000 different transcripts that were not annotated in genome databases. Analysis of a subset of these unannotated transcripts suggests that 85% were differentially expressed in various tissue types and that fewer than 20% would have been detected by ab initio gene predictions. These studies suggest that the human genome contains on the order of twice as many transcribed regions as are currently annotated and that experimental approaches will be required to fully elucidate the novel genes corresponding to these transcripts.

Footnotes

  • 5 Present address: Department of Molecular Biology, Genentech, Inc., South San Francisco, CA 94080, USA.

  • 6 Corresponding author.

    6 E-mail velculescu{at}jhmi.edu; fax (410) 955-0548.

  • [Supplemental material is available online at www.genome.org]

  • Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.5486607

    • Received May 11, 2006.
    • Accepted December 1, 2006.
| Table of Contents

Preprint Server