Large-scale production of SAGE libraries from microdissected tissues, flow-sorted cells, and cell lines

  1. Jaswinder Khattra1,
  2. Allen D. Delaney1,
  3. Yongjun Zhao1,
  4. Asim Siddiqui1,
  5. Jennifer Asano1,
  6. Helen McDonald1,
  7. Pawan Pandoh1,
  8. Noreen Dhalla1,
  9. Anna-liisa Prabhu1,
  10. Kevin Ma1,
  11. Stephanie Lee1,
  12. Adrian Ally1,
  13. Angela Tam1,
  14. Danne Sa1,
  15. Sean Rogers1,
  16. David Charest2,
  17. Jeff Stott1,
  18. Scott Zuyderduyn1,4,
  19. Richard Varhol1,
  20. Connie Eaves3,
  21. Steven Jones1,
  22. Robert Holt1,
  23. Martin Hirst1,
  24. Pamela A. Hoodless3, and
  25. Marco A. Marra1,5
  1. 1 Canada’s Michael Smith Genome Sciences Centre, BC Cancer Research Centre, BC Cancer Agency, Vancouver, British Columbia V5Z 4S6, Canada;
  2. 2 Genome British Columbia, Vancouver, British Columbia V5Z 1C6, Canada;
  3. 3 Terry Fox Laboratory, BC Cancer Research Centre, BC Cancer Agency, Vancouver, British Columbia V5Z 1L3, Canada

Abstract

We describe the details of a serial analysis of gene expression (SAGE) library construction and analysis platform that has enabled the generation of >298 high-quality SAGE libraries and >30 million SAGE tags primarily from sub-microgram amounts of total RNA purified from samples acquired by microdissection. Several RNA isolation methods were used to handle the diversity of samples processed, and various measures were applied to minimize ditag PCR carryover contamination. Modifications in the SAGE protocol resulted in improved cloning and DNA sequencing efficiencies. Bioinformatic measures to automatically assess DNA sequencing results were implemented to analyze the integrity of ditag structure, linker or cross-species ditag contamination, and yield of high-quality tags per sequence read. Our analysis of singleton tag errors resulted in a method for correcting such errors to statistically determine tag accuracy. From the libraries generated, we produced an essentially complete mapping of reliable 21-base-pair tags to the mouse reference genome sequence for a meta-library of ∼5 million tags. Our analyses led us to reject the commonly held notion that duplicate ditags are artifacts. Rather than the usual practice of discarding such tags, we conclude that they should be retained to avoid introducing bias into the results and thereby maintain the quantitative nature of the data, which is a major theoretical advantage of SAGE as a tool for global transcriptional profiling.

Footnotes

  • 4 Present address: Department of Cancer Genetics, BC Cancer Research Centre, BC Cancer Agency, Vancouver, British Columbia V5Z 1L3, Canada

  • 5 Corresponding author.

    5 E-mail mmarra{at}bcgsc.ca; fax (604) 877-6085.

  • [Supplemental material is available online at www.genome.org.]

  • Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.5488207

    • Received May 12, 2006.
    • Accepted October 3, 2006.
| Table of Contents

Preprint Server