Gene Discovery in the Apicomplexa as Revealed by EST Sequencing and Assembly of a Comparative Gene Database

  1. Li Li1,
  2. Brian P. Brunk2,
  3. Jessica C. Kissinger1,3,
  4. Deana Pape4,
  5. Keliang Tang5,
  6. Robert H. Cole5,
  7. John Martin4,
  8. Todd Wylie4,
  9. Mike Dante4,
  10. Steven J. Fogarty5,
  11. Daniel K. Howe6,
  12. Paul Liberator7,
  13. Carmen Diaz7,
  14. Jennifer Anderson7,
  15. Michael White8,
  16. Maria E. Jerome8,
  17. Emily A. Johnson8,
  18. Jay A. Radke8,
  19. Christian J. Stoeckert, Jr.2,
  20. Robert H. Waterston4,
  21. Sandra W. Clifton4,
  22. David S. Roos1, and
  23. L. David Sibley5,9
  1. 1Department of Biology, 2Center for Bioinformatics, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA; 3Department of Genetics, University of Georgia, Athens, Georgia 30602, USA; 4Genome Sequencing Center, Department of Genetics, 5Department of Molecular Microbiology, Washington University School of Medicine, St. Louis, Missouri 63108, USA; 6Department of Veterinary Sciences, University of Kentucky, Lexington, Kentucky 40546, USA; 7Human and Animal Infectious Diseases, Merck Research Laboratories, Rahway, New Jersey 07065, USA; 8Veterinary Molecular Biology, Montana State University, Bozeman, Montana 59717, USA

Abstract

Large-scale EST sequencing projects for several important parasites within the phylum Apicomplexa were undertaken for the purpose of gene discovery. Included were several parasites of medical importance (Plasmodium falciparum, Toxoplasma gondii) and others of veterinary importance (Eimeria tenella, Sarcocystis neurona, and Neospora caninum). A total of 55,192 ESTs, deposited into dbEST/GenBank, were included in the analyses. The resulting sequences have been clustered into nonredundant gene assemblies and deposited into a relational database that supports a variety of sequence and text searches. This database has been used to compare the gene assemblies using BLAST similarity comparisons to the public protein databases to identify putative genes. Of these new entries, ∼15%–20% represent putative homologs with a conservative cutoff of p < 10−9, thus identifying many conserved genes that are likely to share common functions with other well-studied organisms. Gene assemblies were also used to identify strain polymorphisms, examine stage-specific expression, and identify gene families. An interesting class of genes that are confined to members of this phylum and not shared by plants, animals, or fungi, was identified. These genes likely mediate the novel biological features of members of the Apicomplexa and hence offer great potential for biological investigation and as possible therapeutic targets.

[The sequence data from this study have been submitted to dbEST division of GenBank under accession nos.:Toxoplasma gondii: BG657138BG661027,BI921045BI921090, BI946571BI946588, BM003839BM004582,BM039066BM040645, BM131233BM133172, BM174962BM176879,BM188953BM189923, BM271559BM271694. Plasmodium falciparum:BI670521BI670830, BI813842BI816393, BI936022BI936312,BM273300BM276553. Sarcocystis neurona: BE574328, BE574347,BE574384, BE574386, BE574409, BE574465, BE574508, BE574543, BE574561,BE574633, BE574689, BE574694, BE574723, BE635418BE636244,BE574288BE574724, BF323572BF324064, BM252128BM253024,BM303125BM305293. Eimeria tenella: AI755306AI758088,AI759179AI759181, AI759254AI759304, AI759182AI759253, AI759305AI759387, AI759463AI759546,AI759388AI759462, AI759547AI759621, BE027133BE028807,BF023640BF023711, BF023609BF023639, BG235514BG235880,BG413067BG413336, BG466192BG467045, BG515959BG517044,BG560819BG562379, BG724474BG725148, BI895002BI896127,BM305294BM306971, BM321464BM322026. Neospora caninum: BF248514BF249435, BF716421BF717094, BF823742,BF823805BF823813, BF823743BF824633, BG235070BG235513.]

Footnotes

  • 9 Corresponding author.

  • E-MAIL sibley{at}borcim.wustl.edu; FAX (314) 362-3203.

  • Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.693203.

    • Received August 5, 2002.
    • Accepted December 6, 2002.
| Table of Contents

Preprint Server