Inferring protein function from genomic sequence: Giardia lamblia expresses a phosphatidylinositol kinase-related kinase similar to yeast and mammalian TOR,☆☆

https://doi.org/10.1016/S1096-4959(02)00218-XGet rights and content

Abstract

Functional assays of genes have historically led to insights about the activities of a protein or protein cascade. However, the rapid expansion of genomic and proteomic information for a variety of diverse taxa is an alternative and powerful means of predicting function by comparing the enzymes and metabolic pathways used by different organisms. As part of the Giardia lamblia genome sequencing project, we routinely survey the complement of predicted proteins and compare those found in this putatively early diverging eukaryote with those of prokaryotes and more recently evolved eukaryotic lineages. Such comparisons reveal the minimal composition of conserved metabolic pathways, suggest which proteins may have been acquired by lateral transfer, and, by their absence, hint at functions lost in the transition from a free-living to a parasitic lifestyle. Here, we describe the use of bioinformatic approaches to investigate the complement and conservation of proteins in Giardia involved in the regulation of translation. We compare an FK506 binding protein homologue and phosphatidylinositol kinase-related kinase present in Giardia to those found in other eukaryotes for which complete genomic sequence data are available. Our investigation of the Giardia genome suggests that PIK-related kinases are of ancient origin and are highly conserved.

Introduction

The availability of complete genomes from many species, including several eukaryotes, allows the identification, modeling and comparison of regulatory pathways for diverse organisms. Starting with a nucleic acid sequence, we can infer the encoded protein structure, identify key features, assess its distribution in phylogenetically divergent organisms, and predict its metabolic function. The proper use of bioinformatics enables new techniques for high-throughput measurements of gene expression, e.g. DNA microarray expression profiling. It is now possible to rapidly identify proteins in specific pathways and subsets of genes that are induced in response to a specific treatment or environmental condition.

Our ongoing homology-based analyses of genome sequence data have revealed a great deal about Giardia's evolution and metabolic capabilities. For example, it is apparent that Giardia has retained functional spliceosomal machinery. Unless Giardia acquired this machinery and its rare introns through lateral transfer, which is unlikely, these results push back the date of intron acquisition to before the divergence of the diplomonads (Nixon et al., 2002b). We have also detected homologues of proteobacterial enzymes, possibly relics of secondary mitochondrial loss or acquired in a separate endosymbiotic event (Morrison et al., 2001). One surprising discovery is a protein which displays nearly 95% similarity at the amino acid level with a cDNA that is expressed in embryonic mouse and human placental tissue (unpublished data). The function of this protein is unknown in any system but in Giardia it represents one of the major transcripts in trophozoites. We are currently investigating its cellular location in Giardia. This is a good example of how genome analysis of phylogenetically divergent organisms can influence which of the many open reading frames (ORFs) in mammalian genomes should be selected for detailed analysis. Obviously this gene is either very conserved and therefore functionally important in eukaryotes or it represents a genetic element that has been laterally transferred between genomes. Other sequences of interest, originally detected by homology, include an Fe-hydrogenase, with important implications for the hydrogen hypothesis about the origins of eukaryotes (Martin and Müller, 1998), several ferredoxins, ferredoxin-nitroreductases, and oxygen-insensitive nitroreductases, some of which appear to have been laterally transferred from different bacterial lineages (Nixon et al., 2002a), and genes involved in intermediary metabolism, information processing, and cellular structure (e.g. Henze et al., 1998, Sanchez et al., 1999, Bouzat et al., 2000, Wu et al., 2000, McArthur et al., 2001). In this paper, we demonstrate the investigation of previously unrecognized giardial proteins through a bioinformatics approach.

Using BLAST (Altschul et al., 1990, Altschul et al., 1997) to search the non-redundant GenBank protein database, we detected a phosphatidylinositol kinase-related kinase (PIKK) in the Giardia lamblia genome. These kinases are key components in signaling pathways that regulate cell growth, cell cycle progression, differentiation and cell to cell communication. PIKKs have been implicated in DNA damage response pathways, maintenance of telomere length, recombination and nonsense-mediated mRNA decay (Keith and Schreiber, 1995, Kuruvilla and Schreiber, 1999, Schmelzle and Hall, 2000, Denning et al., 2001). As their name suggests, phosphatidylinositol kinase-related kinases were recognized by their similarity to phosphatidylinositol kinases (PIK), which are lipid kinases. The PIKKs contain a conserved domain that is strikingly similar to PIK, but are actually serine/threonine protein kinases (Hunter 1995). The carboxy terminus of PIKKs contains conserved functional domains. These are the FAT domain (Bosotti et al., 2000), PI3_PI4 kinase domain and FATC domain. Despite the conserved PI3_PI4K domain, no intrinsic lipid kinase activity has been reliably demonstrated. In contrast, the amino terminus is less well conserved. HEAT repeat elements (Andrade and Bork, 1995, Andrade et al., 2001b) are detected within and upstream of the FAT domain in some PIKKs.

Perhaps the best studied member of this family is the TOR (target of rapamycin) protein. In other eukaryotes, TOR proteins respond to environmental nutrient levels and regulate protein synthesis, possibly through receptor-mediated pathways (Herbert et al., 2002). TOR was first identified in Saccharomyces cerevisiae, through characterization of mutations that conferred rapamycin resistance (Heitman et al., 1991a). Yeast has two TOR proteins, Tor1p and Tor2p, which share 70% amino acid identity, but are not functionally equivalent. While both proteins influence the G1 to S phase progression in the cell cycle, Tor2p has a rapamycin-resistant activity involved in actin scaffolding (Zheng and Schreiber 1997). Rapamycin diffuses into the cell and binds to both the large (∼2500 aa) TOR protein and to FKBP12 (FK506 binding protein, 109–114 aa), a prolyl-isomerase (Heitman et al., 1991a, Heitman et al., 1991b). FKBP12 functions in protein folding and is highly conserved among eukaryotes (Heitman et al., 1992). The FKBP12-rapamycin complex targets a binding domain in TOR known as the FKBP12-rapamycin binding (FRB) domain (Choi et al., 1996). The rapamycin-FKBP12 complex inhibits TOR's kinase activity, arresting cell cycling in the G1 phase, which mimics the effect of cell starvation (Heitman et al., 1991a, Barbet et al., 1996, Zaragoza et al., 1998). TOR homologues in mammals are called mTOR or FRAP (FKBP12-rapamycin associated protein). Two homologues are present in the human genome (Brown et al., 1994, Sabatini et al., 1994, Onyango et al., 1998). A single TOR protein occurs in the completed genomes of Caenorhabditis elegans (C. elegans Sequencing Consortium, 1998) and Drosophila melanogaster (FlyBase Consortium, 2002). Investigations of the response of cells to nutrient levels and rapamycin exposure and the characterization of rapamycin-resistant mutants (Hara et al., 1998, Beck and Hall, 1999, Dennis et al., 2001) have identified many components of TOR/FRAP regulatory pathways and the links between them (reviewed in Gingras et al., 2001a, Gingras et al., 2001b, Raught et al., 2001).

Studies suggest that TOR's role in mammalian translational regulation is to phosphorylate the eukaryotic translation initiation factor (eIF) 4E binding proteins and release eIF4E for interaction with eIF4G (Gingras et al., 1999, Mothe-Satney et al., 2000, Raught et al., 2000). The interaction of these components is critical in the recruitment of translational machinery to (capped) mRNA. In yeast, TOR appears to regulate the analogous eIF4E binding protein Eap1 (Cosentino et al., 2000). A second role of mammalian TOR is its ability to indirectly upregulate S6 kinase (Jefferies et al., 1997, Dufner and Thomas, 1999, Tang et al., 2001). S6K1 activity drives preferential translation of 5′ TOP (terminal oligopyrimidine tract) mRNAs, such as those encoding ribosomal proteins and components of the translational apparatus. TOR influences similar pathways in yeast, controlling protein synthesis and degradation by signaling to the TAP42 and NPR1 proteins (Schmidt et al., 1998, Beck and Hall, 1999, Shamji et al., 2000, Jacinto et al., 2001, Kimball, 2001). In the absence of adequate ATP or amino acids or in the presence of rapamycin, TOR is inactive, phosphatases are activated and dephosphorylate target proteins, and translation is repressed. TOR is not part of a simple linear pathway, but is a key element in a complex network of effector cascades, which regulate not only translation, but also transcription, protein degradation and ribosome biogenesis.

The detection of a putative FRAP/TOR protein homologue in preliminary BLASTX annotation of Giardia lamblia single-pass reads prompted our current investigation of proteins involved in translational regulation in this putatively early branching eukaryote. Unlike most other eukaryotes, Giardia's genome is extremely compact, with apparently rare introns (Nixon et al., 2002b), and small intergenic regions (Smith et al., 1998, Iwabe and Miyata, 2001). Untranslated regions of mRNAs are generally quite short (Adam 2001), which would reduce the potential for translational regulation through secondary structures. Elmendorf et al. conclude that Giardia exhibits the potential for relaxed transcriptional regulation because of the sequence diversity of its short (8–10 bp) initiator elements and the occurrence of a large and diverse population of sterile, antisense transcripts in Giardia, representing 20% of the total cellular polyadenylated RNAs (Elmendorf et al., 2001a, Elmendorf et al., 2001b). These observations render it a valuable target for comparative studies of regulatory pathways, particularly those governing transcription and translation.

Section snippets

Database homology searches

Sequence data from the Giardia lamblia genome project are publicly available at http://www.mbl.edu/Giardia (McArthur et al., 2000). Although the Giardia genome project is still in progress, we have generated approximately a 6X coverage of the genome, which we estimate accounts for >95% of its coding capacity. The genome project provides preliminary annotations based upon BLASTX queries against the non-redundant GenBank protein databases for single-pass reads. For more exhaustive analyses, we

Characterization of Giardia TOR protein

Sequences homologous to both the highly conserved carboxy half and the less-well conserved amino half of S. cerevisiae TOR proteins were identified in assembled contigs based upon single-pass reads posted as of February, 2002. After amplifying and sequencing, a 1400 bp fragment to link two large contigs, we assembled a full-length PIKK homologue. The predicted open reading frame is 7812 nt, encoding a 2604 aa protein. The stop codon is followed by a putative polyadenylation site, TGTAAA (Que et

Discussion

Both the domain and sequence similarity provide strong evidence that gTOR is a FRAP/TOR-like protein with functional roles that are analogous to FRAP/TOR proteins in other eukaryotes. The gFKBP may bind rapamycin, but that complex is unlikely to bind TOR. This should allow the detection of gFKBP functions independent of the ‘accidental’ complex formed with rapamycin and TOR, especially if gFKBP function involves other interactions with the conserved FKBP12 surface implicated in rapamycin

Acknowledgements

We thank the many ongoing eukaryotic genome project teams that have made their preliminary sequence data available for use. We wish to thank the scientists and funding agencies comprising the international Malaria Genome Project for making sequence data from the genome of P. falciparum (3D7) public prior to publication of the completed sequence. The Sanger Centre (UK) provided sequence for chromosomes 1, 3–9 and 13, with financial support from the Wellcome Trust. A consortium composed of The

References (72)

  • T.P. Herbert et al.

    The extracellular signal-regulated kinase pathway regulates the phosphorylation of 4E-BP1 at multiple sites

    J. Biol. Chem.

    (2002)
  • T. Hunter

    When is a lipid kinase not a lipid kinase? When it is a protein kinase

    Cell

    (1995)
  • N. Iwabe et al.

    Overlapping genes in parasitic protist Giardia lamblia

    Gene

    (2001)
  • E. Jacinto et al.

    TIP41 interacts with TAP42 and negatively regulates the TOR signaling pathway

    Mol. Cell.

    (2001)
  • L.A. Knodler et al.

    Novel protein-disulfide isomerases from the early-diverging protist Giardia lamblia

    J. Biol. Chem.

    (1999)
  • A.G. McArthur et al.

    The Giardia genome project database

    FEMS Microbiol. Lett.

    (2000)
  • I. Mothe-Satney et al.

    Mammalian target of rapamycin-dependent phosphorylation of PHAS-I in four (S/T)P sites detected by phospho-specific antibodies

    J. Biol. Chem.

    (2000)
  • P. Onyango et al.

    Molecular cloning and expression analysis of five novel genes in chromosome 1p36

    Genomics

    (1998)
  • X. Que et al.

    Developmentally regulated transcripts and evidence of differential mRNA processing in Giardia lamblia

    Mol. Biochem. Parasitol.

    (1996)
  • D.M. Sabatini et al.

    RAFT1: a mammalian protein that binds to FKBP12 in a rapamycin-dependent fashion and is homologous to yeast TORs

    Cell

    (1994)
  • S.L. Salzberg et al.

    Interpolated Markov models for eukaryotic gene finding

    Genomics

    (1999)
  • L.B. Sanchez et al.

    Cloning and sequencing of an acetyl-CoA synthetase (ADP-forming) gene from the amitochondriate protist, Giardia lamblia

    Gene

    (1999)
  • R.A. Sayle et al.

    RASMOL: biomolecular graphics for all

    Tr. Biochem. Sci.

    (1995)
  • T. Schmelzle et al.

    TOR, a central controller of cell growth

    Cell

    (2000)
  • A.F. Shamji et al.

    Partitioning the transcriptional program induced by rapamycin among the effectors of the Tor proteins

    Curr. Biol.

    (2000)
  • M.W. Smith et al.

    Sequence survey of the Giardia lamblia genome

    Mol. Biochem. Parasitol.

    (1998)
  • D.C. Yu et al.

    Protein synthesis in Giardia lamblia may involve interaction between a downstream box (DB) in mRNA and an anti-DB in the 16S-like ribosomal RNA

    Mol. Biochem. Parasitol.

    (1998)
  • R.D. Adam

    Biology of Giardia lamblia

    Clin. Microbiol. Rev.

    (2001)
  • S.F. Altschul et al.

    Gapped BLAST and PSI-BLAST: a new generation of protein database search programs

    Nucl. Acids Res.

    (1997)
  • M.A. Andrade et al.

    HEAT repeats in the Huntington's disease protein

    Nat. Genet.

    (1995)
  • M.A. Andrade et al.

    Automated genome sequence analysis and annotation

    Bioinformatics

    (1999)
  • J.H. Badger et al.

    CRITICA: coding region identification tool invoking comparative analysis

    Mol. Biol. Evol.

    (1999)
  • N.C. Barbet et al.

    TOR controls translation initiation and early G1 progression in yeast

    Mol. Biol. Cell.

    (1996)
  • A. Bateman et al.

    The Pfam protein families database

    Nucl. Acids Res.

    (2002)
  • T. Beck et al.

    The TOR signalling pathway controls nuclear localization of nutrient-regulated transcription factors

    Nature

    (1999)
  • Cited by (11)

    • Effects of wortmannin, sodium nitroprusside, insulin, genistein, and guanosine triphosphate on chemotaxis and cell growth of Entodinium caudatum, Epidinium caudatum, and mixed ruminal protozoa

      2014, Journal of Dairy Science
      Citation Excerpt :

      There is little published research with ciliates. Although the flagellated protozoal parasite Giardia conserved most elements of target of rapamycin signaling (considered as evolutionarily ancient), rapamycin binding sites might be absent compared with those conserved in higher eukaryotes (Morrison et al., 2002). However, wortmannin inhibited growth in that protozoan (Hernandez et al., 2007).

    • Incomplete nonsense-mediated mRNA decay in Giardia lamblia

      2008, International Journal for Parasitology
    • Cytonuclear coevolution: The genomics of cooperation

      2004, Trends in Ecology and Evolution
    View all citing articles on Scopus

    Contribution to a special issue on CBP on Comparative Functional Genomics.

    ☆☆

    Sequences described have been deposited in GenBank under accession numbers AY095368 and AY095369.

    View full text