Abstract
The Arabidopsis Genome Initiative has released up to now more than 80% of the genome sequence of Arabidopsis thaliana. About 70% of the identified genes have at least one paralogue. In order to understand the biological function of individual genes, it is essential to study the structure, expression and organization of the entire multigene family. A systematic analysis of multigene families, made possible by the amount of genomic sequence data available, provides important clues for the understanding of genome evolution and plasticity. In this paper, four multigene families of A. thaliana are characterized, namely LCAD, HD-GL2, LGT and MYST. Members of HD-GL2 and LCAD have already been reported in plants. The LGT genes specify proteins containing motifs of glycosyl transferase. No plant genes similar to the LGT genes have been reported to date. The novel MYST family, most likely plant-specific, encodes proteins with no identified function. Sequencing and in silico analysis led to the characterization of 29 novel genes belonging to these four gene families. The organization, structure and evolution of all the members of the four families are discussed, as well as their chromosome location. Expression data of some of the paralogues of each family are also presented.
Similar content being viewed by others
References
Abel, G.J.W., Springer, F., Willmitzer, L. and Kossmann, J. 1996. Cloning and functional analysis of a cDNA encoding a novel 139 kDa starch synthase from potato (Solanum tuberosum L.). Plant J. 10: 981–991.
Altschul, S.F., Stephen, F., Gish, W., Miller, W., Myers, E.W. and Lipman, D.J. 1990. Basic local alignment search tool. J. Mol. Biol. 215: 403–410.
Aubourg, S., Kreis, M. and Lecharny, A. 1999. The DEAD box RNA helicase family in Arabidopsis thaliana. Nucl. Acids Res 27: 628–636.
Bailey, T.L. and Elkan, C. 1994. Fitting a mixture model by expectation maximization to discover motifs in biopolymers. In: R. Altmann (Ed.), Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, AAAI Press, pp. 28–36.
Barakat, A., Matassi, G. and Bernardi, G. 1998. Distribution of genes in the genome of Arabidopsis thaliana and its implications for the genome organization of plants. Proc. Natl. Acad. Sci. USA 95: 10044–10049.
Bevan, M., Bancroft, I., Bent, E., Love, K., Goodman, H., Dean, C., Bergkamp, R., Dirkse, W., van Staveren, M., Stiekema, W. et al. 1988. Analysis of 1.9 Mb of contiguous sequence from chromosome 4 of Arabidopsis thaliana. Nature 391: 485–488.
Castle, L.A. and Meinke, D.W. 1994. A FUSCA gene of Arabidopsis encodes a novel protein essential for plant development. Plant Cell 6: 25–41.
Clegg, M.T., Cummings, M.P. and Durbin, M.L. 1997. The evolution of plant nuclear genes. Proc. Natl. Acad. Sci. USA 94: 7791–7798.
Cooke, R., Raynal, M., Laudié, M., Grellet, F., Delseny, M., Morris, P.-C., Guerrier, D., Giraudat, J., Quigley, F., Clabault, G. et al. 1996. Further progress towards a catalogue of all Arabidopsis genes: analysis of a set of 5000 non-redundant ESTs. Plant J. 9: 101–124.
Corpet, F., Gouzy, J. and Kahn, D. 1999. Recent improvements of the ProDom database of protein domain families. Nucl. Acids Res 27: 263–267.
Creusot, F., Fouilloux, E., Dron, M., Lafleuriel, J., Picard, G., Billaut, A., Le Paslier, D., Cohen, D., Chaboute, M.E., Durr, A. et al. 1995. The CIC library: a large insert YAC library for genome mapping in Arabidopsis thaliana. Plant J. 8: 763–770.
Cronn, R.C., Zhoo, X., Paterson, A.H. and Wendel, J.F. 1996. Polymorphism and concerted evolution in a tandemly repeated gene family: 5S ribosomal DNA in diploid and allopolyploid cottons. J. Mol. Evol. 42: 685–705.
Dornelas, M.C., Wittich, P., von Recklinghausen, I., van Lammeren, A. and Kreis, M. 1999. Characterization of three novel members of the Arabidopsis SHAGGY-related protein kinase (ASK) multigene family. Plant Mol. Biol. 39: 137–147.
Fryxell, K.J. 1996. The coevolution of gene family trees. Trends Genet. 12: 364–369.
Furtek, D., Schiefelbein, J.W., Johnston, F. and Nelson, O.E. Jr. 1988. Sequence comparisons of three wild-type Bronze-1 alleles from Zea mays. Plant Mol. Biol. 11: 473–481.
Gehring, W.J., Qian, Y.Q., Billeter, M., Furukubu-Tokunaga, K., Schier, A.F., Resendez-Perez, D., Affolter, M., Otting, G. and Wüthrich, K. 1994. Homeodomain-DNA recognition. Cell 78: 211–223.
Gotschlich, E.C. 1994. Genetic locus for the biosynthesis of the variable portion of Neisseria gonorrhoeae lipooligosaccharide. J. Exp. Med. 180: 2181–2190.
Graham, G.J. 1995. Tandem genes and clustered genes. J. Theor. Biol. 175: 71–87.
Grima-Pettenati, J., Feuillet, C., Goffner, D., Borderies, G. and Boudet, A.M. 1993. Molecular cloning and expression of a Eucalyptus gunnii cDNA clone encoding cinnamyl alcohol dehydrogenase. Plant Mol. Biol. 21: 1085–1095.
Hebsgaard, S.M., Korning, P.G., Tolstrup, N., Engelbrecht, J., Rouzé, P. and Brunak, S. 1996. Splice site prediction in Arabidopsis thaliana pre-mRNA by combining local and global sequence information. Nucl. Acids Res. 24: 3439–3452.
Henikoff, J.G., Pietrokovski, S. and Henikoff, S. 1997a. Recent enhancements to the Blocks database servers. Nucl. Acids Res. 25: 222–225.
Henikoff, S., Greene, E.A., Pietrokovski, S., Bork, P., Attwood, T.K. and Hood, L. 1997b. Gene families: the taxonomy of protein paralogs and chimeras. Science 278: 609–614.
Higgins, D.G., Thompson, J.D. and Gibson, T.J. 1996. Using CLUSTAL for multiple sequence alignments. Meth. Enzymol. 266: 383–402.
Hofmann, K. and Stoffel, W. 1993. Tmbase: a database of membrane spanning proteins segments. Biol. Chem. Hoppe-Seyler 347: 166.
Horton, P. and Nakai, K. 1996. A probabilistic classification system for predicting the cellular localization sites of proteins. Intell. Syst. Mol. Biol. 4: 109–115.
Kiedrowski, S., Kawalleck, P., Hahlbrock, K., Somssich, I.E. and Dangl, J.L. 1992. Rapid activation of a novel plant defense gene is strictly dependent on the Arabidopsis RPM1 disease resistance locus. EMBO J. 11: 4677–4684.
Kornberg T., Siden, I., O'Farrell, P. and Simon, M. 1985. The engrailed locus in Drosophila: in situ localization of transcripts reveals compartment-specific expression. Cell 40: 45–53.
Kowalski, S.P., Lan, T.-H., Feldman, K.A. and Paterson, A.H. 1994. Comparative mapping of Arabidopsis thaliana and Brassica oleracea chromosomes reveals islands of conserved organization. Genetics 138: 499–510.
Krebbers, E., Seurinck, J., Herdies, L., Cashmore, A.R. and Timko, M.P. 1988. Four genes in two diverged subfamilies encode the ribulose-1,5-biphosphate carboxylase small subunit polypeptides of Arabidopsis thaliana. Plant Mol. Biol. 11: 745–759.
Lu, P., Porat, R., Nadeau, J.A. and O'Neill, S.D. 1996. Identification of a meristem L1 layer-specific gene in Arabidopsis that is expressed during embryonic pattern formation and defines a new class of homeobox genes. Plant Cell 8: 2155–2168.
Martin, T., Frommer, W.B., Salanoubat, M. and Willmitzer, L. 1993. Expression of Arabidopsis sucrose synthase gene indicates a role in metabolization of sucrose both during phloem loading and in sink organs. Plant J. 4: 367–377.
McGrath, J.M., Jancso, M.M. and Pichersky, E. 1993. Duplicate sequences with a similarity to expressed genes in the genome of Arabidopsis thaliana. Theor. Appl. Genet. 86: 880–888.
Newman, T., de Bruijn, F.J., Green, P., Keegstra, K., Kende, H., McIntosh, L., Ohlrogge, J., Raikhel. N,, Somerville, S., Thomashow, M. et al. 1994. Genes galore: a summary of methods for accessing results from large-scale partial sequencing of anonymous Arabidopsis cDNA clones. Plant Physiol. 106: 1241–1255.
Ohno, S. 1970. Evolution by Gene Duplication, Springer-Verlag, New York.
Petrov, D.A., Lozovskaya, E.R. and Hartl, D.L. 1996. High intrinsic rate of DNA loss in Drosophila. Nature 384: 346–349.
Pollock, T.J., van Workum, W.A., Thorne, L., Mikolajczak, M.J., Yamazaki, M., Kijne, J.W. and Armentrout, R.W. 1998. Assignment of biochemical functions to glycosyl transferase genes which are essential for biosynthesis of exopolysaccharides in Sphingomonas strain S88 and Rhizobium leguminosarum. J. Bact. 180: 586–593.
Rerie, W.G., Feldmann, K.A. and Marks, M.D. 1994. The GLABRA2 gene encodes a homeo domain protein required for normal trichome development in Arabidopsis. Genes Dev. 8: 1388–1399.
Rhee, S.Y., Weng, S., Bongard-Pierce, D.K., Garcia-Hernandez, M., Malekian, A., Flanders, D.J. and Cherry, M. 1999. Unified display of Arabidopsis thaliana physical maps from AtDB, the A. thaliana database. Nucl. Acids Res 27: 79–84.
Robertson, H.M. 1998. Two large families of chemoreceptor genes in the nematodes Caenorhabditis elegans and Carnorhabditis briggsae reveal extensive gene duplication, diversification, movement and intron loss. Genome Res. 8: 449–463.
Ronald, P.C. 1998. Resistance gene evolution. Curr. Opin. Plant Biol. 1: 294–298.
Sato, S., Kotani, H., Nakamura, Y., Kaneko, T., Asamizu, E., Fukami, M., Miyajima, N. and Tabata, S. 1997. Structural analysis of Arabidopsis thaliana chromosome 5. I. Sequence features of the 1.6 Mb regions covered by twenty physically assigned P1 clones. DNA Res. 4: 215–230.
Somers, D.A., Nourse, J.P., Manners, J.M., Abrahams, S. and Watson, J.M. 995. A gene encoding a cinnamyl alcohol dehydrogenase homolog in Arabidopsis thaliana. Plant Physiol. 108: 1309-1310.
Staub, J.M., Wei, N. and Deng, X.-W. 1996. Evidence for FUS6 as a component of the nuclear-localized COP9 complex in Arabidopsis. Plant Cell 8: 2047–2056.
Swoboda, P., Gal, S., Hohn, B. and Putcha, H. 1994. Intrachromosomal homologous recombination in whole plants. EMBO J. 13: 484–489.
Swofford, D.L. 1991. Phylogenetic Analysis Using Parsimony (PAUP) Version 3.1. Illinois Natural History Survey, Champaign, IL.
Tatusov, R.L., Koonin, E.V. and Lipman, D.J. 1997. A genomic perspective on protein families. Science 278: 631–637.
Terryn, N., Heijnen, L., De Keyser, A., Van Asseldonck, M., De Clercq, R., Verbakel, H., Gielen, J., Zabeau, M., Villarroel, R., Jesse, T. et al. 1999. Evidence for an ancient chromosomal duplication in Arabidopsis thaliana by sequencing analyzing a 400-kb contig at the APETALA2 locus on chromosome 4. FEBS Lett. 445: 237–245.
Tichtinsky, G., Tavares, R., Takvorian, A., Schwebel-Dugué, N., Twell, D. and Kreis, M. 1998. An evolutionary conserved group of plant GSK3/shaggy-like protein kinase genes preferentially expressed in developing pollen. Biochim. Byophys. Acta 1442: 261–273.
Trotman, C.N.A. 1998. Introns-early: slipping lately? Trends Genet. 14: 132–134.
Weigel, D. and Meyerowitz, E.M. 1994. The ABCs of floral homeotic genes. Cell 78: 203–209.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Tavares, R., Aubourg, S., Lecharny, A. et al. Organization and structural evolution of four multigene families in Arabidopsis thaliana: AtLCAD, AtLGT, AtMYST and AtHD-GL2. Plant Mol Biol 42, 703–717 (2000). https://doi.org/10.1023/A:1006368316413
Issue Date:
DOI: https://doi.org/10.1023/A:1006368316413