Introduction

Cytochrome P450 (CYP) enzymes, many of which can catalyze xenobiotic compounds, constitute a superfamily of hemoproteins (Ding and Kaminsky 2003). CYP genes are classified into families and subfamilies on the basis of sequence similarities, and among them, numerous polymorphisms have been previously reported [Human Cytochrome P450 (CYP) Allele Nomenclature Committee, http://www.imm.ki.se/CYPalleles/]. The products of these and the other eleven genes selected for the work reported here are described in the following paragraphs.

CYP2A6 is a major player in the oxidation of nicotine and coumarin in human liver microsomes (Nakajima et al. 1996a; 1996b). Polymorphisms of CYP2A6 that might affect enzymatic activity (Ariyoshi et al. 2001; Kitagawa et al. 2001; Pitarque et al. 2001; Daigo et al. 2002; Oscarson et al. 2002; Xu et al. 2002) or susceptibility to lung cancer (Pianezza et al. 1998; London et al. 1999; Miyamoto et al. 1999) have been reported. However, some of those variants may be rare substitutions or limited to specific ethnic groups (Kitagawa et al. 2001; Oscarson et al. 1999, 2002; Xu et al. 2002).

CYP2A13 may play important roles in xenobiotic toxicity and tobacco-related tumorigenesis in the respiratory tract (Su et al. 2000). Zhang et al. (2002) have identified a C-to-T polymorphism (Arg257Cys) in exon 5 of the gene, and the product of this variant is 37% to 56% less active than the wild-type protein toward all substrates tested.

CYP2B6 is involved in the metabolism of several clinically important drugs (Ekins and Wrighton 1999). Lang et al. (2001) have identified five polymorphisms that would affect amino acid sequences; among them, a C-to-T polymorphism (Arg487Cys) in exon 9 of the gene appears to be associated with enzymatic activity. However, some of those five variants could also be rare substitutions or limited to specific ethnic groups (Hiratsuka et al. 2002).

CYP2E catalyzes the conversion of ethanol to acetaldehyde and to acetate and also metabolizes the pre-mutagenic nitrosamines present in cigarette smoke (Guengerich et al. 1991). Polymorphisms have been associated with increased risk of alcohol-related liver disease (Tanaka et al. 1997; Sun et al. 1999), lung cancer (el-Zein et al. 1997; Oyama et al. 1997; Wu et al. 1997), nasopharyngeal carcinoma (Hildesheim et al. 1997), and oral cancer (Hung et al. 1997).

By screening a database of expressed-sequence tags, Rylander et al. (2001) have identified CYP2S1, a P450 enzyme that is expressed mainly in trachea, lung, stomach, small intestine, and spleen. Rivera et al. (2002) have reported that CYP2S1 is inducible by 2,3,7,8-tetrachlorodibenzo-p-dioxin (dioxin) in a cell line derived from human lung epithelium.

Thromboxane A synthase (TBXAS1, CYP5A1) catalyzes the conversion of prostaglandin endoperoxide into thromboxane A2 (Shen and Tai 1986; Jones and Fitzpatrick 1991). TBXAS1 plays an important role in hemostasis and in cardiovascular diseases (FitzGerald et al. 1990). Although eleven polymorphisms have been identified in the promoter region, coding sequences, or 3'-untranslated region (3'UTR) of the TBXAS1 gene, the biological effects of these variants are currently unknown (Chevalier et al. 2001).

CYP7A1 encodes cholesterol 7-alpha-hydroxylase, the rate-limiting enzyme for the conversion of cholesterol to bile acids in the liver (Jelinek et al. 1990). The promoter region of this gene contains a potential DNA-binding site for the transcription factor CPF; mutation of the CPF-binding site abolishes hepatic-specific expression in transient transfection assays (Nitta et al. 1999). Wang et al. (1998) have identified two linked polymorphisms in the 5'flanking region of CYP7A1; the allele defined by these polymorphisms is associated with increased concentrations of low-density lipoprotein cholesterol in plasma.

CYP7B1 encodes oxysterol 7-alpha-hydroxylase (Setchell et al. 1998). This enzyme not only participates the synthesis of primary bile acids from cholesterol but also may be involved in neurosteroid metabolism, synthesis of sex hormones, and detoxification of oxysterols (Setchell et al. 1998; Wu et al. 1999). Mutation in the CYP7B1 gene causes severe neonatal liver disease, an inborn error of bile acid synthesis (Setchell et al. 1998).

Arylacetamide deacetylase (AADAC) is an esterase involved in the metabolic activation of arylamine substrates that ultimately become carcinogenic (Probst et al. 1991). The AADAC gene is expressed in liver, adrenal cortex, adrenal medulla, and pancreas (Trickett et al. 2001).

Carboxyl-ester lipase (CEL), also called cholesterol esterase, plays an important role in the hydrolysis and absorption of cholesterol and lipid-soluble vitamin esters (Lombardo et al. 1980). The 3' portion of the CEL gene is characterized by a GC-rich region (Nilsson et al. 1990), and by a variable number of tandem-repeats sequence (Higuchi et al. 2002).

Carboxylesterases (CESs) constitute a group of serine-dependent esterases (Munger et al. 1991). These enzymes catalyze the hydrolysis of many different endogenous and xenobiotic compounds and play roles in the metabolism of numerous drugs that contain ester and amide bonds (Satoh and Hosokawa 1998). CES1 and CES2, two human-liver carboxylesterases selected for this study, differ in their substrate specificity (Dean et al. 1991; Brzezinski et al. 1994).

Esterase D (ESD), a member of a group of nonspecific esterases, is especially abundant in liver and kidney (Lee et al. 1986). The gene encoding human ESD is a useful genetic marker for retinoblastoma (Lee and Lee 1986).

Granzymes are cytotoxic T-lymphocyte-associated serine esterases (Masson and Tschopp 1987). Granzymes A (GZMA) and B (GZMB) are the most abundantly expressed of the granzymes (Henkart 1994). Both are involved in apoptotic processes, but each uses a distinct pathway (Beresford et al. 1999; Shresta et al. 1999). The GZMA pathway slowly induces apoptosis of target cells, whereas GZMB appears to facilitate the induction of apoptosis (Shi et al. 1992a, 1992b).

Interleukin 17 (IL17), also known as cytotoxic T-lymphocyte-associated serine esterase, is secreted by activated memory CD4+ T cells and modulates the early stage of the immune response (Rouvier et al. 1993; Broxmeyer 1996). High levels of IL-17 may be associated with several chronic inflammatory diseases (Kotake et al. 1999; Molet et al. 2001; Laan et al. 2002).

Ubiquitin carboxyl-terminal esterase L3 (UCHL3) catalyzes C-terminal esters and amides of ubiquitin (Wilkinson et al. 1989). This enzyme is thought to be involved in ubiquitin recycling to maintain the pools of monomeric ubiquitin necessary for proteolysis (Larsen et al. 1998). Johnston et al. (1997) have determined the crystal structure of human UCHL3 and identified active sites.

GGT1 encodes gamma-glutamyltransferase, an enzyme involved in glutathione metabolism (Curthoys and Hughey 1979). Inborn deficiency of GGT1 causes glutathionuria (Schulman et al. 1975; Wright et al. 1980).

TGM1, also known as transglutaminase, is expressed during terminal differentiation of keratinocytes (Candi et al. 1995); this enzyme synthesizes the cornified envelope by a cross-linking reaction (Melino et al. 2000). Mutations in the TGM1 gene cause a skin disease, lamellar ichthyosis (Huber et al. 1995; Russell et al. 1995).

To investigate in detail the nature of apparent genotype/phenotype correlations among the 19 human genes described above, we began by searching for additional SNPs in their promoter regions, exons, and introns (except for repetitive elements) and report here a total of 680 genetic variations, of which 405 had not previously been reported.

Subjects and methods

After informed consent was obtained from each participant, total genomic DNAs were isolated from peripheral leukocytes of 48 unrelated Japanese individuals by the standard phenol/chloroform extraction method. On the basis of sequence information in the GenBank, we designed polymerase chain reaction (PCR) primers to amplify DNA from all 19 genes in their entirety, except that repetitive elements were excluded by invoking the REPEAT MASKER computer program (http://ftp.genome.washington.edu/cgi-bin/RepeatMasker). PCR experiments and DNA sequencing were performed according to methods described previously (Iida et al. 2001; Saito et al. 2001; Sekine et al. 2001). All SNPs detected by the PolyPhred Computer Program (Nickerson et al. 1997) were confirmed by sequencing both strands of each PCR product.

Results

We defined exon-intron boundaries within each of the 19 genes examined by comparing genomic sequences with cDNA sequences. The accession numbers of the genomic sequences and the cDNA sequences used for this study are listed in Table 1. We screened 96 Japanese chromosomes for SNPs in eight CYP genes, and nine esterase genes, plus GGT1 and TGM1, by direct DNA sequencing. The re-sequencing of a total of about 342 kb genomic DNA (153.7 kb for the CYP genes, 171.6 kb for esterases, 16.3 kb for the other two genes) identified 607 SNPs (284 in CYPs, 302 in esterases, and 21 in the other two genes) and 73 insertion/deletion polymorphisms (35 in CYPs and 38 in esterases; Table 2). Among the 680 genetic variations identified in our screening, including insertion/deletion polymorphisms, 405 (60%) had not been reported previously.

Table 1. Accession numbers for the genomic and cDNA sequences used in this study
Table 2. Summary of genetic variations in 19 genes (SNP single-nucleotide polymorphism)

CYP genes

Figure 1 illustrates the location of each variation among the CYP genes. Detailed information about nucleotide positions and substitutions is summarized in Table 3; the numbers of SNPs are summarized in Table 4. Among the 284 SNPs found in CYP genes, 13 were located in 5' flanking regions, 231 in introns, 33 in exons, and seven in 3' flanking regions. Among the SNPs detected in exons, 23 were located in coding regions and ten were in 3'UTRs. Among the former, 15 would cause substitution of an amino acid, and seven of those were novel. Of the eight SNPs that were synonymous, three were novel (Table 5).

Fig. 1a–h.
figure 1figure 1

Locations of single-nucleotide polymorphisms (SNPs) in the CYP2A6 (a), CYP2A13 (b), CYP2B6 (c), CYP2E (d), CYP2S1 (e), TBXAS1 (f), CYP7A1 (g), and CYP7B1 (h) genes (vertical lines). Open boxes Exons, hatching unsequenced regions of repetitive elements, ATG initiation codon, TGA, TAG stop codons

Table 3. Summary of genetic variations detected in the CYP2A6 gene, CYP2A13 gene, CYP2B6 gene, CYP2E gene, CYP2S1 gene, TBXAS1 gene, CYP7A1 gene, and CYP7B1 gene (CYP2A6 Cytochrome P450, subfamily IIA, polypeptide 6, CYP2A13 Cytochrome P450, subfamily IIA, polypeptide 13, CYP2B6 Cytochrome P450, subfamily IIB, polypeptide 6, CYP2E Cytochrome P450, subfamily IIE, CYP2S1 Cytochrome P450, subfamily IIS, polypeptide 1, TBXAS1 Thromboxane A synthase 1, CYP7A1 Cytochrome P450, subfamily VIIA, polypeptide 1, CYP7B1 Cytochrome P450, subfamily VIIB, polypeptide 1, NCBI National Center for Biotechnology Information, UTR untranslated region, del deletion, ins insertion)
Table 4. Number and regions of SNPs detected in 19 genes (SNP single-nucleotide polymorphism, UTR untranslated region)
Table 5. Novel SNPs detected in exons of 19 genes (SNP single-nucleotide polymorphism, UTR untranslated region)

Esterase genes

Figure 2 illustrates the location of each variation found among the esterase genes examined; detailed information regarding nucleotide positions and substitutions is summarized in Table 6. Among the 302 SNPs, 21 were located in 5' flanking regions, 252 in introns, 17 in exons, and 12 in 3' flanking regions. Of the 17 SNPs detected in exons, one was located in a 5'UTR; ten were in coding regions, and six were in 3'UTRs. Among the SNPs detected in coding regions, five would substitute an amino acid, and two of those were novel. Among the five synonymous SNPs, three were novel (Table 5).

Fig. 2a–i.
figure 2figure 2

Locations of single-nucleotide polymorphisms (SNPs) in the AADAC (a), CEL (b), CES1 (c), CES2 (d), ESD (e), GZMA (f), GZMB (g), IL17 (h), and UCHL3 (i) genes (vertical lines). Open boxes Exons, hatching regions of repetitive elements, ATG initiation codon, TGA, TAG, TAA stop codons

Table 6. Summary of genetic variations detected in the AADAC gene, CEL gene, CES1 gene, CES2 gene, ESD gene, GZMA gene, GZMB gene, IL17 gene, UCHL3 gene (AADAC arylacetamide deacetylase, CEL carboxyl-ester lipase, CES1 carboxylesterase 1, CES2 carboxylesterase 2, ESD esterase D, GZMA granzyme A, GZMB granzyme B, IL17 interleukin 17, UCHL3 ubiquitin carboxyl-terminal esterase L3, NCBI National Center for Biotechnology Information)

Other genes

Figure 3 illustrates the location of each variation found in the GGT1 and TGM1 genes; detailed information regarding nucleotide positions and substitutions is summarized in Table 7. Among the 21 SNPs, three were located in 5' flanking regions, 13 in introns, and five in exons; three of these five were located in coding regions and the other two in 3'UTRs. Of the three SNPs detected in coding regions, one would cause the substitution of an amino acid, and other two were synonymous SNPs. All three were novel (Table 5).

Fig. 3a, b.
figure 3

Locations of single-nucleotide polymorphisms (SNPs) in the GGT1 (a) and TGM1 (b) genes (vertical lines). Open boxes Exons, hatching regions of repetitive elements, ATG initiation codon, TGA, TAG stop codons

Table 7. Summary of genetic variations detected in the GGT1 gene and TGM1 gene (GGT1 gamma-glutamyltransferase 1, TGM1 Transglutaminase 1, NCBI National Center for Biotechnology Information)

Discussion

We identified a total of 680 genetic variations (607 SNPs and 73 insertion/deletion polymorphisms) among 19 enzyme-encoding genes selected for this study, by screening DNA from 48 unrelated Japanese individuals with respect to the entire relevant genomic regions except for repetitive sequences. The genes examined included eight cytochrome P450 (CYP) genes and nine esterase genes, plus two others. All data for the genetic variations reported here are available on our website (http://snp.ims.u-tokyo.ac.jp/).

CYP enzymes play central roles in the oxidative metabolism of numerous endogenous substrates, such as steroid hormones, and of xenobiotics, including various carcinogens and toxins (Ding and Kaminsky 2002). Among the CYP genes examined here, other investigators have previously detected 27 polymorphisms that would affect amino acid sequences [ten in CYP2A6, six in CYP2B6, three in CYP2E, and eight in TBXAS1; Human Cytochrome P450 (CYP) Allele Nomenclature Committee, http://www.imm.ki.se/CYPalleles/]. Zhang et al. (2002) have detected an additional SNP (Arg257Cys) in the coding region of CYP2A13. However, of the 28 polymorphisms reported previously, we have found only six in our Japanese population sample (Ile471Thr in CYP2A6; Arg257Cys in CYP2A13; Arg22Cys, Gln172His, and Arg487Cys in CYP2B6; Glu450Lys in TBXAS1). On the other hand, we have found seven novel non-synonymous substitutions (one in CYP2A6, two in CYP2A13, two in TBXAS1, and two in CYP7A1; Table 5). Our results should contribute to a better understanding of ethnic differences in drug responses or possible correlations between genotypes and phenotypes of disease susceptibility.

The promoter region of the CYP7A1 gene contains a potential binding site for a hepatic-specific transcription factor, CPF (CYP7A1 promoter binding factor; Nitta et al. 1999). Although mutation of the CPF site abolishes hepatic-specific expression of the gene in transient transfection assays (Nitta et al. 1999), we have failed to find any variant, including insertion/deletion polymorphisms, in the CPF-binding region among the 96 Japanese chromosomes examined.

Although we have found 302 genetic variations among nine esterase genes, only three represent novel changes that would cause substitutions of amino acids (Table 5). In the AADAC, CES1, CES2, and UCHL3 genes, other research groups have determined presumed active-site residues (Johnston et al. 1997; Pindel et al. 1997; Humerickhouse et al. 2000; Trickett et al. 2001); however, we have found no variations in these regions. As the promoter region of the AADAC gene contains a potential response element for aryl hydrocarbons, which could allow the induction of the gene in response to xenobiotics (Trickett et al. 2001), polymorphisms in the 5' flanking region should be investigated intensively.