Introduction

Quantitative differences in gene expression provide a significant source of variation in natural populations, thereby representing an important substrate for evolution and accounting for a considerable fraction of phenotypic differences. The abundance of a gene transcript could be directly affected by polymorphism in regulatory elements. Therefore, the analysis of DNA sequence variations, related to variations in gene expression, results in a better understanding of transcriptional regulation. Variants that affect gene expression (expression quantitative trait loci, eQTL) may have a substantial impact on quantitative traits, whereas qualitative traits, including hereditary disease, may more often depend on structural variation (Farrall, 2004; Yan and Zhou, 2004). A literature survey revealed that of 107 genes with experimentally verified functional cis regulatory polymorphisms 82% had proven effects on phenotypes (Rockman and Wray, 2002). In human, more than 100 cis regulatory polymorphisms were associated with phenotypic variation (Wray, 2007). Moreover, in livestock current findings of quantitative trait nucleotides suggest high importance of cis regulatory polymorphisms. Transcript levels can be thought of as a quantitative phenotype that can be subjected to statistical genetic analyses in an effort to identify genomic regions that are likely to contain at least one causal gene with regulatory effects on the expression level, termed expression QTL or eQTL (Jansen and Nap, 2001; Schadt et al., 2003; Yvert et al., 2003). Using microarray technology, the abundance of many transcripts can be assessed simultaneously. However, the high cost of microarrays limits the sample size for global genetic mapping of transcript abundance. To overcome this limitation, several methods based on the use of selective phenotyping/profiling strategies, maximum recombination or other selection strategies (Jin et al., 2004; Nettleton and Wang, 2006; Cardoso et al., 2008) have been proposed. Most of these studies were performed using statistical simulations. Many eQTL studies in human or mice were limited in sample size (57–195 individuals) (Schadt et al., 2003, 2005; Morley et al., 2004; Cheung et al., 2005; Stranger et al., 2005), including our previous eQTL study using the microarray expression profiles of 74 F2 animals. We propose combining eQTL analyses based on microarrays and on real-time RT-PCR in a two-step procedure with increasing numbers of samples to deal with the limitations of microarrays (high costs and limited sensitivity) in eQTL studies. We report experimental data from eQTL analyses obtained by quantitative real-time RT-PCR on 276 F2 animals and compare this to our previous eQTL study based on microarrays. The relatively large sample size of this study provides the means to assess the relationship between genetic variants and gene expression with improved statistical power. The candidate genes were selected to include genes with cis- and trans-acting eQTL (according to the microarray expression profiling study) and genes that were differentially expressed in M. longissimus dorsi samples that differed in meat waterholding capacity (measured as so-called drip loss) and in pH (Ponsuksili et al., 2008a,). We demonstrate that a global microarray eQTL analysis of a limited number of samples can be used for exploring functional and regulatory gene networks and scanning for cis eQTL. Subsequent analysis of a subset of likely cis-regulated genes by real-time RT-PCR in a larger number of samples serves for verification and narrowing down the QTL regions by targeting these positional candidate genes. Additionally, polymorphisms of cis-acting genes (SNPs) were identified and, when used as fixed effects for eQTL analysis, confirmed the impact of the respective polymorphic genes.

Materials and methods

Animals and tissue collection

In this study, a total of 276 animals obtained from the F2 generation of an experimental cross between the commercial breeds Duroc and Pietrain (DuPi population) were used (Liu et al., 2007, 2008). The grandparental purebred F0 animals were reciprocally mated, and 32 F1 animals were used to produce the F2 generation. Here, the F2 progeny that were used formed 20 full-sib families.

Microarray analysis

The expression patterns of M. longissimus dorsi samples taken between the 13th and 14th ribs of 74 F2 animals were assessed using Affymetrix GeneChip Porcine Genome Arrays according to the Affymetrix protocols as previously described (Ponsuksili et al., 2008a, 2008b). In brief, 5 μg of total RNA was used to prepare antisense biotinylated RNA target products. Hybridization, washing and scanning were performed using Affymetrix equipment (Hybridization Oven 645, Fluidics Station 450, Scanner 3000 7G; Affymetrix UK Ltd, High Wycombe, UK). The quality of hybridization was assessed in all samples following the manufacturer's recommendations. Data were analysed with Affymetrix GCOS 1.1.1 software using global scaling to a target signal of 500 and were processed with MAS5.0 to generate cell intensity files (present or absent). Quantitative expression levels for the present transcripts were estimated using PLIER (probe logarithmic intensity error) for normalization purposes, as implemented in the Array Assist software (Stratagene Europe, Amsterdam, the Netherlands). The microarray data related to all samples were deposited in the Gene Expression Omnibus (GEO (http://www.ncbi.nlm.nih.gov/geo/)) public repository (GEO accession number: GSE10204).

Quantitative real-time RT-PCR

For real-time RT–PCR analysis, 13 genes were selected to represent candidate genes to be further analysed for possible effects on meat quality (Srikanchai et al., 2009). The criteria used to select the genes included their differential gene expression depending on meat quality traits (Ponsuksili et al., 2008b) and the involvement of both cis- and trans-acting eQTL (Ponsuksili et al., 2008a). Four of the selected genes (VTN, unknown transcript probe set no. SSC.25503.1.S1.at, CYP3A and LYZ) were differentially expressed between high and low drip loss in our previous study (Ponsuksili et al., 2008b). Two genes showed a differential expression between high and low pH at 24 h post mortem (LAMA4, KERA). The chromosomal positions of the genes are shown in Table 1, which were used to distinguish cis or trans-acting eQTL. The mapping information in the linkage group was accessed through PigQTLdb (http://www.animalgenome.org/QTLdb/pig.html). The rest of the genes (AHNAK, SLC3A2, MAP4K4, COQ9, ANKRD13, ZDHHC5, CS) exhibited trait correlated expression and possessed cis- or trans-acting eQTL according to microarray expression profiling (Ponsuksili et al., 2008a). To normalize the variation in RT-PCR efficiency and initial RNA input, RPL32 and HPRT were used as internal standards. Both of the reference genes were first checked for lack of variation and eQTL, on the basis of microarray data. All 15 transcripts were quantified by real-time RT-PCR using the iCycler apparatus (Bio-Rad Laboratories GmbH, Munich, Germany) and iQ SYBR Green Supermix (Bio-Rad). Real-time RT-PCR was performed in duplicate using 276 animals from full-sib families including 56 individual animals that were previously used for microarray analysis. RNA was isolated as described above. In all, 2 μg of RNA was reverse transcribed to cDNA using SuperScript III MMLV reverse transcriptase (Invitrogen, Karlsruhe, Germany) in a reaction containing 500 ng oligo (dT)11VN primer and 500 ng random hexamer primer according to the manufacturer's protocol. Templates were amplified, using the gene-specific primers (Supplementary Table 1), by 45 cycles of 95 °C for 15 s (denaturation), 60 °C for 30 s (annealing), and 72 °C for 30 s (extension) preceded by an initial denaturation step of 95 °C for 10 min as a universal thermal cycling parameter. On the basis of the analysis of PCR product melting curves, a high-temperature fluorescence acquisition point was estimated and included in the amplification cycle programme. For all assays, a standard curve was generated by amplifying serial dilutions of specific PCR products. After completing the qPCR melting curve analysis, agarose gel electrophoresis was performed to confirm the specificity of the amplification. Normalization of variation in RT-PCR efficiency and initial RNA input was performed using the RPL32 and HPRT genes as internal standards by dividing the calculated mRNA copy numbers by a mean normalization factor derived from the expression of the reference genes (relative abundance).

Table 1 Genes used for RT-PCR and eQTL analysis

Linkage analysis

The eQTL estimation was performed using the F2 option of QTLexpress (Seaton et al., 2002). The model included full-sib family and sex as fixed effects and carcass weight (kg) and age at slaughter (days) as covariates. The genotypes of 106 microsatellite markers were analysed. The linkage maps, according to Liu et al. (2007), were constructed using CRIMAP version 2.4 (Green et al., 1990). The single QTL model analysis with additive and dominance effects was fitted at 1-cM intervals along every single chromosome. Chromosome-wide and genome-wide significance levels were estimated by permutation tests using 5000 permutations (Churchill and Doerge, 1994). The 5% chromosome-wide threshold corresponds approximately to the suggestive linkage threshold proposed by Lander and Kruglyak (1995). Average significance thresholds were 4.94 and 5.20 for relative transcript abundance and microarray expression levels, respectively, corresponding to LOD 2.0 and 1.9. The fraction of the phenotypic variance indicated by the QTL was defined as the percentage reduction of the residual variance resulting from inclusion of the QTL in the model (Knott et al., 1996).

Genotyping and association study

Polymorphisms were identified in the genes AHNAK, SLC3A2, ZDHHC5, MAP4K4, COQ9 and CS and in the unknown transcript SSC.25503.1.S1.at by comparative sequencing of PCR fragments in an SNP discovery panel of commercial breed animals (German Landrace, German Large White, Pietrain, Duroc). The primer sequences and the method of genotyping are shown in Supplementary Table 2. The associations between transcript abundance and their polymorphisms were calculated using family, sex and genotype as fixed effects and carcass weight (kg) and age at slaughter (d) as covariates. Furthermore, the SNPs of these genes were fitted as fixed effects in the eQTL analysis.

Results

eQTL analysis based on quantitative RT-PCR

Expression levels for the 13 genes were measured by quantitative RT-PCR in 276 animals from the F2 generation of a cross between the commercial breeds Duroc and Pietrain (Liu et al., 2007, 2008). For the quantification of transcripts by real-time RT-PCR, the selected genes were functional candidate genes for meat or carcass quality traits that showed trait-associated expression. Compared to the microarray-based eQTL analysis, a greater sample size was used for real-time RT-PCR to assess the relationship between genetic variation and gene expression with improved statistical power.

Using the relative abundance of transcripts, a total of 16 eQTL for the expression level of 11 genes were mapped with LOD scores of at least 2, that is, the 5% chromosome-wide significance level. There were no eQTL for LYZ and KERA. Plots of F-values obtained for the relative abundance of each transcript across all chromosomes are shown in Figure 1. Five eQTL were mapped to the map position of the corresponding gene (cis-acting) with relatively high LOD scores (2.4–15.5) (Table 2), whereas one eQTL was mapped on the same chromosome as the gene itself, although in a more distant position (COQ9). The expression level of ZDHHC5 was mapped with the highest LOD score, having a value of 15.5. The eQTL for the expression level of AHNAK was mapped (LOD score 3.26) in the same vicinity. The eQTL for ZDHHC5 and AHNAK were assigned as cis-acting on SSC2. These eQTL explained 26.05 and 6.16% of the expression variation for ZDHHC5 and AHNAK, respectively. The eQTL for the transcript abundance of MAP4K4 was assigned to its own mapping interval on SSC3 (LOD score 2.48). On SSC5 a cis-acting eQTL of CS (LOD score 2.41) was found. The transcript SSC.25503.1.S1.at had a cis-acting eQTL located on SSC7 (LOD score 2.45). The eQTL for the expression level of COQ9 was mapped distal on SSC6, whereas COQ9 mapped in the neighbouring marker interval proximal on SSC6.

Figure 1
figure 1

F-ratio obtained for the relative abundance of each transcript, plotted at 1 cM distances along all chromosomes. x axis—boundaries of chromosome, SSC1-18; y axis—F-ratio from interval mapping.

Table 2 Details of eQTL detected based on relative transcript abundance of 11 genes obtained by real-time RT-PCR (n=276)

Consistency in eQTL between microarray data and quantitative RT-PCR

To examine whether the evidence for eQTL was consistent between the different methods, eQTL data from microarray experiments and quantitative RT-PCR were compared. In total, an analysis of the microarray data revealed 21 eQTL from 12 genes with LOD scores that were higher than 1.9, corresponding to 5% chromosome-wide significance. There was no eQTL for LYZ. The percent variance of the expression level given by eQTL analysis when using microarrays was higher than the corresponding quantitative RT-PCR value.

Every trans-acting eQTL that was found using the 74 porcine Affymetrix chips was different from the trans-acting eQTL found from the 276 quantitative RT-PCR samples (Tables 2, 3). Linkage analysis using either microarray data or real-time RT-PCR data from the 13 genes revealed a total of seven eQTL mapping in close vicinity to the corresponding gene, that is, cis-acting eQTL. For AHNAK, ZDHHC5 and SSC.25503.1.S1.at the eQTL F-values reached chromosome-wide significance within overlapping intervals, whereas MAP4K4 and COQ9 consistently showed eQTL on the same chromosome, although in more distant positions. The eQTL for SLC3A2 only reached significance with the analysis using micorarray data, yet the shapes of the F-value plots derived from real-time RT-PCR were similar, with a peak at the same position on SSC2. Only real-time RT-PCR revealed chromosome-wide significant eQTL for CS at its own mapping position.

Table 3 Details of eQTL detected based on expression levels of 12 genes obtained by microarray analysis (n=74)

Effects of SNPs at cis-acting eQTL and association with their transcript abundance

All genes with cis-acting eQTL or eQTL that were mapped on their own chromosome according to real-time RT-PCR analysis were used to identify SNPs. Owing to a lack of sequence information for the promoter regions, the 5′- and 3′-UTRs, and the coding regions of the genes were used for SNP identification and for further genotyping. The detected SNPs were subsequently used as fixed effects in the eQTL analysis. In total, five SNPs from AHNAK and one SNP each for ZDHHC51, CS, MAP4K4, SSC.25503.1.S1.at, and COQ9 were identified. The eQTL effect for AHNAK transcript abundance remained significant when three of the five SNPs were included, whereas the other two SNPs had a significant influence on the QTL effect. No eQTL for the AHNAK transcript level could be detected if either of these two SNPs or the haplotype of all five SNPs were used as fixed effects in the model to estimate eQTL. The expression level of ZDHHC5, which was mapped to SSC2, was maintained after its own SNP was used as a fixed effect. The SNPs of COQ9, CS and MAP4K4 cause a reduction in the F-values at the eQTL peaks (Figure 2). An association between the haplotype or genotype of AHNAK, COQ9 and CS and their transcript abundances was found with P<0.001, 0.01 and 0.05, respectively.

Figure 2
figure 2

F-ratio plots of eQTL analyses for six genes using (grey) and not using (black) genotypes of the respective genes as fixed effects. x axis—map position in cM; y axis—F-ratio from interval mapping. Physical location of the gene is indicated by an arrowhead.

Discussion

Microarray technology that allows the simultaneous assessment of the expression of all known transcripts throughout the genome has recently become available. However, the high cost of microarrays limits the sample size for global genetic mapping of transcript abundance. Many investigations have reported the successful mapping of eQTL in rats, mice or humans with different sample sizes (Morley et al., 2004; Doss et al., 2005; Hubner et al., 2005; MacLaren and Sikela, 2005; Stranger et al., 2005). Mapping of eQTL to the specific gene indicates that cis changes are responsible for the different expression levels, whereas mapping positions of eQTL that are different from the positions of the corresponding genes indicate trans regulation. These data allow the derivation of regulatory networks for genes. However, to link an eQTL to the genetic background of a classical phenotypic trait of interest, it is necessary to establish a relationship between the variation of that classical phenotypic trait and its corresponding ‘pQTL’ (pheneQTL) position on one hand and either the expression level of that particular transcript or the mapping position of the eQTL on the other hand. Genes that are differentially expressed at variable levels of a trait of interest, where eQTL mapping indicates cis regulation for the genes, are more likely to represent the ‘cause,’ that is, the genetic background of the trait of interest, whereas the differential expression of genes showing trans regulation may represent the ‘effect,’ that is, pathways that are affected by causal variation. There are numerous publications highlighting the possible merits of ‘genetical genomics’ applications in livestock breeding (Pomp et al., 2004; Walsh and Henderson, 2004; Kadarmideen et al., 2006; Kadarmideen and Reverter, 2007) and/or addressing issues of data evaluation and experimental design in silico (Perez-Enciso et al., 2003, 2004, 2007; De Koning and Haley, 2005; De Koning et al., 2007; Cardoso et al., 2008). Experimental data from genetic investigations of gene expression in farm animals are still limited. For the first time, results from an eQTL study performed with 74 animals of a F2 pig population, with a focus on drip loss in meat, were presented by our group (Ponsuksili et al., 2008a). Here, we present results from genetic analyses of the transcriptional level in M. longissimus dorsi muscle in a much larger sample of individuals using RT-PCR data along with a comparison to microarray data. Although a global microarray analysis allows the exploration of functional and regulatory gene networks reflecting interactions at the level of DNA, RNA or proteins, an analysis of a subset of genes is relevant to narrow down a QTL region through the targeting positional candidate genes.

The expression trait can be inferred from the proximity of the gene physical location to the eQTL. Moreover, the potential to distinguish cis and trans eQTL, and thus the number of either kind of eQTL, depends on the resolution of the map. Choices of cis-window sizes vary across eQTL mapping studies, ranging from 10 kb in yeast (Brem et al., 2002) to 20 Mb in mouse (Bystrykh et al., 2005). In this study, the criterion for a predicted cis-acting eQTL was that the interval of microsatellite markers flanking the genes on the genetic maps overlaps with the interval flanked by the cross points of the F-ratio plot for a particular eQTL, with the corresponding chromosome-wide 5% significance level. This relaxed measure was chosen to avoid the exclusion of any true cis-acting eQTL that may have been mapped non-precisely because of the low resolution of the genetic map.

The analysis of eQTL for transcripts with expression levels that correlate with the low-heritable trait of drip loss revealed a proportion of trans eQTL that is substantially higher than that in other studies of humans and model organisms with no focus on low-heritable traits (Ponsuksili et al., 2008a). This is confirmed here, where only one out of six genes, which were differentially expressed at high vs low drip loss (VTN, SSC.25503.1.S1.at, CYP3A, LYZ) and pH at 24 h post mortem (LAMA4 and KERA), showed a cis-acting eQTL (SSC.25503.1.S1.at). This suggests that the low heritability (that is, the high dependency on non-genetic factors) of some traits coincides with the fact that they are under the control of several trans-regulated genes, that is, genes that are controlled by other genes. These genes represent members of a regulatory network that might contribute to the flexibility of the organism in regards to expressing low-heritable traits in response to different stimuli. Moreover, it was shown that the expression level of the trans-regulated genes was less heritable than that of cis-regulated genes (Petretto et al., 2006). Overall, this implies that the expression of genes associated with traits that have low heritability, like drip loss or pH, is largely dependent on trans regulation and is low in heritability. As we observed in the previous study, a number of differentially expressed genes with extremely high or low drip loss or pH phenotypes belong to the functional group ‘response to stimuli’ (Ponsuksili et al., 2008b). The trait-associated expression levels of these genes may be an effect of the external stimuli rather than a cause of phenotypic variation in the target trait. Variations in proteins or RNAs that interact with the cis-regulatory sequences of these genes (trans-acting effects) may lead to alterations in their expression that are not because of genetic polymorphisms within the regulatory sequences, such as promoters (cis-acting). Expression QTL mapping, with its potential to categorize cis and trans effects, provides the means to discriminate between ‘effect’ and ‘cause’ with respect to trait-associated differential expression.

Despite numerous differences between microarray data and quantitative RT-PCR data, the designation of transcripts as cis regulated was relatively consistent. This finding suggests that cis regulation is a stable characteristic of individual transcripts, which is detectable, independent of the number of animals or environmental influences. In contrast, for trans eQTL the analysis of a high number of individuals by real-time RT-PCR and a more limited number of individual samples by microarrays did not show any common results. The finding may be explained by the presence of type 1 or type 2 errors, which may occur with any of the approaches because of the small number of individuals, limited mapping resolution and/or small effects from the underlying eQTL. Petretto et al. (2006) reported that trans eQTL generally have effects that are smaller in size and have higher false discovery rates than cis eQTL. In our study, estimates of the effects of eQTL detected by microarrays were much larger than those of eQTL derived from real-time RT-PCR. A bias exists, when estimating QTL effects, towards an overestimation of the phenotypic variances associated with the QTL at lower numbers of individuals (Beavis effect) (Beavis, 1998; Xu, 2003). An important stipulation in expression genetics is that the size of the mapping panel is an important determinant of success. The number of animals in a study has a significant role in the identification of trans effects, but it has a relatively small effect on the discovery of cis eQTL. The observation that cis eQTL was more consistently detected than trans eQTL was also made by comparing the results from different studies of eQTL with different numbers of animals and different tissue types (Brem et al., 2002; Yvert et al., 2003; Morley et al., 2004; Göring et al., 2007). Göring et al. (2007) found that none of their trans eQTL could be confirmed between their study and a previous study of Morley et al. (2004).

We also attempted to identify SNPs in genes that could give rise to strong cis eQTL in a porcine population. Typically, cis-acting eQTL are considered to include SNPs that are within 100 kb upstream and downstream of the gene that is affected by that eQTL. On the basis of earlier work involving human genes with promoter polymorphisms, it has been estimated that a sizable fraction of cis-variants (about one-third of the SNPs in promoters) may alter gene expression (Hoogendoorn et al., 2003; Buckland, 2004). One study recently reported on the investigation of cis-regulatory variations in 3′-UTRs for a set of genes showing cis-acting regulation in a panel of mouse recombinant congenic strains (Lee et al., 2006). In farm animals, annotation of the promoter regions for most genes is incomplete. Accordingly, our SNPs are located in the coding region or at the 5′- or 3′-UTR. It is not likely that these SNPs are responsible for differential expressions, that is, causal SNPs. However, in the F2 population, regions of linkage disequilibrium are often large and thus SNPs may serve as linkage disequilibrium markers for any causal SNP. SNPs, in genes that give rise to strong cis eQTL in a F2 population, have a role in disrupting eQTL when they are used as fixed effects (except for the SNP in ZDHHC5). In our study, one haplotype of the AHNAK gene and two other single SNPs are associated with the variation in their expression in the F2 population. The identification of causative polymorphisms and the determination of their functional roles are even more challenging because there are several different molecular mechanisms through which mRNA levels in cells can be regulated. Such putative candidate polymorphisms that affect the transcription of a given gene may be prioritized for experimental validation, and hypotheses can be generated for the possible biological roles of candidate regulatory SNPs based on an examination of the data.

In summary, this is to our knowledge the first report to address the relative sensitivity of the detection of cis and trans eQTL effects, based on the sample size and the methods used to detect expression levels. Sample size has a significant role with regard to statistical power, not only in the detection of eQTL but also in the determination of the kind of eQTL (cis, trans effects), as shown in this study. Cis regulation was more consistent than trans eQTL. The expression level of genes associated with low-heritable traits is predominantly controlled by trans eQTL.