1. Introduction
The domestic chicken is the most popular and widely distributed domestic fowl worldwide [
1] and hence it is a stable source of protein, including meat and eggs [
2]. In contrast to the advances made by high-throughput nuclear genotype analyses in chicken [
3,
4], the insights into mitochondrial (mt) genome diversity are rather sparse. For many animals it is known that mt genome diversity can have a remarkable impact on a variety of traits, such as meat quality in pigs [
5] and the metabolic capacity of dairy cows [
6]. In chickens, previous studies have shown that mutations in the mitochondrial genome can have strong physiological effects. This includes the adaptation to high altitudes of the Tibetan chicken [
7,
8] and economically important traits such as egg quality and body weight [
9]. Even though the whole mitochondrial genome has been used in many studies of domesticated animals such as goats [
10], horses [
11] and pigs [
12], most studies of the chicken relied on partial mitochondrial sequences or the protein coding sequences [
1] and thus, studies covering the whole mt genome are underrepresented. Given the importance of the genetic background on organismic physiological performance, beside the nuclear genome, the in-depth study of mitochondrial genomes and their variation are of utmost interest.
Mitochondria are commonly known as the powerhouse of the cell since they contribute to the energy metabolism by generating ATP [
13]. During the process of oxidative phosphorylation (OXPHOS) they are a major producer of reactive oxygen species (ROS) and thus contribute to oxidative stress [
14]. Therefore, the mitochondrial energy metabolism is directly linked to the availability of phosphorus (P).
Additional to their role as the cellular energy supplier, mitochondria contribute to several other key cellular processes such as the biosynthesis of important macromolecules [
13], calcium (Ca) homeostasis [
15] and ageing [
16] that have been mainly studied within genetic model organisms such as fruit flies or mice [
17,
18].
Mitochondria carry their own genome, which in chickens is approximately 16 kb long and contains 13 protein coding genes, 22 tRNA genes, two rRNA genes and one regulatory control region [
19]. Due to their maternal inheritance, simple molecular structure [
1] and relative high copy number [
20] they traditionally serve as popular genetic markers in molecular biology [
21].
As all living organisms, chickens need P for their growth, health and energy metabolism [
22]. For chickens, P in plant-based feedstuff is mainly present as phytic-acid (InsP
6) [
23,
24], which consists of six phosphate groups and one
myo-inositol ring. While passing through the digestive tract, stepwise InsP
6 dephosphorylation results in the release of phosphate and
myo-inositol [
25]. However, the ability of poultry to degrade InsP
6 is limited. Feed supplements produced from rock phosphates maintain the P supply. The availability of rock phosphate is finite [
26], which makes reducing its use one of the major challenges in food production.
To improve P utilization efficiency, we need to understand the genetic and nongenetic background for the processes influencing the formation of inositol derivates during InsP
6 degradation and their relevance for P utilization. It is furthermore important to understand how these processes change over time, since the P demand changes during a chicken’s lifetime, depending on factors such as growth or egg laying [
27].
We aim to understand the mitochondrial genetic variation in laying hens and hypothesize that different haplotypes lead to physiologically distinct phenotypes. To study this impact, we obtained whole mitochondrial genomes from a large experimental setup of laying hens that had the abovementioned aspects, P and Ca utilization and phytate degradation in focus (Sommerfeld 2020a, 2020b [
27,
28]). We analyse the body weight and feed intake and hypothesize that the performance of individuals with different mitochondrial haplotypes differs within and between production periods according to the animals’ needs. Thus, we can gain insights into the connection between the genetic variation of mitochondrial genomes and their physiological effects and contribute to a better understanding of these effects in laying hens. We enriched our data by nuclear genetic information of a subset of individuals that enable us to gain important insights into the nuclear genetic structure of the population and potential effects of nuclear and mitochondrial interactions [
29,
30,
31].
2. Materials and Methods
2.1. Animal Experiments
The animal experiments were performed at the Agricultural Experiment Station of the University of Hohenheim, Germany. They were approved by the Regierungspräsidium Tübingen, Germany (Project no. HOH50/17TE) in accordance with the German Animal Welfare Legislation.
We used 180 laying hens: 90 brown (Lohmann Brown Classic) and 90 white (Lohmann LSL-Classic) leghorn hybrids obtained from Lohmann Tierzucht GmbH (Cuxhaven, Germany). The hens originated from two experiments addressing the changes of utilization of P and Ca in different periods of the hens’ life (production periods, experiment 1) and under adequate or reduced P and Ca supply during the peak of egg production (experiment 2). The first experiment provided 100 individuals and is described in detail in Sommerfeld et al. 2020a [
27]; the second one provided 80 animals and is described in Sommerfeld et al. 2020b [
28]. We briefly summarize them in the following.
In experiment 1 (
Figure 1), the animals were raised as one group in floor pens on deep litter bedding, with diets according to the specific requirements in each period based on soybean and corn meal, but no difference to the recommendations as given in detail within
Table 1 in Sommerfeld et al. 2020a [
27]. After 8, 14, 22, 28, and 58 weeks 10 hens per father and strain were chosen and moved into the metabolism units (1 m × 1 m × 1 m, arranged in a randomized block design, where 2 units formed one block, each metabolism unit contained one individual) where after five days of rest excreta and feed intake was measured for four days for each individual. Feed was available for ad libitum consumption and no changes in the diet were made. The animals were weighed and after two days of rest the hens were killed by decapitation following anaesthesia using a gas mixture [
32] and samples of blood and liver tissue were taken, thus the hens were 10, 16, 24, 30, and 60 weeks old. In the following the five sampling points will be named as period 1 to period 5.
For experiment 2 (
Figure 1), the animals were raised together under the same conditions as in experiment 1. After 27 weeks hens from 20 fathers (10 per strain) were separated into four groups in which each group contained one hen per father and the hens were placed into the metabolism units (1 m × 1 m × 1 m, arranged in a randomized block design, where 8 units formed one block and each metabolism unit contained one individual). For three weeks the individuals received feed for ad libitum consumption with specific diets for each group based on soybean and corn meal, containing recommended or lower concentrations of Ca and P (
Table 1). The excreta of each hen were collected and the feed intake was monitored for four days in week 30. The hens were killed at 31 weeks of age (peak of egg production). As in experiment 1 blood and liver tissue were sampled.
2.2. DNA Samples and Extraction
We extracted DNA twice per animal: from liver tissue for the haplotype analyses and from blood for genotyping. For the haplotype reconstruction all 180 individuals were used, while for the nuclear genotyping 52 samples from individuals of the first experiment (26 brown and 26 white, covering period 1 to period 4) and 55 from the second experiment (27 brown and 28 white, covering all four diets) were included. The animals for the genotyping were selected randomly; care was taken to include an even amount of brown and white animals.
Liver samples were taken after slaughtering as described above and directly transferred to dry ice. We extracted high molecular DNA from frozen tissue using the DNeasy Blood & Tissue Kit (Qiagen, Hilden, Germany) following the purification of total DNA from Animal Tissues protocol.
For genotyping, the DNA from 53 brown and 54 white hens was extracted from blood which was directly taken at the sampling sessions as described above, stored in EDTA and further processed using the Maxwell 16 Blood DNA Purification Kit on a Maxwell 16 MDx instrument (Promega, Madison, WI, USA).
2.3. Analysis of Mitochondrial Haplotypes
2.3.1. Long Range PCR and Next Generation Amplicon Sequencing
The mt genome was individually amplified by two overlapping fragments of 8935 bp and 9522 bp length using a Long Range (LR) PCR Kit (Qiagen, Hilden, Germany). Primers were designed and tested if they amplify other fragments of the same size using Primer BLAST [
33]. Primer sequences for fragment 1 are: F: ACGCCACAGCTAAAACCCCCAGC R: TGGATGGTGGAGAGGCGGTTGT and fragment 2: F: TCCTGCTTGCCCTCCCCTCCCT R: CGCACCCGCACTGTGAAGGGCAA. The reactions were set up according to the protocol of the manufacturer with primer annealing temperatures of 63 °C for fragment 1 and 60 °C for fragment 2. The elongation phase was set to 9 min for both fragments. Successful amplifications were verified via gel electrophoresis on a 1% agarose gel (VWR International GmbH, Darmstadt, Germany) stained with gel-red (Biotium Inc., Fremont, CA, USA). The PCR products were precipitated with ethanol; quality and quantity of the products were measured using a NanoDrop 2000/2000c Spectrophotometer and Qubit Fluorometer (Thermo Fisher Scientific Inc., Waltham MA, USA). Resulting purified fragments were equimolarly pooled for sequencing. The samples were sequenced by CeGaT (CeGaT GmbH, Tübingen, Germany), using the Nextera XT Kit (Illumina Inc., San Diego, CA, USA) for library preparation, on an Illumina NovaSeq 6000 MiSeq expecting 150 bp paired-end reads.
2.3.2. Sequence Analysis
The sequencing reads were demultiplexed and adapter-trimmed by CeGaT using Illumina bclfastq (v2.20; Illumina Inc., San Diego, CA, USA) and Skewer (Version 0.2.2 [
34]), followed by a quality-check in FastQC [
35] of files containing comparatively high and low numbers of reads. After the removal of primer sequences with Cutadapt (version 2.5 [
36]) the reads were quality-trimmed in Trimmomatic (version 0.32 [
37]) using the following parameters: SLIDINGWINDOW:4:15 MINLEN:50 CROP:148. The used settings cut the read if the average quality of 4 bases is below 15, discard reads which are shorter than 50 bases and cut the reads that are longer than 148 bases.
We mapped the reads against a published white leghorn mitochondrial genome (AP003317 [
38]) using the algorithm implemented in Geneious Prime (version 2020.0.2
https://www.geneious.com, accessed on 5 November 2019). Duplicates were removed before mapping using the Dedupe Duplicate Read Remover 38.37 by Brian Bushnell as implemented in Geneious Prime using default parameters, as well as the variant call with default settings except for minimum coverage, which was set to five. The resulting output was then used to create consensus sequences using GATK FastaAlternateReferenceMaker [
39].
The resulting consensus sequences included a poly-C site with 8 to 13 Cytosins (position 3940 and following). The site did not only vary in its length without a pattern between individuals but also within the reads of most individuals. The issue could not be resolved as in both sequencing approaches (Illumina and Sanger-technology) these sites led to read termination. Consequently, we followed the most conserved approach and restricted the length down to 8 Cytosins for all individuals.
For those individuals that showed coverage close to zero in one of the two PCR fragments, the missing parts were sequenced using a PCR based Sanger-sequencing approach. A list of the corresponding individuals, used primers, reagents and PCR protocol can be found in
Tables S1–S3, Supplementary Materials. Sanger-sequencing reactions were performed by Microsynth AG (Balgach, Switzerland) and the resulting sequences were edited in Geneious Prime.
2.3.3. Validation of Sequencing Results
To approve the resulting consensus sequences, the primer- and quality-trimmed reads were additionally mapped against a more distant mt genome to the leghorn variant (
Gallus gallus, NCBI Accession AP003322, [
40]) using the same parameters as described above.
To exclude LR-PCR biased artefacts, we resequenced the whole mt genome of one individual with Sanger-technology using mt DNA and a set of 22 oligonucleotides that result in 22 overlapping fragments. Furthermore, the long-range PCR fragments that were used for the library construction were Sanger sequenced using the same set of oligonucleotides (
Supplementary Table S1).
2.3.4. Alignment and Haplotype Reconstruction
The consensus sequences were aligned using the implemented algorithm in Geneious Prime with default settings.
Haplotype networks were generated based on the alignment using the TCS algorithm [
41] implemented in PopART v. 1.7 [
42]. Additionally, a phylogenetic analysis was conducted via Maximum Likelihood, for which we initially identified the nucleotide substitution model that best fit to the data, based on the program Model Test implemented in MEGA X [
43]. Based on the lowest Bayesian Information Criterion (BIC) score, the Hasegawa–Kishino–Yano model [
44] was chosen for tree construction with 500 bootstrap replicates and default settings in MEGA X.
2.4. Analysis of Nuclear Genotype Data
After the identification of the mt haplotypes, we aimed to gain additional insights into the nuclear genome diversity and by analysing nuclear genotype data obtained with the Illumina 60K chicken Infinium iSelect chip (Illumina Inc., San Diego, CA, USA). For the following analyses we used 54 white and 53 brown individuals from both experiments.
2.4.1. Filtering
The resulting SNPs were filtered for cluster separation ≥0.4 in the GenomeStudio software (v. 2011.1, Illumina Inc., San Diego, CA, USA) and then exported for downstream analyses. Prior to the next filtering steps, the SNP positions were transferred to the newest chicken genome version (GRCg6a, Accession: GCF_000002315.6) and the two strains were separated using vcftools [
45]. The SNPs were filtered in vcftools for missingness (95%), minor allele count (3) and minor allele frequency (0.03) prior to further analysis. For the population structure analysis with ADMIXTURE [
46] Linkage disequilibrium (LD) pruning was performed after filtering as recommended using PLINK version 1.9 [
47], where all SNPs with a R
2 value greater than 0.1 with any other SNP within a 50-SNP sliding window (advanced by 10 SNPs each time) is removed as described in the manual. The dataset was filtered without previous separation of the strains for additional ADMIXTURE analyses as described above.
2.4.2. Analysis of Nuclear Genetic Diversity
To gain insights into the nuclear genome diversity and relationships between individuals within the strains, a genomic relationship matrix (G-Matrix) was performed with the
AGHmatrix [
48] package in R (R Core Team 2020, Version 3.0.3) using the VanRaden method [
49]. Note that the G-Matrix is usually used to study complex traits, but was used here to obtain some insight of genomic relationships. The two strains were analyzed separately, since we look at them as two distinct lines. After the calculation of the matrices, statistical analyses were performed in JMP Pro (Version 13. SAS Institute Inc., Cary, NC, USA, 1989–2019) to detect the impact of the paternal and possible maternal background and differences between the strains and mt haplotypes. The data was tested for normality using Shapiro–Wilk test. Wilcoxon/Kruskal–Wallis tests were used for the comparison of two groups and Steel–Dwass tests for multiple comparisons.
The differentiation between both strains was estimated using Weir and Cockerham’s F-statistics fixation-index (F
ST) [
50] as implemented in vcftools based on the dataset including both strains filtered together. To analyze the differentiation between the mt haplotypes the F
ST between the different haplotypes was estimated the same way based on the file only containing the brown strain.
To estimate population structure ADMIXTURE was used on (1) the merged data set obtained from both strains, and (2) separate data sets for each strain, to gain deeper insights into the strain structure. The number of ancestral populations (K) was set from 2 to 6 for all runs. The best K was estimated using cross validation as suggested in the manual.
2.5. Phenotypic Traits
We included measurements of phenotypic traits from the same individuals, following the overarching hypothesis that mt haplotype variation may impact phenotypic performances in laying hens.
2.5.1. Measurements of Body Weight, Feed Intake and Phosphorus Utilization
Body weight, feed intake and P utilization were measured for all hens in the first experiment (
n = 100). Feed intake was calculated over the course of 4 days, by measuring the amount of feed in the beginning and the end of the excreta collection phase that is described in Sommerfeld et al. 2020a [
27]. Body weight was measured on the last day of the excreta collection phase. P utilization was calculated as the proportion of intake, which was not recovered in excreta (based on quantitative data, the amount of remaining P in the excreta) for the same periods as feed intake was measured [
27]. P utilization was additionally measured for all hens of experiment 2 and analyzed for both experiments separately.
2.5.2. Statistical Analyses
Statistical analysis of body weight and feed intake were performed for all individuals from experiment 1 while the distribution of mt haplotypes over the four diets in the second experiment was heterogeneous, limiting robust statistical tests for these individuals.
The overall impact of the haplotypes on body weight and feed intake was evaluated by a statistical model, derived from a linear mixed model developed by Sommerfeld et al. 2020a [
27].
where Y is the response variable, ε is the residual error, period and haplotype are fixed effects, with block, metabolism unit and father as random effects. Statistical significance was declared at
p < 0.05.
All modelling was performed in R (R Core Team 2019, Version 3.6.1) using the
lmerTest package [
51]. The interaction of period and haplotype was removed, since the model was rank-deficient due to the absence of one haplotype in period 5. To detect the overall influence of period and haplotype on the response variable, a three factorial analysis of variance (ANOVA) was used. A pairwise Tukey post hoc test (package
emmeans [
52]) was performed to detect differences between the haplotypes independent from the periods and between the periods independent of the haplotype.
To detect significant differences in body weight, feed intake and P utilization between the haplotypes within the periods (and for P utilization within the diets of experiment 2), pairwise Tukey–Kramer HSD or Steel–Dwass tests were performed using JMP Pro (Version 15. SAS Institute Inc., Cary, NC, USA, 1989–2019) after testing for normality using Shapiro–Wilk test.