Characterization of the Transcriptional Complexity of the Receptive and Pre-receptive Endometria of Dairy Goats

Zhang, Lei; An, Xiao-Peng; Liu, Xiao-Rui; Fu, Ming-Zhe; Han, Peng; Peng, Jia-Yin; Hou, Jing-Xing; Zhou, Zhan-Qin; Cao, Bin-Yun; Song, Yu-Xuan

doi:10.1038/srep14244

Download PDF

Article
Open access
Published: 16 September 2015

Characterization of the Transcriptional Complexity of the Receptive and Pre-receptive Endometria of Dairy Goats

Lei Zhang¹^na1,
Xiao-Peng An¹^na1,
Xiao-Rui Liu¹^na1,
Ming-Zhe Fu¹^na1,
Peng Han¹^na1,
Jia-Yin Peng¹^na1,
Jing-Xing Hou¹^na1,
Zhan-Qin Zhou¹^na1,
Bin-Yun Cao¹^na1 &
…
Yu-Xuan Song¹^na1

Scientific Reports volume 5, Article number: 14244 (2015) Cite this article

1999 Accesses
25 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Endometrium receptivity is essential for successful embryo implantation in mammals. However, the lack of genetic information remains an obstacle to understanding the mechanisms underlying the development of a receptive endometrium from the pre-receptive phase in dairy goats. In this study, more than 4 billion high-quality reads were generated and de novo assembled into 102,441 unigenes; these unigenes were annotated using published databases. A total of 3,255 unigenes that were differentially expressed (DEGs) between the PE and RE were discovered in this study (P-values < 0.05). In addition, 76,729–77,102 putative SNPs and 12,837 SSRs were discovered in this study. Bioinformatics analysis of the DEGs revealed a number of biological processes and pathways that are potentially involved in the establishment of the RE, notably including the GO terms proteolysis, apoptosis and cell adhesion and the KEGG pathways Cell cycle and extracellular matrix (ECM)-receptor interaction. We speculated that ADCY8, VCAN, SPOCK1, THBS1 and THBS2 may play important roles in the development of endometrial receptivity. The de novo assembly provided a good starting point and will serve as a valuable resource for further investigations into endometrium receptivity in dairy goats and future studies on the genomes of goats and other related mammals.

Transcriptomic profiling of mare endometrium at different stages of endometrosis

Article Open access 27 September 2023

A. Szóstek-Mioduchowska, A. Wójtowicz, … D. J. Skarzynski

Spatiotemporal endometrial transcriptome analysis revealed the luminal epithelium as key player during initial maternal recognition of pregnancy in the mare

Article Open access 16 November 2021

Alba Rudolf Vegas, Giorgia Podico, … Stefan Bauersachs

Bioinformatic analysis of endometrial miRNA expression profile at day 26–28 of pregnancy in the mare

Article Open access 16 February 2024

Agnieszka Sadowska, Tomasz Molcan, … Anna Szóstek-Mioduchowska

Introduction

Embryo implantation is a complex initial step in the establishment of successful pregnancy in mammals¹ and consists of apposition, adhesion and invasion². The synchronized differentiation of the receptive endometrium (RE) from the pre-receptive endometrium (PE) is essential for embryo implantation³. The development of endometrial receptivity is known as the “window of implantation” because it is a spatially and temporally restricted stage⁴. During this period, the endometrium undergoes pronounced structural and functional changes induced by the ovarian steroids oestrogen and progesterone, which prepare it to be receptive to adhesion and subsequent invasion by the embryo^5,6. Studies have shown that infertility is partly caused by dysfunction of the receptive endometrium⁷. Furthermore, impaired uterine receptivity is one of the major reasons for the failure of embryo transplantation in humans and other mammals during assisted reproduction with good-quality embryos^8,9.

The development of novel, high-throughput sequencing techniques has provided new strategies that can be used to analyse the functional complexity of the transcriptome¹⁰. There are three high throughput sequencing methods that can be used for transcriptomic studies, including the classical 454 pyro-sequencing method and the low-cost Solexa sequencing method; these methods have been employed frequently over the past few years¹¹, but now Illumina sequencing has grabbed that first spot. The RNA sequencing (RNA-Seq) approach, which was developed to help analyse global gene expression, is an efficient method to map and quantify the transcriptome¹². The holistic view of the transcriptome and its organization provided by the RNA-Seq method has revealed many novel transcribed regions, splice isoforms and single nucleotide polymorphisms (SNPs) and has allowed the refinement of gene structures^{13,14,15,16,17}. Finally, RNA-Seq generates absolute rather than relative gene expression measurements, thereby providing greater insight and accuracy than do microarrays^18,19.

Notably, recent studies have reported that the attainment of endometrial receptivity is a complex process involving numerous molecular mediators⁴. Molecular studies have extensively investigated the possible genes involved in the establishment of the receptive endometrium²⁰, such as hormones^21,22, cytokines²³ and growth factors²⁴. Nevertheless, the molecular mechanisms involved in the development of the endometrium from the pre-receptive state to the receptive state remain largely unknown and the complexity of the goat transcriptome has not yet been fully elucidated. Drawing on the experience of previous studies, in this study we adopted the Illumina RNA-Seq approach to obtain a larger and more reliable transcriptomic dataset²⁵ from the PE (gestational day 5) and RE (gestational day 15) in dairy goats. Then, we constructed a comprehensive analysis of the endometrial transcriptional profiles at the global level to compare the genes expressed in the PE and RE and further explore DEGs, single nucleotide polymorphisms (SNP) and simple sequence repeat (SSR) using Gene Ontology (GO) and Kyoto Encyclopedia of Genes (KEGG) for DEGs. Therefore, the results of our present study may provide essential information in support of further research on the development of endometrial receptivity in dairy goats. Furthermore, our transcriptomic study will provided good reference data for gene expression profiling of goats.

Results

Sequencing Results

Summary of sequencing

This study used RNA-Seq to compare the transcriptomic landscapes of the endometrium from the PE (gestational day 5) and RE (gestational day 15) phases of 20 healthy, 24-month-old multiparous dairy goats. Total RNA from the receptive and pre-receptive endometria were used to construct RNA libraries for Illumina sequencing. Reads with adapters and low quality reads were removed prior to assembly. In total, we acquired 46,514,662 and 44,185,646 clean reads from the PE and RE libraries, respectively. Approximately 99.86% of the total reads were valid for further analysis (Table 1).

Table 1 Overview of the sequencing reads and reads after preprocessing.

Full size table

De novo assembly of sequencing data

The Trinity software (http://trinityrnaseq.sourceforge.net/) was used for the de novo assembly of our valid reads²⁶. The preprocessed sequencing reads were assembled into 102,441 unigenes using the optimized parameters. The assembled unigenes in the present study were evaluated using the following standard metrics: Min length, Median length, Mean length, N50, Max length and Total length (Table 2). N50 represents a weighted median statistic such that 50% of the entire assembly is contained in unigenes equal to or larger than this value in base pairs. The mean unigene length was no less than 1,874 bp in this study, while the average length of the unigenes was approximately 896 bp. Thus there were 91,787,136 bases were generated and the sequencing depth was about 98× in this study, what fully guaranteed that the low abundance sequence could be detected. The size distribution of the reads is shown in Fig. 1.

Table 2 Assembly results of unigenes.

Full size table

Unigene Annotation

To exclude interference from alternative splicing of transcripts, first we clustered all of the transcripts that matched the same reference gene; then, we removed redundant transcripts and preserved only the longest transcript from each cluster to represent a unigene²⁷. The unigenes were BLASTed to public database banks including SWISS-PROT (a manually annotated and reviewed protein sequence database), nr (NCBI non-redundant protein sequences), KEGG (Kyoto Encyclopedia of Genes and Genomes), KOG (euKaryotic Ortholog Groups) and Pfam (a widely used protein family and structure domain database). The valid reads were assembled into 102,441 unigenes, (Table S1), of which 36,308 (35%) had BLAST hits to known proteins in SWISS-PROT, 15,220 (14.86%) in nr, 29,835 (29.12%) in Pfam, 34,265 (33.45%) in KEGG and 37,219 (36.33%) in KOG (Table 3). Because one unigene might be annotated to more than one public database, a total of 43,127 coding genes were found after annotation.

Table 3 Annotation result statistics of unigenes in different databases.

Full size table

To study the sequence conservation of the endometrium in other animal species, we used BLAST²⁸ to align the unigenes to the NCBI non-redundant database (nr) using an E-value of e⁻¹⁰ as the threshold. A total of 15,220 unique sequences (14.86%) had BLAST hits in the nucleotide sequence database in nr. The majority of the annotated sequences corresponded to known nucleotide sequences of animal species, with 30.3%, 17.4%, 9.1%, 3.6% and 2.5% matching with Ovis aries, Bos taurus, Bos grunniens, Homo sapiens and Orcinus sequences, respectively (Fig. 2). KOG (clusters of orthologous groups for eukaryotic complete genomes) is a classification system based on orthologous genes²⁹. In this study, 31,778 unigenes were annotated to 26 groups by the KOG database (Fig. 3). General functional prediction alone (R) annotated 6,009 unigenes at most and no unigenes were annotated to the Unnamed protein (X). Cell cycle control, cell division, chromosome partitioning (D) was annotated with 854 unigenes, Extracellular structures (W) was annotated with 113 unigenes.

Detecting SNP and SSR

Next-generation sequencing provides a range of new potential applications for evolutionary and ecological-genetic studies in non-model species³⁰. The discovery of putative SNPs in the RE and PE datasets was summarized. According to the results presented in Table 4, a total of 76,729 putative SNPs were identified for the PE dataset (Table S2), of which 55,044 (71.74%) were transitions and 21,685 (28.26%) were transversions. Similarly, 77,102 putative SNPs were identified from the RE dataset (Table S3), of which 72.13% were transitions and 27.87% were transversions.

Table 4 Result statistics of putative SNPs in pre-receptive and receptive endometrium.

Full size table

SSRs consist of tandem repeats of short (1–6 bp) nucleotide motifs³¹ that are distributed throughout the genome^32,33. After screening for SSRs in the 102,441 unique sequences using the MISA software, we identified 12,837 SSRs distributed in 10,330 sequences (Table S4). A total of 1,556 sequences contained more than one SSR. Based on the repeat motifs, the SSR loci were divided into monomers (47.12%), dimers (25.36%), trimers (23.58%), quadramers (1.47%), pentamers (1.72%) and hexamers (0.75%) (Fig. 4).

Differential gene expression and functional characterization

Analysis of Unigene Expression

The RNA-Seq technique allowed the analysis of differential expression profiles via transcript abundance with a high sensitivity for transcripts expressed at low levels^34,35. We generated 90 million paired-end reads 100 bp in length, yielding approximately 9 GB of sequence. Thus, the sequencing depth in this study was sufficient to detect transcripts expressed at low levels. To better categorize the unigenes that presented differential expression levels, unigenes expression values RPKM (reads per kilobase of exon model per million mapped reads) were categorized into three groups: high (>500 RPKM), medium (10 to 500 RPKM) and low (<10RPKM) (Table 5). Unigenes that are highly expressed in a specific tissue may be responsible for the basic metabolism and functions of that tissue¹². A total of 192 and 187 genes were found to be highly expressed in the PE and RE libraries, respectively.

Table 5 RNA-Seq gene expression results for the RE and PE libraries (RPKM).

Full size table

A total of 3,255 unigenes were found to differ significantly in terms of expressional levels (P < 0.05) between the PE and RE libraries; the full list of DEGs is provided in Table S5. There were 208 differentially expressed unigenes that were down-regulated in the RE compared to the PE in goats, while 618 unigenes were up-regulated with fold-changes greater than or equal to 2 (Fig. 5). Additionally, there were 5 unigenes specifically expressed in the receptive endometrium with expression values at the medium level. The top 10 unigenes that were up-regulated in RE compared to PE are shown in Table 6. comp34258_c0_seq1 (MMP1) was the most up-regulated DEG (13.61-fold increase in the RE compared to the PE), followed by comp41692_c0_seq2 (MMP12, 10.89-fold) and comp19823_c0_seq3 (Fxyd4, 8.48-fold). The top 10 down-regulated unigenes are shown in Table 7. The most down-regulated DEG was comp24892_c0_seq2 (NNAT, -5.32-fold increase in the RE compared with the PE), followed by comp43544_c1_seq3 (MYH11, -4.41-fold). A heat map of the Pearson’s correlation and a dendrogram of the correlation between transcript tags are provided in Fig. 6. The up-regulated DEG with the highest level of expression (RPKM = 1525.24) was comp9210_c0_seq2 (S100G) with a 5.34-fold increase in the RE and the down-regulated DEG with the highest level of expression (RPKM = 1985.32) was comp39532_c0_seq7 (Tes) with a -4.09-fold increase in the RE (Table 8).

Table 6 The top 10 unigenes that up-regulated in RE compared with PE.

Full size table

Table 7 The top 10 unigenes that down-regulated in RE compared with PE.

Full size table

Table 8 DEG with highly expressed level in R or PE libraries (|log2 (RE/PE)|>2).

Full size table

Gene Ontology Analysis of the DEGs

The DEGs were analysed by running queries for each DEG against the GO database, which provides information related to three ontologies: molecular function, cellular component and biological process. In this study, GO enrichments of the DEGs were categorized into 426 functional groups that met the criteria of P-values < 0.001. Out of the 133 terms that were significantly enriched in molecular functions (Table S6), the most significantly enriched GO terms were protein binding (GO: 0005515) with 154 genes annotated, followed by ATP binding (GO: 0005524), calcium ion binding (GO: 0005509), metal ion binding (GO: 0046872) and sequence-specific DNA binding transcription factor activity (GO: 0003700). In the cellular compartment GO category, 88 terms were significantly enriched (Table S7). The most significantly enriched GO terms were integral to membrane (GO: 0016021) with 338 genes annotated, followed by nucleus (GO: 0005634), cytoplasm (GO: 0005737) and extracellular region (GO: 0005576). In the biological processes, 205 GO terms were significantly enriched and were related to various processes (Table S8) such as proteolysis (GO: 0006508), apoptosis (GO: 0006915), cell adhesion (GO: 0007155), protein transport (GO: 0015031), protein folding (GO: 0006457), multicellular organismal development (GO: 0007275) and cell differentiation (GO: 0030154). The top 20 GO functional annotations for the DEGs are shown in Fig. 7. The inclusion of the annotation for insulin-like growth factor binding (GO: 0005520) and clathrin sculpted gamma-aminobutyric acid transport vesicle membrane (GO: 0061202) interested us, because previous studies have reported that IGF-1 (insulin-like growth factor) and GABA (gamma-aminobutyric acid) may play important roles in the development of the receptive endometrium in mice and humans^36,37,38,39.

KEGG Pathway Analysis of the DEGs

Various genes cooperate with each other to exercise their biological functions. Accordingly, KEGG analysis helps us to further understand the biological functions of DEGs⁴⁰. Overall, the DEGs were significantly enriched in 82 KEGG pathways, meeting the criteria of P-values < 0.001 (Table S9), suggesting that these pathways may play important roles in the development of endometrial receptivity.

The KEGG pathways showing the highest levels of significance were the MAPK signalling pathway (ko04010, 88) with 88 DEGs enriched, followed by Pathways in cancer (ko05200, 76), Oxidative phosphorylation (ko00190, 74), Phagosome (ko04145, 70), Alzheimer’s disease (ko05010, 52), Focal adhesion (ko04510, 50), Cytokine-cytokine receptor interaction (ko04060, 48) and Apoptosis (ko04210, 48). The top 17 KEGG pathways are shown in Fig. 8. These results indicate that diversifying metabolic processes are active in the development of a receptive endometrium from the pre-receptive phase and a variety of metabolites are synthesized in the receptive endometrium.

Genes Possibly Involved in the Development of Receptive Endometria

Our analysis identified many DEGs that were enriched in Calcium (GO: 0005509, GO: 0005509, GO: 0008294 and ko04020) and Cell adhesion (GO: 0045785, GO: 0007155 and ko04514) were significant according to the analysis results of both the GO terms and KEGG pathways (P < 0.001). Based on these results, we analysed the mRNA expression levels of some coding genes related to calcium and cell adhesion and found that ADCY8, VCAN, NA, SPOCK1, CGREF1, THBS1, THBS2, S100G, S100A1, S100A2, S100A4, S100A10, S100A13, MMP1, MMP3, MMP11, MMP12 and MMP19 exhibited significantly different expression levels among the two endometrial phases (Fig. 9). Therefore, the results of this study suggested that these genes may play roles in the differential regulation of goat endometrial development during the receptive and pre-receptive phases; however, the validation of this hypothesis needs further study.

Discussion

Investigating the transcriptome profile of the receptive and pre-receptive endometrium will contribute to our understanding of the biochemical and physiological development of endometrial receptivity during the “window of implantation”. RNA-Seq offers an unprecedented level of sensitivity and high throughput deep sequencing and has been widely used to detect gene expression patterns^10,12. In the present study, large-scale transcriptome data were obtained using Illumina RNA-Seq as the first step of our endeavour to provide clear insight into the molecular mechanism of endometrial receptivity in dairy goats.