Introduction

Sonic Hedgehog (Shh) signaling cascade is one of the intricate signal transduction mechanism that is conserved among vertebrates and invertebrates and participates in multiple distinct developmental processes including ventralization of central nervous system (CNS) and anterior/posterior patenting (A-P) of the limbs (Allahyari et al. 2019; Tickle and Towers 2017; Symmons et al. 2016). In invertebrate animals, like Drosophila and C. elegans Shh signaling is mediated by zinc-finger containing DNA-binding protein Cubitus interruptus (Ci) (Methot and Basler 2001). Vertebrates have achieved the additional complexity in Shh signaling pathway through ancient gene duplication events that had created the three homologous copies of Ci like ancestral gene, i.e., Gli1, Gli2, and Gli3 (Karlstrom et al. 2003). During the course of vertebrate evolution, three Gli paralogs (Gli1, Gli2, and Gli3) have acquired some of the distinct features of their ancestral gene Ci (Abbasi et al. 2009). For instance, Gli2 and Gli3 proteins preserved transcriptional bi-functionality present in their Drosophila homolog Ci (Aoto et al. 2002). In contrast, Gli1 appears to function only as transcriptional activator and its role in embryonic development is widely reported as dispensable (Mill et al. 2003).

Among Gli proteins, Gli3 has dynamic expression pattern and unique characteristics features during early embryonic development and postnatally (Tyurina et al. 2005). It has both a repressor and an activator domain and could in principle function both as transcriptional activator and repressor (Hu et al. 2006). For instance, Gli3xt mice have polydactyly (Bai et al. 2004) and dorsal central nervous system defects, associated with ectopic Shh expression (Haddad-Tóvolli et al. 2015), suggesting Gli3 plays a role in both transducing and repressing Shh signals (Veistinen et al. 2012). Moreover, genetic analysis in mice revealed that Gli3 null mutants die before birth with defects in several tissues including brain, spinal cord, limbs, craniofacial skeleton, vertebrae, lungs, and other internal organs (Cao et al. 2013). Mutations of GLI3 gene in human causes a variety of dominant developmental defects syndromes conjointly referred to as “GLI3 morphaphathies” (Abbasi et al. 2007), including Greig cephalopolysyndactyly syndrome (GCPS) (Vortkamp et al. 1992), Pallister-Hall syndrome (PHS) (Böse et al. 2002), preaxial polydactyly type IV (PPD-IV), and postaxial polydactyly type A (PAPA) (Memi et al. 2018). In addition to craniofacial anomalies with intellectual disability and mental retardation (Kalff-Suske et al. 1999; Biesecker 2006; Démurger et al. 2015), all these GLI3 morphaphathies also depict congenital malformations of autopod with distinct preaxial, postaxial, and syndactyly phenotypes (Démurger et al. 2015)

Biochemical, molecular, animal model and human disease data demonstrates that correct expression of Gli3 is crucial to ensure appropriate deployment of Shh signaling pathway for normal embryogenesis in vertebrates (Abbasi et al. 2007; Karlstrom et al. 2003). Precise expression of a typical developmental gene in vertebrates is governed by non-coding cis-regulatory DNA sequences that often reside within close proximity or hundreds of kilobase pairs (Kbp) away from the concerned gene body (Davis et al. 2016; Leal and Cohn 2017). Previously we and others have identified a number of conserved non-coding elements (CNEs) in intronic and intergenic genomic regions of human GLI3 gene (Abbasi et al. 2013; Coy et al. 2011). These GLI3 associated CNEs when tested in in vivo and in vitro models, acted as tissue and developmental stage specific enhancers (Abbasi et al. 2007; Coy et al. 2011; Alvarez-Medina et al. 2008).

Mutations in enhancer modules can potentially affect the normal availability of associated gene transcripts during embryogenesis (Ali et al. 2016). For instance, deletion of Gli3 limb specific enhancers (mm1179/hs1586) has significantly reduced the Gli3 expression in the embryonic hand plate and exhibits forelimb specific polydactyly (Osterwalder et al. 2018). A recent study has systematically investigated the functional impact of single nucleotide mutagenesis in one of the best-characterized enhancer elements, the zone of polarizing activity regulatory sequence (ZRS) (Kvon et al. 2020). The ZRS is a limb specific enhancer of the Sonic Hedgehog (Shh) gene and is located at an extreme physical distance (of nearly one million base pairs) from its target promoter (Johnson et al. 2014). ZRS is active in the posterior margin of developing fore- and hind limb buds, where it controls digit number and identity (Al-Qattan et al. 2012). Kvon and colleagues reported that 71% of single nucleotide mutations in ZRS enhancer drives abnormal reporter gene expression in the limbs (Kvon et al. 2020). An investigation of human specific substitutions in a neuro-developmental enhancer interval of FZD8 in transgenic mice assay suggests the vital role of non-coding variations in evolving human brain size (Boyd et al. 2015). Over the last two decades, a comprehensive catalog of independently acting cis-regulators, governing dynamic aspects of endogenous GLI3 expression during vertebrate development has been characterized by employing in vitro and in vivo models. However, to best of our knowledge, the crucial roles of GLI3 associated enhancers in developmental trait evolution, in particular central nervous system (CNS) evolution have not received any significant attention.

In the present study we have investigated the pattern of polymorphism and divergence of Gli3 associated functionally confirmed CNS-specific enhancer modules. We have characterized among them a subset of enhancers with an accelerated rate of sequence divergence in human lineage. Among these positively selected enhancers, we found an assorted set of putative TFBSs, possessing human unique single nucleotide polymorphisms. This work not only reveals the molecular evolutionary patterns of Gli3 associated CNS specific enhancers, but also providing hints for the putative genetic basis of population specificity of gene expression regulation.

Materials and methods

GLI3 associated functionally confirmed CNS-specific enhancers

We initiated our search for GLI3 associated functionally confirmed central nervous system (CNS) enhancers through literature searches (Abbasi et al. 2007; Coy et al. 2011; Alvarez-Medina et al. 2008) and VISTA enhancer browser (Visel et al. 2007). Among the previously reported GLI3 associated functionally confirmed and independently acting 19 tissue specific cis-regulatory modules, we shortlisted nine CNS-specific enhancers for further investigations (Fig. 1).

Fig. 1
figure 1

GLI3 associated central nervous system associated enhancer intervals. Human GLI3 gene is located on chromosome 7; its genomic coordinates and its graphical representation of exons and introns are depicted on the top panel. Black boxes represent the coding exons and functionally confirmed CNS specific enhancers are shown as vertical red bars. The upstream intergenic region is represented by horizontal green line. Enhancers are the non-coding genomic regions that regulate the Gli3 expression during CNS development. Enhancers depicted in this figure have been characterized previously through functional experiments in several species including human cell lines, mice, zebrafish, and chicken. The corresponding studies reporting the expression pattern of respective enhancer elements in brain and spinal cord are given in the bibliographic section. CNE, conserved non-coding element; CNS, central nervous system; Mb, midbrain; Hb, hindbrain; Fb, forebrain; Sc, spinal cord. Figures of transgenic mice and transgenic zebrafish expressing reporter genes (lacZ and GFP respectively) under the influence of GLI3 associated CNS-specific enhancers were adopted from Abbasi et al. (2010), Coy et al. (2011), and Anwar et al. (2015)

Estimation of signatures of selection

To estimate signatures of selection on selected subset of GLI3 associated enhancers, previously described methodology by Moon and colleagues was employed (Moon et al. 2019). The unphased VCF (Variant Call Format) files of human GLI3 gene (Chr7: 42,000,548-42,277,469/GRCH37) were downloaded from 1000 Genome Browser phase 3. The downloaded VCF files were then computed through VCF tools (Danecek et al. 2011) and PopGenome (R package) (Pfeifer et al. 2014) to calculate Tajima’s D and mean FST values for each of the enhancer region. To measure the signatures of selection on individual polymorphic sites of GLI3 associated CNS specific enhancers, we adopted the strategy of Schaschl et al. (2015) that aimed to capture signatures of positive selection on individual polymorphic sites (Schaschl et al. 2015). Methodology devised by Schaschl and colleagues includes three approaches, FDIST, BayeScan and extended Lewontin and Krakauer test (FLK). Firstly, FDIST (fixation index (FST) vs expected heterozygosity (He)) calculation was done in program LOSITON (Looking for Selection in a Tangled Dataset) (Antao et al. 2008). Secondly, we used the Bayesian statistical approach, implemented in the program BayeScan, that detect the outlier loci under selection (Foll 2012). We included an additional approach to verify the SNPs detected under the positive selection by FDIST and BayeScan, the “Lewontin and Krakauer test (FLK)” (Bonhomme et al. 2010). FLK test calculations were performed using hapFLK and P values were computed as explained in the software documentation (Fariello et al. 2013). The SNPs corresponding to P value of 0.05 or less were considered significant, having the signatures of positive selection (Bonhomme et al. 2010).

Prediction of allele specific transcription factor binding motifs

Transcriptional factor binding sites (TFBSs) were predicted using de novo motif discovery tool AliBaba 2.1, based on the TRANSFAC database (Grabe 2002). The resultant TFBSs were then confirmed via literature for their role in brain development and were cross checked with the list of 142 brain-specific TFs collected from MGI and Human Protein Atlas by Zehra and Abbasi (Zehra and Abbasi 2018).

Haplotype level analysis

To better characterize the polymorphism of human specific positively selected SNPs in human populations, haplotypic analyses were performed following the methodology of Sabeti and colleagues (Sabeti et al. 2002). Unphased VCF files of human GLI3 gene were converted to phased haplotype files using fast PHASE with default settings (Scheet and Stephens 2006). In order to generate analysis that highlights the segregating alleles to be under the influence of positive selection in certain population, extended haplotype homozygosity (EHH) plot and bifurcation diagrams were generated through package “rehh” (version 2.0) (Gautier and Vitalis 2012).

Results

Detection of signatures of selection on GLI3 associated CNS specific enhancers

From the literature survey, a total of nine Gli3 associated CNS-specific enhancers were shortlisted (Abbasi et al. 2007; Coy et al. 2011; Alvarez-Medina et al. 2008) (Fig. 1). Majority of these nine CNS- specific enhancers are embedded within intronic intervals of Gli3, one in each introns 1, 3, 4, and 11 and two elements in each of introns 2, and 9, whereas only one enhancer region was positioned in the upstream intergenic interval of Gli3 (Fig. 1). To emphasize the physical positioning of enhancer contents, we schematize the entire map of human GLI3 containing locus that depicts intron-exon structure, conservation depth of enhancers, methods for their functional testing and activity domains in CNS (Fig. 1).

To determine the evolutionary rate with which the human GLI3 associated CNS-specific enhancer intervals evolved during the course of human evolution, we estimated the Tajima’s D and fixation index (FST) (Table 1) following the previous methodology used by Moon and co-workers (Moon et al. 2019). Because of ascertainments bias of Tajima’s D and FST towards common polymorphism, following the previous methodologies we chose P ≤ 0.05 as cut-off criterion for significance (Fariello et al. 2013; Korneliussen et al. 2013; Schaschl et al. 2015). Among the analyzed subset of enhancers, values of mean FST and Tajima’s D for CNE1 (Tajima’s D = −0.53, FST = 0.13), and HCNR4 (Tajima’s D = −1, FST = 0.23) with significant P value of 0.02 and 0.003 respectively is indicative of significant signatures of recent positive selection (Table 1). However, Tajima’s D and mean FST values for HCNE1, Mm652, CNE2, CNE6, CNE9, CNE11, and CNE13 failed to reject neutrality with non-significant P values (Table 1). The positional information, number of SNPs, mean FST, and Tajima’s D for each human GLI3 CNS specific enhancers are illustrated in Table 1.

Table 1 Positional information and summary statistics of GLI3 gene associated CNS-specific enhancers

It has been suggested that, when positive selection is acting on an enhancer region, there is likely to be negative selection simultaneously acting on the surrounding region therefore, scanning such genomic regions for signatures of positive selection will yield false negative results (Fariello et al. 2013; Capellini et al. 2017). To avoid this problem, selection on individual polymorphic sites was evaluated (Table 2). For this reason, we adopted the strategy of Schaschl et al. (2015) that aimed to capture signatures of positive selections on individual polymorphic sites both as a result of hard and soft selection sweep (Schaschl et al. 2015). SNP data of the human CNS specific GLI3 enhancers for twenty-six human populations belonging to five super populations (Africa, Europe, East Asia, South Asia, and America) was downloaded from the 1000 Genome Browser phase 3 (Auton et al. 2015). SNPs with low MAF score were filtered out, a quality control step implemented in genomic studies (Malomane et al. 2018). Methodology devised by Schaschl and colleagues includes three approaches, FDIST, BayeScan, and extended Lewontin and Krakauer test (FLK). A genomic position under observation can be classified as directional if P ≤ 0.05 for FDIST (Schaschl et al. 2015). FDIST analysis revealed that SNP rs3801232 (FST = 0.15, P = 0.004) within CNE1 and SNP rs17640804 (FST = 0.20, P = 0.004) positioned within HCNR4 are significantly deviating from the neutrality, which might suggest that these two SNPs are under positive selection. Similarly, based on FDIST values it can be inferred that SNP rs28458974 within CNE6 of GLI3 is evolving under balancing selection (FST = 0, P = 0.053). To further verify these FDIST based results, next we employed the Bayesian statistical approach which assumes that genetic differentiation among populations in different environment is different for the loci that are under significant directional pressure (Fischer et al. 2011). This approach scan for SNPs under selection by decomposing FST into population specific component, the Bayes Factor (BF) and locus specific component alpha (α) (Nielsen et al. 2018). A loci is considered under positive selection, if log10(BF) >1.5 and ((α) = 0.91 → 0.97) (Foll 2012). The SNP rs3801232 within GLI3 associated CNE1 depicts BayeScan log10(BF) = 2.39 and ((α = 0.93), whereas the SNP rs17640804 residing within HCNR4 depicts log10(BF) = 1.7 and ((α = 0.96) providing strong evidence that both of these SNPs filtered earlier by FDIST analysis are under positive selection (Table 2). However, BayeScan did not confirm the signatures of balancing selection on SNP rs28458974 within CNE6 (Fig. 2). To further validate the FDIST and BayeScan based data, an additional approach, “extended Lewontin and Krakauer test (FLK),” was employed (Table 2). FLK test statistics revealed that SNP rs17640804 within HCNR4 and SNP rs3801232 within CNE1 are evolving under positive selection with significant P value, i.e., 0.034 and 0.041 respectively (Table 2).

Table 2 Identification of outlier loci with significantly higher or lower genetic differentiation than expected under neutrality in the GLI3 gene associated CNS-specific enhancers
Fig. 2
figure 2

Selective outlier loci in the human GLI3 gene associated CNS-specific enhancers using FDIST and BayeScan. (A) Depicts FDIST analysis. It evaluates the relationship between fixation index (FST) and expected heterozygosity (He), describing the expected distribution of FST vs He (FDIST) under an island model of migration with neutral markers. Value of heterogenicity (expected) HE is shown along x-axis while the y-axis represents the fixation index (FST) values. The red area indicates positive selection, gray area neutrality, and yellow area balancing selection. SNPs are represented as blue dots. According to FDIST analysis, SNP rs3801232 (chr7:42213714-42213714) within CNE1 and SNP rs17640804 (chr7:42091791-42091791) positioned within HCNR4 are significantly deviating from the neutrality, which might suggest that these two SNPs are under positive selection (falling in the red zone). Similarly, based on FDIST interpretations, it can be inferred that SNP rs28458974 (chr7:42012170-42012170) within GLI3 associated CNS-specific enhancer region CNE6 is evolving under balancing selection (falling in the yellow zone). (B) Results of BayeScan algorithm for identification of outlier loci with significantly higher or lower genetic differentiation than expected under neutrality. The q value of a given locus is the minimum FDR (false discovery rate). The vertical dotted line represents the threshold FDR = −1.5 and the red dots indicate the SNPs in the GLI3 gene associated CNS-specific enhancers. SNPs on the right side of the vertical dotted line are putatively under positive selection

Prediction of Allele specific transcription factor binding motifs

FDIST, BayeScan and FLK statistics reveals positive selection on SNP rs17640804 within HCNR4 and SNP rs3801232 within CNE1. To gauge whether these positively selected SNPs are unique to human, a comparative analysis was performed by employing orthologous sequences from human, other mammals, birds, and fish (Supplementary file: Figure S1). Careful inspection of multiple sequence alignments revealed that rs17640804 and rs3801232 are unique to human lineage (Supplementary file: Figure S1). The SNPs rs3801232 (chr7: 42131390, GRCh37/hg19) within human GLI3 CNS-specific enhancer CNE1 carries cytosine (C) residues until teleost fish (Supplementary file; Figure S1). However, in modern human, this cytosine residue is replaced by thymine (T) (Supplementary file: Figure S1A). The substitution of thymine (T) by cytosine (C) in human DNA segment leads to the creation of a novel binding site for the transcription factor SP1 (Fig. 3). Transcriptional factor SP1 is known to express widely in the mammalian CNS and participates in memory and oxidative stress in brain (Ryu et al. 2003; Eppig et al. 2015).Co-immunoprecipitation data suggests that SP1 functions by recruiting the basal TF machinery and by affecting the chromatin state (O’Connor et al. 2016). An example of SP1 based regulation of gene expression, is the previously reported cooperation between SP1 and p300 during neuronal differentiation, where both of these proteins cooperatively activate the p21 promoter to bring about withdrawal of the progenitor neural cells from cell cycle (Chuang et al. 2016). Single nucleotide polymorphism can alter the binding affinity of SP1 transcription factor (Zhang et al. 2009). Hata and colleagues demonstrated that the genomic fragment of AGTRL1 gene having the “G” allele of the SNP rs9943582 has much higher binding affinity for SP1 and, consequently, higher enhancer activity than that of “A” allele (Hata et al. 2007). Previously, a functional “T” to “G” polymorphism at nucleotide 309 in the promoter region of the MDM2 gene (rs2279744) has been identified, which results in the enhancement for binding affinity of SP1 transcription factor (Onat et al. 2006). The SNP rs17640804 (chr7: 42131390, GRCh37/hg19) within HNCR4 possesses cytosine (C) residues that is evolutionarily conserved until birds (Supplementary file: Figure S1B). In modern human lineage this cytosine (C) residue is replaced by thymine (T) (Supplementary file: Figure S1). This single nucleotide change in humans leads to the creation of binding sites for Activating transcription factors 2 (ATF2) (Fig. 3). ATF2 is a member of the activator protein-1 family of transcription factors and is highly expressed in human brain and lever (Pearson et al. 2005). Neuronal cell proliferation and differentiation, learning, memory, synaptic plasticity and some other neuropathological conditions are known to be modulated by ATF 2 transcription factor (Yang et al. 2019). Consistent with the diversity of cellular processes reported to be transcriptionally regulated by ATF2, the number of disease pathologies associated with alterations in ATF2 are also numerous. For instance, the altered expression and phosphorylation state of ATF2 has been associated with neurogenerative pathologies (Watson et al. 2017). Similarly, Xu and colleagues also demonstrated the role of ATF2 in Schizophrenia and depression in Northern Chinese Hans population (Xu et al. 2020).

Fig. 3
figure 3

Effects of SNPs on transcription factor binding sites. (A) Comparison between the ancestral allele “C” and derived allele “T” of polymorphism rs3801232 within CNE1. The substitution of thymine (T) by cytosine (C) in human DNA segment leads to the creation of new binding site for the transcription factor SP1. (B) Depicts the comparison between the ancestral allele “C” and derived allele “T” of rs17640804 within HCNR4 for the putative transcription factor binding sites. This single nucleotide change in humans leads to the creation of binding sites for activating transcription factors 2 (ATF2). Seq: the input sequence length for TFBSs analysis. Start: depict the starting position of the transcription factor binding motifs. End: ending position of the transcription factor binding motifs. Arrowhead: the position of the SNP is shown by a red arrowhead

Haplotypic analysis of SNP rs3801232 and rs17640804

The allelic frequency spectrum of positively selected SNPs rs3801232 and rs17640804 within the human GLI3 associated CNS-specific enhancers is different in different human populations (Supplementary file: Figure S2). For instance, the frequency of ancestral allele “C” for SNP rs3801232 is highest in African population (68%) and least in East Asian (32%) (Supplementary file: Figure S2). Therefore, to investigate whether the selected alleles (rs3801232 and rs17640804) has approached or achieve fixation in one population but remains polymorphic in the remaining human populations, we performed haplotypic analyses, following the methodology of Sabeti and colleagues (Sabeti et al. 2002). They used the extended haplotype homozygosity (EHH) approach along with the haplotype bifurcation diagram to identify regions in the human genome that are being under positive selection in different human populations (Sabeti et al. 2002; Fariello et al. 2013). In our case, extended haplotype homozygosity (EHH) plot and bifurcation diagrams for SNP rs3801232 (Fig. 4A) and rs17640804 (Fig. 4B) across different human populations (America, East Asia, South Asia, Europe and Africa) indicates that the derived alleles “T” for rs3801232 and rs17640804 are segregating under the clear influence of positive selection (then their respective ancestral counterpart), in all human populations analyzed except Africa (Fig. 4 A and B).

Fig. 4
figure 4

Extended haplotype homozygosity (EHH) plot and bifurcation diagrams for SNPs rs3801232 and rs17640804. EHH plots showing the homozygosity of the most frequent extended haplotype around the SNP rs3801232 and rs17640804 in the panel (A) and (B) respectively. The vertical dotted line in EHH plot depicts the position of the SNP. X-axis of the EHH plot shows the upstream and downstream genomic coordinates from the locus of interest. Homozygosity scale is shown on the y-axis, ranging from 0 to 1 (0 implying no homozygosity and 1 complete homozygosity). The five vertical panels depict different human populations (Africa, America, East Asia, Europe, South Asia, and Africa). Haplotype bifurcation diagrams in panel (A) and (B) showing breakdown of linkage disequilibrium in individuals carrying the ancestral and derived allele of the polymorphism rs3801232 and rs17640804 respectively. It is bidirectional with the root representing a core SNP depicted by a vertical dotted line. Thickness of the line corresponds to the number of individuals in a population with shared haplotype. Each vertical panel shows the bifurcation pattern for SNP rs3801232 and rs17640804 in different human populations (Africa, America, East Asia, Europe, South Asia, and Africa)

Discussion

The accurate, precise and robust regulation of gene expression during development is crucial for the normal functioning of cellular events pertaining to cellular homeostasis and differentiation (Burgess 2016; Long et al. 2016). The majority of variants affecting such diverse cellular processes reside outside protein coding regions (Carullo and Day 2019). It has been suggested that ~ 90% of common variants associated with complex human traits do not affect a protein directly but instead reside within cis-regulatory circuits that control gene expression (Ward and Kellis 2016). Therefore, to fully comprehend the disease and evolutionary relevance of the non-coding portion of human genome, it is quintessential to understand the role of non-coding variants in altering the developmental genes expression (Nishizaki et al. 2019.)

The non-coding cis-acting enhancers that regulate mammalian Gli3 gene expression during CNS development have been characterized through functional experiments in several species including human cell lines, mice, zebrafish, and chicken (Abbasi et al. 2007; Coy et al. 2011; Alvarez-Medina et al. 2008) (Fig. 1). Given the fact that GLI3 has been implicated in several disease phenotypes, embryonic development and various cancers types, human specific regulation of this gene might have profound evolutionary implications (Matissek and Elsawa 2020). However, to best of our knowledge, the potential significance of human GLI3 associated non-coding cis-regulators in developmental trait evolution, in particular central nervous system (CNS) evolution have not received enough attention.

Among the functionally characterized set of CNS-specific GLI3 cis-regulators, the CNE1 and HCNR4 were found to show the signatures of positive selection (Fig. 2 A and B). Both of these enhancers possess single nucleotide polymorphisms which are unique to modern human lineage (Supplementary file: Figure S1). It is well known that single nucleotide substitutions within mammalian enhancers often result in the creation or removal of TFBSs, causing alteration in gene expression patterns (Douglas and Hill 2014). For instance, Prabhakar and colleagues have reported, gain of an additional TFBS due to human specific substitutions in the conserved non-coding enhancer element of cut-like homeobox (CUX1) gene, known to have a role in autism (Prabhakar et al. 2006). It has been shown that non-coding substitutions within TFBSs of this enhancer might impact the expression patterns of the concerned gene body (CUX1) (Levchenko et al. 2018). These findings were further corroborated by Zehra and Abbasi (2018). They reported human specific modifications in SOX2, RUNX, and FOS/JUND binding sites within brain exclusive human accelerated enhancers (Zehra and Abbasi 2018). In line with these expectations, we observed that SNPs within CNE1 and HCNR4 (of GLI3) have led to the creation of putative novel binding sites for transcriptional factors SP1 and Activating transcription factors 2 (ATF2) respectively (Fig. 3 A and B). Conceivably, evolution of novel TFBSs might have altered the spatiotemporal deployment of GLI3 transcripts within human CNS, possibly leading to the evolutionary elaboration of human brain. Therefore, the discovery of positively selected nucleotide residues within a subset of GLI3 associated enhancers offer an exciting opportunity to directly implicate GLI gene family members in developmental trait evolution through further experimental work.

Allelic frequency spectrum of SNPs rs3801232 and rs17640804 within CNE1 and HCNR4 (of GLI3) is different among different human populations (America, East Asia, South Asia, Europe and Africa) (Supplementary file: Figure S2). Natural selection may act to increase the frequency of de-novo beneficial mutations (Deng and Xu 2017). However, the standing genetic variants, those which have been segregating in a population for some time, have also been observed to undergo selection regime and thereby known to contribute to several phenotypic adaptations (Capellini et al. 2017). For instance, evolutionary loss of pelvis structures in freshwater stickleback populations has been associated with mutations that inactivate the fin specific enhancer for the Pitx1 gene (Chan et al. 2010). These Pitx1 associated regulatory substitutions showed signatures of positive selection in pelvic reduced stickleback populations (Coyle et al. 2007). Similarly, the population genetics experiments revealed tremendous diversity of traits among modern human populations, e.g., from lactase persistent to skin pigmentations (Deng and Xu 2017). Much of this trait diversity in human populations arose from genetic adaptations to different environmental pressures (Hancock et al. 2010). However, our findings that SNPs within subset of GLI3 enhancers have undergone positive selection in non-African human populations require further validations. It could be due to many different and non-mutually exclusive reasons, such as variations in the demography, difference in pattern of linkage disequilibrium and different rates of adaptations (Capellini et al. 2017). For instance, the promoter region of a gene that encodes the precursor of opioid neuropeptide (PDYN) exhibits a significant degree of population differentiation between human populations, which is suggestive of local adaptions (Moon et al. 2019). Similarly, patterns of variations in the regulatory region of human LCT gene that confer lactase persistence in African are consistent with the action of selective sweep (Hancock et al. 2010).

Conclusion

To assess the interspecific and intraspecific pattern of genetic variations in functionally characterized GLI3 associated CNS-specific enhancers, the present study integrates various selection and rate analysis approaches and information from the 1000 genome project. A subset of GLI3 associated CNS-specific enhancers was found to evolve under positive selection in all human populations analyzed except Africa. In this subset of positively selected GLI3 associated CNS-specific enhancers, the single nucleotide polymorphisms led to the changes in transcriptional factor binding sites. Evolution of TFBSs, in pre-existing enhancers may allow a gene to acquire a novel expression patterns without compromising the ancestral expressions/functions (Wong et al. 2017). Therefore, the results of this study open avenues for further studies aimed at understanding the possible roles of GLI3 associated enhancers in brain evolution and inter-population variability of gene expression in humans.