Elsevier

Neuropeptides

Volume 38, Issue 1, February 2004, Pages 1-15
Neuropeptides

News and reviews
Post-genomic approaches to exploring neuropeptide gene mis-expression in disease

https://doi.org/10.1016/j.npep.2003.09.004Get rights and content

Abstract

In the past, a great deal of time and effort has been spent in the analysis of mutations and polymorphisms of gene coding sequence and their relationships to neurological disorders. Unfortunately, many studies of genes that have been strongly implicated in the development of neuronal conditions, have failed to identify any significant coding sequence alterations in affected individuals. It is only relatively recently that mutations affecting gene regulation have been seriously considered as factors in the initiation and exacerbation of neurological disease. In this review, we will examine evidence from our own and other labs demonstrating how mutational or polymorphic changes in the primary structure of non-coding DNA sequences implicated in the development of disease are able to affect the expression of neuronally expressed genes. In addition, we will describe freely available methods of rapidly and accurately identifying likely neuropeptide gene regulatory regions using computer analysis of newly available mouse, rat, human and pufferfish (Fugu) genomic sequence. We will also describe how, in the absence of suitable cell lines, these identified sequences can be analysed in vivo using transgenic analysis. In silico analysis of the available genome sequences of different vertebrates in combination with cell and transgenic analysis has the potential of significantly accelerating the identification and characterisation of conserved neuropeptide gene regulatory regions and the identification of the transcription factors (TFs) that bind to them. Only by taking full advantage of these technologies and combining them with the huge and valuable resource represented by human polymorphic linkage analysis and association studies will neurobiologists gain a better understanding of how inappropriate regulation of neuropeptide gene expression can contribute to the progression of neurological disease.

Introduction

Since the discovery of substance P (SP) in 1931 (von Euler and Gaddum, 1931) huge efforts have been made to elucidate the roles of an ever growing number of identified neuropeptides in regulating a wide range of processes that include pain perception, appetite, digestion, blood pressure, inflammation, mood, placental function etc (Hokfelt et al., 2000). Further studies have been carried out on human patients suffering a range of painful, debilitating and often fatal human diseases to determine the possible roles of neuropeptides in the progression of these diseases. Many of these studies have taken the form of linkage studies or association studies whereby identifiable mutations and polymorphisms of neuropeptide and neuropeptide receptor genes have been associated with the incidence of particular diseases (Mayer and Hollt, 2001; Reiterova et al., 2001; Gard, 2002; Wahle et al., 2002). The vast majority of publications that relate the incidences of certain polymorphisms to disease risks have described how these disease causing polymorphism/mutations have affected the coding regions of the genes under study. However, many polymorphisms and mutations, that have also been strongly associated with the progression of certain disease, have been found in DNA sequence outside gene coding sequence often at some distance from the transcriptional start site of any gene. It is now becoming accepted that many of these polymorphisms may indeed be mutations that can affect mRNA splicing, mRNA stability (Mendell and Dietz, 2001; Faustino and Cooper, 2003) or transcriptional regulation.

Because of the ease of access to available cDNA sequence over the previous two decades many biologists have focused on exonic sequence mutations and polymorphisms in an attempt to correlate function with disease. However, it is now becoming accepted that the normal development and functioning of a highly complex organ such as the human brain is every bit as reliant on the correct regulation of the genes involved as on correct protein function. Cis-regulatory elements (CisREs) are short lengths of DNA sequence that are bound by transcription factor (TF) proteins and are found on the same molecule of DNA (often at some distance) as the genes whose expression they regulate. CisREs may be involved in increasing gene expression (enhancers) or are involved in down regulating gene expression (repressors). Previously, problems in identifying regions of DNA responsible for regulating gene expression have been formidable as a result of the difficulties faced by researchers in finding and characterising the large number of small and often widely distributed CisREs required for normal gene regulation within the brain. Very recently, this situation has changed and analysis of regulatory regions is now increasingly possible because of the huge amount of genomic date available from the human, rodent and pufferfish (Fugu) genome projects. In this review we will describe how a combination of post-genomic bio-informatics, using the genomes of a number of different species, and transgenic analyses can be used to identify and characterise important cis-regulatory regions thereby allowing the exploration of how regulatory mutations may contribute to disease progression. To illustrate our approach we will use specific examples from fields outside neuroscience including examples from our own laboratories in addition to examples from studies of developmental biology and immunology.

Changes in the genome can occur at any time by a variety of different ways ranging from point mutations induced by chemical deamination events and DNA polymerase replication mistakes to large-scale events often affecting mega bases of DNA such as chromosomal rearrangements and deletions. Coding polymorphisms of key genes have often been linked to particular genes mostly due to the relative ease with which coding mutations can be mechanistically linked to diseases such as cystic fibrosis, Duchenne Muscular Dystrophy and sickle cell anaemia. However, the majority of these dramatic diseases do not affect a high proportion of the population in industrial countries who, instead, tend to suffer diseases associated with middle and old age such as rheumatic illness, cardiovascular disease, cancers and neuro-degenerative disease. All of these diseases have been shown to have strong, but complex, genetic components. Although significant progress has been made in discovering the genetic basis of susceptibility to these diseases, in many cases, no coding mutation has been identified that can account for their progression. In many cases problems associated with mRNA splicing (Faustino and Cooper, 2003) and stability (Mendell and Dietz, 2001) have compromised the function of the protein. However, in many other cases it has been shown that gene mis-regulation due to point mutations, polymorphisms and chromosomal deletions/recombinations within critical cis-regulatory domains may be another important factor in the susceptibility of certain individuals to many diseases (Amouyel, 2002; Cheung and Spielman, 2002; Eichner et al., 2002; Howell et al., 2002; Kiyohara et al., 2002; Vercelli, 2002; Laws et al., 2003).

Several cases of regulatory polymorphisms have been observed within the regulatory regions of neuropeptides. Perhaps one of the best-defined examples is the promoter of the AVPR1A gene that encodes the receptor for arginine vasopressin (AVP). Interesting studies have been carried out in relation to the AVP receptor whereby it was shown that particular promotor variations were able to define the social behaviour of prairie voles (Wang and Young, 1997; Wang et al., 1997). These studies have now been extended to look at AVP promoter variation in Autism (Kim et al., 2002) and predisposition to depression (Landgraf, 2003)

Point mutations within CisREs important in maintaining or up regulating the expression of critical genes often result in a reduced expression levels of the proteins encoded by these genes. For example it has been found that a mutation within the promoter of the connexin-32 gene reduced expression of the gene in Schwann cells where it plays an important role in nerve cell myelination. This mutation caused many symptoms associated with the pathogenesis of X-linked dominant Charcot-Marie-Tooth disease (Wang et al., 2000). Furthermore, many regulatory mutations have been discovered in the promoters of the human LDL and lipoprotein lipase genes which have been found to affect levels of circulating lipids thus influencing susceptibility to atherosclerotic vascular disease (Kontula and Ehnholm, 1996).

A further example of how polymorphisms can alter gene expression both quantitatively and spatially has been demonstrated following studies of intron 2 polymorphisms within the human serotonin transporter (SERT) gene that have been associated with affective disorders (Reif and Lesch, 2003). It was demonstrated that intron 2 of the SERT gene contains an enhancer element (STIN2) that can drive marker gene expression in embryonic stem cells (Fiskerstrand et al., 1999a). Interestingly, these studies showed that two different polymorphism of this intronic sequence, STIN 2.10 and STIN 2.12 had significantly different transcriptional properties (Fiskerstrand et al., 1999a). We extended this study and produced mice transgenic for reporter constructs containing STIN 2.10 and STIN 2.12 (MacKenzie and Quinn, 1999). Both polymorphisms produced similar expression patterns within the developing brain but differed dramatically in an area of the rostral hindbrain in which the SERT gene is expressed during early embryonic development where it may play a role in modulating morphogenic levels of serotonin (Fig. 1C). In addition, the rostral hindbrain is an area of the embryonic brain destined to give rise to the first cells of the serotoninergic system (Fig. 1D). These observations demonstrate that polymorphic differences within regulatory regions of genes, critical to emotional modulation, are capable of affecting the expression of these genes quantitatively and in a spatially specific manner. The results also support the hypothesis that many neurological disorders, affecting the adult population, may arise as a result of differences in gene expression that occur during embryonic development.

The products of many genes may cause disease symptoms when expressed inappropriately or at elevated levels (Magdaleno and Curran, 1999). In certain cases the expression of these genes is actively repressed (Love et al., 2000). For example, substance P (SP), when injected into living tissue, causes an inflammatory response and acts as a cytokine by recruiting components of the immune response (Hokfelt et al., 2001). If we consider that inappropriate expression of SP may be harmful to health then it is interesting to find that the transcriptional start site of the rat preprotachykinin-A (PPTA; encodes substance P) can be bound by a repressor protein called Neuronal Restrictive Silencing Factor (NRSF) that, when bound, strongly represses PPTA expression (Millward-Sadler et al., 2003). Intriguingly, we were able to show that removal of the NRSF binding site from 3 kb of the rPPTA promoter resulted in inappropriate marker gene expression in osteogenic tissues (Fig. 2) (Millward-Sadler et al., 2003). Considering the known roles of SP in inflammation and pain, the possibility that perturbations within the PPTA NRSF binding site may play a role in the development of rheumatic disease has not escaped our notice.

Individual nucleotide differences may be sufficient to alter the affinity of a CisRE for a particular TF. For example, a relationship between a single nucleotide polymorphism (SNP) in the TNFα promoter (TNF−857T) and increased risk of inflammatory bowel disease (IBD) has been recently established by linkage analysis (van Heel et al., 2002). This elegant study used a combination of population genetics and molecular biology to show that the TNF−857T polymorphism resulted in higher levels of TNFα expression from blood cultures taken from patients exposed to bacterial lipopolysaccherides. It was also observed that OCT1 binds the TNF−857T polymorphism but not TNF−857C, and interacts with the pro-inflammatory NFκB TF. It was hypothesised from this study that the result of Oct1 binding to the TNFα promoter resulted in elevated TNFα protein expression by circulating monocytes and a higher susceptibility to IBD (van Heel et al., 2002). All these examples support the contention that regulatory mutations and polymorphisms play an important role in the development of disease symptoms.

Current attempts at characterising CisREs often start with the cloning of large areas of gene flanking sequences, derived from genomic phage/cosmid clones or high fidelity PCR amplification, into plasmid constructs carrying easily identifiable marker genes. What usually follows is the laborious task of sequentially deleting chunks of that DNA and testing each of the deletion constructs for transcriptional activity within either a cell culture based or transgenic system a process often refered to as “promotor crunching”(MacKenzie et al., 1997). This type of “promotor crunching” approach has been effective in the past for identifying regulatory regions lying close to the loci of the genes they regulate. However, “promotor crunching” can not only be extremely tedious and wasteful of time and resources but frequently fails to isolate many of the cis-regulatory regions required for gene expression. The following paragraphs describe the advantages and shortcomings of several approaches that have been used for studying the transcriptional activity of gene flanking sequences.

Attempts at analysing the transcriptional activity of gene flanking regions often involves the use of transformed or isolated primary cell lines. This approach is often undertaken using marker genes whose products can be quantifiably assayed such as chloramphenicol acetyl-transferase (CAT) or, more recently, luciferase. One advantage of this approach is that, within the cell line being used, the degrees of transcriptional activity produced by the deletion reporter construct can be analysed rapidly and quantitatively often over several orders of magnitude. These transformed cell bases analyses are highly suitable for determining the transcriptional activity of the regulatory regions of “housekeeper” or constitutively expressed genes. However, many of the genes currently studied for their involvement in human disease display tissue specific expression that depends on multiple interactions between diverse cell types within a defined three-dimensional matrix (such as the brain) that cannot be easily recreated in a cell culture dish. This is particularly true of neuronally expressed genes whose expression may be regulated by multiple cell interactions involving cell bodies at some considerable distance. Moreover, many questions surround the usefulness of data produced using transformed neuronal cell lines that, due to extensive genomic rearrangements, often express genes in an inappropriate manner. One solution to the problems encountered using transformed neuronal cell lines has been the use of rodent primary neurons. However, the problem of transforming reporter plasmids into primary neuronal cell cultures that are highly refractive to DNA transformation often limits their use for studying brain specific gene regulation. Furthermore, primary cultures are often heterogeneous in nature and are contaminated with various non-neuronal cell types. Further methods to introduce reporter genes into primary neurons often require a severe challenge to the cell, e.g. microinjection, that makes it difficult to monitor changes in promoter function in response to more subtle challenges such as NGF. To address some of these problems we have successfully used virus vectors to introduce reporter gene constructs into both primary dorsal root ganglion derived cell lines and in vivo prime (Fiskerstrand and Quinn, 1996; Fiskerstrand et al., 1999b; Quinn et al., 2000).

To circumvent the problems of using DNA constructs in cell lines, as listed above, many groups, including our own, have included the analysis of reporter constructs in transgenic animals. This is done by injecting linearised reporter construct DNA directly into the pronucleii of one cell mouse embryos to produce transgenic animals (MacKenzie et al., 1997; MacKenzie and Quinn, 1999; MacKenzie et al., 2000; MacKenzie and Quinn, 2002; Millward-Sadler et al., 2003). Reporter genes frequently used include the bacterial β-galactosidase gene (LacZ) the gene for green fluorescent protein (GFP) and derivatives thereof. Both reporter genes have their positive and negative aspects. For example, The LacZ reporter gives excellent resolution of expression at both the gross and cellular level but is subject to problems relating to the depth of tissue the X-gal stain used to detect LacZ activity can penetrate and, being a bacterial gene, is often susceptible to inactivation by methylation as a transgene within the mouse genome. GFP can be viewed within living tissues but requires specialised equipment to allow its visualisation and does not resist the standard histological processing required for sectional analysis. From a four-dimensional perspective the data produced from transgenic analysis is much more representative of how the native promoter works in its native environment because, once injected into a living embryo, the reporter construct integrates into the genome and becomes part of a living, growing system. Furthermore, the generation of transgenic animals need not be a major hurdle to the analysis of regulatory regions as, in the hands of a skilled operator, many transgenic animals may be generated from a single days microinjection. Therefore, transgenic analysis of potential CisREs need not take a long time and transgenic animals, generated by microinjection, can be analysed for marker gene expression at any time during gestation or following birth without the need for breeding transgenic lines.

It must be mentioned, however, that the random nature of transgene (reporter construct) integration following pronuclear injection necessitates the generation of more than three separate individuals (representing 3 different integration events) per reporter construct that show similar transgene expression (representing three different integration effects) before the expression produced can be attributed solely to the transgene. As a possible way around the problem of random integration, many groups have inserting the reporter cassette into specific locations with the stem cell genome (HPRT locus; www.genoway.com) prior to transgenic production This technology is hoped to reduce the affects of integration position on the transgene.

Many cis-regulatory domains important to the regulation of critical genes lie at distances from the start site of the gene that frequently exceeds 50 kb (Carter et al., 2002; Griffin et al., 2002) and may indeed be up to 1 Mb away (Lettice et al., 2002). If we now consider that plasmid constructs become extremely hard to produce over about 16 kb by conventional cloning then it is not surprising that previous attempts at defining gene regulatory regions using plasmids have often met with little success. For example, previous attempts to recapitulate the expression of the PPTA gene in transgenic mice with a reporter gene construct containing up to 5.6 kb of rat 5 flanking DNA were consistently unsuccessful despite repeated efforts (T. Harmar, personal communication).

The use of yeast artificial chromosomes (YACs), P1 artificial chromosomes (PACs) and bacterial artificial chromosomes (BACs) in finding distant gene regulatory regions largely overcome the problems associated with plasmid based transgenesis as these constructs can hold very large fragments of DNA (PACs and BACs can hold 250 kb and YACs, up to 1 megabase (Mb=1 × 106 bp; Monaco and Larin, 1994; Magdaleno and Curran, 1999). Furthermore, by virtue of their large size, they are believed to be able to insulate gene regulatory regions from the insertional effects often encountered using a plasmid based transgenic approach. Using a YAC transgenic approach we were able to recreate the expression of the PPTA gene in transgenic animals using a 380 kb human yeast artificial chromosome (YAC) into which a LacZ reporter gene had been spliced (Fig. 3; MacKenzie et al., 2000; MacKenzie and Quinn, 2002). This experiment demonstrated that artificial chromosome constructs derived from human DNA could be used to generate relevant gene expression patterns within transgenic animals. In addition, this study also demonstrated that all the transcriptional machinery required for expression of the human PPTA gene exists in the mouse and the human CisRE and mouse TFs involved are highly compatible (MacKenzie et al., 2000; MacKenzie and Quinn, 2002). Although the distance at which regulatory elements can be analysed can now be increased by up to two orders of magnitude compared to analyses using plasmids researchers are still faced with the problems of finding the regulatory regions of interest within the YAC/BAC construct. Because of their large size YAC and BAC constructs cannot be easily truncated using restriction enzymes. Therefore a truncation approach based on homologous recombination is most often employed. However, even whilst using YAC and BAC technologies, researchers are still faced with trying to find what could be a 100 bp regulatory element within up to 1 Megabase (1 Mb=1×106 bp) of DNA. This is not an ideal situation as the vast majority of scientists in the neuroscience community do not have the time or resources to devote to long distance genetic analysis or “promotor crunching” to finding cis-regulatory regions and would prefer not to rely on luck. What is urgently needed is a simpler and more widely accessible method of detecting the whereabouts of cis-regulatory regions. The following section will go onto outline some of the exciting discoveries that have resulted from sequence comparison of the human, rat, mouse and pufferfish genome and will describe how anybody with a computer and access to the internet can start to define gene regulatory regions either flanking or within their genes of interest.

Section snippets

Cis-regulatory regions have been highly conserved through evolution

It has become widely accepted that one of the best measures of the importance of a sequence of DNA to the proper development and functioning of an organism is its levels of evolutionary conservation. Certain genomic sequences have been highly conserved because changes in their primary structure reduce the fitness of the individual who are selected against by evolution and do not propagate their genes. It has been known for some time that the coding regions of the vast majority of animal genes

Detecting CisRE's “on line”

We have reviewed several of the methods previous used to identify and characterise CisREs. A major shortcoming of these approaches is their empirical nature that requires the expenditure of a great deal of time and effort to locate the regulatory elements being sought. What is required to overcome these problems is a method of streamlining and accelerating the detection of cis-regulatory regions by providing some method of predicting where CisREs reside prior to undertaking an involved cell

Analysing candidate gene regulatory elements

The comparative analysis of different vertebrate genome sequences, although able to produce compelling data on the possible locations of hypothetical CisREs, must only be used as an indicator of where a likely CisREs may lie in relation to the gene it may regulate and cannot be taken as conclusive evidence of their existence. What should follow are a series of experiments that can be used to better define the characteristics of these possible CisREs and to further explore the identities of the

Conclusions

Although the protein products of genes make individuals it is highly probable that how these genes are regulated play a much more important role in what makes us individual. This may extend to individual susceptibility to disease where particular polymorphisms/mutations within neuropeptide gene regulatory domains may alter the spatial, temporal or quantitative expression of certain neuropeptides or their receptors leading to potentially life threatening illnesses. The search to find and

Acknowledgements

We would like to thank Steve Davies for critically reading our manuscript. This work was funded by Tenovus Scotland, the BBSRC, the Wellcome Trust and the MRC.

References (73)

  • A MacKenzie et al.

    A yeast artificial chromosome containing the human preprotachykinin-A gene expresses substance P in mice and drives appropriate marker gene expression during early brain embryogenesis

    Mol. Cell. Neurosci.

    (2002)
  • A MacKenzie et al.

    Two enhancer domains control early aspects of the complex expression pattern of Msx1

    Mech. Dev.

    (1997)
  • A MacKenzie et al.

    The human preprotachykinin-A gene promoter has been highly conserved and can drive human-like marker gene expression in the adult mouse CNS

    Mol. Cell. Neurosci.

    (2000)
  • P Mayer et al.

    Allelic and somatic variations in the endogenous opioid system of humans

    Pharmacol. Ther.

    (2001)
  • J.T Mendell et al.

    When the message goes awry: disease-producing mutations that influence mRNA content and performance

    Cell

    (2001)
  • A.P Monaco et al.

    YACs, BACs, PACs and MACs: artificial chromosomes as research tools

    Trends Biotechnol.

    (1994)
  • J.P Quinn et al.

    Molecular models to analyse preprotachykinin-A expression and function

    Neuropeptides

    (2000)
  • A Reif et al.

    Toward a molecular architecture of personality

    Behav. Brain. Res.

    (2003)
  • T Takahashi et al.

    A minimal murine Msx-1 gene promoter. Organization of its cis-regulatory motifs and their role in transcriptional activation in cells in culture and in transgenic mice

    J. Biol. Chem.

    (1997)
  • H.L Wang et al.

    Point mutation associated with X-linked dominant Charcot-Marie-Tooth disease impairs the P2 promoter activity of human connexin-32 gene

    Brain Res. Mol. Brain Res.

    (2000)
  • Z Wang et al.

    Ontogeny of oxytocin and vasopressin receptor binding in the lateral septum in prairie and montane voles

    Brain Res. Dev. Brain Res.

    (1997)
  • W.W Wasserman et al.

    Identification of regulatory regions which confer muscle-specific gene expression

    J. Mol. Biol.

    (1998)
  • S.B Carroll

    Genetics and the making of Homo sapiens

    Nature

    (2003)
  • D Carter et al.

    Long-range chromatin regulatory interactions in vivo

    Nat. Genet.

    (2002)
  • J Castelli-Gair

    Implications of the spatial and temporal regulation of Hox genes on development and evolution

    Int. J. Dev. Biol.

    (1998)
  • S.S Cheah et al.

    Gene-targeting strategies

    Methods Mol. Biol.

    (2000)
  • V.G Cheung et al.

    The genetics of variation in gene expression

    Nat. Genet.

    (2002)
  • J.E Eichner et al.

    Apolipoprotein E polymorphism and cardiovascular disease: a HuGE review

    Am. J. Epidemiol.

    (2002)
  • N.A Faustino et al.

    Pre-mRNA splicing and human disease

    Genes Dev.

    (2003)
  • C.E Fiskerstrand et al.

    Novel cell lines for the analysis of preprotachykinin A gene expression identify a repressor domain 3 of the major transcriptional start site

    Biochem. J.

    (1999)
  • B Gao et al.

    DNase I footprinting analysis of transcription factors recognizing adrenergic receptor gene promoter sequences

    Methods Mol. Biol.

    (2000)
  • R.E Harlan et al.

    Cellular localization of substance P- and neurokinin A-encoding preprotachykinin mRNA in the female rat brain

    J. Comp. Neurol.

    (1989)
  • T Hokfelt et al.

    Substance P: a pioneer amongst neuropeptides

    J. Intern. Med.

    (2001)
  • W.M Howell et al.

    Gene polymorphisms, inflammatory diseases and cancer

    Proc. Nutr. Soc.

    (2002)
  • S.J Kim et al.

    Transmission disequilibrium testing of arginine vasopressin receptor 1A (AVPR1A) polymorphisms in autism

    Mol. Psychiatry.

    (2002)
  • K Kontula et al.

    Regulatory mutations in human lipoprotein disorders and atherosclerosis

    Curr. Opin. Lipidol.

    (1996)
  • Cited by (0)

    View full text