Introduction

All cells of a multicellular organism carry the same genetic material coded in their DNA sequence, but cells obviously display a broad morphological and functional diversity. This heterogeneity is caused by differential expression of genes. Epigenetics can be defined as the study of heritable changes of a phenotype such as the gene expression patterns of a specific cell type that are not caused by changes in the nucleotide sequence of the genetic code itself.

These changes are mitotically and in some cases meiotically heritable. Epigenetic regulation mediates genomic adaption to an environment thereby ultimately contributing toward the phenotype. They “bring the phenotype into being” as said by the developmental biologist Conrad H. Waddington in the 1940s [1].

Epigenetic phenomena are mediated by a variety of molecular mechanisms including post-transcriptional histone modifications, histone variants, ATP-dependent chromatin remodeling complexes, polycomb/trithorax protein complexes, small and other non-coding RNAs including siRNA and miRNAs, and DNA methylation, as described in detail in [2]. These diverse molecular mechanisms have all been found to be closely intertwined and stabilize each other to ensure the faithful propagation of an epigenetic state over time and especially through cell division. Nonetheless epigenetic states are not definitive and changes occur with age in a stochastic manner as well as in response to environmental stimuli. Chromatin modulations play a central role to shape the epigenome and delineate a functional chromatin topology which serves as the platform forming regulatory circuits in all cells [3]. Open (euchromatin) and closed (heterochromatin) chromatin states are controlled by histone modifications and histone composition in close crosstalk with the binding of a myriad of non-histone proteins. The basic building block of chromatin is the nucleosome which is formed of an octamer of histone proteins containing an H3–H4 tetramer flanked on either side with an H2A–H2B dimer around which 146 bp of DNA are spooled in a 1.65 left-handed super-helical turn. The protruding N-terminal tails of these histones are extensively modified by various modifications such as acetylation, methylation, phosphorylation, and ubiquitylation [4]. The combination of different N-terminal modifications and the incorporation of different histone variants which have distinct roles in gene regulation have led to the proposition of a regulatory histone code which determines at least partly the transcriptional potential for a specific gene or a genomic region [5, 6]. DNA methylation is highly related to certain chromatin modifications; and enzymes that modify DNA and histones have been shown to directly interact and constitute links between local DNA methylation and regional chromatin structure [7].

In this review, an introduction to the multiple biological facets of the biology of DNA methylation and their potential use in clinical applications is given. The DNA methylation landscape and the enzymes responsible for adding and potentially removing methyl groups to the DNA will be described and the various biological processes in which DNA methylation plays a key role are discussed. Different pathologies for which changes in DNA methylation patterns have been investigated will be mentioned with a certain emphasis on cancer as most of the disease-associated DNA methylation literature concerns changes in tumorigenesis. The article concludes with a paragraph on the reasons why DNA methylation is a very promising biomarker. Due to space restrictions and the large field of research described in this introduction, over-simplifications and omissions are inevitable and many of the references refer the interested reader to more detailed reviews.

The Biology of DNA Methylation

DNA methylation is the only genetically programmed DNA modification in mammals. This post-replication modification is almost exclusively found on the 5 position of the pyrimidine ring of cytosines in the context of the dinucleotide sequence CpG [8]. 5-methylcytosine accounts for ~1% of all bases, varying slightly in different tissue types and the majority (75%) of CpG dinucleotides throughout mammalian genomes are methylated. Other types of methylation such as methylation of cytosines in the context of CpNpG or CpA sequences have been detected in mouse embryonic stem cells and plants, but are generally rare in somatic mammalian/human tissues. CpGs are underrepresented in the genome, probably because they act as a mutation hotspot (deamination of methylated CpGs to TpGs). Mutation rates at CpG sites have been estimated to be about 10–50 times higher than other transitional mutations, as the mutation product is a naturally occurring DNA base which may not be appropriately repaired. The elevated mutation rate has led to depletion of the dinucleotide during evolution. Despite this general trend, relatively CpG-rich clusters of approximately 1–4 kb in length—so called CpG islands—are found in the promoter region and first exons of many genes [9]. They are mostly non-methylated corresponding to the maintenance of an open chromatin structure and a potentially active state of transcription. There are around 30,000 CpG islands in the human genome. As CpG islands are mainly unmethylated in the germline, they are less susceptible to deamination and have therefore retained the expected frequency of CpGs. It should be noted that a growing number of CpG islands has been identified that are methylated in non-pathological somatic tissues [10, 11]. Depending on the employed set of parameters, a CpG island is defined as having a G + C content of more than 50% (55%), an observed versus expected ratio for the occurrence of CpGs of more than 0.6 (0.65) and a minimum size of 200 (500)  bp [12, 13]. About three quarters of transcription start sites and 88% of active promoters are associated with CpG-rich sequences and might be regulated by DNA methylation. Promoter CpG islands of genes differ in their susceptibility to become methylated during normal development as well as during carcinogenesis, which might be due to intrinsic sequence properties [14]. This regulation is controlled in a tissue- and developmental-stage-specific manner and is maintained throughout the life of an individual.

DNA Methyltransferases and Methyl-Binding Proteins

The composition of the genome is reflected in and dictates the epigenetic machinery to establish particular local and global epigenetic patterns using both CpG spacing as well as sequence motifs and DNA structure [15, 16]. Mammalian one carbon metabolism provides the methyl group for all biologic methylation reactions. These are dependent on methyl donors (methionine and choline) and co-factors (folic acid, vitamin B12, pyridoxal phosphate) to synthesize the universal methyl donor S-adenosyl-l-methionine (SAM) [17]. During the methylation reaction a methyl group is transferred from SAM to the DNA leaving S-adenosylhomocysteine which at high concentrations inhibits the action of DNA methyltransferases.

So far four DNA methyltransferases have been identified (DNMT1, DNMT2, DNMT3A, and DNMT3B) as well as a DNMT-related protein (DNMT3L) [18, 19]. They catalyze the transfer of a methyl group from SAM to the cytosine base. With the exception of DNMT2 which acts probably as RNA methyltransferase in vivo, all Dnmts are essential for embryonic viability as homozygous mutant mice die early. Simplified, DNMT1 acts as maintenance methyltransferase as it prefers hemi-methylated templates. It is located at the replication fork during the S phase of the cell cycle and methylates the newly synthesized DNA strand using the parent strand as a template. Consequently, it passes the epigenetic information through cell generations. De novo methylation is carried out by the methyltransferases DNMT3A and DNMT3B. These enzymes have certain preferences for specific targets (e.g., Dnmt3a together with Dnmt3L methylates maternal imprinted genes and Dnmt3b localizes at minor satellite repeats) but also work cooperatively to methylate the genome. Possible trigger mechanisms to initiate de novo methylation include preferred target DNA sequences, RNA interference, certain chromatin structures induced by histone modifications and other protein–protein interactions [18, 19].

DNA-methyl-binding domain (MBD1-4) proteins or methyl CpG-binding proteins (MeCP2) recognize and bind to methylated DNA [18, 20]. They recruit transcriptional corepressors such as histone deacetylating complexes, polycomb proteins, and chromatin remodeling complexes and attract chromodomain-binding proteins. Besides the structurally related MBD proteins, methylated DNA can also be bound by some zinc finger proteins such as Kaiso and the more recently discovered ZBTB4 and ZBTB38 proteins that are also able to repress transcription in a methylation-dependent manner [20, 21].

Although active demethylation of DNA undoubtedly occurs during development, the exact mechanisms for global as well as for gene-specific demethylation events are still unclear and subject to much debate [22]. Demethylation might be caused by the replacement of methylated cytosines through an enzymatic process in which a glycosylase plays a major role [23] or by a deamination-induced repair process as the AID cytidine deaminase has been shown to deaminate cytidine in RNA as well as 5-methylcytosine in DNA [24]. Other proposed mechanisms include a deamination of the methylated cytosine to thymine by the DNA methyltransferase DNMT3B itself in the absence of the universal methyl donor S-adenosylmethionine [25] or more recently in the zebrafish model of the Aid cytidine deaminase in conjunction with the thymine glycosylase Mbd4 and the Gadd45a protein [26]. However, all postulated mechanisms are not very efficient indicating that the underlying mechanisms might be more complex than currently suggested [22, 27].

Development

Cytosine methylation is essential for mammalian embryogenesis, during which methylation levels change dynamically [28, 29]. During development and differentiation, the mammalian organism creates a number of cell-type-specific differentially marked epigenomes, whose identity is inter alia defined by their respective DNA methylation patterns. Thus, the human body with one genome contains approximately 180 different epigenomes. Mammalian development is characterized by two waves of genome-wide epigenetic reprogramming, in the zygote and in the primordial germ cells. In mice (and probably other mammals), the genome becomes demethylated during pre-implantation, probably to initiate cellular differentiation. Most of the paternal genome is actively and rapidly demethylated leading to the erasure of most paternal germ line methylation marks, while the maternal genome remains methylated or may undergo further de novo methylation. After completion of the first cell cycle, loss of methylation on the maternal allele occurs passively through cell divisions until blastocyst formation. This demethylation removes most of the pre-existing patterns of methylation inherited from the parental DNA. Around implantation, where cell lines start to commit to different developmental lineages, DNA methylation levels are then restored by de novo methylation. Disruption of any of the DNA methyltransferases results in embryonic lethality and hypomorphic alleles of Dnmt1 result in genome-wide deregulation of gene expression. The second reprogramming event occurs also during embryogenesis but only in the primordial germ cells where DNA methylation patterns are erased at all single-copy genes (including imprinted genes) and some repetitive elements [30]. Depending on the sex of the newly formed germ line, imprints at paternally methylated loci are restored shortly after birth, while maternally methylated loci occur only during the last stages of oogenesis.

Modifications to the environment during early development can also lead to permanent changes in the patterns of epigenetic modifications (see also the paragraph on environmental and nutritional epigenetics). For example, differences in the cell culture medium lead to differences in cleavage kinetics, blastocyst formation, and disturbed epigenetic profiles at imprinted gene loci [31]. This might also partly account for the increased incidence of imprinting disorders in children born from assisted reproduction technologies (ART) [32]. It should be noted that reduced fertility has also been linked to epigenetic changes. An alternative explanation is therefore that people with incorrect epigenetic information need to revert more often to reproductive technologies and the incorrectly programmed gametes rather than ART increases the risk for epigenetic disorders in their children [33, 34]. Truth lies probably in the combination of both explanations. An incomplete erasure and reprogramming of the epigenetic patterns might also be one of the reasons for the low success rate of cloning using somatic cell nuclear transfer, i.e., the fusion of a somatic cell with an enucleated oocyte [35]. Global as well as gene-specific DNA methylation patterns—in particular at imprinted gene loci (see below)—are disturbed in cloned animals and lead to a multitude of pre- and perinatal developmental abnormalities.

Epigenetic changes are also an integral part of aging and cellular senescence, whereby the overall content of DNA methylation in the mammalian and human genome decreases with age. Simultaneously, distinct genes acquire methylation at specific sites such as their promoters, a situation that strikingly resembles the DNA methylation changes that are found in cancer [36].

Transcription

Transcription does not occur on naked DNA, but in the context of chromatin which critically influences the accessibility of the DNA to transcription factors and the DNA polymerase complexes. DNA methylation, histone modifications, and chromatin remodeling are closely interwoven and constitute multiple layers of epigenetic modifications to control and modulate gene expression through chromatin structure [37]. DNMTs and histone deacetylases (HDACs) are found in the same multi-protein complexes; and MBDs interact with HDACs, histone methyltransferases as well as with the chromatin remodeling complexes. Furthermore, mutations or loss of members of the SNF2 helicase/ATPase family of chromatin remodeling proteins such as ATRX or LSH lead to genome-wide perturbations of DNA methylation patterns and inappropriate gene expression programs.

Cytosine methylation of CpG dinucleotides is found in close proximity to critically important cis-elements within promoters and is often associated with a repressed chromatin state and inhibition of transcription. In many cases, methylated and silenced genes can be reactivated using DNA methylation inhibitors such as 5-azacytidine. However, it should be noted that an unmethylated state of a CpG island does not necessarily correlate with the transcriptional activity of the gene, but rather that the gene can be potentially activated. On the other hand, the simple presence of methylation does not necessarily induce silencing of nearby genes. Only when a specific core region of the promoter that is often—but not necessarily—spanning the transcription start site becomes hypermethylated, the expression of the associated gene is modified [38]. The methylation status at specific CpG dinucleotides in the core region might therefore better correlate with the expression of the gene than the overall methylation level of the entire CpG island. CpG islands are also found outside promoter regions and these appear generally to be more susceptible to methylation than the respective promoter sequences in various cancers as well as during cell culture. However, methylation of these CpG islands does usually not diminish transcription. It has been proposed that methylation begins in exonic regions and then progressively spreads to CpG islands in other locations including promoter regions. The exact mechanism is still to be elucidated, but it could be that the protection from methylation is lost through the absence of transcription facilitating the access of DNMTs to the DNA as transcription factors and/or transcription/initiation complexes are absent. In some cases the methylation density in a promoter core region seems to be crucial to induce transcriptional silencing [39], while in other cases the demethylation of specific CpG sites is sufficient for transcriptional re-activation [40].

Methylation can interfere with transcription in several ways [8]. It can inhibit the binding of transcriptional activators with their cognate DNA recognition sequence such as Sp1 and Myc through sterical hindrance. The above-described MBD proteins and the DNMTs themselves bind to methylated DNA and prevent thereby the binding of potentially activating transcription factors. These two protein families also recruit additional proteins with repressive function such as histone deacetylases and chromatin remodeling complexes to the methylated DNA to establish a repressive chromatin configuration.

In many cases DNA methylation follows changes in the chromatin structure and is used as the molecular mechanism to permanently and thus heritably lock the gene in its inactive state [8]. Recent results have also shown that active histone marks such as H3K4Me3 at the transcription start site might permit transcription of a gene when stimulated even in the presence of a partly methylated CpG island immediately adjacent to the transcription start site [41].

Genome Stability

5-Methylcytosine and other modified bases are also found in bacteria where they constitute an integral part of the modification-restriction enzyme that allows distinguishing between self and invading foreign DNA. DNA methylation plays an important role in the maintenance of genome integrity by transcriptional silencing of repetitive DNA sequences and endogenous transposons. DNA methylation might prevent the potentially deleterious recombination events between non-allelic repeats caused by these mobile genetic elements. In addition, methylation increases the mutation rate leading to a faster divergence of identical sequences and disabling of many retrotransposons [42].

Imprinting

In mammals, the maternal and paternal genomes are functionally not equal and both are required for normal development. A subset of genes is asymmetrically expressed from only the maternal or the paternal allele in a parent-of-origin-specific manner in all somatic cells of the offspring [43]. These imprinted genes are generally located in clusters and the alleles are differentially marked by DNA methylation, histone acetylation/deacetylation, and histone methylation and often associated with antisense RNAs [44, 45]. About 50 imprinted genes are known in mouse and man, but up to 200 imprinted genes have been computationally predicted [46]. Imprints are established in the gametes by Dnmt3a and at least for maternally imprinted genes the regulatory cofactor Dnmt3L in a parent-of-origin-specific manner. These epigenetic marks in imprinting control regions are not erased in the zygote. Imprinted genes are probably the most important buffering factors for regulating the day-to-day flux between mother and fetus in placental mammals. The H19/Igf2 locus is one paradigm for imprinting and has been extensively studied in mice demonstrating that the physical contacts between differentially methylated regions, containing insulators, silencers, and activators, create a higher-order chromatin structure leading to transcriptional regulation of both H19 and Igf2 [47].

X Inactivation

Random silencing of one of the two X chromosomes in embryonic tissues of female mammals to achieve dosage compensation is another paradigm for a stable and heritable epigenetic state in somatic cells [48]. DNA methylation occurs quite late during the inactivation process. Only after expression of the large non-coding Xist RNA, its coating of the future inactive chromosome, changes in the patterns of histone modifications and variants, and gene silencing, DNA methylation patterns are established on the inactive X chromosome, where they are necessary to maintain the inactive X chromosome in its silent state.

Environmental and Nutritional Epigenetics

Epigenetics holds the promise to explain at least a part of the influences the environment has on a phenotype. Studies in monozygotic twins have demonstrated that epigenetic differences in genetically identical humans (monozygotic twins) accumulate with age and different environments create different patterns of epigenetic modifications [49]. Differences are therefore largest in twin pairs of old age that have been raised separately. Transient nutritional or chemical stimuli occurring at specific ontogenic stages may have long-lasting influences on gene expression by interacting with epigenetic mechanisms and altering chromatin compaction and transcription factor accessibility. Developmental stages in multicellular organisms proceed according to a tightly regulated temporal and spatial pattern of gene expression accompanied by changes in DNA methylation patterns as described in the above paragraph on development. These changes occur in response to transient stimuli. Therefore, epigenetics provides a mechanism by which physiological homeostasis could be developmentally programmed and inherited.

DNA methylation is dependent on the diet-ingested methyl donor folate. DNA methylation levels correlate with the levels of available folate as well as the genotype-dependent activity of involved enzymes such as the 5,10-methylenetetrahydrofolate reductase gene [50]. In mice, an increase in folic acid intake leads to increased DNA methylation of an allele of the agouti locus, causing gene silencing and a modification of the phenotype [51]. Disorders like intra-uterine growth retardation and neural tube defects as well as the adult onset of many complex diseases have been linked to aberrant methyl metabolism in utero. This modulation of epigenetic patterns in utero has given rise to the developmental origin of disease hypothesis, which postulates that the in utero environment can cause permanent changes to metabolic processes that directly affect postnatal phenotype, confers susceptibility to multifactorial disease at adult age and may also be transmitted to subsequent generations [52]. Both chemical and environmental toxins have shown to induce changes to DNA methylation patterns without altering the genetic sequence and leading to epimutation-associated phenotypes [53, 54]. Environmental toxins such as benzpyrene and dioxin do appear to promote a transgenerational susceptibility to disease that remains unexplained by genetic means. Endocrine disruptors such as the anti-androgenic fungicide vinclozolin have been shown to alter the DNA methylation patterns in sperm and the effects persist for at least four generations [55, 56]. Long after the stimulus is gone ‘cellular memory’ mechanisms enable cells to remember their chosen fate, thus perturbation at an early stage have long-lasting consequences.

Transgenerational Epigenetic Inheritance

Transgenerational epigenetic inheritance refers to the transfer of epigenetic information across generations, i.e., through meiosis [57]. This mechanism would explain the inheritance of a phenotype in addition to the DNA sequence from the parents. The strongest evidence comes from a phenomenon called paramutation in plants where the epigenetic state at one locus is conferred to the homologous allele in a meiotically heritable manner inducing a change in gene expression in the absence of a genetic mutation. Two models have been proposed either based on the pairing of homologous chromosomes or on the RNA-mediated silencing. A recent transgene model in mice lent support to the RNA model [58]. Two loci in mice have been shown to exhibit transgenerational epigenetic inheritance, the agouti viable yellow and the axin fused allele. In both cases the variable phenotype (coat color or presence or absence of a kinked tail, respectively) corresponds to the extent of DNA methylation of an IAP retrotransposon inserted at the respective locus. However, due to the clearing of DNA methylation patterns in primordial germ cells it is not the DNA methylation itself that is responsible for these metastable epialleles. So far transgenerational epigenetic inheritance has not been clearly identified in humans despite some epidemiological evidence [59].

DNA Methylation and Disease

DNA methylation and chromatin structure are strikingly altered in many pathological situations particularly cancer and various mental retardation syndromes and altered levels of folate and homocysteine have been repeatedly linked to disease. Although a number of genetic variations have recently been identified to confer susceptibility to a certain disease, in most cases even the worst combination of alleles of several disease susceptibility loci only explains a small percentage of disease occurrence. Consequently environmental factors play undoubtedly a large role in the actual occurrence of disease. Epigenetic modifications constitute a memory of an organism to all the stimuli or insults it has ever been exposed to.

Disease-associated changes in epigenetic modifications can be classified into changes in genes that are epigenetically regulated and genes that are part of the molecular machinery responsible for correct establishment and propagation of the epigenetic modifications through development and cell division. Aberrant methylation patterns have been reported in various neurodevelopmental disorders including ATRX (X-linked α-thalassemia and mental retardation), Fragile X, and ICF (Immune deficiency, centromeric instability, and facial abnormalities) [60]. The latter is caused by mutations in the DNA methyltransferase 3B. Mutations in the methyl-binding protein MeCP2 are found in Rett syndrome. Imprinting anomalies lead to disorders such as Prader–Willi, Angelman and Beckwith–Wiedemann syndrome, or transient neonatal diabetes [61, 62].

DNA Methylation Changes in Cancer

Cancer is probably the best-studied disease with a strong epigenetic component [63, 64]. In tumors, a global loss of DNA methylation (hypomethylation) of the genome is observed [65] and has been suggested to initiate and propagate oncogenesis by inducing chromosome instabilities and transcriptional activation of oncogenes and pro-metastatic genes such as r-ras [66]. The overall decrease in DNA methylation is accompanied by a region- and gene-specific increase of methylation (hypermethylation) of multiple CpG islands [63, 64]. Hypermethylation of CpG islands in the promoter region of a tumor suppressor or otherwise cancer-related gene is often associated with transcriptional silencing of the associated gene. The number of gene-associated promoters that are known to become hypermethylated during carcinogenesis is rapidly growing. Genes of numerous pathways involved in signal transduction (APC), DNA repair (MGMT, MLH1, BRCA1), detoxification (GSTP1), cell cycle regulation (p15, p16, RB), differentiation (MYOD1), angiogenesis (THBS1, VHL), and apoptosis (Caspases, p14, DAPK) are often inappropriately inactivated by DNA methylation. It should be noted that so far no single gene has been identified that is always methylated in a certain type of cancer. Both hypo- and hypermethylation are found in the same tumor, but the underlying mechanisms for both phenomena have not yet been elucidated. A new dimension has recently been added to epigenetic cancer research with the demonstration of long-range gene silencing by epigenetic modifications [67]. Long-range epigenetic silencing seems to be a prevalent phenomenon during carcinogenesis as a recent survey identified 28 regions of copy-number independent transcriptional deregulation in bladder cancer that are potentially regulated through epigenetic mechanisms [68]. While the contribution of genetic factors to carcinogenesis such as the high-penetrance germ line mutations in genes such as BRCA1 and p53 in familial cancers has long been recognized, it has become evident that epigenetic changes leading to transcriptional silencing of tumor suppressor genes constitute an at least equally contributing mechanism. For example, microarray expression profiles of breast tumors with BRCA1 mutations are very similar to those of sporadic breast cancer cases with BRCA1 promoter hypermethylation demonstrating that disruption of BRCA1 function by either genetic or epigenetic pathways leads to the same perturbations [69]. With the exception of haploinsufficient genes, ‘two hits’ are necessary to inactivate the two alleles of a gatekeeper tumor suppressor gene to enable oncogenic progression according to Knudson’s two-hit hypothesis [70]. DNA methylation can act as one hit having the same functional effect as a genetic mutation or deletion as proven by numerous experiments in which re-establishing expression of tumor suppressor genes could be achieved through drugs inducing demethylation. Epimutations can inactivate one of the two alleles, while the other is lost through genetic mechanisms or silence both alleles [71]. Epigenetic changes occur at higher frequency when compared to genetic changes and might be especially important in early stage human neoplasia. They often precede malignancy as extensive CpG island hypermethylation can be detected in benign polyps of the colon, in low- as well as in high-grade tumors [72, 73]. It has therefore been suggested that epigenetic lesions in normal tissue set the stage for neoplasia. DNA hypermethylation could for example not only be detected in dysplastic epithelium of patients with ulcerative colitis, a condition associated with an increased risk for the development of colon cancer, but also in histological normal epithelium [74]. Aberrant DNA methylation patterns are therefore probably not a consequence or by-product of malignancy and contribute directly to the cellular transformation. It has been extrapolated that aberrant promoter methylation is initiated at ~1% of all CpG islands and as much as 10% become methylated during the multistep process of tumorigenesis [72]. Detection can be carried out in the tissue itself, but—more importantly—recent reports have demonstrated a high level of concordance of DNA methylation patterns in tumor biopsies and matched DNA samples extracted from body fluids such as serum, plasma, urine, and sputum. DNA methylation-based markers are therefore promising tools for non-invasive detection of different tumor types. The most effective way to detect the aberrant methylation is to analyze fluids that have been in physical contact with the site of the respective cancer. A large number of novel sources has been successfully tested including nipple aspirate fluid, breast-fine needle washing, bronchial brush samples, buccal cells, needle biopsies, pancreatic juice, peritoneal fluid, prostate fluid or ejaculate, brochialveolar lavages, saliva, exfoliative cells from bladder or cervix, urine, peritoneal fluid, or stool samples [75]. Tumors release a substantial amount of genomic DNA into the systemic circulation and this freely circulating DNA contains the same genetic and epigenetic alterations that are specific to the primary tumor [76]. As the analyzed gene-specific methylation patterns are in most cases absent in control patients, methylation analysis of DNA recovered from plasma and serum can be used as a biomarker for molecular diagnosis and prognosis in various types of malignancies [77]. Besides early detection, the methylation status of CpG islands can be used to characterize and classify cancers. While for example head and neck, breast, or testicular tumors show overall low levels of methylation, some other tumor types such as colon tumors, acute myeloid leukemias, or gliomas are characterized by high levels of methylation, although some heterogeneity is observed in almost all tumor types. Methylation patterns can be shared by different types of tumors as well as being tumor-type-specific and methylation profiling can therefore identify distinct subtypes of human cancers [72]. Other important applications of DNA methylation analysis in cancer are the detection of tumor recurrence as well as the prediction and monitoring of patients response and effectiveness to a given anti-cancer therapy [78]. As DNA methylation is a non-mutational and therefore—at least in principle—a reversible modification, it can be used as point of departure for anti-neoplastic treatment by chemical or antisense oligonucleotide-induced demethylation [79].

DNA Methylation and Complex Disease

While most of the interest has so far been focused on epigenetic changes in cancer, it is probable that epigenetic changes directly or indirectly contribute to the susceptibility and development of many complex or multifactorial diseases [80]. Epigenetic mechanisms are consistent with various non-Mendelian features of multifactorial diseases such as the relatively high degree of discordance in monozygotic twins. Only few diseases have been studied in some detail. The promoter of the membrane bound form of COMT is found to be hypomethylated in schizophrenia and bipolar disorder leading to hyperactivity of the gene, while the RELN promoter displays concomitant hypermethylation in schizophrenia [81]. The expression of the methyl-binding protein MeCP2 is reduced in the frontal cortex of autistic patients and this correlates with altered methylation patterns of the MeCP2 promoter and probably other X-linked genes [82, 83]. There is also some evidence for epigenetic dysregulation in cognitive disorders such as Alzheimer’s and Huntigton’s disease [84]. DNA methylation patterns are also globally disturbed in autoimmune diseases such as the lupus erythematosus [85] or rheumatoid arthritis [86] and possibly play a role in the pathogenesis of asthma [87]. Epigenetic changes are probably also involved in the pathogenesis of diabetes, metabolic syndrome, and intermediated phenotypes, where disease susceptibility seems to be influenced by the maternal in utero environment and recent epidemiological evidence implicates also paternal behavior [88]. To further underline the scope of epigenetic alterations in disease, it is interesting to point out that recent research has shown that so called monogenetic diseases such as α-thalassemia that have previously been attributed solely to genetic alterations can also be caused by epigenetic alterations at the same locus [89]. The field of epigenetics of complex diseases is still in its infancy, but epigenetics might provide the missing link between the genetic susceptibility and the phenotype by mediating and modulating environmental influences differentially depending on the epigenotype of a disease susceptibility locus.

DNA Methylation as a Biomarker

Research has so far mainly focused on the hypermethylation of promoter-associated CpG islands where hypermethylation is inversely correlated to their transcriptional activity. Correlation between DNA methylation and gene inactivation is a prerequisite for the identification and validation of novel functionally important genes namely tumor suppressor genes. However, a large number of promoters become hypermethylated during carcinogenesis where there is no evidence that the corresponding gene acts as a tumor suppressor gene. In this case, DNA methylation might still be a useful biomarker for tumor diagnosis or risk assessment if the methylation pattern is specific for a certain tumor type and/or correlates with clinically important parameters. A good example is the classic panel for the detection of the CpG island methylator phenotype defining a subtype of colorectal cancers with a distinct phenotype that comprises three MINT (Methylated IN Tumors) fragments [90]. These fragments have been identified through differential screening processes, but have only later been mapped to specific genomic loci. As described above, the analysis of DNA methylation patterns is complicated by the fact that some changes are due to exposure to environmental influences as well as accumulation of DNA methylation at some promoters during aging [54]. To be useful as biomarker, age-associated DNA methylation changes have therefore to be distinguished from cancer predisposing alterations.

Biomarkers capable of distinguishing diseased or malignant cells from normal ones must be specific, sensitive, and detectable in specimens obtained through minimally invasive procedures to be clinically applicable. Many biomarkers on the protein, RNA, or DNA level fulfilling these criteria have been discovered. In routine clinical practice, most tumor diagnostics is carried out by biochemical assays determining the presence and/or quantity of enzymes, receptors, growth factors, or hormones. Despite the wide-spread use of (microarray-based) RNA detection techniques in research facilities, there are some potential pitfalls associated with its use in routine clinical diagnostics such as the required preservation of mRNA from the tissue, tissue heterogeneity, and the need for normalization. Attention to numerous details of sample extraction, storage, and handling has to be paid to ensure intra- and interlaboratory reproducibility [91]. DNA-based molecular biomarkers can be more easily transferred from a research laboratory setting into routine diagnostics in a clinic due to the amplifiable and stable nature of DNA. Methyl groups on cytosines are part of the covalent structure of the DNA in contrast to other epigenetic marks such as chromatin. Once methylation is acquired, it is in most cases chemically and biologically stable over time, while expression of mRNA and/or proteins can be modified by non-disease-related environmental conditions and vary over the cell cycle. As most methods determine the ratio between methylated and unmethylated CpGs, DNA methylation analysis is independent of the total amount of starting material. It provides a binary and positive signal that can be detected independent of expression levels. It is therefore easier to detect than negative signals like loss of heterozygosity. If the core region of a CpG island in a promoter that is controlling transcriptional activity is defined, the stable DNA-based analyte can be used as a proxy to monitor the (re-)activation of gene expression during treatment. DNA methylation can be analyzed with a growing number of methods that are amenable to high throughput and quantitative assays eliminate the need for normalization. They are applicable to formalin-fixed paraffin-embedded clinical specimens and other archived material. Epigenetic changes—similar to genetic alterations—lead to an altered phenotype of a certain cell conferring a selective advantage to those. However, in contrast to genetic DNA-based alterations such as point mutations that can theoretically occur at any position in the coding regions of a gene, DNA methylation changes are always confined to the same small regions of a gene (usually the promoter-associated CpG island). Also, the assessment of the methylation status examines the lesion itself which is the epigenetic inactivation of the promoter rather than the effect of this alteration such as loss of protein expression or modified enzyme activity. A further advantage is the potential reversal of epigenetic changes by treatment with pharmacological agents, while genetic changes are irreversible [79]. One of the most important criteria for a clinically useful biomarker enabling screening of individuals at potential risk and monitoring of therapy response or disease recurrence is the analysis of the reliable biomarker in surrogate tissues such as blood or body fluids that can be obtained through minimal invasive procedures. Similar approaches based on the detection of RNA have been complicated by the inherent instability of these molecules and difficulties in detecting changes in the level of tumor-derived RNA in the background of a large number of molecules derived from normal cells. The sensitive and specific detection of tumor-specific DNA methylation patterns at distal sites makes DNA methylation a biomarker of choice for the clinical management of cancer patients.

Although epigenetics in general and DNA methylation research in particular are advancing at a breathtaking speed, we are probably only at the tip of the iceberg and we will see a large increase in the number of identified epigenetic changes over the next few years together with first genome-wide mammalian epigenome maps at single nucleotide resolution. Elucidation of the epigenetic changes occurring during development, the investigation of their subtle but persistent alteration in response to exposure to environmental and chemical insults at doses far below those required for visible changes of the phenotype in toxicity tests, and analysis and mapping of the changes to the DNA methylation patterns occurring during tumorigenesis and in other pathologies will contribute to enhance our understanding of the importance of epigenetic changes in development and disease. This knowledge might ultimately improve existing treatments and create new options to prevent, slow down the progress or eventually cure some diseases.