Introduction

Telomeres

Human chromosomes are capped and stabilised by telomeres. They are comprised of several thousand copies of a hexamer repeat sequence (TTAGGG)n, a single-stranded 3′ G-rich overhang, and a plethora of generic DNA-binding proteins, tankyrases, and specific telomere-binding proteins collectively termed the ‘shelterin’ complex (Baird 2006; Moyzis et al. 1988; Verdun and Karlseder 2007). Telomeres prevent chromosome ends from being recognised as damaged DNA in need of double-strand break repair and, as a result, protect against chromosome–chromosome fusions and rearrangements, helping maintain genomic integrity (Wright and Shay 2005; Murnane 2006; Blackburn 2001). The telomere has been likened to the plastic tip (aglet) at the end of a shoelace as it prevents degradation and ‘fraying’ of the lace or chromosome. Telomeres are built up in embryonic cells by telomerase, a ribonucleoprotein consisting of an oestrogen-responsive reverse transcriptase component (TERT) and an RNA subunit (TERC) (Greider and Blackburn 1996). They are heterogeneous in length, varying between chromosomes and individuals, and show remarkable sequence homology throughout higher organisms (Blasco 2007; Lansdorp et al. 1996; Londono-Vallejo 2004; Meyne et al. 1989).

Telomeres tend to shorten with each cell division, as this highly repetitive stretch of DNA is inefficiently copied. This is referred to as ‘the end replication problem’ (Valdes et al. 2005; Shay and Wright 2005; Levy et al. 1992); the leading 5′–3′ strand in the synthesis of new DNA can be successfully made to the end, but the lagging 3′–5′ strand cannot as its synthesis is more complex (Verdun and Karlseder 2006, 2007; Broccoli 2004; Petraccone et al. 2008). This leads to a progressive loss in mean telomere length (TL) of 15–66 bp per year (Valdes et al. 2005; Allsopp et al. 1992; Slagboom et al. 1994; Hastie et al. 1990; Mayer et al. 2006). The rate of telomere attrition is greatest in the first year of life [almost ten times the rate of loss compared to age 1–18 and 28 times the rate compared to age 19 and over (Aubert et al. 2012)], but the rate of attrition has also been shown to increase over the age of 50 years (Baird 2006; Cawthon et al. 2003).

Rare mutations in telomere maintenance genes, such as TERT, RTEL1, DKC1 and WRN, can cause dramatically shortened telomeres and premature ageing and increase risk of several rare diseases, including dyskeratosis congenita, which is characterised by short telomeres and premature ageing (Gupta and Kumar 2010; Knight et al. 1999; Stuart et al. 2015; Crabbe et al. 2007). This contributes to the hypothesis of TL as a measure of ‘biological age’ of both the cell and the organism. The ‘Hayflick Limit’ of 52 mitoses, as measured in cell culture, is the approximate limit of replicative capacity for human cells (Weinstein and Ciszek 2002; Benetti et al. 2007; Hayflick 2003). Beyond this point, telomeres will be below a critical length and gross chromosomal rearrangements through repeated chromosomal breakage–fusion–bridge cycles will occur (Murnane 2006). In response to this ‘crisis’, as signalled via the telomere-associated proteins, the cell cycle will arrest (Wong et al. 2009; von Zglinicki 2003; Verdun et al. 2005). The vast majority of these arrested cells will undergo apoptosis or senesce, protecting the organism from creating a potentially tumourigenic cell (Wong et al. 2009; d’Adda di Fagagna et al. 2003). On rare occasions, a cell can escape apoptosis or senescence and completely bypass arrest. Approximately one in every ten million cells in ‘crisis’ can then re-lengthen their telomeres, to recap and protect whatever gross or localised DNA damage has occurred, through the derepression of telomerase (Stampfer and Yaswen 2003; Bodnar et al. 1998). More than 90 % of cancer cells show such renewed telomerase activity, indicating that this may play a key role in malignant transformation (Cawthon et al. 2003; Kolquist et al. 1998; Meeker 2006; Wu et al. 2003). It has therefore been hypothesised that shorter mean TL may predispose to a number of common diseases of ageing, including cardiovascular disease (Murnane 2006; van der Harst et al. 2010) and cancer (Weischer et al. 2013; Mirabello et al. 2010; Risques et al. 2007; Pooley et al. 2010), and thus could be used as a biomarker of disease risk.

Measurement of telomere length

One of the major challenges has been the reliable measurement of TL to properly test such hypotheses. Blood leukocytes yield high-quality DNA that is suitable for TL assays, and blood is a convenient tissue to collect for epidemiological studies. Leukocyte TL is thus generally measured as a marker of overall TL, under the assumption that within an individual TL is generally strongly correlated across tissue types. After the isolation and characterisation of the telomere sequence in the late 1980s (Moyzis et al. 1988; Meyne et al. 1989), the first and only method of length determination for many years was TRF (terminal restriction fragment) Southern blotting (Bryant et al. 1997; Cherkas et al. 2008; Bataille et al. 2007). Although the method generated an absolute value (in kb) of mode TL for each sample, it was not particularly sensitive and used a large amount of DNA per sample (~1 µg).

In recent years, quantitative PCR (Q-PCR) assays to measure mean TL have been developed. Q-PCR assays can be used in high-throughput laboratories, since they are simple and rapid to perform and require minimal quantities of DNA (<100 ng per sample) (Cawthon 2002). They have been performed on tens of thousands of samples to date to study TL with respect to smoking (McGrath et al. 2007), obesity and dietary factors (Cassidy et al. 2010), stress (Surtees et al. 2011), general physical health (Harris et al. 2006), oxidative damage (Shen et al. 2009) and cancer risk (Pooley et al. 2010, 2013; Bojesen et al. 2013). The discovery of genetic variants that are significantly associated with Q-PCR-determined TL has provided proof that this is a valid and sensitive measurement tool (Codd et al. 2010; Pooley et al. 2013; Bojesen et al. 2013). However, debate over the relative merits of the different assays continues. In a recent evaluation, inter-laboratory coefficients of variation were higher for Q-PCR than for other methods (Martin-Ruiz et al. 2014). One disadvantage of Q-PCR is that absolute values are not generated, so it is difficult (but not impossible) to compare assay results from batch to batch, study to study and across different laboratories.

Other techniques used to investigate TL include (a). STELA: a PCR-based technology amplifying specific chromosomes (Xu and Blackburn 2007); (b). Q-FISH: quantitative fluorescence in situ hybridisation, whereby fluorescently labelled probes complimentary to the telomere repeats are hybridised to a metaphase spread of chromosomes (Zheng et al. 2009) and (c). Flow-FISH: an adaptation coupling Q-FISH incorporating flow cytometry, whereby the TL of multiple cell types can be measured and compared (Baerlocher et al. 2006). These other techniques, although incredibly specific and accurate, are very labour intensive and as yet unsuitable for large-scale studies of TL and disease. Most recently, the ‘TelSeq’ method has been shown to produce results that correlate well with existing technologies (Ding et al. 2014): whole-genome next-generation sequencing data are mined for reads that are rich in telomere sequence, and relative length is determined. With the potential to be relatively high-throughput, this may overtake Q-PCR as the method of choice in future studies.

Telomere length

Inter- and intra- individual variation in telomere length

There is considerable variation between individuals both in absolute TL and the rate of telomere shortening (Chen et al. 2011), even from birth (Okuda et al. 2002; Akkad et al. 2006). Much of this variation is attributable to either measurement error or variation in TL between cells in the same individual. Although consistent differences in both absolute TL and rate of attrition have been observed between different cell types, TLs have been found to be strongly correlated across different cell types within the same individual (Takubo et al. 2002; Aubert et al. 2012; Daniali et al. 2013).

Demographic factors associated with telomere length

Various factors have been associated with inter-individual TL. Unsurprisingly, given the gradual erosion of telomeres with each cell division, age is by far the strongest predictor of individual TL, explaining an estimated 17.5 % of the inter-individual variation in TL (Daniali et al. 2013). Other demographic characteristics, such as sex and ethnicity, have been associated to a lesser extent. Men have significantly shorter telomeres than women, and their telomeres decline more rapidly with age (Aubert et al. 2012; Möller et al. 2009; Weischer et al. 2012; Mayer et al. 2006).

Environmental factors associated with telomere length

Many environmental factors, particularly those indicating an “unhealthy lifestyle” (including obesity, smoking, lack of exercise and alcohol use) have been frequently, though somewhat inconsistently, associated with shorter telomeres (Cherkas et al. 2008; Cassidy et al. 2010; Weischer et al. 2012; Mirabello et al. 2009; Strandberg et al. 2011; Nawrot et al. 2004; LaRocca et al. 2010) and with the rate of telomere shortening (Chen et al. 2011). Their estimated effect on TL is much less than that of ageing; for example telomeres have been estimated to be 240 bp shorter in obese women and 5 bp shorter for every pack year smoked (Valdes et al. 2005). Given that many of these factors are known to lead to oxidative stress [e.g. obesity and smoking (Burke and Fitzgerald 2003; Dandona et al. 2004)] and that oxidative stress is known to accelerate shortening of telomeres (von Zglinicki 2000, 2002; Houben et al. 2008), they may well be directly affecting TL. However, given that both TL and these factors are changing and having an effect over the entire lifespan of an individual, it is difficult to establish directionality of effect, particularly when the various factors are so highly correlated. Evidence for the causality of a trait may be strengthened by showing that genetic predictors of that trait are also predictors of TL; for example, in a large study, a genetic predictor of body mass index was investigated, but not shown to be associated with TL (Du et al. 2013).

Hereditary factors associated with telomere length

In addition to the rare genetic mutations discussed earlier, at least some of the population variation in TL is explained by common genetic polymorphisms. Before any genotyping was conducted, various studies had shown that TL was heritable, with a strong correlation (r 2 = 0.25) between maternal and newborn TLs (Akkad et al. 2006); heritability estimates range from 36 to 86 % in twin and other familial studies (Slagboom et al. 1994; Vasa-Nicotera et al. 2005; Njajou et al. 2007; Bakaysa et al. 2007; Atzmon et al. 2009). It should be remembered that much of this heritability may be due to shared environment, which is notably not modelled in those studies that estimate heritability to be over 50 %. Family-based linkage analyses have also been conducted in two studies, finding significant evidence for linkage at 12q12.22 (Vasa-Nicotera et al. 2005; Mangino et al. 2008) and 14q23.2 (Andrew et al. 2006), although neither study appears to replicate the other’s findings.

The natural next step is to try to identify specific genetic regions that are associated with TL in a genome-wide association study (GWAS). The first GWAS to reach genome-wide significance identified a locus that includes TERC (which encodes the telomerase RNA component), with each copy of the minor allele conferring a roughly 75 bp reduction in TL (Codd et al. 2010). Subsequent GWAS have identified further single-nucleotide polymorphisms (SNPs) associated at genome-wide significance with TL including in the vicinity of OBFC1, which encodes the human homolog of a yeast protein involved in the replication and capping of telomeres (Levy et al. 2010), CTC1, which is also involved in telomere maintenance (Mangino et al. 2012), and ZNF676, a zinc finger protein whose role in telomere biology is unknown (Mangino et al. 2012). By far the biggest GWAS of TL was a meta-analysis including 37,684 individuals with replication in 10,739 individuals (Codd et al. 2013). This replicated the associations with TERC and OBFC1, revealed novel associations between TL and SNPs in the region of 3 genes known to be involved in telomere biology (TERT, NAF1 and RTEL1) and found novel associations at two further loci (19p12 and 2p16.2) with no obvious candidate telomere-related genes. Despite the size of the study, the total variance in TL explained by the seven variants reaching genome-wide significance was only about 1 %, leaving the majority of the genetic variation influencing TL unexplained.

Pooley et al. (2013) also reported variants of genome-wide significance with TL in TERC, TERT and OBFC1 in the analysis of a custom genotyping array (“iCOGS”) in breast cancer cases (n = 11,024) and healthy controls (n = 15,065). There was also supportive evidence (p < 5 × 10−4) of associations with SNPs in NAF1, RTEL1 and at 2p16.2. However, at 2p16.2, they found the minor allele of surrogate SNP (rs10165485) to be associated with longer TL in contrast to the published rs11125529 association with shorter TL (Codd et al. 2010) (pairwise r 2 between SNPs = 0.98). The reportedly associated SNP at 19p12 was not directly genotyped, nor was there any good surrogate, so this association could not be tested.

Epidemiological evidence of the relationship between telomere length and common disease

Epidemiological studies have established an association between shorter TL and the risk of various age-related common diseases.

Cardiovascular and metabolic disease

Haycock et al. (2014) recently reviewed the evidence for such an association with cardiovascular disease (coronary heart disease and cerebrovascular disease), conducting a meta-analysis of 24 studies. Evidence was seen of a relationship with coronary heart disease: estimated relative risk (RR) (comparing the shortest versus the longest third of TL) 1.54, 95 % confidence interval (1.30, 1.83), with moderate between-study heterogeneity (I 2 = 64 %), and the association remained when adjusting for publication bias or restricting to prospective studies. A similar effect size was seen for the association with cerebrovascular disease [RR 1.42 (1.11, 1.81)], although there was no real evidence for an effect when restricting the meta-analysis to prospective studies.

In a similar meta-analysis, Zhao et al. (2013) concluded that shorter telomere length is associated with an increased risk of type 2 diabetes, although this meta-analysis did not distinguish between prospective and retrospective studies. In a more recently reported analysis (Willeit et al. 2014) based on the prospective Bruneck Study, TL was measured on three occasions, spanning 15 years, but only a small number of participants (44/606) developed type 2 diabetes. When correcting for regression dilution through analysis of the repeated TL measures, there was evidence of increased risk comparing the bottom to the top quarter of TL [RR 3.24 (1.29, 8.15)]. Upon meta-analysis with two other prospective studies, the estimated RR was 1.31 (1.07, 1.60), with moderate between-study heterogeneity (I 2 = 69 %).

Cancer

The relationship with cancer seems to be more complex. Wentzensen et al. (2011) and Ma et al. (2011) conducted similar meta-analyses of 25–29 epidemiological studies of TL and cancer risk published prior to 2010, including 13 different cancers and overall incident cancer. From a random effects meta-analysis of all studies, the RR of cancer to those with telomeres in the shortest quarter of TL compared with the longest was 1.96 (1.37, 2.81) (Wentzensen et al. 2011), but the between-study heterogeneity was substantial (I 2 = 94 %). As the authors note, this may be due, at least in part, to differences between specific cancers. Strong evidence of association was seen between shorter telomeres and increased risk of bladder and gastric cancer, but no evidence of an association was seen with various other cancers, including breast cancer. Indeed there is accumulating evidence that, in contrast to the common pattern, longer telomeres are associated with increased risk of certain cancers, including melanoma (Nan et al. 2011; Burke et al. 2013), soft tissue sarcoma (Xie et al. 2013), B cell lymphoma (Hosnijeh et al. 2014) and lung cancer adenocarcinoma (Sanchez-Espiridion et al. 2014), suggesting perhaps that telomere maintenance inhibits apoptosis and increases the likelihood of malignancy in the development of these cancers. A number of recent studies have suggested there may be a non-monotonic relationship between TL and the risk of certain cancers, with higher risk at both extremes of TL (Qu et al. 2013; Skinner et al. 2012; Cui et al. 2012; Wang et al. 2014), but to date there has not been a sufficient number of large high-quality studies to confirm this pattern. Heterogeneity between studies may thus be due to combining related, but distinct, diseases that are influenced quite differently by TL.

Retrospective and prospective studies

Heterogeneity may also arise through differences in study design and conduct. Even when there is emerging evidence of consistent association between TL and disease the reasons for the association remain uncertain.

Some of the meta-analyses have allowed comparisons to be made between retrospective studies (where TL is measured after disease diagnosis) and prospective studies (where TL may be measured a considerable time before diagnosis). Where estimates from these types of study are reported separately (e.g. Haycock et al. 2014; Wentzensen et al. 2011), the effect sizes tend to be larger from the retrospective studies, suggesting that reverse causality or other aspects of residual confounding contribute to the estimates. Pooley et al. (2010) observed far smaller estimates of risk from prospective compared with retrospective studies of TL and the risk of breast and colorectal cancer. Weischer et al. (2013) studied a prospective cohort study of over 47,000 individuals from Denmark, 3142 of whom received a cancer diagnosis during follow-up. Although the unadjusted hazard ratio for cancer risk was 1.74 (1.58, 1.93) for shortest versus longest quartile of TL, of similar magnitude to the meta-analyses (Wentzensen et al. 2011; Ma et al. 2011), the association disappeared after adjusting for other risk factors, primarily age. Reverse causality could be due to the disease process itself (for example increased levels of oxidative stress among cases) or to treatment effects.

Even in prospective studies, confounding by joint risk factors remains a strong possibility. Although in some studies, adjustment is made for potential confounders, residual confounding cannot be ruled out, especially as several common risk factors for these diseases are also related to TL, as outlined earlier.

Genetic studies

To overcome concerns about both reverse causality and confounding, an increasing number of studies have examined the influence of telomere-related genes on disease risk, the rationale for which is considered more fully below. Polymorphisms in genes known to be involved in telomere maintenance have been investigated in genetic association studies of common disease. Most notably, polymorphisms in or around TERT (encoding telomerase reverse transcriptase) are strongly associated with various cancers, including breast, bladder and prostate cancers and melanoma (Rafnar et al. 2009; Bojesen et al. 2013).

However, the most strongly associated SNPs and even the direction of association differ between cancers. The effects may reflect differences in the direction of the associations with TL itself. For example TERT SNP alleles associated with longer TL in Bojesen et al. (2013) were also associated with increased risk of melanoma (Barrett et al. 2015), mirroring the observed association between longer TL and melanoma risk. In contrast, the minor allele of SNP rs2736108, associated with longer telomeres, is associated with lower risks for oestrogen receptor (ER)-negative (p = 10−8) and BRCA1 mutation carrier (p = 10−5) breast cancers (Bojesen et al. 2013). For hormonal cancers, the functional variants (rs10069690 and rs2242652) with the biggest effects on risk have no direct effects on TL, and, conversely, variants (rs7705526 and rs2736108), which are clearly drivers of TL, have smaller, secondary effects on breast and ovarian cancers (Bojesen et al. 2013; Terry et al. 2012; Pellatt et al. 2013). Thus, the TERT gene clearly has multiple, pleotropic effects; 5′ variants affect promoter activity and TL, whilst more 3′ variants, affecting RNA splicing and a TERT silencer element, have roles in hormonal cancer development but not via changes to TL.

Since the recent GWASs of telomere length, and in particular the meta-analysis by Codd et al. (2013), a more systematic approach has been possible combining the effects of all SNPs known to be associated with TL into a single polygenic score. Codd et al. (2013) showed modest evidence for an effect of a polygenic score based on the 7 genome-wide significant telomere-associated SNPs identified in their meta-analysis with coronary artery disease (p = 0.014, from a study of over 22,000 cases and 64,000 controls). Using similar approaches, effects have also been observed for various cancers, including bladder cancer (Chang et al. 2012). In particular, Iles et al. (2014) showed very strong evidence for an effect of the polygenic score from the same 7 SNPs (Codd et al. 2013) on the risk of melanoma (p < 10−8). Individuals with a polygenic score in the highest quartile were at almost 30 % increased risk of melanoma compared with those with a score in the lowest quartile.

Study design

Although findings are sometimes inconsistent, there is clear and emerging evidence of association between TL and common disease, and of differential effects on different diseases. Many different study designs have been used, some measuring leukocyte TL directly and others considering TL-associated genotypes.

Studies measuring leukocyte telomeres directly exploit the strong correlation between TL in different tissues within an individual, although the evidence for this is not extensive and concerns remain about the stability of telomere measurements. TL varies considerably throughout a person’s lifetime, due to both ageing and specific environmental or host stimuli, and using current technology there is considerable measurement error. The timing and method of TL measurement are therefore both important factors. Telomeres measured after diagnosis may well be affected by treatment or by the disease process itself. Thus, prospective studies, where TL is measured well before disease onset, are necessary to ensure the association is not due to reverse causality.

Many disease-related factors, including diet, sun exposure and smoking, are themselves associated with TL. Thus, even in prospective studies, it is difficult to rule out confounding. In many of the prospective studies conducted to date, adjustment is made for known confounders, but residual confounding cannot be ruled out.

These considerations, along with the high heritability of TL and recent discovery of TL-related genetic variants, have led to an interest in studies where the relationship between genetic predictors of TL and disease risk is investigated. Clearly such studies avoid any possibility of reverse causality. Confounding is also unlikely, provided population stratification is properly accounted for. Is it therefore possible to invoke Mendelian Randomisation principles (Katan 1986; Davey Smith and Ebrahim 2003) and infer that an association between the genetic risk factors and disease demonstrates a causal role for TL in disease risk?

One of the assumptions behind Mendelian randomisation is that the genetic risk variants do not have a direct effect on disease risk other than through the putative causal factor (TL). However, some observations cast doubt on this assumption. Firstly, as described above in relation to TERT, genes can clearly have pleiotropic effects. It is also possible that specific genetic variants influence not only TL but also other aspects of telomere biology or DNA repair, and their association with disease risk may not be due to TL per se. A second observation relates to the relationship between genetic predictors of TL and risk of melanoma (Iles et al. 2014). The association is much stronger than would be expected if the risk were mediated through TL alone, given that the genetic predictors only explain about 1 % of the variation in measured TL (Codd et al. 2013). This may well be partly explained by measurement error and intra-individual variability of TL, but it could also be due to pleiotropic effects of the variants.

Conclusions

The observed associations between TL and the risk of many different diseases suggest a role that is fundamental to health at the level of the cell and the organism. However, despite these strong associations, it is still unclear whether TL is itself causal or is a biomarker of underlying disease-related mechanisms. Interventions aimed at increasing TL for the purpose of halting or reversing the ageing process or preventing disease, are thus not supported by current evidence. Even as a biomarker of disease, TL does not currently have clinical utility in a population setting; associations with disease are complex, TL alone is not sufficiently predictive of risk and is subject to considerable measurement error. As understanding develops and evidence accumulates, TL may in the future provide a useful biomarker of risk or of disease progression.

Ideally, large prospective studies would be conducted with longitudinal measures of TL, extensive genotyping and detailed measures of phenotype and exposure to further understanding of the relationship with health. Since the measurement of TL is unstable, a key requirement is that sample collection and processing should be as uniform as possible. Such prospective studies would allow investigation of the relationship between genetic predictors of TL on disease risk, while adjusting for measured TL.

Further insights are also needed into telomere biology. It seems likely that the sole focus on TL, rather than other features of telomere maintenance and stability, is too simplistic. Similarly, the assumption that leukocyte TL is reflected in disease-relevant tissue needs further investigation, and consideration should be given to measuring TL at more than one time point. The observation that longer telomeres are associated with greater risk of some cancers also complicates the prevailing view that longer telomeres are always advantageous.

Large-scale prospective studies of disease that consistently and reliably measure TL are rare and expensive to conduct. However, large-scale datasets specifically designed to study the relationship between germline variation and disease risk have been established for most common disorders. The majority of these will not be suitable for the measurement of TL, given the sensitivity of TL measurement to sample handling and the variability of TL over time. Thus, at least until more reliable cost-effective methods of TL measurement become available, one of the most promising approaches to understanding the relationship between telomere features and disease is to study the genetic factors that underlie them.