FormalPara Key Summary Points

Why carry out this study?

CLL is the most common form of leukemia in Western countries and remains incurable.

There have been major advances in development of ‘small molecule’ targeted drugs; however, treatment failures and resistance to new targeted therapies are common.

Our hypothesis is that expression of potential target genes changes with progression of normal B-lymphocytes through pre-malignant MBL cells to malignant CLL.

What was learned from this study?

Levels of GRASP and AC016745.3 mRNAs were progressively lower and C11orf80, ROR1, METTL8, and LEF1 mRNAs progressively higher in B lymphocytes from F-MBL and F-CLL cases compared to F-Controls. PARP3 was increased in F-MBL compared to F-Controls but decreased in F-CLL compared to F-MBL.

The findings for F-CLL were the same in S-CLL, except for PARP3, which was higher in S-CLL.Multiple CLL case families, though limited by small numbers of patients, can be studied to identify differentially abundant mRNAs in normal B lymphocytes, MBL and CLL cells and provide new molecular signatures for targeted therapies.

Introduction

Chronic lymphocytic leukemia (CLL) accounts for > 25% of all leukemia cases in Western countries [1], and a family history is found in up to 10% of cases [2]. Familial clustering of CLL (F-CLL) has been consistently demonstrated in epidemiological studies [3], and a higher frequency of monoclonal B-cell lymphocytosis (MBL), a precursor to CLL, is found in CLL families [4, 5].

To detect patterns of multiple low-risk loci, genome-wide association studies (GWAS) have analyzed large numbers of F-CLL and sporadic CLL (S-CLL) cases and controls using dense-coverage single nucleotide polymorphism (SNP) arrays [6,7,8]. Over 40 risk mutations have been identified to have a role in the etiology of CLL [9], and 30 of these account for ~ 19% of CLL heritability [8], suggesting that a significant proportion of genetic susceptibility has not been detected. Some of this “missing heritability” may be associated with non-DNA sequence-based inheritance factors that affect gene expression, including epigenetic variations, which have been found in several familial cancers [10, 11]. The simultaneous presence of F-CLL and familial MBL (F-MBL) in families provides an opportunity to study changes in mRNA levels associated with progression to CLL against similar genetic backgrounds.

DNA microarray studies have identified differential mRNA expression among normal B lymphocytes, MBL and CLL cells from unrelated individuals [12]. However, this is the first study of gene expression from normal B lymphocytes, F-MBL and F-CLL from within one family. We previously performed a genome-wide linkage scan of the family using high-density SNP markers; however, there was no significant evidence for a single gene model of disease susceptibility, suggesting that susceptibility to CLL has a more complex basis [13]. Although individual family studies are limited by low subject numbers, background genetic variation is reduced, increasing the detection of epigenetic and environmental modifiers associated with variation in gene expression and phenotype [14].

To identify differential mRNAs associated with B lymphocytes, F-MBL and F-CLL, blood samples were collected from members of one of the largest multiple-case CLL kindreds reported in the literature [13]. In the present study, DNA microarrays were used to compare mRNAs in enriched B lymphocytes to determine whether mRNA abundances of genes differed among B lymphocytes from control subjects, F-MBL, F-CLL and S-CLL cases.

Methods

Patients and Samples

The experimental protocol was approved by the Nepean and Blue Mountains Local Health District Human Research Ethics Committee (01/70). Peripheral blood samples (40 ml) were collected from six patients (2 with F-CLL and 4 with F-MBL) and three unaffected members from a family with multiple cases of CLL (Fig. 1) [13]. In addition, samples were collected from six unrelated S-CLL cases and three NK-Controls. All CLL subjects were treatment naïve. The diagnosis of CLL was based on the presence of a clonal B lymphocyte count ≥ 5  × 109/l for ≥ 3 months, co-expression of CD19, CD5 and CD23, and weak or no expression of CD20, CD79b, CD22 and surface IgM [15]. The diagnosis of F-MBL was based on the same immunophenotype but clonal B cells were < 5 × 109/l. For comparisons of mRNA levels in F-MBL, F-CLL and S-CLL, we sought to reduce the effect of genetic relatedness by combining NK-Controls with F-Controls (Combined Controls).

Fig. 1
figure 1

Pedigree of the family. The pedigree in abbreviated form showing segregation of CLL. Blackened symbols denote family members affected with F-CLL; half-shaded symbols F-MBL; clear symbols unaffected; ticked symbols individuals studied from whom mRNA were collected; diamonds represent grouped siblings. The pedigree numbering system corresponds to the original report of this family [13], where each generation is identified by a Roman numeral and each child and cousin in the same generation is identified by an Arabic numeral. Ages (in years) at diagnosis of F-MBL or F-CLL are shown

B lymphocytes were enriched using a RosetteSep™ B Cell isolation cocktail (StemCell Technologies Inc., Vancouver, BC, Canada) to provide > 95% B lymphocyte purity confirmed by flow cytometry [16].

IgVH Usage and Mutation Analysis

Genomic DNA was extracted using the Wizard® Genomic DNA Purification kit (Promega, Madison, WI, USA) and quantified using a Nanodrop 2000 spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA). Amplification by polymerase chain reaction (PCR) and sequence analysis of IgVH rearrangements were conducted according to BIOMED-2 protocols [17, 18], using IgVH gene clonality master mixes (InVivoScribe Technologies, San Diego, CA, USA). Purified PCR products were sequenced at the Australian Genome Research Facility, Brisbane, Australia. Ig blast GenBank and the IMGT/V-QUEST portal for immunoglobulin and T cell receptor sequences (International ImMunoGeneTics Information System) were used to analyze and align IgVH sequences [19]. Sequences with germline homology ≥ 98% were considered as unmutated and those < 98% as mutated [19].

Interphase Fluorescence In Situ Hybridization (FISH)

FISH analyses for common abnormalities associated with CLL were performed in affected individuals using the following probes: DLEU/LAMP at 13q14, chromosome 12 centromere, ATM at 11q22 and TP53 at 17p13. Interphase FISH studies were performed based on techniques adapted from the Cytogenetics and the Molecular Genetics Laboratory, the Children’s Hospital at Westmead, Sydney, NSW, Australia. Two hundred images of interphase nuclei were captured for every probe set according to the manufacturer’s instructions. Results were abnormal when the percentage of cells with any given abnormality exceeded 5% in 200 interphase nuclei for trisomy 12 and 8% for deletions of 13q, 11q and 17p.

RNA Extraction

RNA was extracted using the Isolate II RNA mini kit (Bioline, Taunton, MA, USA). Samples were quantified and purity determined using a Nanodrop 2000 (Thermo Fisher Scientific, Waltham, MA, USA). RNA purity was assessed by measuring absorbances at 260 and 280 nm (A260 and A280, respectively). Samples with concentrations between 50 and 100 ng/µl and with A260/A280 > 1.8 were analyzed using Affymetrix gene expression microarrays (Affymetrix Inc, Santa Clara, CA, USA). An additional RNA quality assessment was performed using the Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA) to determine the ratio of two ribosomal RNAs (rRNA; 28S/18S) and the RNA integrity number (RIN). Only RNA preparations with a 28S/18S rRNA ratio > 2 and RIN > 7 were used for microarray analyses.

Transcriptome Profiling

RNA was prepared as described for the GeneChip® WT Pico Reagent Kit (Affymetrix Inc, Santa Clara, CA, USA) and analyzed using Affymetrix GeneChip® Human Transcriptome 2.0 Arrays. Affymetrix transcriptome analysis console (TAC 3.0) software was used to perform statistical analyses. Gene expression intensity was calculated for each sample using Tukey's Bi weight average for all eligible exons’ intensities in that gene and represented as a bi-weight average shown in a log2 scale. The quality of each Affymetrix Human Transcriptome Array was determined using Affymetrix spike-in controls, perfect match expression and relative log expression during data summarization and normalization in the Affymetrix expression console software, version 1.4.1. The Affymetrix transcriptome analysis console (TAC 3.0) software was used to perform statistical analysis and generate a list of differentially expressed mRNAs. The following formula was used to compare fold change in expression between CLL and controls: log2(CLL/control) = log2(CLL) − log2(control) and converted to a linear scale fold-change value using the formula [2log2(CLL/control)]. Quantitative reverse transcription PCR (qRT-PCR) was used to confirm GRASP mRNA levels. mRNA was converted to cDNA using a Tetro cDNA synthesis kit (Bioline, Taunton, MA, USA), and qRT-PCR was performed using a Rotor-Gene 2000 cycler (Corbett Life Science; Qiagen, Hilden, Germany) with validated primer pairs (Supplementary Material, Table S1) [20]. Gene expression of GRASP relative to GAPDH was calculated using the delta cycle threshold (delta Ct) method [21].

Electronic-Database Information

URLs for programs and data presented herein are as follows: US National Library of Medicine, National Center for Biotechnology Information (accessed 31 March 2017) available from https://www.ncbi.nlm.nih.gov/igblast; the International ImMunoGeneTics Information System (accessed 31 March 2017) available from http://www.imgt.org/IMGT_vquest/vquest; The R Project for Statistical Computing (accessed 15 October 2017) available at http://www.R-project.org; National Genetics Reference Laboratory, Manchester, UK (accessed 1 December 2017) available from http://www.ngrl.org.uk/Manchester/projects/snpcheck.html.

Statistical and Bioinformatic Analyses

Identification of differentially abundant mRNAs was performed using one-way analysis of variance (ANOVA) tests, and to correct for multiple comparisons, false discovery rate (FDR) P-values were calculated [22,23,24]. Hierarchical clustering was performed using Affymetrix transcriptome analysis console version 3.0 software (Affymetrix Inc, Santa Clara, CA, USA). Distances between clusters were computed using the complete linkage method (maximum distance between a pair of objects in the two clusters), and results are displayed in a heat map and dendrogram. To determine mRNAs with FDR < 0.05 that differed among Combined controls, F-MBL, F-CLL and S-CLL, one-way ANOVA with Tukey’s post hoc tests were performed using GraphPad Prism version 7.00 for Windows (GraphPad Software, La Jolla, CA, USA).

Results

Clinical and Laboratory Attributes of Patients

The attributes of two F-CLL, four F-MBL and six S-CLL, including Binet stage [25], are shown in Table 1, and an abbreviated family pedigree is shown in Fig. 1. There were no significant differences in mean ages among F-Controls (n = 3, mean 48 years; SD 6 years), F-MBL (n = 4, mean 62 years; SD 10 years) and F-CLL (n = 2, mean 54 years; SD 0 years), and no difference in mean ages among F-MBL, F-CLL and S-CLL (n = 6, mean 73, SD 12, one-way ANOVA with Tukey’s post hoc test); however, there was a significant difference in age between combined F-Controls and NK-Controls and S-CLL (mean age 49 vs 73; P = 0.005). To reduce the effect of genetic relatedness, NK-Controls were added to F-Controls for analyses of non-kindred S-CLL, F-MBL and F-CLL.

Table 1 Clinical and B-cell phenotype of subjects with F-MBL, F-CLL and S-CLL

Comparison of mRNA Levels in F-Controls, F-MBL and F-CLL

RNA extracts of enriched B lymphocytes were prepared and analyzed to identify differences in abundance of mRNAs. Using flow cytometry, there were no differences among the purity of F-MBL CD20 +, CD5 + cases and CLL (mean purity 83% versus 94% respectively; P > 0.05, Student t-test) or between CD5 mRNA expression in F-MBL and F-CLL cases (mean log2 bi-weight avg signal 8.7 versus 9.5, respectively, P > 0.05, Student's t-test). The levels of 2095 mRNAs (1794 coding, 301 non-coding) differed among F-Controls, F-MBL and F-CLL (ANOVA P < 0.01) (Fig. 2 and Supplementary Material, Table S2). After correcting for multiple comparisons (FDR P-value < 0.05), seven mRNAs were identified that segregated F-Controls from F-MBL and F-CLL (Table 2; Fig. 3). Compared to F-Control B-lymphocytes, levels of GRASP mRNA and the novel transcript ACO16745.3 were decreased in F-MBL and further decreased in F-CLL (Fig. 4a, b). C11orf80 and METTL8 levels were higher in F-MBL and further increased in F-CLL (Fig. 4c, d). The mean mRNA level for PARP3 was increased in F-MBL compared to F-Controls, however less increased in F-CLL (Fig. 4e). Compared to F-Controls, ROR1 and LEF1 mRNA levels were increased in both F-MBL and F-CLL (Fig. 4f, g); however, there were no differences between F-MBL and F-CLL.

Fig. 2
figure 2

Hierarchical clustering of B lymphocyte mRNA abundance in F-Controls, F-MBL and F-CLL. Data are displayed as a heat map where rows represent mRNAs and columns represent samples from patients. Colored pixels indicate the magnitude of the response for each gene, where shades of red and blue represent induction and repression, respectively, relative to the median for all genes. Differential expression analysis identified 2095 differentially expressed mRNAs (1794 coding, 301 non-coding) among F-Controls, F-MBL and F-CLL (ANOVA P < 0.01; not corrected for multiple comparisons). The range of differential expression (log2) was 1.58 suppression to 19.27 increased expression. The cluster dendrograms at the right segregate F-Controls, F-MBL and F-CLL

Table 2 mRNAs differentially abundant in F-Controls, F-MBL and F-CLL
Fig. 3
figure 3

Hierarchical clustering of B lymphocyte mRNA levels in F-Controls, F-MBL and F-CLL cases. Array elements that significantly varied between groups (FDR < 0.05) were included (7 mRNAs). The range of differential expression (log2) was 4.89 suppression (LEF1) to 11.87 increased expression (C11orf80). The cluster dendrograms at right segregate F-Controls, F-MBL and F-CLL

Fig. 4
figure 4

Comparison of seven mRNAs differentially abundant among F-Controls, F-MBL and F-CLL cases. ag Bi-weight average signal (log2) intensity for each of the seven mRNAs found to be differentially abundant among F-Controls, F-MBL and F-CLL. Significances were determined using one-way ANOVA with Tukey’s post hoc test. P ≤ 0.05 values are summarized with 1 asterisk, P ≤ 0.01 with 2 asterisks, P ≤ 0.001 with 3 asterisks and P ≤ 0.0001 with 4 asterisks. h Real-time reverse transcription-PCR (qRT-PCR) validation of microarray results for GRASP. qRT-PCR was performed in F-Controls (n = 3), F-MBL (n = 4) and F-CLL (n = 1). Changes in expression were determined relative to GAPDH (delta Ct)

qRT-PCR was used to measure mRNA levels for GRASP in F-Controls (n = 3), F-MBL (n = 4) and F-CLL (n = 1), and changes in expression were determined relative to GAPDH using the delta Ct method. The delta Ct for GRASP was highest (mRNA less abundant) in F-CLL, intermediate in F-MBL and lowest in normal F-Controls (Fig. 4h), consistent with the microarray data.

Comparison of mRNA Levels in Related and NK-Controls, F-MBL, F-CLL and S-CLL

S-CLL cases were analyzed to determine whether mRNA abundances of genes differed to familial cases and whether the same mRNAs that showed changes in abundance among F-Controls, F-MBL and F-CLL were also differentially abundant in combined familial and NK-Controls (Combined Controls), F-MBL, F-CLL and S-CLL (Fig. 5). For six of the genes, there were no differences in mRNA levels between S-CLL cases and F-CLL (Supplementary Material, Table S3). However, there was a difference in PARP3 levels between S-CLL cases and F-CLL. These results were the same when F-Controls were removed from the Combined Controls group (Supplementary Material, Fig. S1).

Fig. 5
figure 5

Comparison of seven mRNAs among combined F-Controls and NK-Controls, F-MBL, F-CLL and S-CLL cases. a-g Bi-weight average signal (log2) intensity for each of the seven mRNAs among combined familial and NK-Controls (Combined Controls), F-MBL and F-CLL compared to S-CLL. Significances were determined using one-way ANOVA with Tukey’s post hoc test. P ≤ 0.05 values are summarized with 1 asterisk, P ≤ 0.01 with 2 asterisks, P ≤ 0.001 with 3 asterisks, and P ≤ 0.0001 with 4 asterisks. ns: P > 0.05

The abundances of mRNAs for LEF1, GRASP, ROR1 and METTL8 were different between S-CLL and F-MBL; however, there were no differences among C11orf80, PARP3 and AC016745.3.

Discussion

In this article we report that mRNA levels of GRASP and AC016745.3 were lower and of C11orf80, PARP3, ROR1, METTL8 and LEF1 were higher in enriched B lymphocytes from F-MBL and F-CLL cases compared to F-Control subjects. Furthermore, there were no differences in mRNA levels of GRASP, AC016745.3, C11orf80, ROR1, METTL8 and LEF1 between F-CLL and S-CLL. PARP3 was differentially abundant but increased in F-CLL and S-CLL compared to F-Controls and combined F- and NK-Controls. Previous studies have found changes in mRNA levels in both sporadic MBL and early-stage S-CLL cases compared to normal B lymphocytes [26], including a prognostic seven-gene signature (FMOD, PIK3C2B, LEF1, CKAP4, PFTK1, BCL-2 and GPM6a) [12]. Furthermore, mRNA levels of genes involved in MAPKinase, protein kinase A and proliferation pathways have been found to differentiate normal B lymphocytes from sporadic MBL and S-CLL cases [27].

More than 40 mutations have been associated with an inherited risk of CLL [9, 28]. Significantly, susceptibility alleles and haplotypes are enriched in regulatory elements including B-cell transcription factor binding sites, and it is likely that a proportion of the genetic susceptibility to CLL results from mutations that affect gene regulation [28]. Furthermore, non-DNA sequence-based inheritance factors, including epigenetic variations, that regulate gene expression have been described for hereditary cancers [10, 11]. In the present study, the simultaneous presence of F-MBL and F-CLL in a single family provided an opportunity to study changes in mRNA associated with progression to F-CLL against similar genetic backgrounds. In this family, the mRNA levels of GRASP and AC016745.3 were decreased in F-MBL (2.6- and 2.1-fold, respectively) compared to F-Controls and further decreased in F-CLL (21.1- and 2.6-fold, respectively), whereas C11orf80 and METTL8 mRNA levels were increased in F-MBL (2.8- and 5.7-fold, respectively) and further increased in F-CLL (3.7- and 9.2-fold, respectively). The mRNA levels of ROR1 and LEF1 were also higher in F-CLL compared to F-Controls (27.9- and 73.5-fold, respectively); however, there were no differences between F-CLL and F-MBL, and for PARP3, levels were higher in F-MBL (1.3-fold) but less so for F-CLL (1.1-fold) compared to F-Controls.

The incidence of CLL increases with age; however, familial cases are more likely to be younger (≤ 55 years) than sporadic cases [29], and consequently age-matching of cases and F-Controls was difficult for this single family-based study. The younger F-Controls may develop F-CLL in the future, which would be expected to reduce differences in expression among the seven mRNAs that were differentially abundant between F-Controls and F-MBL or F-CLL cases.

Of the seven differential genes identified, LEF1 and ROR1 have previously been associated with either the development of CLL or progression of MBL to CLL [30,31,32,33,34,35,36]. The transcription factor, LEF1, is involved in the development of B lymphocytes and is highly expressed in mouse pro-B and pre-B lymphocytes but downregulated in mature B cells [37]. LEF1 functions in the Wnt/β-catenin signaling pathway, recruiting β-catenin to activate transcription of several target genes in response to constitutive Wnt pathway activation, which regulates B lymphocyte proliferation and survival [31]. CLL cells aberrantly express LEF1 compared to normal B lymphocytes and LEF1 knockdown or LEF1 inhibition by small molecule decreases CLL B-cell survival [31, 38].

ROR1 signaling is involved in cell proliferation and differentiation, and over-expression of ROR1 on the surface of B-CLL has been documented in several studies [33, 39]. ROR1 acts as a receptor for Wnt5 signaling, which increases CLL cell survival, proliferation and migration [40]. These effects are blocked by cirmtuzumab, a humanized anti-ROR1 monoclonal antibody [40]. siRNA silencing of ROR1 in CLL cells induces apoptosis of B-CLL cells but not control B cells [41]. Consequently, ROR1 has been considered as a target for new CLL therapies [42].

This study identified five novel associations, of which PARP3, GRASP, METTL8 and C11orf80 may be plausible candidate genes associated with the neoplastic transformation of B lymphocytes. PARP3 facilitates the formation and maintenance of the mitotic spindle and genome integrity [43] and is a potential target for cancer therapy [44]. The GRASP gene encodes the general receptor for phosphoinositide 1-associated scaffold protein, which promotes ADP ribosylation factors to Rac signaling networks and cell migration [45, 46]. Consistent with our findings, GRASP has been found to be downregulated in CLL compared to control B lymphocytes [47]. METTL8 encodes a methyltransferase associated with CLL [48] and may be responsible for epigenetic effects in CLL. C11orf80 encodes a component of a topoisomerase 6 complex specifically required for meiotic recombination and may be a potential target for treatment if overexpressed in CLL cells.

Although CLL and F-MBL samples were not 100% pure and contained contaminating CD20 +, CD5- B lymphocytes, for comparisons of mRNA expression there were no differences among the purity of F-MBL CD20 +, CD5 + cases and F-CLL using flow cytometry (mean purity 83% versus 94%, respectively; ns, Student's t-test) and CD5 mRNA expression in F-MBL and F-CLL cases (mean log2 bi-weight avg signal 8.7 versus 9.5, respectively, ns, Student's t-test). Furthermore, the possibility of activating downstream pathways was reduced by using a negative selection method to purify CLL and F-MBL cells rather than positively sorting CD5 + cells, which induces protein kinase C signaling [49].

Conclusions

In conclusion, although studies of single families are limited by small numbers, identification of differentially abundant mRNAs in normal B lymphocytes, F-MBL and CLL cells has provided new molecular signatures for targeted therapies. Significantly, the similarities between F-CLL and S-CLL in this study and previous studies [50] indicate that findings from familial studies may translate to sporadic cases.

This study was limited by the small sample sizes, especially for F-CLL, and inability to standardize the collection times of samples after diagnosis. In addition, all F-MBL samples were IgVH mutated, which may alter expression of downstream effectors.