Introduction

Inflammation is a key pathway in many diseases including cancer1, a leading cause of death worldwide2. More specifically inflammation is one of the enabling hallmarks of cancer3 and is thought to influence several phases of cancer development including initiation and progression4 through both intrinsic (genetic events) and extrinsic (soluble mediators) pathways5. Epidemiological, pharmacological and genetic research provide solid evidence that inflammation can increase cancer risk and promote tumour progression6,7. Epidemiological studies have already examined the link between inflammation and risk of some cancers including bladder, oesophageal, ovarian, prostate or thyroid cancer6 but also colorectal cancer8 and others7.

While breast cancer (BC) is the most commonly diagnosed cancer among women worldwide, B-cell lymphomas (BCLs) are the most common hematopoietic cancers in both men and women in the developed world. Few epidemiological studies have investigated the association between blood levels of cytokines9,10, CRP11,12,13 and/or immune related factor9 and risk of BC in prospective settings and results were inconsistent9,10,11,12,13,14,15. Several prospective studies have also examined the association between B-cell Non-Hodgkin Lymphoma (B-cell NHL) and circulating levels of inflammatory markers providing strong evidence of a subtle role of inflammation in B cell NHL development and progression16,17,18,19,20,21,22,23,24,25,26. These studies have suggested an increased risk of B cell NHL and/or its subtypes with increased blood levels of interleukin17,18,19,20,23,24,25, chemokines16,25, immune activation markers16,17,22 and growth factor19. While other studies reported negative association of interleukin with risk of B cell NHL18,23.

The prevalence of these two cancers and the conflicting results regarding the role of inflammation underline that BC and B-cell NHL are particularly relevant to explore the role of inflammation in cancer initiation.

Previous studies generally focus on the concentration levels of individual biomarkers, and since these biomarkers are systemic, associations between individual biomarkers and cancer may not be always apparent. The use of lower resolution summary scores may improve statistical power and reveal associations with cancer risk27,28. In the present work, we investigate inflammatory scores to ascertain their link with cancers onset. We used data from the EnviroGenoMarkers (EGM) project designed as two nested case-control studies including breast cancer (BC) and Β-cell Non-Hodgkin Lymphoma (B-cell NHL, including multiple myeloma) cases within two cohorts. Specifically, we constructed our inflammatory score using a large panel of cytokines, chemokines and growth factors measured in prospectively collected peripheral blood samples in 268 participants from the Italian component of the European Prospective Investigation into Cancer and Nutrition (EPIC-Italy) and 492 participants from the Northern Sweden Health and Disease Study (NSHDS). Making the most of the EGM dataset, we examined participant’s inflammatory status through the inflammatory score (and a corresponding unsupervised alternative) in relation to cancer onset as a common mechanism in disease initiation and progression. These associations were investigated for each type of cancer (breast cancer and B-cell NHL separately), and after stratification by time-to-diagnosis (TtD).

Results

Study population

Key characteristics of the study population are reported in Table 1, participants were predominantly from Sweden (60.6%) with a mean age of 52.8 years (SD = 7.7). Participants from EPIC-Italy mostly women (74.3%) tended to report a lower education level, more physical activity, and have a higher alcohol consumption compared to participants from NSHDS. Our study population includes 90 breast cancer (54.4% from EPIC-Italy) and 248 B-cell NHL (33.8% from EPIC-Italy) prospective cases which were diagnosed from 2 to 15.5 years after inclusion.

Table 1 Comparison of baseline characteristics of EPIC-Italy and NSHDS.

Pre-diagnostic inflammatory score and breast cancer

The breast cancer population includes a total of 167 women (56.3% from EPIC-Italy) with a slightly higher proportion of cases in both cohorts after excluding individuals with missing data (Supplementary Table S1). The median time to diagnosis was 5.84 years and showed limited heterogeneity in both cohorts (5.35 years in EPIC-Italy and 6.43 years in NSHDS). BC cases from both cohorts were more inactive compared to controls (p = 0.05 in EPIC-Italy and NSHDS). In NSHDS, BC cases had lower BMI, experienced fewer pregnancies whereas no significant differences between cases and controls for any of the other baseline variables investigated were observed in EPIC-Italy (Supplementary Table S1). Women with BC from EPIC-Italy differed from those from NSHDS by age at menarche, parity, the use of hormone replace therapy and their menopausal status (Supplementary Table S2).

Our analyses based on the full BC population suggested a lower inflammatory score in BC cases compared to controls (model 1: β = −0.96, p = 0.333, Table 2A), but that difference did not reach statistical significance. Consistent estimates and similar conclusions can be drawn for the analyses stratified by cohort (model 1: β = −1.12, p = 0.436 and β = −0.64, p = 0.640 respectively for EPIC-Italy and NSHDS, Table 2A). Adjusting for either behavioural, socioeconomic, or hormonal factors, or in the fully adjusted models, effect size estimates showed consistent signs, and associations did not reach statistical significance (p > 0.09 across all models investigated, Table 2A, Supplementary Fig. S1). Stratification by time to diagnosis revealed statistically significant lower inflammatory score in BC cases diagnosed less than 6 years after enrolment (β = −2.88, p = 0.032, Table 2A). Results for each cohort separately showed consistent direction of association, but the lower inflammatory score in BC cases diagnosed less than 6 years after enrolment was borderline statistically significant in EPIC-Italy (β = −3.52, p = 0.057) and non-significant in NSHDS (β = −3.58, p = 0.196, Table 2A, Supplementary Fig. S1). No associations were observed in the pooled population or by cohort for BC cases diagnosed 6 years after enrolment (β = −0.06, p = 0.970, Table 2A).

Table 2 Association of pre-diagnostic inflammatory score (A) and the first PC (B) with breast cancer case/control.

Analyses using the first principal component (PC1, explaining 32.6% of the variance) as an alternative inflammatory score showed that our conclusions were robust to our definition of the inflammatory score and we identified the same significant association indicating a lower inflammatory score in the BC cases diagnosed less than 6 years after enrolment (β = −1.55, p = 0.029, Table 2B, Supplementary Fig. S2).

Pre-diagnostic inflammatory score and B-cell NHL

Key characteristics of the B-cell NHL study population are given in Supplementary Table S3. Of the 248 β-cell malignancies cases diagnosed during follow-up, 84 arose from the Italian population and 164 from Sweden. The median time to diagnosis was 6.09 years (5.53 in EPIC-Italy and 6.38 in NSHDS). Cases and controls were on average 53.1 years old (Supplementary Table S3).

Our analyses of all B-cell NHL cases (i.e. pooling all histological subtypes) indicated a lower inflammatory score compared to controls (model 1: β = −1.28, p = 0.012 Table 3A). Consistent effect size estimates were observed in both cohorts separately, but the lower inflammatory score in B-cell NHL cases was only found statistically significant in the Swedish population (model 1: β = −0.89, p = 0.310 and β = −1.48, p = 0.019 for EPIC-Italy and NSHDS, respectively, Table 3A). The adjustment for socioeconomic variables, as for behaviours, lightly weakened these associations, but in the fully adjusted model, the lower inflammatory score in B-cell NHL cases was still significant in the pooled population (Table 3A – Supplementary Fig. S3).

Table 3 Association of pre-diagnostic inflammatory score (A) and the first PC (B) with B-cell non-Hodgkin lymphoma case/control.

After stratification on time to diagnosis, we observed a borderline significant lower inflammatory score in cases diagnosed less than 6 years after inclusion in the pooled B-cell NHL population (β = −1.27, p = 0.051, Table 3A). A consistent estimate of the effect size was observed in cases diagnosed more than six years after enrollment, without the difference reaching statistical significance (β = −1.06, p = 0.108). B-cell NHL cases diagnoses more than 6 years after enrolment showed significantly lower inflammatory score only in the NSHDS samples. As before, results using PC1 as an alternative inflammatory score suggested highly consistent results (Table 3B, Supplementary Fig. S4).

Discussion

Our analyses identified a general lower burden of inflammation in prospective BC and B-cell NHL cancer cases. Although non-significant, BC cases had a lower inflammatory score compared to controls in the pooled population and in both cohort. This difference was mainly driven by BC cases diagnosed less than 6 years after enrolment, for whom the inflammatory score was significantly lower than in controls. No significant differences were observed after stratification by BC histological subtypes in both oestrogen receptor positive and negative BC (Supplementary Table S4).

B cell NHL cases also had a significantly lower inflammatory score compared to controls in the pooled population and in NSHDS. Additional analyses showed a lower inflammatory score in all histological subtype cases, statistically significant in the largest subtype group of multiple myeloma (27.4% of B-cell NHL cases, Supplementary Table S5). Our results were robust to the definition of the inflammatory score and were not affected by adjusting for the main potential confounders.

Several limitations of this study should be considered. Our study population remains limited in size especially the stratified analyses by cohort, by histological subtypes and time to diagnoses. Our results showed overall similar trends but reduced strength of associations, which might indicate limited statistical power. The two cohorts we used may be heterogeneous in terms of population genetics, exposure profiles, lifestyle factors and study design-related source of variability, explaining some result differences between the two cohorts. Only BC and B-cell NHL cases were included in the EGM project, hence limiting our ability to investigate the role of inflammation in various cancer types. Additionally, a minimum of two years elapsed between enrolment and diagnosis. Since cancer development is likely to have a long latency period, over many years, we cannot rule out the presence of abnormal cells and cancer cases may already have been in a preclinical state at enrolment. Furthermore, we measured the inflammatory proteins at a single time point which may not reflect the long term inflammatory status of an individual and does not allow us to consider the speed of cancer progression in relation to longitudinal inflammatory burden. Finally, we cannot rule out the possibility that other factors may contribute to the association between inflammation and cancer risk. Furthermore variables relating to the inflammatory system, such as infection status and/or medication could not be accounted for.

Our study has a number of strengths. First, the prospective design limits both recall and reverse causation bias which can be induced by the disease itself, treatments or lifestyle changes after diagnosis. We also had access to a wide range of information about lifestyle factors and hormonal status which allowed us to adjust for important potential confounders. Additionally, the availability of two cohorts allowed for independent confirmation of the observed signals and comparison of the inflammatory score and cancer association results between two countries. Moreover, compared to most previous prospective studies, we measured and combined into a score a large panel of inflammatory proteins to capture the inflammatory load. We know of two studies which have examined several markers of inflammation both individually and jointly as an inflammatory score and have reported that the use of a score is a robust method to explore the association between inflammatory load and future risk of cancer27,29. Allin et al. reported a positive association between an inflammatory score derived from three inflammatory biomarkers (CRP, fibrinogen and whole blood leukocyte count) and future risk of colorectal, lung and BC during a median follow-up period of 4.8 years in a Danish general population27. To test the robustness of the definition of the inflammatory score, we defined, as an unsupervised alternative, a score using the first principal component obtained from the 28 inflammatory related proteins. Our results and conclusions remained stable with either score.

As a general pattern, behavioural factors and education did not seem to explain a substantial part of case-control differences in the inflammatory score. An unhealthy lifestyle, including smoking30, heavy drinking31, physical inactivity32 and body mass index is related to higher chronic inflammation and adverse health outcomes33. In our study, after controlling for behavioural and education, no significant changes were observed in the associations indicating that these variables do not act as confounder in the observed association between inflammation and cancer, suggesting that inflammation may be causally linked to cancer risk.

Our study remains however restricted to a limited set of potential confounders and mediators, hence hampering the generalisability of our conclusion.

A vast majority of the studies looking at the association between inflammation and BC risk used the CRP and a recent meta-analysis concludes a modest statistically significant positive association between CRP concentration and breast cancer risk11,34. Our results may appear to be contradictory. This may be due to our approach which summarises the association of 28 inflammatory related proteins with disease risk using a score to better capture the inflammatory status and improve statistical power but also attributable to the timing in which blood samples were collected and cancer diagnosed, or to other factors or confounders. Several epidemiological studies have explored the link between circulating marker of adiposity and inflammation in relation to BC risk with the underlying idea that obesity associated inflammation may increase the risk of cancer9,34,35,36. The release of inflammatory mediators may then take place in the adipocytes and might not be apparent at the same level in the circulatory system36,37. The time from blood collection to cancer diagnosis may change according to studies and influence results. In our results a lower inflammatory score in BC cases compared to controls was only observed in those diagnosed less than 6 years after enrolment. Early BC cases may present inflammatory differences compared to those developing BC cancer later or may be already in a pre-disease stage caracterized by a lower inflammatory score suggesting that inflammatory status differs according to BC cancer evolution time.

Numerous studies have investigated the association between pre-diagnostic inflammation and NHL using cytokines, chemokines and other immune markers independently. These studies showed a deregulation of these biomarkers with future risk of NHL suggesting subclinical dysfunction as a consistent risk factor16,17,18,19,20,21,24,38. In a previous study in a subset of EPIC-Italy’s population, the authors reported association between lower levels of IL-2, INF-y and upper levels of ICAM and risk of NHL18. In another study, it has been shown that lower levels of leptin was associated with increased risk of NHL and upper levels of IL-10 increase risk of B-NHL; they also found no association between a composite score from a principal component and future risk of NHL20. A case control study in the PLCO trial reported an increased risk of NHL with elevated serum levels of TNF-R1 and sCD2717. Our results demonstrate that future cases of NHL had a lower inflammatory score compared to controls. This provides further support for the subtle status of deregulated immune response or failure to modulate the immune response appropriately in relation to NHL risk. It is also established that NHL is one of the most common malignancies occurring after immunosuppressive therapy use due to transplantation39,40.

Inflammation has multifaceted role in cancer including initiation, promotion and progression. Deregulation of biological processes can lead towards chronic inflammation or play in immunosuppressive roles. Our results from a case-control study nested within 2 Europeans cohorts focusing on 2 specific cancers (BC and B-cell NHL) suggest an association between the presence of cancer and a lower inflammatory score and support the complex role of chronic inflammation in cancer development. The use of the PC1 aimed to measure the inflammation variability captured by the 28 proteins. These findings call for further studies to better understand how inflammatory changes evolve involving larger populations, larger variety of cancer types and repeated measures of larger panels of inflammatory markers.

Methods

Study population

The Envirogenomarkers study (EGM, www.envirogenomarkers.net) was designed as two nested case-control studies and includes participants from the Italian component of the European Prospective Investigation into Cancer and nutrition41 (EPIC-Italy) and the Northern Sweden Health and Disease Study42 (NSHDS), which have been described previously43,44. The Envirogenomarkers project and its associated studies, experimental protocols and methods were approved by the Regional Ethical Review Board of the Umeå Division of Medical Research, for the Swedish cohort, and the Florence Health Unit Local Ethical Committee, for the Italian cohort. All methods were carried out in accordance with the approved guidelines.

The EGM research project was to identify novel biomarkers associated with chronic diseases in which the environment might play an important role. EGM project was therefore focused on breast cancer and B-cell NHL. Exposure of environmental pollutants including polychlorinated biphenyls, polycyclic aromatic hydrocarbons, cadmium or lead were also collected.

EPIC-Italy includes 47,749 healthy participants aged 35–70 years old from 5 different areas: Turin, Varese, Florence, Naples, and Ragusa. Anthropometric measurement, lifestyle factors and blood samples were collected at recruitment (1993–1998). Standardized procedures were used to identify newly diagnosed cases of cancer based on automated linkages to cancer and mortality registries, municipal population offices and hospital discharge systems. In Naples follow-up information was collected through periodic personal contact. All participants signed a written informed consent and the ethical review boards of the International Agency for Research on Cancer, and of the collaborating institutions responsible for subject recruitment in each of the EPIC recruitment centers approved the study.

NSHDS is a prospective cohort study that included 95,000 participants from the general population from 1990 up to January 2008. It includes three sub-cohorts, of which the largest is the Västerbotten Intervention Programme (VIP) that mainly recruits individuals aged 40, 50 or 60 years. At initial recruitment, subjects were asked to complete a self-administered questionnaire to collect demographic, medical and lifestyle information and a separate self-administered food frequency questionnaire. Informed consent was obtained from all participants and a medical examination was conducted during which a blood sample was taken. Incident cancers occurring during follow-up were identified by linkage with the Swedish Cancer Registry and the local Northern Sweden Cancer Registry. NSHDS study was approved by the Local Ethical Committee in Umeå, Sweden. The sample selection strategy is described in Supplementary Fig. S5.

Outcome

Our study population includes breast cancer (BC) and B-cell non-Hodgkin lymphoma (B-cell NHL) cases who were healthy at blood collection and were clinically diagnosed from 2 (minimum chosen duration to avoid the inclusion of cancer cases at enrolment) to 15.5 years after inclusion. Only invasive breast cancer cases were included in the study. All eligible B-cell NHL cases, including multiple myeloma were included. For each case, one suitable control was selected among participants in each cohort who were alive and free of cancer and were matched by center, gender, date of blood collection (+/−6 months), and age at recruitment (+/−2.5 years). We considered all B-cell NHL subtype together and in the text B-cell NHL includes multiple myeloma.

Laboratory analyses

For each participant, a blood sample was collected at enrolment and stored in citrate (Italy) or EDTA (Sweden). Within two hours, plasma was separated and placed in cold storage. Biosamples underwent inflammatory profiling in two distinct phases and 32 of inflammation-related proteins were measured using the milliplex HCYTOMAG-60K and HSCYTMAG-60SK kits (Millipore, Billerca, MA), according to the manufacturer’s protocol45. Four analytes were excluded from further statistical analyses due to high rates of non-detection (>75%). Finally, 12 cytokines, 10 chemokines and 6 growth and angiogenic factors were measured for all participants (Supplementary Table S6). Protein levels below the level of detection were imputed based on a maximum likelihood estimation method exploiting the correlation structure across proteins to draw the missing values46. Levels of proteins were log-transformed to normalise their distributions. As previously proposed47, linear mixed models efficiently correct for technically-induced variation, which is potentially diluting the effects of interest. In practice, we adopted a two-step procedure first fitting a linear mixed model regressing protein levels (response) including a random intercept depending on technical covariates (microtiter plate). In a second step, we subtracted from the measured protein levels the random effect estimates measuring the variation linked to technical covariates to obtain denoised data (i.e. without the potential technical artefact)48.

Inflammatory measure

As proposed previously48, and under the working hypothesis of a consistent contribution of each inflammatory marker to the overall inflammatory burden, we summarized the individual inflammatory status as a score derived from the 28 proteins assayed: for each protein, we defined a dichotomized indicator: “high concentration” = 1, and “low concentration” = 0 based on the highest quartile of the denoised protein concentrations, and summed across the 28 proteins these binary indicators48. Principal component analysis (PCA) is a mathematical algorithm that reduces the dimensionality of the data while retaining most of the variation in the data set49. PCA of these 28 inflammatory markers concentrations shows that 19 PCs explained more than 90% of the total variation seen in the dataset (Supplementary Fig. S6). As a continuous and hypothesis-free alternative score, we used the scores of the first principal component (PC1, explaining 32.6% of the total variance of the 28 proteins).

Statistical analysis

We compared baseline characteristics of the total population and by cancer sub-population using the Chi-squared test or Fisher exact test for the categorical variables and T-test or Wilcoxon rank test for continuous variables. Categorical variables are reported as percentages while continuous variables are reported as mean (and standard deviation).

Our inflammatory score and its continuous alternative (PC1) were regressed against the disease outcome (case-control status and in BC and B-cell NHL population, separately) on the pooled population. We used the same model adjusting for potential confounders collected at enrolment for each type of cancer: age, gender, phase, cohort and center (Model 1). Potential confounders included behavioural factors: body mass index (BMI, continuous variable, kg/m2); smoking status (categorical: current, former, never); alcohol consumption (continuous, g/day); physical activity (categorical: inactive, moderately inactive, moderately active, active); and education (categorical: none/primary, professional/technical, secondary, university/college) as a proxy for socioeconomic status. Based on model 1, we controlled separately for behavioural factors (hereafter referred to as model 1 + behaviours) and for participant’s educational level (hereafter referred to as model 1 + socioeconomic position). The fully adjusted model included behavioural and socioeconomic factors listed above and was additionally adjusted for the reproductive/hormonal variables for BC analyses only. The latter comprised menopausal status (categorical: post-menopausal, pre-menopausal, unknown); contraceptive usage (categorical: yes, no); age at menarche (binary: ≤12 years old or >12 years old); menopausal hormone use (categorical: yes, no) and parity (quantitative discrete: 0, 1, 2, 3, >4).

Sensitivity analyses

We fitted our models on data for each cohort separately (EPIC-Italy and NSHDS). To evaluate the impact of follow-up time and consider a potential effect of the preclinical phase of the disease, analyses were also stratified on the time to diagnosis as defined by the time elapsed between blood collection and clinical diagnosis: below or above the median time to diagnosis (6 years) across the entire population of cancer cases. We used the same time to diagnosis cut-off (below or above 6 years) for each type of cancer since the median time to diagnosis vary little across cancer type (5.84 years in BC and 6.09 years in B-cell NHL). In these analyses, cases were compared to all controls to preserve statistical power. Analyses were also performed by BC and B-cell NHL histological subtypes.

Statistical analyses were performed in R v3.250 using the RStudio environment v.0.99.484. All p-values with p < 0.05 were considered statistically significant.

Availability of data and materials

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.