Introduction

Low back pain (LBP) constitutes one of the most common global causes of disability regardless of age [1]. Prevalence of LBP has been reported to reach adult levels by the end of puberty [2], and some studies have identified childhood LBP as a significant risk factor for LBP in adulthood [3].

Despite extensive research, the etiology of LBP remains unknown resulting in unpredictable treatment outcomes. Imaging findings of disc degeneration (DD) become more prevalent with age and are frequently seen in asymptomatic individuals [4]. Findings of DD on magnetic resonance imaging (MRI) in early adulthood seem to predispose the individual for a more rapid progression of degeneration [5]. A systematic review found strong evidence for the association of disc herniation on MRI with progression of DD, while evidence for the extent of disc degeneration or the presence of Modic 1 changes was insufficient [6].

MRI findings such as disc degeneration, disc protrusion, bulge, and extrusion have been associated with LBP [7]; the association of high intensity zones (HIZ) or Modic changes has proven inconsistent [7,8,9]. Moreover, the value of MRI findings in predicting future LBP seems low [10]. In a 30-year follow-up study of young males with LBP, degenerated lumbar intervertebral discs (LIVD) in early adulthood demonstrated progressive degeneration over time but did not predict future pain or disability [11].

Our present understanding of the natural history of the LIVD is insufficient to differentiate age-related changes from possible pathologic findings related to LBP. The few long-term studies with MRI published to date lack clinical information [5], report on specific populations [11, 12] or older subjects [13]. In the present study, our objective was twofold: first, to describe the natural history of LIVD from childhood to adulthood; and second, to investigate whether findings of DD were associated with the clinical symptom of LBP.

Methods

Study design and subject recruitment ( Fig.  1 )

Fig. 1
figure 1

Flow diagram of the recruitment process

In 1994, we randomly recruited healthy 8-year-old school children from six primary schools in the urban capital area of Helsinki to investigate the normal growth of the lumbar spine from childhood to early adulthood. The subjects (n = 94 in the beginning of the study) were examined at the age of 8, 11, and 18. In January 2021, an invitation letter for a follow-up examination was sent to all those subjects who were enrolled in 1994, and whose contact details were available from the national population data service agency (n = 89). Subjects who did not respond to the first invitation letter were contacted anew via mail.

All four investigations included a semi-structured interview, a clinical examination and a lumbar spine MRI. The final examination in 2021 additionally included selected patient-reported outcome measures (PROM).

Semi-structured interview

The semi-structured interview included the following information: physical demands of work, sports activities and smoking habits; history of LBP without specific trauma (last week/last month/last year/earlier); usage of pain medication; and history of contacting a healthcare provider due to LBP.

Patient-reported outcome measures

The participants filled in the following standardized PROMs before the clinical examination: International Physical Activity Questionnaire-Short Form (IPAQ- SF) to assess their physical activity level [14], the EQ-5D-5L for quality of life [15], Oswestry Disability Index (ODI) as a symptom-specific measure [16] and Numeric Rating Scale (NRS, 11-point scale) for pain intensity during the past week including back, neck, arm and leg pain [17].

Clinical examination

The subjects’ weight was measured with a balance-beam scale; height was self-reported. Body Mass Index (BMI) was calculated using the standard formula for adults (weight in kilograms divided by the square of height in meters). In addition, a standard clinical examination was performed.

MRI investigation

The MRI investigations at the ages of 8 and 11 were performed with a high-field 1.0 T MRI scanner using a dedicated spine coil, while the subsequent investigations were obtained with a high-field 1.5 T MRI scanner. All MRI investigations were performed before 10 am to prevent possible diurnal variations in the signal intensity (SI). Specifically, the MRIs in 2021 were performed using a 1.5 T Canon Vantage Orian scanner (Canon Medical Systems Corporation, Otawara, Japan), a dedicated spine coil and the following four sequences:

T2 sagittal sequence (TR 4950 ms, TE 120 ms, FOV 300, slice thickness 3.5 mm, slice interval 0.3, acq2).

T1 sagittal sequence (TR 500 ms, TE 9 ms, FOV 300, slice thickness 3.5 mm, slice interval 0.3, acq1).

STIR sagittal sequence (TR 4600 ms, TE 50 ms, TI 125, FOV 300, slice thickness 3.5 mm, slice interval 0.3, acq1).

T2 axial sequence with imaging planes in the direction of the three lowest discs 1 + 1 + 1 (TR 3150 ms, TE 120 ms, FOV 220, slice thickness 3 mm, slice interval 0.3, acq1).

The morphology of the three lowest LIVDs on sagittal T2-weighted images was assessed both qualitatively and quantitatively. For the qualitative visual evaluation, the MRI images including the historical ones were assessed using the Pfirrmann classification [18]. A musculoskeletal radiologist (second author) and a spine surgeon (last author) independently graded the discs without any knowledge of the subject´s clinical characteristics. The inter-rater reliability (IRR, kappa) between the two evaluators for the three lowest intervertebral discs ranged from 0.73 to 0.84 indicating substantial agreement [19]. In case of discrepancy, we used the assessment of the third evaluator (fifth author) for consensus. The individual grades (1–5) of the three LIVDs were added up for a Pfirrmann summary score (range 3–15) [20]. The visual analysis included assessment of any possible HIZ or Modic changes, as well as disc protrusions and herniations.

For the computerized quantitative evaluation of the SI of the three lowest LIVDs, an ellipsoid region of interest (ROI) was digitally marked in midline on T2-weighted sagittal images from each nucleus pulposus. As an internal reference, the SI of the adjacent cerebrospinal fluid (CSF, lat. Liquor cerebrospinalis) was used resulting in a disc to CSF-SI ratio (SINDL) [21].

This manuscript follows the STROBE guidelines for reporting observational studies when applicable.

Data analysis

Data are presented as means with standard deviation (SD), median with interquartile range (IQR) or as counts with percentages. Statistical comparison between the groups was performed by t-test, Mann–Whitney test or Chi-square test, when appropriate. In the case of violation of the assumptions (e.g., non-normality) for continuous variables, a bootstrap-type method or Monte Carlo p-values (small number of observations) for categorical variables were used. Repeated measures in Pfirrmann summary score and SINDL between the groups were analyzed using mixed-effects models, with an unstructured covariance structure (Kenward–Roger method to calculate the degrees of freedom). Fixed effects included group, time and group × time interactions. The repeated measurements were taken at different age, including 8, 11, 18, and 34 years. Mixed models allowed analysis of unbalanced datasets without imputation; therefore, we analyzed all available data with the full analysis set. The normality of variables was evaluated graphically and by using the Shapiro–Wilk W test. The Stata 17.0, StataCorp LP (College Station, TX, USA) statistical package was used for the analysis.

Results

Fifty-one of the 89 invited subjects consented to this follow-up. The final study group comprised 48 subjects (24 males and 24 females). Two subjects only wanted to attend the clinical examination; one additional MRI was cancelled due to pregnancy. Mean age of the subjects at the time of the final investigation was 34.2 years (SD 0.6). LBP without specific trauma was reported by 35 subjects (73%). A statistically significant difference in the use of pain medication (n = 17 vs n = 0, p < 0.001) and the ODI (25.9 vs 23.3, p = 0.045) was seen between subjects with or without LBP, respectively. Table 1 shows more detailed description of the study population.

Table 1 Characteristics of 48 subjects according to LBP at the age of 34

Fourteen subjects had contacted a healthcare provider due to their LBP symptom. None of the subjects had a history of spine surgery. Of the 19 subjects who reported LBP at the age of 18, 16 still reported low back pain at the age of 34; of the 29 subjects who did not report LBP at the age of 18, 10 remained asymptomatic at the age of 34 with the remaining 19 reporting new-onset LBP.

Pfirrmann summary score and its association with LBP

The Pfirrmann summary score significantly increased with age (p < 0.001) with the mean score increasing by 3.7 (95% CI: 3.3 to 4.1) during the study period from the initial 3.8 (SD 0.8) at the age of 8. A statistically significant difference was seen in the Pfirrmann summary score at the age of 18 and 34 among those subjects who reported LBP at the age of 34 compared to those who did not report LBP at the age of 34 (p = 0.004 at age 18, p = 0.039 at age 34), Fig. 2.

Fig. 2
figure 2

The mean Pfirrmann summary score during the study period for all subjects and according to LBP at the age of 34. Error bars are for 95% confidence intervals

Visual assessment of disc changes and their association with LBP

Table 2 illustrates the prevalence of Modic changes, HIZ and disc protrusions in our study population at the age of 34. No statistically significant differences were noticed between subjects with or without LBP.

Table 2 Prevalence of Modic, HIZ, and disc protrusions in the study population at the age of 34

SINDL and its association with LBP

When assessing SINDL for the three lowest LIVDs separately, a statistically significant decrease with age was noticed at all disc levels (p < 0.001 for all levels). No significant differences in SINDL emerged at any of the disc levels between subjects with or without LBP at the age of 34, Fig. 3a–c.

Fig. 3
figure 3

The mean SINDL for the three lowest LIVDs (a L3/L4, b L4/L5, c L5/S1) separately during the study period for all subjects and according to LBP at the age of 34. Error bars are for 95% confidence intervals

Discussion

Our objective was to describe the natural development of LIVDs (L3/4, L4/5, L5/S1) from childhood to adulthood in a population randomly selected from healthy school children in 1994. We further wanted to investigate whether disc changes on MRI associated with LBP. Our main finding was that subjects who reported LBP at the age of 34 had statistically significantly higher Pfirrmann summary scores already at the age of 18. The difference remained statistically significant at the age of 34, although it was not as remarkable as earlier due to progression of degenerative changes in the asymptomatic subjects alike. In the clinical setting, this finding suggests that either more severe DD in single LIVD or more extensive DD in multiple LIVDs after pubertal growth spurt may be associated with LBP in adulthood.

When the SI of the three lowest LIVDs relative to the SI of the adjacent CSF (SINDL) was calculated separately for each level, no statistically significant differences were noticed between subjects with or without LBP at the age of 34. This may indicate that changes detected in a single LIVD are not as consequential as more widespread DD. The Pfirrmann summary score introduced by Määttä et al. [20] enabled us to analyze the disc morphology of the lower lumbar spine in its entirety for a more comprehensive understanding.

Some evidence suggests that LBP in childhood or adolescence may predict LBP in adulthood [3]. While most of our subjects with LBP at the age of 18 reported continued LBP at the age of 34, many subjects who were asymptomatic at the age of 18 also developed new LBP by the age of 34. Thus, our results do not allow for any conclusions on the predictive value of early-onset LBP per se for LBP symptoms in adulthood.

Sääksjärvi et al. [11] recently published a 30-year follow-up on 26 male subjects who were initially examined by MRI at the age of 20 due to LBP severe enough to exclude them from military service. In their study population, LIVDs with even slightly decreased SI at baseline were more likely to have severely decreased SI at follow-up compared to healthy discs. However, contrary to our results, they did not notice any significant association between the severity of DD at baseline and current pain or disability. Their study population was different from ours in that all their subjects had LBP at baseline which might reflect on their results. Further, although they used a relative measure of disc SI, their reference was the least degenerated lumbar disc with the highest SI; this might have affected their results especially in older subjects. In our study, we used the adjacent CSF which has proven useful as a SI reference [22].

Our study has several limitations. The concept of self-reported LBP should be taken with certain reservations. It is noteworthy that also our asymptomatic population reported some LBP over the past week in NRS. Subjects who reported LBP nevertheless used significantly more pain medication either regularly or occasionally suggesting that their symptoms were more bothersome. The difference between the groups in ODI was statistically significant but did not reach clinical significance. Further, in our inquiry of life-time prevalence of LBP, memory decay might affect the results, although this is less likely with more severe or recurrent symptoms. It is also possible that due to loss of follow-up our study population was biased toward subjects with current or previous history of LBP. Nineteen of our 48 subjects (40%) reported LBP at the age of 18 indicating that majority of subjects of this long-term follow-up did not experience LBP during childhood or adolescence.

For the visual assessment of LIVDs, we used the Pfirrmann classification. The IRR between the two primary evaluators was substantial and comparable to previous literature [23]. The assessment of the third evaluator was used for consensus. The quantitative disc signal measurement (SI) from mid-sagittal MRI has proven a reliable measure of DD [24]. Our study spanned over three decades covering the evolution of MRI technology. The relative measure between the SI of the disc and the adjacent CSF (SINDL) allowed us to compare LIVDs within a single individual and between individuals over time. We only assessed the three lowest LIVDs, as in our young study population, we did not expect significant DD in the upper lumbar spine.

The power of our study is probably not enough to detect some possible associations between disc morphology and LBP, e.g., Modic 1 changes were present only in four subjects. On the other hand, the association between the Pfirrmann summary score at the age of 18 and 34 and LBP at the age of 34 came across as statistically significant. Despite loss to follow-up, to our knowledge, this is the first longitudinal study describing the natural history of LIVDs on MRI from childhood to adulthood with concurrent data on the clinical symptom of LBP.

In conclusion, subjects who reported LBP at the age of 34 presented with more severe or widespread DD already at the age of 18 compared with asymptomatic subjects; the difference was still present at the age of 34. Our finding is clinically important considering the high prevalence of both LBP and MRI findings in adolescents [25], and might offer some cue toward better understanding “normal” age-related versus “pathologic” DD.