Introduction

Lumbar spinal stenosis (LSS) is caused by structural and degenerative changes of lumbar spine leading to an anatomical narrowing of the spinal canal or the neural foramina. Symptoms of LSS include low back pain, radicular pain in the lower extremities and neurogenic claudication. LSS may be treated conservatively or surgically, but available literature offers no firm guidelines for the choice of treatment [1, 2]. However, surgical treatment of LSS has become the most frequently performed spinal surgery in adults [3]. Evidence suggests that patients with LSS and predominant leg pain or neurologic claudication are more likely to benefit from decompression surgery [4]. Patients with physical comorbidities, depression, smoking habits and spinal deformities show inferior postoperative outcomes after surgery for LSS [5, 6]. Imaging findings such as grade of spinal canal stenosis, degeneration of the intervertebral discs and fatty infiltration (FI) of the paraspinal muscles may influence the patient outcomes after surgery for LSS as well [5, 7, 8].

FI of the paraspinal muscles is a frequent imaging finding in patients with degenerative LSS. Mainly formed by the multifidus and the erector spinae, the paraspinal muscles are important in movement and stability of the lumbar spine. It has been suggested that FI of these muscles may be associated with increased risk of late complications after stabilising lumbar spine surgery, such as hardware loosening, proximal junctional kyphosis and adjacent segment degeneration [9, 10]. Preoperative assessment of paraspinal muscles’ FI in patients with LSS may thus provide useful information for more accurate selection of patients and decision-making before surgery and may further play a role in improvement of the postoperative outcomes by means of rehabilitation [11].

Systematic reviews report conflicting evidence for associations between FI of the paraspinal muscles and pre- and postoperative pain and disability in various degenerative diseases of lumbar spine [12,13,14]. Different results may partly be explained by the use of different imaging methods (e.g. quantitative vs. qualitative) for assessing FI or varying psychometric properties of these methods [15].

Magnetic resonance imaging (MRI) is the imaging method of choice for the evaluation of LSS. MRI is also an excellent non-invasive method for the assessment of muscular FI. Both quantitative and semiquantitative MRI methods have been used for this assessment; quantitative methods have demonstrated higher reliability compared to the semiquantitative methods [16].

In a previous publication, we have demonstrated high reliability for a simplified MRI method for the assessment of FI of the paraspinal muscles called the muscle fat index (MFI) [16]. In this method, T2-signal intensity of the paraspinal and the psoas muscles is used as a surrogate for amount of muscular FI. This simplified quantitative method can be performed on routine lumbar MRI examinations using a standard picture archiving and communication system (PACS) solution without a need for additional software. In the current study, we hypothesised that preoperative FI of the paraspinal muscles assessed by the MFI may be associated with postoperative pain and disability in LSS. Thus, the aim of the current study was to evaluate the association between MRI-assessed FI of the paraspinal muscles by the MFI before and pain or disability 2 years after surgery for LSS.

Methods

This prospective observational study was approved by the Norwegian regional committees for medical research ethics (reference number 2011/2034 central region) and adhered to the Declaration of Helsinki. All patients provided written informed consent. The study is compliant with the strengthening the reporting of observational studies in epidemiology (STROBE) statement.

Patients

Inclusion and exclusion criteria for this study are summarised in Table 1. The included patients were enrolled from the NORwegian Degenerative spondylolisthesis and spinal STENosis-the spinal stenosis trial (NORDSTEN-SST) [17]. Symptomatic patients with LSS without degenerative spondylolisthesis were included in this prospective multicentre trial. The diagnosis and severity of spinal stenosis was confirmed and documented on preoperative MRI examinations. All included patients were scheduled for surgical decompression of the spinal canal and randomised for surgery by either unilateral laminotomy with cross-over technique, bilateral laminotomy or spinous process osteotomy. All these three techniques led to comparable improvement of the postoperative outcomes without any significant differences [18].

Table 1 Inclusion and exclusion criteria

MRI examinations and assessments

The NORDSTEN is a multicentre study, and the MRI examinations were performed at several study sites in 1.5 or 3.0 Tesla units from different manufacturers between February 2013 and August 2016. At the outset of the study, the performing institutions were provided with a standardised MRI protocol including sagittal T1- and T2-weighted, and axial T2-weighted images. The MRI examinations were performed within 6 months before surgery. For the current study, we used preoperative axial T2-weighted images obtained from the L2–L5 levels (repetition time 1500–6548 ms; echo time 82–126 ms; slice thickness 3–4 mm and field of view from 160 × 160 to 220 × 220 mm2). Figure 1 shows the flowchart for inclusion and exclusion of the patients in the current study.

Fig. 1
figure 1

Flowchart showing the selection process of the patients. NORDSTEN-SST Norwegian degenerative spondylolisthesis and spinal stenosis-spinal stenosis trial

FI of the paraspinal muscles (erector spinae and multifidus) was assessed using the MFI (muscle fat index). This quantitative MRI method was previously proposed by the authors and demonstrated high interobserver and intraobserver reliability (intraclass correlation coefficient [ICC] 0.79 and 0.86–0.91, respectively) [16]. Using this method, the investigator segments the muscles by drawing a manual region of interest around the psoas muscle and another region of interest around the erector spinae and the multifidus muscles (together). The measurements are performed using a single slice of axial T2-weighted image at the level with the upper endplate of the lower vertebra at each lumbar level. The MFI is then calculated as a continuous variable by dividing the signal intensity of the psoas to the signal intensity of the erector spinae and multifidus at the same level and side of the spine. For detailed description of this method, please see Fig. 2. Studies have suggested that the psoas muscle is less prone to FI [16, 19]; we used this muscle as a natural control. MFI values close to 1.0 suggested near equal fat content in the paraspinal muscles and the psoas muscle, while values close to zero suggested a high-fat content in the paraspinal muscles compared to the psoas muscle. The MFI calculations from the index level (the most stenotic level) and from the side (left or right) with higher grade of FI (i.e. lower MFI) were used in the statistical analyses.

Fig. 2
figure 2

Axial T2-weighted MR image obtained at the L3/L4 level (the most stenotic level). The muscle fat index (MFI) was calculated by dividing the mean signal intensity (SI) of the psoas (P) to the mean SI of the erector spinae and multifidus (ES + MF) muscles on the same side. In this case, the mean SI of P on the right and the left sides were measured to 854 and 731, respectively. Corresponding measures for the ES and MF (together in the same region of interest) were 1202 and 946, respectively. The MFI (SI of P divided to SI of ES and MF) was calculated to 0.71 on the right and 0.77 on the left. The MFI from the right side (suggesting higher fatty infiltration) was used in the statistical calculations

In the current study, we used the same study population and observers as in the previous publication where the reliability of the MFI method was documented [16]. Hence, no additional reliability analysis was performed for this parameter and one of the three independent investigators, a radiologist with 13 years of experience in spine imaging (HB) evaluated MRI examinations of 300 consecutive patients from the NORDSTEN-SST, blinded to the clinical symptom of the patients. The necessary number of observations needed to detect a small to medium effect size (Cohen’s d = 0.4) is 200 and to account for dropouts related to the suitability of the axial T2-weighted images for the evaluation of the paraspinal muscles, we enrolled 300 patients in the initial evaluation. The two other investigators were spine surgeons with 6 and 10 years of experience (EH and JA). In an initial evaluation by HB, axial T2-weighted images that did not cover the psoas and the paraspinal muscles or somehow were inadequate (such as marked volume reduction of psoas on one side) were excluded (Fig. 1).

All the investigators also graded the severity of spinal canal stenosis, intervertebral disc degeneration and facet joint osteoarthritis as categorical parameters by Schizas [20], Pfirrmann [21] and Weishaupt [22] classifications, respectively. For better clinical relevance, these parameters were dichotomised into mild or severe categories, and we performed a reliability study between the three investigators. The definitions of these parameters and reliability findings are summarised in Table 3. For these parameters, we used the two surgeons’ gradings for further analysis if they agreed, and the radiologist’s gradings if the surgeons disagreed. We defined the index level as the narrowest level between the L2 and L5 vertebrae, as assessed by the cross-sectional area of the dural sac (measured by HB). All the MRI measurements were performed using the integrated tools in a PACS (Sectra IDS7, Linkoping, Sweden) application on personal laptops with integrated non-diagnostic monitors (for all the investigators).

Assessment of patient-reported outcomes

We used patient-reported outcome measures, assessed both preoperatively and 2 years after surgery (here called the 2 year follow-up outcome measures). The outcome measures used in this study included:

  • The Oswestry disability index (ODI version 2.0), a pain and disability index used in low back pain (scale ranging from 0 to 100, where 0 denotes no disability, and 100 indicates complete disability).

  • The Zurich claudication questionnaire (ZCQ) for symptom severity and physical function, a disease-specific questionnaire for LSS with three sub-scores including severity of the symptoms (scale ranging from 1 to 5, where 1 indicates the best clinical outcome) and physical function (scale ranging from 1 to 4, where 1 indicates the best clinical outcome).

  • Numeric rating scale (NRS) for back pain and leg pain intensity (ranging from 0 to 10, where 0 indicates no pain, and 10 indicates the worst pain imaginable).

To make the results more relevant from a clinical viewpoint, we also dichotomised each continuous outcome measure into success or failure, considering a minimum of 30% improvement from before surgery to 2 year follow-up score as an acceptable postoperative result (success) [23]. Although there are certain drawbacks with dichotomising continuous outcomes (such as loss of information and power of relationship), this method is frequently used and given a priori definition of the categories, it can enhance clinical relevance of the results of a study, and make the results more understandable [24].

Statistical analyses

We used STATA software (StataCorp. LLC 2017. Stata Statistical Software: Release 17. College Station, TX, USA) for the statistical analyses. The MFI values (continuous variable) were described as means and standard deviations. Paired-sample t-tests were used to compare differences in means of the outcome measure scores (continuous variables) before and after surgery. P values < 0.050 were considered statistically significant. Multivariate linear and logistic regression analyses were performed with the 2 year follow-up outcome measures as continuous and dichotomous response variables, respectively. Associations between the MFI (at the index level) and the 2 year follow-up outcome measure scores in the linear and logistic regression analyses were adjusted for the preoperative outcome measure scores, age and body mass index (both treated as continuous variables), sex (male or female), smoking status (regular smoker yes or no), grade of spinal stenosis, disc degeneration and facet joint osteoarthritis (dichotomised into mild or severe). These covariates and possible confounders deemed appropriate based on the clinical experience of the spine surgeons, radiologists and physical therapists in our research group, and also based on the published literature on the most relevant patient-related and MRI findings in postoperative outcomes of LSS [5, 25]. In the logistic regression models, the odds ratio (OR) of minimum 30% improvement of the outcome measure scores at 2 year follow-up was calculated; 95% confidence intervals (CI) were calculated for the regression coefficients and the OR values.

Results

Patient characteristics, frequency distribution of the categorical MRI measurements and both preoperative and 2 year follow-up outcome measures are summarised in Table 2. Of 300 patients initially enrolled in the study, 57 patients were excluded. The causes of exclusions are presented in Fig. 1. At the end, MRI examinations and clinical data of 243 patients were evaluated. Mean age of the study participants at the baseline was 66.6 (standard deviation 8.5) years, 119 were females (49%) and 51 were regular smokers (21%). All the outcome measures showed significant improvement 2 years after surgery (p < 0.001, Table 2). We found substantial interobserver reliability for the Schizas classification, almost perfect for the Pfirrmann and moderate for the Weishaupt classification (Table 3).

Table 2 Patient characteristics and outcome measures
Table 3 Definitions and reliability values for categorical MRI evaluations

Associations between the preoperative MFI and the 2 year follow-up outcome measures

The results of unadjusted linear and logistic regression analyses are presented in Table 4. In the adjusted linear regression analyses, the preoperative MFI demonstrated a significant inverse association with the NRS leg pain 2 years after surgery (coef. − 3.20, 95% CI − 5.61, − 0.80, p = 0.009) (Table 5). Thus, more severe preoperative FI in the paraspinal muscles was associated with less improvement of leg pain at 2 years. MFI was not significantly associated with the other outcome measures. Both female sex and smoking status were also associated with worse postoperative outcome measure scores (Table 5).

Table 4 Unadjusted linear and logistic regression values
Table 5  Adjusted linear and logistic regression values

In the multivariate logistic regression analyses, the MFI was again significantly related to postoperative NRS leg pain (OR = 1.51, 95% CI 1.17, 1.95, p = 0.002), with higher odds of achieving a minimum of 30% improvement of leg pain 2 years after surgery when the preoperative MFI was higher (denoting less FI). The associations between the preoperative MFI and improvement of ZCQ function (OR 1.21, 95% CI 0.99, 1.49) and NRS back pain (OR 1.22, 95% CI 0.99, 1.50) showed non-significant trends. The MFI was not significantly associated with any of the other dichotomised outcome measures. Among the covariates, female sex and smoking status showed statistically significant associations with some of the 2 year follow-up outcome measures (suggesting less favourable outcomes). Severe spinal stenosis (Schizas grades C and D) showed significant association with ZCQ symptoms and NRS back pain, but disc degeneration and facet joint osteoarthritis did not show significant associations with any of the outcome measures (Table 5). A graphic demonstration of the associations between the preoperative MFI and changes in the 2 year follow-up outcome measures is presented in Fig. 3.

Fig. 3
figure 3

Forest plot demonstrating the estimated odds ratios (dots) and corresponding 95% confidence intervals (whiskers) for achieving a minimum of 30% improvement in the 2 year follow-up patient-reported outcome measures. NRS numeric rating scale, ODI Oswestry disability index and ZCQ Zurich claudication questionnaire

Discussion

In this study, we investigated the association between preoperative FI of the paraspinal muscles assessed by the MRI-based MFI, and postoperative pain as well as disability in patients treated with minimally invasive decompression surgery for LSS. We found a clear association between the preoperative MFI and persistent leg pain in these patients 2 years after surgery. This finding suggests that patients with FI of the paraspinal muscles may show less improvement of leg pain 2 years after surgery compared to those without FI. The MFI did not show any statistically significant associations with either the ODI, ZCQ symptoms or ZCQ function. Considering the overall improvement of the 2 year follow-up PROMs in our study, it is unclear whether less improvement of leg pain in patients with FI has any effect on the overall results of surgery.

The prognostic value of FI in the paraspinal muscles and its relationship to postoperative outcomes of LSS has previously been studied with different MRI methods. In a study by Betz et al. [14] using a categorical grading of muscular FI on MRI (Goutallier classification), the authors did not find any association with the 1 year follow-up ZCQ (symptoms and function) or NRS leg pain after instrumented fixation or minimally invasive surgery for LSS. In another study using a quantitative MRI method for the assessment of muscular FI (thresholding technique), the authors did not find any association between FI of the multifidus and visual analogue scale scores (for back and leg pain) or the ODI in patients undergoing posterior lumbar interbody fusion for LSS [26]. On the other hand, Storheim et al. [10] found that less fat in the multifidus muscle (assessed by a visual grading on MRI) was associated with better 2 year postoperative ODI after disc replacement surgery in patients with chronic low back pain. Liu et al. [27] also demonstrated an association between FI of the multifidus muscle on MRI (using thresholding technique) and ODI, 6 and 18 months after fusion surgery on patients with LSS. To our knowledge, the current study is the first to examine the prognostic value of quantitatively assessed FI of the paraspinal muscles on MRI in postoperative outcomes of patients undergoing minimally invasive decompression surgery for LSS.

It has been suggested that LSS patients with predominant leg pain symptoms are more likely to benefit from surgical decompression [4]. However, the pathomechanism of leg pain in LSS is complex, and based on the results of the current study, we cannot conclude any causal link between FI of the paraspinal muscles and leg pain in these patients. Leg pain in patients with LSS may be a radiating pain from the lumbar nerves (radicular pain), a symptom of neurogenic claudication [28], or may originate from the supporting structures of the lumbar spine including vertebral bodies, ligaments, facet joints and the paraspinal muscles [29]. The latter is called “referred pain”, and to our knowledge, its ethology and the possible effect on the treatment outcomes of LSS is not understood [30]. In the authors’ opinion, all the three mechanisms mentioned above may be involved in the pathogenesis of leg pain in LSS. Different psychometric properties of the PROMs used in this study may also partly explain the stronger association between the MFI and leg pain. For instance, leg pain may not be measured similarly by different PROMs (e.g. having pain now in ODI, during last week in NRS leg pain and last month in ZCQ), or that patients’ understanding of pain location may differ between these PROMs. Preoperative targeted rehabilitation of the paraspinal muscles has shown promising results in reduction of both preoperative [11] and postoperative [31] leg pain in patients with LSS, supporting the idea of a relationship between weakness in the paraspinal muscles and leg pain in these patients. Exercise may reduce FI of the paraspinal muscles in short-term [32], but to our knowledge, the long-term effect of rehabilitation on the muscular FI in LSS is not known.

Another factor that may contribute to muscular FI in patients with LSS is limited physical activity. Studies have shown a link between FI of the skeletal muscles and functional disabilities [33]. We did not assess FI of leg muscles and cannot rule out the possible effect of immobility-induced FI in leg muscles on leg pain.

While the suggested diagnostic MRI criteria for LSS consider mainly factors affecting the anatomical narrowing of the spinal canal or the neural foramina, imaging findings of extraspinal factors such as FI of the paraspinal muscles are not emphasised [34]. We suggest that FI of the paraspinal muscles observed on preoperative lumbar MRI should be considered as a prognostic factor in patients undergoing surgery for LSS. However, due to the explorative design of the current study with multiple analyses, there is a chance that the results are spurious, and they need confirmation by other studies.

Female sex and older age have been identified as risk factors for increased FI of the paraspinal muscles [35]. After adjusting for these factors, the association between FI of the paraspinal muscles and leg pain remained significant in the current study. In line with the previous studies, the effect of smoking as a risk factor for postoperative leg pain, back pain and disability was significant in the current study [6, 36]. On the other hand, studies have not identified smoking as a risk factor for increased FI in the paraspinal muscles [35], supporting independent roles of smoking and FI regarding pain and disability in LSS.

Strengths and limitations

The main strength of the current study was a large sample size and including patients from a prospective multicentre trial. By excluding patients with conditions that may influence the signal intensity of the paraspinal muscles (such as inflammation, previous surgery or scoliosis), we reduced the chance of selection bias. A limitation was inclusion of symptomatic patients with LSS scheduled for surgery, limiting the generalisability of the results to other LSS patients with milder symptoms who are not candidates for surgery. Another limitation was that we used a single slices of image at each level to determine the amount of fat in the whole volume of the paraspinal muscles, which stretch over several lumbar levels. This limitation concerns the other available MRI methods for the assessment of muscular FI as well. Further, despite statistically significant association between the MFI and postoperative leg pain, we cannot draw any casual conclusions since FI of the paraspinal muscles is more likely to be a consequence of LSS and not the cause of leg pain.

More sophisticated MRI methods such as chemical-shift imaging and MR spectroscopy as direct methods for the assessment of FI of the muscles have shown high accuracy and association with back pain [37]. Assessment of the lean muscle mass by thresholding techniques is also a highly reliable method with varying degrees of associations with symptoms of LSS [19, 27]. However, these quantitative MRI methods demand additional imaging and more time and resources for analysis. The main advantage of the MFI used in the current study is that it can be implemented in the clinical practice without a need for time-consuming MR sequences or exporting the images into an external application.

Although patients with inflammatory or postoperative changes in the lumbar spine were excluded in this study, it is possible that mild oedema could have contributed to the increased signal intensity of the paraspinal muscles. It should be noted, however, that paraspinal muscle oedema in combination with fatty infiltration may indicate denervation resulting from injury to the dorsal rami of the lumbar nerves in LSS. Some of the patients in our study cohort had hip arthroplasty that may affect the paraspinal and psoas muscles. We did not assess this relationship in our study cohort. However, by adjusting for the preoperative PROMs in the regression analyses, possible effects of such changes on pain or disability were adjusted for.

Finally, acquisition of MRI examinations from different units and field strengths and variations in the imaging parameters in the current study may have obscured associations between the imaging findings and clinical outcomes. However, in everyday clinical practice, MR images are acquired from different units, and these variations are inevitable.

In conclusion, this study showed a statistically significant association between preoperative FI of the paraspinal muscles and less improvement of leg pain 2 years after decompression surgery for LSS. However, this association was not statistically significant for either the ODI or the ZCQ. Although LSS patients with FI of the paraspinal muscles may experience less improvement of leg pain after surgery, the overall influence of this MRI finding on the postoperative clinical outcomes is uncertain.