Introduction

Degenerative change of the lumbar spine is a major source of low back pain and disability in older age people [1]. The processes of degeneration are complex and involve deterioration of both the anterior and posterior spinal column. Kirkaldy-Willis [2] categorized the degenerative cascade into three phases: dysfunction, instability, and restabilization. They identified the correlation of pathological changes within the disk and facet joints. Recently, degeneration of the posterior spinal ligament has been considered to be one of the causes of low back pain [3, 4] prompting numerous studies to further evaluate the role of the interspinous ligament. These studies include an investigation of its mechanical role in spinal stability using biomechanical testing [5, 6], and anatomical, biochemical, and pathological changes in the degenerative aging spine [6, 7].

Magnetic resonance imaging (MRI) has emerged as the diagnostic method of choice for studying spinal degenerative pathology [8]. Since MRI provides excellent soft tissue evaluation and multiplanar capabilities, the assessment of midline structures, such as spinous processes, and supraspinous, and interspinous ligaments, can be clearly viewed using mid-sagittal imaging [9]. MRI characteristics of interspinous ligament degeneration have been considered in previous studies [9, 10]. Although the clinical importance of the interspinous ligaments has been revealed, reproducibility studies of MRI evaluation of interspinous ligaments have not been performed. A standardized nomenclature is required for the comparison of data from different investigations [11]. Additionally, the reliability of the assessment tool has a critical influence on the validity of acquired data.

In this study, we propose a simple classification system for degenerative changes of the lumbar interspinous ligament as seen with MRI. The reliability of this system was evaluated by examining both intra- and interobserver reproducibility.

Materials and methods

Participants

In this prospective consecutive data collection, 118 positional MRI scans of the lumbar spine were collected from July to December 2007. All patients were referred for lumbar MRI in the evaluation of symptoms of back and/or leg pain. MRI scans of 50 patients (26 males and 24 females) were randomly selected. The mean age was 48.8 years (range 23–85 years) and mean body weight was 183.93 (range 115–240 lb). None of the patients had previously undergone lumbar spine surgery. The Institutional Review Board approved this study and informed consent was obtained from all participants. Four lumbar spinal levels (L2–3, L3–4, L4–5, L5–S1) were used and a total of 200 interspinous ligament levels were evaluated on T1- and T2-weighted mid-sagittal images.

Grading system for interspinous ligament degeneration

MR classification for interspinous ligament degeneration was developed on the basis of a comprehensive literature review of previously published studies examining anatomical, radiological, and histological aspects [3, 4, 9, 10, 12]. We classified the degeneration into four grades according to the signal intensities and characteristics of structural changes of interspinous ligament and surrounding tissues using mid-sagittal images of T1- and T2-weighted MR images (Table 1).

Table 1 Classification of interspinous ligament degeneration

MRI technique

MRI of the lumbar spine was performed using a 0.6 Tesla MRI scanner (Upright Multi-Position MRI; Fonar Corp, Melville, New York). We examined the mid-sagittal T1-weighted spin echo images (repetitive time 671 ms, echo time 17 ms, thickness 4.0 mm, field of view 30 cm, matrix 256 × 224, NEX 2) and mid-sagittal T2-weighted fast spin echo images (repetition time 3,000 ms, echo time 140 ms, thickness 4.0 mm, field of view 30 cm, matrix 256 × 224, NEX 2) with a quad channel planar coil.

Image assessment

The lumbar spine MR images were assessed independently by three spine surgeons with 3, 7, and 9 years of experience with lumbar spine MRI. Each reader analyzed the images on separate occasions after the selected images were randomly reordered, with a minimum interval of 1 week. Instructions explaining the classification system and a set of sample images were given to all readers during the review (Fig. 1). All readers were instructed to precisely follow the classification algorithm.

Fig. 1
figure 1

Example of each grade of interspinous ligament degeneration (arrowhead). a Grade A low- to iso-signal intensity on T1- and T2-weighted images. b Grade B high signal intensity on T1- and T2-weighted images. c Grade C low signal intensity on T1-weighted images and high signal intensity on T2-weighted images. d Grade D low or iso-signal intensity on T1- and T2-weighted images with marked narrowing of the interspinous interval

Data analysis

The percentage of each grade assigned by each reviewer was determined. For ordinal level measurements, kappa statistics were appropriately used to evaluate the agreement percentage within readers (intraobserver agreement) and between readers (interobserver agreement) [13]. The interpretation of reliability coefficients suggested by Landis and Koch [14] was performed: kappa 0–0.2 indicated slight agreement, 0.21–0.4 fair agreement, 0.41–0.6 moderate agreement, 0.61–0.8 substantial agreement, and 0.81–1.0 excellent agreement. The frequency of disagreement was calculated for each grade. All statistical analyses were performed using SPSS (version15, SPSS, Chicago, IL, USA).

Results

A total of 200 interspinous levels were analyzed in this study. The overall grades of interspinous degeneration assessed by each reader are shown in Table 2.

Table 2 Overall grading from three readers

Grade A was seen in 85–101 levels (42.5–50.5%), grade B in 76–92 (38–46%), grade C in 6–11 (3–5.5%), and grade D in 14–18 (7–9%), depending on the reader.

Intraobserver and Interobserver reliability

The results of intra- and interobserver reliability are summarized in Table 3. The intraobserver agreement was excellent for all readers, with kappa values ranging from 0.840 to 0.901. Complete agreement within readers ranged from 181 (90.5%) to 188 (94%) for the 200 levels. As expected, interobserver agreement was lower than intraobserver agreement ranging from substantial to excellent, with kappa values from 0.726 to 0.818. Complete interobserver agreement was achieved in a range from 167 (83.5%) to 178 (89%) of all 200 levels.

Table 3 Kappa statistics of intraobserver and interobserver reliability

Evaluation of disagreement

The overall intraobserver and interobserver agreement and disagreement are shown in Table 4. As expected, disagreement of 3 grades (between grades A and D) was less frequent than the difference between 1 or 2 grades in both intraobservation and interobservation. Also, disagreement of 2 grades was lower than disagreement of 1 grade. The relationship between different interspinous ligament degeneration grades and the frequency of disagreement are shown in Table 5. In the groups with a disagreement of 1 grade, a difference between grades A and B was much more frequent than between grades B and C, and between grades C and D in both intraobservation and interobservation, and represented the highest rates of disagreement in the study. Regarding the frequency of disagreement in the groups of two grade difference, the percentage of difference between grades A and C was slightly higher than between grades B and D in both intraobservation and interobservation. Interestingly, disagreement was less frequent when the difference involved grade D (grades A and D, B and D, and C and D).

Table 4 Intraobserver and interobserver agreement and disagreement
Table 5 The relation of different interspinous ligament degeneration grades and frequency of disagreement

Discussion

Many classifications of degenerative change of the lumbar intervertebral disk and facet joint osteoarthritis have been proposed. Additionally, the reliability of these grading systems has been tested [15, 16]. Posterior spinal ligaments, especially the interspinous ligament, significantly contribute to the stability of the spine [5, 6]. Recent studies have shown that significant pain relief can be achieved after interspinous ligament injections, supporting its possible role in low back pain [17, 18]. Although research regarding interspinous ligament degeneration has been increasing, few studies have focused on the MRI characteristics of the interspinous ligament [3, 7, 10]. We have developed a classification system for interspinous ligament degeneration based on a modified version of that proposed by Fujiwara et al. [10] The modifications are based on a thorough radio-anatomic-histological literature review [3, 4, 9, 10, 12]. Our classification is based on the signal intensity of the interspinous ligament and specific characteristic changes within the ligaments using mid-sagittal T1- and T2-weighted MRI. In this study we focused on the reliability and reproducibility of this classification system and also the frequency of disagreement at each different grade.

The cascade of interspinous ligament degeneration has not been defined well. Prior studies have evaluated MRI findings in asymptomatic control subjects and non-pathologic cadaveric lumbar spines [7, 10], revealing low-signal intensity on both T1- and T2-weighted images to correspond with the earliest stages of degeneration. This correlates with grade A interspinous ligament degeneration in our study. Prior radiologic–pathologic investigations have shown the correlation of the changes in radiographic images of human interspinous ligaments and histological findings [9, 10, 12, 19]. Although the histological examination revealed various degenerative changes within the interspinous ligament, the dominant characteristics were identified in MR findings. Marked fatty replacement with a high signal intensity on both T1- and T2-weighted images (staged as grade B in our study) might represent fatty degeneration within the ligament. Low signal intensity on T1- and high signal intensity on T2-weighted images (staged as grade C in our study), were found to correlate with a dominant extensive proliferation of cells and vascular invasion. This signal intensity was controversially considered to be “interspinous bursitis” (Baastrup’s disease) with a pathological correlation of increased vascularity, eburnation, and formation of bursae [3, 12, 19]. The overlap of these pathological findings was suggested to represent this stage’s association with inflammation [3, 9, 10]. Massive fibrosis with chondrometaplasia was predominately observed as low-signal intensity on both T1- and T2-weighted images with hypertrophy of the spinous process. Progressive loss of interspinous space, hyperplasia, sclerosis, and marrow changes within spinous processes were also considered to reflect severe ligament degeneration (staged as grade D in our study) [9]. Our investigation did not have a pathologic correlation since it would be extremely difficult to obtain the pathologic specimens from the subjects. However, based on prior well-correlated radio-pathologic studies, it is likely that our MRI classification represented the interspinous ligament degeneration cascade.

The distribution of MRI characteristics of interspinous ligament degeneration has not been documented well. In our study, most interspinous ligament degeneration was grade A or B. Fujiwara et al. also found that two-thirds of their study population presented with an interspinous ligament signal intensity similar to that of grades A and B in our study. The signal change in grade C mimics that considered to be interspinous bursitis. The prevalence of this condition has been described in previous MRI studies as 8.2%, which is comparable to the percentage of grade C in our study [3]. Grade D, which may represent the most severe stage of degeneration, was also identified in a small number in our population.

Using our proposed classification, we found intraobserver reliability to be excellent in all observers. Interobserver reliability was lower; however, the values remained within substantial to excellent agreement. There was no obvious difference in kappa values between the three readers who are all spine surgeons with different levels of clinical experience. The frequency of disagreement was relatively less when the difference involved grade D (grade A and D, B and D, C and D). This may be explained by greater difficulty discriminating signal intensity than with identifying interspinous interval narrowing or marrow or bony changes within the spinous processes. As expected, a difference of 1 grade occurred more often than a difference of 2 grades. Most of these differences were between grades A and B. This may be explained by a disproportionately high percentage of grades A and B, resulting in a more frequent misinterpretation between these grades. A second possible cause for this occurrence is the increased difficulty of distinguishing bright and intermediate signal intensities, which are characteristic of grade B and grade A, respectively. This indicates that this classification may require a higher resolution of imaging or objective measures of signal intensity.

There are a number of limitations for this type of study. First, we used 0.6 T MR imaging, which is not the dominate system in use today, and may provide low-resolution images. Nonetheless, we found sufficient agreement with these images. Second, we cannot definitively define the clinical correlation of this classification system. Since this is a retrospective analysis, we could not confirm the presence of ligamentous pain with diagnostic injections. The primary focus of our study was to determine the reliability of this classification system. Clinical-radiographic relationship investigations need a standardized and reproducible imaging classification in order to compare outcomes. Additional clinical studies may be conducted using our proposed classification system.

In conclusion, we have described a classification system for interspinous ligament degeneration using mid-sagittal T1- and T2-weighted MRI. We have tested this classification system and found it to provide sufficient reliability and reproducibility. We believe that this classification is easy to apply and comprehend and may be used as a standardized nomenclature for clinical and radiographic investigations of interspinous ligament pathology.