Introduction

Sciatica caused by lumbar disc herniation has a prevalence of 2–5% in adults and has been estimated to be the reason for 6% of the population’s work disability, leading to a consumption of health care resources in up to 95% of the affected patient population [1, 2]. The majority of the affected patients recover spontaneously with non-operative treatment, but in case of persisting symptoms, surgery usually gives relief [3, 4].

The surgical procedure for lumbar disc herniation is standardized [3]. Earlier studies have shown large variations in incidence of surgery for lumbar disc herniation in different countries, ranging from 14/100,000 in Great Britain to 70/100,000 in the USA [5,6,7]. Even in the neighbouring Scandinavian countries, there was a twofold variation in surgical incidence: 29/100,000 inhabitants in Sweden, 46/100,000 inhabitants in Denmark and 58/100,000 inhabitants in Norway during the data collection period of the present study (2011–2013) [8,9,10]. With this variation in mind, we asked whether differences in surgical incidence were associated with differences in surgical selection criteria (preoperative patient characteristics) and treatment effectiveness (patient-reported outcomes). The core data set suggested by the International Consortium for Health Outcomes Measurement (ICHOM) for degenerative disorders of the spine was used, developed to facilitate comparisons between countries and surgical units [11].

Materials and methods

The protocol for this retrospectively designed study on prospectively collected data is registered on ClinicalTrials.gov (NCT02889484). This study has been prepared in accordance with the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement.

We performed comparisons of patients undergoing primary surgery for lumbar disc herniation in Sweden, Denmark and Norway. Similar core data sets exist in the national spine registers in these countries. In addition, the three countries have similar health care systems and similar socio-economic, ethnical and genetic backgrounds for the majority of the population, which would facilitate the interpretation of the results [12,13,14]. Outcome was assessed after 1, 2, 5 and 10 years in Sweden and Denmark, but only after 1 year in Norway. Hence, only 1-year data were used in this study. Studies comparing 1- and 2-year data after surgery for lumbar disc herniation have not shown relevant differences in outcome [15, 16].

The registers

The three national spine registers have similar processes of data collection. At admission for surgery (baseline), the patient completes a questionnaire including physical function, pain and health-related quality of life, without the assistance of health professionals. On a separate form, the surgeon registers diagnosis, type of surgical procedure and any complications occurring during the hospital stay. Co-morbidity is physician-reported in the Norwegian register and patient-reported in the Swedish and Danish registers. After one year, the patient completes the same questionnaire.

The Swedish spine register, Swespine, has included patients treated with surgery for lumbar disc herniation since 1998. Coverage, the proportion of operating centres using Swespine during the study period, was approximately 90%. Completeness, the proportion of operated patients reported to Swespine, was approximately 75%, and the follow-up at one year was approximately 70% [15].

The Danish spine register, DaneSpine, is based on Swespine and has successively been implemented since 2009. Coverage during the study period was approximately 80%, completeness was approximately 64%, and the follow-up at one year was approximately 57% [9, 17].

The Norwegian spine register, NORspine, is based on experiences from the Swespine register and previous validation studies from a local clinical registry and was founded in 2007. Coverage during the study period was approximately 95%, completeness was approximately 65%, and the follow-up at one year was approximately 66% [18].

Informed consent

The registers in Sweden and Denmark apply the opt-out method, and therefore informed consent is not required, but answering the questionnaire is voluntary. In Norway, the opt-in method is used and patients provide informed consent.

Patients’ inclusion and exclusion criteria

Of those recorded in our registers, we included patients with data on sex, with a primary surgery for lumbar disc herniation in Sweden, Denmark and Norway during 2011–2013. Other baseline data included self-assessed data on anthropometrics, smoking, co-morbidity and duration of leg and back pain. Exclusion criteria included age under 18 years or over 65 years, previous lumbar spine surgery, surgery other than discectomy only and anthropometrics outside normal ranges (Fig. 1). In a non-responder analysis, we compared the baseline characteristics of the 3497 patients (35%) who did not respond at the one-year follow-up with the 6468 patients who responded (Fig. 1).

Fig. 1
figure 1

Flow chart of the study

Data collection

Anonymized individual-level data were acquired from all three national registers and merged on a common data server provided by Swespine.

Outcome instruments

Primary outcome measure was pain-related disability assessed with the Oswestry Disability Index (ODI) version 2.1 (from 0; no disability to 100; maximum disability) [19]. Secondary outcome measures were the Numerical Rating Scale (NRS) for leg and back pain (both ranging from 0; no pain to 10; maximum pain) and EQ-5D-3L, according to the British tariff UK-TTO (from − 0.59; worst possible health to 1; perfect health) [20, 21].

The Norwegian spine register used the NRS for leg and back pain, while the Swedish and Danish spine registers used the visual analogue scale (VAS) scoring from 0 (no pain) to 100 (worst imaginable pain) [20]. To compare the data, conversion was done by dividing the VAS score by 10 with stochastic approximation of decimals to the closest integer.

Statistics

Descriptive data are presented as mean (95% confidence interval of the mean; 95% CI) or number (%). Analysis of variance (ANOVA), Student’s t test, Pearson’s Chi-square, likelihood ratio Chi-square test and linear regression tests were used for statistical comparisons, and the crude (unadjusted) data are presented. Unadjusted p values and adjusted p values obtained after case-mix adjustments are presented. Case-mix adjustments were made with baseline age, sex, BMI, smoking, co-morbidity, duration of leg pain and the preoperative value of the dependent variable. Missing data were excluded analysis by analysis.

Comparisons of the baseline variables and comparisons of the change from baseline to one year, as well as the final value of the outcome variables at one year, were performed.

In an extended analysis of ODI, we calculated proportions of patients achieving a clinically relevant outcome, i.e., an acceptable symptom state (ODI score ≤ 22), a substantial improvement in ODI (≥ 30%) or a minimally important ODI improvement (≥ 15) at follow-up [22,23,24,25].

Associations between possible risk factors recorded at baseline and absolute one-year outcome for ODI, NRS leg pain and back pain at one-year follow-up were assessed in a linear regression model.

The statistician (DN) performing the analyses was unaware of group belonging (i.e., country). The code was revealed after the analyses were performed. Statistics were performed in R version 3.3.3 (R Foundation for Statistical Computing). Two-sided tests were performed and a p value less than 0.05 was considered significant.

Study approval

This study was approved by the Ethical Review Board in Linköping, Sweden (number 2015/181-31), the Regional ethical committee for medical research in South-East Norway (number 2014/2219), the Regional Committees on Health Research Ethics for Southern Denmark (number S-20160091) and by the boards of each register.

Results

Preoperative variables

There were statistically significant differences between the countries in all baseline variables (Table 1 and Fig. 2). The mean difference in ODI at baseline was up to 4 points. In Denmark, the patients had the lowest NRS for both leg and back pain, the highest EQ-5D-3L and, together with the patients in Norway, the lowest ODI. In Sweden, the patients had the highest ODI, the lowest EQ-5D-3L and the highest NRS leg pain. The Norwegian patients had a mean NRS back pain that was 1.4–1.5 points higher than in Sweden and Denmark. In Sweden, the patients had a longer duration of leg pain and a lower proportion were smokers than in Denmark and Norway (16% vs 33% and 30%).

Table 1 Baseline characteristics
Fig. 2
figure 2

Preoperative and post-operative absolute values. Comparison of absolute outcome values at baseline (blue) and at follow-up (red). Data are presented as mean and 95% confidence interval. p values are given for the ANOVA F test for the comparison between the countries

Outcome at one year—absolute value

At one year, statistically significant differences were seen between the countries in all outcome variables except for EQ-5D-3L (Fig. 2). ODI was lower (better) in Norway than in Sweden and Denmark (15 vs 18 and 19, respectively), and NRS leg pain was lower (better) in Sweden than in Denmark and Norway (2.0 vs both 2.2), while NRS back pain was higher (worse) in Norway than in Denmark and Sweden (2.8 vs both 2.4). Case-mix adjustment did not change the statistically significant difference seen for ODI, but attenuated the differences for NRS leg pain and back pain. The adjustment also made the difference in EQ-5D-3L statistically significant between countries (Fig. 2).

Improvement from baseline to one-year post-operative

There were statistically significant differences in improvement in all outcomes between the countries at one-year follow-up. These differences were attenuated by case-mix adjustment, except for ODI (Table 2). Danes had smaller mean improvements in all outcomes when compared to Swedes and Norwegians, and the differences were still statistically significant for ODI and EQ-5D-3L after case-mix adjustment (Table 2).

Table 2 Change in outcome from baseline to one-year post-operative

Extended analysis for ODI, the primary outcome measure

After one year, there were significant differences in the proportions reaching a clinically relevant outcome. In these analyses, the outcomes in Sweden and Norway tended to be more favourable than in Denmark (Table 3). Case-mix adjustment did not change the strength or direction of these differences.

Table 3 Proportions of patients achieving a clinically relevant outcome from baseline to the one-year follow-up using different cut-offs for Oswestry Disability Index (ODI)

Outcome predictors

All possible risk factors in the linear regression were independently associated with the ODI result at one year (eTable 1). Smoking, co-morbidity and duration of leg pain > 3 months at baseline had the strongest association with more disability (ODI) at one-year follow-up. Similar results were found when using NRS leg pain and back pain as dependent variables (eTables 2 and 3).

Non-responders at baseline

At baseline, non-responders were 3.3 years younger (p < 0.001) and had a 0.3 higher BMI (p = 0.001), a 4% lower proportion of women (p < 0.001), a 9% higher proportion of smokers (p < 0.001), a 2% higher proportion of patients with preoperative pain more than three months for both leg (p = 0.002) and back pain (p = 0.004), a 0.2 lower NRS leg pain (p = 0.004) and a 0.2 higher NRS back pain (p < 0.001) compared to responders (eTable 4).

Discussion

We asked whether differences in surgical incidence were associated with differences in surgical selection criteria. In Sweden, with its relatively low surgical incidence, the patients had the highest amount of back disability and leg pain and the lowest quality of life preoperatively. We hypothesize that surgical selection in Sweden is influenced by a more conservative treatment tradition, which is associated with more preoperative symptoms in Swedish patients. However, the association between surgical selection criteria and surgical incidence was not evident since the relatively high surgical incidence in Norway was not associated with a better preoperative disability, pain and quality of life compared to patients in Denmark.

We also asked whether differences in surgical incidence were associated with differences in treatment effectiveness. There were large and clinically significant improvements in disability, pain and quality of life within all countries. At the one-year follow-up, there was no clear pattern when comparing the different outcomes in relation to the incidence of surgery. Therefore, no clear relationship between treatment effectiveness and surgical incidence could be seen. Even though there were statistically significant differences in baseline data and one-year outcome between the countries, none of these differences reached the minimum clinically important difference at the patient level and were within the limits of the measurement error [23, 25].

Case-mix adjustments attenuated some of the variations in outcome. There was a counteracting negative effect of a longer duration of leg pain and a positive effect of less smoking in Sweden. This might explain that the outcome in Denmark and Norway was not better than in Sweden. Obviously, there are several other and unknown confounding factors, which could alter the results, such as education and ethnicity [11]. Case-mix adjustment could be more pivotal in comparisons between patients from countries that are more dissimilar and with other diagnoses and should therefore always be considered.

The longer preoperative duration of leg pain in Sweden could reflect differences in the health care system, such as accessibility to surgical health care, and/or more conservative treatment traditions, and might lead to the lower incidence of lumbar disc herniation surgery observed in Sweden. Irrespective of reason, the “wait for natural course” may not necessarily be associated with long-term adverse effects. Evidence from randomized controlled trials and a systematic review indicate that surgery is not superior to non-surgical treatment in a longer term but may be more cost-effective [4, 26]. However, data on the optimal time point for surgical intervention are not known and there are conflicting opinions [17, 27, 28]. In this study, a longer duration of leg pain preoperatively was a predictor of poorer outcome.

Other predictors for worse outcome were smoking and co-morbidity, with smoking as the most important. Others have shown that smoking is associated with a worse outcome after surgery for spinal degenerative disorders [29]. Our data further support this.

Advantages of this study include the large patient sample, which gives high precision of the estimates. The results are likely to reflect high external validity since they reflect real life and are based on consecutive data from three national registers, all with relatively high coverage, completeness and follow-up. In addition, validated, well-known outcome instruments advocated by ICHOM were used [11]. To avoid any country-specific differences in EQ-5D-3L, we used the same scoring algorithm to convert everyone’s health profile to the EQ-5D-3L index score [21].

There are limitations with the current study. The main limitations are the lumbar disc herniation diagnosis which is assessed only by the operating surgeon, any cultural and language differences that could affect questionnaire responses, conversion of pain scales to NRS in two of the countries, and the loss to follow-up [20]. The registered co-morbidity rate was higher in Norway, compared to Sweden and Denmark, which probably reflects varying data collection (physician-reported vs patient-reported co-morbidity) rather than a true difference. A higher co-morbidity rate in Norway could possibly have had a negative impact on patient-reported outcome, which was not observed. Unknown factors and unrecorded differences in preoperative assessment and treatment may exist. In all countries, conservative treatment is advocated before surgery is attempted.

We have relied upon the treating surgeon registering the correct diagnosis in the register. Recent studies from both the Swespine and NorSpine registers showed that the diagnosis in the register and the surgical file was the same in 97% of the cases [30, 31]. We consider the number of incorrectly included patients to be very small and randomly distributed between the countries and, therefore, unlikely to bias the estimates.

Previous efforts to do country-wise comparisons have mainly not been aiming at studying patient-reported outcome, but some studies do exist [32, 33]. In the current study, the countries are using the same questionnaires translated in their own language. Even if these have been cross-validated against other languages, we cannot exclude that this may have an impact on the results in this study [33]. The direction of such a bias in the results in this study is unknown, but if any, it is likely to be small.

Pain according to the NRS was not available in the questionnaires in Sweden and Denmark and has therefore been converted from the VAS. This could possibly be a source of bias, but considering the similar behaviour of the NRS, ODI and EQ-5D-3L postoperatively we find it unlikely that the results have been affected in any major way [20].

We cannot be certain that the proportion of non-responders did not bias our results. The most important factors are probably the higher proportion of smokers and longer duration of symptoms among the non-responders, which possibly could indicate a slightly poorer outcome among non-responders. However, Solberg et al. and Højmark et al. did not find loss to follow-up to bias conclusions after spine surgery [34, 35].

Conclusion

In this international register study, we found no clear association between incidence of surgery for lumbar disc herniation and preoperative patient characteristics as well as outcome, and the differences between the countries were lower than the minimal clinical important difference in all outcomes. Other factors such as the outcome of non-operatively treated patients and cost-effectiveness analyses would need to be taken into account to determine the optimal surgical incidence and will be the future direction of this collaborative effort.