Introduction

Open reduction with dorsal spondylodesis is an established standard operation in spinal surgery which was performed over 463,200 times in the U.S. in 2014 [1, 2]. Indications range from degenerative spondylolisthesis with instability to traumatic B-type fractures with further need to ventral stabilization [3, 4]. A severe complication after spondylodesis is the occurrence of a surgical site infection (SSI). Devastating consequences for the patient in the short- and long-term let alone resulting socio-economic burden lead to the development of strategies to prevent SSI in patients undergoing spine surgery [5, 6]. Despite these widely accepted and attended routines like antibiotic prophylaxis and wound irrigation in prolonged spinal surgery, the rate of early SSI after dorsal spondylodesis has been assessed around 1.9–11.9% [7,8,9,10,11], whereas comparable orthopedic routine operations report lower values around 2.2% [12, 13]. In virtue of these alarming data, several systematic reviews and meta-analyses have attempted to develop multifactorial risk stratification systems to assess the risk for the individual patient to acquire a SSI [6, 14,15,16].

CRP was shown to be superior to both ESR and WBC in terms of assessing a SSI [17, 18]. The actual dilemma regarding serologic parameters is the missing “normal” course of CRP values after dorsal spondylodesis to be distinguished from a pathological course indicating an infection and necessity for revision. Although observing the CRP course has been a useful tool for decades to indicate either normal or pathological processes, a distinctive value or cut-off following the operative procedure after which an infection seems most likely to occur, has not been established so far.

Thus, the overall objective of this study was to help develop an algorithm leading to earlier débridement and targeted antibiotic therapy, which in turn could assist to cure these patients, before the SSI leads to implant removal and the abovementioned devastating side effects.

Methods

Patients

From 2016 to 2018, we retrospectively analyzed 192 patients undergoing open reduction and dorsal spondylodesis at a level-I trauma center in Germany. Indications were degenerative spondylolisthesis as well as traumatic fractures. The study was approved by the local ethics committee (IRB number 16/7/19). All research was performed in accordance with the principles expressed in the Declaration of Helsinki, all study participants voluntarily attended the study and gave informed consent.

Patient-related criteria (age and sex) were recorded. Additionally, assessed parameters were C-reactive protein (CRP), white blood cell count (WBC), hemoglobin (Hb) and thrombocytes. Whether the cause of operation was traumatic or degenerative, the number of bridged segments as well as the region of operation (cervical, thoracic or lumber) was recorded.

Blood serum was collected in the morning of the respective day, consistently before surgery. In every operation, preoperative intravenous antibiotics was applied one hour before surgery (usually cefazolin 2 g, clindamycin 600 mg in case of penicillin allergy; topical intrawound application of vancomycin powder was not performed). Drains were routinely removed on the second day after surgery. The wound was inspected every two days after surgery. Blood samples were taken on average every second day (mean of 0.40 samples/day) and only in the stationary setting until the patient was discharged from hospital.

The patients were divided into two groups: an “infection” group and a “non-infection” group. Patients were included into the “infection” group if a bacterium was detected during revision surgery after multiple aerobic and anaerobic tissue biopsies were cultured (at least five, incubation for at least three weeks) and histopathology was obtained according to the recommendations of the AAOS and Dowdell et al. [6, 19]. From the day after this operation, the CRP levels were excluded from measurements since a second peak after revision surgery would falsify further calculation.

Inclusion criteria

  • Patients > 18 years.

  • Open reduction and dorsal spondylodesis in the cervical, thoracic or lumbar spine from 01/01/2016 until 12/31/2018.

  • One of three senior operators had performed the operation.

Exclusion criteria

  • Patients < 18 years.

  • Open fractures.

  • Tumorous diseases.

  • A priori infection of the spine.

  • Manifested infection prior to surgery or inflammatory disease.

  • Incomplete patient recordings.

Determination of peak value

A postoperative peak was defined as the earliest CRP level increase (normal: < 5 mg/l) after the operation which was preceded and followed by lower CRP values. A second peak was defined as–after a first postoperative peak occurred–another CRP value which was preceded and followed by lower CRP values.

Postoperative kinetics

A postoperative peak was determined and recorded in patients with an uncomplicated course and no infection. CRP data from these patients were displayed as a scatter plot. Subsequently, CRP values were normalized as percentage of the peak value and the peak normalized on the number of days after the peak. One-phase decay exponential regression was used to approximate these raw data.

A failure to decrease after the (first) postoperative peak was assessed for the third and fourth day. According to the calculated exponential one-phase decay (see below) a decline on day 3 and 4 was determined as 52.7% (+ 15% threshold, i.e., 67.7%) and 44.8% (+ 15% threshold, i.e., 59.8%) of the peak value, respectively.

The resulting predictive test which we conducted was defined positive if:

  1. 1)

    The existence of a second peak was recognized or

  2. 2)

    The existence of a second peak or a failure to decline occurred.

    and-additionally in the second measurement–

  3. 3)

    The maximum CRP outvalued a certain cut-off

Resulting sensitivity, specificity, positive and negative predictive values are reported.

Statistics

For a statistical power of 0.95 (α error = 0.05) and to detect medium effect sizes (|ρ|= 0.3), a sample size of 134 was needed. If two groups (infection and non-infection) were normally distributed was assessed by D'Agostino-Pearson-Test. If so, Student’s two-tailed t test was performed. Otherwise, Mann–Whitney test was performed and shown accordingly. In group comparison, Fisher’s exact test was performed. Differences among both groups in CRP kinetics have been assessed by Wilcoxon signed rank test. Binary logistic regression model was used to find predictors for an infection. Within the frame of an exploratory study predictors were chosen by forward inclusion [20]. Multiple linear regression was used to detect predictors elevating CRP values. In the reoperation-group, all values after the second operation were excluded.

Statistical analysis was conducted with GraphPad Prism 8.02 (GraphPad Software, San Diego, USA), SPSS Statistics software version 26.0 (IBM SPSS Inc., Chicago, IL, USA) and R 4.0.0 (The R Foundation for Statistical Computing, Vienna, Austria). If not differentially stated, overall, mean ± standard deviation is presented.

Results

Cohort characteristics

Hundred and seventy-eight patients were included in the non-infection group and 14 individuals in the infection group. Age in control-group (67.30 ± 15.73 years) and infection group (67.64 ± 10.13 years) did not differ significantly (p = 0.690, Mann–Whitney test). Sex among both groups (control: 50.6% male; infection: 71.4% male) was not significantly different (p = 0.170, Fisher’s exact test). The region and underlying reason of operation did not differ among groups (Suppl. Table 1). The incidence of SSI differed depending on the number of operated segments.

Due to logistic reasons, the CRP measurements were not strictly performed on a daily or weekly basis. In average, we measured the CRP every second day (0.40 times per day). To control for selection bias, the average amount of CRP measurements per day was assessed in each group, and no difference was detected (mean non-infection: 0.40, infection: 0.42, p = 0.706, Mann–Whitney test).

Patients were discharged after 10.3 ± 4.1 days (non-infection group: 9.7 ± 3.6 days, infection group: 18.0 ± 2.2 days, p < 0.001).

Kinetics of C-reactive protein

Eighty-two out of 178 patients in the non-infection group (46.1%) and 9 of 14 patients in the infection group (64.3%) had preoperatively CRP values above reference (≥ 5 mg/l). The mean preoperative CRP values were 20.25 mg/l (± 34.35) and 26.2 mg/l (± 30.45), respectively (p = 0.250, Mann–Whitney test).

The maximum CRP value was reached on day 3 with 134.89 mg/l (± 64.22) in the non-infection group. In the infection group, the maximum CRP value was reached on day 2 with 215.38 mg/l (± 69.05, p = 0.894).

The only significant difference between the non-infection and infection group was days 7 (66.5 ± 48.3 vs. 131.4 ± 79.7 mg/l, respectively) and 8 (55.4 ± 45.5 vs. 121.0 ± 57.2 mg/l, respectively) (day 7 adjusted p = 0.044 and day 8 adjusted p = 0.007, Holm-Sidak method, Fig. 1). Relating the CRP to the maximum CRP (which is reached on day 3 or 2 after the operation), in the non-infection group, a steady decline from day 3 on until day 8 was observed, whereas the infection group did not present this steady deterioration (Fig. 2).

Fig. 1
figure 1

Course of C-reactive protein levels in cohorts with and without infection CRP levels were determined 3 days before as well as on the day of surgery with follow-up measurements on every second day. The peak of CRP was reached on day 3 in the non-infection and on day 2 in the infection group. The statistically significant differences among both groups were detected on day 7 and 8, on which the non-infection-group had lower values compared to the infection group (Holm-Sidak method, day 7: p = 0.044, day 8: p = 0.007). Mean ± SEM

Fig. 2
figure 2

Kinetics of CRP in relation to the day of peak CRP in the non-infection (blue) and infection (red) group. Setting the maximum of CRP value on day 0, the course of CRP followed a one-phase decay exponential regression in the non-infection group (red, R2 = 0.7562, half-life of 2.135 days). This approximation was not adequate in the infection group (blue, R2 = 0.3977)

Nonetheless, the decline in the non-infected group did not follow a linear regression (R2 = 0.4423) nor exponential regression (one-phase decay, R2 = 0.4805) when the maximum was measured individually for each patient. After adjusting the maximum CRP on day 0 and subsequently adjusting the following CRP values to this day 0 maximum, an exponential regression following a one-phase decay was set with a satisfying approach for the non-infection group, but not the infection group (y = 0.7537 × e(−0.3246×x) + 0.2423; R2 = 0.7562, Fig. 2a, b).

Diagnostic value of “day 2” and “day 3” CRP

Since the peak value of postoperative CRP was seen on day 2 and 3 in the non-infection group, different cut-off values were tested with regard to their prognostic value. A cut-off value of 200 mg/l for day 2 resulted in a sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) of 75%, 86%, 6.5% and 98%, respectively. A cut-off value of 100 mg/l for day 3 resulted in 75%, 38%, 20% and 92%, respectively (Suppl. Figure 1a–d).

Determinants of peak CRP

In the non-infection group, 175 out of 178 had a postoperative peak (98%), while in the infection group, 14 of 14 displayed a peak (100%). Next, determinants of a higher peak CRP were sought. Therefore, multiple linear regression identified age (p = 0.002), number of addressed segments (p = 0.009) as well as preoperative CRP (p = 0.021) as determinants for regression analysis. The biggest association with maximum CRP, regarding to β-coefficient, was caused by age (Table 1). Binary logistic regression for an infection based on these variables, however, did not yield in a statistically significant prediction model (R2 = 0.078).

Table 1 Multiple linear regression analysis reveals a significant relation between age, preoperative CRP and number of addressed segments, but not sex or region

Predicting an infection: use of single variables

A second peak in CRP was considered to be indicative for an infectious complication. In the non-infection group, however, 15 out of 178 patients (8.4%) presented a second peak, while in the infection group, 6 out of 14 did (42.9%). Subsequently, sensitivity for a second peak was 42.9%, specificity 91.6%, PPV 28.6% and NPV 95.3%. A second peak was observed after 9.4 ± 2.9 days on average (non-infection group: 8.4 ± 2.4 days after surgery, infection group: 12 ± 2.3 days after surgery, p = 0.006).

Another hint for an infection was the failure of CRP to decrease according to the abovementioned one-phase exponential decay (Fig. 2 a). In the non-infection group, 24 patients (13.5%) did not decline according to the day 3 or day 4 cut-offs, while in the infection group, six patients (42.9%) did not.

Combining the presence of a second peak with the failure to decrease, the sensitivity, specificity, PPV and NPV was 71.4%, 78.7%, 20.8% and 97.2%, respectively. To further on improve the statistical value of these, another parameter, maximum CRP, was empirically selected. Calculating various cut-off values, a maximum of 225 was chosen to further improve the abovementioned algorithm. Assembling the existence of a second peak, failure to decrease and maximum CRP of more than 225 mg/l, a sensitivity, specificity, PPV and NPV of 85.7%, 70.2%, 18.4 and 98.4%, respectively, was reached. Ultimately, we restricted the failure to decrease and counted the failure just if the maximum CRP was above 100 mg/l. Along with the maximum CRP above 225 mg/l, we reached a sensitivity, specificity, PPV and NPV of 92.9%, 78.2%, 25% and 99.3%, respectively.

Predicting an infection: binary logistic regression

To establish a mathematical infection model with clinical applicability, binary logistic regression analyses for an infection were calculated on the imputed data with abovementioned parameters:

  1. (1).

    The qualitative CRP value as used above (mean of CRP values of days 1 to 7 divided by maximum CRP divided; a value below 0.5 is considered indicative for an infection).

  2. (2).

    Decline on day 3 or 4.

  3. (3).

    Empirically determined CRP values: A sum of CRP in the days 1–7 above 1200 mg/l and a mean CRP value from day 3 and 4 above 150 mg/l.

With these parameters a binary logistic regression model showed that the entire model as each coefficient of the predictor was significant in predicting an infection (Nagelkerke R2: 0.37, p < 0.001):

$$ y = - 2.609 + 4.292 \times {\text{qualitative CRP}} - 1.833 \times {\text{decline day}} 3 {\text{or}} 4 + 3.132 \times {\text{empiric CRP}} $$

If the values of qualitative CRP value, failure to decline on day 3 and 4, and empirically determined CRP as stated above rise in 1 point, the likelihood of an infection will rise 4.2%, 1.8%, and 2.6%, respectively. Considering the Nagelkerke R2 of 0.37, the Cohen’s effect size is very strong with 0.75. To evaluate the approach, a ROC curve was modeled presenting an area under the curve (AUC) of 0.847 (Fig. 3).

Fig. 3
figure 3

ROC curve and table for sensitivity and specificity of binary logistic regression of selected values a ROC curve with an AUC of 0.847. b Sensitivity and specificity for selected results of the binary logistic regression, e.g., a sensitivity of 50% was paralleled by a specificity of 97.2%, while a sensitivity of 85.7% went along with a specificity of 69.7%

Application of the binary logistic regression in our cohort resulted in values between 2.68 (I) and − 5.44 (VII). The higher the resulting value of the binary logistic regression the higher the specificity and, subsequently, the lower the sensitivity.

Patients with infections

The fourteen patients with infections after dorsal spondylodesis were prominent due to a second peak or failure to decrease in 10 cases (71.4%). The reoperation took place on average on day 10.71 (± 3.28 days). In nine cases, a bacterium could be detected, while the other five cases displayed observable amounts of pus during revision surgery or persistent discharge, but no culturally or histologically traceable bacteria (Table 2).

Table 2 Patients with infectious complications with underlying bacteria, CRP peak and predicted CRP on specific postoperative days

Discussion

The early detection and multimodal treatment of postoperative spinal infections are the only way to ensure the reduction of these often prolonged and calamitous complications [14, 21]. The proper and timely detection of these depicts a major intricacy for the operating surgeon. Throughout the last years, a number of screening tools have been assessed and validated.

We created a mathematical approach to provide the clinician with a tool that uses broadly available parameters to make decisions regarding the probability of an infection after a dorsal spondylodesis with a relatively high sensitivity (92.9%) and specificity (78.2%), whereas the major drawback is the mixture of degenerative and traumatic cases in our cohort.

The gold standard for infection detection remains the microbiological isolation of the causing pathogen, which is often a time-consuming process and unfortunately an “a posteriori” method. The “raise for the surface” begins at the time of implantation, and early revision surgery in case of bacterial contamination, positively influences the outcome for the patient [22, 23]. Besides blood serum samples, instrument-based tools are implemented into the clinical case-finding procedure. Notwithstanding magnetic resonance imaging can be of help, it has not been established as screening tool due to high costs and metal artifacts [24].

Typically, WBC as well as erythrocyte sedimentation rate (ESR) and CRP are used to detect postoperative infection [25]. While in orthopedic and geriatric surgery WBC and ESR are well characterized, the CRP with the right cut-off values as well has established as an appropriate tool for early detection due to its high sensitivity [26,27,28,29].The occurrence of a “second peak” has been demonstrated to indicate an infectious complication in arthroplasty and microdiscectomy [27]. In spinal surgery, the natural kinetics of CRP has not yet been deeply examined compared to the course in infected patients.

A “peak” of CRP was detected on day 3 in our non-infectious group. In a study performed by Chung et al., the authors reported a peak on day 2 as well in 103 patients who underwent elective spinal surgery. As we pointed out that age, preoperative CRP value and the number of operated segments correlated significantly to the maximum CRP, age could be confirmed by Chung et al.[30]. This phenomenon was similarly described by Aono et al. while analyzing 168 patients and no infection who underwent a posterior lumbar interbody fusion, CRP peaking at day 4 (in 94% of patients) and decreasing to 5 mg/l on day 14 [31].

In support of our data, Mok et al. analyzed 149 patients, mostly in the lumbosacral region, who were operated posteriorly and with a mean of 4.7 levels fusion. Different from that, we analyzed just one procedure and approach. They emphasized that postoperative kinetics follows a typical decline, while a “second rise or failure to decrease” was sensitive for infection resulting in a sensitivity and specificity (2nd peak or failure to decrease) of 71% and 51%, respectively, which is exactly the same level of sensitivity in our study (71.4%), while the here reported specificity was higher (79.8%) [32]. The authors calculated a CRP kinetics curve, which follows a first-order elimination and predicted 70% of the variance of CRP values (R2 = 0.7011). In our study, the CRP values could be predicted by exponential regression with a comparable alignment (R2 = 0.7562).

Interestingly, Houten et al. demonstrated an influence of the number of operated levels on the maximum postoperative CRP value [33]. According to that, we show that the maximum CRP value is associated with the number of addressed segments. Since the incision lengths have not been collected in neither study, these can just serve as indirect parameters of the operative extent. Although this quantitative influence on the maximum CRP level is validated, an affectation of the infection rate has not been properly shown, while the literature on this subject is inconclusive [34, 35].

A further peak after discharge of patients cannot be excluded. However, patients of the non-infection group who developed a second peak did this in average on 8.4 ± 2.4 days after surgery and were discharged after 9.7 ± 3.6 days that is less than the average length of stay after lumbar spine surgery in Germany [36]. None of the patients from the non-infection group developed a wound infection after discharge within at least one year after surgery. In the end, the study shows that it is the combination of different CRP kinetics that is essential for a prediction of an infection with a high sensitivity (qualitative high CRP, failure to decline, and a second peak). In line with the strive for reduction of regular postoperative blood sampling [37, 38], the results of this study make more information available out of standard blood samplings in order to prevent further expensive and extensive diagnostics.

To our knowledge, a binary logistic regression with resulting sensitivity of 85.7% and specificity of 69.7% and a ROC curve with an AUC of 0.847 for predicting a SSI after spinal surgery have not been determined so far.

Conclusions

The study determined that specific kinetics of postoperative CRP is of great value detecting postoperative infections after open reduction and dorsal spondylodesis. On day 7 and 8 after the operation, the infectious group depicts significantly higher CRP values compared to the non-infectious group. With a second peak, restricted failure to decline and maximum CRP of more than 225, a prognostic sensitivity and specificity of 92.9% and 78.2% to detect an infection was established, while a binary logistic regression leads to values of 85.7% and 69.7%, respectively. A one-phase decay exponential regression predicts the “natural” course of CRP after initial peak (R2 = 0.7562). With this approach, focusing on typical kinetics of the CRP course caused by an infection, clinical decision making is supported by an inexpensive and easily assessed biomarker.

Limitations

The combination of traumatic and degenerative patients is the major drawback in our cohort. Although Suppl. Table 1 shows no differences between both groups in terms of the operated region or underlying reason of the operative procedure, the combination of all these may impair the power of our mathematical approach for either the traumatic or the degenerative patients due to a broader applicability.

The retrospective study design is prone to a selection bias. A weakness of this study is the low number of patients (178) and infections (14) as well as the exclusion of previously existing infectious diseases, which might be of special interest regarding the aging cohort of patients with spinal operative interventions. Despite the treatment and exclusion of common infections like pneumonia and urinary tract infections, these infections were just excluded if evident at the time of surgery, but could have existed. The duration of surgery and further patient characteristics (BMI, comorbidities) have not been assessed. The combination of traumatic and degenerative patients was needed since an isolated consideration would have impaired the power of our analyses. In regard to the binary logistic regression analysis using a forward inclusion approach it is noteworthy that this method is at higher risk to depend on researchers’ knowledge and sensibility to the issues studied [20].