INTRODUCTION

Patients on long-term opioid therapy (LTOT) are at increased risk for opioid overdose.1,2,3 Leveraging electronic health records (EHR) data to identify patients at particularly high-risk represents one strategy to target preventive efforts to reduce overdoses.4 Opioid overdose risk prediction models can identify candidate patients for individual-level clinical interventions, such as naloxone prescribing and increased frequency of clinical visits to evaluate the risks and benefits of continued opioid therapy. Misapplications of models in practice, however, may result in patient harm, such as increased overdose risk due to unwarranted and abrupt tapering or discontinuation of opioids.5,6,7 Labeling patients as “high-risk” may also reinforce existing opioid-related stigma.8 To maximize clinical benefit and minimize potential patient harm, development and deployment of models require careful consideration.

Ongoing assessment of overdose risk prediction models is important to ensure their continued clinical relevance. Models previously developed may not perform optimally in current contexts and require updating before they can be successfully integrated in clinical care.9,10 Amid an evolving opioid overdose epidemic,11,12 our understanding of characteristics associated with overdose may improve over time, such as the dangers of combining opioids with other controlled medications.2,13 The overdose risk environment may evolve due to changes in clinical practices and prescribing policies, such as guidelines recommending initiation of patients on immediate- rather than extended-release opioids.2,14,15,16 Clinical operational needs for risk models can also shift, and characteristics of the ideal risk model require reevaluation.

In this study, our objectives were to update and validate a model of opioid overdose for use to risk-stratify LTOT patients in collaboration with clinical partners from a pain management program in one health system. We re-evaluated a 5-variable opioid overdose risk prediction model previously developed at Kaiser Permanente Colorado (KPCO) that used data from 2006 to 201417, which we refer to as the KP Colorado Opioid Overdose Risk (KPCOOR) model. This model was developed to help identify patients who might benefit from being prescribed or dispensed naloxone. We compared the performance of the original KPCOOR model to updated models using data between 2015 and 2019 from a study cohort in the same health system. We also discuss necessary modifications to ensure clinical relevance and considerations for moving the model into practice.

METHODS

Study Design and Setting

We updated and validated a risk prediction model of opioid overdose in a retrospective cohort of adult patients prescribed LTOT between 2015 and 2019 in a single health system at KPCO. KPCO is an integrated health plan and care delivery system serving over 500,000 members. Study data were extracted from patient EHR data, pharmacy records, and medical claims. These records contain socio-demographic and clinical data, including diagnoses (International Classification of Diseases, Ninth (ICD-9) and Tenth Revision, Clinical Modification (ICD-10)) from outpatient, emergency department, and inpatient settings. Prescription data were captured for medications dispensed from outpatient KPCO pharmacies and from external pharmacies using claims data. Cause-of-death data were obtained through a linkage with state vital records. The data-only study was approved by the KPCO Institutional Review Board with a waiver of informed consent.

In 2020, researchers at KPCO began collaborating with clinical leaders in the pain management program, who provided feedback on clinical relevance of predictors and feasibility of model implementation, to revalidate the original KPCOOR model. The original model was developed using data from a cohort of 42,828 patients prescribed LTOT at KPCO between 2006 and 2014 and was validated externally at Denver Health, a safety-net health system.17 The original model included five predictors (age (and age-squared), mental health diagnosis, substance use disorders, tobacco use or use disorder, and long-acting or extended-release (LA/ER) opioid formulation) and had a C-statistic of 0.73 (95% CI, 0.69–078), sensitivity of 66.1%, specificity of 66.6%, and PPV of 0.56%.17 The original model was developed to inform naloxone prescribing and dispensing. Clinical partners anticipated using the updated model to identify LTOT patients who could benefit from increased frequency of clinical visits to assess ongoing opioid therapy.

Study Cohort

We identified a cohort of 33,625 patients prescribed LTOT between January 1, 2015, and June 30, 2019. LTOT was defined as three or more opioid analgesic prescriptions dispensed within 90 days. National Drug Codes were used to identify prescriptions from pharmacy dispensing and claims records. Buprenorphine and methadone formulations indicated for treatment of opioid use disorder (OUD) at the time of this study were excluded from the cohort definition. The first set of three opioid prescriptions within a 90-day qualifying period during 2015–2019 study period was used to establish the start of LTOT, and the date of the third prescription was considered the index date. We required at least 1 year of continuous enrollment in KPCO prior to the index date to collect data on baseline characteristics. We excluded patients who continuously met the LTOT definition in 2014 because their index date occurred before study start. We also excluded patients < 18 years old on the index date and those who had a cancer diagnosis and/or a do-not-resuscitate order in the year before the index date. A DNR order was considered a proxy for end-of-life care. Patients were followed for 2 years after the index date or until first observed opioid overdose, disenrollment from KPCO, death from causes other than opioid overdose, or end of the study period on June 30, 2021, whichever occurred first.

Outcome

The outcome of interest was non-fatal or fatal opioid overdose observed within 2 years after the index date. Non-fatal opioid overdoses were identified from drug poisoning diagnoses in emergency department and inpatient settings based on KPCO in- and out-of-network medical claims (ICD-9 codes 965.0, 965.00, 965.1, 965.02, 965.09, E850.0, E850.1, E850.2; ICD-10 codes T40.0, T40.1, T40.2, T40.3, T40.4, and T40.6). Fatal opioid overdoses were flagged from death records that had drug poisoning as the underlying cause (ICD-10 codes X40–X44, X60–X64, X85, and Y10–Y14) and opioid involvement as a contributing cause (T40.0, T40.1, T40.2, T40.3, T40.4, and T40.6).

Candidate Predictors

Potential predictors included those identified in the original study13,17,18,19,20 and several additional candidates based on emerging evidence and input from clinical partners.18,21 Candidate predictors included socio-demographic characteristics (age, sex, self-reported race and ethnicity, and Medicaid coverage); medication exposures; tobacco use or use disorder; substance use disorders (opioid, other drug, or alcohol use disorders were combined due to few events); mental health disorders (mood, anxiety, non-mood psychotic, and behavioral disorders); and medical diagnoses. For opioid exposures, we examined drug formulation (LA/ER), dose (< 50, 50 to < 90, and ≥ 90 mean daily morphine milligram equivalents (MME)), and history of one or more opioid dispensings in the year prior to the index date. We considered other medications dispensed concurrently with opioids, defined as one or more prescriptions for anticonvulsants (e.g., gabapentin), antidepressants, benzodiazepines, hypnotics, muscle relaxants, or stimulants in the 90 days before the LTOT index date. We also assessed past-year number of unique opioid prescribing clinicians, number of unique opioid dispensing pharmacies, and history of early opioid medications refill (one or more dispensings ≥ 5 days before running out of days supply). Variables except for socio-demographic characteristics, opioid formulation, opioid dose, and concurrent medications were measured in the 365 days before the index date. Socio-demographic characteristics were measured on the index date. Individuals with unknown race and ethnicity data (7.1%) were grouped in a separate missing category. Formulation and dose were based on opioid prescriptions dispensed on or with a days supply covering the index date.

Statistical Analyses

We compared three Cox proportional hazards regression models predicting the 2-year opioid overdose risk. First, we tested a model that included all 5 predictors selected in the KPCOOR model (base model 1). Second, to assess changes in predictors, we repeated variable selection and evaluated previously and newly identified candidate predictors. We examined frequencies and bivariate associations of candidate predictors and the outcome. We included a quadratic term for age to account for non-linearity. Multicollinearity was assessed by examining a correlation matrix of coefficients. Based on clinical judgment and statistical significance, we selected more than 20 final candidate predictors for consideration (Table 1). To obtain a parsimonious model, we performed backward variable elimination in 1000 bootstrapped samples.22 In each sample, variables were sequentially removed from a full model of all candidate predictors until only those with a significance level P < 0.05 remained. Variables retained in at least 50% of bootstrapped samples were ultimately selected for inclusion (updated model 2). Third, based on input from clinical partners, we selected and modified predictors to obtain a preferred model for clinical integration (adapted model 3). For efficiency, all models were tested in the full study cohort and internally validated via bootstrapping with 1000 samples.23

Table 1 Summary Statistics of Candidate Predictors in the Study Cohort of Patients Prescribed Long-Term Opioid Therapy

We evaluated the performance of each model by examining discrimination and calibration. We calculated the bootstrap-corrected Harrell’s concordance (C) statistic via 1000 samples to evaluate how well models discriminate between patients with and without the outcome, with higher values indicating better discrimination.24,25,26 For any given pair of patients with and without overdose, a model with good discrimination will assign a greater predicted risk to the patient with an overdose. We assessed calibration by comparing plots of Kaplan–Meier (KM) and Cox regression-based estimates of survival probabilities stratified by tertile of predicted risk.27 We also performed the Greenwood-Nam-D’Agostino (GND) test to assess the difference between observed KM and predicted survival probabilities.28 For the final adapted model, we calculated sensitivity, specificity, and positive predictive value (PPV) where “high-risk” was defined as the highest third tertile of predicted risk. We verified the proportional hazards assumption using a nonzero slope test and plots of Schoenfeld residuals. In sensitivity analyses, we estimated a competing risks model, calculated the PPV for hypothetical overdose rates, and examined an alternative cut-off for “high-risk” based on the Youden Index.29,30 Statistical analyses were performed using Stata version 16.1 (StataCorp., College Station, TX), and the GND tests were conducted in SAS version 9.4 (SAS Institute Inc., Cary, NC). Statistical significance was based on two-sided tests with a threshold of P < 0.05.

RESULTS

Characteristics of the Study Cohort

In the cohort, the mean age (SD) was 58.2 (16.3), and 59.0% were women, 70.8% were White, 13.3% were Hispanic of any race, 4.5% were Black, 4.4% were of other or multiple-race, and 7.1% had unknown race and ethnicity (Table 1). Most cohort members were prescribed a low dose of opioids on the index date (73.8% < 50 MME, 19.5% 50 to < 90 MME, and 6.7% ≥ 90 MME), and half (50%) had a history of one or more opioid dispensings prior to starting LTOT. In the cohort, 12.8% were prescribed opioids concurrently with anticonvulsants, 34.3% with antidepressants, 14.1% with benzodiazepines, and 12.2% with muscle relaxants. A large proportion had a history of one or more mental health diagnoses (42.3%). There were 65 incident opioid overdoses observed during the study (111.4 overdoses per 100,000 person-years), of which 11 were fatal. Means and proportions of most candidate predictors differed between patients with and without an overdose, except for medical diagnoses, sex, history of early opioid refills, and concurrent muscle relaxant prescriptions.

Model Update and Validation

To establish a comparison for the updated model, we validated the KPCOOR model in the study cohort. All five predictors were significantly associated with opioid overdose (model 1, Table 2). The model showed adequate discrimination with a bootstrap-corrected C-statistic of 0.73 (95% CI, 0.64–0.85). The plot of survival probabilities shows clear separation between tertile of predicted risk, further demonstrating model discrimination (Fig. 1, model 1). It also illustrates agreement between observed and predicted values, indicating good calibration. The GND test indicated no significant difference (P = 0.589). Model performance was on par with original study.17

Table 2 Adjusted Cox Regression Models of 2-Year Opioid Overdose Risk
Figure 1
figure 1

Observed vs. expected survival probabilities by tertile of predicted risk.

Eight variables were selected in at least 50% of bootstrapped samples in the model update. Two opioid prescription indicators (high dose of ≥ 90 MME and history of opioid dispensing) and three concurrent prescription measures (anticonvulsants, antidepressants, and benzodiazepines) not selected in the original study were identified as significant predictors in model 2 (Table 2). Two predictors in the KPCOOR model (LA/ER opioid formulation and mental health disorder diagnoses) were not selected in model 2. The bootstrap-corrected C-statistic of 0.80 (95% CI, 0.70–0.88) was higher but not statistically different from model 1. The model was also well-calibrated (GND test, P = 0.585; Fig. 1, model 2).

Model Adaptation with Input from Clinical Partners

With input from clinical partners, we made two modifications to model 2 to ensure that it would be clinically relevant and practicably implemented. Due to concern about the model being misapplied to discontinue treatment for depression, we replaced concurrent antidepressant prescriptions with history of mental health disorders, which was included in the original KPCOOR model.17 The measure of opioid dispensing history prior to LTOT was excluded due to limited operational ability to accurately identify prior dispensing history, unlike in our retrospective cohort study. Compared to model 2, model 3 showed no significant decline in performance, demonstrating good discrimination with a C-statistic of 0.77 (95% CI, 0.66–0.87) (Table 2) and calibration (GND test, P = 0.329; Fig. 1, model 3). Using the top tertile of predicted risk to define “high-risk” patients, model 3 had a sensitivity of 66.2%, a specificity of 66.7%, and a PPV of 0.38%. Individuals in the top tertile of predicted risk by model 3 had a substantially higher observed overdose rate (230.0 per 100,000 person-years) compared to the middle (87.1 per 100,000 person-years) and lowest (24.8 per 100,000 person-years) tertiles (Table 3).

Table 3 Overdose Risk by Tertile of Predicted Risk

In sensitivity analyses, we estimated a competing risks model, which yielded nearly identical results (Supplemental Table 1). We also estimated the PPV for hypothetical overdose rates ranging from 0.5 to 5.0%. Results show that the PPV would increase in contexts where the prevalence of overdose is higher than in this study (Supplemental Table 2). An alternative cut-off for “high-risk” based on the Youden Index corresponded approximately to the 85th percentile and had a lower sensitivity (52.3%) but higher specificity (85.2%).

DISCUSSION

We updated and reassessed the performance of an opioid overdose prediction model. Updating the KPCOOR model resulted in a higher but not statistically different C-statistic compared to the original model. Findings suggest that while the original model remains relevant, ongoing assessment of opioid overdose prediction models may be needed to verify performance related to changes in risks and clinical needs amid an evolving opioid epidemic and increasing understanding of overdose risk factors.

The adapted KPCOOR model included three predictors of opioid overdose not in the original model: concurrent benzodiazepines with opioids, concurrent anticonvulsants, and opioid dose of 90 MME or greater. Concurrent benzodiazepine use with opioids and high opioid dose have been identified as strong predictors of opioid overdose in LTOT patients.2,20,31 Recent research shows that use of anticonvulsants (primarily gabapentin) in combination with opioids may compound risks due to central nervous system and respiratory depression.13,32,33 While the objective of the model was to predict future overdose and not to establish causality, it suggests that use of anticonvulsants, in conjunction with or as an alternative to opioids, to treat neuropathic pain warrants further scrutiny to reduce overdose risk.34,35,36 Our model also included age, which has been shown to be associated with overdose risk in LTOT patients.37,38 The appropriateness of including non-clinical factors, such as race and ethnicity, in overdose risk models may depend on the intended clinical intervention. Demographic factors could inform harm reduction strategies, such as naloxone prescribing, but would likely be inappropriate to consider for determining opioid treatment. Further, potential unintended consequences, such as exacerbation of existing disparities, may result from incorporating demographic factors into models.39

The adapted KPCOOR model can support clinical decisions regarding minimally disruptive interventions to mitigate overdose risk. The model could be used to identify patients at increased overdose risk for naloxone prescribing and dispensing and for increased frequency of clinical visits to weigh the benefits and risks of continuing opioid therapy. Naloxone co-prescribing and ongoing treatment evaluation have been recommended by the Centers for Disease Control and Prevention in its opioid prescribing guideline.16 While the risk model can be used to improve clinical operations, it has a low PPV, and stigma and misapplications are possible. The low PPV implies that most patients classified as “high-risk” will not experience an overdose and that interventions would need to be implemented widely in health systems to have a measurable impact. The cost of such interventions could be balanced against potential years of life lost due to overdose deaths. Given potential for misclassification of “high-risk” patients, caution must be exercised when using the model in practice. False positives may reinforce opioid-related stigma,8 and misapplications of the model may lead to unwarranted disruptions in care. Although we adapted the model to blunt potential inappropriate clinician-initiated discontinuation of antidepressants, misapplications may still occur. For example, the tool should not be used to reduce or discontinue opioid treatment, including rapid tapering of opioid doses—which itself can increase overdose risk.5,6,7 To avoid these potential pitfalls, input from clinical partners and ongoing education regarding appropriate uses of the model are advisable. Future research should assess the impacts of implementing overdose risk models, including the consequences of misclassifying low-risk patients.

Our study has limitations. Risk models were assessed in a single health system and may require validation and refinement in other settings. While we considered a broad set of candidate predictors, we lack data or power to investigate other potentially important risk factors, such as use of opioid medications in a manner not as prescribed and history of drug overdose.14 There were 111 patients with a history of drug overdose who together only had 1 outcome event during follow-up. The number of patients with an overdose was much lower than patients without, which contributed to high false positive rates. We lack sufficient data to evaluate model performance by subpopulations, including by race and ethnicity. Finally, we identified overdoses that only resulted in death or came to medical attention in emergency department or inpatient settings and did not capture overdoses that occurred elsewhere.

CONCLUSION

Amid evolving opioid overdose risks and clinical needs, we developed an updated 7-variable opioid overdose risk prediction model with performance comparable to the original model that could be implemented to mitigate overdose risk in LTOT patients.