INTRODUCTION

The Centers for Medicare and Medicaid Services (CMS) launched the Bundled Payments for Care Improvement (BPCI) program in 2013.1 BPCI is an alternative payment model under which hospitals take responsibility for costs for an “episode” of care for patients admitted with one or more of 48 different conditions. BPCI is entirely voluntary, and hospitals may choose which conditions to participate under. Cost targets are based on hospitals’ own historical spending rather than assessed against a formal control group. If cost targets are achieved (i.e., if hospitals are able to reduce their costs from baseline), participating hospitals keep a portion of the savings. If costs are exceeded, hospitals reimburse Medicare for part of the difference.

Despite a number of publications examining the impact of BPCI on spending and outcomes,2,3,4,5,6 we know little about what makes individual hospitals more or less likely to save money under the program. It is possible that certain conditions may be more amenable to reducing spending over time, because practice redesign is easier or new technology permits more efficient care. For example, joint replacement is an elective condition, and hospitals may carefully select patients and ensure they are optimized preoperatively. By contrast, patients admitted for medical conditions may be more complex or variable, making it harder to optimize their care. It is also possible that it is easier to save for certain conditions because of secular trends in care for that condition—for example, if a new valve replacement technique became available that shortened length of stay across the board. Since there is no control group in BPCI, savings are attributed to the participating hospital, even if non-participants save also.

Alternatively, saving under the program might be about patient population: hospitals with higher proportions of patients with poverty, advanced age, or frailty may be less able to achieve savings.7,8 BPCI is not risk-adjusted, so patient characteristics that could influence spending are not taken into account. Market characteristics, such as the availability of post-acute care (PAC) services, could also influence savings under BPCI. Savings in the program have been driven by savings on PAC, and hospitals better positioned to affect PAC spending due to market conditions may outperform others .9,10,11

Understanding which conditions are likely to be treated optimally under a bundled payment model, and whether certain types of patients or hospitals are particularly well-suited to such a program, has implications for BPCI-Advanced, the recently launched, next iteration of the BPCI alternative payment model. To our knowledge, these issues have not previously been explored.

Therefore, our study aimed to address this gap in four ways. First, we identified “savers” and “non-savers” and examined whether hospitals’ choice of condition was associated with savings in BPCI. Second, we examined patient, hospital, and market characteristics to determine to what extent these variables were associated with savings. Third, we compared changes in Medicare payments for different components of care (inpatient, post-acute, etc.) to see whether savers succeeded by focusing on a particular area. Fourth, we explored whether hospitals that saved money also had a decrement in clinical outcomes.

METHODS

Overview of BPCI and Selection of Primary Outcome

There were four payment models in BPCI, but nearly 90% of hospitals enrolled in model 2, which is the focus of this analysis. Target prices in BPCI were generated based on historical institutional spending minus a discount of 2%; hence, we defined our primary outcome as achieving that level of reduction in Medicare payments (hereafter termed savers vs. non-savers). Since Medicare has not publicly released payment targets, or information on whether hospitals received penalties or bonuses under the program, this analysis focuses on the primary outcome of whether hospitals were able to decrease Medicare allowed spending from their own baseline, but not whether they received bonuses or penalties, nor whether they saved more than non-participating hospitals over the same time frame.

BPCI Participants

We used publicly available enrollment data from the Center for Medicare and Medicaid Innovation to identify all hospitals participating in BPCI. The first participants joined BPCI on October 1, 2013, and there was ongoing quarterly enrollment continuing through October 1, 2015. These data were linked to the 2014 American Hospital Association file to ascertain hospital characteristics and the Area Resource File to obtain market characteristics at the county level.

Study Period

We used Medicare claims data from January 1, 2013, through December 30, 2016. We defined “baseline” as 12 to 3 months before each hospital’s BPCI start date, though hospitals that started in October 2013 only had a 6-month baseline period due to data availability. This time was used to determine baseline spending for the condition that was chosen for each hospital. Recognizing that it takes time to redesign clinical care and optimize performance in an alternative payment model, we left a 1-year gap from 3 months before through 9 months after the start date. The duration of the intervention was 9 months for each hospital (months 10–18 of participation), though hospitals that started on 7/1/15 had only 6 months of intervention time and hospitals that started on 10/1/15 had only 3 months of intervention time due to data availability (Appendix Figure A1).

Patient Identification and Cost of Care

Index admissions at BPCI hospitals for the condition(s) selected by each participant were identified between 1/1/2013 and 9/30/2016 to allow full 90-day episodes to accrue within our data. We included only beneficiaries who were continuously enrolled in Medicare Parts A and B during their episodes of care and excluded those with end-stage renal disease. Medicare claims contained demographic characteristics, principal discharge diagnoses, comorbidities, and service use.

For each index hospitalization, we calculated standardized Medicare-allowed payments from 100% inpatient, skilled nursing facility, home health agency, and durable medical equipment claims from the index hospitalization to 90 days after discharge, which was the episode duration chosen by more than 95% of BPCI participants. To standardize payments, we followed CMS specifications and removed differences in payment based on wage index, disproportionate share hospital payments, medical education, and other special payments. Total Medicare allowed payments were Winzorized at the 95th percentile of national episode payments and adjusted for inflation to price in 2016. We did not include physician and outpatient payments because they were available for only 20% of the population and in past work accounted for only 10–15% of Medicare standardized allowed costs.

Determining if Condition Choice Was Associated With Saving

To assess whether particular conditions or types of conditions drove success in the BPCI program, we examined the individual conditions across all hospitals and then classified conditions as “medical” or “surgical” as well as “mostly elective” or “mostly urgent/emergent” and “high-cost condition” versus “non-high-cost condition.” The elective variable was based on the proportion of patients who had elective admission status versus urgent/emergent admission status coded on their Medicare claims. Conditions were considered elective if >50% of the Medicare claims were coded as elective (Appendix Table A1). The high-cost versus non-high-cost variable was assigned by dividing conditions into two groups based on their average baseline episode costs (Appendix Table A1).

For this set of analyses, we performed regressions for each condition and the groups of condition classifications noted above (medical/surgical, urgent/elective, high-cost/non-high-cost). The outcome was the proportion of hospitals participating for a condition or a group of conditions that were savers. We used linear regression to examine relationships between our predictors and the proportion of savers for each set of conditions.

Determining if Case Mix, Hospital, or Market Characteristics Were Associated with Saving

To assess to what extent case mix, hospital, and market characteristics were associated with success in BPCI, we evaluated several variables that were present in Medicare claims data, including age, proportion female, minority, and Medicaid, and average Chronic Conditions Warehouse scores (a Medicare-provided comorbidity index ranging from 0 to 27, with higher scores indicating more comorbidities), and aggregated these at the hospital level. We characterized each hospital’s for-profit status, teaching status, geographic location, disproportionate share hospital payments, and the number of conditions that hospitals chose to participate in. We further evaluated their market’s median household income, Medicare Advantage penetration, availability of skilled nursing facility beds, market share, and market competitiveness. Market share was calculated as the proportion of all admissions in the county for the conditions of interest. Market competitiveness was calculated by the Herfindahl-Hirschman Index (HHI), a common measure of market competitiveness .12 The HHI was calculated by squaring the market share of each competing hospital and summing the numbers. The HHI approaches zero when many hospitals compete for market share and approaches 1 when the market is dominated by a single hospital.

For this set of analyses, we examined the association between case mix, hospital, and market characteristics and saver status at the hospital-condition pair level. We used logistic regression to calculate the odds of being a saver. These analyses controlled for condition, as our intent was to determine whether there were hospital characteristics beyond the choice of condition that were associated with saving.

Determining if Certain Areas of Spending Drove Savings

To determine whether there were certain areas of spending that drove savings, we examined changes over time in Medicare payments, both overall and in key subcategories such as spending on readmissions or post-acute care. We used a patient-level, difference-in-differences approach to assess changes from baseline to intervention for savers versus non-savers. We used generalized estimating equations in a linear model with robust standard errors, accounting for clustering within hospitals, and including age, gender, and medical comorbidities for risk adjustment defined using Medicare’s Chronic Conditions Warehouse data.

Determining if Saver Status Was Associated with Worsening Clinical Outcomes

Finally, we examined whether savers and non-savers differed in clinical outcomes, including days at home, 90-day readmissions, and 90-day mortality. For this set of analyses, differences in differences in clinical outcomes were analyzed at the patient level, again using a generalized estimating equations model with robust standard errors, accounting for clustering within hospitals, and including age, gender, and medical comorbidities for risk adjustment.

This study was approved by the Office of Regulatory Affairs and Research Compliance at Harvard University. Analyses were performed using SAS v.9.4.

RESULTS

Hospital, Market, and Case Mix Characteristics

A total of 421 hospitals participated in BPCI, resulting in 2974 hospital-condition combinations. Of these, 2848 had data in both the baseline and intervention period for at least one patient and were therefore included in our analysis. Most hospitals were nonprofit (79.3%), urban (99.2%), teaching hospitals (57.9%), with over 350 beds (Table 1).

Table 1 Hospital, Market, and Patient Characteristics

Condition Choice and Savings

The prevalence of savers differed greatly across the 48 clinical conditions. Major joint replacement of the lower extremity had the highest proportion of savers (77.6%) with an average savings of −$2235; and “complex non-cervical spinal fusion” had the lowest (22.2%) with an average increase from baseline of +$8106 (Fig. 1 and Appendix Table A1).

Figure 1
figure 1

Likelihood of saving, by condition, urgency, and costs. legend: Panel a. Medical (yellow) versus surgical (blue); Panel b. Mostly elective (purple) (>50%) versus mostly urgent/emergent (orange) (>50%); Panel c. High-cost conditions (red) versus low-cost conditions (green).

Medical conditions had a higher proportion of savers than surgical conditions (11% more likely to save, 95% confidence interval [4%, 17%], P=0.001, Fig. 1a). Conditions that were mostly urgent/emergent had a higher proportion of savers than conditions that were mostly elective (6% more likely to save [−17%, −3%], P=0.007, Fig. 1b). There was no correlation between whether a condition was high cost and the proportion of savers (high-cost conditions: 6% more likely to save, [−13%, 1%], P=0.12, Fig. 1c). In multivariate regression analyses, none retained statistical significance (medical versus surgical: difference 9.2%, P=0.06; urgent/emergent versus elective: difference 3.8%, P=0.42; high-cost versus non-high-cost: difference 1.7%, P=0.69).

Associations Between Case Mix, Hospital, and Market Characteristics and Saving

A total of 1605 hospital-condition pairs (56%) were classified as savers, having saved at least 2% from their condition-specific baseline costs, and 1243 (44%) were classified as non-savers (Table 2). In regressions evaluating predictors of saving at the hospital-condition pair level and controlling for condition, having higher than median baseline costs was associated with saving (OR: 3.02, P<0.001), as were nonprofit status (OR: 1.36, P=0.009) and higher-than-median annual discharges (OR: 1.36, P=0.001). Hospitals that participated in 2 or more bundles were less likely to be savers (OR: 0.46, P<0.001) as were hospitals with patients with a higher mean number of comorbidities (OR: 0.77, P=0.003). Hospitals in the West were most likely to be savers. None of the hospital market characteristics we evaluated including market competitiveness was associated with saving.

Table 2 Hospital, Market, and Case Mix Predictors of Saving

Changes in Total and Setting-Specific Medicare Payments for Savers vs. Non-Savers

At baseline, total risk-adjusted Medicare payments per episode were higher for savers versus non-savers ($28,092 versus $26,351, P <0.001, Table 3). As expected, for savers, payments decreased from baseline to intervention ($28,092 to $25,926, difference: –$2,166; P<0.001). For non-savers, payments increased ($26,351 to $27,261, difference +$911, P<0.001; difference in differences: –$3,077; P<0.001). Savings accrued across multiple settings including the index hospitalization and all forms of post-acute care spending. Non-savers increased payments for the index hospitalization and most forms of post-acute care spending.

Table 3 Changes in Medicare Payments for Savers Vs. Non-Savers

Association Between Saver Status and Changes in Clinical Outcomes

There was no evidence of decrements in clinical outcomes among savers compared to non-savers over the study period. Both savers and non-savers increased patients’ days at home in the intervention period relative to baseline (Table 4). However, the magnitude of the increase was higher for savers than for non-savers (difference-in-differences: 2.39 days [1.94, 2.84], P<0.001). Savers decreased 90-day readmission rates (–0.9%, P<0.001), while they increased among non-savers (+1.7%, P<0.001; difference in differences: −2.6% [−3.1%, −2.0%], P<0.001). There was no difference in 90-day mortality over time between savers and non-savers.

Table 4 Patient Outcomes by Saver Status

DISCUSSION

Examining predictors of successfully reducing Medicare payments during BPCI, we found that the clinical condition chosen had a large impact on the likelihood of savings for hospitals. Additionally, hospitals that were high-cost at baseline were more likely to save. Hospitals that saved did so across all clinical settings, including inpatient and post-acute care. Finally, we found no evidence that hospitals that saved money did so at the expense of days at home, 90-day readmissions, or 90-day mortality.

At the condition level, we hypothesized that elective surgical conditions would have an advantage in BPCI, as these conditions may be more amenable to standardized care pathways, and patients may be carefully chosen and optimized preoperatively. However, aside from the major joint replacement of the lower extremity, in which more than 75% of participants were savers, non-elective conditions and medical conditions had a higher proportion of savers than elective and surgical conditions. Particularly given the rise in hospital medicine and multidisciplinary inpatient care, it is possible that patients cared for on medical services had enhanced access to discharge planning, social work services, or other interventions that allowed for savings to occur even in these complex scenarios, or due to spillover from ongoing efforts around readmission reduction for conditions in the Hospital Readmissions Reduction Program .13 Further research is needed to understand the specific processes that might be associated with better outcomes.

Among hospital and market characteristics, the strongest predictor of savings was having high baseline costs compared to other hospitals participating in the program for the same condition. Higher baseline hospital costs have been previously associated with savings in major joint replacement of the lower extremity under BPCI.14 High-cost hospitals may have more opportunity to save due to underlying inefficiencies that are relatively easy to address. Regression to the mean may explain some of this finding as well, given that costs for savers and non-savers moved towards one another from baseline. This is a concern for policymakers, who seek to reward real improvements in care rather than normal shifts over time. Using historical costs as a baseline rather than setting a single target for all participants has the advantage of implicit risk adjustment, but policymakers worry about rewarding providers just because they were historically inefficient .15 The next iteration of BPCI, BPCI-Advanced, incorporates regional costs into the formula for setting target costs, which may remedy this concern, albeit with the tradeoff of concerns about risk adjustment methodology.

There have been previous concerns that alternative payment models might encourage providers to overtreat healthier patients, avoid caring for vulnerable populations, or give substandard care in an effort to reduce costs .4,8,16,17 The fact that outcomes remained stable for savers while their costs of care decreased reinforce notions of significant waste or unnecessary care in the US healthcare system. However, this is an area that warrants ongoing evaluation, since Medicare claims data and current quality measures may be inadequately sensitive to pick up differences in functional status, quality of life, or long-term outcomes.

We are not aware of prior work focused on determining predictors of success in the program or patient outcomes for savers. While there have been studies evaluating various aspects of the BPCI program, these studies have typically compared BPCI participants to controls .18 Multiple prior studies have reported success in BPCI for joint replacement6,19,20 and mixed results for medical conditions ,2,5,21 but those patterns could be due to differential changes in the control group rather than among participants. A federally contracted evaluation of BPCI by the Lewin Group15,22,23,24,25 also found that the impact of BPCI on savings seemed to vary with clinical conditions.

Our study has limitations. First, as mentioned above, CMS has not released data on target pricing or hospital savings and losses under BPCI. Therefore, we can draw no conclusions about the program’s impact on penalties or bonuses under the program. Second, BPCI is a voluntary program and results may not generalize to mandatory bundled payment programs. Third, we focused on hospital participants. Although they are the majority of model 2 participants, patterns may differ for physician group practices that participate in BPCI. Fourth, we had a limited intervention period, though it was 9 months into program participation. Longer-term follow-up will be necessary to see how care evolves. Fifth, for some individual hospitals, sample sizes were small, and our estimates may be unstable. Sixth, we used claims data to characterize patients and aggregate hospital case mix, which does not include important variables such as caregiver or social support. Finally, while we evaluated all hospital participants in BPCI, BPCI hospitals were largely urban, nonprofit, teaching hospitals, and our results cannot be extrapolated to predict how hospitals with different characteristics may respond to bundled payment programs.

In conclusion, the clinical condition chosen by hospitals participating in BPCI had a large impact on the likelihood that they would succeed in achieving savings, and high baseline costs were the most significant hospital-level predictor of achieving savings in BPCI. As alternative payment models such as bundled payments become more common, there may be increasing pressure across care settings, including for general internists in both the inpatient and outpatient settings, to control spending.