The influence of multiple coexisting conditions (multimorbidity) on patient outcomes is well established, but the optimal method for measuring multimorbidity is not. In this issue of JGIM, Radomski and colleagues present findings from a national cohort of Veterans Affairs (VA) and non-VA administrative files used to estimate and compare health outcomes for veterans over age 65 with diabetes who are using dual sources of care.1 The primary outcome was mortality, and the major finding was that mortality rates differed depending on which measure of multimorbidity was used. For example, in the model controlling for multimorbidity based on International Classification of Disease (ICD) diagnostic codes, the dual-use groups had lower odds of death than VA-predominant users. On the other hand, in the model based on pharmacy records, the dual-use groups had higher odds of death than VA-predominant users. The authors point out the importance of measuring multimorbidity and outline the need for reliable risk adjustment methods that are not susceptible to measurement differences across health systems.

We agree that appropriately accounting for multimorbidity is essential and we think that the widening adoption of electronic health records (EHRs) strongly suggests a new approach to measurement. When Dr. Mary Charlson first developed her comorbidity index, it was based on inpatient clinical data collected prospectively during the course of care.2,3 In both development and validation studies, the weighted index performed well in discriminating risk of mortality.2,3 However, when investigators adapted the index to use ICD diagnostic codes, its discrimination declined dramatically.4 For the past 20 years, additional indices using ICD codes have been developed in an attempt to improve the performance of these measures. For example, the Elixhauser index (used in the Radomski article) was expanded to 30 unweighted outpatient conditions, including mental health.5 However, the inaccuracy and measurement bias of ICD codes for clinical conditions is well documented. Unfortunately, providers within systems and across systems vary in their coding practices and may code for only one diagnosis even when patients are seen for multiple conditions.

A growing body of literature has examined the use of pharmacy records to create a comorbidity measure when prescription medications are indicative of the disease condition. For example, the RxRisk-V (used in the Radomski article) uses pharmacy records to identify 45 conditions and is weighted to predict mortality.6 , 7 Although these pharmacy-based measures do avoid the measurement bias of ICD diagnostic codes, and pharmacy records are fairly well standardized and portable across US healthcare systems, prescriptions reflect individual provider decision making and behavior—one provider may be more likely than another to prescribe medications for a given condition.

Thus, whether based upon ICD codes or pharmacy records, there is significant potential for bias. One must ask, in the era of paperless EHRs, why are we still talking about billing data (ICD codes or pharmacy records) for risk adjustment? Health services researchers would do well to look beyond billing data, to the direct clinical data becoming widely available in EHRs. Prior work in prognosis has demonstrated the superiority of clinical data over billing data for predicting mortality.8 We would like to highlight three sources of better data on burden of disease: routine laboratory data, modern-day chart review via text processing, and directly collected patient behaviors.

As demonstrated by the Veterans Aging Cohort Study (VACS) Index, it is possible to combine demographic data with routine, Clinical Laboratory Improvement Amendments (CLIA)-certified (and standardized) laboratory data such as hemoglobin, platelets, hepatitis C tests, creatinine, and transaminases, to develop and validate a prognostic index that is highly predictive of mortality and hospitalization among aging populations with and without HIV infection.9 (See also the vacohort.org website.) As in clinical practice, laboratory values are considered accurate until newer values are available within a reasonable time window. The benefit of this approach is substantial. CLIA certification and standardization means that a hemoglobin measurement in New Haven, Connecticut, and one in Orlando, Florida, have the same interpretation. Laboratory data are not subject to provider or clinic biases in diagnosis, and routine laboratory tests are widely available. Finally, the discrimination achieved exceeds or rivals that obtained from more difficult-to-measure variables.8

With the growth of EHRs and expanded capacity for computer processing, it is time to move beyond administrative coded diagnoses and manual chart review.10 Data from clinical encounters including medications, laboratory results, and clinically meaningful diagnostic results in clinical notes extracted with computational and natural language processing (NLP) methods could potentially create more predictive risk adjustment indexes. As demonstrated in a recent manuscript, the extraction of the ejection fraction using NLP from text notes in the EHR was critical to determining the risk of heart failure with preserved versus reduced ejection fraction for HIV-infected individuals.11 Although there are benefits in using NLP to extract clinically meaningful pieces of information from the EHR, no one system makes complete use of EHR information. Multiple systems or approaches will be needed.

Clinically, we measure exposures to controlled substances (urine toxicology screen) because patients’ reports of this behavior may not be reliable. Yet, we continue to base tobacco and alcohol use on patient reporting. Moving forward, we should also measure these exposures at the point of care and include this retrievable information in the EHR. Likewise, we could retrieve information on physical activity and sleep quality from wearable devices. Multiple assays and applications have been validated and are available to measure these exposures. Accurate measures of these health behaviors could significantly improve risk adjustment.

Thus, the major limitation of the study by Radomski and colleagues is the reliance on administrative data. Another significant limitation is that the data are not adjusted for demographics such as education, and educational level is strongly correlated with health outcomes. Finally, these data are from 2008 to 2011, and may no longer be applicable, since the VA has recently implemented many changes in dual-use care.

The measurement of multimorbidity matters to patients, policymakers, researchers, and clinicians, and the use of administrative coded diagnoses is inherently limited. Next steps are to improve multimorbidity risk adjustment measures and base these on provider’s notes, laboratory data, and point-of-care testing.