Introduction

Acute respiratory distress syndrome (ARDS) is a life-threatening manifestation of acute lung injury, and the associated mortality remains high1 despite developments in modern medicine. ARDS reportedly has various clinical phenotypes with different risks and prognostic factors2,3,4. Moreover, certain drugs have been reported to cause ARDS5,6. Although diagnostic criteria for drug-associated lung injury (DALI) have been proposed7,8, there is still no gold standard and its diagnosis remains challenging. It is particularly difficult to diagnose DALI in patients with ARDS.

Recently, we reported that drug-associated ARDS (DARDS) has clinical characteristics that are different from those of non-drug-associated ARDS (non-DARDS) and that DARDS has a more favourable prognosis9. In that study, patients with DARDS had a lower Acute Physiology and Chronic Health Evaluation (APACHE) II score and a higher arterial oxygen tension (PaO2)/fractional inspired oxygen (FiO2) ratio than those without DARDS. However, high-resolution computed tomography (HRCT) scans showed more extensive lung damage and fibroproliferation in the patients with DARDS. Furthermore, another retrospective cohort study reported that patients with ARDS, who did not have typical risk factors, including those with DARDS, were more frequently referred to the intensive care unit for acute respiratory failure, had a lower severity score on admission to the intensive care unit, and were less likely to develop non-pulmonary organ failure10.

Although corticosteroid therapy for ARDS is controversial, it is generally used for DALI. There may be a therapeutic advantage of using steroids in patients with DARDS if they could be diagnosed. We believe that we should consider the clinical features and treatment strategy used for DARDS separately from that of non-DARDS because the effect of treatment, e.g., with corticosteroids, and the prognosis of these disease entities may be different. We hypothesized that it would be possible to use a scoring system to predict drug-associated DARDS using clinical variables. The purpose of this study was to develop and validate a scoring system for prediction of DARDS.

Results

There were 29 patients in the DARDS group and 200 in the non-DARDS group (Fig. 1). The baseline characteristics of the patients in the two groups are shown in Table 1.

Figure 1
figure 1

Flow of patients through the study. CPR, clinical prediction rule; HRCT, high-resolution computed tomography.

Table 1 Baseline characteristics of patients in the DARDS and non-DARDS groups

The aetiological agents associated with DARDS are shown in Table 2. The causative agents were anticancer drugs in 7 cases (24%), Chinese herbal (“kampo”) medicine in 5 (17%), antibiotics in 4 (14%), amiodarone in 4 (14%), anti-rheumatic agents in 3 (10%), non-steroidal anti-inflammatory drugs in 3 (10%), and others in 3 (10%).

Table 2 Agents known to cause drug-associated acute respiratory distress syndrome identified in 29 patients.

Clinical prediction rule was developed using 2 multivariable logistic regression models: a model that included the APACHE II score, DIC score, and PaO2/FiO2 ratio; and a model in which the HRCT was added to the APACHE II score, DIC score, and PaO2/FiO2 ratio. The receiver-operating characteristic (ROC) curves for the clinical and HRCT added models are shown in Fig. 2 and the area under the ROC curve (AUC-ROC), bootstrap AUC-ROC, and optimism in Table 3.

Figure 2
figure 2

Area under the receiver-operating characteristic curve for each model.

Table 3 AUC-ROC, bootstrap AUC-ROC, and optimism values in the multivariable logistic regression models.

We assessed the additive predictive ability of HRCT in diagnostic ability between the clinical model and HRCT added model using net reclassification improvement (NRI) and integrated discrimination improvement (IDI; Table 4) and both were statistically significant (P < 0.01). We found that adding HRCT appeared to be superior to the model including only the clinical variables. The relationship between the predicted probability of DARDS and each continuous variable are shown in Fig. 3. Next, some of the variables in the HRCT added model were categorized according to their ease of use in the clinical setting (Table 5) to create a clinical prediction rule for diagnosing DARDS. The ROC curve for the final model is shown in Fig. 4. The AUC-ROC was 0.866 (95% confidence interval 0.814–0.920). The bootstrap method was used to evaluate the internal validity of the final model. The AUC-ROC was 0.853 when adjusted for optimism, which was 0.136.

Table 4 Additive diagnostic ability of DARDS with a model including HRCT score compared with a model including APACHE II score, DIC score, and PaO2/FiO2 ratio using net reclassification improvement (NRI) and integrated discrimination improvement (IRI).
Figure 3
figure 3

The relationship between DARDS and each variable in the HRCT added model.

Table 5 Variables in the clinical prediction rule.
Figure 4
figure 4

Area under the receiver-operating characteristic curve for the final model.

Using a potential cut-off score of >34.8, the sensitivity was 96.6%, the specificity was 70.3%, the positive predictive value was 32.9%, and the negative predictive value was 99.3%.

Discussion

In this study, we have developed a clinical prediction rule for DARDS in patients with a known diagnosis of ARDS. There are four variables in this clinical prediction rule, which we have named the “DHAP” (DIC score, HRCT score, APACHE II score, PaO2/FiO2) score. Using this clinical prediction rule, the higher the DIC and APACHE II scores the lower the DHAP score, and the higher the HRCT score and PaO2/FiO2 ratio the higher the DHAP score. The Japanese Association for Acute Medicine has proposed a DIC scoring system that reportedly predicts multiorgan failure and a poor prognosis in patients with severe sepsis11. There is a correlation between the findings on HRCT findings and the pathologic stage of diffuse alveolar damage (DAD) as well as an association of HRCT scores with a poor outcome, prolonged mechanical ventilation, multiple organ dysfunction, and ventilator-associated complications (barotrauma and ventilator-associated pneumonia)12. In view of the above findings, the clinical prediction rule devised in the present study suggests that extensive lung damage with fibroproliferation and a less serious general and respiratory condition are important indicators of DARDS.

A DHAP score >34.8 had high sensitivity and a negative predictive value. It also had a high AUC-ROC and bootstrap AUC-ROC, and its optimism was reasonable.

ARDS has recently been reported to include a variety of clinical phenotypes with divergent risk factors and prognostic features2,3,4. In those studies, the response of the hyperinflammatory sub-phenotype, characterized by an increase in levels of biomarkers of inflammation, more severe shock and acidosis, and a significantly poorer prognosis, to positive end-expiratory pressure, fluid therapy, and pharmacotherapy was found to be different to that of the hypoinflammatory sub-phenotype. As already mentioned, ARDS is a heterogeneous syndrome, and it would be helpful to be able to categorize it according to its cause and tailor treatment to the sub-phenotype.

In previous reports, DALI/DARDS accounted for about 10% of all cases of ALI/ARDS7,9,10. Recently, we reported that the clinical features of DARDS were different from those of non-DARDS9. In that study, patients with DARDS received higher doses of corticosteroids and had a better prognosis than those with non-DARDS. Moreover, Gibelin et al.10 reported that the cytology of bronchoalveolar lavage (BAL) fluid and findings on chest CT might help to detect patients with ARDS, including those with drug-induced ARDS, who do not have the common risk factors, and that their mortality risk might be decreased by treatment with an anti-inflammatory agent.

In Japan, the Practical Guidelines of the Japanese Respiratory Society13 recommend that patients with acute respiratory failure caused by DALI be treated initially with high-dose corticosteroids and gradually tapered. It is possible that steroid therapy could be beneficial in patients with DARDS if they could be identified.

Histopathologic features of DAD were reportedly found at autopsy in only 45% of patients with ARDS diagnosed by the Berlin criteria14. This suggests that not all patients with ARDS have the histopathologic features of DAD. One possible explanation for this observation is that there might be some pathologic differences in patients with DARDS who develop fibroproliferative changes, and corticosteroid therapy could have a better effect in DARDS.

It was reported that patients with ARDS and predominantly haemorrhagic or lymphocytic BAL fluid had a better response to corticosteroid therapy and a better outcome than those with predominantly macrophagic or neutrophilic BAL fluid10. Recently, we reported that 62% of patients with DARDS in whom BAL was performed had lymphocytic or eosinophilic BAL fluid9. These findings suggest that many patients with DARDS have lymphocytic BAL fluid and that corticosteroid therapy may be effective in these patients. The value of corticosteroid therapy for ARDS remains controversial. If corticosteroids are withheld in all patients with ARDS, those with DARDS may miss the opportunity for potentially effective treatment. DARDS cannot be diagnosed definitively using the DHAP score alone. However, when combined with BAL and lymphocyte predominance, this score may help to diagnose DARDS. We believe that it is important to be able to determine whether ARDS is drug-associated and that the DHAP score can help to make the diagnosis.

There are some limitations to this study. First, it had a single-centre, retrospective design, which may limit use of the DHAP score at other facilities. Further prospective multicentre validation studies are necessary to determine if use of this tool could be extended to a more general patient population. Second, the number of patients with DARDS was relatively small, so our clinical prediction rule may have the problem of overfitting. However, the accuracy of this rule was sufficiently high, and the optimism of the final model was adequate. Third, only Japanese patients participated in the study, and we may have to consider ethnic differences when using this rule. Fourth, it was difficult to judge whether some drugs were causative or not in the DARDS group. However, this problem is unavoidable in such research because specific markers, histologic features, and clinical findings are generally not diagnostic, and there is no gold standard method for diagnosis of drug-associated lung disease. Fifth, the DHAP score includes the HRCT score, which is not used worldwide and is only used when a patient with ARDS has recently been exposed to a suspected culprit drug.

In conclusion, we have developed a simple scoring system for clinical prediction of DARDS. We believe that using this rule will help early detection, guide appropriate treatment, and improve the prognosis in patients with DARDS. A further external validation study for this clinical prediction rule is now needed.

Methods

Study design

This study was a post-hoc analysis of data collected during an ongoing prospective, single-centre, cohort study of ARDS using HRCT, parts of which have been published previously9,12,15,16,17,18. Our hospital is a tertiary academic teaching institution with 400 beds.

Patients

Two hundred and twenty-nine Japanese patients with ARDS were admitted to our institution from October 2004 to December 2017. The ARDS diagnosis was based on the American-European Consensus Conference19 before June 2012 and on the Berlin definition20 after July 2012. The study participants were enrolled from both the intensive care unit and other departments within the hospital. An HRCT scan was performed on the day of diagnosis of ARDS in all cases. Patients with chronic interstitial lung disease, including idiopathic pulmonary fibrosis and underlying collagen vascular disease, and those with conditions that mimic ARDS (pulmonary alveolar haemorrhage resulting from vasculitis syndrome, acute organizing pneumonia, acute eosinophilic pneumonia, and acute hypersensitivity pneumonitis) were excluded from the analysis.

The study was approved by the institutional review board of Saiseikai Kumamoto Hospital (permission number 238) and conducted in accordance with the ethical standards of the Declaration of Helsinki. Written informed consent was obtained from all patients included in the study or their families. The results of the study are reported in accordance with the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) statement21.

Definition of drug-associated ARDS

We classified the patients with ARDS into a DARDS group and a non-DARDS group according to aetiology. We used the definition of DALI devised by Dhokarh et al.7 and the traditional risk factors for acute lung injury (ALI), i.e., sepsis, septic shock, pneumonia, pancreatitis, trauma, massive blood transfusion, and gastric aspiration. Probable DALI was considered in patients with no established risk factors for ALI except for specific drug exposure within the previous year. Patients with possible DALI had at least one risk factor for ALI and a history of specific drug exposure within the previous year. Those with conditional DALI had received drugs not previously reported to cause ALI but with similarity to known causative agents. Patients without DALI had not been exposed to drugs reported or assumed to cause ALI. The patients with ARDS and probable or possible DALI were classified into a DARDS group while those with ARDS and conditional DALI or without DALI were classified into a non-DARDS group. The data for drug exposure history in the year before onset of ARDS were available in “medicine notebooks” that list all drugs prescribed to patients and are unique to Japan.

Data collection and definitions

Patient age and sex, APACHE II score, Sequential Organ Failure Assessment (SOFA) score, HRCT score (indicating the extent of fibroproliferation12), DIC score, McCabe score, arterial oxygen tension (PaO2/FiO2) ratio, and blood test results were recorded at the time of diagnosis of ARDS. HRCT was performed at the start of mechanical ventilation in all patients. As previously reported22, the HRCT score was graded on a scale of 1−6 (1, normal attenuation; 2, ground-glass attenuation; 3, consolidation; 4, ground-glass attenuation with traction bronchiolectasis or bronchiectasis; 5, consolidation with traction bronchiolectasis or bronchiectasis; and 6, honeycombing). The presence of each of these six abnormalities in the upper, middle, and lower segments of each lung was assessed independently. The extent of each abnormality was determined by visually estimating the proportion of lung parenchyma affected in each segment. Each abnormality score was calculated by multiplying the percentage area by each individual score. The six segment scores were averaged to determine the total score for each abnormality. The overall HRCT score was obtained by adding the six averaged scores.

The DIC score was calculated according to the diagnostic criteria recommended by the Japanese Association of Acute Medicine23. The McCabe score for severity of underlying disease was recorded (1, non-fatal; 2, near-fatal; 3, fatal)24.

Statistical analysis

Continuous variables are described as the median (interquartile range [IQR]) and the categorical variables as the frequency and percentage. The chi-square test or Fisher’s exact test was used to compare the categorical variables between the groups and the Mann-Whitney U test was used to compare the continuous variables between patients with and without DARDS. Variables with clinical relevance, including the APACHE II score, DIC score, HRCT score, and PaO2/FiO2 ratio, were selected a priori and included in multivariable logistic regression models where the binary variable of present or absence of DARDS was the dependent variable. We created 2 models for development of a clinical prediction rule for DARDS: a clinical model consisting of the APACHE II score, DIC score, and PaO2/FiO2 ratio; and an HRCT added model consisting of the APACHE II score, DIC score, PaO2/FiO2 ratio, and HRCT score. To allow for nonlinear associations, the APACHE II and DIC scores were modelled using restricted cubic splines.

Given that the number of patients with missing data was small (n = 8), the prediction model was developed using data only for patients in whom all of the study variables had been assessed (n = 221). The discrimination performance of each potential predictor was assessed by the AUC-ROC. Bootstrap validation was performed with 150 resamples to validate and calibrate each prediction model. The bootstrap bias-corrected AUC (bootstrap AUC-ROC) was reported as the measure of the predictive performance of the model. The optimism25 of each model was estimated using 150 bootstrap resamples. Optimism assesses the magnitude of overfitting of logistic regression model (a value less than 0.3 is considered as good), and was calculated using C-statistics by bootstrap samples. We evaluated the bootstrap AUC-ROC and overfitting of each model and chose two parsimonious models with acceptable diagnostic ability and the least number of parameters. Using the NRI and IDI, we assessed whether there was a difference in diagnostic ability between the 2 models. The total NRI was the summation of the accurate reclassifications of patients with and without DARDS. In the patients with DARDS, improvement of reclassification was the difference between the percentage of patients reclassified as a higher risk group and that of patients reclassified as a lower group. Similarly, in the patients without the DARDS, improvement of reclassification was the difference between the percentage of patients reclassified as a lower risk group and that of patients reclassified as a higher group. The total IDI provides the difference in mean predicted probabilities, representing the amount by which addition of a variable to a model increases the separation of the mean predicted probabilities for DARDS and non-DARDS26. To make the final model easier to use in a clinical setting, the variables were scored, the bootstrap AUC-ROC of the final model was calculated, and its diagnostic ability was assessed. The sensitivity, specificity, positive predictive value, and negative predictive value were calculated by using the best cut-off score for the clinical prediction rule with the Youden index for the ROC.

The statistical analyses were performed using R version 3.5.1 (R Foundation for Statistical Computing, Vienna, Austria). A two-sided p-value < 0.05 was considered statistically significant.