Introduction

Airway obstruction, often secondary to laryngeal edema, is one of the primary causes of respiratory distress after extubation [1]. The frequency of this complication in patients in the intensive care unit is estimated to range between 3 and 30%, of which between 1 and 5% require re-intubation [2].

The occurrence of upper airway obstruction after extubation is difficult to predict. As the presence of an endotracheal tube precludes direct visualization of the upper airway prior to extubation, a cuff-leak test might be useful in an effort to screen for an airway obstruction before extubation. This test consists of deflating the balloon cuff of the endotracheal tube in order to assess the air leak around the tube, which can indirectly assess the upper airway patency. A small leak or the complete absence of one would be suggestive of an airway obstruction. The result of the test can be expressed as qualitative (presence or not of leak around the tube) or quantitative. Several studies have assessed the ability of the cuff-leak test to predict upper airway obstruction secondary to laryngeal edema. The results of these studies, individually considered, lack statistical precision to be used for making medical decisions. A systematic review and a meta-analysis might improve the precision of the individual studies and provide information on the consistency of results and sources of heterogeneity.

We performed a systematic review with meta-analysis to assess the diagnostic accuracy of the cuff-leak test in the diagnosis of upper airway obstruction secondary to laryngeal edema and for reintubation secondary to upper airway obstruction.

Methods

Search strategy and study selection

We searched the following electronic databases: Medline, Embase, CINAHL, CANCERLIT, Pascal-Biomed, ACP Journal Club, Cochrane Library (CDSR, DARE, CCTR), ISI Proceedings, Current Contents and Web of Science. We searched electronic databases updated to December 2008. For the electronic search, we used the following terms or MeSH subject headings: “respiration, artificial,” “intubation, intratracheal,” “airway, obstruction,” “laryngeal, edema,” “weaning,” “reintubation,” “cuff leak,” and "diagnosis," "sensitivity," "specificity," "predictive value," "likelihood ratio," "false positive" and "false negative." Search in Medline was limited to “adult-all” and “human.” No language restriction was used. Further searches were performed by manually reviewing abstracts, conference proceedings and review articles.

We included all studies that met the following criteria: including more than 50 patients, assessing the diagnostic accuracy of the cuff-leak test for upper airway obstruction secondary to laryngeal edema and/or reintubation due to upper airway obstruction, and providing sufficient information to construct the 2 × 2 contingency table for individual study subjects. Two reviewers (MEO and MCM) independently judged study eligibility while screening the references.

Data extraction

Three reviewers (MEO, MCM and FFV) independently extracted data from each study to obtain information on patient demographics, sample size, test methods, diagnostic cutoff points, participant characteristics, sensitivity and specificity of the data, and methodological quality. Each reviewer extracted the data to construct a 2 × 2 table. Any disagreements were resolved by consensus between them.

Quality assessment

The methodological quality of each study was assessed by two authors (MCM and FFV), using a checklist based on criteria adapted from the quality assessment for studies of diagnostic accuracy (QUADAS) tool (maximum score 14 × 2) [3, 4].

Statistical analysis

For each study, the sensitivity, specificity, positive and negative likelihood ratios, and a diagnostic odds ratio (OR) were calculated according to following formulas: Sensitivity = [number of true positives/(number of true positives + number of false negatives)]; specificity = [number of true negatives/(number of true negatives + number of false positives)]; positive likelihood ratio = [sensitivity/(1 − specificity)]; negative likelihood ratio [(1 − sensitivity)/specificity] and diagnostic odds ratio = [sensitivity/(1 − sensitivity)]/[(1 − specificity)/specificity].

The threshold effect derived from using different cutoff points was assessed using the method of Littenberg and Moses [5]. This method allows summary receiver-operator characteristic (SROC) curves to be drawn that summarize the study results, and to assess the impact of individual variables such as the quality of the study on the test accuracy. To detect heterogeneity, the likelihood ratios and diagnostic odds ratios were analyzed using Cochran’s Q test. To quantify the extent of heterogeneity, the I 2 statistic was used to measure the percentage of variability between summary indices that were due to heterogeneity rather than chance. A study with an I 2 greater than 50% indicated substantial heterogeneity. Pooling of the individual indices was performed using DerSimonian and Laird’s random-effects model. Publication bias was examined visually by inspecting funnel plots and statistically by using the Egger regression model [6]. Analyses were performed using Stata, MetaDisc [7] and StatsDirect.

Results

Studies included

With the defined search strategy, 26 studies were identified as potentially eligible. After abstract review we selected 13 studies [820] that evaluated the cuff-leak test. In a further analysis, one study [15], whose only objective was an evaluation of the methodology to perform the cuff-leak test, was excluded. Another study [20] was excluded from the analysis because if included fewer than 50 patients. A total of 11 studies [814, 1619], including 2,303 patients, were included in the analysis (Fig. 1). Table 1 shows the characteristics of patients included in the studies and the methodological quality of the studies. The median QUADAS score was 12 points (interquartile range: 9, 14; for a maximum score of 28 points). Table 2 shows the methodology used to perform the cuff-leak test, and the results reported in each study are shown in Table 2.

Fig. 1
figure 1

Flow chart of the study identification, inclusion and exclusion for meta-analysis

Table 1 Characteristics of included studies in the systematic review
Table 2 Leak assessment and criteria for diagnosis of upper airway obstruction

Prediction of upper airway obstruction secondary to laryngeal edema

The leak cutoff value was different in every study, but the analysis of regression effect did not find a threshold effect (Spearman correlation coefficient = 0.55; P = 0.125 and beta coefficient in the Moses model = 0.38; P = 0.16).

Nine studies assessed the upper airway obstruction defined, in eight studies [911, 13, 14, 16, 18, 19], as the presence of an inspiratory stridor and, in one study [17], as laryngeal edema observed in the fibrobronchoscopy. The overall incidence of upper airway obstruction was 6.9% (range: 0.6–36.8%).

Operative characteristics (sensitivity, specificity, positive likelihood ratio, negative likelihood ratio) of the cuff-leak test in each study and overall are shown in Fig. 2. Although all characteristics showed a significant statistical heterogeneity (P < 0.05), the positive likelihood ratios were always higher than 3, indicating an increased risk of laryngeal edema in patients with a positive test (pooled positive likelihood ratio 6.79). The clinical relevance of a negative test, however, was more limited (pooled negative likelihood ratio 0.46).

Fig. 2
figure 2

Operating characteristics (sensitivity, specificity, positive likelihood ratio and negative likelihood ratio) of cuff-leak test for the prediction of upper airway obstruction after extubation

The pooled diagnostic odds ratio was 18.78 (95% confidence interval: 7.36–47.92). There was a significant heterogeneity (P = 0.001; I 2 = 69%; 95% confidence interval: 24–83%).

The area under the SROC curve (Fig. 3) was 0.92 (95% confidence interval: 0.89–0.94) and the Q point 0.85, indicating a moderate level of overall accuracy.

Fig. 3
figure 3

Summary receiver-operating characteristics (SROC) curve of cuff-leak test for the prediction of upper airway obstruction after extubation

Whereas the Egger test for the diagnostic odds ratio and positive likelihood ratios were not significant, the funnel plot of negative likelihood ratios was clearly asymmetric (Egger test P = 0.0023) strongly suggesting the presence of publication bias (Fig. 4).

Fig. 4
figure 4

Funnel plot for the assessment of potential publication bias. The funnel graph plots the log of the negative likelihood ratio against the standard error of the log of the negative likelihood ratio. The result of the Egger test for publication bias was significant (P = 0.002)

Meta-regression analysis did not show a significant association between methodological quality (QUADAS score) and diagnostic accuracy of the cuff-leak test (diagnostic odds ratio 0.93; 95% confidence interval: 0.65–1.35; P = 0.67).

Prediction of reintubation secondary to upper airway obstruction

Only three studies [8, 11, 12] have evaluated the cuff-leak test to predict reintubation secondary to upper airway obstruction. The overall incidence of reintubation was 7%. The operating characteristics of the cuff-leak test for this outcome are shown in Fig. 5. Heterogeneity was only found for the specificity (P = 0.001). The overall diagnostic odds ratio was 10.37 (95% confidence interval: 3.70–29.13) without significant heterogeneity (P = 0.90).

Fig. 5
figure 5

Operating characteristics (sensitivity, specificity, positive likelihood ratio and negative likelihood ratio) of cuff-leak test for the prediction of reintubation secondary to upper airway obstruction

Discussion

The cuff-leak test, in our systematic review, shows a moderate accuracy to predict upper airway obstruction and a low accuracy to predict reintubation secondary to upper airway obstruction. In the analysis of the studies that have evaluated the post-extubation upper airway obstruction, we have found significant statistical heterogeneity in all the operative descriptors. This result would make the application of this test difficult for medical decision-making. However, despite the statistical heterogeneity, the absence of leak (cuff leak test positive) showed an association with the presence of post-extubation upper airway obstruction (positive likelihood ratio between 5 and 10). If we take a pre-test probability of 15% (mean incidence reported in the control group of studies that have evaluated the effect of steroids in the prevention of the stridor [2]) and a positive likelihood ratio of 5.90, the post-test probability is increased to 51%. With a negative likelihood ratio of 0.48, the use of the cuff-leak test would reduce the post-test probability to 8%. However, from a clinical point of view, the outcome of most interest is the reintubation secondary to upper airway obstruction. This outcome has been evaluated in four studies [8, 11, 12, 20] from which we have excluded one [20] because if had a sample size of fewer than 50 patients. In addition, in this study [20], no patient was reintubated because of stridor, so there were neither false negatives nor true positives. We observed less heterogeneity in the operative characteristics, but the magnitude of the association between the cuff-leak and the reintubation was lower than with upper airway obstruction. So, for a reported incidence of reintubation secondary to upper airway obstruction of 5% [2], the absence of leak increases the probability for reintubation to 17%, and the presence of leak decreases the probability for reintubation to 2%.

Adderley and Mullins [21] had the original idea for using this test in a study including 31 scheduled extubations in 28 children with croup. After extubation, 13% of children with audible leak required reintubation vs. 38% in children without leak. In adults, after first being described in a case reported by Potgieter [22], the cuff-leak test has been evaluated in several studies with methodological limitations and design differences. Firstly, the method, originally described by Miller and Cole [9], is not standardized. Few of the authors reported the inspiratory tidal volume, which can influence the amount of the leak. Other ventilatory parameters (compliance, inspiratory flow) that can have an influence on the result of the test were not mentioned either [15]. Secondly, the way to express the leak changes between absolute value (milliliters) and proportion. The predictive cutoff point also changed in each study. Thirdly, there were differences in the cohorts included. The study with a lower incidence of post-extubation stridor [10] included patients in the cardiovascular surgery postoperative period with a short intubation time (median time: 13 h). Some of the studies included a proportion of females higher than that reported in studies on the epidemiology of mechanical ventilation [23]. In several studies, it was observed that female gender is a factor associated with a higher probability of post-extubation stridor [19, 2426]. Lastly, the outcome evaluated in each study varies from stridor of any severity to reintubation secondary to upper airway obstruction. The lack of an operative and objective definition of stridor has led to significant differences in the incidence of upper airway obstruction among studies. In addition, except for three studies [8, 12, 17], the presence of laryngeal edema as a cause of the upper airway obstruction was not confirmed by an objective diagnostic test. In a recent study [27], it was shown that laryngeal ultrasound can be a reliable, non-invasive method for the evaluation of vocal cords, laryngeal morphology and the ease of airflow that passes through vocal cords or subglottic area because of laryngeal edema.

Our systematic review has some limitations. First, the number of eligible studies was small, especially for the prediction of reintubation, and this in turn limits the precision of the study. Second, the validity of the review is related to the quality of the included studies. It is significant that none of the studies could be given more than 10 points (cutoff point to define a good quality study) in the QUADAS score. Also, we have observed a significant publication bias that could overestimate the negative predictive value of the cuff-leak test.

In conclusion, in our systematic review of the cuff-leak test for prediction of upper airway obstruction, we have found a significant statistical heterogeneity between studies. Despite these limitations, the presence of a positive cuff-leak test (absence of leak) should alert the clinician to a higher risk of upper airway obstruction and re-intubation. On the other hand, the presence of a detectable leak has a low predictive value and does not rule out the occurrence of upper airway obstruction or the need for re-intubation.