Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Inter-rater reliability of a novel objective endpoint for benign central airway stenosis interventions: Segmentation-based volume rendering of computed tomography scans

  • Ankush P. Ratwani ,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Writing – original draft, Writing – review & editing

    ankush.ratwani@vumc.org

    Affiliation Department of Medicine, Division of Allergy, Pulmonary and Critical Care, Vanderbilt University Medical Center, Nashville, TN, United States of America

  • Heidi Chen,

    Roles Formal analysis, Methodology

    Affiliation Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, United States of America

  • Leah Brown,

    Roles Data curation, Writing – review & editing

    Affiliation Department of Medicine, Division of Allergy, Pulmonary and Critical Care, Vanderbilt University Medical Center, Nashville, TN, United States of America

  • Evan A. Schwartz,

    Roles Data curation, Writing – review & editing

    Affiliation Department of Medicine, Division of Pulmonary, Allergy and Critical Care Medicine, Duke University School of Medicine, Durham, NC, United States of America

  • Khushbu Patel,

    Roles Data curation, Writing – review & editing

    Affiliation Department of Medicine, Division of Allergy, Pulmonary and Critical Care, Vanderbilt University Medical Center, Nashville, TN, United States of America

  • Adam Guttentag,

    Roles Data curation, Writing – review & editing

    Affiliation Department of Radiology and Radiological Science, Vanderbilt University Medical Center, Nashville, TN, United States of America

  • Thomas A. McLaren,

    Roles Data curation, Writing – review & editing

    Affiliation Department of Radiology and Radiological Science, Vanderbilt University Medical Center, Nashville, TN, United States of America

  • Kim L. Sandler,

    Roles Data curation, Writing – review & editing

    Affiliation Department of Radiology and Radiological Science, Vanderbilt University Medical Center, Nashville, TN, United States of America

  • Otis B. Rickman,

    Roles Writing – review & editing

    Affiliation Department of Medicine, Division of Allergy, Pulmonary and Critical Care, Vanderbilt University Medical Center, Nashville, TN, United States of America

  • Samira Shojaee,

    Roles Conceptualization, Writing – review & editing

    Affiliation Department of Medicine, Division of Allergy, Pulmonary and Critical Care, Vanderbilt University Medical Center, Nashville, TN, United States of America

  • Robert J. Lentz,

    Roles Conceptualization, Methodology, Writing – review & editing

    Affiliation Department of Medicine, Division of Allergy, Pulmonary and Critical Care, Vanderbilt University Medical Center, Nashville, TN, United States of America

  • Fabien Maldonado

    Roles Conceptualization, Investigation, Methodology, Supervision, Visualization, Writing – review & editing

    Affiliation Department of Medicine, Division of Allergy, Pulmonary and Critical Care, Vanderbilt University Medical Center, Nashville, TN, United States of America

Abstract

Objectives

To evaluate the reliability of a novel segmentation-based volume rendering approach for quantification of benign central airway obstruction (BCAO).

Design

A retrospective single-center cohort study.

Setting

Data were ascertained using electronic health records at a tertiary academic medical center in the United States.

Participants and inclusion

Patients with airway stenosis located within the trachea on two-dimensional (2D) computed tomography (CT) imaging and documentation of suspected benign etiology were included. Four readers with varying expertise in quantifying tracheal stenosis severity were selected to manually segment each CT using a volume rendering approach with the available free tools in the medical imaging viewing software OsiriX (Bernex, Switzerland). Three expert thoracic radiologists were recruited to quantify the same CTs using traditional subjective methods on a continuous and categorical scale.

Outcome measures

The interrater reliability for continuous variables was calculated by the intraclass correlation coefficient (ICC) using a two-way mixed model with 95% confidence intervals (CI).

Results

Thirty-eight patients met the inclusion criteria, and fifty CT scans were selected for measurement. The most common etiology of BCAO was iatrogenic in 22 patients (58%). There was an even distribution of chest and neck CT imaging within our cohort. The average ICC across all four readers for the volume rendering approach was 0.88 (95% CI, 0.84 to 0.93), suggesting good to excellent agreement. The average ICC for thoracic radiologists for subjective methods on the continuous scale was 0.38 (95% CI, 0.20 to 0.55), suggesting poor to fair agreement. The kappa for the categorical approach was 0.26, suggesting a slight to fair agreement amongst the raters.

Conclusion

In this retrospective cohort study, agreement was good to excellent for raters with varying expertise in airway cross-sectional imaging using a novel segmentation-based volume rendering approach to quantify BCAO. This proposed measurement outperformed our expert thoracic radiologists using conventional subjective grading methods.

Introduction

Benign central airway obstruction (BCAO) comprises a complex and multifactorial set of conditions [1]. Patients typically present with signs and symptoms of airflow limitation (dyspnea, cough, wheezing, stridor). However, given the relatively late onset of symptoms, up to half of patients present in respiratory distress [2]. The most common etiology is post-traumatic from prolonged endotracheal intubation or tracheostomy [3]. With the recent worldwide pandemic of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), there have been reports of increasing BCAO cases following prolonged intubation [4, 5] with an expected increase in the coming years.

The burden of BCAO on patients and the healthcare system has been recently examined [69]. When examining a quality healthcare record, a study reported that patients with tracheal stenosis from prolonged intubation had an increased hospital stay (6.3 days; 95% CI, 6.0 to 6.3), in addition to an increase in hospital costs ($10,375; 95% CI, $9762 to $10,988). Another study reported that patients with post-intubation tracheal stenosis (PITS) that underwent nonsurgical treatments (Montgomery T-tube, silicone stent, or tracheostomy) had a decreased quality of life in the domains of physical limitation, bodily pain, and increased emotional distress following the procedure. Thus, early identification and shared decision-making regarding management are vital to prevent further patient morbidity.

However, during the last decade, comparative effectiveness studies evaluating novel therapeutic interventions for BCAO have been lacking, with management currently established on expert opinion and small retrospective cohort studies and case reports [1012]. In addition, these studies have primarily used conventional subjective grading and classification systems as endpoints of disease recurrence, making interpretation of the results challenging due to uncertain reliability. The field urgently needs more objective methods to assess airway luminal narrowing, which are reliable and not overly complex.

OsiriX (Berenex, Switzerland) is a Digital Imaging and Communications in Medicine (DICOM) viewer with the ability to perform advanced post-processing techniques on 2-dimensional (2D) computed tomography (CT) scans. In recent years, the ability of the software to create three-dimensional (3D) reconstructions of solid organs with volume rendering and segmentation-based techniques that are free of charge has generated excitement for preoperative planning and trainee simulation [1315]. However, its use within the trachea has not been well described. Currently, it is being explored as an objective endpoint to quantify stenosis recurrence in an ongoing pilot randomized clinical trial (NCT04996173) evaluating the utility of adding spray cryotherapy to standard of care interventions in BCAO.

The objectives of our study are twofold. Firstly, we sought to evaluate the reliability of this novel objective manual segmentation-based volume rendering approach to quantify BCAO in raters with varying expertise in cross-sectional airway imaging. Secondarily, we aimed to evaluate agreement amongst expert thoracic radiologists using conventional, subjective methods for assessing stenosis severity currently used in clinical practice and research.

Methods

Study subjects

We utilized an ongoing interventional pulmonary procedural database at Vanderbilt University Medical Center (VUMC) to identify patients for inclusion. Demographic data, including age, sex, smoking status, etiology of stenosis, CT imaging type (chest or neck), and axial slice thickness, were ascertained from the electronic health record (EHR). We included only patients with airway stenosis localized to the trachea and documentation of a suspected benign etiology within six months of the selected CT scan. The deidentified data were collected and managed using the Research Electronic Data Capture (REDCap) system [16, 17]. This study was approved by our local institutional review board (IRB #211567).

CT image acquisition, segmentation, and volumetric analysis

All images were uploaded from the local picture archiving and communication system (PACS) to version 12.0 of OsiriX. After reviewing multiple sets of imaging before collection, a 3 cm measurement was felt to adequately capture the entire length of a stenotic segment in 95% of patients. The nadir stenosis point was identified in the soft tissue window and marked in the sagittal plane. We then measured 1.5 cm above and below this point. The airway lumen boundaries were then circumferentially marked using the closed polygon tool in the axial window. Four additional segments are manually segmented: the proximal end, one-third down the stenotic segment, two-thirds down the stenotic segment, and the most distal end (Fig 1). Finally, using the built-in repulsor function, the boundaries of the missing segments were manually adjusted to achieve a luminal fit. The resulting volumetric reconstruction is then generated and can be manipulated in 3D space with the resulting volume measurement (Fig 2).

thumbnail
Fig 1. The trachea is manually segmented for 3 cm along the superior-inferior axis in the axial view.

(A) Represents the most proximal segmentation. (B) Shows segmentation at the focal nadir point of stenosis. (C) Shows the two-thirds segmentation point. (D) Represents the most distal portion of the stenotic segment to be measured.

https://doi.org/10.1371/journal.pone.0290393.g001

thumbnail
Fig 2. The final 3D reconstruction of the trachea based on the measured regions of interest.

A resultant volume is displaced in cm3 with surrounding point statistics.

https://doi.org/10.1371/journal.pone.0290393.g002

Interrater reliability

To assess the reliability of this novel endpoint, we identified four clinicians with different levels of expertise to interpret airway cross-sectional imaging and the ability of this approach to quantify airway stenosis severity. At the time of data collection, observer 1 (AR) was a pulmonologist, observers 2 (ES) and 3 (LB) were medicine residents, and observer 4 (KP) was a radiologist. AR gave each observer a thirty-minute introduction to the software with training on measurement and rendering of a test image. To provide for consistency of measurements AR marked the nadir point of stenosis on each image to be used as a reference. A screen recording was also available to all readers during the study to be used as a reference. The expectation was for all readers to start measuring within 24 hours of the training, and complete within a two-week timeframe.

To compare the reproducibility of our segmentation-based approach with more subjective quantification methods, we recruited three expert thoracic radiologists (AG, KS, and TM) to read the same CT images and give their opinion of stenosis severity on both a continuous scale from 0–100% and a categorical scale using the Cotton-Myer grading system (grade 1: 0–50%, grade 2: 51–70%; grade 3: 71–99%; grade 4: No detectable lumen). In contrast to the objective methods, no nadir point of stenosis was pre-identified in the subjective approach.

Statistical analysis

Descriptive statistics are presented, including means, medians, interquartile ranges (IQR), standard deviations, and ranges for continuous parameters and percentages and frequencies for categorical parameters. The interrater reliability for continuous variables was calculated by the intraclass correlation coefficient (ICC) using a two-way mixed model with 95% confidence intervals (CI). Fleiss’s kappa was used to determine the reliability between groups of categorical variables. Guidelines for the interruption and reporting of ICC and Kappa have been previously described18,19. No correction was made for missing data. All analyses were performed by an independent statistician using R software, version 4.2.0 (R Foundation for Statistical Computing, www.r-project.org).

Results

Thirty-eight patients met the inclusion criteria, with fifty CT scans between 2009 and 2021. Twenty-two (58%) were labeled iatrogenic (post-intubation or post-tracheostomy), 10 (26%) idiopathic, and the remaining six were due to other suspected etiologies (Table 1). Twenty-seven (71%) patients were women and most were never-smokers (68%). The resolution of the scans ranged from 1 to 5 mm, the median 2 mm (IQR, 1.25 to 3). Our cohort had an even distribution of CT neck and chest imaging. The calculated airway luminal volume means and standard deviations for each rater are displayed in Table 2. The average ICC across all four readers was 0.88 (95% CI, 0.84 to 0.93), suggesting good to excellent agreement. The average ICC for the thoracic radiologist on the continuous grading scale for the subjective approach was 0.38 (95% CI, 0.20 to 0.55), suggesting a poor to fair agreement. The average Fleiss kappa for the categorical Cotton-Myer grading system was 0.26, suggesting a slight to fair agreement amongst raters.

thumbnail
Table 1. Baseline characteristics with median (IQR) for continuous variables and number of patients with relative frequencies (%) for categorical variables.

†Iatrogenic includes post-intubation and post tracheostomy-induced stenosis. § Idiopathic includes no formal etiology given within a six-month time frame of the subject CT.

https://doi.org/10.1371/journal.pone.0290393.t001

thumbnail
Table 2. Mean and standard deviation (SD) for generated tracheal volumes for each rater overall CTs.

https://doi.org/10.1371/journal.pone.0290393.t002

Discussion

In this retrospective cohort study, we show that clinicians with varying expertise in airway cross-sectional imaging have good to excellent agreement when using a novel, objective segmentation-based volume approach to quantifying BCAO. Further, we demonstrate that when a group of expert thoracic radiologists evaluates the same images using traditional, more subjective approaches, they have poor overall agreement with both continuous and categorical measures.

Existing classification and grading systems [1821] for BCAO are almost never used in clinical practice and only inconsistently in research. We believe reasons include 1) heavy reliance on a subjective interpretation of airway luminal narrowing, 2) poor external validation limiting generalizability, 3) lack of reproducibility, 4) poor standardization across and within specialties, 5) and poor correlation with physiological markers of disease activity and patient-reported outcomes (PRO). For example, changes in peak expiratory flow, which has been previously shown to predict disease recurrence in patients with BCAO, were recently shown to have a poor correlation with stenosis severity using the Cotton-Myer classification system with an overall kappa of 0.37 [22, 23].

For several reasons, classification and grading systems for BCAO that rely heavily on subjective evaluation measures prove to be the most problematic. First, they may not correctly characterize complex lesions, such as lesions with a more significant vertical extent (≥ 1 cm in length), those that invade surrounding cartilaginous structures, and those with dynamic collapse from underlying malacia. Second, reproducibility amongst providers is challenging as assessment of luminal narrowing is often "eye-balled," with difficulties with interpreting when a transition state occurs (e.g., 49% versus 50% stenosis). Finally, most systems that use a visual grading of stenosis require direct visualization with inherent procedural and anesthesia risks.

We believe that an approach using volumetric assessment to quantify airway stenosis from readily available 2D CT imaging may address some of these challenges. By visualizing a stenotic segment in multiple dimensions, one can get a more comprehensive sense of the lesion’s vertical and structural extent. This can prove advantageous in decision-making regarding early referral for surgical resection, as this has shown to be the most definitive treatment in patients with BCAO [24] who are suitable candidates. Additionally, objective data following an endoscopic therapeutic intervention allows for better identification of disease recurrence and need for a repeat procedure. Finally, little expertise in measurement is required as we have shown that readers with minimal training and expertise in airway cross-sectional imaging were able to have strong agreement. This contrasts with our radiologists’ subjective grading, highlighting the challenges with current approaches in everyday practice.

This study has several notable strengths. Multiple etiologies of BCAO were included, highlighting the generalizability of these findings. Our reported ICC suggests that this novel measurement is reliable regardless of the underlying type of CT performed (chest or neck). The patients in our cohort had a variety of stenotic lengths and severity of luminal narrowing, highlighting the ability of our readers to agree with lesions of different complexities. The ability to analyze these images using the free of charge tools in OsiriX with relatively brief training suggests that this approach could be widely adopted with minimal cost or effort. However, it is essential to consider the inherent trade-off of increased time required to perform such measurements compared to traditional subjective methods.

Limitations of this study include a modest sample size and testing limited to the confines of the trachea. The a priori identification of the point of nadir stenosis may have introduced bias, improving recognition of the stenotic area. However, this approach was chosen to minimize ambiguity in identifying structural abnormalities and prioritize accurate measurements. As the dataset was retrospective, direct correlation of tracheal volume with stenosis severity or quality of life measures was not possible. Future research could explore establishing thresholds or criteria for tracheal volume indicative of stenosis severity or its impact on quality of life. Additionally, the study was limited in assessing dynamic imaging or individuals with underlying malacia.

In conclusion, we report the reliability of a novel objective measure for quantifying airway stenosis based on a straightforward volume rendering approach in the open-source medical imaging viewer OsiriX. This measure holds promise as an objective research endpoint for assessing airway luminal narrowing and may serve as an accurate assessment of disease recurrence in studies testing new therapeutic interventions in BCAO.

References

  1. 1. Ratwani AP, Davis A, Maldonado F. Current practices in the management of central airway obstruction. Curr Opin Pulm Med. 2021. pmid:34720097
  2. 2. Holden VK, Channick CL. Management of benign central airway obstruction. AME Med J. 2018;3: 76–76.
  3. 3. Farzanegan R, Farzanegan B, Zangi M, Golestani Eraghi M, Noorbakhsh S, Doozandeh Tabarestani N, et al. Incidence Rate of Post-Intubation Tracheal Stenosis in Patients Admitted to Five Intensive Care Units in Iran. Iran Red Crescent Med J. 2016;18: e37574. pmid:28144465
  4. 4. Dorris ER, Russell J, Murphy M. Post-intubation subglottic stenosis: aetiology at the cellular and molecular level. Eur Respir Rev. 2021;30. pmid:33472959
  5. 5. Beyoglu MA, Sahin MF, Turkkan S, Yazicioglu A, Yekeler E. Complex Post-intubation Tracheal Stenosis in Covid-19 Patients. Indian J Surg. 2022;84: 805–813. pmid:35818393
  6. 6. Johnson RF, Saadeh C. Nationwide estimations of tracheal stenosis due to tracheostomies. Laryngoscope. 2019;129: 1623–1626. pmid:30569511
  7. 7. Spataro E, Durakovic N, Kallogjeri D, Nussenbaum B. Complications and 30-day hospital readmission rates of patients undergoing tracheostomy: A prospective analysis. Laryngoscope. 2017;127: 2746–2753. pmid:28543108
  8. 8. Bhatti NI, Mohyuddin A, Reaven N, Funk SE, Laeeq K, Pandian V, et al. Cost analysis of intubation-related tracheal injury using a national database. Otolaryngol Head Neck Surg. 2010;143: 31–36. pmid:20620616
  9. 9. Bibas BJ, Cardoso PFG, Salati M, Minamoto H, Luiz Tamagno MF, Terra RM, et al. Health-related quality of life evaluation in patients with non-surgical benign tracheal stenosis. J Thorac Dis. 2018;10: 4782–4788. pmid:30233850
  10. 10. Bhora FY, Ayub A, Forleiter CM, Huang C-Y, Alshehri K, Rehmani S, et al. Treatment of Benign Tracheal Stenosis Using Endoluminal Spray Cryotherapy. JAMA Otolaryngol Head Neck Surg. 2016;142: 1082–1087. pmid:27532803
  11. 11. Nouraei SAR, Obholzer R, Ind PW, Salama AD, Pusey CD, Porter F, et al. Results of endoscopic surgery and intralesional steroid therapy for airway compromise due to tracheobronchial Wegener’s granulomatosis. Thorax. 2008;63: 49–52. pmid:17573443
  12. 12. Lee HJ, Labaki W, Yu DH, Salwen B, Gilbert C, Schneider ALC, et al. Airway stent complications: the role of follow-up bronchoscopy as a surveillance method. J Thorac Dis. 2017;9: 4651–4659. pmid:29268534
  13. 13. Yao F, Wang J, Yao J, Hang F, Lei X, Cao Y. Three-dimensional image reconstruction with free open-source OsiriX software in video-assisted thoracoscopic lobectomy and segmentectomy. Int J Surg. 2017;39: 16–22. pmid:28115296
  14. 14. Bruneau M, Kamouni R, Schoovaerts F, Pouleau H-B, De Witte O. Simultaneous Image-Guided Skull Bone Tumor Resection and Reconstruction With a Preconstructed Prosthesis Based on an OsiriX Virtual Resection. Oper Neurosurg (Hagerstown). 2015;11: 484–490. pmid:29506160
  15. 15. Spiriev T, Nakov V, Laleva L, Tzekov C. OsiriX software as a preoperative planning tool in cranial neurosurgery: A step-by-step guide for neurosurgical residents. Surg Neurol Int. 2017;8: 241. pmid:29119039
  16. 16. Harris PA, Taylor R, Minor BL, Elliott V, Fernandez M, O’Neal L, et al. The REDCap consortium: Building an international community of software platform partners. J Biomed Inform. 2019;95: 103208. pmid:31078660
  17. 17. Harris PA, Taylor R, Thielke R, Payne J, Gonzalez N, Conde JG. Research electronic data capture (REDCap)—a metadata-driven methodology and workflow process for providing translational research informatics support. J Biomed Inform. 2009;42: 377–381. pmid:18929686
  18. 18. Myer CM 3rd, O’Connor DM, Cotton RT. Proposed grading system for subglottic stenosis based on endotracheal tube sizes. Ann Otol Rhinol Laryngol. 1994;103: 319–323. pmid:8154776
  19. 19. Lano CF Jr, Duncavage JA, Reinisch L, Ossoff RH, Courey MS, Netterville JL. Laryngotracheal reconstruction in the adult: a ten year experience. Ann Otol Rhinol Laryngol. 1998;107: 92–97. pmid:9486901
  20. 20. Freitag L, Ernst A, Unger M, Kovitz K, Marquette CH. A proposed classification system of central airway stenosis. Eur Respir J. 2007;30: 7–12. pmid:17392320
  21. 21. Filauro M, Mazzola F, Missale F, Canevari FR, Peretti G. Endoscopic Preoperative Assessment, Classification of Stenosis, Decision-Making. Front Pediatr. 2019;7: 532. pmid:31970144
  22. 22. Song SA, Santeerapharp A, Choksawad K, Franco RA Jr. Reliability of peak expiratory flow percentage compared to endoscopic grading in subglottic stenosis. Laryngoscope Investig Otolaryngol. 2020;5: 1133–1139. pmid:33364404
  23. 23. Kimura K, Du L, Berry LD, Huang L-C, Chen S-C, Francis DO, et al. Modeling Recurrence in Idiopathic Subglottic Stenosis With Mobile Peak Expiratory Flow. Laryngoscope. 2021;131: E2841–E2848. pmid:34309022
  24. 24. Hoffman MR, Patro A, Huang L-C, Chen S-C, Berry LD, Gelbard A, et al. Impact of Adjuvant Medical Therapies on Surgical Outcomes in Idiopathic Subglottic Stenosis. Laryngoscope. 2021. pmid:34117778