A general framework for wireless capsule endoscopy study synopsis

https://doi.org/10.1016/j.compmedimag.2014.05.011Get rights and content

Abstract

We present a general framework for analysis of wireless capsule endoscopy (CE) studies. The current available workstations provide a time-consuming and labor-intense work-flow for clinicians which requires the inspection of the full-length video. The development of a computer-aided diagnosis (CAD) CE workstation will have a great potential to reduce the diagnostic time and improve the accuracy of assessment. We propose a general framework based on hidden Markov models (HMMs) for study synopsis that forms the computational engine of our CAD workstation. Color, edge and texture features are first extracted and analyzed by a Support Vector Machine classifier, and then encoded as the observations for the HMM, uniquely combining the temporal information during the assessment. Experiments were performed on 13 full-length CE studies, instead of selected images previously reported. The results (e.g. 0.933 accuracy with 0.933 recall for detection of polyps) show that our framework achieved promising performance for multiple classification. We also report the patient-level CAD assessment of complete CE studies for multiple abnormalities, and the patient-level validation demonstrates the effectiveness and robustness of our methods.

Introduction

Wireless capsule endoscopy (CE) [10], [22] was invented to screen the gastrointestinal (GI) tract, especially the small bowel (previously not accessible non-invasively) using a simple outpatient test. It has significantly impacted the diagnostic approach for many diseases, such as bleeding, Crohn's and Celiac diseases, tumors, polyps, and other lesions [22].

Introduced by Given Imaging Inc. in 2000, over 1,000,000 Pillcam small bowel (SB) capsules alone have already been swallowed in the past 10 years since the device was first approved by the U.S. FDA. The size of the CE device is ϕ 11 mm × 26 mm. It consists of an imaging sensor, associated optics, and communication electronics. An outpatient examination typically produces more than 50,000 images, which are then manually and tediously examined by an expert reader by inspecting the full-length video. In addition to time constraints, this manual procedure cannot guarantee that every abnormality is detected due to their various sizes, positions and characteristics, and the experience of the clinicians. Development of a computer-aided diagnosis tool for CE assessment is therefore desirable and necessary. Fig. 1 shows some examples of CE images. The image resolution is 576 × 576. It can be seen that besides diagnostic relevant images (e.g. lesion (b) and polyp (c)), there are also a large amount of images with normal lumen (f), bile (a), air bubbles (d) and extraneous matter (e).

In this work, we propose a general framework to summarize CE videos into multiple classes. Fig. 2 shows the conceptual objective of CE study synopsis. A general hidden Markove models (HMM) is built based on statistical classifiers integrating multiple image appearance attributes (color, edge, and texture). The underlying support vector machine classifier [5] outputs are encoded as the binary observations of HMM. The proposed method is a generalized model instead of discriminative one, which can generate instances for clinical training and education. We have evaluated this composite framework by performing the video synopsis on a database of complete CE videos where study images were summarized into six most commonly seen classes – normal images, lesions, polyps, air bubbles, bile, and other extraneous matter. Compared to prior art reviewed above where image-based performance measures are reported, we describe patient-level validation of this general model to verify its effectiveness and robustness.

Our main contributions lie in: (a) the proposal of a general framework based on HMM integrating temporal information; (b) investigation of multi-class study synopsis; (c) development of a CAD workstation for automated CE assessment; and (d) validation on complete CE videos providing great potential for direct clinical application.

Section snippets

Related work

Prior related work mainly focused on statistical analysis for detection of various individual abnormalities, with bleeding [18], [13], [25] being the main focus. More recently, Yi et al. [25] introduced a clinically viable software for automated GI tract bleeding detection and classification. The major functional modules included a graph-based segmentation algorithm, specific feature selection and validation and cascade classification. The method focused on single abnormality, bleeding, while

Methods

The information flow of our framework is shown in Fig. 3. The supervised framework consists of model training and validation stages. For model training, we perform feature selection, frame-based classification and build an HMM for each class. This extensible framework can be applied to both multi-class image sequence analysis and study synopsis that will be described in detail in this section.

Experiments

The current Johns Hopkins University CE archive contains 75 full-length CE studies (approximately 50,000 images each) and over 120,000 annotated images collected under a Johns Hopkins Medical Institutions (JHMI) Institutional Review Board (IRB) protocol. The selected images in these studies were reviewed and annotated by our expert clinical collaborators. The annotating procedure was extremely time-consuming and also required validation for internal consistency and rather bias. The image

Discussion

A framework for CE study analysis via video synopsis has been proposed. It can be applied to image sequence classification and video synopsis (labeling/annotating).

In the experiments, the sequence-based classification outperformed frame-based one in terms of accuracy and recall for each class. For normal lumen, the precision of frame-based classification was higher than sequence-based classification. It may be caused by the large number of normal lumen images compared to other classes resulting

Conclusions

We describe the general framework for the computational engine of our CAD workstation currently in development. It is different from previous work in both model building and application. Encoded color, edge and texture feature vectors are regarded as the HMM observations, which agree with the manner of human being. Experiments performed on complete CE videos achieved promising results. The CE videos were summarized into six classes in which include two abnormalities, i.e. lesions and polyps. We

References (31)

  • L. Alexandre et al.

    Color and Position versus Texture Features for Endoscopic Polyp Detection

  • M. Bashar et al.

    Detecting Informative Frames from Wireless Capsule Endoscopic Video Using Color and Texture Features

  • S. Bejakovic et al.

    Analysis of Crohn's disease lesions in capsule endoscopy images

  • J. Boreczky et al.

    A hidden Markov model framework for video segmentation using audio and image features

  • N. Cristianini et al.

    An Introduction to Support Vector Machines and Other Kernel-based Learning Methods

    (2000)
  • J. Cunha et al.

    Automated topographic segmentation and transit time estimation in endoscopic capsule exams

    IEEE Trans Med Imaging

    (2008)
  • V. Hai et al.

    Adaptive Control of Video Display for Diagnostic Assistance by Analysis of Capsule Endoscopic Images

  • T.M. Htwe et al.

    Vision-based techniques for efficient wireless capsule endoscopy examination

  • D. Iakovidis et al.

    Efficient homography-based video visualization for wireless capsule endoscopy

  • G. Iddan et al.

    Wireless capsule endoscopy

    Nature

    (2000)
  • R. Kumar et al.

    Learning disease severity for capsule endoscopy images

  • R. Kumar et al.

    Assessment of crohn's disease lesions in wireless capsule endoscopy images

    IEEE Trans Biomed Eng

    (2012)
  • P. Lau et al.

    Detection of bleeding patterns in WCE video using multiple features

  • H. Liu et al.

    Wireless capsule endoscopy video reduction based on camera motion estimation

    J Digit Imaging

    (2013)
  • M. Mackiewicz et al.

    Wireless capsule endoscopy color video segmentation

    IEEE Trans Med Imaging

    (2008)
  • Cited by (0)

    1

    Qian Zhao was a visiting graduate student at the Visual Imaging and Surgical Robotics Laboratory at JHU during this work. This work was supported by Dr. Kumar's discretionary funds and the HK ITF project #6902928 (PI: Max Q.-H. Meng).

    View full text