Elsevier

Neurocomputing

Volume 452, 10 September 2021, Pages 576-591
Neurocomputing

Automatic fluid segmentation in retinal optical coherence tomography images using attention based deep learning

https://doi.org/10.1016/j.neucom.2020.07.143Get rights and content

Abstract

Optical coherence tomography (OCT) is one of the most commonly used ophthalmic diagnostic techniques. Macular Edema (ME) is the swelling of the macular region in the eye. Segmentation of the fluid region in the retinal layer is an important step in detecting lesions. However, manual segmentation is often a time consuming and subjective process. In this paper, an improved U-Net segmentation method is proposed. In this method, the attention mechanism is introduced to automatically locate the fluid region, which avoids the problem of excessive calculation in multi-stage methods. At the same time, the use of dense skip connections which combines high-level and low-level features makes the segmentation results more precise. The loss function is a joint loss, including weighted binary cross entropy loss, dice loss, and regression loss, where regression loss is used to avoid the problem of merging multiple fluid regions into one. The experimental results show that the proposed method can adapt to the OCT scans acquired by various imaging scanning devices, and this method is more effective than other start-of-the-art fluid segmentation methods.

Introduction

Optical coherence tomography (OCT) [1] is a powerful imaging modality that can be used to acquire structural and molecular information of biological tissues. By using low coherence interferometry, OCT can noninvasively reconstruct high resolution cross-sectional images from backscattered spectra of biological samples [2]. It has been widely used in various fields, especially in biomedical applications such as ophthalmic imaging, cardiovascular imaging, gastrointestinal imaging and lung imaging. In traditional excisional biopsy, the quality of images obtained in the stomach is low, and endoscopic OCT can clearly depict the internal tissue microstructure in the gastrointestinal tissue. Ophthalmology is the first medical application of OCT technology [1], [3], which revolutionized the clinical practice of ophthalmology.

Macular Edema (ME) is the infiltration of fluid in macular region due to disruptions in blood retinal barrier [4], which is related to retinal diseases such as age-related macular degeneration (AMD) and diabetic macular edema (DME) [5]. Fig. 1 shows some retinal images with manual labeled macular edema masks. Studies have shown that OCT signal have strong correlation with retinal histology and is highly useful to diagnose ME caused by different diseases. Total retinal thickness measured from OCT images, is widely used for diagnosing ME, and many methods have been proposed for layer segmentation in OCT images [6], [7], [8]. However, it is shown that the retinal fluid volume can provide a more accurate indication of vascular permeability [9].

For the fluid segmentation task, manual segmentation is often a time consuming and subjective process. Several automatic segmentation methods have been proposed in recent years. The earliest methods are based on basic image processing methods [10], [11], [12]. Wilkins et al. [11] proposed an automated segmentation method based on intensity thresholding and size-based criteria. Roychowdhury et al. [10] further improved the approach in [11] by categorizing the candidate regions into large, broken large and small cysts categories. The candidates are filtered with different rules depending on the assigned category. These image processing methods are difficult to deal with medical images with different characteristics. Several machine learning-based segmentation methods have also been proposed. These methods typically transform segmentation problems into classification or regression tasks, including unsupervised or semi-supervised methods. Such as, random forest classification [13], kernel regression [14] and fuzzy level-set methods [15], [16]. Compared to image processing-based methods, machine learning based methods can achieve more accurate segmentation results because these can extract more features to design suitable classifiers or regression models.

Machine learning has been applied to many fields, such as pattern recognition [17], [18], [19], [20], [21], computer vision [22], [15], [23], [24], [25] and other applications [26], [27], [28], [29], [30], [31]. In recent years, deep learning, as a development of traditional neural network technique [32], [33], has been successfully applied into the task of medical image processing [34], [35], [36], [37], including the fluid region segmentation problem [38]. In our previous work [39], patch CNN based binary classifier was trained to distinguish fluid and its surrounding region. Lee et al. [40] proposed an automatic CNN-based segmentation method. Gopinath et al. [41] used a CNN to segment cystoid macular edemas, including a post-processing step that uses clustering to refine the previously identified cystoid regions. Roy et al. [6] propose a new fully convolutional architecture named ReLayNet, which is formed by a series of encoder blocks relaying the intermittent feature representations to their matched decoder blocks through concatenation layers. Venhuizen et al. [42] proposed a fully convolutional neural network (FCNN) where every pixel in the volume is analyzed and given a probability of belonging to the fluid region. It is composed of a cascade of two FCNNs with two complementary tasks. The first one extracts the region of interest, whereas the second one actually segments the fluid regions. Both architectures are based on the U-Net, proposed by Ronneberger et al. [43] specially for biomedical image segmentation. A typical U-Net is shown in Fig. 2. Similarly, Tennakoon et al. [44] also proposed a deep neural net inspired by U-Net architecture, but adding a batch normalization layer and an adversarial network to encode higher order relationships. This approach applied a preprocessing step to the dataset and a median filter to reduce the speckle noise. A U-Net based network was presented by Girish et al. [45] to automatically capture both micro and macro-level features for the characterization of the fluid structures recently. In order to enhance the robustness of the network, Hu et al. [5] proposed an stochastic atrous spatial pyramid pooling (sASPP) method to automatically segment the SRF and PED lesions. These deep learning-based methods greatly improve the accuracy of the fluid region segmentation task.

However, the above deep learning-based methods have some shortcomings. Some of them need to train several networks to detect and segment fluids which increase the complexity of training [42], [44]. In addition, most methods do not consider the independent and small fluid region in the macular edema image, which lead to those small closely located regions would be inproperly segmented as a whole region.

In this paper, we proposed an automatic fluid region segmentation approach based on a modified U-Net architecture. It extends our previous conference work [46] by updating the segment network architecture, using the weighted binary cross entropy loss instead of traditional cross entropy loss, and greatly expanding the experiments to systematically evaluate the correctness of the model. Generally, U-Net is a kind of symmetrical network architecture including an encoder and a decoder. The encoder gradually reduces the spatial dimension of image (or features), and the decoder gradually recovers spatial dimensions. There is usually a skip connection between the encoder and the decoder, which helps the decoder to preserve the details of the target. The traditional U-Net is trained through minimizing the cross-entropy loss. Different from the traditional U-Net, we designed a modified U-Net network architecture for fluid region segmentation. The proposed network architecture is an integration of improved U-Net that has two decoders and attention gates [47]. The first path is named as Prob-path for predicting probability segmentation result, the second is called Dist-path that is used to predict the corresponding distance map. Different to the work in [42], we introduce the attention gates to learn where to look for the fluid regions instead of using two networks. In addition, similar to RefineNet [48], we use dense skip connection in Prob-path to fuse the missing information in the down sampling to produce a high resolution predicted map. In this way, coarse high-level semantic features and fine-grained underlying features can be better utilized. To deal with the problem that small nearby fluid regions are wrongly merged, regression loss is introduced. This loss is utilized in Dist-path. The segmentation path is trained to output a fluid region probability mask through minimize weighted binary cross entropy and dice loss. Fig. 3 shows the framework of the proposed segmentation network.

The main contributions of the paper include:

  • To learn where to look for the fluid regions and the fluid segmentation, we introduced the attention gates into the network architecture instead of using two-steps segmentation. The attention gates help network determine where the fluid regions are, thus avoid designing another segmentation network to locate the suspicious areas.

  • Dense skip connection is used to fuse the missing information in the down sampling. In this way, coarse high-level semantic features and fine-grained underlying features can be merged which can help to achieve more accurate segmentation result.

  • To alleviate the problem of inappropriate merging small nearby fluid regions, beside predicting the traditional probability segmentation output, we introduced the Dist-path to predict a distance map and the regression loss is used to train this path. The proposed network is trained through minimize a joint loss function.

The rest of the paper is organized as follows. The Section 2 reviews some related work in fluid regions segmentation. The following Section 3 describes the proposed method. Experimental results are shown in the Section 4. Discussion is presented in Section 5 and we conclude the paper in the Section 6.

Section snippets

Related work

In this section, we review some previous related works, include U-Net and attention mechanism for image analysis.

The proposed method

In the problem of fluid segmentation, the fluid region occupies only a small area in the image, which suffers from the class imbalance problem. We used the cropped patchs, weighted binary cross entropy loss and dice loss to alleviate the problem. Fig. 1 shows the OCT retinal images with macular edema. In order to segment fluid region automatically, we designed an improved U-Net alike network. The architecture of the proposed segmentation network is shown in Fig. 3. In this section, we present

Dataset

The data set used in this paper comes from the challenge [55]. We used images with fluid from raw data set. The experiment data contains 500 scans from four OCT scan devices (cirrus, nidek, spectralis, topcon). The number of scans from each devices are 57, 159, 53 and 231 respectively. In addition, we applied data augmentation (including rotation, shift and crop) to these images. Three quarters of them are used to train network, and the remaining one quarter is used as test. During the

Discussion

We proposed a fluid segmentation method using attention based deep learning. To evaluate the performance of the proposed method, we have compared it to four advanced fluid segmentation methods [39], [40], [44], [42]. As shown in Tables 2 and 3, the proposed method could achieve the best results and the method has second-best results is two-step strategy method [42]. Our method achieved higher AJI and Dice in quantitative analysis, the statistical analysis indicated that the improvements are

Conclusions

The segmentation of the fluid regions in the retinal OCT images is of great significance, which can assist the doctor to quickly diagnose the macular edema and make corresponding diagnostic measures quantitatively. We propose an automatic fluid segmentation method in retinal OCT images using attention based deep learning. The proposed network consists one encoder path and two decoder paths, with output of fluid probability map and distance map respectively. We introduce the attention structure

CRediT authorship contribution statement

Xiaoming Liu: Conceptualization, Methodology, Supervision, Writing - review & editing. Shaocheng Wang: Software, Investigation, Writing - original draft. Ying Zhang: Writing - review & editing, Investigation. Dong Liu: Software, Writing - original draft. Wei Hu: Writing - review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

This work is partially supported by the National Natural Science Foundation of China (No. 61403287, No. 61472293, No. 61572381), and the Natural Science Foundation of Hubei Province (No. 2014CFB288).

Xiaoming Liu received the Ph.D degree from Zhejiang University, China, in 2007. From 2014 to 2015, he was a Visiting Scholar in the University of North Carolina at Chapel Hill, NC USA. Currently, he is a full Professor with the School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, China. His research interests include medical image processing, pattern recognition, and machine learning.

References (56)

  • D.-S. Huang et al.

    A new partitioning neural network model for recursively finding arbitrary roots of higher order arbitrary polynomials

    Applied Mathematics and Computation

    (2005)
  • X. Liu et al.

    A new automatic mass detection method for breast cancer with false positive reduction

    Neurocomputing

    (2015)
  • G. Litjens et al.

    A survey on deep learning in medical image analysis

    Medical Image Analysis

    (2017)
  • J. Schlemper et al.

    Attention gated networks: Learning to leverage salient regions in medical images

    Medical Image Analysis

    (2019)
  • D. Huang et al.

    Optical coherence tomography

    Science

    (1991)
  • G. Trichonas, P.K. Kaiser, Optical coherence tomography imaging of macular oedema, British Journal of Ophthalmology 98...
  • M.F. Marmor, Mechanisms of fluid accumulation in retinal edema, in: Macular Edema, Springer, 35–45,...
  • J. Hu, Y. Chen, Z. Yi, Automated segmentation of macular edema in OCT using deep neural networks, Medical Image...
  • A.G. Roy et al.

    ReLayNet: retinal layer and fluid segmentation of macular optical coherence tomography using fully convolutional networks

    Biomedical Optics Express

    (2017)
  • X. Liu, D. Liu, T. Fu, K. Zhang, J. Liu, L. Chen, Shortest Path with Backtracking Based Automatic Layer Segmentation in...
  • X. Liu, T. Fu, Z. Pan, D. Liu, W. Hu, J. Liu, K. Zhang, Automated Layer Segmentation of Retinal Optical Coherence...
  • S.M. Waldstein et al.

    Correlation of 3-dimensionally quantified intraretinal and subretinal fluid with visual acuity in neovascular age-related macular degeneration

    JAMA Ophthalmology

    (2016)
  • S. Roychowdhury, D.D. Koozekanani, S. Radwan, K.K. Parhi, Automated localization of cysts in diabetic macular edema...
  • G.R. Wilkins et al.

    Automated segmentation of intraretinal cystoid fluid in optical coherence tomography

    IEEE Transactions on Biomedical Engineering

    (2012)
  • A. Lang et al.

    Automatic segmentation of microcystic macular edema in OCT

    Biomedical Optics Express

    (2015)
  • S.J. Chiu et al.

    Kernel regression based segmentation of optical coherence tomography images with diabetic macular edema

    Biomedical Optics Express

    (2015)
  • J. Wang et al.

    Automated volumetric segmentation of retinal fluid on optical coherence tomography

    Biomedical Optics Express

    (2016)
  • X.-F. Wang et al.

    A novel density-based clustering framework by using level set method

    IEEE Transactions on Knowledge and Data Engineering

    (2009)
  • Cited by (46)

    • TSSK-Net: Weakly supervised biomarker localization and segmentation with image-level annotation in retinal OCT images

      2023, Computers in Biology and Medicine
      Citation Excerpt :

      These models are trained to segment different organs and tissues in medical images with detailed annotations [12,13]. Fully supervised deep learning methods have also been extensively used in retinal OCT images, such as the segmentation of layers [14–16], fluid [17–19], and drusen [20,21]. However, the training of these methods requires large-scale datasets with pixel-level annotations, which are often expensive and challenging to obtain.

    View all citing articles on Scopus

    Xiaoming Liu received the Ph.D degree from Zhejiang University, China, in 2007. From 2014 to 2015, he was a Visiting Scholar in the University of North Carolina at Chapel Hill, NC USA. Currently, he is a full Professor with the School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, China. His research interests include medical image processing, pattern recognition, and machine learning.

    Shaocheng Wang is currently a graduate student in the School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, China. His research interests are medical image processing and machine learning.

    Ying Zhang received the master degree from Wuhan University, China, in 2008. She is currently an associate chief physician with Wuhan Aier Eye Hospital, Wuhan, China.

    Dong Liu was a graduate student in the School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, China. His research interests are medical image processing and machine learning.

    Wei Hu received his Ph.D degree from Zhejiang University, China, in 2008. He is currently an associate Professor with the School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, China. His research interests include system on chip, computer architecture.

    View full text