Automatic fluid segmentation in retinal optical coherence tomography images using attention based deep learning
Introduction
Optical coherence tomography (OCT) [1] is a powerful imaging modality that can be used to acquire structural and molecular information of biological tissues. By using low coherence interferometry, OCT can noninvasively reconstruct high resolution cross-sectional images from backscattered spectra of biological samples [2]. It has been widely used in various fields, especially in biomedical applications such as ophthalmic imaging, cardiovascular imaging, gastrointestinal imaging and lung imaging. In traditional excisional biopsy, the quality of images obtained in the stomach is low, and endoscopic OCT can clearly depict the internal tissue microstructure in the gastrointestinal tissue. Ophthalmology is the first medical application of OCT technology [1], [3], which revolutionized the clinical practice of ophthalmology.
Macular Edema (ME) is the infiltration of fluid in macular region due to disruptions in blood retinal barrier [4], which is related to retinal diseases such as age-related macular degeneration (AMD) and diabetic macular edema (DME) [5]. Fig. 1 shows some retinal images with manual labeled macular edema masks. Studies have shown that OCT signal have strong correlation with retinal histology and is highly useful to diagnose ME caused by different diseases. Total retinal thickness measured from OCT images, is widely used for diagnosing ME, and many methods have been proposed for layer segmentation in OCT images [6], [7], [8]. However, it is shown that the retinal fluid volume can provide a more accurate indication of vascular permeability [9].
For the fluid segmentation task, manual segmentation is often a time consuming and subjective process. Several automatic segmentation methods have been proposed in recent years. The earliest methods are based on basic image processing methods [10], [11], [12]. Wilkins et al. [11] proposed an automated segmentation method based on intensity thresholding and size-based criteria. Roychowdhury et al. [10] further improved the approach in [11] by categorizing the candidate regions into large, broken large and small cysts categories. The candidates are filtered with different rules depending on the assigned category. These image processing methods are difficult to deal with medical images with different characteristics. Several machine learning-based segmentation methods have also been proposed. These methods typically transform segmentation problems into classification or regression tasks, including unsupervised or semi-supervised methods. Such as, random forest classification [13], kernel regression [14] and fuzzy level-set methods [15], [16]. Compared to image processing-based methods, machine learning based methods can achieve more accurate segmentation results because these can extract more features to design suitable classifiers or regression models.
Machine learning has been applied to many fields, such as pattern recognition [17], [18], [19], [20], [21], computer vision [22], [15], [23], [24], [25] and other applications [26], [27], [28], [29], [30], [31]. In recent years, deep learning, as a development of traditional neural network technique [32], [33], has been successfully applied into the task of medical image processing [34], [35], [36], [37], including the fluid region segmentation problem [38]. In our previous work [39], patch CNN based binary classifier was trained to distinguish fluid and its surrounding region. Lee et al. [40] proposed an automatic CNN-based segmentation method. Gopinath et al. [41] used a CNN to segment cystoid macular edemas, including a post-processing step that uses clustering to refine the previously identified cystoid regions. Roy et al. [6] propose a new fully convolutional architecture named ReLayNet, which is formed by a series of encoder blocks relaying the intermittent feature representations to their matched decoder blocks through concatenation layers. Venhuizen et al. [42] proposed a fully convolutional neural network (FCNN) where every pixel in the volume is analyzed and given a probability of belonging to the fluid region. It is composed of a cascade of two FCNNs with two complementary tasks. The first one extracts the region of interest, whereas the second one actually segments the fluid regions. Both architectures are based on the U-Net, proposed by Ronneberger et al. [43] specially for biomedical image segmentation. A typical U-Net is shown in Fig. 2. Similarly, Tennakoon et al. [44] also proposed a deep neural net inspired by U-Net architecture, but adding a batch normalization layer and an adversarial network to encode higher order relationships. This approach applied a preprocessing step to the dataset and a median filter to reduce the speckle noise. A U-Net based network was presented by Girish et al. [45] to automatically capture both micro and macro-level features for the characterization of the fluid structures recently. In order to enhance the robustness of the network, Hu et al. [5] proposed an stochastic atrous spatial pyramid pooling (sASPP) method to automatically segment the SRF and PED lesions. These deep learning-based methods greatly improve the accuracy of the fluid region segmentation task.
However, the above deep learning-based methods have some shortcomings. Some of them need to train several networks to detect and segment fluids which increase the complexity of training [42], [44]. In addition, most methods do not consider the independent and small fluid region in the macular edema image, which lead to those small closely located regions would be inproperly segmented as a whole region.
In this paper, we proposed an automatic fluid region segmentation approach based on a modified U-Net architecture. It extends our previous conference work [46] by updating the segment network architecture, using the weighted binary cross entropy loss instead of traditional cross entropy loss, and greatly expanding the experiments to systematically evaluate the correctness of the model. Generally, U-Net is a kind of symmetrical network architecture including an encoder and a decoder. The encoder gradually reduces the spatial dimension of image (or features), and the decoder gradually recovers spatial dimensions. There is usually a skip connection between the encoder and the decoder, which helps the decoder to preserve the details of the target. The traditional U-Net is trained through minimizing the cross-entropy loss. Different from the traditional U-Net, we designed a modified U-Net network architecture for fluid region segmentation. The proposed network architecture is an integration of improved U-Net that has two decoders and attention gates [47]. The first path is named as Prob-path for predicting probability segmentation result, the second is called Dist-path that is used to predict the corresponding distance map. Different to the work in [42], we introduce the attention gates to learn where to look for the fluid regions instead of using two networks. In addition, similar to RefineNet [48], we use dense skip connection in Prob-path to fuse the missing information in the down sampling to produce a high resolution predicted map. In this way, coarse high-level semantic features and fine-grained underlying features can be better utilized. To deal with the problem that small nearby fluid regions are wrongly merged, regression loss is introduced. This loss is utilized in Dist-path. The segmentation path is trained to output a fluid region probability mask through minimize weighted binary cross entropy and dice loss. Fig. 3 shows the framework of the proposed segmentation network.
The main contributions of the paper include:
- •
To learn where to look for the fluid regions and the fluid segmentation, we introduced the attention gates into the network architecture instead of using two-steps segmentation. The attention gates help network determine where the fluid regions are, thus avoid designing another segmentation network to locate the suspicious areas.
- •
Dense skip connection is used to fuse the missing information in the down sampling. In this way, coarse high-level semantic features and fine-grained underlying features can be merged which can help to achieve more accurate segmentation result.
- •
To alleviate the problem of inappropriate merging small nearby fluid regions, beside predicting the traditional probability segmentation output, we introduced the Dist-path to predict a distance map and the regression loss is used to train this path. The proposed network is trained through minimize a joint loss function.
The rest of the paper is organized as follows. The Section 2 reviews some related work in fluid regions segmentation. The following Section 3 describes the proposed method. Experimental results are shown in the Section 4. Discussion is presented in Section 5 and we conclude the paper in the Section 6.
Section snippets
Related work
In this section, we review some previous related works, include U-Net and attention mechanism for image analysis.
The proposed method
In the problem of fluid segmentation, the fluid region occupies only a small area in the image, which suffers from the class imbalance problem. We used the cropped patchs, weighted binary cross entropy loss and dice loss to alleviate the problem. Fig. 1 shows the OCT retinal images with macular edema. In order to segment fluid region automatically, we designed an improved U-Net alike network. The architecture of the proposed segmentation network is shown in Fig. 3. In this section, we present
Dataset
The data set used in this paper comes from the challenge [55]. We used images with fluid from raw data set. The experiment data contains 500 scans from four OCT scan devices (cirrus, nidek, spectralis, topcon). The number of scans from each devices are 57, 159, 53 and 231 respectively. In addition, we applied data augmentation (including rotation, shift and crop) to these images. Three quarters of them are used to train network, and the remaining one quarter is used as test. During the
Discussion
We proposed a fluid segmentation method using attention based deep learning. To evaluate the performance of the proposed method, we have compared it to four advanced fluid segmentation methods [39], [40], [44], [42]. As shown in Tables 2 and 3, the proposed method could achieve the best results and the method has second-best results is two-step strategy method [42]. Our method achieved higher AJI and Dice in quantitative analysis, the statistical analysis indicated that the improvements are
Conclusions
The segmentation of the fluid regions in the retinal OCT images is of great significance, which can assist the doctor to quickly diagnose the macular edema and make corresponding diagnostic measures quantitatively. We propose an automatic fluid segmentation method in retinal OCT images using attention based deep learning. The proposed network consists one encoder path and two decoder paths, with output of fluid probability map and distance map respectively. We introduce the attention structure
CRediT authorship contribution statement
Xiaoming Liu: Conceptualization, Methodology, Supervision, Writing - review & editing. Shaocheng Wang: Software, Investigation, Writing - original draft. Ying Zhang: Writing - review & editing, Investigation. Dong Liu: Software, Writing - original draft. Wei Hu: Writing - review & editing.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgment
This work is partially supported by the National Natural Science Foundation of China (No. 61403287, No. 61472293, No. 61572381), and the Natural Science Foundation of Hubei Province (No. 2014CFB288).
Xiaoming Liu received the Ph.D degree from Zhejiang University, China, in 2007. From 2014 to 2015, he was a Visiting Scholar in the University of North Carolina at Chapel Hill, NC USA. Currently, he is a full Professor with the School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, China. His research interests include medical image processing, pattern recognition, and machine learning.
References (56)
- et al.
Evaluation of retinal nerve fiber layer, optic nerve head, and macular thickness measurements for glaucoma detection using optical coherence tomography
American Journal of Ophthalmology
(2005) - et al.
An efficient local Chan-Vese model for image segmentation
Pattern Recognition
(2010) - et al.
Locally linear discriminant embedding: An efficient method for face recognition
Pattern Recognition
(2008) - et al.
Palmprint recognition using FastICA algorithm and radial basis probabilistic neural network
Neurocomputing
(2006) - et al.
Human face recognition based on multi-features using neural networks committee
Pattern Recognition Letters
(2004) - et al.
RFDCR: Automated brain lesion segmentation using cascaded random forests with dense conditional random fields
NeuroImage
(2020) - et al.
Robust dimensionality reduction via feature space to feature space distance metric learning
Neural Networks
(2019) - et al.
Determining the centers of radial basis probabilistic neural networks by recursive orthogonal least square algorithms
Applied Mathematics and Computation
(2005) - et al.
Syntactic n-grams as machine learning features for natural language processing
Expert Systems with Applications
(2014) - et al.
Dilation method for finding close roots of polynomials based on constrained learning neural networks
Physics Letters A
(2003)
A new partitioning neural network model for recursively finding arbitrary roots of higher order arbitrary polynomials
Applied Mathematics and Computation
A new automatic mass detection method for breast cancer with false positive reduction
Neurocomputing
A survey on deep learning in medical image analysis
Medical Image Analysis
Attention gated networks: Learning to leverage salient regions in medical images
Medical Image Analysis
Optical coherence tomography
Science
ReLayNet: retinal layer and fluid segmentation of macular optical coherence tomography using fully convolutional networks
Biomedical Optics Express
Correlation of 3-dimensionally quantified intraretinal and subretinal fluid with visual acuity in neovascular age-related macular degeneration
JAMA Ophthalmology
Automated segmentation of intraretinal cystoid fluid in optical coherence tomography
IEEE Transactions on Biomedical Engineering
Automatic segmentation of microcystic macular edema in OCT
Biomedical Optics Express
Kernel regression based segmentation of optical coherence tomography images with diabetic macular edema
Biomedical Optics Express
Automated volumetric segmentation of retinal fluid on optical coherence tomography
Biomedical Optics Express
A novel density-based clustering framework by using level set method
IEEE Transactions on Knowledge and Data Engineering
Cited by (46)
AMSC-Net: Anatomy and multi-label semantic consistency network for semi-supervised fluid segmentation in retinal OCT
2024, Expert Systems with ApplicationsLinear multifractional stable motion for modeling of fluid-filled regions in retinal optical coherence tomography images
2024, Chaos, Solitons and FractalsLoss-balanced parallel decoding network for retinal fluid segmentation in OCT
2023, Computers in Biology and MedicineAn optimized seed point selection method assists digital image correlation for measuring deformation from OCT images of fundus diseases
2023, Measurement: Journal of the International Measurement ConfederationTSSK-Net: Weakly supervised biomarker localization and segmentation with image-level annotation in retinal OCT images
2023, Computers in Biology and MedicineCitation Excerpt :These models are trained to segment different organs and tissues in medical images with detailed annotations [12,13]. Fully supervised deep learning methods have also been extensively used in retinal OCT images, such as the segmentation of layers [14–16], fluid [17–19], and drusen [20,21]. However, the training of these methods requires large-scale datasets with pixel-level annotations, which are often expensive and challenging to obtain.
Xiaoming Liu received the Ph.D degree from Zhejiang University, China, in 2007. From 2014 to 2015, he was a Visiting Scholar in the University of North Carolina at Chapel Hill, NC USA. Currently, he is a full Professor with the School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, China. His research interests include medical image processing, pattern recognition, and machine learning.
Shaocheng Wang is currently a graduate student in the School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, China. His research interests are medical image processing and machine learning.
Ying Zhang received the master degree from Wuhan University, China, in 2008. She is currently an associate chief physician with Wuhan Aier Eye Hospital, Wuhan, China.
Dong Liu was a graduate student in the School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, China. His research interests are medical image processing and machine learning.
Wei Hu received his Ph.D degree from Zhejiang University, China, in 2008. He is currently an associate Professor with the School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, China. His research interests include system on chip, computer architecture.