Abstract
In recent days, deep learning is on rage and is gaining a huge amount of popularity due to its supremacy in terms of accuracy. Deep learning is being used for a vast number of applications out of which healthcare is an important category. In this paper, we discuss the role of deep learning in medical image segmentation. It is also known as the automated or semi-automated detection of edges within various medical image modalities so as to identify the region of interest. Furthermore, we also explore the various deep learning networks that are widely preferred for medical image segmentation along with the architecture and overview of each network. This paper covers the most recent and widely preferred deep learning networks such as Convolutional Neural Network (CNN) and other related networks such as Alexnet, Resnet, U-net and V-net. The challenges and limitations of the emerging DL networks is also studied.
1 Introduction
Deep learning is a process in which machines learn to process data and derive a conclusion using neural networks that are comprised of different levels, arranged according to hierarchy. Deep learning is used in various applications such as speech and image recognition, bio-informatics, military and most importantly medical image analysis. It is capable enough to transform the entire landscape of healthcare. The application of deep learning in healthcare is expected to grow in the time ahead. Deep learning is used alongside medical imaging for health check and monitoring, diagnosis and treatment of diseases, injuries etc. Medical image segmentation is yet another application of deep learning that is used to identify organs or lesions from different modalities of medical images such as Computed Tomography (CT), Magnetic Resonance Imaging (MRI), ultrasound etc.
Initially, edge detection filters and mathematical methods were being used, after which deep learning was brought into use predominantly alongside transfer learning. Later, 2.5 dimensional CNN was introduced and this produced a remarkable balance between the performance and computational costs. After this, 3 dimensional CNN came into use and proved to be superior to 2.5D in terms of performance. Over time, various types of CNN architectures have evolved as shown in Fig. 1.
2 Deep Learning Network Architectures and Related Work
2.1 Basic CNN (1989)
CNN is a well-known class of deep learning networks. It is widely preferred in image segmentation, image classification, object detection etc. CNN is also known as ConvNet and is shift-invariant. They are regularized versions of the Multi-Layer Perceptron (MLP). MLP networks are often fully connected and hence result in overfitting of data. The basic architecture of CNN comprises of 5 layers as shown in Fig. 2
Input Layer:
This layer is made up of artificial neurons that allows the initial data into the network for further processing.
Convolutional Layer:
The convolutional layer is comprised of weights that should be trained as per the application that the network is being used [2]. This layer is also responsible for feature extraction which includes edges, objects, textures and scenes [3].
Pooling Layer:
The feature map dimensions obtained from the convolutional layer are reduced in the pooling layer.
Fully Connected Layer:
In this layer, each input from the previous layer is connected to each activation function in the next layer.
Output Layer:
This layer is responsible for producing the final result of the segmentation, classification or relevant application.
The following authors have implemented various variants of CNN for segmentation of tumors in the brain. R. Thillaikkarasi et al. [4] presented a novel kernel-based CNN combined with a modified Support Vector Machine(SVM) for efficient and automatic segmentation of brain tumors. In this work, spectrum mixing was included along with the kernel to elevate the flexibility during segmentation. Sidra Sajid et al. [5] introduced a patch-based hybrid CNN approach to detect tumors in brain and considers local and contextual information. This method addressed the overfitting problem by making use of the dropout regulariser alongside with batch normalization procedure. This work also provided a solution for data imbalance using a two-phase training procedure and resulted in a DSC (Dice Score Co-efficient) of 86%. Farheen Ramzan et al. [6] proposed a 3D CNN network associated with residual learning and dilated convolutional operations to accurately analyze the end to end mapping from MRI volumes to the brain segments at the voxel level. DSC of 87 ± 3% was achieved for different datasets. Kumar et al. [7] also implemented a 3D CNN network for brain tumor segmentation.3D CNN was preferred over 2D CNN as it suffered a loss of quality in the input image due to compressed 2D image processing. The proposed 3D CNN consisted of five max-pooling layers, two fully connected layers and a soft max layer. Mostefa Ben naceur et al. [8] proposed a CNN network inspired by occipito temporal pathway which includes a special function known as selective attention that operates based on the receptive field sizes to identify the crucial objects in the scene.This method was used for segmentation of brain tumors and yielded a DSC of 90% for tumor, 83% for tumor core and 83% for enhanced tumor. Nai Qin Feng et al. [9] proposed a deep CNN framework in a cascaded structure with CRF (Conditional Random Field) for post processing which efficiently eradicates the contradiction between the accuracy of segmentation, depth of the network and the number of pooling layers in conventional CNN. This method was used to segment tumors in the brain and yielded a DSC of 86%. Zhaohan Xiong et al. [10] proposed a Dual FCN with 16 layers called AtriaNet for segmentation of the left atrium. This method yielded a DSC o 94%. Mamta Mittal et al. [11] proposed a combination of GCNN (Growing CNN) and SVM for segmentation of brain tumors. This GCNN permitted to encode properties of the inputs to improvise the next step and reduce the parameters. This method yielded a PSNR of 96%.W.V.Deng et al. [12] proposed a fusion of a Heterogeneous CNN (HCNN) and a CRF (Conditional Random Fields).The CRF has been developed as a Recurrent Regression NN (RRNN). This method could divide the brain images into several slices and yielded a precision and recall of 96.5% and 97.8% respectively which is by far the highest among the discussed methods.
Segmentation of Breast Tumors
Variants of CNN networks have been used in successful segmentation of breast masses as well. Ademola Enitan Ilesanmi et al. [13] proposed a DL based segmentation technique exclusively for breast tumors using VEU-Net. (Variant Enhanced Block).This method yielded a DSC varying from 74% to 91% for various datasets. Later, Mughad A. Al-Antari et al. [14] proposed a DL based segmentation technique for mammogram using full resolution CNN which yielded a DSC of 92% and accuracy of 92.97%.
Segmentation of Thyroid Nodules
Researchers have paved their way using DL into segmentation of thyroid nodules as welp in the recent days.Jeremy M. Webb et al. [15] implemented a combination of recurrent FCNN and DeepLab V3 for segmenting thyroid nodules from ultrasound images. This method yielded an Intersection over Union (IoU) of 42% for cycts, 53% for nodules and 73% for thyroid nodules. Viksit Kumar et al. [16] proposed a segmentation technique based on prong CNN for segmenting thyroid nodule, gland and cystic components. Prong is nothing but the network shape caused by splitting the architecture for generating multiple outputs. This method yielded a detection rate of 82% and 44% for thyroid nodules and cystic components respectively. Ngoc-Quang N Guyen et al. [17] proposed a Deep CNN network for segmenting boundaries in medical images. This DL network aids in identifying boundaries using multiscale effective decoders. This method yielded an accuracy of 95 ± 3% for segmenting boundaries in different datasets thus, proving to be superior to the other variants.
Segmentation of Parenchyma
Researches have been carried out using CNN variants on segmentation of various other diseases such as the implementation of a 3D patch-based CNN network for parenchymal segmentation from MRI images of the brain by Al-Louzi et al. [18]. This network not only resulted in robust and accurate outcome of brain atrophy and segmentation of lesions in PML but also proved to be valuable clinically and towards including standard forms of quantitative MRI measures in clinical therapies. This variant of the CNN network used was made up of a network architecture consisting of Multiview feature pyramid networks and hierarchical residual blocks consisting of embedded batch normalization and non-linear activation functions.Ying Chen et al. [19] introduced a dense deep CNN that includes popular optimization methods that include dense block, batch normalization and drop-out. This method was implemented to segment lung parenchyma and yielded an accuracy of 95%. J. Ramya et al. [20] introduced a technique for segmentation of optic cup combining DNN and hybrid particle swarm optimization technique which achieved superior performance with a DSC of 98%.
Segmentation of Prostate
The following authors have preferred CNN models for segmentation of prostate carcinoma.Davood Karimi et al. [21] proposed a variant of CNN which involved two strategies to segment prostate. The first strategy is to apply adaptive sampling strategy and the next is to use the disagreement of the CNN ensemble to identify the uncertain segmentations and estimate segmentation uncertainty map. Ke Yan et al. [22] proposed a P-DNN (Propagation DNN) for prostate segmentation. This method incorporates optimal combination of multi-level feature extraction on a single model. This method yielded a DSC of 89.9 ± 2%. Massimo Salvi et al. [23] proposed a CNN with rings (Rapid Identification of Glandular structures) for effective detection and segmentation of Gland segmentation in prostate histopathological images. This method yielded a DSC of 90%.
Segmentation of Cardiac Tissues
CNN networks have proved to be successful in segmentation of cardiac images as well.Huaifei Hu et al. [24] proposed a combination of FCN and 3D ASM (Active Shape Model) to segment right and left ventricles in cardiac MRI. The method yielded a JI of 89%. Hisham Abdeltawab et al. [25] proposed a segmentation technique for the left and right ventricle using dual FCN. FC1 and FC2 were concatenated at the final output. This method received a DSC of 88% to. 96% for different datasets.
Other Related ROI Segmentations
Futhermore, Tang et a. [26] implemented a multi-scale CNN network for Selective Internal Radiation Therapy (SIRT) patients. The trained model was not efficient enough on SIRT data which had low contrast due to reduced dosage as well as lesions having vast difference in density from their surroundings, abnormal liver shape or positioning. Ryu et al. [27] introduced a CNN network made-up of an encoder and inference branches which was combinedly for segmentation as well as classification purposes. This network takes the combination of an input image and its corresponding Euclidean distance maps of the foreground and background as the input data stream. However, several drawbacks were reported as it does not incorporate all kinds of machines available in clinical trials and hence results cannot be obtained for the left out machines. This resulted in a low Jaccard score index (JSI) of 68% and did not prove to be very efficient on heptic lesions. Terapap Apiparakoon et al. [28] proposed a modified CNN model with FPN for bone lesion segmentation.A ladder FPN was introduced to the top-down pathway to semi-supervise the network training and an additional layer was included to extract global features. This method yielded an F1 score of 84% and a precision of 85%. Kurnianingsih et al. [29] proposed a Mask R-CNN based technique for segmentation of cervical cells. This method used Resnet-10 as a backbone and yielded a precision of 92% and recall of 91%. Lee et al. [30] also used the 3D CNN network for the detection of plaque in major calcifications and obtained a decent F1 score of 92%. Nudrat Nida et al. [31] proposed a Region Based CNN (RCNN) in combination with fuzzy C-means clustering (FCM) technique for detecting and segmenting lesions in melanoma. In this method, CNN resolved the insufficient sample problem and FCM extracts affected patches with variable boundaries that aid in disease recognition. This method achieved a F1 score of 95% and accuracy of 94%.
Tariq Mahmood Khan et al. [32] proposed a CNN based network for segmentation of retinal vessel segmentation. This network was a Residual Connection based Encoder Decoder network. This architecture has is capable of retaining and exploiting low-level semantic edge information for robust vessel segmentation. The method yielded an accuracy on 96 ± 1% for different datasets. Veena et al. [33] introduced an optic disc and cup segmentation technique for the diagnosis of glaucoma which yielded an accuracy of 97% using a modified CNN network consisting of 39 layers including nineteen convolutional layers, four max-pooling layers, eleven drop out layers and a single merger layer.
Each of the above discussed CNN variants have their respective advantages and disadvantages. The most common advantage of CNN is that it does not require intense human supervision for feature detection unlike its predecessors but it also suffered certain drawbacks such as its requirement of a large training dataset and its inability to encode the position and orientation of the object.
2.2 Alexnet (2012)
The architecture of alexnet consists of eight layers including five convolutional layers and three fully connected layers as displayed in Fig. 3. But it isn’t the layers that make the alexnet special. Instead, alexnet holds a series of features that ended up acting as new approaches to CNN frameworks which made alexnet stand out from the rest. AlexNet replaced the tanh function which was standard at the time with Rectified Linear Units (ReLU). Alexnet was designed to allow multi-GPUs which in turn enabled the training of bigger models at a reduced training time. Also, data augmentation and dropout techniques were deployed in alexnet as a means to the overfitting issue.
Lu et al. [35] proposed an improved alexnet network for detection and segmentation of abnormality in the brain from magnetic resonance images. The last few layers in this improved AlexNet were replaced with an exceptional learning machine which in turn was enhanced by a modified chaotic bat algorithm to attain improved generalization. Chen et al. [36] implemented a 3D framework of alexnet based on classic AlexNet to segment and reconstruct of prostate tumor medical images with adaptive improvement. This method yielded an accuracy of 92%. Also, in comparison with the conventional segmentation as well as depth segmentation methods, the efficacy of the proposed network was exemplary with respect to the time consumption during training, the amount of parameters considered, or evaluation of network performance. Alexnet was most preferred for feature extraction purposes rather than segmentation applications which yielded significant feature extraction results [37]. The disadvantage of alexnet is that because the model isn’t particularly deep, it faces difficulty scannning for all attributes, resulting in models that aren’t very good.
2.3 Resnet (2015)
Resnets are made up of residual blocks as shown in Fig. 4. The resnet was originally introduced to rectify the vanishing gradient issue faced by the previous neural networks. The ResNet uses a 34-layered basic network architecture which was influenced by VGG-19 in which a time saving alternate connection is introduced as in Fig. 5. These alternate connections thereby form a residual network. The alternate shortcut connections in resnet are called skip connections and these are the core of the residual. Also, these skip connections are padded with an extra zero in order to increase their dimensions. The skip connections in ResNet help to rectify the setback of vanishing gradients by permitting this new path through which the gradient is allowed to flow. These residual deep learning networks are widely preferred in classification applications but have been used in certain segmentation applications as well.
A resnet50 based mask r-cnn was implemented by Jeevakala et al. [39] as well to segment internal auditory canals and their nerves. The localization results yielded an accuracy of 79%. Song Guo et al. [40] proposed a combination of Resnet-101 and VGG-16 for segmentation of retinal vessels. This network was a multi-scale network and yielded a F1 score of 82 ± 2%. Similarly, Resnet-101 was selected to be the foundation of Mask R-CNN proposed by Zhao et al. [41] where identity mapping block was used as a means to rectify degradation issues faced and facilitate training of the deeper network.
Comparatively, resnet based deep learning models proved to produce better results when used for classification applications rather than segmentation applications in terms of performance measures. Liu et al. [43] proposed a feature pyramid mask r-cnn network based on resnet to segment the nuclei present along the cervical. In this method, pixel-level information was used before hand to dispense supervisory information to train the mask r-cnn. The precision and recall yielded were 96%. One significant disadvantage of resnet is that deeper networks usually necessitate more training time.
2.4 U-net (2015)
The U-net architecture first evolved from the traditional CNN in the year 2015 for bio-medical images. The u-net network is symmetric along both sides as represented in Fig. 6. The two major parts of the network architecture include the expansive 2D convolutional layers on the right and the contracting path comprising of the general convolutional process along the left. The pooling operations are replaced by upsampling operators consisting of multiple feature channels to amplify the resolution of the output. Different variants of u-net have been deployed for various medical segmentation tasks by researchers around the world as discussed below.
Segmentation of Optic Regions
U-net has been used for segmentation of various regions of the eye. Pan et al. [44] introduced a modified u-net based network for segmentation of retinal vessel segmentation in fundus images. As the traditional u-net was not deep enough, this network was proposed to bind the outcome of the convolutional layer with that of the deep CNN in the residual network under extreme depth conditions. Zheng qiang Jiang et al. [45] proposed a coronary vessel segmentation network based on U-net. This network was comprised of multi-resolution and multi-scales as the traditional U-net comprised of only a single convolution operation for a single scale image and hence does provide accurate segmentation. This method yielded a DSC of 79%. Xioming Liu et al. [46] proposed a modified U-net model for segmentation of fluids in retinal optic CT images. This variant has an automated attention mechanism to locate the fluid region to avoid the problem of excessive calculation in multi-stage methods. Also, the dense skip connections combined the high-level and low-level features thus making the results of segmentation more precise. This method achieved a DSC of 80 ± 2% for images from different devices. Sang Yoon Han et al. [47] proposed a segmentation technique inspired by the U-net for detection of pupil centreline this model the complexity of the U-net is reduced by decreasing the number of channels and floors in the U-net. This network achieved a detection rate of 87.3%. Bilel Daoud et al. [48] proposed a segmentation technique for nasopharyngeal carcinoma which was inspired by U-net. The proposed method consisted of 2 CNN based systems with overlapping patches with fixed sizes and with different sizes, thus yielding a DSC of 85% to 91% for axial, coronal and sagittal sections. Khaled Alsaih et al. [49] proposed a method for segmentation of retinal fluid segmentation using SegNet which resembled the U-net architecture except that the encoder path is replaced by VGG-16 network. This method yielded a DSC of 92%. Mangipudi et al. [50] proposed an improved u-net based network to segment optic disc and cup in glaucomatic images. Contrast to the original u-net architecture, this network consisted of only half the number of filters in each convolutional layer. Also, the input size was kept low so that the number of parameters used during training could reduce. By doing so, the computational time required for training was reduced to a significant extent and this method yielded a DSC of 93% and 95% for cup and disc segmentation respectively. Bhargav. J.Bhatkalkar et al. [51] proposed a segmentation technique for optic disc in fundus images. This method is a combination of DeepLab V3 + and U-net by incorporating an attention module between the encoder and de-coder to improve the accuracy. This method achieved a DSC of 95 ± 2% for different datasets. Monsumi et al. [52] implemented a iris segmentation method using an interactive variant of U-net that includes modules to squeeze and expand with an aim to reduce the training time and improve storage by reducing the parameters. This method resulted in a DSC of 98%. Shuang Yu et al. proposed a robust optic disc segmentation network based on U-net with Resnet-34 encoding layers. This method yielded an accuracy between 84% to 97% for different datasets.
Segmentation of Various Tumours
Researchers have also used U-net for segmentation of various tumours. A deeper 14 layer U-net model consisting of 26 blocks of VGG19 encoders with ImageNet was implemented by Lu et al. [53]. This method resulted in a DSC of 86% for segmenting the tumor mask, 76% for segmenting contour of the tumor and 66% for segmenting the contour of tumor after gaussian smoothening. Yong Zhou Lu et al. [54] implemented a U-net based DL model with VGG-16 encoder pretrained with ImageNet to segment tumours in PET images. This network yielded a DSC of 86% for mask of tumour and 76% for contour of tumour. Manhoor Ali et al. [55] introduced a model combining 3D CNN and U-net to segment brain tumour from MRI images. This method also replaced Relu activation function by leaky Relu and produced a DSC of 75%, 90% and 84% for enhancing tumour, whole tumour and tumour core respectively. In this method, 3D asymmetric kernels were used for convolution and flat stride was used for pooling to tackle anisotropic spacing. Mohamed A. Naser et al. [56] implemented a U-net model with 1 convolutional transpose layer instead of max-pooling in the de-coding part for segmentation of brain tumour yielded a DSC of 92%. Tran et al. [57] proposed the combined use of U-net and Un-net for segmenting liver tumours. The Un-net was designed in such a way that the skip connection path, pooling path and the up-convolutional path are replaced in the node structure. In the re-designed structure, the features in the node of the output layer are conjoined with the next node as well as the encoder node at the same level. This method yielded a DSC of 96.5% and 73% for liver and liver tumour segmentation respectively. Zhenxi Zhang et al. [58] proposed a U-net like model for segmenting 3D MRI images of the brain. Later, Tao Lei et al. [59] proposed an enhanced U-net model named as Def ED-Net (Deformable Encoder Decoder Network for liver and liver tumour segmentation. This method avoids loss of spatial contextual information of images by employing deformable convolution with residual structures to generate feature maps. This model yielded a DSC of 96% which is exemplary comparatively.
Segmentation of Blood Vessels
U-net has also proved to be efficient in segmentation of blood vessels. Manual E Gegundez-Arias et al. [60] proposed a simplified U-net with a combination of residual blocks and batch normalisation at the up and down scaling phases. This model was used to segment blood vessels in retinal images based and achieved an accuracy of 95% ± 1% on different datasets. Enda Boudegga et al. [61] proposed a DL network for segmenting blood vessels by extending the well-known U-net. In this method, the standard convolutional layers were replaced using LCM (Light weight Convolutional Modules) in order to reduce the computations. This method yielded a superior accuracy of 97%.
Segmentation of Cardiac Diseases
Various parts and diseases of the heart have been effectively detected and segmented using U-net. Gurpreet Sing et al. [62] proposed the use of the traditional U-net for automatic segmentation of cardiac CT images and achieved an overall accuracy of 73%. Can Xiao et al. [63] implemented an improved 3D U-net based on FCN for heart coronary artery segmentation. The upper part of the FCN was modified to enable propagation of information to higher resolution layers. This method yielded a DSC of 82%. Lohendran Baskaran et al. [64] proposed a U-net based segmentation of cardiovascular structures from Cardiac CT images and achieved a DSC of 82% as well-lined-Lu et al. [65] proposed a ringed residual U-net for pancreatic segmentation. With the use of the ring residual module this method yielded exemplary results via deep convolution and can consolidate the characteristics of traditional deep learning networks. This network yielded a DSC of 88.32 ± 2.84%. Tao Liu et al. [66] proposed a U-net based RCNN to efficiently segment heart diseases from cardiac images and obtained a DSC of 86% to 95% for different sections of the heart.
Segmentation of Brain Tissues and Tumours
Researches include the use of U-net in brain tissue segmentation tasks which have resulted to be successful. Sil C Van De Leemput et al. [67] proposed a FCNN based U-net model for brain tissue segmentation. In addition to the traditional U-net, shortcuts were added over every two convolutional layers as they speed up convergence and increase the overall performance, achieving a DSC of 87%. Nagaraj Yamanakumar et al. [68] proposed a brain-tissue segmentation model known as M-net which was inspired by U-net. M-net consisted of two side paths and two main encoding and de-coding paths which aids better feature learning. This method produced an accuracy of 94 ± 2%. Fan Zhang et al. [69] proposed a brain tissue segmentation technique using 2D U-net with a novel augmented target loss function to increase accuracy in tissue boundaries. This method yielded a high accuracy of 95 ± 2%.
Other Related ROI Segmentations
Variants of U-net have been employed in several other medical oriented segmentation tasks effectively.
The combination of U-net and U-net++ variant was proposed by Jonmohamadi et al. [70] to automatically segment multiple structures from knee arthroscopy. In U-net++, the skip connections are compensated using nested, dense skip connections as a means to develop a more efficient architecture. This modification was done to supress the semantic gap of the feature maps that lie between encoder and decoder operators. The U-net++ model yielded a DSC of 0.79%, 0.50%, 0.51% and 0.48% during segmentation of femur, tibia, anterio and meniscus respectively. Yuli Sun Hariyani et al. [71] proposed a dual attention based U-net variant for nailfold capillary segmentation. This model was named as the DA-CapNet and it improvised the U-net architecture by including a dual-attention module that captured feature maps more efficiently yielding an IoU of 64% and precision of 77%. Chen et al. [72] introduced the U-net plus variant which was used to segment esophagus and esophageal cancer. In this variant, 2 blocks were introduced to optimise feature extraction of tediously complex and abstract information and as a means to resolve irregular, vague boundaries with ease. The DSC obtained using the U-net plus was 79%. A dense U-net model was introduced by Li et al. [73] for segmentation of mammogramic masses. This variant of u-net combines densely connected CNN with attention gates. The encoder end is densely connected to the CNN whereas the attention gates are connected at the decoder end. This network produced an F1 score of 82.24%.
Sebastin Stenman et al. [74] introduced a U-net and ImageNet combination with Resnet backbone to segment leukocytes which yielded an IoU of 82%. Shyam Lal et al. [75] proposed a nuclei segmentation method for liver cancer detection using a modified U-net. This model was known as NucleiSeg Net and it included a residual block comprising of convolutional layers aiming to obtain high-level semantic features. This method yielded a F1 score of 83% and JSI of 72%. Yesenia Gonzalez et al. [76] proposed a sigmoid colon segmentation network based on U-net. The proposed network combined the use of 2D and 3D operationist DSC obtained using this network was 82% ± 6%. Xieli Li et al. [77] proposed a dual U-net based network for segmentation of overlapping nuclei. This method has a multi-task learning network in which the boundary and region information helps to improve the segmentation accuracy of glaucoma nuclei, especially overlapping ones yielded a F1 score of 82%. Junlong Chen et al. [78] proposed a variant of U-net for aortic dissection which when compared to the traditional encoder block consisted of an enhanced feature representation capability. This method achieved an accuracy of 85%.
Chanbo Huang et al. [79] employed a modified U-net for segmentation of cell images. This variant combined the advantages of U-net and resnet into one module and yielded an accuracy of 97% and IoU of 84%. Bing Bing Zheng et al. [80] introduced the Multi-scale Discriminative Net (MSD-Net) inspired by the U-net model. This variant was used to segment lung infections through four stages of operation. The four stages include feature map scale, a global average pooling layer to extract semantic consistence from the encoder and a pyramid convolutional block to achieve multi scale information. This method achieved a sensitivity of 82% to 86% for three different infections. Amine Amyar et al. [81] proposed a variant of U-net that comprised of convolutional layers with stride = 2 to replace pooling and maintain spatial information. Also, the number of filters were increased from 64 to 1024. This method yielded a DSC of 88%.
Catherine P. Jeyapandian et al. [82] proposed a network for segmenting histologic structures in the kidney cotex. This network was inspired by the conventional U-net with slightly tweaked parameters. The F1 scores obtained, varied from 81%–91% for various structures. Duo Wang et al. [83] implemented the 3D U-net for segmentation of pulmonary nodules and achieved an accuracy varying between 72% to 91% for different tasks. Van-Truong Pham et al. [84] proposed a DL network for segmentation of tympanic membranes from otoscopic images. This network was known as Ear U-net and was based on three paradigms. Firstly, efficientnet was used as encoder. The second paradigm is that, attention gate was used for skin connections and thirdly, residual blocks were used for decoder. The DSC achieved was 92%.
Zhang et al. [85] segmented epicardial fat using dual U-nets and a morphological processing layer. The function of the morphological layer is to accurately identify the pericardium. The first U-net network focuses solely on the detection of the pericardium and the second U-net network was used to locate and segment the epicardial fat. This dual network-based design has yielded a DSC of 91.19%.
Qi-Zhang et al. [86] proposed an Epicardial Fat segmentation network using dual U-nets including a morphological processing layer. The first U-net was for refining and obtaining the inside region of the pericardium and the second layer acted as a backbone for segmentation. This method yielded a DSC of 91%. Francesco Marzola et al. [87] proposed a segmentation technique for transverse musculoskeletal ultrasound images. This DL network was an ensemble NN that combined the predictions of U-net, U-net++, FPN and AttentionNet. This method yielded a precision of 88% and recall of 92%. Lian Ding et al. [88] proposed a light weight U-net variant for segmentation of pediatric hand bones. This model contained a reduced number of up sampling and down sampling operators as well as kernels. This method yielded a DSC of 92.9%.
A lightweight U-net model was introduced by Ding et al. [89] for segmentation of pediatric hand bones. Multiple filters with different kernel sizes were deployed along with two down-sampling operators, two up-sampling operators. This network frame yielded a DSC of 93.1% in the segmentation of pediatric bones. Javier Civit Mascot et al. [90] proposed a TPU (Tensor Processing Unit) cloud based U-net model for segmentation of eye fundus images and achieved a DSC of 94%. Tawsifur Rahman et al. [91] proposed a DL network for detection and segmentation of tuberculosis in chest x-ray with the use of two U-net models. The modified U-net includes a bi-directional convolutional long short term memory that combines feature maps. This method yielded an accuracy and F1 score of 96%. Guodong Zeng et al. [92] proposed a LP-Unet for segmentation of hip-joints in MRI images. In this network, the listic decomposition, convolution and dense up-sampling convolution were applied at the beginning of the 3D U-net. The main advantage of LP-net is that, it reduced the GPU memory. This method obtained a DSC of 97 ± 2%. Al-Kofahi et al. [93] used a combination of the U-net and MXNet library to quantify pixel-level predictions of a number of classes.
U-net networks not only proved to be efficient in medical image segmentation but also generated significant results in image reconstruction and pixel regression as well. The disadvantage of U-Net topologies is that learning may slow down in the middle layers of deeper models, putting the network at danger of ignoring the layers that represent abstract characteristics.
2.5 Volumetric Convolution Network (V-net, 2016)
Although, the V-net was inspired by the U-net, both architectures have their differences. The left portion consists of the compression path and the right portion consists of the de-compression path which is responsible for reverting the original size of the signal. Each portion is divided into various stages that govern different resolutions. Pooling operations are replaced by convolution layers that vary between one to three and a residual function is familiarised at each stage. The convolutional layers are made up of volumetric kernels of 5 × 5 × 5 voxels as displayed in Fig. 7. The Prelu non-linear activation function is present on the left portion and down-sampling is performed to increase the receptive field. On the other hand, the right portion performs a deconvolution operation to increase the size of the input. Few features are similar along both the portions such as the number of convolutional layers provided that the last convolutional layer is responsible for producing the same output size as the input. There are very few implementations of v-net by researchers which are discussed below and more works are to be expected in the latter days.
Gibson et al. [95] introduced a dense v-net for segmentation of 8 organs in the abdominal region such as the stomach, duodenum, left kidney, liver, spleen, gallbladder and pancreas. Dense V-Net differs in certain ways. The down-sampler consisted of three dense feature stacks connected by down-sampling stridden convolutions. Every skip connection was a convolution of the associated stack output, and the up-sampler comprises bilinear up-sampling. Memory dependencies of the feature stack and spatial dropout enable deep networks at high resolutions, which is an advantage while segmenting smaller structures. Caixia Dong et al. [96] proposed a V-net based 3D DL network known as Di-Vnet for segmenting coronary arteries. It functions as two stages namely, cardiac segmentation, followed by a second stage of CAS(coronary arteries segmentation). This method achieved a DSC of 90 ± 1% for different datasets. Zeng et al. [97] implemented v-net architecture for image fetal segmentation. A combination of v-net and multi-scale loss function was used where v-net was used for the attention mechanism and the multi-scale loss function is used for deep supervision. The combination of these two functions induced significant results and helped to yield a DSC of 97.93%.
3 Discussion and Conclusion
Medical image processing using deep learning is a vast, interesting and challenging research area that conjoins the medical field and the computer field. This survey covers the recent works involving the widely used deep learning networks in medical image segmentation as per the distribution in Fig. 8. Researchers all around the world have been introducing and implementing several variants of DL networks that are derived from the standard DL architectures, geared towards amplifying the performance and rectifying the drawbacks faced by the existing network performances. Such research works are performed to contribute towards the advancement of the healthcare field and assist radiologists in precise diagnosis. This paper summarises the standard network architectures of CNN, Alexnet, Resnet, U-net, V-net and the related works that cover the implementation of its variants along with performance study and a comparison chart. From the study, we have understood that U-net is most preferred and widely used for segmentation of medical images due its high performance measures. We would like to conclude stating that from this survey it is understood that through collaborative research between computer vision techniques and DL techniques the medical field can draw huge benefits Table 1.
References
Van Hiep Phung, E.J.: A high‐accuracy model average ensemble of convolutional neural networks for classification of cloud image patches on small datasets. Appl. Sci. 9, 4500 (2019)
Ke, Q., Boussaid, F.: Computer vision for human–machine interaction. Comput. Vis. Assist. Heathcare (2018)
Yang, B., Guo, H.: Design of cyber-physical-social systems with forensic-awareness based on deep learning. Adv. Comput. 120, 39–79 (2020)
Thillaikkarasi, R., Saravanan, S.: An enhancement of deep learning algorithm for brain tumor segmentation using kernel based CNN with M-SVM. J. Med. Syst. 43, 1–7 (2019)
Sajid, S., Hussain, S.: Brain tumor detection and segmentation in MR images using deep learning. Arab. J. Sci. Eng. 44, 9249–9261 (2019)
Ramzan, F., Khan, M.U.G., Iqbal, S., Saba, T., Rehman, A.: Volumetric segmentation of brain regions from MRI scans using 3D convolutional neural networks. IEEE Access 8, 103697–103709 (2020). https://doi.org/10.1109/ACCESS.2020.2998901
Anand Kumar, G., Sridevi, P.V.: 3D deep learning for automatic brain MR tumor segmentation with T-spline intensity inhomogeneity correction. Autom. Control Comput. Sci. 52(5), 439–450 (2018). https://doi.org/10.3103/S0146411618050048
Ben Naceur, M., Akil, M., Saouli, R., Kachouri, R.: Fully automatic brain tumour segmentation with deep learning-based selective attention using overlapping patches and multi-class weighted cross-entropy. Med. Image Anal. 63, 101692 (2020). https://doi.org/10.1016/j.media.2020.101692. Epub 29 Apr 2020. PMID: 32417714
Feng, N., Geng, X., Qin, L.: Study on MRI medical image segmentation technology based on CNN-CRF model. IEEE Access 8, 60505–60514 (2020). https://doi.org/10.1109/ACCESS.2020.2982197
Xiong, Z., Fedorov, V.V., Fu, X., Cheng, E., Macleod, R., Zhao, J.: Fully automatic left atrium segmentation from late gadolinium enhanced magnetic resonance imaging using a dual fully convolutional neural network. IEEE Trans. Med. Imaging 38(2), 515–524 (2019). https://doi.org/10.1109/TMI.2018.2866845. PMID: 30716023; PMCID: PMC6364320
Mittal, M., Goyal, L.M., Kaur, S., Kaur, I., Amit Verma, D., Hemanth, J.: Deep learning based enhanced tumour segmentation approach for MR brain images. Appl. Soft Comput. 78, 346–354 (2019)
Deng, W., Shi, Q., Wang, M., Zheng, B., Ning, N.: Deep learning-based HCNN and CRF-RRNN model for brain tumor segmentation. IEEE Access 8, 26665–26675 (2020). https://doi.org/10.1109/ACCESS.2020.2966879
Ilesanmi, A.E., Chaumrattanakul, U., Makhanov, S.S.: A method for segmentation of tumours in breast ultrasound images using the variant enhanced deep learning. Biocybern. Biomed. Eng. 41, 802–818 (2021)
Al-antari, M.A., Al-masni, M.A., Choi, M.-T., Han, S.-M., Kim, T.-S.: A fully integrated computer-aided diagnosis system for digital X-ray mammograms via deep learning detection, segmentation, and classification. Int. J. Med. Inform. 117, 44–54 (2018)
Webb, J.M., Meixner, D.D., Adusei, S.A., Polley, E.C., Fatemi, M., Alizad, A.: Automatic deep learning semantic segmentation of ultrasound thyroid cineclips using recurrent fully convolutional networks. IEEE Access 9, 5119–5127 (2021). https://doi.org/10.1109/ACCESS.2020.3045906
Kumar, V., et al.: Automated segmentation of thyroid nodule, gland, and cystic components from ultrasound images using deep learning. IEEE Access 8, 63482–63496 (2020). https://doi.org/10.1109/ACCESS.2020.2982390
Nguyen, N., Lee, S.: Robust boundary segmentation in medical images using a consecutive deep encoder-decoder network. IEEE Access 7, 33795–33808 (2019). https://doi.org/10.1109/ACCESS.2019.2904094
Al-Louzi, O.: Progressive multifocal leukoencephalopathy lesion and brain parenchymal segmentation from MRI using serial deep convolutional neural networks. NeuroImage Clin. 28, 102499 (2020)
. Chen, Y, Wang, Y., Hu, F., Wang, D.: A lung dense deep convolution neural network for robust lung parenchyma segmentation. IEEE Access 8, 93527–93547 (2020). https://doi.org/10.1109/ACCESS.2020.2993953
Ramya, J., Rajakumar, M.P., Uma Maheswari, B.: HPWO-LS-based deep learning approach with S-ROA-optimized optic cup segmentation for fundus image classification. Neural Comput. Appl. 33(15), 9677–9690 (2021). https://doi.org/10.1007/s00521-021-05732-1
Karimi, D., et al.: Accurate and robust deep learning-based segmentation of the prostate clinical target volume in ultrasound images. Med. Image Anal. 57, 186–196 (2019). https://doi.org/10.1016/j.media.2019.07.005
Yan, K., Wang, X., Kim, J., Khadra, M., Fulham, M., Feng, D.: A propagation-DNN: deep combination learning of multi-level features for MR prostate segmentation. Comput. Methods Programs Biomed. 170, 11–21 (2019)
Salvi, M., et al.: A hybrid deep learning approach for gland segmentation in prostate histopathological images. Artif. Intell. Med. 115, 102076 (2021)
Hu, H., et al.: Automatic segmentation of left and right ventricles in cardiac MRI using 3D-ASM and deep learning. Signal Process. Image Commun. 96, 116303, 101902 (2021)
Abdeltawab, H., et al.: A deep learning-based approach for automatic segmentation and quantification of the left ventricle from cardiac cine MR images. Comput. Med. Imaging Graph. 81, 101717 (2021)
Tang, X., et al.: Whole liver segmentation based on deep learning and manual adjustment for clinical use in SIRT. Eur. J. Nucl. Med. Mol. Imaging 47(12), 2742–2752 (2020). https://doi.org/10.1007/s00259-020-04800-3
Ryu, H., Shin, S.Y., Lee, J.Y., Lee, K.M., Kang, H.-J., Yi, J.: Joint segmentation and classification of hepatic lesions in ultrasound images using deep learning. Eur. Radiol. 31(11), 8733–8742 (2021). https://doi.org/10.1007/s00330-021-07850-9
Apiparakoon, T., et al.: MaligNet: semisupervised learning for bone lesion instance segmentation using bone scintigraphy. IEEE Access 8, 27047–27066 (2020). https://doi.org/10.1109/ACCESS.2020.2971391
Allehaibi, K.H.S., et al.: Segmentation and classification of cervical cells using deep learning. IEEE Access 7, 116925–116941 (2019). https://doi.org/10.1109/ACCESS.2019.2936017
Lee, J.: Segmentation of coronary calcified plaque in intravascular OCT images using a two-step deep learning approach. IEEE Access 8, 225581–225593 (2020)
Nida, N., Irtaza, A., Javed, A., Yousaf, M.H., Mahmood, M.T.: Melanoma lesion detection and segmentation using deep region based convolutional neural network and fuzzy C-means clustering. Int. J. Med. Inform. 124, 37–48 (2019)
Khan, T.M., Alhussein, M., Aurangzeb, K., Arsalan, M., Naqvi, S.S., Nawaz, S.J.: Residual connection-based encoder decoder network (RCED-Net) for retinal vessel segmentation. IEEE Access 8, 131257–131272 (2020). https://doi.org/10.1109/ACCESS.2020.3008899
Veena, H.: A novel optic disc and optic cup segmentation technique to diagnose glaucoma using deep learning convolutional neural network over retinal fundus images. J. King Saud Univ. (2021)
Vaishnavi, J.: An efficient adaptive histogram based segmentation and extraction model for the classification of severities on diabetic retinopathy. Multimedia Tools Appl. 79, 30439–30452 (2020)
Lu, S., Wang, S.-H., Zhang, Y.-D.: Detection of abnormal brain in MRI via improved AlexNet and ELM optimized by chaotic bat algorithm. Neural Comput. Appl. 33(17), 10799–10811 (2020). https://doi.org/10.1007/s00521-020-05082-4
Chen, J.: Medical image segmentation and reconstruction of prostate tumor based on 3D AlexNet. Comput. Methods Programs Biomed. 200, 105878 (2021)
Mansour, R.F.: Deep-learning-based automatic computer-aided diagnosis system for diabetic retinopathy. Biomed. Eng. Lett. 8, 41–57 (2018)
He, K., Zhang, X.: Deep residual learning for image recognition. arXiv (2015)
Jeevakala, S., Sreelakshmi, C., Ram, K., Rangasami, R., Sivaprakasam, M.: Artificial intelligence in detection and segmentation of internal auditory canal and its nerves using deep learning techniques. Int. J. Comput. Assist. Radiol. Surg. 15(11), 1859–1867 (2020). https://doi.org/10.1007/s11548-020-02237-5
Guo, S., Wang, K., Kang, H., Zhang, Y., Gao, Y., Li, T.: BTS-DSN: deeply supervised neural network with short connections for retinal vessel segmentation. Int. J. Med. Inform. 126, 105–113 (2019)
Zhao, X.: EBioMedicine (2020)
Liu, Y.: Automatic segmentation of cervical nuclei based on deep learning and a conditional random field. IEEE Access 6, 53709–53721 (2018)
Ding, L.: A lightweight U-Net architecture multi-scale convolutional network for pediatric hand bone segmentation in X-ray image. IEEE Access 7, 68436–68445 (2019)
Pan, X.: A fundus retinal vessels segmentation scheme based on the improved deep learning U-Net model. IEEE Access 7, 122634–122643 (2019)
Jiang, Z., Ou, C., Qian, Y., Rehan, R., Yong, A.: Coronary vessel segmentation using multiresolution and multiscale deep learning. Inform. Med. Unlocked 24, 100602 (2021)
Xiong, Z., Fedorov, V.V., Fu, X., Cheng, E., Macleod, R., Zhao, J.: Fully automatic left atrium segmentation from late gadolinium enhanced magnetic resonance imaging using a dual fully convolutional neural network. IEEE Trans. Med Imaging 38(2), 515–524 (2019). https://doi.org/10.1109/TMI.2018.2866845
Han, S.Y., Kwon, H.J., Kim, Y., Cho, N.I.: Noise-robust pupil center detection through CNN-based segmentation with shape-prior loss. IEEE Access 8, 64739–64749 (2020). https://doi.org/10.1109/ACCESS.2020.2985095
Daoud, B., Morooka, K., Kurazume, R., Leila, F., Mnejja, W., Daoud, J.: 3D segmentation of nasopharyngeal carcinoma from CT images using cascade deep learning. Comput. Med. Imaging Graph. 77, 101644 (2019)
Alsaih, K., Yusoff, M.Z., Faye, I., Tang, T.B., Meriaudeau, F.: Retinal fluid segmentation using ensembled 2-dimensionally and 2.5-dimensionally deep learning networks. IEEE Access 8, 152452–152464 (2020). https://doi.org/10.1109/ACCESS.2020.3017449
Mangipudi, P.S., Pandey, H.M., Choudhary, A.: Improved optic disc and cup segmentation in Glaucomatic images using deep learning architecture. Multimedia Tools Appl. 80(20), 30143–30163 (2021). https://doi.org/10.1007/s11042-020-10430-6
Bhatkalkar, B.J., Reddy, D.R., Prabhu, S., Bhandary, S.V.: Improving the performance of convolutional neural network for the segmentation of optic disc in fundus images using attention gates and conditional random fields. IEEE Access 8, 29299–29310 (2020). https://doi.org/10.1109/ACCESS.2020.2972318
Sardar, M., Banerjee, S., Mitra, S.: Iris segmentation using interactive deep learning. IEEE Access 8, 219322–219330 (2020). https://doi.org/10.1109/ACCESS.2020.3041519
Lu, Y.: Automatic tumor segmentation by means of deep convolutional U-Net with pre-trained encoder in PET images. IEEE Access 8, 113636–113648 (2020)
Lu, Y., Lin, J., Chen, S., He, H., Cai, Y.: Automatic tumor segmentation by means of deep convolutional U-Net with pre-trained encoder in PET images. IEEE Access 8, 113636–113648 (2020). https://doi.org/10.1109/ACCESS.2020.3003138
Ali, M., Gilani, S.O., Waris, A., Zafar, K., Jamil, M.: Brain tumour image segmentation using deep networks. IEEE Access 8, 153589–153598 (2020). https://doi.org/10.1109/ACCESS.2020.3018160
Naser, M.A., Jamal Deen, M.: Brain tumour segmentation and grading of lower-grade glioma using deep learning in MRI images. Comput. Biol. Med. 121, 103758 (2020)
Tran, S.-T.: A multiple layer U-Net, Un-Net, for liver and liver tumor segmentation in CT. IEEE Access 9, 3752–3764 (2020)
Zhang, Z., Li, J., Tian, C., Zhong, Z., Jiao, Z., Gao, X.: Quality-driven deep active learning method for 3D brain MRI segmentation. Neurocomputing 446, 106–117 (2021)
Lei, T., Wang, R., Zhang, Y., Wan, Y., Liu, C., Nandi, A.K.: DefED-Net: deformable encoder-decoder network for liver and liver tumor segmentation. IEEE Trans. Radiat. Plasma Med. Sci. (2021). https://doi.org/10.1109/TRPMS.2021.3059780
Gegundez-Arias, M.E., Marin-Santos, D., Perez-Borrero, I., Vasallo-Vazquez, M.J.: A new deep learning method for blood vessel segmentation in retinal images based on convolutional kernels and modified U-Net model. Comput. Methods Programs Biomed. 205, 106081 (2021)
Boudegga, H., Elloumi, Y., Akil, M., Bedoui, M.H., Kachouri, R., Abdallah, A.B.: Fast and efficient retinal blood vessel segmentation method based on deep learning network. Comput. Med. Imaging Graph. 90, 101902 (2021)
Gurpreet, S., et al.: Deep learning based automatic segmentation of cardiac computed tomography. J. Am. Coll. Cardiol. 73, 1643–1643 (2019)
Xiao, C., Li, Y., Jiang, Y.: Heart coronary artery segmentation and disease risk warning based on a deep learning algorithm. IEEE Access 8, 140108–140121 (2020). https://doi.org/10.1109/ACCESS.2020.3010800
Baskaran, L., et al.: Automatic segmentation of multiple cardiovascular structures from cardiac computed tomography angiography images using deep learning (2020). https://doi.org/10.1371/journal.pone.0232573
Lu, L., Jian, L., Luo, J., Xiao, B.: Pancreatic segmentation via ringed residual U-Net. IEEE Access 7, 172871–172878 (2019). https://doi.org/10.1109/ACCESS.2019.2956550
Liu, T., Tian, Y., Zhao, S., Huang, X., Wang, Q.: Residual convolutional neural network for cardiac image segmentation and heart disease diagnosis. IEEE Access 8, 82153–82161 (2020). https://doi.org/10.1109/ACCESS.2020.2991424
Van De Leemput, S.C., Meijs, M., Patel, A., Meijer, F.J.A., Van Ginneken, B., Manniesing, R.: Multiclass brain tissue segmentation in 4D CT using convolutional neural networks. IEEE Access 7, 51557–51569 (2019). https://doi.org/10.1109/ACCESS.2019.2910348
Yamanakkanavar, N., Lee, B.: Using a patch-wise M-Net convolutional neural network for tissue segmentation in brain MRI images. IEEE Access 8, 120946–120958 (2020). https://doi.org/10.1109/ACCESS.2020.3006317
Zhang, F., et al.: Deep learning based segmentation of brain tissue from diffusion MRI. Neuroimage 233, 117934 (2021)
Jonmohamadi, Y.: Automatic segmentation of multiple structures in knee arthroscopy using deep learning. IEEE Access 8, 51853–51861 (2020)
Hariyani, Y.S., Eom, H., Park, C.: DA-CapNet: dual attention deep learning based on U-Net for nailfold capillary segmentation. IEEE Access 8, 10543–10553 (2020). https://doi.org/10.1109/ACCESS.2020.2965651
Chen, S.: U-Net plus: deep semantic segmentation for esophagus and esophageal cancer in computed tomography images. IEEE Access 7, 82867–82877 (2019)
Li, S.: Attention dense-U-net for automatic breast mass segmentation in digital mammogram. IEEE Access 7, 59037–59047 (2019)
Stenman, S., et al.: Antibody supervised training of a deep learning based algorithm for leukocyte segmentation in papillary thyroid carcinoma. IEEE J. Biomed. Health Inform. 25(2), 422–428 (2021). https://doi.org/10.1109/JBHI.2020.2994970
Lal, S., Das, D., Alabhya, K., Kanfade, A., Kumar, A., Kini, J.: NucleiSegNet: robust deep learning architecture for the nuclei segmentation of liver cancer histopathology images. Comput. Biol. Med. 128, 104075 (2021)
Gonzalez, Y., et al.: Semi-automatic sigmoid colon segmentation in CT for radiation therapy treatment planning via an iterative 2.5-D deep learning approach. Med. Image Anal. 68, 101896 (2021)
Li, X., Wang, Y., Tang, Q., Fan, Z., Yu, J.: Dual U-Net for the segmentation of overlapping glioma nuclei. IEEE Access 7, 84040–84052 (2019). https://doi.org/10.1109/ACCESS.2019.2924744
Cheng, J., Tian, S., Yu, L., Ma, X., Xing, Y.: A deep learning algorithm using contrast-enhanced computed tomography (CT) images for segmentation and rapid automatic detection of aortic dissection. Biomed. Signal Process. Control 62, 102145 (2020)
Huang, C., Ding, H., Liu, C.: Segmentation of cell images based on improved deep learning approach. IEEE Access 8, 110189–110202 (2020). https://doi.org/10.1109/ACCESS.2020.3001571
Zheng, B., et al.: MSD-Net: multi-scale discriminative network for COVID-19 lung infection segmentation on CT. IEEE Access 8, 185786–185795 (2020). https://doi.org/10.1109/ACCESS.2020.3027738
Amyar, A., Modzelewski, R., Li, H., Ruan, S.: Multi-task deep learning based CT imaging analysis for COVID-19 pneumonia: classification and segmentation. Comput. Biol. Med 126, 104037 (2020). https://doi.org/10.1016/j.compbiomed.2020.104037
Jayapandian, C.P., Chen, Y., Janowczyk, A.R., Palmer, M.B.: Development and evaluation of deep learning–based segmentation of histologic structures in the kidney cortex with multiple histologic stains. Kidney Int. 99(1), 86–101 (2021)
Wang, D., Zhang, T., Li, M., Bueno, R., Jayender, J.: 3D deep learning based classification of pulmonary ground glass opacity nodules with automatic segmentation. Comput. Med. Imaging Graph. 88, 101814 (2021)
Pham, V.-T., Tran, T.-T., Wang, P.-C., Chen, P.-Y., Lo, M.-T.: EAR-UNet: a deep learning-based approach for segmentation of tympanic membranes from otoscopic images. Artif. Intell. Med. 115, 102065 (2021)
Zhang, Q.: Automatic epicardial fat segmentation and quantification of CT scans using dual U-Nets with a morphological processing layer. IEEE Access 8, 128032–128041 (2020)
Zhang, Q., Zhou, J., Zhang, B., Jia, W., Wu, E.: Automatic epicardial fat segmentation and quantification of CT scans using dual U-nets with a morphological processing layer. IEEE Access 8, 128032–128041 (2020). https://doi.org/10.1109/ACCESS.2020.3008190
Marzola, F., van Alfen, N., Doorduin, J., Meiburger, K.M.: Deep learning segmentation of transverse musculoskeletal ultrasound images for neuromuscular disease assessment. Comput. Biol. Med. 135, 104623 (2021)
Ding, L., Zhao, K., Zhang, X., Wang, X., Zhang, J.: A lightweight U-Net architecture multi-scale convolutional network for pediatric hand bone segmentation in X-ray image. IEEE Access 7, 68436–68445 (2019). https://doi.org/10.1109/ACCESS.2019.2918205
Ding, Y.: A stacked multi-connection simple reducing net for brain tumor segmentation. IEEE Access 7, 104011–104024 (2019)
Civit-Masot, J., Luna-Perejón, F., Vicente-Díaz, S., Rodríguez Corral, J.M., Civit, A.: TPU cloud-based generalized U-Net for eye fundus image segmentation. IEEE Access 7,142379–142387 (2019). https://doi.org/10.1109/ACCESS.2019.2944692
Rahman, T., et al.: Reliable tuberculosis detection using chest X-ray with deep learning, segmentation and visualization. IEEE Access 8, 191586–191601 (2020). https://doi.org/10.1109/ACCESS.2020.3031384
Zeng, G., et al.: MRI-based 3D models of the hip joint enables radiation-free computer-assisted planning of periacetabular osteotomy for treatment of hip dysplasia using deep learning for automatic segmentation. Eur. J. Radiol. Open 8, 100303 (2020). https://doi.org/10.1016/j.ejro.2020.100303
Al-Kofahi, Y.: A deep learning-based algorithm for 2-D cell segmentation in microscopy images . BMC Inform. 19, 1–11 (2018)
Milletari, F.: Hough-CNN: deep learning for segmentation of deep brain regions in MRI and ultra-sound. Comput. Vis. Image Underst. 164, 92–102 (2017)
Milletari, F., et al.: Hough-CNN: deep learning for segmentation of deep brain regions in MRI and ultra-sound Comput. Vis. Image Underst. 164, 92–102 (2017)
Gibson, E.: Automatic multi-organ segmentation on abdominal CT with dense V-networks. IEEE Trans. Medi. Imaging. IEEE Trans. Med. Imaging, 37(8), 1822–1834 (2018)
Zeng, Y., Tsui, P.-H., Wu, W., Zhou, Z., Wu, S.: Fetal ultrasound image segmentation for automatic head circumference biometry using deeply supervised attention-gated V-Net. J. Digit. Imaging 34(1), 134–148 (2021). https://doi.org/10.1007/s10278-020-00410-5
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 IFIP International Federation for Information Processing
About this paper
Cite this paper
Maria, H.H., Jossy, A.M., Malarvizhi, S. (2022). Perspective Review on Deep Learning Models to Medical Image Segmentation. In: Kalinathan, L., R., P., Kanmani, M., S., M. (eds) Computational Intelligence in Data Science. ICCIDS 2022. IFIP Advances in Information and Communication Technology, vol 654. Springer, Cham. https://doi.org/10.1007/978-3-031-16364-7_15
Download citation
DOI: https://doi.org/10.1007/978-3-031-16364-7_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16363-0
Online ISBN: 978-3-031-16364-7
eBook Packages: Computer ScienceComputer Science (R0)