基于密集空洞空间金字塔池化和注意力机制的皮肤病灶图像分割方法_《生物医学工程学杂志》

作者：

尹稳 ,  周冬明 , 范腾 , 余卓璞 , 李祯

云南大学信息学院（昆明 650504）;

关键词：

皮肤病图像分割空洞卷积注意力机制 U型网络

DOI：

10.7507/1001-5515.202208015

视频：

导出 下载 收藏 扫码 引用

摘要 全文 图表 视频 参考文献 施引文献 补充材料

皮肤是人体最大的器官，很多内脏疾病会直接体现在皮肤上，准确分割皮肤病灶图像具有重要的临床意义。针对皮肤病灶区域颜色复杂、边界模糊、尺度信息参差不齐等特点，本文提出一种基于密集空洞空间金字塔池化（DenseASPP）和注意力机制的皮肤病灶图像分割方法。该方法以U型网络（U-Net）为基础，首先重新设计新的编码器，以大量残差连接代替普通的卷积堆叠，在拓展网络深度后还能有效保留关键特征；其次，将通道注意力与空间注意力融合并加入残差连接，从而使网络自适应地学习图像的通道与空间特征；最后，引入并重新设计的DenseASPP以扩大感受野尺寸并获取多尺度特征信息。本文所提算法在国际皮肤影像协会官方公开数据集（ISIC2016）中得到令人满意的结果，平均交并比（mIOU）、敏感度（SE）、精确率（PC）、准确率（ACC）和戴斯相似性系数（Dice）分别为0.901 8、0.945 9、0.948 7、0.968 1、0.947 3。实验结果证明，本文方法能够提高皮肤病灶图像分割效果，有望能为专业皮肤病医生提供辅助诊断。

引用本文： 尹稳, 周冬明, 范腾, 余卓璞, 李祯. 基于密集空洞空间金字塔池化和注意力机制的皮肤病灶图像分割方法. 生物医学工程学杂志, 2022, 39(6): 1108-1116. doi: 10.7507/1001-5515.202208015 复制

引言

皮肤病是一种常见的身体疾病。皮肤病的种类繁多，很多内脏疾病会直接体现在皮肤上。近年来，以黑色素瘤为代表的色素障碍性皮肤病发病率逐年上升。据美国癌症协会（American cancer society，ACS）统计，2022 年美国新增黑色素瘤患者将达到99 780例，预计死亡病例达7 650例^[1]。但如果能在早期发现黑色素瘤，其 5 年生存率可达到90% 以上^[2]。因此，快速诊断并治疗黑色素瘤对挽救患者生命具有重要意义。目前，皮肤病的诊断方法大多依赖皮肤镜技术^[3]，皮肤病灶图像分割可快速分离出正常区域与病变区域，能够为皮肤镜检查提供关键依据。但早期皮肤病患者病灶区域颜色浅、边缘模糊，且病变区域常常藏匿于毛发之间，与正常皮肤和常见的良性痣难以区分，即便是专业的医护人员也会有漏诊和误诊情况，所以急需一种分割方法实现皮肤病灶区域的自动分割。

传统的分割方法在过去很长一段时间内占有主导地位，这类算法大多是对图像表层信息的提取，最具代表性的算法有基于阈值^[4-6]、区域^[7-8]、聚类^[9-11]、边缘检测^[12-13]等。例如Glaister等^[14]提出一种基于纹理清晰度（texture distinctiveness，TD）的皮肤病灶分割算法，该算法以TD度量为核心，在学习输入图片的稀疏纹理分布后，根据TD度量捕捉到的纹理分布之间的差异，合理设置阈值，将图片分割为正常区域与病变区域。Masood等^[15]提出一种基于聚类、阈值并结合模糊C均值算法的皮肤镜图像分割方法，首先用平滑滤波的方法对图像进行预处理，再用模糊C均值使图像中的每个像素和C聚类中心之间的加权相似性度量的目标函数最优，使得每个像素被准确分到某一类。

近年来，随着深度学习方法的迅猛发展，图像处理技术得到很大提升，图像分割越来越多地用于医学领域，利用计算机辅助诊断的方式也已广泛应用于临床诊断中，研究人员提出了各种算法以解决皮肤病灶图像边缘容易忽略、分割不准确的难点。Long等^[16]开创性地提出一种全卷积神经网络（fully convolutional network，FCN），以端到端的方式实现了图像像素级别的分割，开创了语义分割的先河。而医学领域图像分割的真正流行，是在于Ronneberger等^[17]提出具有编码器-解码器结构的U型网络（U-Net），该网络编码部分与解码部分完全对称，为避免上下采样造成的特征丢失，网络在编码与解码之间采用跳跃连接的方式相连，实现高低级语义特征的融合。如今U-Net已经有多种变体，如巢穴U-Net （U-Net++）^[18]、残差U-Net（residual U-Net，Res-UNet）^[19]、循环残差U-Net（recurrent residual U-Net，R2U-Net）^[20]、注意力U-Net（attention U-Net，Atten-UNet）^[21]、改进巢穴U-Net（UNet3+）^[22]等，这些变形网络也被大量用在皮肤病灶图像分割中。例如，Oktay等^[21]将注意力机制加入到分割网络中，扩展了卷积神经网络的表达能力，自适应地学习特征权重，赋予重要特征更大的权重，更快速地学习皮肤病灶特征。Yuan等^[23-24]提出一种新的基于杰卡德距离（Jaccard distance）的损失函数以实现皮肤病灶图像的自动分割。Chen等^[25]将变换器（Transformer）运用到医学图像分割，提出Transformer U-Net（Trans-UNet），能够对像素信息进行精准定位，解决U-Net在显式建模中的局限性。Valanarasu等^[26]将多层感知机（multilayer perceptron，MLP）运用到U-Net网络中形成新的分割网络——感知器U-Net（UNeXt），大大减少了网络参数量，实现皮肤病灶图像的快速分割。但以上算法依然存在着大量不足：U-Net系列在上下采样过程中容易造成空间信息丢失，导致分割精度下降；Transformer系列参数较多，计算量大，需要依靠强大的硬件设备，且捕捉局部特征的能力不足，尤其在医学图像这样的小数据集身上。

针对上述问题，本文提出一种基于密集空洞空间金字塔池化（dense atrous spatial pyramid pooling，DenseASPP）和注意力机制的新型皮肤病灶图像分割方法^[27]，文章的主要贡献有：

（1）基于U-Net网络，提出全新的编码器-解码器分割网络，以端到端的方式训练该网络。对原始网络的编码器模块进行了新的设计，以两次残差连接代替原来简单的卷积（convolution，Conv）堆叠，有效解决上下采样过程造成的特征丢失。

（2）引入空间注意力与通道注意力双重高效注意力机制，赋予网络学习病灶特征的能力，同时用残差方式连接以提高网络的编解码能力。

（3）修改了瓶颈层的结构，重新设计的DenseASPP运用于瓶颈层，在扩大感受野的同时，通过密集跳跃连接来获取不同高低级尺度的特征信息。

1 算法描述

本文以U-Net模型为基础，结合残差网络、空洞Conv、密集网络以及注意力机制等思想，提出全新的皮肤病灶图像分割模型，整体分割流程和网络结构如图1所示，其中模块两侧数字表示图片尺寸大小，模块顶部数字表示通道数。不同尺寸大小的图片在预处理阶段进行统一裁剪、灰度化处理和翻转操作，使得图片大小统一为256 × 256，通道数目统一为1。图片送入网络后，依次经过5次下采样和5次上采样，下采样过程中，图片尺寸逐渐变小，通道逐渐增多，能够有效提取病灶特征；上采样过程中，图片尺寸逐渐增大，通道数逐渐减少，病灶特征逐渐恢复。在上下采样的中间，用了一层瓶颈层连接，瓶颈层加入新设计的DenseASPP模块，用不同大小扩张系数的空洞Conv获取多尺度信息。

图1 分割流程及网络结构图 Figure1. Segmentation process and network structure

图选项

模型	mIOU	SE	PC	ACC	Dice
U-Net^[14]	0.8316	0.8784	0.9380	0.8784	0.9072
UNet++^[15]	0.8465	0.9099	0.9183	0.9497	0.9119
Atten-UNet^[18]	0.8971	0.9483	0.9411	0.9661	0.9447
R2U-Net^[17]	0.5201	0.8090	0.6892	0.7158	0.7443
UNet3+ ^[19]	0.8779	0.9490	0.9198	0.9582	0.9342
UNeXt^[23]	0.8519	0.9139	0.9271	0.9503	0.9207
本文算法	0.9018	0.9459	0.9487	0.9681	0.9473

模块组合				mIOU	SE	PC	ACC	Dice
编码器模块	CBAM	跳跃连接	DenseASPP	mIOU	SE	PC	ACC	Dice
−	−	−	−	0.8316	0.8784	0.9380	0.9458	0.9072
√	−	−	−	0.8398	0.9122	0.9076	0.9464	0.9099
√	√	−	−	0.8706	0.9235	0.9345	0.9585	0.9290
√	√	√	−	0.8858	0.9424	0.9337	0.9633	0.9380
√	√	√	√	0.9018	0.9459	0.9487	0.9681	0.9473

1.	Siegel R L, Miller K D, Fuchs H E, et al. Cancer statistics, 2022. CA: a Cancer Journal for Clinicians. 2022, 72(1): 7–33.
2.	Ge Z, Demyanov S, Chakravorty R, et al. Skin disease recognition using deep saliency features and multimodal learning of dermoscopy and clinical images//International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer Cham, 2017: 250-258.
3.	Binder M, Schwarz M, Winkler A, et al. Epiluminescence microscopy: a useful tool for the diagnosis of pigmented skin lesions for formally trained dermatologists. Archives of Dermatology, 1995, 131(3): 286-291.
4.	Otsu N. A threshold selection method from gray-level histograms. IEEE Transactions on Systems, Man, and Cybernetics, 1979, 9(1): 62-66.
5.	Pun T. A new method for grey-level picture thresholding using the entropy of the histogram. Signal Processing, 1980, 2(3): 223-237.
6.	Yen J C, Chang F J, Chang S. A new criterion for automatic multilevel thresholding. IEEE Transactions on Image Processing, 1995, 4(3): 370-378.
7.	Pham D L, Xu C, Prince J L. A survey of current methods in medical image segmentation. Annual Review of Biomedical Engineering, 2000, 2(3): 315-337.
8.	Tremeau A, Borel N. A region growing and merging algorithm to color segmentation. Pattern Recognition, 1997, 30(7): 1191-1203.
9.	Cheng Y. Mean shift, mode seeking, and clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1995, 17(8): 790-799.
10.	Fukunaga K, Hostetler L. The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Transactions on Information Theory, 1975, 21(1): 32-40.
11.	Sheikh Y A, Khan E A, Kanade T. Mode-seeking by medoidshifts//2007 IEEE 11th International Conference on Computer Vision. IEEE, 2007: 1-8.
12.	Lakshmi S, Sankaranarayanan V. A study of edge detection techniques for segmentation computing approaches. IJCA Special Issue on “Computer Aided Soft Computing Techniques for Imaging and Biomedical Applications” (CASCT), 2010: 35-40.
13.	Khan J F, Bhuiyan S M A, Adhami R R. Image segmentation and shape analysis for road-sign detection. IEEE Transactions on Intelligent Transportation Systems, 2010, 12(1): 83-96.
14.	Glaister J, Wong A, Clausi D A. Segmentation of skin lesions from digital images using joint statistical texture distinctiveness. IEEE Transactions on Biomedical Engineering, 2014, 61(4): 1220-1230.
15.	Masood A, Al-Jumaily A A. Fuzzy C mean thresholding based level set for automated segmentation of skin lesions. Journal of Signal and Information Processing, 2013, 4(3): 66.
16.	Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015: 3431-3440.
17.	Ronneberger O, Fischer P, Brox T. U-net: convolutional networks for biomedical image segmentation//International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer Cham, 2015: 234-241.
18.	Zhou Z, Siddiquee MMR, Tajbakhsh N, et al. UNet++: a nested U-Net architecture for medical image segmentation. Deep Learn Med Image Anal Multimodal Learn Clin Decis Support, Springer Cham, 2018, 11045: 3-11.
19.	Xiao X, Lian S, Luo Z, et al. Weighted res-Unet for high-quality retina vessel segmentation//2018 9th International Conference on Information Technology in Medicine and Education (ITME). IEEE, 2018: 327-331.
20.	Alom M Z, Hasan M, Yakopcic C, et al. Recurrent residual convolutional neural network based on U-net (R2U-net) for medical image segmentation. arXiv preprint, 2018, arXiv: 1802.06955.
21.	Oktay O, Schlemper J, Folgoc L L, et al. Attention U-net: learning where to look for the pancreas. arXiv preprint, 2018, arXiv: 1804.03999.
22.	Huang H, Lin L, Tong R, et al. Unet 3+: a full-scale connected Unet for medical image segmentation// 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2020). IEEE, 2020: 1055-1059.
23.	Yuan Y, Chao M, Lo Y C. Automatic skin lesion segmentation using deep fully convolutional networks with jaccard distance. IEEE transactions on medical imaging, 2017, 36(9): 1876-1886.
24.	Yuan Y, Lo Y C. Improving dermoscopic image segmentation with enhanced convolutional-deconvolutional networks. IEEE Journal of Biomedical and Health Informatics, 2017, 23(2): 519-526.
25.	Chen J, Lu Y, Yu Q, et al. Transunet: transformers make strong encoders for medical image segmentation. arXiv preprint, 2021, arXiv: 2102.04306.
26.	Valanarasu J M J, Patel V M. UNeXt: MLP-based rapid medical image segmentation network. arXiv preprint, 2022, arXiv: 2203.04967.
27.	Yang M, Yu K, Zhang C, et al. Denseaspp for semantic segmentation in street scenes//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018: 3684-3692.
28.	李佐勇, 卢妍, 曹新容, 等. 基于双路径和空洞空间金字塔池化的血液白细胞分割. 生物医学工程学杂志, 2022, 39(3): 471-479.
29.	董婷, 魏珑, 叶晓丹, 等. 基于空洞空间卷积池化金字塔结构和注意力机制的全卷积残差网络磨玻璃肺结节分割方法. 生物医学工程学杂志, 2022, 39(3): 441-451.
30.	杨国亮, 邹俊峰, 李世聪, 等. 基于U型稠密特征融合的皮肤病灶分割. 中国医学物理学杂志, 2022, 39(4): 442-447.
31.	Woo S, Park J, Lee J Y, et al. Cbam: convolutional block attention module//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 3-19.
32.	Gutman D, Codella N C F, Celebi E, et al. Skin lesion analysis toward melanoma detection: a challenge at the international symposium on biomedical imaging (ISBI) 2016, hosted by the international skin imaging collaboration (ISIC). arXiv preprint, 2016, arXiv: 1605.01397.

《生物医学工程学杂志》

基于密集空洞空间金字塔池化和注意力机制的皮肤病灶图像分割方法

摘要 全文 图表 视频 参考文献 施引文献 补充材料

引言

1 算法描述

1.1 编码器模块

1.2 密集空洞空间金字塔池化

1.3 卷积模块注意力机制

2 实验结果与分析

2.1 实施细节

2.2 数据集与评价指标

2.3 实验结果及分析

2.4 消融实验

3 结论

引言

1 算法描述

1.1 编码器模块

1.2 密集空洞空间金字塔池化

1.3 卷积模块注意力机制

2 实验结果与分析

2.1 实施细节

2.2 数据集与评价指标

2.3 实验结果及分析

2.4 消融实验

3 结论

上一篇

下一篇

Format

Content

摘要全文图表视频参考文献施引文献补充材料