Contrastive rendering with semi-supervised learning for ovary and follicle segmentation from 3D ultrasound

doi:10.1016/j.media.2021.102134

Medical Image Analysis

Volume 73, October 2021, 102134

https://doi.org/10.1016/j.media.2021.102134 Get rights and content

Highlights

•
We propose a contrastive rendering (C-Rend) framework to segment ovary and follicles with detail-refined boundaries.
•
We adopt the concept of point-wise contrastive learning into the C-Rend to maximize the divergence among different classes.
•
We incorporate the proposed C-Rend with a semi-supervised learning (SSL) framework, leveraging unlabeled data for better performance.
•
Our proposed method has the potential to assist clinical doctors for fast diagnosis decision of infertility and may facilitate many advanced applications.

Abstract

Segmentation of ovary and follicles from 3D ultrasound (US) is the crucial technique of measurement tools for female infertility diagnosis. Since manual segmentation is time-consuming and operator-dependent, an accurate and fast segmentation method is highly demanded. However, it is challenging for current deep-learning based methods to segment ovary and follicles precisely due to ambiguous boundaries and insufficient annotations. In this paper, we propose a contrastive rendering (C-Rend) framework to segment ovary and follicles with detail-refined boundaries. Furthermore, we incorporate the proposed C-Rend with a semi-supervised learning (SSL) framework, leveraging unlabeled data for better performance. Highlights of this paper include: (1) A rendering task is performed to estimate boundary accurately via enriched feature representation learning. (2) Point-wise contrastive learning is proposed to enhance the similarity of intra-class points and contrastively decrease the similarity of inter-class points. (3) The C-Rend plays a complementary role for the SSL framework in uncertainty-aware learning, which could provide reliable supervision information and achieve superior segmentation performance. Through extensive validation on large in-house datasets with partial annotations, our method outperforms state-of-the-art methods in various evaluation metrics for both the ovary and follicles.

Graphical abstract

Introduction

Ultrasound (US) is widely used in female infertility diagnosis due to its advantages of real-time imaging, low cost and non-invasiveness. Female infertility is mainly caused by abnormal development of ovary and follicle (Coelho Neto, Ludwin, Borrell, Benacerraf, Dewailly, da Silva Costa, Condous, Alcazar, Jokubkiene, Guerriero, et al., 2018, Kiruthika, Ramya, 2014). Periodical and comprehensive ultrasound screenings are usually performed to monitor the growth of ovary and follicle. During each scanning, biometrics are measured for quantitative evaluation and diagnosis.

Specifically, the standard diagnosis pipeline is mainly based on 2D US. Sonographers first manually localize the standard planes (SPs) for the targets from scanned videos. Next, biometrics, including ovary and follicle size as well as follicle count, are measured and analyzed on the planes (Narra, Singhal, Narayan, Ramaraju, 2018, Kelsey, Wallace, 2012). Although the pipeline is tractable, the following drawbacks still exist. First, the SP localization is strongly dependent on sonographers’ experience and skill. High inter- and intra-operator variability exists, thus leading to low diagnosis reproducibility, especially for novices. Second, both SP localization and manual measurement is time-consuming when tens of follicles exist (as shown in Fig. 1(b)). Sonographers have to localize multiple SPs for complete measurement and examination. It brings huge difficulty for follicle counting and significantly decreases the diagnosis efficiency. Third, the final diagnostic conclusions could be biased by planar metrics, which only partially represent anatomical geometry.

In contrast, 3D US has the inherent advantages of less experience-dependency and more efficiency, which also makes off-line analysis more reliable (Coelho Neto et al., 2018). With a broad volumetric view, ovary and all follicles could be imagined in just a single 3D scanning (as shown in Fig. 1(a)). Notably, 3D US paves new paths for many crucial studies that can not be approached by 2D US, such as the measurement of volume size of ovary and follicles. However, facing the poor image quality and expensive manual annotation cost, related clinical studies often resort to semi-automatic methods, such as (Gooding, Barber, Kennedy, Noble, 2004, Gooding, Kennedy, Noble, 2008). Since these methods still involve cumbersome and subjective interactions, they are inappropriate to measure objects with irregular shapes thus resulting in conflicting results (Narra et al., 2018). In this regard, automated analysis tools are highly demanded. To achieve that, accurate segmentation should be obtained firstly.

However, it is a challenging task to perform ovary and follicle segmentation from US volumes. The first difficulty comes from boundary ambiguity. As shown in Fig. 1(c) and (d), speckle noise makes the boundary among follicles blurring. Boundary thickness of follicles is inconsistent due to irregular follicle shapes and complex connection status. It is difficult to recognize the boundary between the ovary and background tissues, even for experienced experts. Second, it is very difficult to annotate sufficient 3D data (each including hundreds of slices) for training high-performance supervised models. It would take at least 10 hours for an experienced expert to annotate one volume like the one in Fig. 1(b) which contains tens of follicles. Last but not least, an explosive increase of volumetric data also requires a relatively large network, which challenges the efficient segmentation in clinical application.

In our review of related works, we first introduce 2D US based methods, and then we summarize the 3D US studies. Finally, previous works that focus on addressing boundary ambiguity issue are discussed.

Most early works perform segmentation based on conventional methods (e.g, thresholding, watershed, region growing, active contours). A previous work (Deng et al., 2011) used a modified watershed algorithm to automatically segment the follicles. A similar idea can also be found in Krivanek and Sonka (1998). Potocnik and Zazula (2000) firstly employed region growing method to extract candidates of follicles and then recognized them based on empirically determined parameters. Potočnik and Zazula (2002) further proposed to estimate the parameters adaptively and employed dynamic images to take advantages of spatial-temporal information. Cigale et al. (2006) and Lenic et al. (2007) employed an SVM to train cellular neural networks which achieved a shorter learning process and more robust segmentation performance. Li et al. (2019) presented a composite network using recurrent neural network (RNN) to learn multi-scale and long-range spatial contexts in 2D US. The above methods indicate immense capacity for the ovary or follicle segmentation task. However, the measurement based on automatic segmentation in 2D US is error-prone, due to irregular shapes and occlusion of targets(Chen et al., 2009).

Different from the 2D US, segmentation studies in 3D US are limited due to the challenges of low imaging quality, large volume size and annotation difficulty. Continuous-wave transform algorithm (Cigale and Zazula, 2007) is applied to segment ovary in 3D US automatically. Chen et al. (2009) used a probabilistic boosting tree (PBT) to detect the follicular locations based on global-local context and then used the locations as prior knowledge to segment follicles based on markov random filed (MRF). Narayan et al. (2018) employed noise-robust phase asymmetric feature maps to detect and segment follicles in 3D US based on the max-flow algorithm. Narra et al. (2018) presented a variational segmentation framework to perform 2D radial slice segmentation. It integrated with deep energy map learned from U-Net as soft shape prior. The segmented slice results were unitized to generate a 3D mesh of ovary or follicle for surface measurement. Although the aforementioned methods are effective, the majority is still not discriminative enough to deal with gray-scale inhomogeneity and boundary ambiguity well. In addition, few supervised 3D deep models have been studied due to the difficulty of 3D annotation. Cigale and Zazula (2007) only explored 2D U-Net to segment 3D volume in a slice-by-slice way, which loses spatial information in the third dimension.

It is worth noting that boundary ambiguity has always been a great challenge for segmentation tasks (not just for ovary and follicle) and has been studied by previous works. Tu and Bai (2009) presented a novel auto-context framework to iteratively reuse the context information for segmentation refinement during training. Yang et al. (2018) employed a multi-directional recurrent neural network (RNN) to extract local semantic features to combat boundary ambiguity. Chen et al. (2016) and Zhu et al. (2019) employed edge-weighted mechanisms to pay more attention to segment object edges, to tackle the ambiguity problem. Wang et al. (2019b) applied the Atrous Spatial Pyramid Pooling (ASPP, (Chen et al., 2017)) to fuse the context information hierarchically and improve the performance on capturing small objects. Although these methods extract global/local semantic information for boundary renement effectively, they cannot approach point-level learning for fine-grained boundary identication.

Recently, a PointRend method was proposed to treat image segmentation as a rendering task (Kirillov et al., 2020). It was proven to be promising alternative in improving semantic segmentation. Inspired by it, we proposed a Contrastive-Rendering (C-Rend) framework which was primarily proposed in our initial MICCAI work (Li et al., 2020) to address the boundary ambiguity issue. It aims to improve the boundary estimation accuracy in US images and decrease computation cost.

In this study, considering the difficulty in collecting annotated volumetric data, we further aggregate the C-Rend framework with a semi-supervised learning (SSL) strategy based on Mean-Teacher (MT) model (Tarvainen and Valpola, 2017a). The conventional MT model employs an uncertainty estimation strategy to generate a more reliable target from unlabeled data for the training process. Although such strategy helps the model pay more attention to the high-confidence region (Yu et al., 2019), those ambiguous boundaries are probably ignored. Thus we integrate our C-Rend module into the SSL framework as a student model, enhancing the performance of boundary refinement. The student model learns from the teacher by encouraging the consistency between the predictions of unlabeled data obtained by student and teacher models. We performed extensive experiments on a challenging in-house dataset. Results show that the proposed method got decent agreements with expert annotations and outperformed strong contenders. It has great potential to advance the quantitative analysis of ovarian and follicle US volumes.

In summary, our contribution is three-fold:

•
We propose a C-Rend framework to formulate boundary estimation as a rendering task. The C-Rend adaptively recognizes ambiguous points and re-predicts their boundary predictions via coarse and fine-grained feature enriched representation learning.
•
We adopt the concept of point-wise contrastive learning into the C-Rend to maximize the divergence among different classes. It encourages the similarity of intra-class ambiguous points and contrastively decreases the similarity of inter-class ones.
•
We aggregate the C-Rend into the SSL framework leveraging reliable information from unlabeled data. Such aggregation reinforces the model optimization for both high- and low-confidence regions. Experiments demonstrate that our C-Rend with SSL can yield satisfactory segmentation results.

The organization of the rest in this paper is as follows. Section 2 presents the details of the C-Rend framework and SSL strategy. Section 3 presents the experimental results of the proposed method on 3D US segmentation. Finally, Section 4 elaborates the discussion of the proposed method, and the conclusion of this study is given in Section 5.

Section snippets

Methodology

Fig. 2 illustrates the overview of the proposed segmentation framework for the 3D ovarian US. It mainly consists of ve key components: (1) a segmentation architecture containing an asymmetric encoder-decoder structure as the backbone, (2) point selection module to select ambiguous points that need to be re-predicted, (3) a rendering head to re-predict the label of selected points based on the hybrid point-level features, (4) a contrastive learning head to further enhance the condence on

Datasets and pre-processing

Experiments were conducted on a dataset consisting of 307 transvaginal ultrasound (TVUS) volumes collected from 217 patients at the Third Affiliated Hospital of Guangzhou Medical University with approval from the local research ethics committee. Concretely, there are 156 TVUS volumes with manual annotations and the rest were used as unlabeled data for our semi-supervised system. The labeled dataset was divided into a training group and evaluation group at a ratio of $8 : 2$ . All the ovaries and

Discussion

Segmentation of ovary and follicles is crucial to quantitative analysis based female infertility diagnosis in 3D US. This task, however, is challenging due to poor 3D imaging quality, ambiguous boundary, large volume size and insufficient annotations. To address these problems and improve the segmentation performance, we developed a novel contrastive rendering framework to enhance the boundary prediction accuracy with less computation cost. The proposed C-Rend framework trained a light-weight

Conclusion

In this study, a general and lightweight framework for 3D ovarian US segmentation is presented, which holds potentials for dirent deep architectures and applications. Exploiting point-level coarse prediction and ne-grained features, coupled with contrastive learning, to calibrate the ambiguous prediction on boundary is the main highlight of this work. Moreover, we combined the C-Rend method with the semi-supervised training strategy based on Mean-Teacher approach to acquire better segmentation

CRediT authorship contribution statement

Xin Yang: Conceptualization, Methodology, Writing - review & editing, Project administration. Haoming Li: Writing - original draft, Methodology, Formal analysis, Validation. Yi Wang: Writing - review & editing. Xiaowen Liang: Investigation, Validation. Chaoyu Chen: Investigation, Validation. Xu Zhou: Writing - review & editing. Fengyi Zeng: Investigation, Validation. Jinghui Fang: Validation. Alejandro Frangi: Writing - review & editing. Zhiyi Chen: Data curation, Resources. Dong Ni:

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was supported by the grant from National Key R&D Program of China (No. 2019YFC0118300), Shenzhen Peacock Plan (No. KQTD2016053112051497, KQJSCX20180328095606003).

References (35)

Y. Deng et al.
An automated diagnostic system of polycystic ovary syndrome based on object growing
Artif. Intell. Med.
(2011)
M.J. Gooding et al.
Volume segmentation and reconstruction from freehand three-dimensional ultrasound data with application to ovarian follicle measurement
Ultrasound Med. Biol.
(2008)
B. Potočnik et al.
Automated analysis of a sequence of ovarian ultrasound images. part i: segmentation of single 2d images
Image Vis. Comput.
(2002)
S. Wang et al.
Boundary and entropy-driven adversarial learning for fundus image segmentation
International Conference on Medical Image Computing and Computer-Assisted Intervention
(2019)
W. Xue et al.
Modality alignment contrastive learning for severity assessment of covid-19 from lung ultrasound and clinical information
Med. Image Anal.
(2021)
H. Chen et al.
Dcan: deep contour-aware networks for accurate gland segmentation
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
(2016)
L.-C. Chen et al.
Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs
IEEE Trans. Pattern Anal. Mach. Intell.
(2017)
T. Chen et al.
A simple framework for contrastive learning of visual representations
arXiv preprint arXiv:2002.05709
(2020)
T. Chen et al.
Automatic ovarian follicle quantification from 3d ultrasound data using global/local context with database guided segmentation
2009 IEEE 12th International Conference on Computer Vision
(2009)
B. Cigale et al.
Segmentation of ovarian ultrasound images using cellular neural networks trained by support vector machines
International Conference on Knowledge-Based and Intelligent Information and Engineering Systems
(2006)

B. Cigale et al.

Segmentation of 3d ovarian ultrasound volumes using continuous wavelet transform

11th Mediterranean Conference on Medical and Biomedical Engineering and Computing 2007

(2007)

M.A. Coelho Neto et al.

Counting ovarian antral follicles by ultrasound: a practical guide

Ultrasound Obstetric. Gynecol.

(2018)

M. Gooding et al.

The effect of follicle volume measurement on clinical decisions

Proc. Med. Image Understand. Anal. (MIUA)

(2004)

T.W. Kelsey et al.

Ovarian volume correlates strongly with the number of nongrowing follicles in the human ovary

Obstet. Gynecol. Int.

(2012)

A. Kendall et al.

What uncertainties do we need in bayesian deep learning for computer vision?

Advances in Neural Information Processing Systems

(2017)

A. Kirillov et al.

Pointrend: Image segmentation as rendering

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

(2020)

V. Kiruthika et al.

Automatic segmentation of ovarian follicle using k-means clustering

2014 fifth international conference on signal and image processing

(2014)

Cited by (10)

Reviewing 3D convolutional neural network approaches for medical image segmentation
2024, Heliyon
Convolutional neural networks (CNNs) assume pivotal roles in aiding clinicians in diagnosis and treatment decisions. The rapid evolution of imaging technology has established three-dimensional (3D) CNNs as a formidable framework for delineating organs and anomalies in medical images. The prominence of 3D CNN frameworks is steadily growing within medical image segmentation and classification. Thus, our proposition entails a comprehensive review, encapsulating diverse 3D CNN algorithms for the segmentation of medical image anomalies and organs.
This study systematically presents an exhaustive review of recent 3D CNN methodologies. Rigorous screening of abstracts and titles were carried out to establish their relevance. Research papers disseminated across academic repositories were meticulously chosen, analyzed, and appraised against specific criteria. Insights into the realm of anomalies and organ segmentation were derived, encompassing details such as network architecture and achieved accuracies.
This paper offers an all-encompassing analysis, unveiling the prevailing trends in 3D CNN segmentation. In-depth elucidations encompass essential insights, constraints, observations, and avenues for future exploration. A discerning examination indicates the preponderance of the encoder-decoder network in segmentation tasks. The encoder-decoder framework affords a coherent methodology for the segmentation of medical images.
The findings of this study are poised to find application in clinical diagnosis and therapeutic interventions. Despite inherent limitations, CNN algorithms showcase commendable accuracy levels, solidifying their potential in medical image segmentation and classification endeavors.
Evaluation of oocyte maturity using artificial intelligence quantification of follicle volume biomarker by three-dimensional ultrasound
2022, Reproductive BioMedicine Online
Citation Excerpt :
An in-house developed artificial intelligence method based on deep learning was used to obtain the follicle volume biomarker. A detailed technological description of the pipeline has been published separately (Yang et al., 2021). C-Rend is a deep learning model for the accurate simultaneous segmentation of ovaries and follicles (Figure 1).
Can a novel deep learning-based follicle volume biomarker using three-dimensional ultrasound (3D-US) be established to aid in the assessment of oocyte maturity, timing of HCG administration and the individual prediction of ovarian hyper-response?
A total of 515 IVF cases were enrolled, and 3D-US scanning was carried out on HCG administration day. A follicle volume biomarker established by means of a deep learning-based segmentation algorithm was used to calculate optimal leading follicle volume for predicting number of mature oocytes retrieved and optimizing HCG trigger timing. Performance of the novel biomarker cut-off value was compared with conventional two-dimensional ultrasound (2D-US) follicular diameter measurements in assessing oocyte retrieval outcome. Moreover, demographics, infertility work-up and ultrasound biomarkers were used to build models for predicting ovarian hyper-response.
On the basis of the deep learning method, the optimal cut-off value of the follicle volume biomarker was determined to be 0.5 cm³ for predicting number of mature oocytes retrieved; its performance was significantly better than the conventional method (two-dimensional diameter measurement ≥10 mm). The cut-off value for leading follicle volume to optimize HCG trigger timing was determined to be 3.0 cm³ and was significantly associated with a higher number of mature oocytes retrieved (P = 0.01). Accuracy of the multi-layer perceptron model was better than two-dimensional diameter measurement (0.890 versus 0.785) and other multivariate classifiers in predicting ovarian hyper-response (P < 0.001).
Deep learning segmentation methods and multivariate classifiers based on 3D-US were found to be potentially effective approaches for assessing mature oocyte retrieval outcome and individual prediction of ovarian hyper-response.
Segmentation of ovarian cyst using improved U-NET and hybrid deep learning model
2024, Multimedia Tools and Applications
Semi-supervised contrast learning-based segmentation of choroidal vessel in optical coherence tomography images
2023, Physics in Medicine and Biology
Semi-supervised liver segmentation based on local regions self-supervision
2023, Medical Physics
Applications of machine learning techniques for enhancing nondestructive food quality and safety detection
2023, Critical Reviews in Food Science and Nutrition

View all citing articles on Scopus

¹: The two authors contribute equally to this work.

View full text

Contrastive rendering with semi-supervised learning for ovary and follicle segmentation from 3D ultrasound

Highlights

Abstract

Graphical abstract

Introduction

Section snippets

Methodology

Datasets and pre-processing

Discussion

Conclusion

CRediT authorship contribution statement

Declaration of Competing Interest

Acknowledgments

Artif. Intell. Med.

Ultrasound Med. Biol.

Image Vis. Comput.

Med. Image Anal.

Dcan: deep contour-aware networks for accurate gland segmentation

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs

IEEE Trans. Pattern Anal. Mach. Intell.

A simple framework for contrastive learning of visual representations

arXiv preprint arXiv:2002.05709

Automatic ovarian follicle quantification from 3d ultrasound data using global/local context with database guided segmentation

2009 IEEE 12th International Conference on Computer Vision

Segmentation of ovarian ultrasound images using cellular neural networks trained by support vector machines

International Conference on Knowledge-Based and Intelligent Information and Engineering Systems