Elsevier

Neurocomputing

Volume 500, 21 August 2022, Pages 799-808
Neurocomputing

CoCycleReg: Collaborative cycle-consistency method for multi-modal medical image registration

https://doi.org/10.1016/j.neucom.2022.05.113Get rights and content

Abstract

Multi-modal image registration is an essential step for many medical image analysis applications. Recent advances in multi-modal image registration rely on image-to-image translation to achieve good performance. However, the performance is still limited owing to the poor use of complementary regularization between image registration and translation, which is able to simultaneously enhance both parts’ accuracy. To this end, we propose CoCycleReg, a novel method that formulates image registration and translation in a Collaborative Cycle-consistency manner. Instead of dividing into two discrete stages, we unify the image registration and translation via cycle-consistency in an end-to-end training process, such that each part can benefit from the other one. To ensure the deformation fields’ reversibility in the cycle, we extensively introduce a novel dual-head registration network, consisting of one single backbone to extract the features and two heads to respectively predict the deformation fields. The experiments on T1-T2(MRI) and CT-MRI datasets validate that the proposed CoCycleReg surpasses the other state-of-the-art conventional and deep learning approaches comprehensively considering the speed, accuracy, and regularity of deformation fields. In the ablation analysis, a method that sets the cycle-consistency Corresponding authors at: Department of Computer Science at School of Informatics, Xiamen University, Xiamen 361005, Chinaconstraints of registration and image-to-image translation separately is compared, and the results demonstrate the effectiveness of collaborative cycle-consistency. In addition, the improvement of image-to-image translation is also verified in further analysis. The code is publicly available at  https://github.com/DopamineLcy/cocycle-reg/.

Introduction

Medical images from different modalities such as Computed Tomography(CT) and Magnetic Resonance Imaging(MRI) provide complementary information, which can significantly aid in the early detection of tumors or other diseases and help improve diagnostic accuracy [1], [2]. However, multi-modal images usually have inevitable misalignment issues due to patient motion and variations in anatomical structures. Rigid registration can perform well in structures that are not susceptible to elastic changes (e.g., bone). But for soft tissues, many factors, including tissue abnormalities, respiratory movements, and muscle contractions, can cause elastic deformation. For this situation, deformable registration is more suitable and accurate. Deformable image registration has been a fundamental component of many medical image analysis applications, such as monitoring diseases’ progression and quantifying treatment mechanisms’ effectiveness [3], [4], [5], [6]. The goal of deformable image registration is to achieve a high speed, high accuracy and guarantee deformation fields to be realistic.

Previous works on multi-modal image registration mainly include conventional iterative optimization-based methods [7], [8], [9], [10], metric-based deep learning methods [11], [12] and image-to-image translation-based deep learning methods [13], [14], [15].

The conventional iterative optimization-based methods estimate the deformation fields by optimizing certain objective functions like Mutual Information (MI) [7], [9], [10] and Modality Independent Neighbourhood Descriptor (MIND) [8]. The most severe limitation of this sort of methods is that the optimization process is very computationally expensive and time-consuming. Besides the computational disadvantage, designing accurate metrics to evaluate the similarity of images from different modalities is challenging.

With the advances of deep neural networks, researchers began to investigate the deep learning-based methods for mono-modal image registration [16], [17]. The deep learning methods optimize a spatial transform network (STN) [18] by comparing the warped image and the target one using similarity metrics like Mean Squared Error (MSE) and Normalized Cross-Correlation (NCC). Furthermore, the mono-modal image registration methods have been extended to multi-modal image registration and these extended methods can be broadly classified into metric-based deep learning methods [11], [12] and image-to-image translation-based deep learning methods [13], [14], [15]. The common idea of the metric-based methods is to find a metric to evaluate the similarity of images from different modalities and solve multi-modal registration problem based on mono-modal registration methods. To achieve this, MI, Structural Similarity (SSIM) [19] and MIND [8] are utilized. However, statistic metrics like MI and SSIM introduce inaccuracy while the upper limit of handcrafted metrics like MIND is obvious since it’s arguably impossible to design a metric to suit all kinds of modalities. Instead of directly measuring the similarity of images from different modalities, image-to-image translation-based deep learning methods translate the moving images to the modality of target images and use simple mono-modal metrics like MSE to measure the similarity [13], [14]. This kind of methods discards complicated handcrafted metrics completely and the performance benefits from the development of image-to-image translation.

In recent years, cycle-consistency has become prevalent since Zhu et al. proposed the cycle-consistency of image-to-image translation and made a great success in CycleGAN [20]. For the image registration task, the cycle-consistency is used to improve the invertibility of image registration [21]. More details about related cycle-consistency can be found in 2.3. Besides mentioned in the last paragraph that image-to-image translation-based deep learning methods benefit a lot from the image translation, the translation-registration collaboration is also a key point. However, to the best of our knowledge, the complementary regularization between image registration and translation is still underexplored. Inspired by cycle-consistency and the integrated translation and registration framework in [14], we propose CoCycleReg, delving into the cycle-consistency of the integrated translation and registration framework to improve the performance of multi-modal image registration.

In the present work, our main contributions are:

  • we introduce a collaborative cycle-consistency framework for multi-modal image registration, where the image registration and translation part can regularize each other during the training process. The regularization enhances the accuracy and regularity of image registration and the consistency of geometry shape during image-to-image translation;

  • we propose dual-head deformation fields generating network to generate bi-directional deformation fields with a single network. Compared to inverse the deformation field of one direction directly to obtain the other one, the proposed dual-head network generates bi-directional deformation fields with better invertibility. In the meantime, training two networks to generate bi-directional deformation fields is avoided, which reduces the network parameters and makes training easier;

  • the entire framework is end-to-end and image-to-image translation in 3D volumes is achieved directly, instead of doing translation in 2D slice by slice and concatenating, which makes supervisory information be aware by the generators. The end-to-end framework improves the performance of image-to-image translation and thus promotes the image registration process.

We validate the effectiveness of our method with the example of pairwise multi-modal registration of 3D CT and MRI scans. Specifically, we evaluate the model performance on a well-aligned T1-T2 (MRI) dataset with manual deformations and a clinical CT-MRI dataset. Experiment results demonstrate our method outperforms other state-of-the-art approaches comprehensively considering the speed, accuracy, and regularity of deformation fields and has some positive effects on the image-to-image translation process. Further ablation analysis validates the effectiveness of the proposed collaborative cycle-consistency manner.

Section snippets

Deep Learning-based Medical Image Registration

VoxelMorph [17] has been the most prevalent method of medical image registration for giving a generic unsupervised learning pattern. In recent years, most of the proposed methods, both mono-modal and multi-modal registration, are based on the pattern to optimize a spatial transform network (STN) [18] by comparing the warped image and the target one using similarity metrics. In our study, the VoxelMorph pattern is integrated in the proposed collaborative cycle-consistency manner.

Image-to-image Translation

Our work is an

Methods

Given a set of multi-modal image pairs (xi,yi)i=1n, where xX and yY, where X and Y denote two image modalities. For simplicity, we denote a pair of multi-modal images as (x,y) instead of (xi,yi). As our task bi-directional multi-modal image registration, for a given input image pair (x,y), our goal is to estimate bi-directional deformation fields (ϕx2y,ϕy2x).

The pipelines of the proposed method from x to y and from y to x are completely symmetrical and here we take the cycle process from x to

Experimental Design

We evaluate the present approach on two datasets with precise manual segmentation. During the registration, the segmentation of the moving image was warped simultaneously with the image, and the accuracy was measured by the degree of overlap between the fixed and warped segmentation. Dice Similarity Coefficient (DSC) [29] and Hausdorff Distance-95 (HD95) are computed between masks of fixed and warped images to measure the registration accuracy. In addition, the number of voxels with a

Discussion and Conclusions

In this paper, we have proposed a novel deep learning framework CoCycleReg for multi-modal medical image registration, which focuses on the deep relationship between image registration and image-to-image translation. CoCycleReg outperforms other state-of-the-art approaches comprehensively considering the speed, accuracy, and regularity of deformation fields.

We performed a set of comparative experiments validating that CoCycleReg outperforms state-of-the-art methods of conventional iterative

CRediT authorship contribution statement

Chenyu Lian: Conceptualization, Methodology, Writing - original draft. Xiaomeng Li: Investigation, Writing - review & editing. Lingke Kong: Investigation, Software. Jiacheng Wang: Visualization, Software. Wei Zhang: Investigation, Validation. Xiaoyang Huang: Validation, Formal analysis. Liansheng Wang: Supervision, Writing - review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

This work was supported by the Fundamental Research Funds for the Central Universities (Grant No. 20720190012, 20720210121).

Chenyu Lian received the B.S. degree from Xiamen University in 2021 and now is a master student in the Department of Computer Science, Xiamen University, Xiamen, China. His main research interests include medical image analysis and machine learning.

References (36)

  • K. Marstal, F. Berendsen, M. Staring, S. Klein, Simpleelastix: A user-friendly, multi-lingual library for medical image...
  • X. Cao et al.

    Deep learning based inter-modality image registration supervised by intra-modality similarity

  • C.K. Guo, Multi-modal image registration with unsupervised deep learning, Ph.D. thesis, Massachusetts Institute of...
  • D. Wei et al.

    Synthesis and inpainting-based mr-ct registration for image-guided thermal ablation of liver tumors

  • M. Arar, Y. Ginger, D. Danon, A.H. Bermano, D. Cohen-Or, Unsupervised multi-modal image registration via geometry...
  • Z. Xu et al.

    Adversarial uni-and multi-modal stream networks for multimodal image registration

  • G. Balakrishnan, A. Zhao, M.R. Sabuncu, J. Guttag, A.V. Dalca, An unsupervised learning model for deformable medical...
  • G. Balakrishnan et al.

    Voxelmorph: a learning framework for deformable medical image registration

    IEEE transactions on medical imaging

    (2019)
  • Cited by (0)

    Chenyu Lian received the B.S. degree from Xiamen University in 2021 and now is a master student in the Department of Computer Science, Xiamen University, Xiamen, China. His main research interests include medical image analysis and machine learning.

    Dr. Xiaomeng Li is an Assistant Professor at the Department of Electronic and Computer Engineering at The Hong Kong University of Science and Technology. Before joining HKUST, she was a Postdoctoral Research Fellow at Stanford University. She obtained my Ph.D. degree from The Chinese University of Hong Kong. Her research lies in the interdisciplinary areas of artificial intelligence and medical image analysis, aiming at advancing healthcare with machine intelligence.

    Lingke Kong received the M.S. degree from HuaQiao University in 2020 and now is an algorithm engineer in the department of Scientific Research from Manteia Technologies Co.,Ltd., Xiamen, China. His main research interests include medical image registration and machine learning.

    Jiacheng Wang received the B.S. degree from Xiamen University in 2018 and now is a Ph.D. student in the Department of Computer Science from Xiamen University, Xiamen, China. His main research interests include medical image processing and machine learning.

    Wei Zhang, researcher at Manteia Technologies Co.,Ltd., located in Xiamen, China, got master degree of engineering at Nanjing University of Science and Technology in 2018. His current research interests focus on deep learning-based applications in the field of radiation therapy, including medical image analysis, automated planning.

    Xiaoyang Huang is currently an Assistant Professor in the Department of Computer Science, Xiamen University, Xiamen, China. His research interests include medical image processing.

    Liansheng Wang received the Ph.D. degree in Computer Science from the Chinese University of Hong Kong in 2012. He is currently an Associate Professor in the Department of Computer Science, Xiamen University, Xiamen, China. His research interests include medical image processing and analysis, machine learning, big medical data.

    View full text