Abstract
Due to the complementary nature of visible and infrared images, they are widely used in image fusion to generate fused images containing more comprehensive information. Although existing fusion methods have achieved good results, there are some problems. In some cases, the features of an image are affected by a shot from another modality, which leads to the problem of background contamination and missing information. To solve these problems, we designed a visible and infrared image fusion network starting from three key factors that affect structural similarity. Our fusion network can avoid these problems through detail enhancement, contrast preservation, and luminance balancing. Through the cross-stage feature extraction and multi-scale feature enhancement modules achieve detail enhancement. The complementary information fusion module finds and fuses complementary information from different images to achieve contrast preservation. The loss function performs luminance balancing. Comparison and generalization experiments on several other public datasets show that our network effectively avoids background contamination and information loss and achieves outstanding results in both quantitative and qualitative aspects.
Similar content being viewed by others
References
Zhang, H., Xu, H., Tian, X., Jiang, J., Ma, J.: Image fusion meets deep learning: a survey and perspective. Inf. Fusion 76, 323–336 (2021). https://doi.org/10.1016/j.inffus.2021.06.008
Ha, Q., Watanabe, K., Karasawa, T., Ushiku, Y., Harada, T.: MFNet: towards real-time semantic segmentation for autonomous vehicles with multi-spectral scenes, In: 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS), IEEE, pp. 5108–5115 (2017)
Zhou, L., Chen, Z.: Illumination-aware window transformer for RGBT modality fusion. J. Vis. Commun. Image Represent. 90, 103725 (2023). https://doi.org/10.1016/j.jvcir.2022.103725
Lu, Y. et al.: Cross-modality person re-identification with shared-specific feature transfer, Presented at the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13379–13389, (2020). Accessed: Mar 29, 2023. [Online]. Available: https://openaccess.thecvf.com/content_CVPR_2020/html/Lu_Cross-Modality_Person_Re-Identification_With_Shared-Specific_Feature_Transfer_CVPR_2020_paper.html
Tang, Q., Yan, P., Sun, W.: Visible-infrared person re-identification employing style-supervision and content-supervision. Vis. Comput. (2023). https://doi.org/10.1007/s00371-023-02929-4
Guo, C., Yang, D., Li, C., Song, P.: Dual Siamese network for RGBT tracking via fusing predicted position maps. Vis. Comput. 38(7), 2555–2567 (2022). https://doi.org/10.1007/s00371-021-02131-4
Zhang, J., Huang, H., Jin, X., Kuang, L.-D., Zhang, J.: Siamese visual tracking based on criss-cross attention and improved head network. Multimed. Tools Appl. (2023). https://doi.org/10.1007/s11042-023-15429-3
Wang, B., Zhang, F., Zhao, Y.: LCH: fast RGB-D salient object detection on CPU via lightweight convolutional network with hybrid knowledge distillation. Vis. Comput. (2023). https://doi.org/10.1007/s00371-023-02898-8
Zhang, Y., Wang, H., Yang, G., Zhang, J., Gong, C., Wang, Y.: CSNet: a ConvNeXt-based Siamese network for RGB-D salient object detection. Vis. Comput. (2023). https://doi.org/10.1007/s00371-023-02887-x
Ren, L., Pan, Z., Cao, J., Zhang, H., Wang, H.: Infrared and visible image fusion based on edge-preserving guided filter and infrared feature decomposition. Signal Process. 186, 108108 (2021). https://doi.org/10.1016/j.sigpro.2021.108108
Lu, R., Gao, F., Yang, X., Fan, J., Li, D.: A novel infrared and visible image fusion method based on multi-level saliency integration. Vis. Comput. 39(6), 2321–2335 (2023). https://doi.org/10.1007/s00371-022-02438-w
Yang, Y., Zhang, Y., Huang, S., Zuo, Y., Sun, J.: Infrared and visible image fusion using visual saliency sparse representation and detail injection model. IEEE Trans. Instrum. Meas. 70, 1–15 (2021). https://doi.org/10.1109/TIM.2020.3011766
Zhang, C., Feng, Z.: Infrared-visible image fusion using accelerated convergent convolutional dictionary learning. Arab. J. Sci. Eng. (2022). https://doi.org/10.1007/s13369-021-06380-2
Rasti, B., Ghamisi, P.: Remote sensing image classification using subspace sensor fusion. Inf. Fusion 64, 121–130 (2020). https://doi.org/10.1016/j.inffus.2020.07.002
Li, H., Wu, X.-J.: Infrared and visible image fusion using latent low-rank representation, ArXiv180408992 Cs, Jan (2022), Accessed: May 05, 2022. [Online]. Available: http://arxiv.org/abs/1804.08992
Luo, H., Hou, R., Qi, W.: A novel infrared and visible image fusion using low-rank representation and simplified dual channel pulse coupled neural network, In: Proceedings of the 2019 international conference on artificial intelligence and computer science, Wuhan Hubei China: ACM, pp. 583–589, (2019). https://doi.org/10.1145/3349341.3349472
Liu, G., et al.: Infrared and visible image fusion through hybrid curvature filtering image decomposition. Infrared Phys. Technol. 120, 103938 (2022). https://doi.org/10.1016/j.infrared.2021.103938
Liu, L., Song, M., Peng, Y., Li, J.: A novel fusion framework of infrared and visible images based on RLNSST and guided filter. Infrared Phys. Technol. 100, 99–108 (2019). https://doi.org/10.1016/j.infrared.2019.05.019
Li, H., Wu, X.-J.: DenseFuse: a fusion approach to infrared and visible images. IEEE Trans. Image Process. 28(5), 2614–2623 (2019). https://doi.org/10.1109/TIP.2018.2887342
Ma, J., Tang, L., Xu, M., Zhang, H., Xiao, G.: STDFusionNet: an infrared and visible image fusion network based on salient target detection. IEEE Trans. Instrum. Meas. 70, 1–13 (2021). https://doi.org/10.1109/TIM.2021.3075747
Liu, J. et al.: Target-aware dual adversarial learning and a multi-scenario multi-modality benchmark to fuse infrared and visible for object detection, In: 2022 IEEE/CVF conference on computer vision and pattern recognition (CVPR), New Orleans, LA, USA: IEEE, pp. 5792–5801, (2022). https://doi.org/10.1109/CVPR52688.2022.00571.
Ma, J., Zhang, H., Shao, Z., Liang, P., Xu, H.: GANMcC: a generative adversarial network with multiclassification constraints for infrared and visible image fusion. IEEE Trans. Instrum. Meas. 70, 1–14 (2021). https://doi.org/10.1109/TIM.2020.3038013
Wang, Z., Wu, Y., Wang, J., Xu, J., Shao, W.: Res2Fusion: infrared and visible image fusion based on dense res2net and double nonlocal attention models. IEEE Trans. Instrum. Meas. 71, 1–12 (2022). https://doi.org/10.1109/TIM.2021.3139654
Tang, L., Yuan, J., Ma, J.: Image fusion in the loop of high-level vision tasks: a semantic-aware real-time infrared and visible image fusion network. Inf. Fusion 82, 28–42 (2022). https://doi.org/10.1016/j.inffus.2021.12.004
Chen, Y., Xia, R., Zou, K., Yang, K.: RNON: image inpainting via repair network and optimization network. Int. J. Mach. Learn. Cybern. 14(9), 2945–2961 (2023). https://doi.org/10.1007/s13042-023-01811-y
Chen, Y., Xia, R., Zou, K., Yang, K.: FFTI: image inpainting algorithm via features fusion and two-steps inpainting. J. Vis. Commun. Image Represent. 91, 103776 (2023). https://doi.org/10.1016/j.jvcir.2023.103776
Chen, Y., Xia, R., Yang, K., Zou, K.: DARGS: image inpainting algorithm via deep attention residuals group and semantics. J. King Saud Univ. Comput. Inf. Sci. 35(6), 101567 (2023). https://doi.org/10.1016/j.jksuci.2023.101567
Wang, L., Koniusz, P., Huynh, D.: Hallucinating IDT descriptors and I3D optical flow features for action recognition with CNNs, In: 2019 IEEE/CVF international conference on computer vision (ICCV), Seoul, Korea (South): IEEE, pp. 8697–8707, (2019). https://doi.org/10.1109/ICCV.2019.00879.
Wang, L., Koniusz, P.: Self-supervising action recognition by statistical moment and subspace descriptors, In: Proceedings of the 29th ACM international conference on multimedia, in MM '21. New York, NY, USA: association for computing machinery, pp. 4324–4333, (2021). https://doi.org/10.1145/3474085.3475572.
Wang, L., Huynh, D.Q., Mansour, M.R.: Loss switching fusion with similarity search for video classification. In: 2019 IEEE international conference on image processing (ICIP), pp. 974–978, (2019). https://doi.org/10.1109/ICIP.2019.8803051
Lu, Y., et al.: TransFlow: transformer as flow learner, Presented at the proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 18063–18073, (2023)
Cui, Y., Yan, L., Cao, Z., Liu, D.: TF-blender: temporal feature blender for video object detection, In: 2021 IEEE/CVF international conference on computer vision (ICCV), Montreal, QC, Canada: IEEE, pp. 8118–8127, (2021). https://doi.org/10.1109/ICCV48922.2021.00803
Wang, W., Han, C., Zhou, T., Liu, D.: Visual recognition with deep nearest centroids. arXiv, Mar 14, 2023. Accessed: Sep 11, 2023. [Online]. Available: http://arxiv.org/abs/2209.07383
Yan, L., et al.: GL-RG: global-local representation granularity for video captioning, In: proceedings of the thirty-first international joint conference on artificial intelligence, pp. 2769–2775, (2022). https://doi.org/10.24963/ijcai.2022/384
Chen, Y., Xia, R., Yang, K., Zou, K.: MFFN: image super-resolution via multi-level features fusion network. Vis. Comput. (2023). https://doi.org/10.1007/s00371-023-02795-0
Liu, D., Cui, Y., Yan, L., Mousas, C., Yang, B., Chen, Y.: DenserNet: weakly supervised visual localization using multi-scale feature aggregation. Proc. AAAI Conf. Artif. Intell. 35(7), 7 (2021). https://doi.org/10.1609/aaai.v35i7.16760
Zhang, J., Zou, X., Kuang, L.-D., Wang, J., Sherratt, R.S., Yu, X.: CCTSDB 2021: a more comprehensive traffic sign detection benchmark. Hum. Centric Comput. Inf. Sci. 12, 16491 (2022)
Zhang, J., Zheng, Z., Xie, X., Gui, Y., Kim, G.-J.: ReYOLO: A traffic sign detector based on network reparameterization and features adaptive weighting. J. Ambient Intell. Smart Environ. 14(4), 317–334 (2022). https://doi.org/10.3233/AIS-220038
Li, H., Wu, X.-J., Durrani, T.: NestFuse: an infrared and visible image fusion architecture based on nest connection and spatial/channel attention models. IEEE Trans. Instrum. Meas. 69(12), 9645–9656 (2020). https://doi.org/10.1109/TIM.2020.3005230
Wu, Y., Liu, J., Jiang, J., Fan, X.: Dual attention mechanisms with perceptual loss ensemble for infrared and visible image fusion, In: 2020 8th International Conference on Digital Home (ICDH), Dalian, China: IEEE, pp. 87–92 (2020). https://doi.org/10.1109/ICDH51081.2020.00023
Nie, C., Zhou, D., Nie, R.: Edafuse: a encoder-decoder with atrous spatial pyramid network for infrared and visible image fusion. SSRN Electron. J. (2021). https://doi.org/10.2139/ssrn.3982278
Zhang, Y., Liu, Y., Sun, P., Yan, H., Zhao, X., Zhang, L.: IFCNN: a general image fusion framework based on convolutional neural network. Inf. Fusion 54, 99–118 (2020). https://doi.org/10.1016/j.inffus.2019.07.011
Li, Y., Wang, J., Miao, Z., Wang, J.: Unsupervised densely attention network for infrared and visible image fusion. Multimed. Tools Appl. 79(45–46), 34685–34696 (2020). https://doi.org/10.1007/s11042-020-09301-x
Mustafa, H.T., Yang, J., Mustafa, H., Zareapoor, M.: Infrared and visible image fusion based on dilated residual attention network. Optik 224, 165409 (2020). https://doi.org/10.1016/j.ijleo.2020.165409
Xu, H., Zhang, H., Ma, J.: Classification saliency-based rule for visible and infrared image fusion. IEEE Trans. Comput. Imaging 7, 824–836 (2021). https://doi.org/10.1109/TCI.2021.3100986
Ma, J., Yu, W., Liang, P., Li, C., Jiang, J.: FusionGAN: a generative adversarial network for infrared and visible image fusion. Inf. Fusion 48, 11–26 (2019). https://doi.org/10.1016/j.inffus.2018.09.004
Ma, J., Xu, H., Jiang, J., Mei, X., Zhang, X.-P.: DDcGAN: a dual-discriminator conditional generative adversarial network for multi-resolution image fusion. IEEE Trans. Image Process. 29, 4980–4995 (2020). https://doi.org/10.1109/TIP.2020.2977573
Xu, D., Wang, Y., Xu, S., Zhu, K., Zhang, N., Zhang, X.: Infrared and visible image fusion with a generative adversarial network and a residual network. Appl. Sci. 10(2), 554 (2020). https://doi.org/10.3390/app10020554
Liu, Z., Mao, H., Wu, C.-Y., Feichtenhofer, C., Darrell, T., Xie, S.: A ConvNet for the 2020s, Presented at the proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 11976–11986, (2022)
Wang, C.-Y., Mark Liao, H.-Y., Wu, Y.-H., Chen, P.-Y., Hsieh, J.-W., Yeh, I.-H.: CSPNet: a new backbone that can enhance learning capability of CNN, In: 2020 IEEE/CVF conference on computer vision and pattern recognition workshops (CVPRW), Seattle, WA, USA: IEEE, pp. 1571–1580, (2020). https://doi.org/10.1109/CVPRW50498.2020.00203
Kim, Y., Koh, Y.J., Lee, C., Kim, S., Kim, C.-S.: Dark image enhancement based onpairwise target contrast and multi-scale detail boosting, In: 2015 IEEE international conference on image processing (ICIP), pp. 1404–1408, (2015). https://doi.org/10.1109/ICIP.2015.7351031.
Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. arXiv, May 11, 2017. Accessed: May 18, 2023. [Online]. Available: http://arxiv.org/abs/1606.00915
Tang, L., Yuan, J., Zhang, H., Jiang, X., Ma, J.: PIAFusion: a progressive infrared and visible image fusion network based on illumination aware. Inf. Fusion 83–84, 79–92 (2022). https://doi.org/10.1016/j.inffus.2022.03.007
Zhao, Z., Xu, S., Zhang, C., Liu, J., Li, P., Zhang, J.: DIDFuse: deep image decomposition for infrared and visible image fusion, In: proceedings of the twenty-ninth international joint conference on artificial intelligence, pp. 970–976, (2020). https://doi.org/10.24963/ijcai.2020/135
Xu, H., Ma, J., Jiang, J., Guo, X., Ling, H.: U2Fusion: a unified unsupervised image fusion network. IEEE Trans. Pattern Anal. Mach. Intell. 44(1), 502–518 (2022). https://doi.org/10.1109/TPAMI.2020.3012548
Li, H., Wu, X.-J., Kittler, J.: RFN-Nest: an end-to-end residual fusion network for infrared and visible images. Inf. Fusion 73, 72–86 (2021). https://doi.org/10.1016/j.inffus.2021.02.023
Wang, D., Liu, J., Fan, X., Liu, R.: Unsupervised misaligned infrared and visible image fusion via cross-modality image generation and registration. arXiv, May 24, 2022. Accessed: Nov 18, 2022. [Online]. Available: http://arxiv.org/abs/2205.11876
Luo, X., Fu, G., Yang, J., Cao, Y., Cao, Y.: Multi-modal image fusion via deep laplacian pyramid hybrid network. IEEE Trans. Circuits Syst. Video Technol. (2023). https://doi.org/10.1109/TCSVT.2023.3281462
Rao, D., Xu, T., Wu, X.-J.: TGFuse: an infrared and visible image fusion approach based on transformer and generative adversarial network. IEEE Trans. Image Process. (2023). https://doi.org/10.1109/TIP.2023.3273451
Cheng, C., Xu, T., Wu, X.-J.: MUFusion: a general unsupervised image fusion network based on memory unit. Inf. Fusion 92, 80–92 (2023). https://doi.org/10.1016/j.inffus.2022.11.010
Acknowledgements
This work was supported by the National Natural Science Foundation of China (51805078), the Fundamental Research Funds for the Central Universities (N2103011), the Central Guidance on Local Science and Technology Development Fund (2022JH6/100100023), and the 111 Project (B16009).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sun, S., Song, K., Man, Y. et al. DCBFusion: an infrared and visible image fusion method through detail enhancement, contrast reserve and brightness balance. Vis Comput (2023). https://doi.org/10.1007/s00371-023-03134-z
Accepted:
Published:
DOI: https://doi.org/10.1007/s00371-023-03134-z