Abstract
Infrared imaging has recently played an important role in a wide range of applications including video surveillance, robotics and night vision. However, the manufacturing cost of high-resolution infrared cameras is more expensive regarding similar quality in visible cameras. This could explain the fact that thermal databases are less available compared to visible ones. In this paper, we mainly emphasis the need for aligning features from visible and thermal domains for object detection in order to ensure effective results in both domains without the need to retrain data and to perform additional annotations. To address that, we incorporate feature distribution alignments into faster R-CNN architecture at different levels. The resulting proposed adaptive detector has the advantage of covering different aspects of the domain shift in order to improve the overall performance. Using KAIST and FLIR ADAS datasets, the effectiveness of the proposed detector is assessed and better results are obtained compared to the baseline detector and to the obtained results by other existing works. Our code is available at https://github.com/AmineMarnissi/UDAT.
Similar content being viewed by others
References
Bayoudh, K., Knani, R., Hamdaoui, F., Mtibaa, A.: A survey on deep multimodal learning for computer vision: advances, trends, applications, and datasets. Visual Comput., pp. 1–32 (2021)
Berg, A., Ahlberg, J., Felsberg, M.: Generating visible spectrum images from thermal infrared. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1143–1152 (2018)
Chen, C., Zheng, Z., Ding, X., Huang, Y., Dou, Q.: Harmonizing transferability and discriminability for adapting object detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8869–8878 (2020)
Chen, Y., Li, W., Sakaridis, C., Dai, D., Van Gool, L.: Domain adaptive faster r-cnn for object detection in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3339–3348 (2018)
Chen, Y., Wang, H., Li, W., Sakaridis, C., Dai, D., Van Gool, L.: Scale-aware domain adaptive faster r-cnn. Int. J. Comput. Vision 129(7), 2223–2243 (2021)
Dai, X., Yuan, X., Wei, X.: Tirnet: object detection in thermal infrared images for autonomous driving. Appl. Intell. 51(3), 1244–1261 (2021)
Devaguptapu, C., Akolekar, N., M Sharma, M., N Balasubramanian, V.: Borrow from anywhere: Pseudo multi-modal object detection in thermal imagery. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2019)
Ganin, Y., Lempitsky, V.: Unsupervised domain adaptation by backpropagation. In: International Conference on Machine Learning, pp. 1180–1189. PMLR (2015)
Gautam, A., Singh, S.: Neural style transfer combined with efficientdet for thermal surveillance. Visual Comput. pp. 1–17 (2021)
Ghose, D., Desai, S.M., Bhattacharya, S., Chakraborty, D., Fiterau, M., Rahman, T.: Pedestrian detection in thermal images using saliency maps. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2019)
Huang, X.: Moving object detection in low-luminance images. Visual Comput. pp. 1–13 (2021)
Hwang, S., Park, J., Kim, N., Choi, Y., So Kweon, I.: Multispectral pedestrian detection: Benchmark dataset and baseline. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1037–1045 (2015)
Jiang, B., Chen, C., Jin, X.: Unsupervised domain adaptation with target reconstruction and label confusion in the common subspace. Neural Comput. Appl. 32(9), 4743–4756 (2020)
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: European Conference on Computer Vision, pp. 694–711. Springer (2016)
Kieu, M., Bagdanov, A.D., Bertini, M., Bimbo, A.D.: Task-conditioned domain adaptation for pedestrian detection in thermal imagery. In: Computer Vision - ECCV (2020)
Kim, M., Joung, S., Park, K., Kim, S., Sohn, K.: Unpaired cross-spectral pedestrian detection via adversarial feature learning. In: 2019 IEEE International Conference on Image Processing (ICIP), pp. 1650–1654. IEEE (2019)
Kuang, X., Zhu, J., Sui, X., Liu, Y., Liu, C., Chen, Q., Gu, G.: Thermal infrared colorization via conditional generative adversarial network. Infrared Phys. Technol. p. 103338 (2020)
Li, W., Xu, Z., Xu, D., Dai, D., Van Gool, L.: Domain generalization and adaptation using low rank exemplar SVMs. IEEE Trans. Pattern Anal. Mach. Intell. 40(5), 1114–1127 (2017)
Li, X., Hu, Y., Zheng, J., Li, M., Ma, W.: Central moment discrepancy based domain adaptation for intelligent bearing fault diagnosis. Neurocomputing 429, 12–24 (2021)
Lin, C., Lu, J., Wang, G., Zhou, J.: Graininess-aware deep feature learning for robust pedestrian detection. IEEE Trans. Image Process. 29, 3820–3834 (2020)
Liu, H., Wang, X., Zhang, W., Zhang, Z., Li, Y.F.: Infrared head pose estimation with multi-scales feature fusion on the IRHP database for human attention recognition. Neurocomputing 411, 510–520 (2020)
Liu, Q., Li, X., He, Z., Li, C., Li, J., Zhou, Z., Yuan, D., Li, J., Yang, K., Fan, N., et al.: Lsotb-tir: A large-scale high-diversity thermal infrared object tracking benchmark. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 3847–3856 (2020)
Liu, W., Liao, S., Ren, W., Hu, W., Yu, Y.: High-level semantic feature detection: A new perspective for pedestrian detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5187–5196 (2019)
Mao, X., Li, Q., Xie, H., Lau, R.Y., Wang, Z., Paul Smolley, S.: Least squares generative adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2794–2802 (2017)
Mhalla, A., Chateau, T., Gazzah, S., EssoukriBenAmara, N.: An embedded computer-vision system for multi-object detection in traffic surveillance. IEEE Trans. Intell. Transp. Syst. 20(11), 4006–4018 (2018)
Mohamed Amine, M., Hajer, F., Anis, S., Najoua, E.B.A.: Thermal image enhancement using generative adversarial network for pedestrian detection. In: International Conference on Pattern Recognition (2020)
Motiian, S., Piccirilli, M., Adjeroh, D.A., Doretto, G.: Unified deep supervised domain adaptation and generalization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5715–5725 (2017)
Nasiri, A., Taheri-Garavand, A., Omid, M., Carlomagno, G.M.: Intelligent fault diagnosis of cooling radiator based on deep learning analysis of infrared thermal images. Appl. Therm. Eng. 163, 114410 (2019)
Ouyang, W., Zeng, X., Wang, X.: Learning mutual visibility relationship for pedestrian detection with a deep model. Int. J. Comput. Vision 120(1), 14–27 (2016)
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010). https://doi.org/10.1109/TKDE.2009.191
Panareda Busto, P., Gall, J.: Open set domain adaptation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 754–763 (2017)
Park, S., Hwang, J., Park, J.E., Ahn, Y.C., Kang, H.W.: Application of ultrasound thermal imaging for monitoring laser ablation in ex vivo cardiac tissue. Lasers Surg. Med. 52(3), 218–227 (2020)
Rahman, M.M., Fookes, C., Baktashmotlagh, M., Sridharan, S.: Correlation-aware adversarial domain adaptation and generalization. Pattern Recognit. 100, 107124 (2020)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Saito, K., Ushiku, Y., Harada, T., Saenko, K.: Strong-weak distribution alignment for adaptive object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6956–6965 (2019)
Sankaranarayanan, S., Balaji, Y., Jain, A., Nam Lim, S., Chellappa, R.: Learning from synthetic data: Addressing domain shift for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3752–3761 (2018)
Tzeng, E., Hoffman, J., Saenko, K., Darrell, T.: Adversarial discriminative domain adaptation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7167–7176 (2017)
Wang, M., Deng, W.: Deep visual domain adaptation: a survey. Neurocomputing 312, 135–153 (2018)
Wei, L., Cui, W., Hu, Z., Sun, H., Hou, S.: A single-shot multi-level feature reused neural network for object detection. Vis. Comput. 37(1), 133–142 (2021)
Xu, C.D., Zhao, X.R., Jin, X., Wei, X.S.: Exploring categorical regularization for domain adaptive object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11724–11733 (2020)
Xu, D., Ouyang, W., Ricci, E., Wang, X., Sebe, N.: Learning cross-modal deep representations for robust pedestrian detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5363–5371 (2017)
Xu, M., Zhang, J., Ni, B., Li, T., Wang, C., Tian, Q., Zhang, W.: Adversarial domain adaptation with domain mixup. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 6502–6509 (2020)
Yang, J., An, W., Wang, S., Zhu, X., Yan, C., Huang, J.: Label-driven reconstruction for domain adaptation in semantic segmentation. In: European Conference on Computer Vision, pp. 480–498. Springer (2020)
Zellinger, W., Moser, B.A., Saminger-Platz, S.: On generalization in moment-based domain adaptation. Ann. Math. Artif. Intell. 89(3), 333–369 (2021)
Zhang, H., Fromont, E., Lefèvre, S., Avignon, B.: Multispectral fusion for object detection with cyclic fuse-and-refine blocks. In: 2020 IEEE International Conference on Image Processing (ICIP), pp. 276–280. IEEE (2020)
Zhang, H., Zhang, Z., Odena, A., Lee, H.: Consistency regularization for generative adversarial networks. arXiv preprint arXiv:1910.12027 (2019)
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232 (2017)
Zoetgnande, Y.W.K., Dillenseger, J.L., Alirezaie, J.: Edge focused super-resolution of thermal images. In: 2019 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2019)
Zou, Y., Yu, Z., Vijaya Kumar, B., Wang, J.: Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 289–305 (2018)
Funding
This work has been funded by the DGVR research fund from the Tunisian Ministry of Higher Education and Scientific Research that is gratefully acknowledged.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Marnissi, M.A., Fradi, H., Sahbani, A. et al. Feature distribution alignments for object detection in the thermal domain. Vis Comput 39, 1081–1093 (2023). https://doi.org/10.1007/s00371-021-02386-x
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-021-02386-x