skip to main content
10.1145/3474085.3475643acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

MSO: Multi-Feature Space Joint Optimization Network for RGB-Infrared Person Re-Identification

Authors Info & Claims
Published:17 October 2021Publication History

ABSTRACT

The RGB-infrared cross-modality person re-identification (ReID) task aims to recognize the images of the same identity between the visible modality and the infrared modality. Existing methods mainly use a two-stream architecture to eliminate the discrepancy between the two modalities in the final common feature space, which ignore the single space of each modality in the shallow layers. To solve it, in this paper, we present a novel multi-feature space joint optimization (MSO) network, which can learn modality-sharable features in both the single-modality space and the common space. Firstly, based on the observation that edge information is modality-invariant, we propose an edge features enhancement module to enhance the modality-sharable features in each single-modality space. Specifically, we design a perceptual edge features (PEF) loss after the edge fusion strategy analysis. According to our knowledge, this is the first work that proposes explicit optimization in the single-modality feature space on cross-modality ReID task. Moreover, to increase the difference between cross-modality distance and class distance, we introduce a novel cross-modality contrastive-center (CMCC) loss into the modality-joint constraints in the common feature space. The PEF loss and CMCC loss jointly optimize the model in an end-to-end manner, which markedly improves the network's performance. Extensive experiments demonstrate that the proposed model significantly outperforms state-of-the-art methods on both the SYSU-MM01 and RegDB datasets.

References

  1. Martín Arjovsky, Soumith Chintala, and Lé on Bottou. 2017. Wasserstein GAN. CoRR, Vol. abs/1701.07875 (2017).Google ScholarGoogle Scholar
  2. Qian Bao, Wu Liu, Yuhao Cheng, Boyan Zhou, and Tao Mei. 2021. Pose-Guided Tracking-by-Detection: Robust Multi-Person Pose Tracking. IEEE Trans. Multim., Vol. 23 (2021), 161--175.Google ScholarGoogle ScholarCross RefCross Ref
  3. Seokeon Choi, Sumin Lee, Youngeun Kim, Taekyung Kim, and Changick Kim. 2020. Hi-CMD: Hierarchical Cross-Modality Disentanglement for Visible-Infrared Person Re-Identification. In CVPR. 10254--10263.Google ScholarGoogle Scholar
  4. Pingyang Dai, Rongrong Ji, Haibin Wang, Qiong Wu, and Yuyu Huang. 2018. Cross-Modality Person Re-Identification with Generative Adversarial Training. In IJCAI 2018. 677--683. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Weijian Deng, Liang Zheng, Qixiang Ye, Guoliang Kang, Yi Yang, and Jianbin Jiao. 2018. Image-Image Domain Adaptation With Preserved Self-Similarity and Domain-Dissimilarity for Person Re-Identification. In CVPR 2018. 994--1003.Google ScholarGoogle Scholar
  6. Xing Fan, Wei Jiang, Hao Luo, and Mengjuan Fei. 2019. SphereReID: Deep hypersphere manifold embedding for person re-identification. J. Vis. Commun. Image Represent., Vol. 60 (2019), 51--58.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Zhan-Xiang Feng, Jianhuang Lai, and Xiaohua Xie. 2020. Learning Modality-Specific Representations for Visible-Infrared Person Re-Identification. IEEE Trans. Image Process., Vol. 29 (2020), 579--590.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Chuang Gan, Tianbao Yang, and Boqing Gong. 2016. Learning Attributes Equals Multi-Source Domain Generalization. In CVPR 2016. 87--97.Google ScholarGoogle Scholar
  9. Chuang Gan, Hang Zhao, Peihao Chen, David D. Cox, and Antonio Torralba. 2019. Self-Supervised Moving Vehicle Tracking With Stereo Sound. In ICCV 2019. 7052--7061.Google ScholarGoogle Scholar
  10. Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron C. Courville, and Yoshua Bengio. 2020. Generative adversarial networks. Commun. ACM, Vol. 63, 11 (2020), 139--144. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Yi Hao, Nannan Wang, Jie Li, and Xinbo Gao. 2019. HSME: Hypersphere Manifold Embedding for Visible Thermal Person Re-Identification. In AAAI 2019. 8385--8392.Google ScholarGoogle Scholar
  12. Xinwei He, Yang Zhou, Zhichao Zhou, Song Bai, and Xiang Bai. 2018. Triplet-Center Loss for Multi-View 3D Object Retrieval. In CVPR 2018. 1945--1954.Google ScholarGoogle ScholarCross RefCross Ref
  13. Alexander Hermans, Lucas Beyer, and Bastian Leibe. 2017. In Defense of the Triplet Loss for Person Re-Identification. CoRR, Vol. abs/1703.07737 (2017).Google ScholarGoogle Scholar
  14. Sergey Ioffe and Christian Szegedy. 2015. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In ICML 2015, Vol. 37. 448--456. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. 2016. Image-to-Image Translation with Conditional Adversarial Networks. CoRR, Vol. abs/1611.07004 (2016).Google ScholarGoogle Scholar
  16. Mengxi Jia, Yunpeng Zhai, Shijian Lu, Siwei Ma, and Jian Zhang. 2020. A Similarity Inference Metric for RGB-Infrared Cross-Modality Person Re-identification. In IJCAI 2020. 1026--1032.Google ScholarGoogle ScholarCross RefCross Ref
  17. Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual Losses for Real-Time Style Transfer and Super-Resolution. In ECCV (2) 2016, Vol. 9906. 694--711.Google ScholarGoogle ScholarCross RefCross Ref
  18. Jin Kyu Kang, Toan Minh Hoang, and Kang Ryoung Park. 2019. Person Re-Identification Between Visible and Thermal Camera Images Based on Deep Residual CNN Using Single Input. IEEE Access, Vol. 7 (2019), 57972--57984.Google ScholarGoogle ScholarCross RefCross Ref
  19. Kajal Kansal, A. Venkata Subramanyam, Zheng Wang, and Shin'ichi Satoh. 2020. SDL: Spectrum-Disentangled Representation Learning for Visible-Infrared Person Re-Identification. IEEE Trans. Circuits Syst. Video Technol., Vol. 30, 10 (2020), 3422--3432.Google ScholarGoogle ScholarCross RefCross Ref
  20. Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In ICLR (Poster) 2017.Google ScholarGoogle Scholar
  21. Diangang Li, Xing Wei, Xiaopeng Hong, and Yihong Gong. 2020. Infrared-Visible Cross-Modal Person Re-Identification with an X Modality. In AAAI 2020. 4610--4617.Google ScholarGoogle ScholarCross RefCross Ref
  22. Yongguo Ling, Zhun Zhong, Zhiming Luo, Paolo Rota, Shaozi Li, and Nicu Sebe. 2020. Class-Aware Modality Mix and Center-Guided Metric Learning for Visible-Thermal Person Re-Identification. In ACM Multimedia 2020. 889--897. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Haijun Liu, Jian Cheng, Wen Wang, Yanzhou Su, and Haiwei Bai. 2020. Enhancing the discriminative feature learning for visible-thermal cross-modality person re-identification. Neurocomputing, Vol. 398 (2020), 11--19.Google ScholarGoogle ScholarCross RefCross Ref
  24. Weiyang Liu, Yandong Wen, Zhiding Yu, Ming Li, Bhiksha Raj, and Le Song. 2017. SphereFace: Deep Hypersphere Embedding for Face Recognition. In CVPR 2017. 6738--6746.Google ScholarGoogle Scholar
  25. Yan Lu, Yue Wu, Bin Liu, Tianzhu Zhang, Baopu Li, Qi Chu, and Nenghai Yu. 2020. Cross-Modality Person Re-Identification With Shared-Specific Feature Transfer. In CVPR 2020. 13376--13386.Google ScholarGoogle Scholar
  26. Hao Luo, Youzhi Gu, Xingyu Liao, Shenqi Lai, and Wei Jiang. 2019. Bag of Tricks and a Strong Baseline for Deep Person Re-Identification. In CVPR Workshops 2019. 1487--1495.Google ScholarGoogle ScholarCross RefCross Ref
  27. Mehdi Mirza and Simon Osindero. 2014. Conditional Generative Adversarial Nets. CoRR, Vol. abs/1411.1784 (2014).Google ScholarGoogle Scholar
  28. Dat Tien Nguyen, Hyung Gil Hong, Ki-Wan Kim, and Kang Ryoung Park. 2017. Person Recognition System Based on a Combination of Body Images from Visible Light and Thermal Cameras. Sensors, Vol. 17, 3 (2017), 605.Google ScholarGoogle ScholarCross RefCross Ref
  29. Karen Simonyan and Andrew Zisserman. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. In ICLR 2015.Google ScholarGoogle Scholar
  30. I. Sobel and G. Feldman. 1973. A 3×3 isotropic gradient operator for image processing. Pattern Classification and Scene Analysis (1973), 271--272.Google ScholarGoogle Scholar
  31. Rahul Rama Varior, Mrinal Haloi, and Gang Wang. 2016. Gated Siamese Convolutional Neural Network Architecture for Human Re-identification. In ECCV (8) 2016, Vol. 9912. 791--808.Google ScholarGoogle ScholarCross RefCross Ref
  32. Feng Wang, Xiang Xiang, Jian Cheng, and Alan Loddon Yuille. 2017. NormFace: L(2) Hypersphere Embedding for Face Verification. In ACM Multimedia 2017. 1041--1049. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Guan'an Wang, Tianzhu Zhang, Jian Cheng, Si Liu, Yang Yang, and Zengguang Hou. 2019 b. RGB-Infrared Cross-Modality Person Re-Identification via Joint Pixel and Feature Alignment. In ICCV 2019. 3622--3631.Google ScholarGoogle ScholarCross RefCross Ref
  34. Guan'an Wang, Tianzhu Zhang, Yang Yang, Jian Cheng, Jianlong Chang, Xu Liang, and Zeng-Guang Hou. 2020 b. Cross-Modality Paired-Images Generation for RGB-Infrared Person Re-Identification. In AAAI 2020. 12144--12151.Google ScholarGoogle Scholar
  35. Qi Wang, Xinchen Liu, Wu Liu, An-An Liu, Wenyin Liu, and Tao Mei. 2020 a. MetaSearch: Incremental Product Search via Deep Meta-Learning. IEEE Trans. Image Process., Vol. 29 (2020), 7549--7564.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Xiaolong Wang, Ross B. Girshick, Abhinav Gupta, and Kaiming He. 2018. Non-Local Neural Networks. In CVPR 2018. 7794--7803.Google ScholarGoogle Scholar
  37. Zhixiang Wang, Zheng Wang, Yinqiang Zheng, Yung-Yu Chuang, and Shin'ichi Satoh. 2019 a. Learning to Reduce Dual-Level Discrepancy for Infrared-Visible Person Re-Identification. In CVPR 2019. 618--626.Google ScholarGoogle ScholarCross RefCross Ref
  38. Xing Wei, Diangang Li, Xiaopeng Hong, Wei Ke, and Yihong Gong. 2020. Co-Attentive Lifting for Infrared-Visible Person Re-Identification. In ACM Multimedia 2020. 1028--1037. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Ancong Wu, Wei-Shi Zheng, Hong-Xing Yu, Shaogang Gong, and Jianhuang Lai. 2017. RGB-Infrared Cross-Modality Person Re-identification. In ICCV 2017. 5390--5399.Google ScholarGoogle Scholar
  40. Hanrong Ye, Hong Liu, Fanyang Meng, and Xia Li. 2021 a. Bi-Directional Exponential Angular Triplet Loss for RGB-Infrared Person Re-Identification. IEEE Trans. Image Process., Vol. 30 (2021), 1583--1595.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Mang Ye, Xiangyuan Lan, and Qingming Leng. 2019. Modality-aware Collaborative Learning for Visible Thermal Person Re-Identification. In ACM Multimedia 2019. 347--355. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Mang Ye, Xiangyuan Lan, Jiawei Li, and Pong C. Yuen. 2018a. Hierarchical Discriminative Learning for Visible Thermal Person Re-Identification. In AAAI 2018. 7501--7508.Google ScholarGoogle Scholar
  43. Mang Ye, Xiangyuan Lan, Zheng Wang, and Pong C. Yuen. 2020 a. Bi-Directional Center-Constrained Top-Ranking for Visible Thermal Person Re-Identification. IEEE Trans. Inf. Forensics Secur., Vol. 15 (2020), 407--419.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Mang Ye, Jianbing Shen, Gaojie Lin, Tao Xiang, Ling Shao, and Steven C. H. Hoi. 2020 b. Deep Learning for Person Re-identification: A Survey and Outlook. CoRR, Vol. abs/2001.04193 (2020).Google ScholarGoogle Scholar
  45. Mang Ye, Jianbing Shen, and Ling Shao. 2021 b. Visible-Infrared Person Re-Identification via Homogeneous Augmented Tri-Modal Learning. IEEE Trans. Inf. Forensics Secur., Vol. 16 (2021), 728--739.Google ScholarGoogle ScholarCross RefCross Ref
  46. Mang Ye, Zheng Wang, Xiangyuan Lan, and Pong C. Yuen. 2018b. Visible Thermal Person Re-Identification via Dual-Constrained Top-Ranking. In IJCAI 2018. 1092--1099. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Ye Yuan, Wuyang Chen, Yang Yang, and Zhangyang Wang. 2020. In Defense of the Triplet Loss Again: Learning Robust Person Re-Identification with Fast Approximated Triplet Loss and Label Distillation. In CVPR Workshops 2020. 1454--1463.Google ScholarGoogle Scholar
  48. Shizhou Zhang, Yifei Yang, Peng Wang, Xiuwei Zhang, and Yanning Zhang. 2019. Attend to the Difference: Cross-Modality Person Re-identification via Contrastive Correlation. CoRR, Vol. abs/1910.11656 (2019).Google ScholarGoogle Scholar
  49. Ziyue Zhang, Shuai Jiang, Congzhentao Huang, Yang Li, and Richard Yi Da Xu. 2020. RGB-IR Cross-modality Person ReID based on Teacher-Student GAN Model. CoRR, Vol. abs/2007.07452 (2020).Google ScholarGoogle Scholar
  50. Yun-Bo Zhao, Jian-Wu Lin, Qi Xuan, and Xugang Xi. 2019. HPILN: a feature learning framework for cross-modality person re-identification. IET Image Process., Vol. 13, 14 (2019), 2897--2904.Google ScholarGoogle ScholarCross RefCross Ref
  51. Kecheng Zheng, Wu Liu, Lingxiao He, Tao Mei, Jiebo Luo, and Zheng-Jun Zha. 2021. Group-aware Label Transfer for Domain Adaptive Person Re-identification. CoRR, Vol. abs/2103.12366 (2021).Google ScholarGoogle Scholar
  52. Liang Zheng, Hengheng Zhang, Shaoyan Sun, Manmohan Chandraker, Yi Yang, and Qi Tian. 2017. Person Re-identification in the Wild. In CVPR 2017. 3346--3355.Google ScholarGoogle Scholar
  53. Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. 2017. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks. In ICCV 2017. 2242--2251.Google ScholarGoogle Scholar
  54. Yuanxin Zhu, Zhao Yang, Li Wang, Sai Zhao, Xiao Hu, and Dapeng Tao. 2020. Hetero-Center loss for cross-modality person Re-identification. Neurocomputing, Vol. 386 (2020), 97--109.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. MSO: Multi-Feature Space Joint Optimization Network for RGB-Infrared Person Re-Identification

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      MM '21: Proceedings of the 29th ACM International Conference on Multimedia
      October 2021
      5796 pages
      ISBN:9781450386517
      DOI:10.1145/3474085

      Copyright © 2021 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 17 October 2021

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate995of4,171submissions,24%

      Upcoming Conference

      MM '24
      MM '24: The 32nd ACM International Conference on Multimedia
      October 28 - November 1, 2024
      Melbourne , VIC , Australia

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader