research-article

Co-Attentive Lifting for Infrared-Visible Person Re-Identification

Authors:
Xing Wei

Xi'an Jiaotong University, Xi'an, China

Xi'an Jiaotong University, Xi'an, China
View Profile

,
Diangang Li

Xi'an Jiaotong University, Xi'an, China

Xi'an Jiaotong University, Xi'an, China
View Profile

,
Xiaopeng Hong

Xi'an Jiaotong University & Pengcheng Laboratory, Xi'an, China

Xi'an Jiaotong University & Pengcheng Laboratory, Xi'an, China
View Profile

,
Wei Ke

Xi'an Jiaotong University, Xi'an, China

Xi'an Jiaotong University, Xi'an, China
View Profile

,
Yihong Gong

Xi'an Jiaotong University, Xi'an, China

Xi'an Jiaotong University, Xi'an, China
View Profile

MM '20: Proceedings of the 28th ACM International Conference on MultimediaOctober 2020Pages 1028–1037https://doi.org/10.1145/3394171.3413933

Published:12 October 2020Publication History

MM '20: Proceedings of the 28th ACM International Conference on Multimedia

Pages 1028–1037

ABSTRACT

Infrared-visible cross-modality person re-identification (IV-ReID) has attracted much attention with the popularity of dual-mode video surveillance systems, where the RGB mode works in the daytime and automatically switches to the infrared mode at night. Despite its significant application value, IV-ReID remains a difficult problem mainly due to two great challenges. First, it is difficult to identify persons in the infrared image, which lacks color and texture clues. Second, there is a significant gap between the infrared and visible modalities where appearances of the same person vary considerably. This paper proposes a novel attention-based approach to handle the two difficulties in a unified framework. 1) We propose an attention lifting mechanism to learn discriminative features in each modality. 2) We propose a co-attentive learning mechanism to bridge the gap between the two modalities. Our method only makes slight modifications of a given backbone network and requires small computation overhead while improving the performance significantly. We conduct extensive experiments to demonstrate the superiority of our proposed method.

Supplemental Material

3394171.3413933.mp4

mp4

19.3 MB

Download

References

Ejaz Ahmed, Michael Jones, and Tim K. Marks. 2015. An Improved Deep Learning Architecture for Person Re-Identification. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
Slawomir Bak and Peter Carr. 2017. One-Shot Metric Learning for Person Re-Identification. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
Xinyuan Chang, Zhiheng Ma, Xing Wei, Xiaopeng Hong, and Yihong Gong. 2020. Transductive Semi-Supervised Metric Learning for Person Re-identification. Pattern Recognition (2020), 107569.Google Scholar
Jiaxin Chen, Yunhong Wang, Jie Qin, Li Liu, and Ling Shao. 2017. Fast person re-identification via cross-camera semantic binary transformation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3873--3882.Google ScholarCross Ref
De Cheng, Yihong Gong, Sanping Zhou, Jinjun Wang, and Nanning Zheng. 2016. Person Re-Identification by Multi-Channel Parts-Based CNN With Improved Triplet Loss Function. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
Dong Seon Cheng, Marco Cristani, Michele Stoppa, Loris Bazzani, and Vittorio Murino. 2011. Custom pictorial structures for re-identification. In Bmvc, Vol. 1. 6.Google Scholar
Seokeon Choi, Sumin Lee, Youngeun Kim, Taekyung Kim, and Changick Kim. 2020. Hi-CMD: Hierarchical Cross-Modality Disentanglement for Visible-Infrared Person Re-Identification. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
Pingyang Dai, Rongrong Ji, Haibin Wang, Qiong Wu, and Yuyu Huang. 2018. Cross-Modality Person Re-Identification with Generative Adversarial Training. In IJCAI. 677--683.Google Scholar
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 248--255.Google ScholarCross Ref
Guiguang Ding, Yuchen Guo, Jile Zhou, and Yue Gao. 2016. Large-scale cross-modality search via collective matrix factorization hashing. IEEE Transactions on Image Processing, Vol. 25, 11 (2016), 5427--5440.Google ScholarDigital Library
Michela Farenzena, Loris Bazzani, Alessandro Perina, Vittorio Murino, and Marco Cristani. 2010. Person re-identification by symmetry-driven accumulation of local features. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2360--2367.Google ScholarCross Ref
Zhan-Xiang Feng, Jianhuang Lai, and Xiaohua Xie. 2020. Learning Modality-Specific Representations for Visible-Infrared Person Re-Identification. IEEE Transactions on Image Processing, Vol. 29 (2020), 579--590.Google ScholarDigital Library
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in neural information processing systems. 2672--2680.Google Scholar
Douglas Gray and Hai Tao. 2008. Viewpoint Invariant Pedestrian Recognition with an Ensemble of Localized Features. In The European Conference on Computer Vision (ECCV).Google ScholarDigital Library
Yi Hao, Nannan Wang, Jie Li, and Xinbo Gao. 2019. HSME: Hypersphere Manifold Embedding for Visible Thermal Person Re-Identification. In AAAI, Vol. 33. 8385--8392.Google ScholarCross Ref
Albert Haque, Alexandre Alahi, and Li Fei-Fei. 2016. Recurrent Attention Models for Depth-Based Person Identification. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-Excitation Networks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
Elyor Kodirov, Tao Xiang, Zhenyong Fu, and Shaogang Gong. 2016. Person Re-Identification by Unsupervised $ell_1$ Graph Learning. In The European Conference on Computer Vision (ECCV). 178--195.Google Scholar
Martin Kö stinger, Martin Hirzer, Paul Wohlhart, Peter M. Roth, and Horst Bischof. 2012. Large scale metric learning from equivalence constraints. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2288--2295.Google Scholar
Diangang Li, Xing Wei, Xiaopeng Hong, and Yihong Gong. 2020. Infrared-Visible Cross-Modal Person Re-Identification with an X Modality.. In AAAI. 4610--4617.Google Scholar
Minxian Li, Xiatian Zhu, and Shaogang Gong. 2018. Unsupervised person re-identification by deep learning tracklet association. In Proceedings of the European conference on computer vision (ECCV). 737--753.Google ScholarDigital Library
Shuang Li, Tong Xiao, Hongsheng Li, Wei Yang, and Xiaogang Wang. 2017a. Identity-Aware Textual-Visual Matching With Latent Co-Attention. In The IEEE International Conference on Computer Vision (ICCV).Google Scholar
Shuang Li, Tong Xiao, Hongsheng Li, Bolei Zhou, Dayu Yue, and Xiaogang Wang. 2017b. Person Search With Natural Language Description. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
Xiang Li, Wei-Shi Zheng, Xiaojuan Wang, Tao Xiang, and Shaogang Gong. 2015. Multi-Scale Learning for Low-Resolution Person Re-Identification. In The IEEE International Conference on Computer Vision (ICCV).Google Scholar
Giuseppe Lisanti, Iacopo Masi, Andrew D Bagdanov, and Alberto Del Bimbo. 2014. Person re-identification by iterative re-weighted sparse ranking. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 37, 8 (2014), 1629--1642.Google ScholarCross Ref
Chunxiao Liu, Shaogang Gong, Chen Change Loy, and Xinggang Lin. 2012. Person re-identification: What features are important?. In European Conference on Computer Vision (ECCV) Workshops. Springer, 391--401.Google ScholarDigital Library
Zhiheng Ma, Xing Wei, Xiaopeng Hong, and Yihong Gong. 2020. Learning Scales from Points: A Scale-aware Probabilistic Model for Crowd Counting. In ACM International Conference on Multimedia.Google ScholarDigital Library
Sebastian Mika, Gunnar Ratsch, Jason Weston, Bernhard Scholkopf, and Klaus-Robert Mullers. 1999. Fisher discriminant analysis with kernels. In Neural networks for signal processing IX: Proceedings of the 1999 IEEE signal processing society workshop (cat. no. 98th8468). IEEE, 41--48.Google Scholar
Matteo Munaro, Alberto Basso, Andrea Fossati, Luc Van Gool, and Emanuele Menegatti. 2014. 3D reconstruction of freely moving persons for re-identification with a depth sensor. In 2014 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 4512--4519.Google ScholarCross Ref
Vinod Nair and Geoffrey E Hinton. 2010. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning. 807--814.Google ScholarDigital Library
Dat Tien Nguyen, Hyung Gil Hong, Ki Wan Kim, and Kang Ryoung Park. 2017. Person recognition system based on a combination of body images from visible light and thermal cameras. Sensors, Vol. 17, 3 (2017), 605.Google ScholarCross Ref
Sakrapee Paisitkriangkrai, Chunhua Shen, and Anton van den Hengel. 2015. Learning to Rank in Person Re-Identification With Metric Ensembles. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
Lu Pang, Yaowei Wang, Yi-Zhe Song, Tiejun Huang, and Yonghong Tian. 2018. Cross-domain adversarial feature learning for sketch re-identification. In Proceedings of the 26th ACM international conference on Multimedia. 609--617.Google ScholarDigital Library
Sateesh Pedagadi, James Orwell, Sergio Velastin, and Boghos Boghossian. 2013. Local Fisher Discriminant Analysis for Pedestrian Re-identification. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
Peixi Peng, Tao Xiang, Yaowei Wang, Massimiliano Pontil, Shaogang Gong, Tiejun Huang, and Yonghong Tian. 2016. Unsupervised Cross-Dataset Transfer Learning for Person Re-Identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
Yuxin Peng, Xin Huang, and Yunzhen Zhao. 2017. An overview of cross-media retrieval: Concepts, methodologies, benchmarks, and challenges. IEEE Transactions on circuits and systems for video technology, Vol. 28, 9 (2017), 2372--2385.Google ScholarDigital Library
Weiwei Shi, Yihong Gong, Xiaoyu Tao, Jinjun Wang, and Nanning Zheng. 2017. Improving CNN performance accuracies with min--max objective. IEEE transactions on neural networks and learning systems, Vol. 29, 7 (2017), 2872--2885.Google Scholar
Zhiyuan Shi, Timothy M. Hospedales, and Tao Xiang. 2015. Transferring a Semantic Representation for Person Re-Identification and Search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
Xiaoyu Tao, Xiaopeng Hong, Xinyuan Chang, Songlin Dong, Xing Wei, and Yihong Gong. 2020. Few-Shot Class-Incremental Learning. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
Guan'an Wang, Tianzhu Zhang, Jian Cheng, Si Liu, Yang Yang, and Zengguang Hou. 2019 c. RGB-Infrared Cross-Modality Person Re-Identification via Joint Pixel and Feature Alignment. In The IEEE International Conference on Computer Vision (ICCV).Google ScholarCross Ref
Jingya Wang, Xiatian Zhu, Shaogang Gong, and Wei Li. 2018. Transferable Joint Attribute-Identity Deep Learning for Unsupervised Person Re-Identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
Zheng Wang, Zhixiang Wang, Yang Wu, Jingdong Wang, and Shin'ichi Satoh. 2019 a. Beyond intra-modality discrepancy: A comprehensive survey of heterogeneous person re-identification. arXiv preprint arXiv:1905.10048 (2019).Google Scholar
Zhixiang Wang, Zheng Wang, Yinqiang Zheng, Yung-Yu Chuang, and Shin'ichi Satoh. 2019 b. Learning to Reduce Dual-Level Discrepancy for Infrared-Visible Person Re-Identification. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
Longhui Wei, Shiliang Zhang, Wen Gao, and Qi Tian. 2018a. Person Transfer GAN to Bridge Domain Gap for Person Re-Identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
Xing Wei, Yue Zhang, Yihong Gong, Jiawei Zhang, and Nanning Zheng. 2018c. Grassmann pooling as compact homogeneous bilinear pooling for fine-grained visual classification. In Proceedings of the European Conference on Computer Vision (ECCV). 355--370.Google ScholarCross Ref
Xing Wei, Yue Zhang, Yihong Gong, and Nanning Zheng. 2018b. Kernelized Subspace Pooling for Deep Local Descriptors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
Ancong Wu, Wei-Shi Zheng, and Jian-Huang Lai. 2017a. Robust Depth-Based Person Re-Identification. IEEE Transactions on Image Processing, Vol. 26, 6 (2017), 2588--2603.Google ScholarDigital Library
Ancong Wu, Wei-Shi Zheng, Shaogang Gong, and Jianhuang Lai. 2020. RGB-IR Person Re-identification by Cross-Modality Similarity Preservation. International Journal of Computer Vision (2020), 1--21.Google Scholar
Ancong Wu, Wei-Shi Zheng, Hong-Xing Yu, Shaogang Gong, and Jianhuang Lai. 2017b. RGB-Infrared Cross-Modality Person Re-Identification. In The IEEE International Conference on Computer Vision (ICCV).Google Scholar
Tong Xiao, Hongsheng Li, Wanli Ouyang, and Xiaogang Wang. 2016. Learning Deep Feature Representations With Domain Guided Dropout for Person Re-Identification. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
Fei Xiong, Mengran Gou, Octavia I. Camps, and Mario Sznaier. 2014. Person Re-Identification Using Kernel-Based Metric Learning Methods. In The European Conference on Computer Vision (ECCV).Google Scholar
Mang Ye, Xiangyuan Lan, and Qingming Leng. 2019. Modality-aware collaborative learning for visible thermal person re-identification. In Proceedings of the 27th ACM International Conference on Multimedia. 347--355.Google ScholarDigital Library
Mang Ye, Xiangyuan Lan, Jiawei Li, and Pong C Yuen. 2018a. Hierarchical discriminative learning for visible thermal person re-identification. In Thirty-Second AAAI conference on artificial intelligence.Google ScholarCross Ref
Mang Ye, Zheng Wang, Xiangyuan Lan, and Pong C Yuen. 2018b. Visible Thermal Person Re-Identification via Dual-Constrained Top-Ranking.. In IJCAI. 1092--1099.Google Scholar
Sergey Zagoruyko and Nikos Komodakis. 2017. Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer. In International Conference on Learning Representations.Google Scholar
Ruimao Zhang, Liang Lin, Rui Zhang, Wangmeng Zuo, and Lei Zhang. 2015. Bit-scalable deep hashing with regularized similarity learning for image retrieval and person re-identification. IEEE Transactions on Image Processing, Vol. 24, 12 (2015), 4766--4779.Google ScholarDigital Library
Shizhou Zhang, Yifei Yang, Peng Wang, Xiuwei Zhang, and Yanning Zhang. 2019. Attend to the difference: Cross-modality person re-identification via contrastive correlation. arXiv preprint arXiv:1910.11656 (2019).Google Scholar
Shizhou Zhang, Qi Zhang, Yifei Yang, Xing Wei, Peng Wang, Bingliang Jiao, and Yanning Zhang. 2020. Person Re-identification in Aerial imagery. IEEE Transactions on Multimedia (2020).Google Scholar
Xi Zhang, Hanjiang Lai, and Jiashi Feng. 2018. Attention-aware Deep Adversarial Hashing for Cross-Modal Retrieval. In The European Conference on Computer Vision (ECCV).Google Scholar
Xuan Zhang, Hao Luo, Xing Fan, Weilai Xiang, Yixiao Sun, Qiqi Xiao, Wei Jiang, Chi Zhang, and Jian Sun. 2017. Alignedreid: Surpassing human-level performance in person re-identification. arXiv preprint arXiv:1711.08184 (2017).Google Scholar
Wei-Shi Zheng, Xiang Li, Tao Xiang, Shengcai Liao, Jianhuang Lai, and Shaogang Gong. 2015. Partial Person Re-Identification. In The IEEE International Conference on Computer Vision (ICCV).Google Scholar
Zhedong Zheng, Liang Zheng, and Yi Yang. 2017. Unlabeled Samples Generated by GAN Improve the Person Re-Identification Baseline in Vitro. In The IEEE International Conference on Computer Vision (ICCV).Google ScholarCross Ref
Yingji Zhong, Xiaoyu Wang, and Shiliang Zhang. 2020. Robust Partial Matching for Person Search in the Wild. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
Zhun Zhong, Liang Zheng, Donglin Cao, and Shaozi Li. 2017. Re-Ranking Person Re-Identification With k-Reciprocal Encoding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
Jiahuan Zhou, Bing Su, and Ying Wu. 2020. Online Joint Multi-Metric Adaptation From Frequent Sharing-Subset Mining for Person Re-Identification. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
Fuqing Zhu, Xiangwei Kong, Liang Zheng, Haiyan Fu, and Qi Tian. 2017. Part-based deep hashing for large-scale person re-identification. IEEE Transactions on Image Processing, Vol. 26, 10 (2017), 4806--4817.Google ScholarCross Ref

Index Terms

Co-Attentive Lifting for Infrared-Visible Person Re-Identification
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision representations
        Appearance and texture representations
      2. Computer vision tasks
        Visual content-based indexing and retrieval
2. Information systems
  1. Information systems applications
    1. Data mining
      1. Nearest-neighbor search

Recommendations

Counterfactual attention alignment for visible-infrared cross-modality person re-identification
Highlights
- A self-weighted part attention module is designed to mine discriminative part-aggregated feature.
- A novel self-weighted module is introduced to adaptively assign weights to different body parts.
- A counterfactual attention strategy ...
Abstract
Visible-infrared person re-identification (VI-ReID) copes with cross-modality matching between the daytime visible and night-time infrared images. Existing methods try to use attention modules to enhance multi-modality feature representations, ...
Read More
Visible-infrared Person Re-identification with Human Body Parts Assistance
ICMR '21: Proceedings of the 2021 International Conference on Multimedia Retrieval

Person re-identification (re-id) has received ever-increasing research focus, because of its important role in video surveillance applications. This paper addresses the re-id problem between visible images of color cameras and infrared images of ...
Read More
Cross-Modality Transformer for Visible-Infrared Person Re-Identification
Computer Vision – ECCV 2022
Abstract
Visible-infrared person re-identification (VI-ReID) is a challenging task due to the large cross-modality discrepancies and intra-class variations. Existing works mainly focus on learning modality-shared representations by embedding different ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MM '20: Proceedings of the 28th ACM International Conference on Multimedia
October 2020
4889 pages
ISBN:9781450379885
DOI:10.1145/3394171
General Chairs:
Chang Wen Chen
Chinese University of Hong Kong, Shenzhen, China
,
Rita Cucchiara
UNIMORE, Italy
,
Xian-Sheng Hua
Alibaba Group, China
,
Program Chairs:
Guo-Jun Qi
Futurewei Technologies, USA
,
Elisa Ricci
UNITN & Fondazione Bruno Kessler, Italy
,
Zhengyou Zhang
Tencent, China
,
Roger Zimmermann
National University of Singapore, Singapore
Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 October 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
cross modality search
infrared imagery
person re-identification
visual attention
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate995of4,171submissions,24%
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 30
  Total Citations
  View Citations
- 568
  Total Downloads
- Downloads (Last 12 months)53
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Co-Attentive Lifting for Infrared-Visible Person Re-Identification

MM '20: Proceedings of the 28th ACM International Conference on Multimedia

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Counterfactual attention alignment for visible-infrared cross-modality person re-identification

Visible-infrared Person Re-identification with Human Body Parts Assistance

Cross-Modality Transformer for Visible-Infrared Person Re-Identification