ResMatch: Residual Attention Learning for Feature Matching

Authors

  • Yuxin Deng Wuhan University
  • Kaining Zhang Wuhan University
  • Shihua Zhang Wuhan University
  • Yansheng Li Wuhan University
  • Jiayi Ma Wuhan University

DOI:

https://doi.org/10.1609/aaai.v38i2.27915

Keywords:

CV: Motion & Tracking, CV: Image and Video Retrieval

Abstract

Attention-based graph neural networks have made great progress in feature matching. However, the literature lacks a comprehensive understanding of how the attention mechanism operates for feature matching. In this paper, we rethink cross- and self-attention from the viewpoint of traditional feature matching and filtering. To facilitate the learning of matching and filtering, we incorporate the similarity of descriptors into cross-attention and relative positions into self-attention. In this way, the attention can concentrate on learning residual matching and filtering functions with reference to the basic functions of measuring visual and spatial correlation. Moreover, we leverage descriptor similarity and relative positions to extract inter- and intra-neighbors. Then sparse attention for each point can be performed only within its neighborhoods to acquire higher computation efficiency. Extensive experiments, including feature matching, pose estimation and visual localization, confirm the superiority of the proposed method. Our codes are available at https://github.com/ACuOoOoO/ResMatch.

Published

2024-03-24

How to Cite

Deng, Y., Zhang, K., Zhang, S., Li, Y., & Ma, J. (2024). ResMatch: Residual Attention Learning for Feature Matching. Proceedings of the AAAI Conference on Artificial Intelligence, 38(2), 1501-1509. https://doi.org/10.1609/aaai.v38i2.27915

Issue

Section

AAAI Technical Track on Computer Vision I