ResMatch: Residual Attention Learning for Feature Matching
DOI:
https://doi.org/10.1609/aaai.v38i2.27915Keywords:
CV: Motion & Tracking, CV: Image and Video RetrievalAbstract
Attention-based graph neural networks have made great progress in feature matching. However, the literature lacks a comprehensive understanding of how the attention mechanism operates for feature matching. In this paper, we rethink cross- and self-attention from the viewpoint of traditional feature matching and filtering. To facilitate the learning of matching and filtering, we incorporate the similarity of descriptors into cross-attention and relative positions into self-attention. In this way, the attention can concentrate on learning residual matching and filtering functions with reference to the basic functions of measuring visual and spatial correlation. Moreover, we leverage descriptor similarity and relative positions to extract inter- and intra-neighbors. Then sparse attention for each point can be performed only within its neighborhoods to acquire higher computation efficiency. Extensive experiments, including feature matching, pose estimation and visual localization, confirm the superiority of the proposed method. Our codes are available at https://github.com/ACuOoOoO/ResMatch.Downloads
Published
2024-03-24
How to Cite
Deng, Y., Zhang, K., Zhang, S., Li, Y., & Ma, J. (2024). ResMatch: Residual Attention Learning for Feature Matching. Proceedings of the AAAI Conference on Artificial Intelligence, 38(2), 1501-1509. https://doi.org/10.1609/aaai.v38i2.27915
Issue
Section
AAAI Technical Track on Computer Vision I