short-paper

Weakly Supervised Image Retrieval via Coarse-scale Feature Fusion and Multi-level Attention Blocks

Authors:
Xinyao Nie

Fudan University, Shanghai, China

Fudan University, Shanghai, China
View Profile

,
Hong Lu

Fudan University, Shanghai, China

Fudan University, Shanghai, China
View Profile

,
Zijian Wang

Fudan University, Shanghai, China

Fudan University, Shanghai, China
View Profile

,
Jingyuan Liu

Fudan University, Shanghai, China

Fudan University, Shanghai, China
View Profile

,
Zehua Guo

Fudan University, Shanghai, China

Fudan University, Shanghai, China
View Profile

ICMR '19: Proceedings of the 2019 on International Conference on Multimedia RetrievalJune 2019Pages 48–52https://doi.org/10.1145/3323873.3325017

Published:05 June 2019Publication History

ICMR '19: Proceedings of the 2019 on International Conference on Multimedia Retrieval

Pages 48–52

ABSTRACT

In this paper, we propose an end-to-end Attention-Block network for image retrieval (ABIR), which greatly increases the retrieval accuracy without human annotations like bounding boxes. Specifically, our network utilizes coarse-scale feature fusion, which generates the attentive local features via combining the information from different intermediate layers. Detailed feature information is extracted with the application of two attention blocks. Extensive experiments show that our method outperforms the state-of-the-art by a significant margin on four public datasets for image retrieval tasks.

References

Artem Babenko, Anton Slesarev, Alexander Chigorin, and Victor S. Lempitsky. 2014. Neural Codes for Image Retrieval. In ECCV 2014, Vol. 8689. Springer, 584--599.Google Scholar
Charles Corbière, Hedi Ben-younes, Alexandre Ramé, and Charles Ollion. 2017. Leveraging Weakly Annotated Data for Fashion Image Retrieval and Label Prediction. In IEEE Conference on Computer Vision and Pattern Recognition Workshops,2017. 2268--2274.Google Scholar
Weifeng Ge, Weilin Huang, Dengke Dong, and Matthew R. Scott. 2018. Deep Metric Learning with Hierarchical Triplet Loss. In ECCV 2018, Vol. 11210. Springer, 272--288.Google Scholar
Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-Excitation Networks. In IEEE, CVPR 2018. 7132--7141.Google ScholarCross Ref
Junshi Huang, Rogerio S. Feris, Chen Qiang, and Shuicheng Yan. 2015. Cross-domain Image Retrieval with a Dual Attribute-aware Ranking Network. In IEEE, CVPR 2015. 1062--1074. Google ScholarDigital Library
Wonsik Kim, Bhavya Goyal, Kunal Chawla, Jungmin Lee, and Keunjoo Kwon. 2018. Attention-based Ensemble for Deep Metric Learning. ECCV 2018, Vol. 11205. Springer, 760--777.Google Scholar
Jonathan Krause, Michael Stark, Jia Deng, and Fei Fei Li. 2014. 3D Object Representations for Fine-Grained Categorization. In IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2014. 554--561. Google ScholarDigital Library
Ziwei Liu, Ping Luo, Shi Qiu, Xiaogang Wang, and Xiaoou Tang. 2016. Deepfashion: Powering robust clothes recognition and retrieval with rich annotations. In IEEE, CVPR 2016. 1096--1104.Google ScholarCross Ref
Joe Yue-Hei Ng, Fan Yang, and Larry S. Davis. 2015. Exploiting local features from deep networks for image retrieval. In IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2015. 53--61.Google Scholar
Michael Opitz, Georg Waltner, Horst Possegger, and Horst Bischof. 2018. Deep Metric Learning with BIER: Boosting Independent Embeddings Robustly. CoRR, Vol. abs/1801.04815 (2018). http://arxiv.org/abs/1801.04815Google Scholar
Ali Sharif Razavian, Hossein Azizpour, Josephine Sullivan, and Stefan Carlsson. 2014. CNN Features Off-the-Shelf: An Astounding Baseline for Recognition. In IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2014. 512--519. Google ScholarDigital Library
Namhoon Lee Saumya Jetley, Nicholas A. Lord and Philip H. S. Torr. 2018. Learn To Pay Attention. In International Conference of Learning Representation .Google Scholar
Jo Schlemper, Ozan Oktay, Michiel Schaap, Mattias P. Heinrich, Bernhard Kainz, Ben Glocker, and Daniel Rueckert. 2018. Attention Gated Networks: Learning to Leverage Salient Regions in Medical Images. CoRR, Vol. abs/1808.08114 (2018). http://arxiv.org/abs/1808.08114Google Scholar
Peichung Shih and Chengjun Liu. 2005. Comparative Assessment of Content-Based Face Image Retrieval in Different Color Spaces. In 2005 Audio- and Video-Based Biometric Person Authentication, Vol. 3546. Springer, 1039--1048. Google ScholarDigital Library
Karen Simonyan and Andrew Zisserman. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. CoRR, Vol. abs/1409.1556 (2014). http://arxiv.org/abs/1409.1556Google Scholar
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google Scholar
Hyun Oh Song, Xiang Yu, Stefanie Jegelka, and Silvio Savarese. 2016. Deep Metric Learning via Lifted Structured Feature Embedding. In IEEE, CVPR 2016. 4004--4012.Google ScholarCross Ref
Giorgos Tolias, Ronan Sicre, and Hervé Jégou. 2015. Particular object retrieval with integral max-pooling of CNN activations. CoRR, Vol. abs/1511.05879 (2015). http://arxiv.org/abs/1511.05879Google Scholar
Evgeniya Ustinova and Victor S. Lempitsky. 2016. Learning Deep Embeddings with Histogram Loss. In Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016. 4170--4178. Google ScholarDigital Library
C. Wah, S. Branson, P. Welinder, P. Perona, and S. Belongie. 2011. The Caltech-UCSD Birds-200--2011 Dataset. Technical Report.Google Scholar
Xiu-Shen Wei, Jian-Hao Luo, Jianxin Wu, and Zhi-Hua Zhou. 2017. Selective convolutional descriptor aggregation for fine-grained image retrieval. IEEE Transactions on Image Processing, Vol. 26, 6 (2017), 2868--2881. Google ScholarDigital Library
Chao Yuan Wu, R. Manmatha, Alexander J. Smola, and Philipp Kr?henb?hl. 2017. Sampling Matters in Deep Embedding Learning. In IEEE, CVPR 2017. 2859--2867.Google Scholar
Lingxi Xie, Jingdong Wang, Bo Zhang, and Qi Tian. 2015. Fine-grained image search. IEEE Transactions on Multimedia, Vol. 17, 5 (2015), 636--647.Google ScholarDigital Library
Kota Yamaguchi, M Hadi Kiapour, and Tamara L Berg. 2013. Paper doll parsing: Retrieving similar styles to parse clothing items. In IEEE, CVPR 2013. 3519--3526. Google ScholarDigital Library
Yuhui Yuan, Kuiyuan Yang, and Zhang Chao. 2017. Hard-Aware Deeply Cascaded Embedding. In IEEE, CVPR 2017. 814--823.Google Scholar
Wengang Zhou, Houqiang Li, and Tian Qi. 2017. Recent Advance in Content-based Image Retrieval: A Literature Survey., Vol. abs/1706.06064 (2017).Google Scholar

Index Terms

Weakly Supervised Image Retrieval via Coarse-scale Feature Fusion and Multi-level Attention Blocks
1. Information systems
  1. Information retrieval
    1. Specialized information retrieval
      1. Multimedia and multimodal retrieval
        Image search

Recommendations

Augmented Feature Fusion for Image Retrieval System
ICMR '15: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval

The performance of current image retrieval system is largely determined by the quality and discriminative capability of features. Therefore, using what features and how to effectively combine the power of appropriate features are important in the ...
Read More
A decisive content based image retrieval approach for feature fusion in visual and textual images
Abstract
Image content analysis plays a dynamic role in various computer vision applications. These contents can be either visual (i.e. color, shape, texture) or the textual (i.e. text appearing within images). Both the contents involve ...
Highlights
- A decisive CBIR system is proposed that considers visual and textual contents.
- ...
Read More
Image Retrieval Using Fused Deep Convolutional Features

This paper proposes an image retrieval using fused deep convolutional features to solve the semantic gap between low-level features and high-level semantic features of traditional contend-based image retrieval method. Firstly, the improved network ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICMR '19: Proceedings of the 2019 on International Conference on Multimedia Retrieval
June 2019
427 pages
ISBN:9781450367653
DOI:10.1145/3323873
General Chairs:
Abdulmotaleb El Saddik
University of Ottawa, Canada
,
Alberto Del Bimbo
University of Florence, Italy
,
Zhongfei Zhang
Binghamton University, State University of New York, USA
,
Program Chairs:
Alexander Hauptmann
Carnegie Mellon University, USA
,
K. Selcuk Candan
Arizona State University, USA
,
Marco Bertini
University of Florence, Italy
,
Lexing Xie
Australia National University, Australia
,
Xiao-Yong Wei
Sichuan University, China
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 5 June 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
attention block
feature fusion
image retrieval
weakly supervised
Qualifiers
- short-paper
Conference

Acceptance Rates
Overall Acceptance Rate254of830submissions,31%
Upcoming Conference
ICMR '24

Sponsor:

sigmm

International Conference on Multimedia Retrieval

June 10 - 14, 2024

Phuket , Thailand
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 5
  Total Citations
  View Citations
- 440
  Total Downloads
- Downloads (Last 12 months)5
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Weakly Supervised Image Retrieval via Coarse-scale Feature Fusion and Multi-level Attention Blocks

ICMR '19: Proceedings of the 2019 on International Conference on Multimedia Retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Augmented Feature Fusion for Image Retrieval System

A decisive content based image retrieval approach for feature fusion in visual and textual images

Image Retrieval Using Fused Deep Convolutional Features