skip to main content
10.1145/3323873.3325017acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
short-paper

Weakly Supervised Image Retrieval via Coarse-scale Feature Fusion and Multi-level Attention Blocks

Published:05 June 2019Publication History

ABSTRACT

In this paper, we propose an end-to-end Attention-Block network for image retrieval (ABIR), which greatly increases the retrieval accuracy without human annotations like bounding boxes. Specifically, our network utilizes coarse-scale feature fusion, which generates the attentive local features via combining the information from different intermediate layers. Detailed feature information is extracted with the application of two attention blocks. Extensive experiments show that our method outperforms the state-of-the-art by a significant margin on four public datasets for image retrieval tasks.

References

  1. Artem Babenko, Anton Slesarev, Alexander Chigorin, and Victor S. Lempitsky. 2014. Neural Codes for Image Retrieval. In ECCV 2014, Vol. 8689. Springer, 584--599.Google ScholarGoogle Scholar
  2. Charles Corbière, Hedi Ben-younes, Alexandre Ramé, and Charles Ollion. 2017. Leveraging Weakly Annotated Data for Fashion Image Retrieval and Label Prediction. In IEEE Conference on Computer Vision and Pattern Recognition Workshops,2017. 2268--2274.Google ScholarGoogle Scholar
  3. Weifeng Ge, Weilin Huang, Dengke Dong, and Matthew R. Scott. 2018. Deep Metric Learning with Hierarchical Triplet Loss. In ECCV 2018, Vol. 11210. Springer, 272--288.Google ScholarGoogle Scholar
  4. Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-Excitation Networks. In IEEE, CVPR 2018. 7132--7141.Google ScholarGoogle ScholarCross RefCross Ref
  5. Junshi Huang, Rogerio S. Feris, Chen Qiang, and Shuicheng Yan. 2015. Cross-domain Image Retrieval with a Dual Attribute-aware Ranking Network. In IEEE, CVPR 2015. 1062--1074. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Wonsik Kim, Bhavya Goyal, Kunal Chawla, Jungmin Lee, and Keunjoo Kwon. 2018. Attention-based Ensemble for Deep Metric Learning. ECCV 2018, Vol. 11205. Springer, 760--777.Google ScholarGoogle Scholar
  7. Jonathan Krause, Michael Stark, Jia Deng, and Fei Fei Li. 2014. 3D Object Representations for Fine-Grained Categorization. In IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2014. 554--561. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Ziwei Liu, Ping Luo, Shi Qiu, Xiaogang Wang, and Xiaoou Tang. 2016. Deepfashion: Powering robust clothes recognition and retrieval with rich annotations. In IEEE, CVPR 2016. 1096--1104.Google ScholarGoogle ScholarCross RefCross Ref
  9. Joe Yue-Hei Ng, Fan Yang, and Larry S. Davis. 2015. Exploiting local features from deep networks for image retrieval. In IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2015. 53--61.Google ScholarGoogle Scholar
  10. Michael Opitz, Georg Waltner, Horst Possegger, and Horst Bischof. 2018. Deep Metric Learning with BIER: Boosting Independent Embeddings Robustly. CoRR, Vol. abs/1801.04815 (2018). http://arxiv.org/abs/1801.04815Google ScholarGoogle Scholar
  11. Ali Sharif Razavian, Hossein Azizpour, Josephine Sullivan, and Stefan Carlsson. 2014. CNN Features Off-the-Shelf: An Astounding Baseline for Recognition. In IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2014. 512--519. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Namhoon Lee Saumya Jetley, Nicholas A. Lord and Philip H. S. Torr. 2018. Learn To Pay Attention. In International Conference of Learning Representation .Google ScholarGoogle Scholar
  13. Jo Schlemper, Ozan Oktay, Michiel Schaap, Mattias P. Heinrich, Bernhard Kainz, Ben Glocker, and Daniel Rueckert. 2018. Attention Gated Networks: Learning to Leverage Salient Regions in Medical Images. CoRR, Vol. abs/1808.08114 (2018). http://arxiv.org/abs/1808.08114Google ScholarGoogle Scholar
  14. Peichung Shih and Chengjun Liu. 2005. Comparative Assessment of Content-Based Face Image Retrieval in Different Color Spaces. In 2005 Audio- and Video-Based Biometric Person Authentication, Vol. 3546. Springer, 1039--1048. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Karen Simonyan and Andrew Zisserman. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. CoRR, Vol. abs/1409.1556 (2014). http://arxiv.org/abs/1409.1556Google ScholarGoogle Scholar
  16. Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google ScholarGoogle Scholar
  17. Hyun Oh Song, Xiang Yu, Stefanie Jegelka, and Silvio Savarese. 2016. Deep Metric Learning via Lifted Structured Feature Embedding. In IEEE, CVPR 2016. 4004--4012.Google ScholarGoogle ScholarCross RefCross Ref
  18. Giorgos Tolias, Ronan Sicre, and Hervé Jégou. 2015. Particular object retrieval with integral max-pooling of CNN activations. CoRR, Vol. abs/1511.05879 (2015). http://arxiv.org/abs/1511.05879Google ScholarGoogle Scholar
  19. Evgeniya Ustinova and Victor S. Lempitsky. 2016. Learning Deep Embeddings with Histogram Loss. In Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016. 4170--4178. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. C. Wah, S. Branson, P. Welinder, P. Perona, and S. Belongie. 2011. The Caltech-UCSD Birds-200--2011 Dataset. Technical Report.Google ScholarGoogle Scholar
  21. Xiu-Shen Wei, Jian-Hao Luo, Jianxin Wu, and Zhi-Hua Zhou. 2017. Selective convolutional descriptor aggregation for fine-grained image retrieval. IEEE Transactions on Image Processing, Vol. 26, 6 (2017), 2868--2881. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Chao Yuan Wu, R. Manmatha, Alexander J. Smola, and Philipp Kr?henb?hl. 2017. Sampling Matters in Deep Embedding Learning. In IEEE, CVPR 2017. 2859--2867.Google ScholarGoogle Scholar
  23. Lingxi Xie, Jingdong Wang, Bo Zhang, and Qi Tian. 2015. Fine-grained image search. IEEE Transactions on Multimedia, Vol. 17, 5 (2015), 636--647.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Kota Yamaguchi, M Hadi Kiapour, and Tamara L Berg. 2013. Paper doll parsing: Retrieving similar styles to parse clothing items. In IEEE, CVPR 2013. 3519--3526. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Yuhui Yuan, Kuiyuan Yang, and Zhang Chao. 2017. Hard-Aware Deeply Cascaded Embedding. In IEEE, CVPR 2017. 814--823.Google ScholarGoogle Scholar
  26. Wengang Zhou, Houqiang Li, and Tian Qi. 2017. Recent Advance in Content-based Image Retrieval: A Literature Survey., Vol. abs/1706.06064 (2017).Google ScholarGoogle Scholar

Index Terms

  1. Weakly Supervised Image Retrieval via Coarse-scale Feature Fusion and Multi-level Attention Blocks

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ICMR '19: Proceedings of the 2019 on International Conference on Multimedia Retrieval
      June 2019
      427 pages
      ISBN:9781450367653
      DOI:10.1145/3323873

      Copyright © 2019 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 5 June 2019

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • short-paper

      Acceptance Rates

      Overall Acceptance Rate254of830submissions,31%

      Upcoming Conference

      ICMR '24
      International Conference on Multimedia Retrieval
      June 10 - 14, 2024
      Phuket , Thailand

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader