ABSTRACT
In this paper, we study the problem of identifying logos of business brands in natural scenes in an open-set one-shot setting. This problem setup is significantly more challenging than traditionally-studied ‘closed-set’ and ‘large-scale training samples per category’ logo recognition settings. We propose a novel multi-view textual-visual encoding framework that encodes text appearing in the logos as well as the graphical design of the logos to learn robust contrastive representations. These representations are jointly learned for multiple views of logos over a batch and thereby they generalize well to unseen logos. We evaluate our proposed framework for cropped logo verification, cropped logo identification, and end-to-end logo identification in natural scene tasks; and compare it against state-of-the-art methods. Further, the literature lacks a ‘very-large-scale’ collection of reference logo images that can facilitate the study of one-hundred thousand-scale logo identification. To fill this gap in the literature, we introduce Wikidata Reference Logo Dataset (WiRLD), containing logos for 100K business brands harvested from Wikidata. Our proposed framework that achieves an area under the ROC curve of 91.3% on the QMUL-OpenLogo dataset for the verification task, outperforms state-of-the-art methods by 9.1% and 2.6% on the one-shot logo identification task on the Toplogos-10 and the FlickrLogos32 datasets, respectively. Further, we show that our method is more stable compared to other baselines even when the number of candidate logos is on a 100K scale.
- Jeonghun Baek, Geewook Kim, Junyeop Lee, Sungrae Park, Dongyoon Han, Sangdoo Yun, Seong Joon Oh, and Hwalsuk Lee. 2019. What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis. In ICCV.Google Scholar
- Muhammet Bastan, Hao-Yu Wu, Tian Cao, Bhargava Kota, and Mehmet Tek. 2019. Large scale open-set deep logo detection. arXiv preprint arXiv:1911.07440(2019).Google Scholar
- Ayan Kumar Bhunia, Ankan Kumar Bhunia, Shuvozit Ghose, Abhirup Das, Partha Pratim Roy, and Umapada Pal. 2019. A deep one-shot network for query-based logo retrieval. Pattern Recognition 96(2019), 106965.Google ScholarDigital Library
- Simone Bianco, Marco Buzzelli, Davide Mazzini, and Raimondo Schettini. 2015. Logo recognition using cnn features. In International Conference on Image Analysis and Processing.Google ScholarDigital Library
- Simone Bianco, Marco Buzzelli, Davide Mazzini, and Raimondo Schettini. 2017. Deep learning for logo recognition. Neurocomputing 245(2017), 23–30.Google ScholarDigital Library
- Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. In ICML.Google Scholar
- S. Chopra, R. Hadsell, and Y. LeCun. 2005. Learning a similarity metric discriminatively, with application to face verification. In CVPR.Google Scholar
- István Fehérvári and Srikar Appalaraju. 2019. Scalable logo recognition using proxies. In WACV.Google Scholar
- Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, Zhaohan Guo, Mohammad Gheshlaghi Azar, Bilal Piot, koray kavukcuoglu, Remi Munos, and Michal Valko. 2020. Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning. In NeurIPS.Google Scholar
- Michael Gutmann and Aapo Hyvärinen. 2010. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. In Proceedings of the thirteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, 297–304.Google Scholar
- Raia Hadsell, Sumit Chopra, and Yann LeCun. 2006. Dimensionality reduction by learning an invariant mapping. In CVPR.Google Scholar
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR.Google Scholar
- Elad Hoffer and Nir Ailon. 2015. Deep metric learning using triplet network. In International workshop on similarity-based pattern recognition. Springer, 84–92.Google ScholarCross Ref
- Steven CH Hoi, Xiongwei Wu, Hantang Liu, Yue Wu, Huiqiong Wang, Hui Xue, and Qiang Wu. 2015. Logo-net: Large-scale deep logo detection and brand recognition with deep region-based convolutional networks. arXiv preprint arXiv:1511.02462(2015).Google Scholar
- Steven CH Hoi, Xiongwei Wu, Hantang Liu, Yue Wu, Huiqiong Wang, Hui Xue, and Qiang Wu. 2015. Logo-net: Large-scale deep logo detection and brand recognition with deep region-based convolutional networks. arXiv preprint arXiv:1511.02462(2015).Google Scholar
- Sujuan Hou, Jianwei Lin, Shangbo Zhou, Maoling Qin, Weikuan Jia, and Yuanjie Zheng. 2017. Deep hierarchical representation from classifying logo-405. Complexity 2017(2017).Google Scholar
- Forrest N Iandola, Anting Shen, Peter Gao, and Kurt Keutzer. 2015. Deeplogo: Hitting logo recognition with the deep neural network hammer. arXiv preprint arXiv:1510.02131(2015).Google Scholar
- Ashish Jaiswal, Ashwin Ramesh Babu, Mohammad Zaki Zadeh, Debapriya Banerjee, and Fillia Makedon. 2021. A survey on contrastive self-supervised learning. Technologies 9, 1 (2021), 2.Google ScholarCross Ref
- Glenn Jocher, Ayush Chaurasia, Alex Stoken, Jirka Borovec, NanoCode012, Yonghye Kwon, TaoXie, Jiacong Fang, imyhxy, Kalen Michael, Lorna, Abhiram V, Diego Montes, Jebastin Nadar, Laughing, tkianai, yxNONG, Piotr Skalski, Zhiqiang Wang, Adam Hogan, Cristi Fati, Lorenzo Mammana, AlexWang1900, Deep Patel, Ding Yiwei, Felix You, Jan Hajek, Laurentiu Diaconu, and Mai Thanh Minh. 2022. ultralytics/yolov5: v6.1 - TensorRT, TensorFlow Edge TPU and OpenVINO Export and Inference. https://doi.org/10.5281/zenodo.6222936Google ScholarCross Ref
- Alexis Joly and Olivier Buisson. 2009. Logo retrieval with a contrario visual query expansion. In ACM-MM.Google Scholar
- Y. Kalantidis, LG. Pueyo, M. Trevisiol, R. van Zwol, and Y. Avrithis. 2011. Scalable Triangulation-based Logo Recognition. In ICMR.Google Scholar
- Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, and Dilip Krishnan. 2020. Supervised contrastive learning. NeurIPS (2020).Google Scholar
- Junsik Kim, Seokju Lee, Tae-Hyun Oh, and In So Kweon. 2018. Co-domain embedding using deep quadruplet networks for unseen traffic sign recognition. In AAAI.Google Scholar
- Junsik Kim, Tae-Hyun Oh, Seokju Lee, Fei Pan, and In So Kweon. 2019. Variational prototyping-encoder: One-shot learning with prototypical images. In CVPR.Google Scholar
- Gregory Koch, Richard Zemel, Ruslan Salakhutdinov, 2015. Siamese neural networks for one-shot image recognition. In ICML deep learning workshop.Google Scholar
- Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. NeurIPS (2012).Google Scholar
- Chenge Li, István Fehérvári, Xiaonan Zhao, Ives Macedo, and Srikar Appalaraju. 2022. SeeTek: Very Large-Scale Open-set Logo Recognition with Text-Aware Metric Learning. In WACV.Google Scholar
- Jan Neumann, Hanan Samet, and Aya Soffer. 2002. Integration of local and global shape analysis for logo classification. Pattern recognition letters 23, 12 (2002), 1449–1457.Google Scholar
- Stefan Romberg and Rainer Lienhart. 2013. Bundle min-hashing for logo recognition. In ICMR.Google Scholar
- Stefan Romberg, Lluis Garcia Pueyo, Rainer Lienhart, and Roelof Van Zwol. 2011. Scalable logo recognition in real-world images. In ICMR.Google Scholar
- Baoguang Shi, Xiang Bai, and Cong Yao. 2017. An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition. IEEE TPAMI 39(2017), 2298–2304.Google ScholarDigital Library
- Kihyuk Sohn. 2016. Improved Deep Metric Learning with Multi-class N-pair Loss Objective. In NeurIPS, Daniel D. Lee, Masashi Sugiyama, Ulrike von Luxburg, Isabelle Guyon, and Roman Garnett (Eds.).Google Scholar
- Hang Su, Shaogang Gong, and Xiatian Zhu. 2017. Weblogo-2m: Scalable logo detection by deep learning from the web. In CVPRW.Google Scholar
- Hang Su, Xiatian Zhu, and Shaogang Gong. 2017. Deep learning logo detection with data expansion by synthesising context. In WACV.Google Scholar
- Hang Su, Xiatian Zhu, and Shaogang Gong. 2018. Open Logo Detection Challenge. In BMVC.Google Scholar
- Yonglong Tian, Dilip Krishnan, and Phillip Isola. 2020. Contrastive Multiview Coding. In ECCV.Google Scholar
- Andras Tüzkö, Christian Herrmann, Daniel Manger, and Jürgen Beyerer. 2017. Open set logo detection and retrieval. arXiv preprint arXiv:1710.10891(2017).Google Scholar
- Camilo Vargas, Qianni Zhang, and Ebroul Izquierdo. 2020. One shot logo recognition based on siamese neural networks. In ICMR.Google Scholar
- Oriol Vinyals, Charles Blundell, Timothy Lillicrap, Daan Wierstra, 2016. Matching networks for one shot learning. NeurIPS (2016).Google Scholar
- Denny Vrandečić and Markus Krötzsch. 2014. Wikidata: a free collaborative knowledgebase. Commun. ACM 57, 10 (2014), 78–85.Google ScholarDigital Library
- Jing Wang, Weiqing Min, Sujuan Hou, Shengnan Ma, Yuanjie Zheng, and Shuqiang Jiang. 2022. LogoDet-3K: A Large-Scale Image Dataset for Logo Detection. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 18, 1 (2022), 1–19.Google ScholarDigital Library
- Jing Wang, Weiqing Min, Sujuan Hou, Shengnan Ma, Yuanjie Zheng, Haishuai Wang, and Shuqiang Jiang. 2020. Logo-2K+: A large-scale logo dataset for scalable logo classification. In AAAI.Google Scholar
- Chenxi Xiao, Naveen Madapana, and Juan Wachs. 2021. One-Shot Image Recognition Using Prototypical Encoders with Reduced Hubness. In WACV.Google Scholar
- Jure Zbontar, Li Jing, Ishan Misra, Yann LeCun, and Stephane Deny. 2021. Barlow Twins: Self-Supervised Learning via Redundancy Reduction. In ICML.Google Scholar
Index Terms
- Contrastive Multi-View Textual-Visual Encoding: Towards One Hundred Thousand-Scale One-Shot Logo Identification✱
Recommendations
A Novel Multi-logo Image Watermarking Scheme Resisting Geometrical Attacks
MINES '09: Proceedings of the 2009 International Conference on Multimedia Information Networking and Security - Volume 01in watermarking applications, the robustness of the watermark to geometric manipulations is a critical issue. This is due to the fact that changing the image size or its orientation could make the receiver lost synchronization with original watermarked ...
A deep one-shot network for query-based logo retrieval
Highlights- A scalable solution is proposed for the logo detection problem by redesigning the traditional problem setting.
AbstractLogo detection in real-world scene images is an important problem with applications in advertisement and marketing. Existing general-purpose object detection methods require large training data with annotations for every logo class. ...
Robust multi-logo watermarking by RDWT and ICA
Fractional calculus applications in signals and systemsThis paper proposes a new approach to watermarking multimedia products by redundant discrete wavelet transform (RDWT) and independent component analysis (ICA). For watermark security, embedded logo watermarks are encrypted to random noise signal. To ...
Comments