Self-supervised Bernoulli Autoencoders for Semi-supervised Hashing

Ñanculef, Ricardo; Mena, Francisco; Macaluso, Antonio; Lodi, Stefano; Sartori, Claudio

doi:10.1007/978-3-030-93420-0_25

Ricardo Ñanculef¹¹,
Francisco Mena¹¹,
Antonio Macaluso¹²,
Stefano Lodi¹² &
…
Claudio Sartori¹²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12702))

Included in the following conference series:

Iberoamerican Congress on Pattern Recognition

746 Accesses
1 Citations

Abstract

Semantic hashing is a technique to represent high-dimensional data using similarity-preserving binary codes for efficient indexing and search. Recently, variational autoencoders with Bernoulli latent representations achieved remarkable success in learning such codes in supervised and unsupervised scenarios, outperforming traditional methods thanks to their ability to handle the binary constraints architecturally.

In this paper, we propose a novel method for supervision (self-supervised) of variational autoencoders where the model uses its own predictions of the label distribution to implement the pairwise objective function. Also, we investigate the robustness of hashing methods based on variational autoencoders to the lack of supervision, focusing on two semi-supervised approaches currently in use. Our experiments on text and image retrieval tasks show that, as expected, both methods can significantly increase the quality of the hash codes as the number of labelled observations increases, but deteriorates when the amount of labelled samples decreases. In this scenario, the proposed self-supervised approach outperforms the classical approaches and yields similar performance in fully-supervised settings.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
A high value in this function means that the objects are more similar.
2.
Our code is made publicly available at https://github.com/amacaluso/SSB-VAE.
3.
A draft version of this work is available on arXiv:2007.08799.

References

Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. ACM, New York (1999)
Google Scholar
Carreira-Perpinán, M.A., Raziperchikolaei, R.: Hashing with binary autoencoders. In: Proceedings of the CVPR, pp. 557–566 (2015)
Google Scholar
Chaidaroon, S., Fang, Y.: Variational deep semantic hashing for text documents. In: SIGIR, pp. 75–84 (2017)
Google Scholar
Dadaneh, S.Z., Boluki, S., Yin, M., Zhou, M., Qian, X.: Pairwise supervised hashing with Bernoulli variational auto-encoder and self-control gradient estimator. In: Proceedings of the UAI (2020)
Google Scholar
Do, T.-T., Doan, A.-D., Cheung, N.-M.: Learning to hash with binary deep neural network. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016, Part V. LNCS, vol. 9909, pp. 219–234. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46454-1_14
Chapter Google Scholar
Gong, Y., Lazebnik, S.: Iterative quantization: a procrustean approach to learning binary codes. In: Proceedings of the CVPR, pp. 817–824 (2011)
Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)
MATH Google Scholar
Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the ACM STOC, pp. 604–613 (1998)
Google Scholar
Jang, E., Gu, S., Poole, B.: Categorical reparameterization with Gumbel-softmax. In: Proceedings of the ICLR (2017)
Google Scholar
Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: Proceedings of the ICLR (2014)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS, pp. 1097–1105 (2012)
Google Scholar
Lai, H., Pan, Y., Liu, Y., Yan, S.: Simultaneous feature learning and hash coding with deep neural networks. In: Proceedings of the CVPR, pp. 3270–3278 (2015)
Google Scholar
Lu, J., Liong, V.E., Zhou, J.: Deep hashing for scalable image search. IEEE Trans. Image Process. 26(5), 2352–2367 (2017)
Article MathSciNet Google Scholar
Mena, F., \(\tilde{\text{N}}\)anculef, R.: A binary variational autoencoder for hashing. In: Nyström, I., Hernández Heredia, Y., Milián Núñez, V. (eds.) CIARP 2019. LNCS, vol. 11896, pp. 131–141. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33904-3_12
Norouzi, M., Punjani, A., Fleet, D.J.: Fast exact search in Hamming space with multi-index hashing. IEEE Pattern Anal. Mach. Intell. 36(6), 1107–1119 (2014)
Article Google Scholar
Song, T., Cai, J., Zhang, T., Gao, C., Meng, F., Wu, Q.: Semi-supervised manifold-embedded hashing with joint feature representation and classifier learning. Pattern Recognit. 68, 99–110 (2017)
Article Google Scholar
Triguero, I., García, S., Herrera, F.: Self-labeled techniques for semi-supervised learning: taxonomy, software and empirical study. Knowl. Inf. Syst. 42(2), 245–284 (2015). https://doi.org/10.1007/s10115-013-0706-y
Article Google Scholar
Wang, J., Kumar, S., Chang, S.F.: Semi-supervised hashing for large-scale search. IEEE Pattern Anal. Mach. Intell. 34(12), 2393–2406 (2012)
Article Google Scholar
Wang, Q., Zhang, D., Si, L.: Semantic hashing using tags and topic modeling. In: Proceedings of the SIGIR, pp. 213–222. ACM (2013)
Google Scholar
Weiss, Y., Torralba, A., Fergus, R.: Spectral hashing. In: NIPS (2009)
Google Scholar
Yang, H., Tu, C., Chen, C.: Adaptive labeling for hash code learning via neural networks. In: Proceedings of the ICIP, pp. 2244–2248 (2019)
Google Scholar
Zhang, D., Wang, J., Cai, D., Lu, J.: Self-taught hashing for fast similarity search. In: Proceedings of the SIGIR, pp. 18–25 (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Informatics, Federico Santa María Technical University, Valparaíso, Chile
Ricardo Ñanculef & Francisco Mena
Department of Computer Science and Engineering, University of Bologna, Bologna, Italy
Antonio Macaluso, Stefano Lodi & Claudio Sartori

Authors

Ricardo Ñanculef
View author publications
You can also search for this author in PubMed Google Scholar
Francisco Mena
View author publications
You can also search for this author in PubMed Google Scholar
Antonio Macaluso
View author publications
You can also search for this author in PubMed Google Scholar
Stefano Lodi
View author publications
You can also search for this author in PubMed Google Scholar
Claudio Sartori
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Antonio Macaluso .

Editor information

Editors and Affiliations

Universidade do Porto, Porto, Portugal
João Manuel R. S. Tavares
Universidade Estadual Paulista, São Paulo, Brazil
João Paulo Papa
University of the Balearic Islands, Palma de Mallorca, Spain
Manuel González Hidalgo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ñanculef, R., Mena, F., Macaluso, A., Lodi, S., Sartori, C. (2021). Self-supervised Bernoulli Autoencoders for Semi-supervised Hashing. In: Tavares, J.M.R.S., Papa, J.P., González Hidalgo, M. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2021. Lecture Notes in Computer Science(), vol 12702. Springer, Cham. https://doi.org/10.1007/978-3-030-93420-0_25

Download citation

DOI: https://doi.org/10.1007/978-3-030-93420-0_25
Published: 13 January 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-93419-4
Online ISBN: 978-3-030-93420-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)