Re-ranking Permutation-Based Candidate Sets with the n-Simplex Projection

Amato, Giuseppe; Chávez, Edgar; Connor, Richard; Falchi, Fabrizio; Gennaro, Claudio; Vadicamo, Lucia

doi:10.1007/978-3-030-02224-2_1

Giuseppe Amato¹⁶,
Edgar Chávez¹⁷,
Richard Connor¹⁸,
Fabrizio Falchi¹⁶,
Claudio Gennaro¹⁶ &
…
Lucia Vadicamo¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11223))

Included in the following conference series:

International Conference on Similarity Search and Applications

655 Accesses

Abstract

In the realm of metric search, the permutation-based approaches have shown very good performance in indexing and supporting approximate search on large databases. These methods embed the metric objects into a permutation space where candidate results to a given query can be efficiently identified. Typically, to achieve high effectiveness, the permutation-based result set is refined by directly comparing each candidate object to the query one. Therefore, one drawback of these approaches is that the original dataset needs to be stored and then accessed during the refining step. We propose a refining approach based on a metric embedding, called n-Simplex projection, that can be used on metric spaces meeting the n-point property. The n-Simplex projection provides upper- and lower-bounds of the actual distance, derived using the distances between the data objects and a finite set of pivots. We propose to reuse the distances computed for building the data permutations to derive these bounds and we show how to use them to improve the permutation-based results. Our approach is particularly advantageous for all the cases in which the traditional refining step is too costly, e.g. very large dataset or very expensive metric function.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Throughout this paper, we use the term “metric” and “distance” interchangeably to indicate a function satisfying the metric postulates [23].
2.
In this work, we focus on metric search. The requirement that the function d satisfies the metric postulates is sufficient, but not necessary, to produce a permutation-based representation. For example, d may be a dissimilarity function.
3.
A simplex is a generalisation of a triangle or a tetrahedron in arbitrary dimensions. We refer to [12] for further details.
4.
See also the on-line Appendix at http://arxiv.org/abs/1707.08370.

References

Amato, G., Falchi, F., Gennaro, C., Rabitti, F.: YFCC100M-HNfc6: a large-scale deep features benchmark for similarity search. In: Amsaleg, L., Houle, M.E., Schubert, E. (eds.) SISAP 2016. LNCS, vol. 9939, pp. 196–209. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46759-7_15
Chapter Google Scholar
Amato, G., Falchi, F., Gennaro, C., Vadicamo, L.: Deep permutations: deep convolutional neural networks and permutation-based indexing. In: Amsaleg, L., Houle, M.E., Schubert, E. (eds.) SISAP 2016. LNCS, vol. 9939, pp. 93–106. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46759-7_7
Chapter Google Scholar
Amato, G., Falchi, F., Rabitti, F., Vadicamo, L.: Some theoretical and experimental observations on permutation spaces and similarity search. In: Traina, A.J.M., Traina, C., Cordeiro, R.L.F. (eds.) SISAP 2014. LNCS, vol. 8821, pp. 37–49. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11988-5_4
Chapter Google Scholar
Amato, G., Gennaro, C., Savino, P.: MI-File: using inverted files for scalable approximate similarity search. Multimed. Tools Appl. 71(3), 1333–1362 (2014)
Article Google Scholar
Amato, G., Savino, P.: Approximate similarity search in metric spaces using inverted files. In: Proceedings of InfoScale 2008, pp. 28:1–28:10. ICST (2008)
Google Scholar
Babenko, A., Lempitsky, V.: The inverted multi-index. In: Proceedings of CVPR 2012, pp. 3069–3076. IEEE (2012)
Google Scholar
Blumenthal, L.M.: Theory and Applications of Distance Geometry. Clarendon Press, Oxford (1953)
MATH Google Scholar
Chávez, E., Figueroa, K., Navarro, G.: Effective proximity retrieval by ordering permutations. IEEE Trans. Pattern Anal. Mach. Intell. 30(9), 1647–1658 (2008)
Article Google Scholar
Connor, R., Cardillo, F.A., Vadicamo, L., Rabitti, F.: Hilbert exclusion: improved metric search through finite isometric embeddings. ACM Trans. Inf. Syst. 35(3), 17:1–17:27 (2016)
Article Google Scholar
Connor, R., Vadicamo, L., Cardillo, F.A., Rabitti, F.: Supermetric search with the four-point property. In: Amsaleg, L., Houle, M.E., Schubert, E. (eds.) SISAP 2016. LNCS, vol. 9939, pp. 51–64. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46759-7_4
Chapter Google Scholar
Connor, R., Vadicamo, L., Cardillo, F.A., Rabitti, F.: Supermetric search. Inf. Syst. (2018). https://doi.org/10.1016/j.is.2018.01.002. https://www.sciencedirect.com/science/article/pii/S0306437917301588
Connor, R., Vadicamo, L., Rabitti, F.: High-dimensional simplexes for supermetric search. In: Beecks, C., Borutta, F., Kröger, P., Seidl, T. (eds.) SISAP 2017. LNCS, vol. 10609, pp. 96–106. Springer, Cham (2007). https://doi.org/10.1007/978-3-319-68474-1_7
Chapter Google Scholar
Esuli, A.: Use of permutation prefixes for efficient and scalable approximate similarity search. Inf. Process. Manag. 48(5), 889–902 (2012)
Article Google Scholar
Fagin, R., Kumar, R., Sivakumar, D.: Comparing top k lists. In: Proceedings of SODA 2003, pp. 28–36. Society for Industrial and Applied Mathematics (2003)
Google Scholar
Figueroa, K., Navarro, G., Chávez, E.: Metric spaces library (2007). www.sisap.org/library/manual.pdf
Micó, M.L., Oncina, J., Vidal, E.: A new version of the nearest-neighbour approximating and eliminating search algorithm (AESA) with linear preprocessing time and memory requirements. Pattern Recogn. Lett. 15(1), 9–17 (1994)
Article Google Scholar
Novak, D., Zezula, P.: PPP-codes for large-scale similarity searching. In: Hameurlain, A., Küng, J., Wagner, R., Decker, H., Lhotska, L., Link, S. (eds.) Transactions on Large-Scale Data- and Knowledge-Centered Systems XXIV. LNCS, vol. 9510, pp. 61–87. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-49214-7_2
Chapter Google Scholar
Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. Proc. EMNLP 2014, 1532–1543 (2014)
Google Scholar
Pestov, V.: Indexability, concentration, and VC theory. J. Discret. Algorithms 13, 2–18 (2012)
Article MathSciNet Google Scholar
Schoenberg, I.J.: Metric spaces and completely monotone functions. Ann. Math. 39(4), 811–841 (1938)
Article MathSciNet Google Scholar
Thomee, B., et al.: YFCC100M: the new data in multimedia research. Commun. ACM 59(2), 64–73 (2016)
Article Google Scholar
Weber, R., Schek, H.-J., Blott, S.: A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. VLDB 98, 194–205 (1998)
Google Scholar
Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity Search: The Metric Space Approach. Advances in Database Systems, vol. 32. Springer, Boston (2006). https://doi.org/10.1007/0-387-29151-2
Book MATH Google Scholar
Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. In: Advances in Neural Information Processing Systems 27, pp. 487–495. Curran Associates Inc. (2014)
Google Scholar

Download references

Acknowledgements

The work was partially funded by Smart News, “Social sensing for breaking news”, CUP CIPE D58C15000270008, and by VISECH, ARCO-CNR, CUP B56J17001330004.

Author information

Authors and Affiliations

Institute of Information Science and Technologies (ISTI), CNR, Pisa, Italy
Giuseppe Amato, Fabrizio Falchi, Claudio Gennaro & Lucia Vadicamo
Department of Computer Science, CICESE, Ensenada, Mexico
Edgar Chávez
Department of Computing Science, University of Stirling, Stirling, FK9 4LA, Scotland
Richard Connor

Authors

Giuseppe Amato
View author publications
You can also search for this author in PubMed Google Scholar
Edgar Chávez
View author publications
You can also search for this author in PubMed Google Scholar
Richard Connor
View author publications
You can also search for this author in PubMed Google Scholar
Fabrizio Falchi
View author publications
You can also search for this author in PubMed Google Scholar
Claudio Gennaro
View author publications
You can also search for this author in PubMed Google Scholar
Lucia Vadicamo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lucia Vadicamo .

Editor information

Editors and Affiliations

University of Geneva, Carouge, Switzerland
Stéphane Marchand-Maillet
Arizona State University, Tempe, AZ, USA
Yasin N. Silva
Center for Scientific Research and Higher Education, Ensenada, Mexico
Edgar Chávez

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Amato, G., Chávez, E., Connor, R., Falchi, F., Gennaro, C., Vadicamo, L. (2018). Re-ranking Permutation-Based Candidate Sets with the n-Simplex Projection. In: Marchand-Maillet, S., Silva, Y., Chávez, E. (eds) Similarity Search and Applications. SISAP 2018. Lecture Notes in Computer Science(), vol 11223. Springer, Cham. https://doi.org/10.1007/978-3-030-02224-2_1

Download citation

DOI: https://doi.org/10.1007/978-3-030-02224-2_1
Published: 04 October 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-02223-5
Online ISBN: 978-3-030-02224-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics