Skip to main content

When Similarity Measures Lie

  • Conference paper
  • First Online:
Similarity Search and Applications (SISAP 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9371))

Included in the following conference series:

Abstract

Do similarity or distance measures ever go wrong? The inherent subjectivity in similarity discernment has long supported the view that all judgements of similarity are equally valid, and that any selected similarity measure may only be considered more effective in some chosen domain. This paper presents evidence that such a view is incorrect for structural similarity comparisons. Similarity and distance measures occasionally do go wrong, and produce judgements that can be considered as errors in judgement. This claim is supported by a novel method for assessing the quality of similarity and distance functions, which is based on relative scale of similarity with respect to chosen reference objects. The method may be applied in any domain, and is demonstrated for common measures of structural similarity in graphs. Finally, the paper identifies three distinct kinds of relative similarity judgement errors, and shows how the distribution of these errors is related to graph properties under common similarity measures.

K. A. Naudé—This research was supported by the National Research Foundation, South Africa.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Morain-Nicolier, F., Landré, J., Ruan, S.: Binary symbol recognition from local dissimilarity map. In: 8th International Workshop on Graphic Recognition GREC 2009, pp. 143–148 (2009)

    Google Scholar 

  2. Boyer, L., Habrard, A., Sebban, M.: Learning metrics between tree structured data: application to image recognition. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 54–66. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  3. Rahman, S.A., Bashton, M., Holliday, G.L., Schrader, R., Thornton, J.M.: Small Molecule Subgraph Detector (SMSD) toolkit. Journal of Cheminformatics 1(1), 12 (2009)

    Article  Google Scholar 

  4. Cao, Y., Jiang, T., Girke, T.: A maximum common substructure-based algorithm for searching and predicting drug-like compounds. Bioinformatics 24(13), i366–i374 (2008)

    Article  Google Scholar 

  5. Islam, A., Inkpen, D.: Semantic similarity of short texts. In: Nicolov, N., Angeliva, G., Mitkov, R. (eds.) Text, pp. 227–236. John Benjamins Publishing Company (2009)

    Google Scholar 

  6. Markines, B., Cattuto, C., Menczer, F., Benz, D., Hotho, A., Stumme, G.: Evaluating similarity measures for emergent semantics of social tagging. In: Proceedings of the 18th International Conference on World Wide Web, pp. 641–650. ACM, New York (2009)

    Google Scholar 

  7. Blondel, V.D., Gajardo, A., Heymans, M., Senellart, P., Van Dooren, P.: A measure of similarity between graph vertices: applications to synonym extraction and web searching. SIAM Review 46(4), 647–666 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  8. Cleverdon, C., Mills, J., Keen, M.: Factors Determining the Performance of Indexing Systems. ASLIB Cranfield project, Cranfield University, Cranfield, Technical report (1966)

    Google Scholar 

  9. Nene, S.A., Nayar, S.K., Murase, H.: Columbia Object Image Library (COIL-100). Technical report CUCS-006-96, Columbia University (1996)

    Google Scholar 

  10. Colantoni, P., Laget, B.: Color image segmentation using region adjacency graphs. In: Sixth International Conference on Image Processing and its Applications, vol. 2, pp. 698–702, July 1997

    Google Scholar 

  11. Chevalier, F., Domenger, J., Benoispineau, J., Delest, M.: Retrieval of objects in video by similarity based on graph matching. Pattern Recognition Letters 28(8), 939–949 (2007)

    Article  Google Scholar 

  12. Riesen, K., Bunke, H.: IAM graph database repository for graph based pattern recognition and machine learning. In: da Vitoria Lobo, N., Kasparis, T., Roli, F., Kwok, J.T., Georgiopoulos, M., Anagnostopoulos, G.C., Loog, M. (eds.) Structural, Syntactic, and Statistical Pattern Recognition. LNCS, vol. 5342, pp. 287–297. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  13. Agrawal, R., Gollapudi, S., Kannan, A., Kenthapadi, K.: Similarity search using concept graphs. In: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, Shanghai, China, pp. 719–728 (2014)

    Google Scholar 

  14. Zafarani, R., Liu, H.: Evaluation Without Ground Truth in Social Media. Communications of the ACM 58(6), 54–60 (2015)

    Article  Google Scholar 

  15. Albert, R., Barabasi, A.L.: Topology of evolving networks: local events and universality. Physical Review Letters 85(24), 5234–5237 (2000)

    Article  Google Scholar 

  16. Zager, L., Verghese, G.: Graph similarity scoring and matching. Applied Mathematics Letters 21(1), 86–94 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  17. Boschloo, R.: Raised conditional level of significance for the \(2\times 2\)-table when testing the equality of two probabilities. Statistica Neerlandica 24(1), 1–9 (1970)

    Article  MathSciNet  MATH  Google Scholar 

  18. Shaffer, J.P.: Recent developments towards optimality in multiple hypothesis testing. Lecture Notes-Monograph Series, 16–32 (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kevin A. Naudé .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Naudé, K.A., Greyling, J.H., Vogts, D. (2015). When Similarity Measures Lie. In: Amato, G., Connor, R., Falchi, F., Gennaro, C. (eds) Similarity Search and Applications. SISAP 2015. Lecture Notes in Computer Science(), vol 9371. Springer, Cham. https://doi.org/10.1007/978-3-319-25087-8_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-25087-8_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-25086-1

  • Online ISBN: 978-3-319-25087-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics