Skip to main content

Totally Looks Like - How Humans Compare, Compared to Machines

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11361))

Abstract

Perceptual judgment of image similarity by humans relies on rich internal representations ranging from low-level features to high-level concepts, scene properties and even cultural associations. However, existing methods and datasets attempting to explain perceived similarity use stimuli which arguably do not cover the full breadth of factors that affect human similarity judgments, even those geared toward this goal. We introduce a new dataset dubbed Totally-Looks-Like (TLL) after a popular entertainment website, which contains images paired by humans as being visually similar. The dataset contains 6016 image-pairs from the wild, shedding light upon a rich and diverse set of criteria employed by human beings. We conduct experiments to try to reproduce the pairings via features extracted from state-of-the-art deep convolutional neural networks, as well as additional human experiments to verify the consistency of the collected data. Though we create conditions to artificially make the matching task increasingly easier, we show that machine-extracted representations perform very poorly in terms of reproducing the matching selected by humans. We discuss and analyze these results, suggesting future directions for improvement of learned image representations.

This research was supported through grants to the senior author, for which all authors are grateful: Air Force Office of Scientific Research (FA9550-18-1-0054), the Canada Research Chairs Program (950-219525), the Natural Sciences and Engineering Research Council of Canada (RGPIN-2016-05352) and the NSERC Canadian Network on Field Robotics (NETGP417354-11).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://memebase.cheezburger.com/totallylookslike.

  2. 2.

    https://github.com/ageitgey/face_recognition.

References

  1. Antol, S., et al.: VQA: visual question answering. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2425–2433 (2015)

    Google Scholar 

  2. Battleday, R.M., Peterson, J.C., Griffiths, T.L.: Modeling human categorization of natural images using deep feature representations. arXiv preprint arXiv:1711.04855 (2017)

  3. Brady, T.F., Konkle, T., Alvarez, G.A., Oliva, A.: Visual long-term memory has a massive storage capacity for object details. Proc. Nat. Acad. Sci. 105(38), 14325–14329 (2008)

    Article  Google Scholar 

  4. Chandrasekaran, A., et al.: We are humor beings: understanding and predicting visual humor. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4603–4612 (2016)

    Google Scholar 

  5. Das, A., Agrawal, H., Zitnick, L., Parikh, D., Batra, D.: Human attention in visual question answering: do humans and deep networks look at the same regions? Comput. Vis. Image Underst. 163, 90–100 (2017)

    Article  Google Scholar 

  6. Deza, A., Parikh, D.: Understanding image virality. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1818–1826 (2015)

    Google Scholar 

  7. Geirhos, R., Janssen, D.H., Schütt, H.H., Rauber, J., Bethge, M., Wichmann, F.A.: Comparing deep neural networks against humans: object recognition when the signal gets weaker. arXiv preprint arXiv:1706.06969 (2017)

  8. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  9. Huang, G., Liu, Z., Weinberger, K.Q., van der Maaten, L.: Densely connected convolutional networks. arXiv preprint arXiv:1608.06993 (2016)

  10. Jozwik, K.M., Kriegeskorte, N., Storrs, K.R., Mur, M.: Deep convolutional neural networks outperform feature-based but not categorical models in explaining object similarity judgments. Front. Psychol. 8, 1726 (2017). https://doi.org/10.3389/fpsyg.2017.01726. https://www.frontiersin.org/article/10.3389/fpsyg.2017.01726

    Article  Google Scholar 

  11. Khosla, A., Raju, A.S., Torralba, A., Oliva, A.: Understanding and predicting image memorability at a large scale. In: International Conference on Computer Vision (ICCV) (2015)

    Google Scholar 

  12. Konkle, T., Brady, T.F., Alvarez, G.A., Oliva, A.: Scene memory is more detailed than you think: the role of categories in visual long-term memory. Psychol. Sci. 21(11), 1551–1556 (2010)

    Article  Google Scholar 

  13. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

    Google Scholar 

  14. Liu, W., Wang, Z., Liu, X., Zeng, N., Liu, Y., Alsaadi, F.E.: A survey of deep neural network architectures and their applications. Neurocomputing 234, 11–26 (2017)

    Article  Google Scholar 

  15. Lu, C., Tang, X.: Surpassing human-level face verification performance on LFW with GaussianFace. In: AAAI, pp. 3811–3819 (2015)

    Google Scholar 

  16. Peterson, J.C., Abbott, J.T., Griffiths, T.L.: Adapting deep network features to capture psychological representations. arXiv preprint arXiv:1608.02164 (2016)

  17. Pramod, R., Arun, S.: Do computational models differ systematically from human object perception? In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1601–1609 (2016)

    Google Scholar 

  18. Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vision 115(3), 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  19. Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)

    Article  Google Scholar 

  20. Sharif Razavian, A., Azizpour, H., Sullivan, J., Carlsson, S.: CNN features off-the-shelf: an astounding baseline for recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 806–813 (2014)

    Google Scholar 

  21. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  22. Wertheimer, M.: Laws of organization in perceptual forms. Psycologische Forschung 4, 301–350 (1923)

    Article  Google Scholar 

  23. Workman, S., Souvenir, R., Jacobs, N.: Quantifying and predicting image scenicness. arXiv preprint arXiv:1612.03142 (2016)

  24. Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. arXiv preprint arXiv:1801.03924 (2018)

  25. Zhou, P., Feng, J.: The landscape of deep learning algorithms. arXiv preprint arXiv:1705.07038 (2017)

  26. Zhou, W., Li, H., Tian, Q.: Recent advance in content-based image retrieval: a literature survey. arXiv preprint arXiv:1706.06064 (2017)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amir Rosenfeld .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Rosenfeld, A., Solbach, M.D., Tsotsos, J.K. (2019). Totally Looks Like - How Humans Compare, Compared to Machines. In: Jawahar, C., Li, H., Mori, G., Schindler, K. (eds) Computer Vision – ACCV 2018. ACCV 2018. Lecture Notes in Computer Science(), vol 11361. Springer, Cham. https://doi.org/10.1007/978-3-030-20887-5_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-20887-5_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-20886-8

  • Online ISBN: 978-3-030-20887-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics