Skip to main content

Contribution of Low, Mid and High-Level Image Features of Indoor Scenes in Predicting Human Similarity Judgements

  • Conference paper
  • First Online:
Pattern Recognition and Image Analysis (IbPRIA 2022)

Abstract

Human judgments can still be considered the gold standard in the assessment of image similarity, but they are too expensive and time-consuming to acquire. Even though most existing computational models make almost exclusive use of low-level information to evaluate the similarity between images, human similarity judgements are known to rely on both high-level semantic and low-level visual image information. The current study aims to evaluate the impact of different types of image features on predicting human similarity judgements. We investigated how low-level (colour differences), mid-level (spatial envelope) and high-level (distributional semantics) information predict within-category human judgements of 400 indoor scenes across 4 categories in a Four-Alternative Forced Choice task in which participants had to select the most distinctive scene among four scenes presented on the screen. Linear regression analysis showed that low-level (t = 4.14, p < 0.001), mid-level (t = 3.22, p< 0.01) and high-level (t = 2.07, p < 0.04) scene information significantly predicted the probability of a scene to be selected. Additionally, the SVM model that incorporates low-mid-high level properties had 56% accuracy in predicting human similarity judgments. Our results point out: 1) the importance of including mid and high-level image properties into computational models of similarity to better characterise the cognitive mechanisms underlying human judgements, and 2) the necessity of further research in understanding how human similarity judgements are done as there is a sizeable variability in our data that it is not accounted for by the metrics we investigated.

This research was supported by Fundação para a Ciência e Tecnologia with a PhD scholarship to AM [SFRH/BD/144453/2019] and Grant [PTDC/PSI-ESP/30958/2017] to MIC.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Sampat, M.P., Wang, Z., Gupta, S., Bovik, A.C., Markey, M.K.: Complex wavelet structural similarity: a new image similarity index. IEEE Trans. Image Process. 18(11), 2385–2401 (2009)

    Article  MathSciNet  Google Scholar 

  2. Zhang, Y., Zhang, C., Akashi, T.: Multi-scale Template Matching with Scalable Diversity Similarity in an Unconstrained Environment (2019)

    Google Scholar 

  3. Wu, A., Piergiovanni, A.J., Ryoo, M.S.: Model-based behavioral cloning with future image similarity learning. In: Conference on Robot Learning, pp. 1062–1077 (2020)

    Google Scholar 

  4. Wang, L., et al.: Image-similarity-based convolutional neural network for robot visual relocalization. Sens. Mater. 32, 1245–1259 (2020)

    Google Scholar 

  5. Bell, S., Bala, K.: Learning visual similarity for product design with convolutional neural networks. In: ACM Trans. Graph. (TOG) 34(4), 1–10 (2015)

    Google Scholar 

  6. Silva, E.A., Panetta, K., Agaian, S.S.: Quantifying image similarity using measure of enhancement by entropy. In: Mobile Multimedia/Image Processing for Military and Security Applications 2007 6579, p. 65790U (2007)

    Google Scholar 

  7. Liu, Y., Gevers, T., Li, X.: Color constancy by combining low-mid-high level image cues. Comput. Vision Image Understanding 140, 1–8 (2015)

    Google Scholar 

  8. Hebart, M.N., Zheng, C.Y., Pereira, F., Baker, C.I.: Revealing the multidimensional mental representations of natural objects underlying human similarity judgements. Nat. Hum. Behav. 4(11), 1173–1185 (2020)

    Google Scholar 

  9. Zheng, C.Y., Pereira, F., Baker, C.I., Hebart, M.N.: Revealing interpretable object representations from human behavior. In: International Conference on Learning Representations (2018)

    Google Scholar 

  10. Wang, J., et al.: Learning fine-grained image similarity with deep ranking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1386–1393 (2014)

    Google Scholar 

  11. Oliva, A., Torralba, A.: Modeling the shape of the scene: A holistic representation of the spatial envelope. Int. J. Comput. Vision 42(3), 145–175 (2001)

    Google Scholar 

  12. Zhou, B., Zhao, H., Puig, X., Fidler, S., Barriuso, A., Torralba, A.: Scene parsing through ade20k dataset. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 633–641 (2017)

    Google Scholar 

  13. Xiao, J., Hays, J., Ehinger, K. A., Oliva, A., Torralba, A.: Sun database: large-scale scene recognition from abbey to zoo. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 3485–3492 (2010)

    Google Scholar 

  14. Bylinskii, Z., Isola, P., Bainbridge, C., Torralba, A., Oliva, A.: Intrinsic and extrinsic effects on image memorability. Vision Res. 116, 165–178 (2015)

    Google Scholar 

  15. Ulysses, J. N., Conci, A.: Measuring similarity in medical registration. In: IWSSIP 17th International Conference on Systems, Signals and Image Processing (2010)

    Google Scholar 

  16. Oliva, A., Torralba, A.: Building the gist of a scene: the role of global image features in recognition. Progress Brain Res. 155, 23–36 (2006)

    Google Scholar 

  17. Sadeghi, Z., McClelland, J.L., Hoffman, P.: You shall know an object by the company it keeps: an investigation of semantic representations derived from object co-occurrence in visual scenes. Neuropsychologia 76, 52–61 (2015)

    Google Scholar 

  18. Pennington, J., Socher, R., Manning, C. D.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)

    Google Scholar 

  19. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: arXiv preprint arXiv:1409.1556 (2014)

  20. Anwyl-Irvine, A.L., Massonnié, J., Flitton, A., Kirkham, N., Evershed, J.K.: Gorilla in our midst: an online behavioral experiment builder. Behav. Res. Methods 52(1), 388–407 (2019). https://doi.org/10.3758/s13428-019-01237-x

    Article  Google Scholar 

  21. Bates, D., Mächler, M., Bolker, B., Walker, S.: Fitting linear mixed-effects models using lme4. J. Stat. Softw. 67(1), 1–48 (2015)

    Article  Google Scholar 

  22. Karatzoglou, A., Smola, A., Hornik, K., Zeileis, A.: kernlab-an S4 package for kernel methods in R. J. Stat. Softw. 11(9), 1–20 (2004)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anastasiia Mikhailova .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mikhailova, A., Santos-Victor, J., Coco, M.I. (2022). Contribution of Low, Mid and High-Level Image Features of Indoor Scenes in Predicting Human Similarity Judgements. In: Pinho, A.J., Georgieva, P., Teixeira, L.F., Sánchez, J.A. (eds) Pattern Recognition and Image Analysis. IbPRIA 2022. Lecture Notes in Computer Science, vol 13256. Springer, Cham. https://doi.org/10.1007/978-3-031-04881-4_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-04881-4_40

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-04880-7

  • Online ISBN: 978-3-031-04881-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics