Benefiting from users’ gaze: selection of image regions from eye tracking information for provided tags

Walber, Tina; Scherp, Ansgar; Staab, Steffen

doi:10.1007/s11042-013-1390-3

Benefiting from users’ gaze: selection of image regions from eye tracking information for provided tags

Published: 28 February 2013

Volume 71, pages 363–390, (2014)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Tina Walber¹,
Ansgar Scherp¹ &
Steffen Staab¹

439 Accesses
3 Citations
Explore all metrics

Abstract

Providing image annotations is a tedious task. This becomes even more cumbersome when objects shall be annotated in the images. Such region-based annotations can be used in various ways like similarity search or as training set in automatic object detection. We investigate the principle idea of finding objects in images by looking at gaze paths from users, viewing images with an interest in a specific object. We have analyzed 799 gaze paths from 30 subjects viewing image-tag-pairs with the task to decide whether a tag could be found in the image or not. We have compared 13 different fixation measures analyzing the gaze paths. The best performing fixation measure is able to correctly assign a tag to a region for 63 % of the image-tag-pairs and significantly outperforms three baselines. We look into details of the image region characteristics such as the position and size for incorrect and correct assignments. The influence of aggregating multiple gaze paths from several subjects with respect to improving the precision of identifying the correct regions is also investigated. In addition, we look into the possibilities of discriminating different regions in the same image. Here, we are able to correctly identify two regions in the same image from different primings with an accuracy of 38 %.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Can You See It? Two Novel Eye-Tracking-Based Measures for Assigning Tags to Image Regions

Semi-automatic Annotation of Images Using Eye Gaze Data (SAIGA)

Exploitation of Gaze Data for Photo Region Labeling in an Immersive Environment

Notes

References

Bruneau D, Sasse M, McCarthy J (2002) The eyes never lie: The use of eye tracking data in HCI research. In: Proceedings of the CHI, vol 2
Campbell RJ, Flynn PJ (2001) A survey of free-form object representation and recognition techniques. Comput Vis Image Underst 81(2):166–210
Article MATH Google Scholar
Castagnos S, Jones N, Pu P (2010) Eye-tracking product recommenders’ usage. In: Proceedings of the 4th ACM conference on recommender systems. ACM, pp 29–36
Duygulu P, Barnard K, De Freitas J, Forsyth D (2006) Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In: Computer vision, ECCV 2002, pp 349–354
Grabner H, Gall J, Van Gool L (2011) What makes a chair a chair? In: 2011 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 1529–1536
Hajimirza S, Izquierdo E (2010) Gaze movement inference for implicit image annotation. In: Image analysis for multimedia interactive services. IEEE
Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259
Article Google Scholar
Jaimes A (2001) Using human observer eye movements in automatic image classifiers. In: SPIE. ISSN 0277786X. doi:10.1117/12.429507
Judd T, Ehinger K, Durand F, Torralba A (2009) Learning to predict where humans look. In: IEEE international conference on computer vision (ICCV). Citeseer
Kim D, Yu S (2008) A new region filtering and region weighting approach to relevance feedback in content-based image retrieval. J Syst Softw 81(9):1525–1538
Article Google Scholar
Klami A (2010) Inferring task-relevant image regions from gaze data. In: Workshop on machine learning for signal processing. IEEE
Klami A, Saunders C, De Campos T, Kaski S (2008) Can relevance of images be inferred from eye movements? In: Multimedia information retrieval. ACM
Kompatsiaris I, Triantafyllou E, Strintzis M (2001) A World Wide Web region-based image search engine. In: Conference on image analysis and processing. doi:10.1109/ICIAP.2001.957041
Kozma L, Klami A, Kaski S (2009) GaZIR: gaze-based zooming interface for image retrieval. In: Multimodal interfaces. ACM
Li X, Snoek CGM, Worring M (2009) Annotating images by harnessing worldwide user-tagged photos. In: Acoustics, speech, and signal processing. IEEE, pp 3717–3720
Liu X, Cheng B, Yan S, Tang J, Chua T, Jin H (2009) Label to region by bi-layer sparsity priors. In: Proceedings of the 17th ACM international conference on multimedia. ACM, pp 115–124
Navalpakkam V, Itti L (2005) Modeling the influence of task on attention. Vis Res 45(2):205–231
Article Google Scholar
Pasupa K, Saunders C, Szedmak S, Klami A, Kaski S, Gunn S (2009) Learning to rank images from eye movements. In: IEEE 12th International conference on computer vision workshops, (ICCV Workshops ’09)
Privitera CM, Stark LW (2000) Algorithms for defining visual regions-of-interest: comparison with eye fixations. IEEE Trans Pattern Anal Mach Intell 22(9):970–982
Article Google Scholar
Ramanathan S, Katti H, Huang R, Chua T-S, Kankanhalli M (2009) Automated localization of affective objects and actions in images via caption text-cum-eye gaze analysis. In: Multimedia. ACM. New York, USA. ISBN 9781605586083. doi:10.1145/1631272.1631399
Ramanathan S, Katti H, Sebe N, Kankanhalli M, Chua T (2010) An eye fixation database for saliency detection in images. In: Computer vision–ECCV 2010, pp 30–43
Rowe N (2002) Finding and labeling the subject of a captioned depictive natural photograph. IEEE Trans Knowl Data Eng 14(1):202–207. ISSN 1041-4347
Article Google Scholar
Russell BC, Torralba A, Murphy KP, Freeman WT (2008) LabelMe: a database and web-based tool for image annotation. International Journal of Computer Vision 77(1):157–173
Article Google Scholar
Santella A, Agrawala M, DeCarlo D, Salesin D, Cohen M (2006) Gaze-based interaction for semi-automatic photo cropping. In: CHI. ACM, pp 780
Schneiderman H, Kanade T (2000) A statistical method for 3d object detection applied to faces and cars. In: Proceedings of the IEEE conference on computer vision and pattern recognition, vol 1. IEEE, pp 746–751
Sewell W, Komogortsev O (2010) Real-time eye gaze tracking with an unmodified commodity webcam employing a neural network. In: Proceedings of the 28th of the international conference extended abstracts on human factors in computing systems. ACM, pp 3739–3744
Tang J, Yan S, Hong R, Qi G, Chua T (2009) Inferring semantic concepts from community-contributed images and noisy tags. In: Proceedings of the 17th ACM international conference on multimedia. ACM, pp 223–232
Torralba A, Murphy K, Freeman W (2007) Sharing visual features for multiclass and multiview object detection. IEEE Trans Pattern Anal Mach Intell 29(5):854–869
Article Google Scholar
Tsai D, Jing Y, Liu Y, Rowley H, Ioffe S, Rehg J (2011) Large-scale image annotation using visual synset. In: 2011 IEEE international conference on computer vision (ICCV). IEEE, pp 611–618
Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE Computer Society Conference on computer vision and pattern recognition, CVPR 2001, vol 1. IEEE, pp 511–518
von Ahn L, Liu R, Blum M (2006) Peekaboom: a game for locating objects in images. In: CHI. ACM, 2006. ISBN 1-59593-372-7
Walber T, Scherp A, Staab S (2012) Identifying objects in images from analyzing the users gaze movements for provided tags. In: Advances in multimedia modeling. Springer, pp 138–148
Walber T, Scherp A, Staab S (2013) Can you see it? two novel eye-tracking-based measures for assigning tags to image regions. In: Advances in multimedia modeling. Springer, pp 36–46
Yarbus A (1967) Eye movements and vision. Plenum press
Zhao Q, Koch C (2011) Learning a saliency map using fixated locations in natural scenes. J Vis 11(3):1–15
Article Google Scholar

Download references

Acknowledgement

We thank the subjects participating in our experiment. The research leading to this article was partially supported by the EU project SocialSensor (FP7-287975).

Author information

Authors and Affiliations

University of Koblenz-Landau, Universitätsstr. 1, 56070, Koblenz, Germany
Tina Walber, Ansgar Scherp & Steffen Staab

Authors

Tina Walber
View author publications
You can also search for this author in PubMed Google Scholar
Ansgar Scherp
View author publications
You can also search for this author in PubMed Google Scholar
Steffen Staab
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tina Walber.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Walber, T., Scherp, A. & Staab, S. Benefiting from users’ gaze: selection of image regions from eye tracking information for provided tags. Multimed Tools Appl 71, 363–390 (2014). https://doi.org/10.1007/s11042-013-1390-3

Download citation

Published: 28 February 2013
Issue Date: July 2014
DOI: https://doi.org/10.1007/s11042-013-1390-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Benefiting from users’ gaze: selection of image regions from eye tracking information for provided tags

Abstract

Access this article

Similar content being viewed by others

Can You See It? Two Novel Eye-Tracking-Based Measures for Assigning Tags to Image Regions

Semi-automatic Annotation of Images Using Eye Gaze Data (SAIGA)

Exploitation of Gaze Data for Photo Region Labeling in an Immersive Environment

Notes

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Benefiting from users’ gaze: selection of image regions from eye tracking information for provided tags

Abstract

Access this article

Similar content being viewed by others

Can You See It? Two Novel Eye-Tracking-Based Measures for Assigning Tags to Image Regions

Semi-automatic Annotation of Images Using Eye Gaze Data (SAIGA)

Exploitation of Gaze Data for Photo Region Labeling in an Immersive Environment

Notes

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation