Abstract
Understanding a target audience's emotional responses to a video advertisement is crucial to evaluate the advertisement's effectiveness. However, traditional methods for collecting such information are slow, expensive, and coarse grained. We propose AttentiveVideo, a scalable intelligent mobile interface with corresponding inference algorithms to monitor and quantify the effects of mobile video advertising in real time. Without requiring additional sensors, AttentiveVideo employs a combination of implicit photoplethysmography (PPG) sensing and facial expression analysis (FEA) to detect the attention, engagement, and sentiment of viewers as they watch video advertisements on unmodified smartphones. In a 24-participant study, AttentiveVideo achieved good accuracy on a wide range of emotional measures (the best average accuracy = 82.6% across nine measures). While feature fusion alone did not improve prediction accuracy with a single model, it significantly improved the accuracy when working together with model fusion. We also found that the PPG sensing channel and the FEA technique have different strength in data availability, latency detection, accuracy, and usage environment. These findings show the potential for both low-cost collection and deep understanding of emotional responses to mobile video advertisements.
Supplemental Material
Available for Download
Supplemental movie, appendix, image and software files for, AttentiveVideo: A Multimodal Approach to Quantify Emotional Responses to Mobile Advertisements
- David A. Aaker, Douglas M. Stayman, and Michael R. Hagerty. 1986. Warmth in advertising: Measurement, impact, and sequence effects. J. Cons. Res. 12, 4 (1986), 365--381.Google ScholarCross Ref
- Jeremy N. Bailenson, Emmanuel D. Pontikakis, Iris B. Mauss, James J. Gross, Maria E. Jabon, Cendri A. C. Hutcherson, Clifford Nass, and Oliver John. 2008. Real-time classification of evoked emotions using facial feature tracking and physiological responses. Int. J. Hum.-Comput. Stud. 66, 5 (2008), 303--317. Google ScholarDigital Library
- Lisa Feldman Barrett. 1998. Discrete emotions or dimensions? The role of valence focus and arousal focus. Cogn. Emot. 12, 4 (1998), 579--599.Google ScholarCross Ref
- Steven Bellman, Magda Nenycz-Thiel, Rachel Kennedy, Laurent Larguinat, Bruce McColl, and Duane Varan. 2017. What makes a television commercial sell? Using biometrics to identify successful ads. J. Advertis. Ress 57, 1 (2017), 53--66.Google ScholarCross Ref
- Paul D. Berger and Nada I. Nasr. 1998. Customer lifetime value: Marketing models and applications. J. Interact. Market. 12, 1 (1998), 17--30.Google ScholarCross Ref
- Nathaniel Blanchard, Robert Bixler, Tera Joyce, and Sidney D'Mello. 2014. Automated physiological-based detection of mind wandering during learning. In Proceedings of the International Conference on Intelligent Tutoring Systems. Springer, Cham, 55--60. Google ScholarDigital Library
- Nigel Bosch, Sidney K. D'Mello, Jaclyn Ocumpaugh, Ryan S. Baker, and Valerie Shute. 2016. Using video to automatically detect learner affect in computer-enabled classrooms. ACM Trans. Interact. Intell. Syst. 6, 2 (2016), 17. Google ScholarDigital Library
- Andrei Z. Broder, Peter Ciccolo, Marcus Fontoura, Evgeniy Gabrilovich, Vanja Josifovski, and Lance Riedel. 2008. Search advertising using web relevance feedback. In Proceedings of the 17th ACM Conference on Information and Knowledge Management. ACM, New York, NY, 1013--1022. Google ScholarDigital Library
- Rafael A. Calvo and Sidney D'Mello. 2010. Affect detection: An interdisciplinary review of models, methods, and their applications. IEEE Trans. Affect. Comput. 1, 1 (2010), 18--37. Google ScholarDigital Library
- Sidney K. D'Mello and Arthur Graesser. 2010. Multimodal semi-automated affect detection from conversational cues, gross body language, and facial features. User Model. User-Adapt. Interact. 20, 2 (2010), 147--187. Google ScholarDigital Library
- Paul Ekman and Wallace V. Friesen. 1975. Unmasking the Face: A Guide to Recognizing Emotions from Facial Cues. Prentice Hall, Upper Saddle River, NJ.Google Scholar
- Facebook: Your Video's Performance. Retrieved June 1st, 2017 from https://www.facebook.com/facebookmedia/best-practices/video-metrics.Google Scholar
- Roman Ganhör. 2012. ProPane: fast and precise video browsing on mobile phones. In Proceedings of the 11th International Conference on Mobile and Ubiquitous Multimedia. ACM, New York, NY, 20. Google ScholarDigital Library
- Mark K. Greenwald, Edwin W. Cook, and Peter J. Lang. 1989. Affective judgment and psychophysiological response: Dimensional covariation in the evaluation of pictorial stimuli. J. Psychophysiol. 3, 1 (1989), 51--64.Google Scholar
- Teng Han, Xiang Xiao, Lanfei Shi, John Canny, and Jingtao Wang. 2015. Balancing accuracy and fun: Designing camera based mobile games for implicit heart rate monitoring. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. ACM, New York, NY, 847--856. Google ScholarDigital Library
- Richard L. Hazlett and Sasha Yassky Hazlett. 1999. Emotional response to television commercials: Facial EMG vs. self-report. J. Advertis. Res. 39, 2 (1999), 7--7.Google Scholar
- Nigel Hollis. 2005. Ten years of learning on how online advertising builds brands. J. Advertis. Res. 45, 2 (2005), 255--268.Google ScholarCross Ref
- Sazzad M. Hussain, Hamed Monkaresi, and Rafael A. Calvo. 2012. Combining classifiers in multimodal affect detection. In Proceedings of the 10th Australasian Data Mining Conference, Volume 134. Australian Computer Society, Inc., 103--108. Google ScholarDigital Library
- Andrew H. Kemp and Daniel S. Quintana. 2013. The relationship between mental and physical health: Insights from the study of heart rate variability. Int. J. Psychophysiol. 89, 3 (2013), 288--296.Google ScholarCross Ref
- Mervyn King, Jill Atkins, and Michael Schwarz. 2007. Internet advertising and the generalized second-price auction: Selling billions of dollars worth of keywords. Am. Econ. Rev. 97, 1 (2007), 242--259.Google ScholarCross Ref
- Annie Lang. 1990. Involuntary attention and physiological arousal evoked by structural features and emotional content in TV commercials. Commun. Res. 17, 3 (1990), 275--299.Google ScholarCross Ref
- Kuang-chih Lee, Burkay Orten, Ali Dasdan, and Wentong Li. 2012. Estimating conversion rate in display advertising from past erformance data. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, NY, 768--776. Google ScholarDigital Library
- Yuheng Li, Yiping Zhang, and Ruixi Yuan. 2011. Measurement and analysis of a large scale commercial mobile internet TV system. In Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement Conference. ACM, New York, NY, 209--224. Google ScholarDigital Library
- Zhenyu Li, Gaogang Xie, Mohamed Ali Kaafar, and Kave Salamatian. 2015. User behavior characterization of a large-scale mobile live streaming system. In Proceedings of the 24th International Conference on World Wide Web. ACM, New York, NY, 307--313. Google ScholarDigital Library
- Greg Linden, Brent Smith, and Jeremy York. 2003. Amazon.com recommendations: Item-to-item collaborative filtering. IEEE Internet Comput. 7, 1 (2003), 76--80. Google ScholarDigital Library
- Ritu Lohtia, Naveen Donthu, and Edmund K. Hershberger. 2003. The impact of content and design elements on banner advertising click-through rates. J. Advertis. Res. 43, 4 (2003), 410--418.Google ScholarCross Ref
- Daniel McDuff, Abdelrahman Mahmoud, Mohammad Mavadati, May Amr, Jay Turcot, and Rana el Kaliouby. 2016. AFFDEX SDK: A cross-platform real-time multi-face expression recognition toolkit. In Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems. ACM, New York, NY, 3723--3726. Google ScholarDigital Library
- Daniel McDuff, Rana El Kaliouby, Jeffrey F. Cohn, and Rosalind W. Picard. 2015. Predicting ad liking and purchase intent: Large-scale analysis of facial responses to ads. IEEE Trans. Affect. Comput. 6, 3 (2015), 223--235.Google ScholarDigital Library
- Daniel McDuff, Rana El Kaliouby, Thibaud Senechal, David Demirdjian, and Rosalind Picard. 2014. Automatic measurement of ad preferences from facial responses gathered over the internet. Image Vis. Comput. 32, 10 (2014), 630--640.Google ScholarCross Ref
- Daniel McDuff. 2014. Crowdsourcing Affective Responses for Predicting Media Effectiveness. Ph.D. Dissertation. Massachusetts Institute of Technology, Cambridge, MA.Google Scholar
- Daniel McDuff. 2017. New methods for measuring advertising efficacy. Digital Advertising: Theory and Research 3 (2017).Google Scholar
- Tao Mei, Xian-Sheng Hua, and Shipeng Li. 2009. VideoSense: A contextual in-video advertising system. IEEE Trans. Circ. Syst. Vid. Technol. 19, 12 (2009), 1866--1879. Google ScholarDigital Library
- Anca Cristina Micu and Joseph T. Plummer. 2010. Measurable emotions: How television ads really work. J. Advertis. Res. 50, 2 (2010), 137--153.Google ScholarCross Ref
- Jon D. Morris. 1995. Observations: SAM: The Self-Assessment Manikin; an efficient cross-cultural measurement of emotional response. J. Advertis. Res. 35, 6 (1995), 63--68.Google Scholar
- John M. Murphy. 1987. Branding: A Key Marketing Tool. Springer, Berlin.Google ScholarCross Ref
- Phuong Pham and Jingtao Wang. 2015. AttentiveLearner: Improving mobile MOOC learning via implicit heart rate tracking. In Proceedings of the International Conference on Artificial Intelligence in Education. Springer, Cham, 367--376.Google ScholarCross Ref
- Phuong Pham and Jingtao Wang. 2016. Adaptive review for mobile MOOC learning via implicit physiological signal sensing. In Proceedings of the 18th ACM International Conference on Multimodal Interaction. ACM, 37--44. Google ScholarDigital Library
- Phuong Pham and Jingtao Wang. 2016. AttentiveVideo: Quantifying emotional responses to mobile video advertisements. In Proceedings of the 18th ACM International Conference on Multimodal Interaction. ACM, 423--424. Google ScholarDigital Library
- Phuong Pham and Jingtao Wang. 2017. AttentiveLearner2: A multimodal approach for improving MOOC learning on mobile devices. In Proceedings of the International Conference on Artificial Intelligence in Education. Springer, Cham, 561--564.Google ScholarCross Ref
- Phuong Pham and Jingtao Wang. 2017. Understanding emotional responses to mobile video advertisements via physiological signal sensing and facial expression analysis. In Proceedings of the 22nd International Conference on Intelligent User Interfaces. ACM, 67--78. Google ScholarDigital Library
- Phuong Pham and Jingtao Wang. 2018. Predicting learners’ emotions in mobile MOOC learning via a multimodal intelligent tutor. In Proceedings of the International Conference on Intelligent Tutoring Systems. Springer, Cham, 150--159.Google ScholarDigital Library
- Rosalind W. Picard. 1997. Affective Computing, Vol. 252. MIT Press, Cambridge, MA. Google ScholarDigital Library
- Sachan Priyamvada Rajendra and N. Keshaveni. 2014. A survey of automatic video summarization techniques. Int. J. Electron. Electr. Comput. Syst. 2, 1 (2014).Google Scholar
- Viktor Rozgić, Shiv N. Vitaladevuni, and Rohit Prasad. 2013. Robust EEG emotion classification using segment level decision fusion. In Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’13). IEEE, 1286--1290.Google ScholarCross Ref
- Brent Smith and Greg Linden. 2017. Two decades of recommender systems at Amazon.com. IEEE Internet Comput. 21, 3 (2017), 12--18. Google ScholarDigital Library
- Patricia A. Stout and John D. Leckenby. 1986. Measuring emotional response to advertising. J. Advertis. 15, 4 (1986), 35--42.Google ScholarCross Ref
- Thales Texeira, Michel Wedel, and Rik Pieters. 2012. Emotion-induced engagement in internet video ads. J. Market. Res. 49, 2 (2012), 144--159.Google ScholarCross Ref
- The Interactive Advertising Bureau (IAB). 2016. Advertising Revenue Report 2016. Retrieved June 1, 2017 from https://www.iab.com/wp-content/uploads/2016/04/IAB_Internet_Advertising_Revenue_Report_FY_2016.pdf.Google Scholar
- Ba Tu Truong and Svetha Venkatesh. 2007. Video abstraction: A systematic review and classification. ACM Trans. Multimedia Comput. Commun. Appl. 3, 1 (2007), 3. Google ScholarDigital Library
- Understanding Emotient Analytics Key Performance Indicators. Retrieved June 1, 2017 from http://doczz.net/doc/6743814/understanding-emotient-analytics-key-performance-indicators.Google Scholar
- Martin Wöllmer, Moritz Kaiser, Florian Eyben, Björn Schuller, and Gerhard Rigoll. 2013. LSTM-Modeling of continuous emotions in an audiovisual affect recognition framework. Image Vis. Comput. 31, 2 (2013), 153--163. Google ScholarDigital Library
- Yue Wu, Tao Mei, Nenghai Yu, and Shipeng Li. 2012. Accelerometer-based single-handed video browsing on mobile devices: Design and user studies. In Proceedings of the 4th International Conference on Internet Multimedia Computing and Service. ACM, New York, NY, 157--160. Google ScholarDigital Library
- Xiang Xiao and Jingtao Wang. 2015. Towards attentive, bi-directional mooc learning on mobile devices. In Proceedings of the 2015 ACM on International Conference on Multimodal Interaction. ACM, New York, NY, 163--170. Google ScholarDigital Library
- Xiang Xiao and Jingtao Wang. 2016. Context and cognitive state triggered interventions for mobile MOOC learning. In Proceedings of the 18th ACM International Conference on Multimodal Interaction. ACM, New York, NY, 378--385. Google ScholarDigital Library
- Xiang Xiao and Jingtao Wang. 2017. Understanding and detecting divided attention in mobile MOOC learning. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. ACM, New York, NY, 2411--2415. Google ScholarDigital Library
- Xiang Xiao, Teng Han, and Jingtao Wang. 2013. LensGesture: Augmenting mobile interactions with back-of-device finger gestures. In Proceedings of the 15th ACM on International Conference on Multimodal Interaction. ACM, New York, NY, 287--294. Google ScholarDigital Library
- Anbang Xu, Haibin Liu, Liang Gou, Rama Akkiraju, Jalal Mahmud, Vibha Sinha, Yuheng Hu, and Mu Qiao. 2016. Predicting perceived brand personality with social media. In Proceedings of the International AAAI Conference on Web and Social Media (ICWSM’16). 436--445.Google Scholar
- Jun Yan, Ning Liu, Gang Wang, Wen Zhang, Yun Jiang, and Zheng Chen. 2009. How much can behavioral targeting help online advertising? In Proceedings of the 18th International Conference on World Wide Web. ACM, New York, NY, 261--270. Google ScholarDigital Library
- You Tube. 2016. Analytics and Reporting APIs. Retrieved October 14, 2016 from https://developers.google.com/youtube/analytics/v1/dimsmets/mets.Google Scholar
- Jin-Kai Zhang, Cui-Xia Ma, Yong-Jin Liu, Qiu-Fang Fu, and Xiao-Lan Fu. 2013. Collaborative interaction for videos on mobile devices based on sketch gestures. J. Comput. Sci. Technol. 28, 5 (2013), 810--817.Google ScholarCross Ref
- Weinan Zhang, Shuai Yuan, and Jun Wang. 2014. Optimal real-time bidding for display advertising. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, NY, 1077--1086. Google ScholarDigital Library
Index Terms
- AttentiveVideo: A Multimodal Approach to Quantify Emotional Responses to Mobile Advertisements
Recommendations
Understanding Emotional Responses to Mobile Video Advertisements via Physiological Signal Sensing and Facial Expression Analysis
IUI '17: Proceedings of the 22nd International Conference on Intelligent User InterfacesUnderstanding a target audience's emotional responses to video advertisements is crucial to stakeholders. However, traditional methods for collecting such information are slow, expensive, and coarse-grained. We propose AttentiveVideo, an intelligent ...
AttentiveVideo: quantifying emotional responses to mobile video advertisements
ICMI '16: Proceedings of the 18th ACM International Conference on Multimodal InteractionThis demo presents AttentiveVideo, a multi-modal video player that can collect and infer viewers’ emotional responses to video advertisements on unmodified smart phones. When a subsidized video advertisement is playing, AttentiveVideo uses on-lens ...
Music, Heart Rate, and Emotions in the Context of Stimulating Technologies
ACII '07: Proceedings of the 2nd international conference on Affective Computing and Intelligent InteractionThe present aim was to explore heart rate responses when stimulating participants with technology primarily aimed at the rehabilitation of older adults. Heart rate responses were measured from 31 participants while they listened to emotionally ...
Comments