research-article

Contextual Video Recommendation by Multimodal Relevance and User Feedback

Authors:
Tao Mei

Microsoft Research Asia

Microsoft Research Asia
View Profile

,
Bo Yang

University of Southern California

University of Southern California
View Profile

,
Xian-Sheng Hua

Microsoft Research Asia

Microsoft Research Asia
View Profile

,
Shipeng Li

Microsoft Research Asia

Microsoft Research Asia
View Profile

Authors Info & Claims

ACM Transactions on Information Systems Volume 29 Issue 2Article No.: 10pp 1–24https://doi.org/10.1145/1961209.1961213

Published:01 April 2011Publication History

ACM Transactions on Information Systems

Abstract

With Internet delivery of video content surging to an unprecedented level, video recommendation, which suggests relevant videos to targeted users according to their historical and current viewings or preferences, has become one of most pervasive online video services. This article presents a novel contextual video recommendation system, called VideoReach, based on multimodal content relevance and user feedback. We consider an online video usually consists of different modalities (i.e., visual and audio track, as well as associated texts such as query, keywords, and surrounding text). Therefore, the recommended videos should be relevant to current viewing in terms of multimodal relevance. We also consider that different parts of videos are with different degrees of interest to a user, as well as different features and modalities have different contributions to the overall relevance. As a result, the recommended videos should also be relevant to current users in terms of user feedback (i.e., user click-through). We then design a unified framework for VideoReach which can seamlessly integrate both multimodal relevance and user feedback by relevance feedback and attention fusion. VideoReach represents one of the first attempts toward contextual recommendation driven by video content and user click-through, without assuming a sufficient collection of user profiles available. We conducted experiments over a large-scale real-world video data and reported the effectiveness of VideoReach.

References

Adomavicius, G. and Tuzhilin, A. 2005. Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Engin. 17, 6, 734--749. Google ScholarDigital Library
Baeza-Yates, R. and Ribeiro-Neto, B. 1999. Modern Information Retrieval. Addison Wesley. Google ScholarDigital Library
Balabanovic, M. 1998. Exploring versus exploiting when learning user models for text recommendation. User Model. User-Adapt. Interact. 8, 4, 71--102. Google ScholarDigital Library
Baluja, S., Seth, R., Sivakumar, D., et al. 2008. Video suggestion and discovery for youtube, taking random walks through the view graph. In Proceedings of the International World Wide Web Conference. Google ScholarDigital Library
Boll, S. 2007. Multitube-Where multimedia and web 2.0 could meet. IEEE Multimedia Mag. 14, 1, 9--13. Google ScholarDigital Library
Bollen, J., Nelson, M. L., Araujo, R., and Geisler, G. 2005. Video recommendations for the open video project. In Proceedings of the ACM/IEEE-CS Joint Conference on Digital Libraries. 369--369. Google ScholarDigital Library
Burke, R. 2002. Hybrid recommender systems: Survey and experiments. User Model. User-Adapt. Interact. 12, 4, 331--370. Google ScholarDigital Library
Chang, S.-F., Ma, W.-Y., and Smeulders, A. 2007. Recent advances and challenges of semantic image/video search. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP).Google Scholar
Christakou, C. and Stafylopatis, A. 2005. A hybrid movie recommender system based on neural networks. In Proceedings of the 5th International Conference on Intelligent Systems Design and Applications. Google ScholarDigital Library
Datta, R., Joshi, D., Li, J., and Wang, J. Z. 2008. Image retrieval: Ideas, influences, and trends of the new age. ACM Comput. Surv. 40, 65. Google ScholarDigital Library
Encyclopedia. 2011. Encyclopedia. http://www.encyclopedia.com/.Google Scholar
Fouss, F., Pirotte, A., Renders, J. M., and Saerens, M. 2007. Random-Walk computation of similarities between nodes of a graph, with application to collaborative recommendation. IEEE Trans. Knowl. Data Engin. 19, 3, 355--369. Google ScholarDigital Library
Gibas, M., Canahuate, G., and Ferhatosmanoglu, H. 2008. Online index recommendations for high-dimensional databases using query workloads. IEEE Trans. Knowl. Data Engin. 20, 2, 246--260. Google ScholarDigital Library
Gu, Z., Mei, T., Hua, X.-S., Tang, J., and Wu, X. 2008. Multi-Layer multi-instance learning for video concept detection. IEEE Trans. Multimedia 10, 8, 1605--1616. Google ScholarDigital Library
Hauptmann, A. G., Christel, M. G., and Yan, R. 2008. Video retrieval based on semantic concepts. Proc. IEEE 96, 4, 602--622.Google ScholarCross Ref
Hu, J., Zeng, H.-J., Li, H., Niu, C., and Chen, Z. 2007. Demographic prediction based on user’s browsing behavior. In Proceedings of the International World Wide Web Conference. Google ScholarDigital Library
Hua, X.-S., Lu, L., and Zhang, H.-J. 2004a. Optimization-Based automated home video editing system. IEEE Trans. Circ. Syst. Video Tech. 14, 5, 572--583. Google ScholarDigital Library
Hua, X.-S. and Zhang, H.-J. 2004b. An attention-based decision fusion scheme for multimedia information retrieval. In Proceedings of the IEEE Pacific-Rim Conference on Multimedia. Google ScholarDigital Library
Iwata, T., Saito, K., and Yamada, T. 2008. Recommendation method for improving customer lifetime value. IEEE Trans. Knowl. Data Engin. 20, 9, 1254--1263. Google ScholarDigital Library
Kennedy, L., Chang, S.-F., and Natsev, A. 2008. Query-Adaptive fusion for multimodal search. Proc. IEEE 96, 4, 567--588.Google ScholarCross Ref
Lew, M. S., Sebe, N., Djeraba, C., and Jain, R. 2006. Content-Based multimedia information retrieval: State of the art and challenges. ACM Trans. Multimedia Comput. Comm. Appl. 2, 1, 1--19. Google ScholarDigital Library
Liu, Y., Mei, T., and Hua, X.-S. 2009. CrowdReranking: Exploring multiple search engines for visual search reranking. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval. 500--507. Google ScholarDigital Library
Mei, T., Hua, X.-S., Lai, W., Yang, L., et al. 2007a. MSRA-USTC-SJTU at TRECVID 2007: High-Level feature extraction and search. In Proceedings of TREC Video Retrieval Evaluation Online.Google Scholar
Mei, T., Hua, X.-S., Yang, L., and Li, S. 2007b. VideoSense: Towards effective online video advertising. In Proceedings of ACM Multimedia. 1075--1084. Google ScholarDigital Library
Mei, T., Yang, B., Hua, X.-S., Yang, L., Yang, S.-Q., and Li, S. 2007c. VideoReach: An online video recommendation system. In Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval. 767--768. Google ScholarDigital Library
Moxley, E., Mei, T., and Manjunath, B. S. 2010. Video annotation through search and graph reinforcement mining. IEEE Trans. Multimedia 12, 3, 184--193. Google ScholarDigital Library
MSN Video. 2011. MSN video. http://video.msn.com/video.aspx?mkt=en-us&tab=soapbox/.Google Scholar
Naphade, M., Smith, J. R., Tesic, J., Chang, S.-F., Hsu, W., Kennedy, L., Hauptmann, A., and Curtis, J. 2006. Large-Scale concept ontology for multimedia. IEEE Multimedia Mag. 13, 3, 86--91. Google ScholarDigital Library
Resnick, P. and Varian, H. R. 1997. Recommender systems. Comm. ACM 40, 3, 56--58. Google ScholarDigital Library
Rui, Y., Huang, T. S., Ortega, M., and Mehrotra, S. 1998. Relevance feedback: A power tool for interactive content-based image retrieval. IEEE Trans. Circ. Video Tech. 8, 5, 644--655. Google ScholarDigital Library
Setten, M. V. and Veenstra, M. 2003. Prediction strategies in a TV recommender system---Method and experiments. In Proceedings of the International World Wide Web Conference.Google Scholar
Shen, D., Pan, R., Sun, J.-T., Pan, J. J., Wu, K., Yin, J., and Yang, Q. 2006a. Query enrichment for web-query classification. ACM Trans. Inf. Syst. 24, 3, 320--352. Google ScholarDigital Library
Shen, D., Sun, J.-T., Yang, Q., and Chen, Z. 2006b. Building bridges for web query classification. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval. 131--138. Google ScholarDigital Library
Shen, J., Shepherd, J., Cui, B., and Tan, K.-L. 2009. A novel framework for efficient automated singer identification in large music databases. ACM Trans. Inf. Syst. 27, 3. Google ScholarDigital Library
Shen, J., Tao, D., and Li, X. 2008. Modality mixture projections for semantic video event detection. IEEE Trans. Circ. Syst. Video Tech. 18, 11, 1587--1596. Google ScholarDigital Library
Siersdorfer, S., Pedro, J. S., and Sanderson, M. 2009. Automatic video tagging using content redundancy. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval. 395--402. Google ScholarDigital Library
Snoek, C. G. M. and Worring, M. 2009. Concept-based video retrieval. Found. Trends Inf. Retr. 4, 2, 215--322. Google ScholarDigital Library
Snoek, C., Worring, M., van Gemert, J., Geusebroek, J.-M., and Smeulders, A. W. M. 2006. The challenge problem for automated detection of 101 semantic concepts in multimedia. In Proceedings of the ACM International Conference on Multimedia. 421--430. Google ScholarDigital Library
Tao, D., Tang, X., Li, X., and Wu, X. 2006. Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval. IEEE Trans. Patt. Anal. Mach. Intell. 28, 7, 1088--1099. Google ScholarDigital Library
TRECVID. 2011. TRECVID. http://www-nlpir.nist.gov/projects/trecvid/.Google Scholar
Wei, Y. Z., Moreau, L., and Jennings, N. R. 2005. Learning users interests by quality classification in market-based recommender systems. IEEE Trans. Knowl. Data Engin. 17, 12, 1678--1688. Google ScholarDigital Library
Yahoo! 2011. Yahoo. http://www.yahoo.com/.Google Scholar
Yang, B., Mei, T., Hua, X.-S., Yang, L., Yang, S.-Q., and Li, M. 2007. Online video recommendation based on multimodal fusion and relevance feedback. In Proceedings of the ACM International Conference on Image and Video Retrieval. 73--80. Google ScholarDigital Library
Yang, Y. and Liu, X. 1999. A re-examination of text categorization methods. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval. Google ScholarDigital Library
YouTube. 2011. YouTube. http://www.youtube.com/.Google Scholar
Yu, B., Ma, W.-Y., Nahrstedt, K., and Zhang, H.-J. 2003. Video summarization based on user log enhanced link analysis. In Proceedings of the ACM International Conference on Multimedia. 382--391. Google ScholarDigital Library
Zhou, D., Zhu, S., Yu, K., Song, X., Tseng, B. L., Zha, H., and Giles, C. L. 2008. Learning multiple graphs for document recommendations. In Proceedings of the International World Wide Web Conference. 141--150. Google ScholarDigital Library

Index Terms

Contextual Video Recommendation by Multimodal Relevance and User Feedback
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
2. Information systems
  1. World Wide Web
    1. Web applications
    2. Web services

Recommendations

Online video recommendation based on multimodal fusion and relevance feedback
CIVR '07: Proceedings of the 6th ACM international conference on Image and video retrieval

With Internet delivery of video content surging to an un-precedented level, video recommendation has become a very popular online service. The capability of recommending relevant videos to targeted users can alleviate users' efforts on finding the most ...
Read More
Query refinement suggestion in multimodal image retrieval with relevance feedback
ICMI '11: Proceedings of the 13th international conference on multimodal interfaces

In the literature, it has been shown that relevance feedback is a good strategy for the system to interact with the user and provide better results in a content-based image retrieval (CBIR) system. On the other hand, there are many retrieval systems ...
Read More
Multimodal retrieval with relevance feedback based on genetic programming

This paper presents a framework for multimodal retrieval with relevance feedback based on genetic programming. In this supervised learning-to-rank framework, genetic programming is used for the discovery of effective combination functions of (multimodal)...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Information Systems Volume 29, Issue 2
April 2011
193 pages
ISSN:1046-8188
EISSN:1558-2868
DOI:10.1145/1961209
Issue’s Table of Contents

Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 April 2011
- Accepted: 1 December 2010
- Revised: 1 August 2010
- Received: 1 January 2010
Published in tois Volume 29, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Video recommendation
image retrieval
relevance feedback
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 95
  Total Citations
  View Citations
- 1,472
  Total Downloads
- Downloads (Last 12 months)67
- Downloads (Last 6 weeks)10
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Contextual Video Recommendation by Multimodal Relevance and User Feedback

ACM Transactions on Information Systems

Abstract

References

Cited By

Index Terms

Recommendations

Online video recommendation based on multimodal fusion and relevance feedback

Query refinement suggestion in multimodal image retrieval with relevance feedback

Multimodal retrieval with relevance feedback based on genetic programming

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Contextual Video Recommendation by Multimodal Relevance and User Feedback

ACM Transactions on Information Systems

Abstract

References

Cited By

Index Terms

Recommendations

Online video recommendation based on multimodal fusion and relevance feedback

Query refinement suggestion in multimodal image retrieval with relevance feedback

Multimodal retrieval with relevance feedback based on genetic programming

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media