ABSTRACT
It has long been found that well-configured recommender system ensembles can achieve better effectiveness than the combined systems separately. Sophisticated approaches have been developed to automatically optimize the ensembles' configuration to maximize their performance gains. However most work in this area has targeted simplified scenarios where algorithms are tested and compared on a single non-interactive run. In this paper we consider a more realistic perspective bearing in mind the cyclic nature of the recommendation task, where a large part of the system's input is collected from the reaction of users to the recommendations they are delivered. The cyclic process provides the opportunity for ensembles to observe and learn about the effectiveness of the combined algorithms, and improve the ensemble configuration progressively.
In this paper we explore the adaptation of a multi-armed bandit approach to achieve this, by representing the combined systems as arms, and the ensemble as a bandit that at each step selects an arm to produce the next round of recommendations. We report experiments showing the effectiveness of this approach compared to ensembles that lack the iterative perspective. Along the way, we find illustrative pitfall examples that can result from common, single-shot offline evaluation setups.
- G. Adomavicius and A. Tuzhilin (2005). Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and Data Engineering, 17, 6 (June 2005), 734--749. Google ScholarDigital Library
- F. Aksel and A. Birtürk (2010). An Adaptive Hybrid Recommender System that Learns Domain Dynamics. In International Workshop on Handling Concept Drift in Adaptive Information Systems: Importance, Challenges and Solutions (HaCDAIS-2010) at the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2010). Barcelona, Spain, 49--56.Google Scholar
- A. Bar, L. Rokach, G. Shani, B. Shapira and A. Schclar (2013). Improving Simple Collaborative Filtering Models Using Ensemble Methods. In 11<sup>th</sup> International Workshop on Multiple Classifier Systems (MCS 2013). Nanjing, China, 1--12.Google Scholar
- R. Bell, Y. Koren and C. Volinsky (2007). The bellkor solution to the Netflix prize. KorBell Team's Report to Netflix (2007).Google Scholar
- B. Brodén, M. Hammar, B. Nilson and D. Paraschakis (2018). Ensemble Recommendations via Thompson Sampling: an Experimental Study within e-Commerce. In Proceedings of 23<sup>rd</sup> International Conference on Intelligent User Interfaces (IUI 2018). Tokyo Japan, 19--29. Google ScholarDigital Library
- R. Burke (2002). Hybrid Recommender Systems: Survey and Experiments. User Modeling and User-Adapted Interaction, 12, 4 (November 2002). Kluwer Academic Publishers Hingham, MA, USA, 331--370. Google ScholarDigital Library
- R. Cañamares and P. Castells (2017). A Probabilistic Reformulation of Memory-Based Collaborative Filtering - Implications on Popularity Biases. In Proceeding of the 40<sup>th</sup> Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2017). ACM, New York, USA, 215--224. Google ScholarDigital Library
- O. Chapelle and L. Li (2011). An empirical evaluation of Thompson Sampling. In Proceedings of Neural Information Processing Systems (NIPS 2011). Curran Associates, Inc., Red Hook, NY, USA, 2249--2257. Google ScholarDigital Library
- S. Dooms (2013). Dynamic generation of personalized hybrid recommender systems. In Proceedings of the 7<sup>th</sup> ACM Conference on Recommender Systems (RecSys 2013). Hong Kong, China, 443--446. Google ScholarDigital Library
- A. Gilotte, C. Calauzènes, T. Nedelec, A. Abraham and S. Dollé (2018). Offline A/B Testing for Recommender Systems. In Proceedings of the 11<sup>th</sup> ACM International Conference on Web Search and Data Mining (WSDM 2018). ACM, New York, NY, USA, 198--206. Google ScholarDigital Library
- D. Hill, H. Nassif, Y. Liu, A. Iyer and S. Vishwanathan (2017). An Efficient Bandit Algorithm for Realtime Multivariate Optimization. In Proceedings of the 23<sup>rd</sup> ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2017). Halifax, NS, Canada, 1813--1821. Google ScholarDigital Library
- R. Kohavi, R. Longbotham, D. Sommerfield and R. Henne (2009). Controlled experiments on the web: survey and practical guide. Data Mining and Knowledge Discovery, 18, 1 (February 2009), 140--181. Google ScholarDigital Library
- Y. Hu, Y. Koren and C. Volinsky (2008). Collaborative Filtering for Implicit Feedback Datasets. In Proceedings of the 8<sup>th</sup> IEEE International Conference on Data Mining (ICDM 2008). IEEE Computer Society, Washington, DC, USA, 15--19. Google ScholarDigital Library
- J. Kawale, H. H. Bui, B. Kveton, L. Tran-Thanh and S. Chawla (2015). Efficient Thompson Sampling for Online Matrix-Factorization Recommendation. In Proceedings of Neural Information Processing Systems (NIPS 2015). Curran Associates, Inc., Red Hook, NY, USA, 1297--1305. Google ScholarDigital Library
- P. Kouki, S. Fakhraei, J. Foulds, M. Eirinaki and L. Getoor (2015). HyPER: A Flexible and Extensible Probabilistic Framework for Hybrid Recommender Systems. In Proceedings of the 9<sup>th</sup> ACM Conference on Recommender Systems (RecSys 2015). ACM, New York, NY, USA, 99--106. Google ScholarDigital Library
- L. Li, W. Chu, J. Langford and R. Schapire (2010). A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19<sup>th</sup> International Conference on World Wide Web (WWW 2010). ACM, New York, NY, USA, 661--670. Google ScholarDigital Library
- S. Li, A. Karatzoglou, and C. Gentile (2016). Collaborative Filtering Bandits. In Proceedings of the 39<sup>th</sup> International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2016). ACM New York, NY, USA, 539--548. Google ScholarDigital Library
- F. M. Maxwell and J. A. Konstan (2015). The MovieLens Datasets: History and Context. ACM Transactions on Interactive Intelligent Systems, 5, 4 (December 2015). Google ScholarDigital Library
- X. Ning, C. Desrosiers and G. Karypis (2015). A Comprehensive Survey of Neighborhood-Based Recommendation Methods. In F. Ricci, L. Rokach and B. Shapira (Eds.), Recommender Systems Handbook (2<sup>nd</sup> ed.). Springer, New York, NY, USA, 37--76.Google ScholarCross Ref
- K. Pang, M. Dong, Y. Wu and T. Hospedales (2018). Dynamic Ensemble Active Learning: A Non-Stationary Bandit with Expert Advice. In Proceedings of the 24th International Conference on Pattern Recognition (ICPR 2018). IEEE Computer Society, Washington, DC, USA, 2269--2276.Google ScholarCross Ref
- V. Perchet, P. Rigollet, S, Chassang and E. Snowberg (2016). Batched Bandit Problems. Annals of Statistics, 44, 2 (April 2016), 660--681.Google ScholarCross Ref
- D. Siroker and P. Koomen (2015). A/B testing: the most powerful way to turn clicks into customers. John Wiley & Sons Inc, Hoboken, NJ, USA, 2015. Google ScholarDigital Library
- R. Sutton and A. Barto (2018). Reinforcement Learning: An Introduction (2<sup>nd</sup> ed.). MIT Press, Cambridge, MA, USA, 2018. Google ScholarDigital Library
- L. Tang, Y. Jiang, L. Li and T. Li (2014). Ensemble contextual bandits for personalized recommendation. In Proceedings of the 8<sup>th</sup> ACM Conference on Recommender Systems (RecSys 2014). Foster City, CA, USA, 73--80. Google ScholarDigital Library
- Q. Wang, C. Zeng, W. Zhou, T. Li, S. S. Iyengar, L. Shwartz and G. Grabarnik (2019). Online Interactive Collaborative Filtering Using Multi-Armed Bandit with Dependent Arms. IEEE Transactions on Knowledge and Data Engineering, 31, 8 (August 2019), 1569--1580. Google ScholarDigital Library
- X. Zhao, W. Zhang and J. Wang (2013). Interactive Collaborative Filtering. In Proceedings of the 22<sup>nd</sup> ACM International Conference on Information and Knowledge Management (CIKM 2013). ACM, New York, NY, USA, 1411--1420. Google ScholarDigital Library
Index Terms
- Multi-armed recommender system bandit ensembles
Recommendations
A simple multi-armed nearest-neighbor bandit for interactive recommendation
RecSys '19: Proceedings of the 13th ACM Conference on Recommender SystemsThe cyclic nature of the recommendation task is being increasingly taken into account in recommender systems research. In this line, framing interactive recommendation as a genuine reinforcement learning problem, multi-armed bandit approaches have been ...
A Multi-Armed Bandit Model Selection for Cold-Start User Recommendation
UMAP '17: Proceedings of the 25th Conference on User Modeling, Adaptation and PersonalizationHow can we effectively recommend items to a user about whom we have no information? This is the problem we focus on in this paper, known as the cold-start problem. In most existing works, the cold-start problem is handled through the use of many kinds ...
Group recommendations via multi-armed bandits
WWW '12 Companion: Proceedings of the 21st International Conference on World Wide WebWe study recommendations for persistent groups that repeatedly engage in a joint activity. We approach this as a multi-arm bandit problem. We design a recommendation policy and show it has logarithmic regret. Our analysis also shows that regret depends ...
Comments