skip to main content
10.1145/3604915.3608788acmconferencesArticle/Chapter ViewAbstractPublication PagesrecsysConference Proceedingsconference-collections
research-article
Open Access

A Lightweight Method for Modeling Confidence in Recommendations with Learned Beta Distributions

Published:14 September 2023Publication History

ABSTRACT

Most recommender systems (RecSys) do not provide an indication of confidence in their decisions. Therefore, they do not distinguish between recommendations of which they are certain, and those where they are not. Existing confidence methods for RecSys are either inaccurate heuristics, conceptually complex or computationally very expensive. Consequently, real-world RecSys applications rarely adopt these methods, and thus, provide no confidence insights in their behavior.

In this work, we propose learned beta distributions (LBD) as a simple and practical recommendation method with an explicit measure of confidence. Our main insight is that beta distributions predict user preferences as probability distributions that naturally model confidence on a closed interval, yet can be implemented with the minimal model-complexity. Our results show that LBD maintains competitive accuracy to existing methods while also having a significantly stronger correlation between its accuracy and confidence. Furthermore, LBD has higher performance when applied to a high-precision targeted recommendation task.

Our work thus shows that confidence in RecSys is possible without sacrificing simplicity or accuracy, and without introducing heavy computational complexity. Thereby, we hope it enables better insight into real-world RecSys and opens the door for novel future applications.

References

  1. Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dandelion Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems.Google ScholarGoogle Scholar
  2. Gediminas Adomavicius, Sreeharsha Kamireddy, and YoungOk Kwon. 2007. Towards More Confident Recommendations: Improving Recommender Systems Using Filtering Approach Based on Rating Variance. (2007), 6.Google ScholarGoogle Scholar
  3. Vito Walter Anelli, Amra Delić, Gabriele Sottocornola, Jessie Smith, Nazareno Andrade, Luca Belli, Michael Bronstein, Akshay Gupta, Sofia Ira Ktena, Alexandre Lung-Yut-Fong, 2020. RecSys 2020 challenge workshop: engagement prediction on Twitter’s home timeline. In Fourteenth ACM Conference on Recommender Systems. 623–627.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Cesare Bernardis, Maurizio Ferrari Dacrema, and Paolo Cremonesi. 2019. Estimating Confidence of Individual User Predictions in Item-based Recommender Systems. In Proceedings of the 27th ACM Conference on User Modeling, Adaptation and Personalization. Association for Computing Machinery, 149–156.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Djallel Bouneffouf, Amel Bouzeghoub, and Alda Lopes Ganarski. 2013. Risk-Aware Recommender Systems. In Neural Information Processing, Minho Lee, Akira Hirose, Zeng-Guang Hou, and Rhee Man Kil (Eds.). Springer, 57–65.Google ScholarGoogle Scholar
  6. James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman-Milne, and Qiao Zhang. 2018. JAX: composable transformations of Python+NumPy programs.Google ScholarGoogle Scholar
  7. George Casella and Edward I. George. 1992. Explaining the Gibbs Sampler. The American Statistician 46, 3 (1992), 167–174.Google ScholarGoogle ScholarCross RefCross Ref
  8. Wei Chu, Lihong Li, Lev Reyzin, and Robert Schapire. 2011. Contextual Bandits with Linear Payoff Functions. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics. JMLR Workshop and Conference Proceedings, 208–214.Google ScholarGoogle Scholar
  9. Annie Cuyt, Vigdis Brevik Petersen, Brigitte Verdonk, Haakon Waadeland, and William B. Jones (Eds.). 2008. Handbook of Continued Fractions for Special Functions. Springer.Google ScholarGoogle Scholar
  10. Arjun K Gupta and Saralees Nadarajah. 2004. Handbook of beta distribution and its applications. CRC press.Google ScholarGoogle Scholar
  11. F Maxwell Harper and Joseph A Konstan. 2015. The movielens datasets: History and context. Acm transactions on interactive intelligent systems (tiis) 5, 4 (2015), 1–19.Google ScholarGoogle Scholar
  12. Jonathan L. Herlocker, Joseph A. Konstan, and John Riedl. 2000. Explaining Collaborative Filtering Recommendations. In Proceedings of the 2000 ACM Conference on Computer Supported Cooperative Work. ACM, 241–250.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Jin Huang, Harrie Oosterhuis, and Maarten De Rijke. 2022. It Is Different When Items Are Older: Debiasing Recommendations When Selection Bias and User Preferences Are Dynamic. In Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining. ACM, 381–389.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Eyke Hüllermeier and Willem Waegeman. 2021. Aleatoric and epistemic uncertainty in machine learning: An introduction to concepts and methods. Machine Learning 110, 3 (2021), 457–506.Google ScholarGoogle ScholarCross RefCross Ref
  15. Olivier Jeunen and Bart Goethals. 2021. Pessimistic Reward Models for Off-Policy Learning in Recommendation. In Fifteenth ACM Conference on Recommender Systems. 63–74.Google ScholarGoogle Scholar
  16. Rong Jin, Luo Si, ChengXiang Zhai, and Jamie Callan. 2003. Collaborative Filtering with Decoupled Models for Preferences and Ratings. In Proceedings of the Twelfth International Conference on Information and Knowledge Management. ACM, 309–316.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Christopher C Johnson. 2014. Logistic matrix factorization for implicit feedback data. Advances in Neural Information Processing Systems 27, 78 (2014), 1–9.Google ScholarGoogle Scholar
  18. Norman L Johnson, Samuel Kotz, and N Balakrishnan. 1994. Beta distributions. Continuous univariate distributions. 2nd ed. New York, NY: John Wiley and Sons (1994), 221–235.Google ScholarGoogle Scholar
  19. Zahid Younas Khan, Zhendong Niu, Sulis Sandiwarno, and Rukundo Prince. 2021. Deep Learning Techniques for Rating Prediction: A Survey of the State-of-the-Art. Artificial Intelligence Review 54, 1 (2021), 95–135.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Diederik P. Kingma and Jimmy Ba. 2017. Adam: A Method for Stochastic Optimization. arxiv:1412.6980 [cs]Google ScholarGoogle Scholar
  21. Michael Kläs and Anna Maria Vollmer. 2018. Uncertainty in machine learning applications: A practice-driven classification of uncertainty. In International conference on computer safety, reliability, and security. Springer, 431–438.Google ScholarGoogle ScholarCross RefCross Ref
  22. Bart P Knijnenburg, Martijn C Willemsen, Zeno Gantner, Hakan Soncu, and Chris Newell. 2012. Explaining the user experience of recommender systems. User modeling and user-adapted interaction 22, 4 (2012), 441–504.Google ScholarGoogle Scholar
  23. Norman Knyazev and Harrie Oosterhuis. 2022. The Bandwagon Effect: Not Just Another Bias. In Proceedings of the 2022 ACM SIGIR International Conference on the Theory of Information Retrieval.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Yehuda Koren. 2009. Collaborative Filtering with Temporal Dynamics. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 447–456.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Yehuda Koren and Robert Bell. 2011. Advances in Collaborative Filtering. In Recommender Systems Handbook, Francesco Ricci, Lior Rokach, Bracha Shapira, and Paul B. Kantor (Eds.). Springer US, 145–186.Google ScholarGoogle Scholar
  26. Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix Factorization Techniques for Recommender Systems. Computer 42, 8 (2009), 30–37.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Yehuda Koren and Joe Sill. 2011. OrdRec: An Ordinal Model for Predicting Personalized Item Rating Distributions. In Proceedings of the Fifth ACM Conference on Recommender Systems. ACM, 117–124.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Yew Jin Lim and Yee Whye Teh. 2007. Variational Bayesian Approach to Movie Rating Prediction. (2007).Google ScholarGoogle Scholar
  29. Benjamin Marlin. 2004. Collaborative filtering: A machine learning perspective. University of Toronto Toronto.Google ScholarGoogle Scholar
  30. Maciej A. Mazurowski. 2013. Estimating Confidence of Individual Rating Predictions in Collaborative Filtering Recommender Systems. Expert Systems with Applications 40, 10 (2013), 3847–3857.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Sean McNee, Shyong Lam, Catherine Guetzlaff, Joseph Konstan, and John Riedl. 2003. Confidence Displays and Training in Recommender Systems.Google ScholarGoogle Scholar
  32. Rus M. Mesas and Alejandro Bellogín. 2020. Exploiting Recommendation Confidence in Decision-Aware Recommender Systems. Journal of Intelligent Information Systems 54, 1 (2020), 45–78.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Andriy Mnih and Russ R Salakhutdinov. 2007. Probabilistic Matrix Factorization. In Advances in Neural Information Processing Systems, Vol. 20. Curran Associates, Inc.Google ScholarGoogle Scholar
  34. Jooyoung Moon, Jihyo Kim, Younghak Shin, and Sangheum Hwang. 2020. Confidence-aware learning for deep neural networks. In international conference on machine learning. PMLR, 7034–7044.Google ScholarGoogle Scholar
  35. Ladislav Peska and Stepan Balcar. 2022. The Effect of Feedback Granularity on Recommender Systems Performance. In Sixteenth ACM Conference on Recommender Systems. ACM, 586–591.Google ScholarGoogle Scholar
  36. Amy Rechkemmer and Ming Yin. 2022. When Confidence Meets Accuracy: Exploring the Effects of Multiple Performance Indicators on Trust in Machine Learning Models. In CHI Conference on Human Factors in Computing Systems. 1–14.Google ScholarGoogle Scholar
  37. James Reilly, Barry Smyth, Lorraine McGinty, and Kevin McCarthy. 2005. Critiquing with Confidence. In Case-Based Reasoning Research and Development, Héctor Muñoz-Ávila and Francesco Ricci (Eds.). Springer, 436–450.Google ScholarGoogle Scholar
  38. Steffen Rendle, Walid Krichene, Li Zhang, and John Anderson. 2020. Neural Collaborative Filtering vs. Matrix Factorization Revisited. In Fourteenth ACM Conference on Recommender Systems. ACM, 240–248.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Steffen Rendle, Li Zhang, and Yehuda Koren. 2019. On the difficulty of evaluating baselines: A study on recommender systems. arXiv preprint arXiv:1905.01395 (2019).Google ScholarGoogle Scholar
  40. Francesco Ricci, Lior Rokach, and Bracha Shapira. 2015. Recommender Systems: Introduction and Challenges. In Recommender Systems Handbook. Springer, 1–34.Google ScholarGoogle ScholarCross RefCross Ref
  41. James Robinson-Cox and Robert Boik. 1998. Derivatives of the Incomplete Beta Function. Journal of Statistical Software 03 (1998).Google ScholarGoogle Scholar
  42. Ruslan Salakhutdinov and Andriy Mnih. 2008. Bayesian Probabilistic Matrix Factorization Using Markov Chain Monte Carlo. In Proceedings of the 25th International Conference on Machine Learning. Association for Computing Machinery, 880–887.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Tobias Schnabel, Adith Swaminathan, Ashudeep Singh, Navin Chandak, and Thorsten Joachims. 2016. Recommendations As Treatments: Debiasing Learning and Evaluation. In Proceedings of the 33rd International Conference on International Conference on Machine Learning. 1670–1679.Google ScholarGoogle Scholar
  44. Suvash Sedhain, Aditya Krishna Menon, Scott Sanner, and Lexing Xie. 2015. AutoRec: Autoencoders Meet Collaborative Filtering. In Proceedings of the 24th International Conference on World Wide Web. ACM, 111–112.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Upendra Shardanand and Pattie Maes. 1995. Social Information Filtering: Algorithms for Automating “Word of Mouth”. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems - CHI ’95. ACM Press, 210–217.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Harald Steck. 2013. Evaluation of Recommendations: Rating-Prediction and Ranking. In Proceedings of the Seventh ACM Conference on Recommender Systems. 213–220.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Adith Swaminathan and Thorsten Joachims. 2015. The Self-Normalized Estimator for Counterfactual Learning. In Advances in Neural Information Processing Systems, Vol. 28. 3231–3239.Google ScholarGoogle Scholar
  48. I.J. Thompson and A.R. Barnett. 1986. Coulomb and Bessel Functions of Complex Arguments and Order. J. Comput. Phys. 64, 2 (1986), 490–509.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Chao Wang, Qi Liu, Runze Wu, Enhong Chen, Chuanren Liu, Xunpeng Huang, and Zhenya Huang. 2018. Confidence-Aware Matrix Factorization for Recommender Systems. Proceedings of the AAAI Conference on Artificial Intelligence 32, 1 (2018).Google ScholarGoogle ScholarCross RefCross Ref
  50. Mingyue Zhang, Xunhua Guo, and Guoqing Chen. 2016. Prediction Uncertainty in Collaborative Filtering: Enhancing Personalized Online Product Ranking. Decision Support Systems 83 (2016), 10–21.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Yongfeng Zhang, Xu Chen, 2020. Explainable recommendation: A survey and new perspectives. Foundations and Trends® in Information Retrieval 14, 1 (2020), 1–101.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Qibin Zhao, Liqing Zhang, and Andrzej Cichocki. 2015. Bayesian CP Factorization of Incomplete Tensors with Automatic Rank Determination. IEEE Transactions on Pattern Analysis and Machine Intelligence 37, 9 (2015), 1751–1763.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A Lightweight Method for Modeling Confidence in Recommendations with Learned Beta Distributions

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Article Metrics

          • Downloads (Last 12 months)1,115
          • Downloads (Last 6 weeks)138

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format