skip to main content
10.1145/2645710.2645746acmconferencesArticle/Chapter ViewAbstractPublication PagesrecsysConference Proceedingsconference-collections
research-article

Comparative recommender system evaluation: benchmarking recommendation frameworks

Published:06 October 2014Publication History

ABSTRACT

Recommender systems research is often based on comparisons of predictive accuracy: the better the evaluation scores, the better the recommender. However, it is difficult to compare results from different recommender systems due to the many options in design and implementation of an evaluation strategy. Additionally, algorithmic implementations can diverge from the standard formulation due to manual tuning and modifications that work better in some situations.

In this work we compare common recommendation algorithms as implemented in three popular recommendation frameworks. To provide a fair comparison, we have complete control of the evaluation dimensions being benchmarked: dataset, data splitting, evaluation strategies, and metrics. We also include results using the internal evaluation mechanisms of these frameworks. Our analysis points to large differences in recommendation accuracy across frameworks and strategies, i.e. the same baselines may perform orders of magnitude better or worse across frameworks. Our results show the necessity of clear guidelines when reporting evaluation of recommender systems to ensure reproducibility and comparison of results.

Skip Supplemental Material Section

Supplemental Material

p129-sidebyside.mp4

mp4

50.1 MB

References

  1. G. Adomavicius and A. Tuzhilin. Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Eng., 17(6):734--749, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. T. G. Armstrong, A. Moffat, W. Webber, and J. Zobel. Improvements that don't add up: ad-hoc retrieval results since 1998. In CIKM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. C. Basu, H. Hirsh, and W. W. Cohen. Recommendation as classification: Using social and content-based information in recommendation. In J. Mostow and C. Rich, editors, AAAI/IAAI, pages 714--720. AAAI Press / MIT Press, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. Bellogín, P. Castells, and I. Cantador. Precision-oriented evaluation of recommender systems: an algorithmic comparison. In RecSys, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. S. Breese, D. Heckerman, and C. Kadie. Empirical analysis of predictive algorithms for collaborative filtering. In UAI, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. P. G. Campos, F. Díez, and I. Cantador. Time-aware recommender systems: a comprehensive survey and analysis of existing evaluation protocols. User Model. User-Adapt. Interact., 24(1-2):67--119, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. P. Cremonesi, Y. Koren, and R. Turrin. Performance of recommender algorithms on top-n recommendation tasks. In RecSys, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. P. Cremonesi, A. Sansottera, and S. Gualandi. On the cooling-aware workload placement problem. In AI for Data Center Management and Cloud Computing, 2011.Google ScholarGoogle Scholar
  9. M. Deshpande and G. Karypis. Item-based top-n recommendation algorithms. ACM Trans. Inf. Syst., 22(1):143--177, Jan. 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. C. Desrosiers and G. Karypis. A comprehensive survey of neighborhood-based recommendation methods. In Ricci et al. {26}, pages 107--144.Google ScholarGoogle Scholar
  11. M. D. Ekstrand, M. Ludwig, J. A. Konstan, and J. Riedl. Rethinking the recommender research ecosystem: reproducibility, openness, and lenskit. In RecSys, pages 133--140, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Funk. Netflix update: Try this at home. http://sifter.org/~simon/journal/20061211.html (retrieved Jan. 2014), Dec 2006.Google ScholarGoogle Scholar
  13. Z. Gantner, S. Rendle, C. Freudenthaler, and L. Schmidt-Thieme. Mymedialite: A free recommender system library. In RecSys, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. K. Goldberg, T. Roeder, D. Gupta, and C. Perkins. Eigentaste: A constant time collaborative filtering algorithm. Inf. Retr., 4(2):133--151, July 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Gunawardana and G. Shani. A survey of accuracy evaluation metrics of recommendation tasks. J. Mach. Learn. Res., 10:2935--2962, Dec. 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. L. Herlocker, J. A. Konstan, L. G. Terveen, and J. T. Riedl. Evaluating collaborative filtering recommender systems. ACM Trans. Inf. Syst., 22(1):5--53, Jan. 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. T. Jambor and J. Wang. Optimizing multiple objectives in collaborative filtering. In RecSys, pages 55--62, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. A. Konstan and G. Adomavicius. Toward identification and adoption of best practices in algorithmic recommender systems research. In RepSys, pages 23--28, New York, NY, USA, 2013. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Y. Koren. Factorization meets the neighborhood: A multifaceted collaborative filtering model. In KDD. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Y. Koren and R. Bell. Advances in collaborative filtering. In Ricci et al. {26}, pages 145--186.Google ScholarGoogle Scholar
  21. Y. Koren, R. M. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. IEEE Computer, 42(8):30--37, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. S. M. McNee, J. Riedl, and J. A. Konstan. Being accurate is not enough: how accuracy metrics have hurt recommender systems. In CHI Extended Abstracts, pages 1097--1101, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. T. T. Nguyen, D. Kluver, T.-Y. Wang, P.-M. Hui, M. D. Ekstrand, M. C. Willemsen, and J. Riedl. Rating support interfaces to improve user experience and recommender accuracy. In RecSys. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. S. Owen, R. Anil, T. Dunning, and E. Friedman. Mahout in Action. Manning Publications Co., Greenwich, CT, USA, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom, and J. Riedl. Grouplens: An open architecture for collaborative filtering of netnews. In CSCW, pages 175--186, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. F. Ricci, L. Rokach, B. Shapira, and P. B. Kantor, editors. Recommender Systems Handbook. Springer, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. A. Said and A. Bellogín. Rival - a toolkit to foster reproducibility in recommender system evaluation. In RecSys, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. A. Said, B. J. Jain, S. Narr, and T. Plumbaum. Users and noise: The magic barrier of recommender systems. In UMAP. Springer, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. B. M. Sarwar, G. Karypis, J. A. Konstan, and J. Riedl. Item-based collaborative filtering recommendation algorithms. In WWW, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. G. Shani and A. Gunawardana. Evaluating recommendation systems. In Ricci et al. {26}, pages 257--297.Google ScholarGoogle Scholar
  31. U. Shardanand and P. Maes. Social information filtering: Algorithms for automating "word of mouth". In CHI, pages 210--217, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    RecSys '14: Proceedings of the 8th ACM Conference on Recommender systems
    October 2014
    458 pages
    ISBN:9781450326681
    DOI:10.1145/2645710

    Copyright © 2014 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 6 October 2014

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article

    Acceptance Rates

    RecSys '14 Paper Acceptance Rate35of234submissions,15%Overall Acceptance Rate254of1,295submissions,20%

    Upcoming Conference

    RecSys '24
    18th ACM Conference on Recommender Systems
    October 14 - 18, 2024
    Bari , Italy

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader