research-article

Comparative recommender system evaluation: benchmarking recommendation frameworks

Authors:
Alan Said

TU-Delft, Delft, Netherlands

TU-Delft, Delft, Netherlands
View Profile

,
Alejandro Bellogín

Universidad Autónoma de Madrid, Madrid, Spain

Universidad Autónoma de Madrid, Madrid, Spain
View Profile

RecSys '14: Proceedings of the 8th ACM Conference on Recommender systemsOctober 2014Pages 129–136https://doi.org/10.1145/2645710.2645746

Published:06 October 2014Publication History

RecSys '14: Proceedings of the 8th ACM Conference on Recommender systems

Pages 129–136

ABSTRACT

Recommender systems research is often based on comparisons of predictive accuracy: the better the evaluation scores, the better the recommender. However, it is difficult to compare results from different recommender systems due to the many options in design and implementation of an evaluation strategy. Additionally, algorithmic implementations can diverge from the standard formulation due to manual tuning and modifications that work better in some situations.

In this work we compare common recommendation algorithms as implemented in three popular recommendation frameworks. To provide a fair comparison, we have complete control of the evaluation dimensions being benchmarked: dataset, data splitting, evaluation strategies, and metrics. We also include results using the internal evaluation mechanisms of these frameworks. Our analysis points to large differences in recommendation accuracy across frameworks and strategies, i.e. the same baselines may perform orders of magnitude better or worse across frameworks. Our results show the necessity of clear guidelines when reporting evaluation of recommender systems to ensure reproducibility and comparison of results.

Supplemental Material

p129-sidebyside.mp4

mp4

50.1 MB

Download

References

G. Adomavicius and A. Tuzhilin. Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Eng., 17(6):734--749, 2005. Google ScholarDigital Library
T. G. Armstrong, A. Moffat, W. Webber, and J. Zobel. Improvements that don't add up: ad-hoc retrieval results since 1998. In CIKM, 2009. Google ScholarDigital Library
C. Basu, H. Hirsh, and W. W. Cohen. Recommendation as classification: Using social and content-based information in recommendation. In J. Mostow and C. Rich, editors, AAAI/IAAI, pages 714--720. AAAI Press / MIT Press, 1998. Google ScholarDigital Library
A. Bellogín, P. Castells, and I. Cantador. Precision-oriented evaluation of recommender systems: an algorithmic comparison. In RecSys, 2011. Google ScholarDigital Library
J. S. Breese, D. Heckerman, and C. Kadie. Empirical analysis of predictive algorithms for collaborative filtering. In UAI, 1998. Google ScholarDigital Library
P. G. Campos, F. Díez, and I. Cantador. Time-aware recommender systems: a comprehensive survey and analysis of existing evaluation protocols. User Model. User-Adapt. Interact., 24(1-2):67--119, 2014. Google ScholarDigital Library
P. Cremonesi, Y. Koren, and R. Turrin. Performance of recommender algorithms on top-n recommendation tasks. In RecSys, 2010. Google ScholarDigital Library
P. Cremonesi, A. Sansottera, and S. Gualandi. On the cooling-aware workload placement problem. In AI for Data Center Management and Cloud Computing, 2011.Google Scholar
M. Deshpande and G. Karypis. Item-based top-n recommendation algorithms. ACM Trans. Inf. Syst., 22(1):143--177, Jan. 2004. Google ScholarDigital Library
C. Desrosiers and G. Karypis. A comprehensive survey of neighborhood-based recommendation methods. In Ricci et al. {26}, pages 107--144.Google Scholar
M. D. Ekstrand, M. Ludwig, J. A. Konstan, and J. Riedl. Rethinking the recommender research ecosystem: reproducibility, openness, and lenskit. In RecSys, pages 133--140, 2011. Google ScholarDigital Library
S. Funk. Netflix update: Try this at home. http://sifter.org/~simon/journal/20061211.html (retrieved Jan. 2014), Dec 2006.Google Scholar
Z. Gantner, S. Rendle, C. Freudenthaler, and L. Schmidt-Thieme. Mymedialite: A free recommender system library. In RecSys, 2011. Google ScholarDigital Library
K. Goldberg, T. Roeder, D. Gupta, and C. Perkins. Eigentaste: A constant time collaborative filtering algorithm. Inf. Retr., 4(2):133--151, July 2001. Google ScholarDigital Library
A. Gunawardana and G. Shani. A survey of accuracy evaluation metrics of recommendation tasks. J. Mach. Learn. Res., 10:2935--2962, Dec. 2009. Google ScholarDigital Library
J. L. Herlocker, J. A. Konstan, L. G. Terveen, and J. T. Riedl. Evaluating collaborative filtering recommender systems. ACM Trans. Inf. Syst., 22(1):5--53, Jan. 2004. Google ScholarDigital Library
T. Jambor and J. Wang. Optimizing multiple objectives in collaborative filtering. In RecSys, pages 55--62, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
J. A. Konstan and G. Adomavicius. Toward identification and adoption of best practices in algorithmic recommender systems research. In RepSys, pages 23--28, New York, NY, USA, 2013. ACM. Google ScholarDigital Library
Y. Koren. Factorization meets the neighborhood: A multifaceted collaborative filtering model. In KDD. ACM, 2008. Google ScholarDigital Library
Y. Koren and R. Bell. Advances in collaborative filtering. In Ricci et al. {26}, pages 145--186.Google Scholar
Y. Koren, R. M. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. IEEE Computer, 42(8):30--37, 2009. Google ScholarDigital Library
S. M. McNee, J. Riedl, and J. A. Konstan. Being accurate is not enough: how accuracy metrics have hurt recommender systems. In CHI Extended Abstracts, pages 1097--1101, 2006. Google ScholarDigital Library
T. T. Nguyen, D. Kluver, T.-Y. Wang, P.-M. Hui, M. D. Ekstrand, M. C. Willemsen, and J. Riedl. Rating support interfaces to improve user experience and recommender accuracy. In RecSys. ACM, 2013. Google ScholarDigital Library
S. Owen, R. Anil, T. Dunning, and E. Friedman. Mahout in Action. Manning Publications Co., Greenwich, CT, USA, 2011. Google ScholarDigital Library
P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom, and J. Riedl. Grouplens: An open architecture for collaborative filtering of netnews. In CSCW, pages 175--186, 1994. Google ScholarDigital Library
F. Ricci, L. Rokach, B. Shapira, and P. B. Kantor, editors. Recommender Systems Handbook. Springer, 2011. Google ScholarDigital Library
A. Said and A. Bellogín. Rival - a toolkit to foster reproducibility in recommender system evaluation. In RecSys, 2014. Google ScholarDigital Library
A. Said, B. J. Jain, S. Narr, and T. Plumbaum. Users and noise: The magic barrier of recommender systems. In UMAP. Springer, 2012. Google ScholarDigital Library
B. M. Sarwar, G. Karypis, J. A. Konstan, and J. Riedl. Item-based collaborative filtering recommendation algorithms. In WWW, 2001. Google ScholarDigital Library
G. Shani and A. Gunawardana. Evaluating recommendation systems. In Ricci et al. {26}, pages 257--297.Google Scholar
U. Shardanand and P. Maes. Social information filtering: Algorithms for automating "word of mouth". In CHI, pages 210--217, 1995. Google ScholarDigital Library

Recommendations

Research paper recommender system evaluation: a quantitative literature survey
RepSys '13: Proceedings of the International Workshop on Reproducibility and Replication in Recommender Systems Evaluation

Over 80 approaches for academic literature recommendation exist today. The approaches were introduced and evaluated in more than 170 research articles, as well as patents, presentations and blogs. We reviewed these approaches and found most evaluations ...
Read More
A Scalable, Accurate Hybrid Recommender System
WKDD '10: Proceedings of the 2010 Third International Conference on Knowledge Discovery and Data Mining

Recommender systems apply machine learning techniques for filtering unseen information and can predict whether a user would like a given resource. There are three main types of recommender systems: collaborative filtering, content-based filtering, and ...
Read More
A 3D approach to recommender system evaluation
CSCW '13: Proceedings of the 2013 conference on Computer supported cooperative work companion

In this work we describe an approach at multi-objective recommender system evaluation based on a previously introduced 3D benchmarking model. The benchmarking model takes user-centric, business-centric and technical constraints into consideration in ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
RecSys '14: Proceedings of the 8th ACM Conference on Recommender systems
October 2014
458 pages
ISBN:9781450326681
DOI:10.1145/2645710
General Chairs:
Alfred Kobsa
University of California, Irvine
,
Michelle Zhou
IBM
,
Program Chairs:
Martin Ester
Simon Fraser University
,
Yehuda Koren
Google
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 6 October 2014
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
benchmarking
evaluation
recommender systems
Qualifiers
- research-article
Conference

Acceptance Rates
RecSys '14 Paper Acceptance Rate35of234submissions,15%Overall Acceptance Rate254of1,295submissions,20%
More
Upcoming Conference
RecSys '24

Sponsor:

sigchi

18th ACM Conference on Recommender Systems

October 14 - 18, 2024

Bari , Italy
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 95
  Total Citations
  View Citations
- 2,468
  Total Downloads
- Downloads (Last 12 months)167
- Downloads (Last 6 weeks)24
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Comparative recommender system evaluation: benchmarking recommendation frameworks

RecSys '14: Proceedings of the 8th ACM Conference on Recommender systems

ABSTRACT

Supplemental Material

References

Cited By

Recommendations

Research paper recommender system evaluation: a quantitative literature survey

A Scalable, Accurate Hybrid Recommender System

A 3D approach to recommender system evaluation