RecTree: An Efficient Collaborative Filtering Method

Chee, Sonny Han Seng; Han, Jiawei; Wang, Ke

doi:10.1007/3-540-44801-2_15

RecTree: An Efficient Collaborative Filtering Method

Sonny Han Seng Chee⁷,
Jiawei Han⁷ &
Ke Wang⁷

Conference paper
First Online: 01 January 2001

1026 Accesses
66 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2114))

Abstract

Many people rely on the recommendations of trusted friends to find restaurants or movies, which match their tastes. But, what if your friends have not sampled the item of interest? Collaborative filtering (CF) seeks to increase the effectiveness of this process by automating the derivation of a recommendation, often from a clique of advisors that we have no prior personal relationship with. CF is a promising tool for dealing with the information overload that we face in the networked world.

Prior works in CF have dealt with improving the accuracy of the predictions. However, it is still challenging to scale these methods to large databases. In this study, we develop an efficient collaborative filtering method, called RecTree (which stands for RECommendation Tree) that addresses the scalability problem with a divide-and-conquer approach. The method first performs an efficient k-means-like clustering to group data and creates neighborhood of similar users, and then performs subsequent clustering based on smaller, partitioned databases. Since the progressive partitioning reduces the search space dramatically, the search for an advisory clique will be faster than scanning the entire database of users. In addition, the partitions contain users that are more similar to each other than those in other partitions. This characteristic allows RecTree to avoid the dilution of opinions from good advisors by a multitude of poor advisors and thus yielding a higher overall accuracy.

Based on our experiments and performance study, RecTree outperforms the well-known collaborative filter, CorrCF, in both execution time and accuracy. In particular, RecTree’s execution time scales by O(nlog₂(n)) with the dataset size while CorrCF scales quadratically.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

C.C. Aggarwal, C. Procopiuc, J.L. Wolf, P.S. Yu, and J.S. Park, Fast Algorithms for Projected Clustering, In SIGMOD’99, Philadephia, PA, June 1999.
Google Scholar
R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan, Automatic Subspace Clustering in High Dimensional Data for Data Mining Applications, In SIGMOD’98, Seattle, WA, June 1998.
Google Scholar
J.S. Breese, D. Heckerman, and C. Kadie, Empirical analysis of predictive algorithms for collaborative filtering. In Proc. 14th Conf. Uncertainty in Artificial Intelligence (UAI-98), pp. 43–52, San Franciso, CA, July 1998.
Google Scholar
K. Goldberg, T. Roeder, D. Gupta and C. Perkins, Eigentaste: A Constant Time Collaborative Filtering Algorithm, Information Retrieval, 2001.
Google Scholar
N. Good, J.B. Schafer, J.A. Konstan, A. Borchers, B. Sarwar, J. Herlocker and J. Riedl, Combining Collaborative Filtering with Personal Agents for Better Recommendations, In AAAI-99, July 1999.
Google Scholar
S. Guha, R. Rastogi, and K. Shim, CURE: An Efficient Clustering Algorithm for Large Databases, In SIGMOD’9), pp. 73–84, Seattle, WA, June 1998.
Google Scholar
J. Han, S. Chee, and J.Y. Chiang, Issues for On-Line Analytical Mining of Data Warehouses, In Proc. 1998 SIGMOD’96 Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD’98), Seattle, Washington, June 1998
Google Scholar
J.L. Herlocker, J.A. Konstan, A. Borchers, and J. Riedl, An Algorithmic Framework for Performing Collaborative Filtering, In Proc. 1999 Conf. Research and Development in Information Retrieval, pp. 230–237, Berkeley, CA, August 1999.
Google Scholar
L. Kaufman and P. Rousseeuw, Finding Groups in Data, An Introduction to Clustering Analysis. John Wiley and Sons, 1989.
Google Scholar
J.A. Konstan, B.N. Miller, D. Maltz, J.L. Herlocker, L.R. Gordon, and J. Riedl, Applying Collaborative Filtering to Usenet News, CACM, 40(3): 77–87, 1997.
Google Scholar
P. Resnick, N. Iacovou, M. Sushak, P. Bergstrom, and J. Riedl, GroupLens: An open architechure for collaborative filtering of netnews. In Proc. ACM Conf. Computer Support Cooperative Work (CSC) 1994, New York, NYOct. 1994.
Google Scholar
B.M. Sarwar, J.A. Konstan, A. Borchers, J.L. Herlocker, B.N. Miller, and J. Riedl, Using Filtering Agents to Improve Prediction Quality in the Grouplens Research Collaborative Filtering System. In Proc. ACM Conf. Computer Support Cooperativ Work (CSCW) 1998, Seattle, WA., pp. 345–354 Nov. 1998.
Google Scholar
J.B. Schafer, J. Konstan, and J. Riedl, Recommender Systems in E-Commerce, ACM Conf. Electronic Commerce (EC-99), Denver, CO, pp. 158–166, Nov. 1999.
Google Scholar
U. Shardanand and P. Maes, Social information filtering: Algorithms for automating “word of mouth.” In Proc. 1995 ACM Conf. Human Factors in Computing Systems, New York, NY, pp. 210–217, 1995
Google Scholar
T. Zhang, R. Ramakrishnan, and M. Livny, “BIRCH: an efficient data clustering method for very large databases”, In SIGMOD’96, Montreal, Canada, pp. 103–114, June 1996.
Google Scholar

Download references

Author information

Authors and Affiliations

Simon Fraser University, Burnaby, B.C., Canada
Sonny Han Seng Chee, Jiawei Han & Ke Wang

Authors

Sonny Han Seng Chee
View author publications
You can also search for this author in PubMed Google Scholar
Jiawei Han
View author publications
You can also search for this author in PubMed Google Scholar
Ke Wang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Kyoto University, Kyoto, 606-8501, Japan
Yahiko Kambayashi
EC3, Siebensterngasse 21/3, 1070, Wien
Werner Winiwarter
Center for Spatial Information Science (CSIS), University of Tokyo, 4-6-1, Komaba Meguro-ku, Tokyo, 153-8904, Japan
Masatoshi Arikawa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chee, S.H.S., Han, J., Wang, K. (2001). RecTree: An Efficient Collaborative Filtering Method. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2001. Lecture Notes in Computer Science, vol 2114. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44801-2_15

Download citation

DOI: https://doi.org/10.1007/3-540-44801-2_15
Published: 28 August 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42553-3
Online ISBN: 978-3-540-44801-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics