Abstract
In this paper, based on a weighted projection of a bipartite user-object network, we introduce a personalized recommendation algorithm, called network-based inference (NBI), which has higher accuracy than the classical algorithm, namely collaborative filtering. In NBI, the correlation resulting from a specific attribute may be repeatedly counted in the cumulative recommendations from different objects. By considering the higher order correlations, we design an improved algorithm that can, to some extent, eliminate the redundant correlations. We test our algorithm on two benchmark data sets, MovieLens and Netflix. Compared with NBI, the algorithmic accuracy, measured by the ranking score, can be further improved by 23 per cent for MovieLens and 22 per cent for Netflix. The present algorithm can even outperform the Latent Dirichlet Allocation algorithm, which requires much longer computational time. Furthermore, most previous studies considered the algorithmic accuracy only; in this paper, we argue that the diversity and popularity, as two significant criteria of algorithmic performance, should also be taken into account. With more or less the same accuracy, an algorithm giving higher diversity and lower popularity is more favorable. Numerical results show that the present algorithm can outperform the standard one simultaneously in all five adopted metrics: lower ranking score and higher precision for accuracy, larger Hamming distance and lower intra-similarity for diversity, as well as smaller average degree for popularity.
Export citation and abstract BibTeX RIS
GENERAL SCIENTIFIC SUMMARY Introduction and background. The information explosion confronts us with an overload problem: it is hard to get what you want from millions of books and billions of web pages. The most promising method is to automatically provide personalized recommendations according to the users' past activities. Many information filtering tools have been proposed, such as matrix decomposition techniques, machine learning techniques and content-based analysis. However, physical processes are rarely taken into consideration. In addition, most studies focus overwhelmingly on accuracy as the only important factor without any concern for novelty or diversity of recommendations.
Main results. This paper designs a simple recommendation algorithm based on a two-step diffusion process, and shows that a counter-intuitive choice of parameter can eliminate redundant correlations and thus lead to both more accurate and more diverse recommendations. The advantages of the proposed algorithm have been demonstrated by extensive comparison with some representative algorithms in various performance metrics for accuracy, diversity and novelty.
Wider implications. This paper presents evidence that some physical dynamics can be applied to solve the information filtering problem. The significance of diversity and novelty in the recommendations are emphasized, and three usable measurements are presented. We highlight the view that an accurate recommendation is not necessarily a useful one: real value is found in the ability to suggest objects users would not readily discover for themselves. The viewpoint, algorithm and measurements presented in this paper can be applied in the design of next-generation recommendation systems.