Qualitative analysis of user-based and item-based prediction algorithms for recommendation agents

doi:10.1016/j.engappai.2005.06.010

Engineering Applications of Artificial Intelligence

Volume 18, Issue 7, October 2005, Pages 781-789

https://doi.org/10.1016/j.engappai.2005.06.010 Get rights and content

Abstract

Recommendation agents employ prediction algorithms to provide users with items that match their interests. In this paper, several prediction algorithms are described and evaluated, some of which are novel in that they combine user-based and item-based similarity measures derived from either explicit or implicit ratings. Both statistical and decision-support accuracy metrics of the algorithms are compared against different levels of data sparsity and different operational thresholds. The first metric evaluates the accuracy in terms of average absolute deviation, while the second evaluates how effectively predictions help users to select high-quality items. The experimental results indicate better performance of item-based predictions derived from explicit ratings in relation to both metrics. Category-boosted predictions lead to slightly better predictions when combined with explicit ratings, while implicit ratings, in the context that have been defined in this paper, perform much worse than explicit ratings.

Introduction

Recommendation systems (Resnick and Varian, 1997) have been a popular topic of research ever since the ubiquity of the web made it clear that people of hugely varying backgrounds would be able to access and query the same underlying data. The initial human–computer interaction challenge has been made even more challenging by the observation that customized services require sophisticated data structures and well thought-out architectures to be able to scale up to thousands of users and beyond.

In recent years, recommendation agents are extensively adopted by both research and e-commerce recommendation systems in order to provide an intelligent mechanism to filter out the excess of information available and to provide customers with the prospect to effortlessly find out items that they will probably like according to their logged history of prior transactions.

Recommendation agents need to employ efficient prediction algorithms so as to provide accurate recommendations to users. If a prediction is defined as a value that expresses the predicted likelihood that a user will “like” an item, then a recommendation is defined as the list of n items with respect to the top-n predictions from the set of items available. Improved prediction algorithms indicate better recommendations. This explains the essentiality of exploring and understanding the broad characteristics and potentials of prediction algorithms and the reason why this work concentrates on this research direction.

There are generally two methods to formulate recommendations both depending on the type of items to be recommended, as well as, on the way that user models (Allen, 1990) are constructed. The two different approaches are content-based (Balabanovic and Sholam, 1997; Kalles et al., 2003) and collaborative filtering (Herlocker et al., 2000; Hofmann, 2003), while additional hybrid techniques have been proposed as well (Balabanovic and Sholam, 1997).

Content based recommendation algorithms: Content-based algorithms are principally used when documents are to be recommended, such as web pages, publications, jokes or news. The agent maintains information about user preferences either by initial input about user's interests during the registration process or by rating documents. Recommendations are formed by taking into account the content of documents and by filtering in the ones that better match the user's preferences and logged profile.

Collaborative filtering based recommendation algorithms: Collaborative-filtering algorithms aim to identify users that have relevant interests and preferences by calculating similarities and dissimilarities between user profiles (Herlocker et al., 2004). The idea behind this method is that, it may be of benefit to one's search for information to consult the behavior of other users who share the same or relevant interests and whose opinion can be trusted.

The challenges for recommendation algorithms expand to three key dimensions, identified as sparsity, scalability and cold-start.

Sparsity: Even users that are very active, result in rating just a few of the total number of items available in a database. As the majority of the recommendation algorithms are based on similarity measures computed over the co-rated set of items, large levels of sparsity can be detrimental to recommendation agents. In Huang et al. (2004), authors propose to deal with sparsity problem by applying an associative retrieval framework and related spreading activation algorithms to explore transitive associations among consumers through their past transactions and feedback.

Scalability: Recommendation algorithms seem to be efficient in filtering in items that are interesting to users. However, they require computations that are very expensive and grow non-linearly with the number of users and items in a database. Therefore, in order to bring recommendation algorithms successfully on the web, and succeed in providing recommendations with acceptable delay, sophisticated data structures and advanced, scalable architectures are required. In Cosley et al. (2002), authors describe an open framework for practical testing of recommendation systems in an attempt to provide a standard, public testbed to evaluate recommendation algorithms in real-world conditions.

Cold-start: An item cannot be recommended unless it has been rated by a substantial number of users. This problem applies to new and obscure items and is particularly detrimental to users with eclectic taste (Schein et al., 2002; Melville et al., 2002). Likewise, a new user has to rate a sufficient number of items before the recommendation algorithm be able to provide reliable and accurate recommendations.

The primary contributions of this work are:

•
The utilization of explicit ratings in an “implicit” sense so as to enrich a user's model, without actually prompting users to express their preference to categories.
•
The description of item-based and user-based similarity measures derived from either explicit or implicit ratings.
•
The formation of a range of item-based and user-based prediction algorithms according to item-based and user-based similarity measures.
•
The qualitative analysis and experimental evaluation of presented prediction algorithms.

Section 2 describes a set of similarity measures to compare the relevance between users or items. Section 3 describes a set of existing and newly introduced prediction algorithms that integrate the similarity measures. Section 4 presents the experimental evaluation metrics that are employed in order to compare the algorithms and the results of the evaluation are discussed. Section 5 summarizes the contributions of this work and draws directions for further research.

Section snippets

Similarity measures

In this section, a set of similarity measures are presented based on the Pearson correlation coefficient, a metric of relevance between two vectors (Pearson, 1900). When the values of these vectors are associated with a user's model then the similarity is called user-based similarity, whereas when they are associated with an item's model then it is called item-based similarity. The similarity measure can be effectively used to balance the ratings significance in a prediction algorithm and

Prediction algorithms

Prediction algorithms (Breese et al., 1998) try to guess the rating that a user is going to provide for an item. This user will be referred as active user u_a and this item as active item i_a. These algorithms take advantage of the logged history of ratings and of content associated with users and items in order to provide predictions.

Data set

The experimental data comes from an in-house movie recommendation system named Movie Recommendation System (MRS). The MRS database currently consists of 2068 ratings provided by 114 users to 641 movies, which belong to at least 1 of 21 categories. Therefore the lowest level of sparsity for the tests is defined as $114 \times 641 - 2068 / 114 \times 641 ≃ 0.9717$ . The prediction algorithms are tested over a pre-selected 300-ratings set extracted randomly by the set of 2068 actual ratings. The interested user is

Conclusions and future work

The vast volume of information flowing on the web has given rise to the need for information filtering techniques. Recommendation agents are effectively used to filter out excess information and to provide personalized services to users by employing sophisticated, well thought-out prediction algorithms. This work described how explicit ratings can be utilized in order to implicitly obtain user's preference to specific categories. A number of prediction algorithms have been designed and

References (17)

Allen, R.B., 1990. User models: theory, method and Practice. International Journal of Man–Machine...
M. Balabanovic et al.
Combining content-based and collaborative recommendation
Communications of the ACM
(1997)
Breese, J.S., Heckerman, D., Kadie, C., 1998. Empirical analysis of predictive algorithms for collaborative filtering....
Cosley, D., Lawrence, S., Pennock, D. M., 2002. REFEREE: an open framework for practical testing of recommender systems...
Herlocker, J.L., Konstan, J.A., Borchers, A., Riedl, J., 1999. An algorithmic framework for performing collaborative...
Herlocker, J.L., Konstan, J.A., Riedl, J., 2000. Explaining collaborative filtering recommendations. Proceedings of the...
J. Herlocker et al.
Evaluating collaborative filtering recommender systems
ACM Transactions on Information Systems (TOIS)
(2004)
Hofmann, T., 2003. Collaborative filtering via Gaussian probabilistic latent semantic analysis. Proceedings of the 26th...

There are more references available in the full text version of this article.

Cited by (150)

TLSAN: Time-aware long- and short-term attention network for next-item recommendation
2021, Neurocomputing
Recently, deep neural networks are widely applied in recommender systems for their effectiveness in capturing/modeling users’ preferences. Especially, the attention mechanism in deep learning enables recommender systems to incorporate various features in an adaptive way. Specifically, as for the next item recommendation task, we have the following three observations: 1) users’ sequential behavior records aggregate at time positions (“time-aggregation”), 2) users have personalized taste that is related to the “time-aggregation” phenomenon (“personalized time-aggregation”), and 3) users’ short-term interests play an important role in the next item prediction/recommendation. In this paper, we propose a new Time-aware Long- and Short-term Attention Network (TLSAN) to address those observations mentioned above. Specifically, TLSAN consists of two main components. Firstly, TLSAN models “personalized time-aggregation” and learn user-specific temporal taste via trainable personalized time position embeddings with category-aware correlations in long-term behaviors. Secondly, long- and short-term feature-wise attention layers are proposed to effectively capture users’ long- and short-term preferences for accurate recommendation. Especially, the attention mechanism enables TLSAN to utilize users’ preferences in an adaptive way, and its usage in long- and short-term layers enhances TLSAN’s ability of dealing with sparse interaction data. Extensive experiments are conducted on Amazon datasets from different fields (also with different size), and the results show that TLSAN outperforms state-of-the-art baselines in both capturing users’ preferences and performing time-sensitive next-item recommendation.
CrossRec: Supporting software developers by recommending third-party libraries
2020, Journal of Systems and Software
When creating a new software system, or when evolving an existing one, developers do not reinvent the wheel but, rather, seek available libraries that suit their purpose. In such a context, open source software repositories contain rich resources that can provide developers with helpful advice to support their tasks. However, the heterogeneity of resources and the dependencies among them are the main obstacles to the effective mining and exploitation of the available data. In this sense, advanced techniques and tools are needed to mine the metadata to bring in meaningful recommendations. In this paper, we present CrossRec, a recommender system to assist open source software developers in selecting suitable third-party libraries. CrossRec exploits a collaborative filtering technique to recommend libraries to developers by relying on the set of dependencies, which are currently included in the project being developed. We perform an empirical evaluation to compare the proposed approach with three state-of-the-art baselines, i.e., LibRec, LibFinder, and LibCUP on three considerably large datasets. The experimental results show that CrossRec overcomes the limitation of the baselines by recommending also libraries with a specific version. More importantly, it outperforms LibRec and LibCUP with respect to various quality metrics.
An intelligent recommendation method for service personalized customization
2019, IFAC-PapersOnLine
Position-category-aware attention network for next-item recommendation
2024, Knowledge and Information Systems
Deep learning-based collaborative filtering recommender systems: a comprehensive and systematic review
2023, Neural Computing and Applications
The State-of-the-Art and Challenges on Recommendation System’s: Principle, Techniques and Evaluation Strategy
2023, SN Computer Science

View all citing articles on Scopus

^☆: This paper is part of the special issue of selected best papers of the 9th international workshop on cooperative information agents (CIA 2004) organised by Matthias Klusch, Rainer Unland, and Sascha Ossowski.

View full text

Qualitative analysis of user-based and item-based prediction algorithms for recommendation agents☆

Abstract

Introduction

Section snippets

Similarity measures

Prediction algorithms

Data set

Conclusions and future work

Combining content-based and collaborative recommendation

Communications of the ACM

Evaluating collaborative filtering recommender systems

ACM Transactions on Information Systems (TOIS)