Memory-based The memory-based approach uses user rating data to compute the similarity between users or items. Typical examples of this approach are neighbourhood-based CF and item-based/user-based top-N recommendations. For example, in user based approaches, the value of ratings user
u gives to item
i is calculated as an aggregation of some similar users' rating of the item: :r_{u,i} = \operatorname{aggr}_{u^\prime \in U} r_{u^\prime, i} where
U denotes the set of top
N users that are most similar to user
u who rated item
i. Some examples of the aggregation function include: :r_{u,i} = \frac{1}{N}\sum\limits_{u^\prime \in U}r_{u^\prime, i} :r_{u,i} = k\sum\limits_{u^\prime \in U}\operatorname{simil}(u,u^\prime)r_{u^\prime, i} where k is a normalizing factor defined as k =1/\sum_{u^\prime \in U}|\operatorname{simil}(u,u^\prime)| , and :r_{u,i} = \bar{r_u} + k\sum\limits_{u^\prime \in U}\operatorname{simil}(u,u^\prime)(r_{u^\prime, i}-\bar{r_{u^\prime}} ) where \bar{r_u} is the average rating of user
u for all the items rated by
u. The neighborhood-based algorithm calculates the similarity between two users or items, and produces a prediction for the user by taking the
weighted average of all the ratings. Similarity computation between items or users is an important part of this approach. Multiple measures, such as
Pearson correlation and
vector cosine based similarity are used for this. The Pearson correlation similarity of two users
x,
y is defined as : \operatorname{simil}(x,y) = \frac{\sum\limits_{i \in I_{xy}}(r_{x,i}-\bar{r_x})(r_{y,i}-\bar{r_y})}{\sqrt{\sum\limits_{i \in I_{xy}}(r_{x,i}-\bar{r_x})^2}\sqrt{\sum\limits_{i \in I_{xy}}(r_{y,i}-\bar{r_y})^2}} where Ixy is the set of items rated by both user
x and user
y. The cosine-based approach defines the cosine-similarity between two users
x and
y as: :\operatorname{simil}(x,y) = \cos(\vec x,\vec y) = \frac{\vec x \cdot \vec y} = \frac{\sum\limits_{i \in I_{xy}}r_{x,i}r_{y,i}}{\sqrt{\sum\limits_{i \in I_{x}}r_{x,i}^2}\sqrt{\sum\limits_{i \in I_{y}}r_{y,i}^2}} The user based top-N recommendation algorithm uses a similarity-based vector model to identify the
k most similar users to an active user. After the
k most similar users are found, their corresponding user-item matrices are aggregated to identify the set of items to be recommended. A popular method to find the similar users is the
Locality-sensitive hashing, which implements the
nearest neighbor mechanism in linear time. The advantages with this approach include: the explainability of the results, which is an important aspect of recommendation systems; easy creation and use; easy facilitation of new data; content-independence of the items being recommended; good scaling with co-rated items. There are also several disadvantages of this approach. Its performance decreases when
data is sparse, which is common for web-related items. This hinders the
scalability of this approach and creates problems with large datasets. Although it can efficiently handle new users because it relies on a
data structure, adding new items becomes more complicated because that representation usually relies on a specific
vector space. Adding new items requires inclusion of the new item and the re-insertion of all the elements in the structure.
Model-based An alternative to memory-based methods is to
learn models to predict users' rating of unrated items. Model-based CF algorithms include
Bayesian networks,
clustering models,
latent semantic models such as
singular value decomposition,
probabilistic latent semantic analysis, multiple multiplicative factor,
latent Dirichlet allocation and
Markov decision process-based models. Through this approach,
dimensionality reduction methods are mostly used for improving robustness and accuracy of memory-based methods. Specifically, methods like
singular value decomposition,
principal component analysis, known as latent factor models, compress a user-item matrix into a low-dimensional representation in terms of latent factors. This transforms the large matrix that contains many missing values, into a much smaller matrix. A compressed matrix can be used to find neighbors of a user or item as per the previous section. Compression has two advantages in large,
sparse data: it is more accurate and scales better.
Hybrid A number of applications combine the memory-based and the model-based CF algorithms. These overcome the limitations of native CF approaches and improve prediction performance. Importantly, they overcome the CF problems such as sparsity and loss of information. However, they have increased complexity and are expensive to implement. Usually most commercial recommender systems are hybrid, for example, the
Google news recommender system.
Deep-learning In recent years, many neural and
deep-learning techniques have been proposed for collaborative filtering. Some generalize traditional
matrix factorization algorithms via a non-linear neural architecture, or leverage new model types like Variational
Autoencoders. Deep learning has been applied to many scenarios (context-aware, sequence-aware, social tagging etc.). However, deep learning effectiveness for collaborative recommendation has been questioned. A systematic analysis of publications using deep learning or neural methods to the top-k recommendation problem, published in top conferences (SIGIR, KDD, WWW, RecSys), found that, on average, less than 40% of articles are reproducible, and only 14% in some conferences. Overall, the study identifies 18 articles, only 7 of them could be reproduced and 6 could be outperformed by older and simpler properly tuned baselines. The article highlights potential problems in today's research scholarship and calls for improved scientific practices. Similar issues have been spotted by others and also in sequence-aware recommender systems. == Context-aware collaborative filtering ==