0% found this document useful (0 votes)
77 views14 pages

SVD in Collaborative Filtering

The document discusses matrix factorization for collaborative filtering recommender systems. It explains that matrix factorization introduces latent factors to model relationships between users, items, and ratings. Singular value decomposition (SVD) is commonly used for matrix factorization and was the technique employed by the winning team in the Netflix Prize contest. SVD decomposes the user-item rating matrix into the product of three matrices, representing user factors, singular values, and item factors. This lower-dimensional representation can be used to predict missing ratings. Examples are provided to illustrate applying SVD to decompose a ratings matrix and make recommendations.

Uploaded by

Jerusalem Fetene
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
77 views14 pages

SVD in Collaborative Filtering

The document discusses matrix factorization for collaborative filtering recommender systems. It explains that matrix factorization introduces latent factors to model relationships between users, items, and ratings. Singular value decomposition (SVD) is commonly used for matrix factorization and was the technique employed by the winning team in the Netflix Prize contest. SVD decomposes the user-item rating matrix into the product of three matrices, representing user factors, singular values, and item factors. This lower-dimensional representation can be used to predict missing ratings. Examples are provided to illustrate applying SVD to decompose a ratings matrix and make recommendations.

Uploaded by

Jerusalem Fetene
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Outline

 CF – Model Based
 Matrix Factorization (MF) – SVD

 SVD Concepts and Methods

 Netflix Prize and Example of SVD


RS – Matrix Factorization CF...

CF using MF - concepts
• Matrix factorization a latent factor model. Latent variables (also
called features, aspects, or factors) are introduced to account
for the underlying reasons of a user purchasing or using a
product.
 When the connections between the latent variables and
observed variables (user, product, rating, etc.) are estimated
during the training
 Recommendations can be made to users by computing their
possible interactions with each product through the latent
variables.

2
RS – Matrix Factorization CF...

CF using matrix factorization


 Matrix factorization has gained popularity for CF in recent years due to
its superior performance both in terms of recommendation quality and
scalability.
 Part of its success is due to the Netflix Prize contest for movie
recommendation, which popularized a Singular Value Decomposition
(SVD) based matrix factorization algorithm.
 The prize winning method of the Netflix Prize Contest employed an
adapted version of SVD
RS – Matrix Factorization CF...
MF: Singular Value Decomposition (SVD)
Consider user-movie-rating data represented as a matrix R,
which has 17,000X500,000 = 8.5 billion cells or entries. Each
non-empty cell rij of R represents a known rating of user I on
movie j. Our SVD will decompose R into matrices, i.e. user
aspect U and movie aspect M matrices. That is, we want to use
UTM to approximate R, i.e.

where U = [u1, u2, …, uI] and M = [m1, m2, …, mJ]


RS – Matrix Factorization CF...
SVD method ...
• Let us use K = 90 latent aspects (K needs to be set
experimentally).
• Then, each movie will be described by only ninety aspect values
indicating how much that movie exemplifies each aspect.
• Correspondingly, each user is also described by ninety aspect
values indicating how much he/she prefers each aspect.
RS – Matrix Factorization CF...
SVD method...
• To combine these together into a rating, we multiply each user
preference by the corresponding movie aspect, and then sum
them up to give a rating to indicate how much that user likes
that movie:

• U = [u1, u2, …, uI] and M = [m1, m2, …, mJ]

𝑟ij ≈ 𝑢𝑖 𝑇 𝑚𝑗 = 𝑢ki × 𝑚kj


k=1

• Using SVD, we can perform the task


RS – Matrix Factorization CF...
SVD method...
• To minimize the error, the gradient descent approach
is used.
• For gradient descent, we take the partial derivative of
the square error with respect to each parameter, i.e.
with respect to each uki and mkj.
RS – Matrix Factorization CF...
Netflix Prize & Matrix Factorization
17,700 movies
The $1 Million Question
1 3 4
3 5 5
4 5 5
3
480,000
users 3
2 2 2
5
2 1 1
3 3
1
RS – Matrix Factorization CF...
Matrix Factorization of Ratings Data
n movies f
n movies
x

f
m users
m users

~
~
rui ~ pu qTi
~

• Based on the idea of Latent Factor Analysis


• Identify latent (unobserved) factors that “explain” observations in the
data
• In this case, observations are user ratings of movies
• The factors may represent combinations of features or characteristics of
movies and users that result in the ratings
RS – Matrix Factorization CF...
Matrix Factorization
Pk Dim1 Dim2

Alice 0.47 -0.30 QkT


Dim1 -0.44 -0.57 0.06 0.38 0.57
Bob -0.44 0.23
Dim2 0.58 -0.66 0.26 0.18 -0.36
Mary 0.70 -0.06

Sue 0.31 0.93

𝑟ui =p𝑘 Alice × 𝑞𝑘𝑇 EPL


Prediction:

Note: This can also do factorization via Singular Value Decomposition (SVD)
RS – Matrix Factorization CF...
Example 1 Suppose a rating matrix R is given. Apply SVD to decompose
of ratings matrix R into the product of P & Q. Each user & item is
described with F latent features. Predict the missing ratings.
• P : user factors
F factors
• Q : item factors

F factors
Missing
ratings

Prediction for user u about item i :

Predicted
ratings

11
RS – Matrix Factorization CF...
Example 2:
Given a rating matrix R, then apply SVD to decompose the ratings matrix R
into the product of U, Ʃ & V. Each user & item is described with latent
features
• U : user factors
• Ʃ : Singular values
• V : item factors

Missing
ratings

12
RS – Matrix Factorization CF...
MF: the 3rd column that produces low singular value in the 3rd row and
3rd column are removed
Remove
dimension with
low singular
values

Multiplying the reduced matrices will give reconstructed matrix that


can be used as predicted rating of missing rating of new users

Predicted
ratings

13
The Netflix Prize

 Started on Oct. 2006

 $1,000,000 Grand Prize

 Training dataset: 100 million ratings (1,2,3,4,5 stars) from 480K customers on 18
K movies.

 Qualifying set (2,817,131 ratings)

 Goal:
 Improve the Netflix existing algorithm by at least 10%
 Reduce RMSE From 0.9525 to RMSE<0.8572

14

You might also like