Module 5
Module 5
User Profile: In the User Profile, we create vectors that describe the user’s preference. In the
creation of a user profile, we use the utility matrix which describes the relationship between
user and item. With this information, the best estimate we can make regarding which item user
likes, is some aggregation of the profiles of those items. Item Profile: In Content-Based
Recommender, we must build a profile for each item, which will represent the important
characteristics of that item. For example, if we make a movie as an item then its actors,
director, release year and genre are the most significant features of the movie. We can also add
its rating from the IMDB (Internet Movie Database) in the Item Profile. Utility Matrix: Utility
Matrix signifies the user’s preference with certain items. In the data gathered from the user, we
have to find some relation between the items which are liked by the user and those which are
disliked, for this purpose we use the utility matrix. In it we assign a particular value to each
user-item pair, this value is known as the degree of preference. Then we draw a matrix of a
user with the respective items to identify their preference relationship.
Some of the columns are blank in the matrix that is because we don’t get the whole input from
the user every time, and the goal of a recommendation system is not to fill all the columns but
to recommend a movie to the user which he/she will prefer. Through this table, our
recommender system won’t suggest Movie 3 to User 2, because in Movie 1 they have given
approximately the same ratings, and in Movie 3 User 1 has given the low rating, so it is highly
possible that User 2 also won’t like it. Recommending Items to User Based on Content:
Method 1: We can use the cosine distance between the vectors of the item and the
user to determine its preference to the user. For explaining this, let us consider an
example: We observe that the vector for a user will have a positive number for
actors that tend to appear in movies the user likes and negative numbers for actors
user doesn’t like, Consider a movie with actors which user likes and only a few
actors which user doesn’t like, then the cosine angle between the user’s and movie’s
vectors will be a large positive fraction. Thus, the angle will be close to 0, therefore
a small cosine distance between the vectors. It represents that the user tends to like
the movie, if the cosine distance is large, then we tend to avoid the item from the
recommendation.
Collaborative Filtering
In Collaborative Filtering, we tend to find similar users and recommend what similar users
like. In this type of recommendation system, we don’t use the features of the item to
recommend it, rather we classify the users into clusters of similar types and recommend each
user according to the preference of its cluster.
There are basically four types of algorithms o say techniques to build Collaborative filtering-
based recommender systems:
Memory-Based
Model-Based
Hybrid
Deep Learning
In this type of scenario, we can see that User 1 and User 2 give nearly similar ratings to
the movie, so we can conclude that Movie 3 is also going to be averagely liked by User
1 but Movie 4 will be a good recommendation to User 2, like this we can also see that
there are users who have different choices like User 1 and User 3 are opposite to each
other. One can see that User 3 and User 4 have a common interest in the movie, on that
basis we can say that Movie 4 is also going to be disliked by User 4. This is
Collaborative Filtering; we recommend to users the items which are liked by users of
similar interest domains.
Cosine Similarity
We can also use the cosine similarity between the users to find out the users with
similar interests, larger cosine implies that there is a smaller angle between two users,
hence they have similar interests. We can apply the cosine distance between two users
in the utility matrix, and we can also give the zero value to all the unfilled columns to
make calculation easy, if we get smaller cosine then there will be a larger distance
between the users, and if the cosine is larger than we have a small angle between the
users, and we can recommend them similar things.
We again took the previous example and we apply the rounding-off process, as you can
see how much more readable the data has become after performing this process, we can
see that User 1 and User 2 are more similar and User 3 and User 4 are more alike.
Normalizing Rating
In the process of normalizing, we take the average rating of a user and subtract all the
given ratings from it, so we’ll get either positive or negative values as a rating, which
can simply classify further into similar groups. By normalizing the data we can make
clusters of the users that give a similar rating to similar items and then we can use these
clusters to recommend items to the users.
Challenges to be Faced while using Collaborative Filtering
As we know, every algorithm has its pros and cons and so is the case with Collaborative
Filtering Algorithms. Collaborative Filtering algorithms are very dynamic and can change as
well as adapt to the changes in user preferences with time. But one of the main issues which
are faced by recommender systems is that of scalability because as the user base increases then
the respective sizes for the computation and the data storage space all increase manifold which
leads to slow and inaccurate results.
Also, collaborative filtering algorithms fail to recommend a diversity of products as they are
based on historical data and hence provide recommendations related to them as well.
Matrix Factorization
Choosing the Objective Function
One intuitive objective function is the squared distance. To do this, minimize the sum of squared
errors over all pairs of observed entries: