04-01-2021
Recommendation System
MR. U.A.NULI
2
Introduction:
How do we buy things?
How do we make decisions in our day-to-day lives?
We ask our friends or relatives for suggestions before making decisions.
When it comes to making decisions online about buying products, we read reviews about the
products from anonymous users, compare the products' specifications with other similar
products and then we make our decisions to buy or not.
In an online world, where information is growing at an exponential rate, looking for valid
information will be a challenge.
Recommender systems can be used to provide relevant and required information.
Mr. U.A.Nuli, Assistant Professor, Computer Science and Engineering Department
2
04-01-2021
3
Applications of Recommender systems:
Mr. U.A.Nuli, Assistant Professor, Computer Science and Engineering Department
4
Applications of Recommender systems:
Mr. U.A.Nuli, Assistant Professor, Computer Science and Engineering Department
4
04-01-2021
5
Applications of Recommender systems:
Mr. U.A.Nuli, Assistant Professor, Computer Science and Engineering Department
6
Applications of Recommender systems:
Mr. U.A.Nuli, Assistant Professor, Computer Science and Engineering Department
6
04-01-2021
7
Recommendation System Definition
Recommendation engines, a branch of information retrieval and artificial intelligence,
are powerful tools and techniques to analyse huge volumes of data, especially
product information and user information, and then provide relevant suggestions
based on datamining approaches.
In technical terms, a recommendation engine/system problem is to develop a
mathematical model or objective function which can predict how much a user will
like an item.
Mr. U.A.Nuli, Assistant Professor, Computer Science and Engineering Department
If U = {users}, I = {items} then F = Objective function and measures the usefulness of item I to
user U, given by:
F :U X I → R
Where R = {recommended items}.
For each user u, we want to choose the item i that maximizes the objective function:
Mr. U.A.Nuli, Assistant Professor, Computer Science and Engineering Department
8
04-01-2021
The main goal of recommender systems is to provide relevant suggestions to online users
to make better decisions from many alternatives available over the Web.
A better recommender system is directed more toward personalized recommendations
by taking into consideration the available digital footprint of the user, such as user-
demographic information, transaction details, interaction logs, and information about a
product, such as specifications, feedback from users, comparison with other products,
and so on, before making recommendations.
Mr. U.A.Nuli, Assistant Professor, Computer Science and Engineering Department
10
Need for recommender systems
Given the complexity and challenges in building recommendation engines, a considerable
amount of thought, skill, investment, and technology goes into building recommender
systems. Are they worth such an investment? Let us look at some facts:
• Two-thirds of movies watched by Netflix customers are recommended movies.
• 38% of click-through rates on Google News are recommended links.
• 35% of sales at Amazon arise from recommended products.
• ChoiceStream (An advertising company)claims that 28% of people would like to buy more
music, if they find what they like
Mr. U.A.Nuli, Assistant Professor, Computer Science and Engineering Department
10
04-01-2021
11
Types of Recommender Systems
The following are the different types of recommender Systems/Engines :
Neighbourhood-based recommendation engines:
• User-based collaborative filtering.
• Item-based collaborative filtering
Personalized recommendation engines:
• Content-based recommendation engines.
• Context-aware recommendation engines.
Model-based recommendation engines:
• ML-based recommendation engines.
• Classification – SVM/KNN
• Matrix Factorization
• Singular value decomposition
• Alternating Least Squares
• Hybrid recommendation engines
Mr. U.A.Nuli, Assistant Professor, Computer Science and Engineering Department
11
12
Neighbourhood-based recommendation engines:
Collaborative Filtering
Neighbourhood-based recommender systems considers the preferences or likes of the
user community or users of the neighbourhood of an active user before making
suggestions or recommendations to the active user.
The idea for neighbourhood-based recommenders is very simple:
given the ratings of a user, find all the users similar to the active user who had similar
preferences in the past and then make predictions regarding all unknown products that
the active user has not rated but are being rated in by his neighbourhood
Mr. U.A.Nuli, Assistant Professor, Computer Science and Engineering Department
12
04-01-2021
13
Neighbourhood-based recommendation engines:
Mr. U.A.Nuli, Assistant Professor, Computer Science and Engineering Department
13
14
Neighbourhood-based recommendation engines:
These methods are based on the following assumptions:
o People with similar preferences in the past have similar preferences in the future.
o People's preferences will remain stable and consistent in the future
Mr. U.A.Nuli, Assistant Professor, Computer Science and Engineering Department
14
04-01-2021
15
Neighbourhood-based recommendation engines: Types
• User-based collaborative filtering.
• Item-based collaborative filtering.
Collaborative filtering is the process of filtering for information or patterns using
techniques involving collaboration among multiple agents, viewpoints, data
sources, etc
Mr. U.A.Nuli, Assistant Professor, Computer Science and Engineering Department
15
16
Mr. U.A.Nuli, Assistant Professor,Computer Science and
Engineering Departmrnt
16
04-01-2021
17
Building basic recommendation engine:
The steps to build our basic recommendation engine are as follows:
1. Loading and formatting data.
2. Calculating similarity between users.
3. Predicting the unknown ratings for users.
4. Recommending items to users based on user-similarity score.
Mr. U.A.Nuli, Assistant Professor, Computer Science and Engineering Department
17
18
User-based collaborative filtering
User based collaborative filtering first finds out the similarity between the active user ( the
user needing the recommendation) and other users.
Identifies the similar users based on Euclidian distance or correlation coefficient.
Recommend the products that has not rated/purchased by active user but
rated/purchased by similar/ nearest users.
Mr. U.A.Nuli, Assistant Professor, Computer Science and Engineering Department
18
04-01-2021
19
Example: Movie Recommendation
The Rating table for movie recommendation is as follows:
Sanjay Ajit sunil Amit
Airlift 4 2
BhulBhulaiya 3 3 3 4
Hera Pheri 5 5 5 4
Welcome 3 4 4
Parmanu 4 4
Mr. U.A.Nuli, Assistant Professor, Computer Science and Engineering Department
19
Now assuming Sunil as active user, we will calculate distance of sunil with every other user
employing Euclidian distance formula. Here the movies for which no rating has been given 20
is assumed as 0.
D( sunil, amit) = SQRT( (2-0)2+(3-4)2+(5-4)2+(0-4)2+(0-4)2)
= SQRT(4+1+1+16+16)
= SQRT(38) = 6.1644
D( sunil, ajit) = SQRT( (2-0)2+(3-3)2+(5-5)2+(0-3)2+(0-0)2)
= SQRT(4+0+0+9+0)
= SQRT(13) = 3.6056
D( sunil, Sanjay) = SQRT( (2-4)2+(3-4)2+(5-4)2+(0-4)2+(0-4)2)
= SQRT(4+1+1+16+16) Hence nearest neighbor of sunil is Sanjay and
the movie that sanjay has seen but sunil has
= SQRT(38) = 6.1644
not seen is “Welcome”. Hence,
Mr. U.A.Nuli, Assistant Professor,Computer Science and
recommended movie for sunil is “Welcome”.
Engineering Departmrnt
20
04-01-2021
21
In user-based collaborative filtering, there are a few downs sides:
• The system suffers with performance if the user ratings are very sparse,
which is very common in the real world where users will rate only a few
items from a large catalog.
• The computing cost for calculating the similarity values for all the users is
very high if the data is very large.
• If user profiles or user inputs change quickly then we have to re-compute
the similarity values that come with a high computational cost
Mr. U.A.Nuli, Assistant Professor, Computer Science and Engineering Department
21
22
Item-based
collaborative filtering
item-based collaborative
filtering recommender systems,
unlike user-based collaborative
filtering, we use similarity
between items instead of
similarity between users.
The basic intuition for item-
based recommender systems is
that if a user liked item A in the
past they might like item B,
which is similar to item A:
Mr. U.A.Nuli, Assistant Professor, Computer Science and Engineering Department
22
04-01-2021
23
Mr. U.A.Nuli, Assistant Professor,Computer Science and
Engineering Departmrnt
23
24
How to measure similarity between two items?
One of important metric used for finding similarity is Cosine Similarity
What Is Cosine Similarity?
Cosine similarity is a metric used to measure how similar the two items or documents
are irrespective of their size.
It measures the cosine of an angle between two vectors projected in multi-
dimensional space.
This allows us to measure the similarity of a document of any type. Due to a multi-
dimensional array, any number of variables (which are treated as dimensions) can be
used, which in turn supports large sized documents.
Mathematically, the cosine of the angle of between two vectors is derived from the
dot product of the two vectors divided by the product of the two vectors’ magnitude.
Mr. U.A.Nuli, Assistant Professor, Computer Science and Engineering Department
24
04-01-2021
25
Since we are finding the cosine of two vectors the output will always range from -1 to 1,
where -1 shows that two items are dissimilar and 1 shows that two items are completely
similar. We will now see how we can use the Cosine Similarity measure to determine how
similar the movies are.
Mr. U.A.Nuli, Assistant Professor, Computer Science and Engineering Department
25
26
Cosine Similarity
Assuming two documents Doc1 and Doc2,
Doc1 contains word “mouse” 5 times and word “cat” 12 times
Doc2 contains word “mouse” 12 times and word “cat” 14 times.
How to measure similarity between Doc1 and Doc2?
https://nickgrattandatascience.wordpress.com/2017/12/31/euclidean-manhattan-and-cosine-distance-measures-in-c/
Mr. U.A.Nuli, Assistant Professor, Computer Science and Engineering Department
26
04-01-2021
27
Mr. U.A.Nuli, Assistant Professor, Computer Science and Engineering Department
27
28
Cosine Similarity
Finally, the Cosine distance is the angle subtended
at the origin between the two documents. A value
of 0 degrees represents identical documents and 90
degrees dissimilar documents. Note that this
distance is based on the relative frequency of words
in a document. A document with, say, twice as cos ( Doc1,Doc2)
many occurrences of all words compared to
another document will be regarded as identical. = ( 5*12 +11*12)/ sqrt(52+112)* sqrt(122+142)*
Mr. U.A.Nuli, Assistant Professor, Computer Science and Engineering Department
28
04-01-2021
29
Example
Suppose we have movie ratings given by different users in a table format as shown below:
Mr. U.A.Nuli, Assistant Professor, Computer Science and Engineering Department
29
30
1. Create item-user table
Here user are considered as attributes/parameters to represent movies and can be used to find similarity.
Mr. U.A.Nuli, Assistant Professor, Computer Science and Engineering Department
30
04-01-2021
31
Step 2: To calculate the similarity between the movie Pulp Fiction (P) and Forrest Gump (F), we will
first find all the users who have rated both the movies. In our case, Calvin (C), Robert (R) and
Bradley (B) have rated the movies. We now create two vectors:
v1 = 5 C + 3 R + 1 B
v2 = 2 C + 3 R + 3 B
Therefore Cosine Similarity between movies Pulp Fiction and Forrest Gump is:
cos(v1,v2) = (5*2 + 3*3 + 1*3) / sqrt[(25+9+1) * (4+9+9)] = 0.792
Mr. U.A.Nuli, Assistant Professor, Computer Science and Engineering Department
31
32
Similarly, we can calculate the cosine similarity of all the movies and our final similarity matrix will be:
Mr. U.A.Nuli, Assistant Professor, Computer Science and Engineering Department
32
04-01-2021
33
Step 3: Now we can predict and fill the ratings for a user for the items he hasn’t rated
yet. So to calculate the rating of user Amy for the movie Forrest Gump, we will use the
calculated similarity matrix along with the already rated movie by the user. Therefore,
the rating would be:
(4*0.792 + 5*0.8) / (0.792+ 0.8) = 4.5
Mr. U.A.Nuli, Assistant Professor, Computer Science and Engineering Department
33
34
Hence, our final matrix would be:
Mr. U.A.Nuli, Assistant Professor, Computer Science and Engineering Department
34
04-01-2021
35
Personalized recommendation engines:
Content-based recommender systems
In the previous section, we saw that the recommendations were generated by
considering only the rating or interaction information of the products by the users, that is
to say that suggesting new items for the active user is based on the ratings given to
those new items by similar users to the active user.
Let's take the case of a person who has given a 4-star rating to a movie. In a collaborative
filtering approach we only consider this rating information for generating recommendations.
In real life, a person rates a movie based on the features or content of the movie such as its genre,
actor, director, story, and screenplay. Also the person watches a movie based on their personal
choices.
When we are building a recommendation engine to target users at a personal level, the
recommendations should not be based on the tastes of other similar people but should be based on
the individual users' tastes and the contents of the products.
Mr. U.A.Nuli, Assistant Professor, Computer Science and Engineering Department
35
36
A recommendation that is targeted at a personalized level and that considers individual
preferences and contents of the products for generating recommendations is called a
content-based recommender system.
Another motivation for building content-based recommendation engines is that they
solve the cold-start problem that new users face in the collaborative filtering approach.
When a new user comes, based on the preferences of the person we can suggest new
items that are similar to their tastes.
Mr. U.A.Nuli, Assistant Professor, Computer Science and Engineering Department
36
04-01-2021
37
Building content-based recommender systems involves three main steps, as follows:
1.Generating content information for products.
2. Generating a user profile and preferences with respect to the features of the products.
3. Generating recommendations and predicting a list of items that the user might like
Mr. U.A.Nuli, Assistant Professor, Computer Science and Engineering Department
37
38
Mr. U.A.Nuli, Assistant Professor,Computer Science and
Engineering Departmrnt
38
04-01-2021
39
Limits of Recommendation systems
For all their efficiencies, Recommendation Systems are not a full proof system. Recommenders have
been known to suffer from the following limitations:
•Recommenders depend totally on data and their hirers must constantly supply them with large
volumes of data. That is why; smaller firms are more disadvantaged then the bigger firms such as
Google and Amazon.
•Recommenders may find it difficult to exactly identify user choice patterns if the user preferences
tend to vary quickly, as in fashion. Recommenders depend a lot on historic data but that may not be
suitable for certain product niches.
•Recommenders face problems with unpredictable items. For example, there are certain movie types
that evoke extreme reactions such as love or hate. It is extremely difficult to provide
recommendations for such items.
Mr. U.A.Nuli, Assistant Professor, Computer Science and Engineering Department
39