Chapter#03 Supervised Learning and Its Algorithms - II

CHAPTER#03
SUPERVISED LEARNING & ITS ALGORITHMS

PART # 02
COURSE INSTRUCTORS:
 DR. MUHAMMAD NASEEM
 ENGR. FARHEEN QAZI
 ENGR. SUNDUS ZEHRA
DEPARTMENT OF SOFTWARE ENGINEERING

SIR SYED UNIVERSITY OF ENGINEERING & TECHNOLOGY
AGENDA
 Naïve Bayes Classifier

 Examples
 Types of Naïve Bayes
 Advantages
 Disadvantages
 Summary
WHAT IS NAIVE BAYES CLASSIFIER?
 Naive Bayes is a statistical classification technique based on Bayes Theorem.

 It is one of the simplest supervised learning algorithms.
 Naive Bayes classifier is the fast, accurate and reliable algorithm.
 Naive Bayes classifiers have high accuracy and speed on large datasets.
 Naive Bayes classifier assumes that the effect of a particular feature in a class is
independent of other features.
 For example, a loan applicant is desirable or not depending on his/her income,
previous loan and transaction history, age, and location. Even if these features are
interdependent, these features are still considered independently. This assumption
simplifies computation, and that's why it is considered as naive.
MATH BEHIND NAÏVE BAYES
This assumption is called class conditional independence.
 P(h): the probability of hypothesis h being true (regardless of the data). This is known as the prior
probability of h.
 P(D): the probability of the data (regardless of the hypothesis).This is known as the prior probability.
 P(h|D): the probability of hypothesis h given the data D.This is known as posterior probability.
 P(D|h): the probability of data D given that the hypothesis h was true. This is known as posterior
probability.
PLAY-TENNIS EXAMPLE 1
Outlook Temperature Humidity W indy Class outlook
sunny hot high false N P(sunny|p) = 2/9 P(sunny|n) = 3/5
sunny hot high true N P(overcast|p) = 4/9 P(overcast|n) = 0
overcast hot high false P
rain mild high false P P(rain|p) = 3/9 P(rain|n) = 2/5
rain cool normal false P temperature
rain cool normal true N P(hot|p) = 2/9 P(hot|n) = 2/5
overcast cool normal true P
sunny mild high false N P(mild|p) = 4/9 P(mild|n) = 2/5
sunny cool normal false P P(cool|p) = 3/9 P(cool|n) = 1/5
rain mild normal false P humidity
sunny mild normal true P
P(high|p) = 3/9 P(high|n) = 4/5
overcast mild high true P
overcast hot normal false P P(normal|p) = 6/9 P(normal|n) = 2/5
rain mild high true N windy
P(p) = 9/14 P(true|p) = 3/9 P(true|n) = 3/5
P(n) = 5/14 P(false|p) = 6/9 P(false|n) = 2/5
CONTD….
 Test Data Set:

Outlook Temperature Humidity Windy Class
Rain Hot High False ?
 P(X|p)=
P(rain|p)*P(hot|p)*P(high|p)*P(false|p)*P(p) = 3/9 * 2/9 * 3/9 * 6/9 * 9/14 = 0.010582
 P(X|n)=
P(rain|n)*P(hot|n)*P(high|n)*P(false|n)*P(n) = 2/5 * 2/5 * 4/5 * 2/5 * 5/14 = 0.018286
P(X|n) > P(X|p)
Outlook Temperature Humidity Windy Class
Rain Hot High False N (not play)
EXAMPLE 2
Classification
Algorithms
Training
Data
NAM E RANK YEARS TENURED Classifier

(Model)
M ike Assistant Prof 3 no
M ary Assistant Prof 7 yes
Bill Professor 2 yes
Jim Associate Prof 7 yes
IF rank = ‘professor’
Dave Assistant Prof 6 no
OR years > 6
Anne Associate Prof 3 no THEN tenured = ?
EXAMPLE 3
Q: Using Naive Bayes Classification and given table, classify the given tuple (T), also draw the training table for
Probabilistic approach.
 Trained Data Set
 Test Data Set

Age Income Student Credit_ Buys_
Rating Computer
>40 High No Fair ?
<=30 Low Yes Fair ?
31…40 Medium Yes Excellent ?
TYPES OF NAIVE BAYES CLASSIFIER
 Multinomial Naive Bayes:

This is mostly used for document classification problem, i.e whether a document belongs to the category
of sports, politics, technology etc. The features/predictors used by the classifier are the frequency of the
words present in the document.
 Bernoulli Naive Bayes:
This is similar to the multinomial naive bayes but the predictors are boolean variables. The parameters
that we use to predict the class variable take up only values yes or no, for example if a word occurs in
the text or not.
 Gaussian Naive Bayes:
When the predictors take up a continuous value and are not discrete, we assume that these values are
sampled from a gaussian distribution.
ADVANTAGES
 It is not only a simple approach but also a fast and accurate method for prediction.
 Naive Bayes has very low computation cost.
 It can efficiently work on a large dataset.
 It performs well in case of discrete response variable compared to the continuous
variable.
 It can be used with multiple class prediction problems.
 It also performs well in the case of text analytics problems.
 When the assumption of independence holds, a Naive Bayes classifier performs
better compared to other models like logistic regression.
DISADVANTAGES
 The assumption of independent features. In practice, it is almost impossible that

model will get a set of predictors which are entirely independent.
 If there is no training tuple of a particular class, this causes zero posterior probability.
In this case, the model is unable to make predictions. This problem is known as Zero
Probability/Frequency Problem.
SUMMARY
 Naïve Bayes based on the independence assumption

 Training is very easy and fast; just requiring considering each attribute in each class
separately
 Test is straightforward; just looking up tables or calculating conditional probabilities
with normal distributions

Chapter#03 Supervised Learning and Its Algorithms - II

Uploaded by

Chapter#03 Supervised Learning and Its Algorithms - II

Uploaded by

CHAPTER#03

SUPERVISED LEARNING & ITS ALGORITHMS

DEPARTMENT OF SOFTWARE ENGINEERING

 Naïve Bayes Classifier

 Naive Bayes is a statistical classification technique based on Bayes Theorem.

 Test Data Set:

NAM E RANK YEARS TENURED Classifier

 Test Data Set

 Multinomial Naive Bayes:

 The assumption of independent features. In practice, it is almost impossible that

 Naïve Bayes based on the independence assumption

You might also like