Chapter#03 Supervised Learning and Its Algorithms - II
Chapter#03 Supervised Learning and Its Algorithms - II
COURSE INSTRUCTORS:
DR. MUHAMMAD NASEEM
ENGR. FARHEEN QAZI
ENGR. SUNDUS ZEHRA
P(h): the probability of hypothesis h being true (regardless of the data). This is known as the prior
probability of h.
P(D): the probability of the data (regardless of the hypothesis).This is known as the prior probability.
P(h|D): the probability of hypothesis h given the data D.This is known as posterior probability.
P(D|h): the probability of data D given that the hypothesis h was true. This is known as posterior
probability.
PLAY-TENNIS EXAMPLE 1
Outlook Temperature Humidity W indy Class outlook
sunny hot high false N P(sunny|p) = 2/9 P(sunny|n) = 3/5
sunny hot high true N P(overcast|p) = 4/9 P(overcast|n) = 0
overcast hot high false P
rain mild high false P P(rain|p) = 3/9 P(rain|n) = 2/5
rain cool normal false P temperature
rain cool normal true N P(hot|p) = 2/9 P(hot|n) = 2/5
overcast cool normal true P
sunny mild high false N P(mild|p) = 4/9 P(mild|n) = 2/5
sunny cool normal false P P(cool|p) = 3/9 P(cool|n) = 1/5
rain mild normal false P humidity
sunny mild normal true P
P(high|p) = 3/9 P(high|n) = 4/5
overcast mild high true P
overcast hot normal false P P(normal|p) = 6/9 P(normal|n) = 2/5
rain mild high true N windy
P(p) = 9/14 P(true|p) = 3/9 P(true|n) = 3/5
P(n) = 5/14 P(false|p) = 6/9 P(false|n) = 2/5
CONTD….
P(X|p)=
P(rain|p)*P(hot|p)*P(high|p)*P(false|p)*P(p) = 3/9 * 2/9 * 3/9 * 6/9 * 9/14 = 0.010582
P(X|n)=
P(rain|n)*P(hot|n)*P(high|n)*P(false|n)*P(n) = 2/5 * 2/5 * 4/5 * 2/5 * 5/14 = 0.018286
P(X|n) > P(X|p)
Outlook Temperature Humidity Windy Class
Rain Hot High False N (not play)
EXAMPLE 2
Classification
Algorithms
Training
Data
It is not only a simple approach but also a fast and accurate method for prediction.
Naive Bayes has very low computation cost.
It can efficiently work on a large dataset.
It performs well in case of discrete response variable compared to the continuous
variable.
It can be used with multiple class prediction problems.
It also performs well in the case of text analytics problems.
When the assumption of independence holds, a Naive Bayes classifier performs
better compared to other models like logistic regression.
DISADVANTAGES
If there is no training tuple of a particular class, this causes zero posterior probability.
In this case, the model is unable to make predictions. This problem is known as Zero
Probability/Frequency Problem.
SUMMARY