Module - 1 - ECE3047 - Machine Learning - 1 (8748)

ECE3047 - Machine
Learning Fundamentals
Prepared By
Dr. Rohith G
Assistant Professor (Senior)
School of Electronics Engineering (SENSE), VIT-Chennai
Under the Guidance and Materials mentored by
Dr. Sathiya Narayanan S
Assistant Professor (Senior)
School of Electronics Engineering (SENSE), VIT-Chennai
Prepared by Dr. Rohith G, AP(Senior),VIT Chennai 1
• Module 1: Introduction (to • Module 5: Clustering
Machine Learning) • Module 6: Optimization
• Module 2: Data Preprocessing
• Module 7:
• Module 3: Regression Reinforcement Learning
• Module 4: Classification
Topics in Module-1
• Common
Definitions
• Applications of
Machine Learning
• Types of Learning
• Performance
Learn + Predict + Improve
Measures =
Machine learning
Evolution of ML
Engineering of Machines that

Mimic Cognitive Functions
Ability to perform tasks
without explicit instructions
and relying on pattern
Machine Learning based on
Prepared by Dr. Rohith G, AP(Senior),VIT Chennai
Artificial Neural Networks
4
What is Machine Learning?
A canonical definition by Tom Mitchell in 1997: “An agent is said to learn from experience (E) with
respect to some class of tasks (T), and the performance measure (P), if the learner's performance at T,
as measured by P, improves with E". One has to be very careful about defining the set of tasks T, and
the performance measure P. With experience E, the performance P has to improve.”
What is Machine Learning?
Machine learning can be defined as a subset of Artificial Intelligence (AI) that

allows a system/computer to learn from some available data. The data can either be
labelled (with a number, tag or type) or unlabelled.
F
Advantages and Disadvantages of Machine Learning
Advantages Disadvantages
1. Easy to identify the 1. Chances of error
patterns 2. Data acquiring and
2. No Human Intervention preprocessing
3. Wide range of applications 3. Time and resource
4. Scope for Continuous dependent
improvement 4. Human Expertise for
5. Handling Multi-variety result interpretation
F
Data
F

Applications of Machine Learning

Real world Applications of Machine Learning

Types of Machine Learning based on Learning

What is Supervised Machine Learning?
• Learning an
input and
output map.
• It deals with
Labels
labelled data.
• If the output happens to be a categorical one, then the supervised

learning paradigm is called `classification'.
• If the output is a continuous value, then the learning paradigm is
called `regression'.
How does Supervised Machine Learning Works?

What is Unsupervised Machine Learning?
• Discovering
patterns in
the data.
• It deals with
unlabelled
data.
• The process of finding cohesive groups in the input data is called

`clustering'.
• The process of finding the frequent co-occurance of items in the
data is called `association rule mining'.
How does Unsupervised Machine Learning Works?

Difference between Supervised and Unsupervised

What is Reinforced Machine Learning?
• Learning to control the behavior of a system.

• It is neither supervised nor unsupervised.
How does Reinforcement Machine Learning Works?
When to use Reinforcement Learning?
1.A model of the environment is known, but an analytic
solution is not available;
2.Only a simulation model of the environment is given
(the subject of simulation-based optimization)
3.The only way to collect information about the
environment is to interact with it.
#Example-1
• Reinforcement learning agent is able to
perceive and interpret its environment,
take actions and learn through trial and
error.
• Trials-false till it can maximize its reward #Example-2
and continue to develop further

How to build a Machine learning Model-7 Steps
Step 1. Understand the problem statement clearly

Step 2. Understand and identify data

Step 3. Collect and prepare data

Step 4. Determine the model's features and train it

Step 5. Evaluate the model's performance and establish benchmarks

Step 6. Put the model in operation and make sure it works well
Step7. Iterate and adjust the model

Top 10 Machine learning Algorithms
• Naïve Bayes Classifier
Algorithm
• K-Means Clustering
Algorithm
• Support Vector Machine
Algorithm
• Apriori Algorithm
• Linear Regression Algorithm
• Logistic Regression Algorithm
• Decision Making trees
algorithm
• Random Forest
• K-Nearest Neighbor
• Artificial Neural Networks
algorithm

Performance Metrics
• Mean Absolute
Error (MAE) Data Feature Data
Preparation Engineering Modelling
• Mean Square Error
(MSE)
• Classification Performance
accuracy Measure
• Precision
• Recall or Sensitivity
• Specificity
• F1 score
• Area Under Curve
(AUC)
• ROC Prepared by Dr. Rohith G, AP(Senior),VIT Chennai 25
Performance Metrics

Performance Metrics

Performance Metrics • TP- Positive class correctly identified as positive
• Confusion • FN-Positive class incorrectly identified as negative
Matrix is a • FP-Negative class incorrectly identified as positive
tool to • TN-Negative class correctly identified as negative
determine
the
performance
of classifier.
• It contains
informatio
n about
actual and
predicted
classificati
ons.

Performance Metrics-Confusion Matrix
#Example: Confusion Matrix for identifying the efficacy in detecting the spam mails
• Sensitivity/recall- Measure of
Positive examples labeled as positive by classifier.
• Sensitivity = 45/(45+20) = 69.23% of spam
emails are correctly classified and excluded
from all non-spam emails
• Specificity-Measure of negative examples labeled
as negative by classifier.
• Specificity = 30/(30+5) = 85.71% of non-spam
emails are accurately classified
and excluded from all spam emails.
• Accuracy- proportion of the total number of • Precision-Correctness achieved in positive
predictions that are correct. prediction-Total number of correctly classified
• Accuracy = (45+30)/(45+20+5+30) = 75% of positive examples and the total number of predicted
examples are correctly classified by the classifier. positive examples.
• F1 score is a weighted average of the recall • Precision = 45/(45+5)= 90% of
(sensitivity) and precision. examples are classified as spam are
• F1-score = 2* (90*69.23)/(90+69.23) = 78.26% of actually spam.
harmonic mean of precision and recall.
Confusion Matrix for Imbalanced dataset

Solution to improve accuracy in the model

ROC
• Receiver operating characteristic (ROC)-Graphical plot that
illustrates the diagnostic ability of a binary classifier system as its
discrimination threshold is varied.
• ROC curve summarizes the performance by combining confusion
matrices at all threshold values.
How threshold value

can change the
predicted class
• An ROC curve plots TPR vs. FPR

at different classification
thresholds.
• Lowering the classification
threshold classifies more items as
positive, thus increasing both False
Positives and True Positives.

AUC
• AUC stands for "Area under the ROC Curve.“
• AUC measures the entire two-dimensional area underneath
the entire ROC curve from 0 to 1- A model whose
predictions are 100% wrong has an AUC of 0.0; one
whose predictions are 100% correct has an AUC of 1.0.
• AUC of a classifier is equal to the probability that the
classifier will rank a randomly chosen positive example
higher than a randomly chosen negative example.
• AUC provides an aggregate measure of performance across
all possible classification thresholds.
• AUC represents the probability that a random positive
(green) example is positioned to the right of a random
negative (red) example.

Performance Metrics
• Lowering the
classification
threshold classifies
more items as
positive, thus
increasing both
False Positives and
True Positives
AUC measures the entire two-dimensional area

underneath the entire ROC curve (think integral
calculus) from (0,0) to (1,1).
Summary on Module-1-Machine Learning
• Machine Learning is an application of Artificial Intelligence that provides systems the

ability to automatically learn, predicts and improves from experience without being
explicitly programmed.
Summary on Module-1Applications of Machine Learning based on
Learning

Summary on Module-1-Types of Learning

Summary on Performance Metrics

Module - 1 - ECE3047 - Machine Learning - 1 (8748)

Uploaded by

Module - 1 - ECE3047 - Machine Learning - 1 (8748)

Uploaded by

ECE3047 - Machine

Engineering of Machines that

Machine learning can be defined as a subset of Artificial Intelligence (AI) that

Prepared by Dr. Rohith G, AP(Senior),VIT Chennai 7

Prepared by Dr. Rohith G, AP(Senior),VIT Chennai 8

Prepared by Dr. Rohith G, AP(Senior),VIT Chennai 9

Prepared by Dr. Rohith G, AP(Senior),VIT Chennai 10

• If the output happens to be a categorical one, then the supervised

Prepared by Dr. Rohith G, AP(Senior),VIT Chennai 12

• The process of finding cohesive groups in the input data is called

Prepared by Dr. Rohith G, AP(Senior),VIT Chennai 14

Prepared by Dr. Rohith G, AP(Senior),VIT Chennai 15

• Learning to control the behavior of a system.

Prepared by Dr. Rohith G, AP(Senior),VIT Chennai 17

Prepared by Dr. Rohith G, AP(Senior),VIT Chennai 18

Prepared by Dr. Rohith G, AP(Senior),VIT Chennai 19

Prepared by Dr. Rohith G, AP(Senior),VIT Chennai 20

Prepared by Dr. Rohith G, AP(Senior),VIT Chennai 21

Prepared by Dr. Rohith G, AP(Senior),VIT Chennai 22

Prepared by Dr. Rohith G, AP(Senior),VIT Chennai 23

Prepared by Dr. Rohith G, AP(Senior),VIT Chennai 24

Prepared by Dr. Rohith G, AP(Senior),VIT Chennai 26

Prepared by Dr. Rohith G, AP(Senior),VIT Chennai 27

Prepared by Dr. Rohith G, AP(Senior),VIT Chennai 28

Prepared by Dr. Rohith G, AP(Senior),VIT Chennai 30

Prepared by Dr. Rohith G, AP(Senior),VIT Chennai 31

How threshold value

• An ROC curve plots TPR vs. FPR

Prepared by Dr. Rohith G, AP(Senior),VIT Chennai 32

Prepared by Dr. Rohith G, AP(Senior),VIT Chennai 33

AUC measures the entire two-dimensional area

• Machine Learning is an application of Artificial Intelligence that provides systems the

Prepared by Dr. Rohith G, AP(Senior),VIT Chennai 36

Prepared by Dr. Rohith G, AP(Senior),VIT Chennai 37

Prepared by Dr. Rohith G, AP(Senior),VIT Chennai 38

You might also like