0% found this document useful (0 votes)

23 views16 pages

Machine Learning Evaluation Metrics

The document discusses evaluation metrics for machine learning algorithms, focusing on error, accuracy, precision, recall, and measures for both regression and classification problems. It highlights the importance of the ROC curve and AUC for comparing classifiers, as well as the precision/recall trade-off in decision-making. The next lecture will cover loss functions.

Uploaded by

Dhruv Jain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views16 pages

Machine Learning Evaluation Metrics

Uploaded by

Dhruv Jain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

DS605: Fundamentals of Machine Learning

Lecture 10

Evaluation - II
[Evaluation Metrics]

Arpit Rana
12th August 2024
Experimental Evaluation of Learning Algorithms

Given a representation, data, and a

bias, the learning algorithm returns a
Final Hypothesis (h).

Hypothesis Learner
Space 𝓗 (𝚪: S → h)

How to Check the Performance of

Learning Algorithms?

Final Hypothesis or
Model (h)
Evaluation Metrics
Common Measures
Experimental Evaluation of Learning Algorithms

Typical Experimental Evaluation Metrics

● Error

● Accuracy

● Precision/ Recall
Measures for Regression Problems

● Mean Absolute Error

● Squared Error Which one is better and why?

● Non-differentiability
● Robustness (sensitivity to
outliers)
● Unit changes in MSE
Measures for Regression Problems

● Misclassiﬁcation Rate (a.k.a. Error Rate)

Where,
Measures for Classiﬁcation Problems

True Class
Confusion (Actual)
Matrix
Positive Positive Negative Total
Hypothesized Class

True False
Positive Positive P’
(Predicted)

(TP) (FP)
Negative

False True
Negative Negative N’
(FN) (TN)

Total P N P+N
Measures for Classiﬁcation Problems

F measure: weighted harmonic mean of

True Class precision and recall.
Confusion (Actual)
Matrix
Positive Positive Negative Total
Hypothesized Class

True False
Positive Positive P’
(Predicted)

(TP) (FP)
Negative

False True
Negative Negative N’
(FN) (TN)
⍺ ∈ [0, 1] and 𝛽 ∈ [0, ∞]
For ⍺ = ½, 𝛽 = 1, F measure will be balanced
Total P N P+N and is known as F1 measure.
Measures for Classiﬁcation Problems

What metric would you use to measure the performance of the following classiﬁers.

● A classiﬁer to detect videos that are safe for kids.

● A classiﬁer to detect shoplifters in surveillance images.

Precision/Recall Trade-off

● Images are ranked by their classiﬁer (whether the image is 5 or not) score.

● Those above the chosen decision threshold are considered positive.

● The higher the threshold, the lower the recall, but (in general) the higher the precision.
Precision/Recall Trade-off

How do you decide which threshold to use?

Precision/Recall Trade-off

How do you decide which threshold to use?

● A high-precision classiﬁer is not

very useful if its recall is too low!

● If someone says “let’s reach 99%

precision,” you should ask, “at
what recall?”

To take recall into consideration, we

use other measures.
The ROC Curve

● The receiver operating characteristic (ROC) curve is another common tool used with
binary classiﬁers.

● It is very similar to the precision/recall curve,

○ but instead of plotting precision versus recall,

○ the ROC curve plots the true positive rate (TPR, another name for recall or
sensitivity) against the false positive rate (FPR, 1-speciﬁcity).
The ROC Curve

● Once again there is a tradeoff: the

higher the recall (TPR), the more
false positives (FPR) the classiﬁer
produces.

● The dotted line represents the

ROC curve of a purely random
classiﬁer;

● A good classiﬁer stays as far away

from that line as possible (toward
the top-left corner).
AUC: Area Under the (ROC) Curve

● One way to compare classiﬁers is to measure the area under the curve (AUC).

● A perfect classiﬁer will have a ROC AUC equal to 1, whereas a purely random classiﬁer
will have a ROC AUC equal to 0.5.

Note: As a rule of thumb, you should prefer the PR (precision-recall) curve whenever the
positive class is rare or when you care more about the false positives than the false negatives,
and the ROC curve otherwise.
Next lecture Loss Functions
13th August 2024

CS340 Machine Learning ROC Curves
No ratings yet
CS340 Machine Learning ROC Curves
8 pages
Roc 1 PDF
No ratings yet
Roc 1 PDF
8 pages
06-FSSR DS610 2024 2025T1 Metrics
No ratings yet
06-FSSR DS610 2024 2025T1 Metrics
24 pages
DL IT324a 4
No ratings yet
DL IT324a 4
52 pages
Ai DS 2 Book-Chpt-5
No ratings yet
Ai DS 2 Book-Chpt-5
17 pages
جلسه 13
No ratings yet
جلسه 13
76 pages
19-Performance Metrics
No ratings yet
19-Performance Metrics
23 pages
Unit8 (Evaluation Method)
No ratings yet
Unit8 (Evaluation Method)
43 pages
ML - Training - Evaluation For Machine Learning Course
No ratings yet
ML - Training - Evaluation For Machine Learning Course
31 pages
Evaluation Metrics in Machine Learning - GeeksforGeeks
No ratings yet
Evaluation Metrics in Machine Learning - GeeksforGeeks
6 pages
ML Model Evaluation Metrics
No ratings yet
ML Model Evaluation Metrics
11 pages
Binary Classification PDF
No ratings yet
Binary Classification PDF
27 pages
L 13 Choose Your Own Algorithm D 07062024 111828am
No ratings yet
L 13 Choose Your Own Algorithm D 07062024 111828am
36 pages
Lecture 3b - Evaluation
No ratings yet
Lecture 3b - Evaluation
37 pages
Lecture11evaluationmetricsforclassification 240913060639 0c766554
No ratings yet
Lecture11evaluationmetricsforclassification 240913060639 0c766554
28 pages
Module 5 ML
No ratings yet
Module 5 ML
12 pages
Model Evaluation
No ratings yet
Model Evaluation
31 pages
A10 Model Performance v2 2up
No ratings yet
A10 Model Performance v2 2up
11 pages
Unit6 - 7 Issues
No ratings yet
Unit6 - 7 Issues
53 pages
Classification Metrics Mod 6
No ratings yet
Classification Metrics Mod 6
8 pages
Classification Metrics
No ratings yet
Classification Metrics
39 pages
ML - 03 Evaluation Metrics
No ratings yet
ML - 03 Evaluation Metrics
17 pages
Model Evaluation and Selection
No ratings yet
Model Evaluation and Selection
49 pages
Imbalance Problem
No ratings yet
Imbalance Problem
13 pages
Evaluating Classification Methods
No ratings yet
Evaluating Classification Methods
29 pages
Lec5 Classification
No ratings yet
Lec5 Classification
27 pages
Hands On Machine Learning 3 Edition
No ratings yet
Hands On Machine Learning 3 Edition
31 pages
6 Evaluarea Performantei
No ratings yet
6 Evaluarea Performantei
43 pages
Compare Class I Fiers Part 13
No ratings yet
Compare Class I Fiers Part 13
32 pages
Performance Metrics
No ratings yet
Performance Metrics
34 pages
Data Mining: Class Imbalance Solutions
No ratings yet
Data Mining: Class Imbalance Solutions
56 pages
CH 4
No ratings yet
CH 4
9 pages
Unit - 5
No ratings yet
Unit - 5
57 pages
ML CH 5
No ratings yet
ML CH 5
45 pages
4-1 Fine-Tuning Your Model
No ratings yet
4-1 Fine-Tuning Your Model
60 pages
Lect 02 Evaluation Part 1
No ratings yet
Lect 02 Evaluation Part 1
33 pages
Machine Learning II
No ratings yet
Machine Learning II
61 pages
ML Lecture 11 Evaluation
No ratings yet
ML Lecture 11 Evaluation
17 pages
ML Model Evaluation Metrics
No ratings yet
ML Model Evaluation Metrics
8 pages
Lecture 2.3
No ratings yet
Lecture 2.3
9 pages
Model Evaluation Metrics Guide
No ratings yet
Model Evaluation Metrics Guide
9 pages
UNIT-1-2.Binary Classification and Related Tasks
No ratings yet
UNIT-1-2.Binary Classification and Related Tasks
22 pages
CH-5 ML
No ratings yet
CH-5 ML
36 pages
Data Mining: Evaluation Methods
No ratings yet
Data Mining: Evaluation Methods
25 pages
Module 6
No ratings yet
Module 6
24 pages
Iai&ml Unit-5
No ratings yet
Iai&ml Unit-5
15 pages
Session 2 Evaluation Boosting Bagging Contemporary Business Anaytics
No ratings yet
Session 2 Evaluation Boosting Bagging Contemporary Business Anaytics
17 pages
Lecture - (3-4) Evaluation Metrices Classification and Regression
No ratings yet
Lecture - (3-4) Evaluation Metrices Classification and Regression
28 pages
3 - Model Evaluation & Validation
No ratings yet
3 - Model Evaluation & Validation
47 pages
AI & ML Notes
No ratings yet
AI & ML Notes
22 pages
Chap3 Part1 Classification
No ratings yet
Chap3 Part1 Classification
38 pages
ML Metrics
No ratings yet
ML Metrics
9 pages
Classification Model Evaluation Metrics
No ratings yet
Classification Model Evaluation Metrics
22 pages
4.9 Estimating The Performance of A Classifier II
No ratings yet
4.9 Estimating The Performance of A Classifier II
16 pages
Key Performance Measures in ML
No ratings yet
Key Performance Measures in ML
11 pages
Performance Parameters
No ratings yet
Performance Parameters
14 pages
Chapter 5 Model Evaluation
No ratings yet
Chapter 5 Model Evaluation
21 pages
Evaluation Metrics: Yining Chen (Adapted From Slides by Anand Avati) May 1, 2020
No ratings yet
Evaluation Metrics: Yining Chen (Adapted From Slides by Anand Avati) May 1, 2020
31 pages
Shreya Srivastava-27
No ratings yet
Shreya Srivastava-27
3 pages
170 Machine Learning Interview Questios - Greatlearning
100% (1)
170 Machine Learning Interview Questios - Greatlearning
57 pages
Scene Change Detection
No ratings yet
Scene Change Detection
31 pages
PDF Malware Analysis Guide
No ratings yet
PDF Malware Analysis Guide
45 pages
Heart Disease Prediction Using Hybrid Model
No ratings yet
Heart Disease Prediction Using Hybrid Model
6 pages
Road Crack Detection Using Deep Convolutional Neural Network and Adaptive Thresholding
No ratings yet
Road Crack Detection Using Deep Convolutional Neural Network and Adaptive Thresholding
6 pages
Literature Review For Smart Crop Recommendation System
No ratings yet
Literature Review For Smart Crop Recommendation System
6 pages
Brazilian Portuguese Hate Speech Dataset
No ratings yet
Brazilian Portuguese Hate Speech Dataset
26 pages
Microsoft AI-900 Vapr-2024 by - ToanNguyen 116q
No ratings yet
Microsoft AI-900 Vapr-2024 by - ToanNguyen 116q
73 pages
COMP1801 - Copy 1
No ratings yet
COMP1801 - Copy 1
18 pages
KNN Methods
No ratings yet
KNN Methods
56 pages
Om Shewale: Aspiring Data Scientist
No ratings yet
Om Shewale: Aspiring Data Scientist
2 pages
Machine Learning For Early Detection of Child Depression A Data-Driven Approach
No ratings yet
Machine Learning For Early Detection of Child Depression A Data-Driven Approach
5 pages
ANN and CNN Based Ensemble Learning For Recognizing Renowned Medicinal Plants
No ratings yet
ANN and CNN Based Ensemble Learning For Recognizing Renowned Medicinal Plants
6 pages
Zom SH Ict701
No ratings yet
Zom SH Ict701
10 pages
ICU Mortality Prediction Using Machine Learning
No ratings yet
ICU Mortality Prediction Using Machine Learning
7 pages
Unleashing Creativity The Business Potential of Generative AI
No ratings yet
Unleashing Creativity The Business Potential of Generative AI
6 pages
A Detection Method For Pavement Cracks Combining Object Detection and Attention Mechanism
0% (1)
A Detection Method For Pavement Cracks Combining Object Detection and Attention Mechanism
11 pages
Credit Risk Hackathon Solution Plan
No ratings yet
Credit Risk Hackathon Solution Plan
26 pages
Big Data Analytics Course Overview
No ratings yet
Big Data Analytics Course Overview
6 pages
2022 wmt-1 3
No ratings yet
2022 wmt-1 3
31 pages
Clinical Experience Sharing by Similar Case Retrieval
No ratings yet
Clinical Experience Sharing by Similar Case Retrieval
8 pages
Jksucis S 23 01636
No ratings yet
Jksucis S 23 01636
33 pages
Chap6 ClassificationBasic
No ratings yet
Chap6 ClassificationBasic
83 pages
CNN-AttBiLSTM Mechanism A DDoS Attack Detection Method Based On Attention Mechanism and CNN-BiLSTM
No ratings yet
CNN-AttBiLSTM Mechanism A DDoS Attack Detection Method Based On Attention Mechanism and CNN-BiLSTM
10 pages
Enhancing System Security LLM-Driven Defense Against Prompt Injection Vulnerabilities
No ratings yet
Enhancing System Security LLM-Driven Defense Against Prompt Injection Vulnerabilities
4 pages
The Limitations of Stylometry For Detecting Machine-Generated Fake News
No ratings yet
The Limitations of Stylometry For Detecting Machine-Generated Fake News
12 pages
Ca 3 Merged
No ratings yet
Ca 3 Merged
275 pages
Hybrid Web Recommender Systems
No ratings yet
Hybrid Web Recommender Systems
33 pages
Underground Infrastructure Detection and Localization From Deep Learning Enabled Radargram Inversion and Vision Based Mapping
No ratings yet
Underground Infrastructure Detection and Localization From Deep Learning Enabled Radargram Inversion and Vision Based Mapping
23 pages

Machine Learning Evaluation Metrics

Uploaded by

Machine Learning Evaluation Metrics

Uploaded by

DS605: Fundamentals of Machine Learning

Given a representation, data, and a

How to Check the Performance of

Typical Experimental Evaluation Metrics

● Mean Absolute Error

● Squared Error Which one is better and why?

● Misclassiﬁcation Rate (a.k.a. Error Rate)

F measure: weighted harmonic mean of

● A classiﬁer to detect videos that are safe for kids.

● A classiﬁer to detect shoplifters in surveillance images.

● Those above the chosen decision threshold are considered positive.

How do you decide which threshold to use?

How do you decide which threshold to use?

● A high-precision classiﬁer is not

● If someone says “let’s reach 99%

To take recall into consideration, we

● It is very similar to the precision/recall curve,

○ but instead of plotting precision versus recall,

● Once again there is a tradeoff: the

● The dotted line represents the

● A good classiﬁer stays as far away

You might also like