0% found this document useful (0 votes)

10 views25 pages

Lecture 12 - Training Methods

The document discusses various training methods for classification systems, focusing on the importance of training and test data, feature extraction, and the complexities involved in model selection. It highlights the challenges of choosing the right features, the curse of dimensionality, and the significance of generalization in achieving accurate results. Additionally, it addresses the implications of misclassification costs and the computational complexity of algorithms in relation to performance.

Uploaded by

ttttahttttah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views25 pages

Lecture 12 - Training Methods

Uploaded by

ttttahttttah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

TRAINING METHODS

Calvin Abonga
Training/ Test phases
Training/Test data
• How do we know that we have collected an
adequately large and representative set of
examples for training/testing the system?

Training Set ?

Test Set ?

3
Pre-processing Step

Example

(1) Image enhancement

(2) Separate touching

or occluding fish

(3) Find the boundary of

each fish

4
Sensors & Preprocessing
• Sensing:
• Use a sensor (camera or microphone) for data capture.
• PR depends on bandwidth, resolution, sensitivity, distortion
of the sensor.

• Pre-processing:
• Removal of noise in data.
• Segmentation (i.e., isolation of patterns of interest from
background).

5
Feature Extraction
• Assume a fisherman told us that a sea bass is
generally longer than a salmon.

• We can use length as a feature and decide

between sea bass and salmon according to a
threshold on length.

• How should we choose the threshold?

6
“Length” Histograms

threshold l*

• Even though sea bass is longer than salmon on

the average, there are many examples of fish
where this observation does not hold.

7
“Average Lightness” Histograms
• Consider a different feature such as “average
lightness”

threshold x*

• It seems easier to choose the threshold x* but we

still cannot make a perfect decision.
8
Multiple Features
• To improve recognition accuracy, we might have to
use more than one feature.
• Single features might not yield the best performance.
• Using combinations of features might yield better
performance.

 x1  x1 : lightness
 x  x : width
 2 2
• How many features should we choose?

9
How Many Features?

• Does adding more features always improve

performance?
• It might be difficult and computationally expensive to
extract certain features.
• Correlated features might not improve performance
(i.e. redundancy).
• “Curse” of dimensionality.

10
Curse of Dimensionality
• Adding too many features can, paradoxically, lead to a
worsening of performance.
• Divide each of the input features into a number of intervals, so
that the value of a feature can be specified approximately by
saying in which interval it lies.

• If each input feature is divided into M divisions, then the total

number of cells is Md (d: # of features).
• Since each cell must contain at least one point, the number of
training data grows exponentially with d.

11
Missing Features
• Certain features might be missing (e.g., due to
occlusion).
• How should we train the classifier with missing
features ?
• How should the classifier make the best decision
with missing features ?

12
“Quality” of Features
• How to choose a good set of features?
• Discriminative features

• Invariant features (e.g., invariant to geometric

transformations such as translation, rotation and scale)
• Are there ways to automatically learn which features
are best ?

13
Classification
• Partition the feature space into two regions by finding
the decision boundary that minimizes the error.

• How should we find the optimal decision boundary?

14
Complexity of Model
• We can get perfect classification performance on the
training data by choosing a more complex model.
• Complex models are tuned to the particular training
samples, rather than on the characteristics of the true
model.

overfitting

How well can the model generalize to unknown samples?

15
Generalization
• Generalization is defined as the ability of a classifier to
produce correct results on novel patterns.
• How can we improve generalization performance ?
• More training examples (i.e., better model estimates).
• Simpler models usually yield better performance.

complex model simpler model

16
Understanding model complexity:
function approximation
• Approximate a function from a set of samples
o Green curve is the true function
o Ten sample points are shown by the blue circles
(assuming noise)

17
Understanding model complexity:
function approximation (cont’d)

Polynomial curve fitting: polynomials having various

orders, shown as red curves, fitted to the set of 10
sample points.

18
Understanding model complexity:
function approximation (cont’d)

Polynomial curve fitting: 9’th order polynomials fitted to

15 and 100 sample points.

19
Improve Classification Performance
through Post-processing
• Consider the problem of character recognition
• Exploit context to improve performance.

How m ch info
mation are y u mi
sing?

20
Improve Classification Performance
through Ensembles of Classifiers
• Performance can be
improved using a "pool" of
classifiers.

• How should we build and

combine different classifiers
?

21
Cost of miss-classifications
• Fish classification: two possible classification
errors:

(1) Deciding the fish was a sea bass when it was a

salmon.
(2) Deciding the fish was a salmon when it was a sea
bass.

• Are both errors equally important ?

22
Cost of miss-classifications
(cont’d)
• Suppose that:
• Customers who buy salmon will object vigorously if
they see sea bass in their cans.
• Customers who buy sea bass will not be unhappy if
they occasionally see some expensive salmon in their
cans.

• How does this knowledge affect our decision?

23
Computational Complexity
• How does an algorithm scale with the number of:
• features
• patterns
• categories
• Need to consider tradeoffs between computational
complexity and performance.

24
Would it be possible to build a
“general purpose” PR system?

• It would be very difficult to design a system that is

capable of performing a variety of classification
tasks.
• Different problems require different features.
• Different features might yield different solutions.
• Different tradeoffs exist for different problems.

Fundamentals of PR
No ratings yet
Fundamentals of PR
44 pages
An Introduction To Pattern Recognition - 2
No ratings yet
An Introduction To Pattern Recognition - 2
46 pages
Unit 1
No ratings yet
Unit 1
46 pages
Lecture 3
No ratings yet
Lecture 3
50 pages
1 Introduction
No ratings yet
1 Introduction
81 pages
Introduction to Pattern Recognition Basics
No ratings yet
Introduction to Pattern Recognition Basics
46 pages
Spoken Dialog Systems and Voice XML
No ratings yet
Spoken Dialog Systems and Voice XML
94 pages
Intro to Machine Learning Basics
100% (1)
Intro to Machine Learning Basics
52 pages
PR01
100% (1)
PR01
41 pages
Pattern Recognition: An Overview: Prof. Richard Zanibbi
No ratings yet
Pattern Recognition: An Overview: Prof. Richard Zanibbi
29 pages
Pattern Recognition Course Overview
No ratings yet
Pattern Recognition Course Overview
45 pages
CSE 473 Pattern Recognition: Instructor: Dr. Md. Monirul Islam
100% (1)
CSE 473 Pattern Recognition: Instructor: Dr. Md. Monirul Islam
57 pages
Machine Learning
No ratings yet
Machine Learning
28 pages
Unit I Part2
No ratings yet
Unit I Part2
29 pages
Artificial Neural Networks-Pattern Recogntion
No ratings yet
Artificial Neural Networks-Pattern Recogntion
21 pages
Chapter 1 From Book Duda
No ratings yet
Chapter 1 From Book Duda
19 pages
Pattern Classification
100% (1)
Pattern Classification
42 pages
PR Some Solutions
No ratings yet
PR Some Solutions
26 pages
Introduction to Pattern Recognition
100% (1)
Introduction to Pattern Recognition
39 pages
Module 1 Part A
No ratings yet
Module 1 Part A
24 pages
Pattern Recognition Course Overview
No ratings yet
Pattern Recognition Course Overview
65 pages
ML Mid Syllabus
No ratings yet
ML Mid Syllabus
182 pages
Pattern Recognition...
No ratings yet
Pattern Recognition...
21 pages
Unit6 - 7 Issues
No ratings yet
Unit6 - 7 Issues
53 pages
Principles of Pattern Recognition
No ratings yet
Principles of Pattern Recognition
45 pages
Data Preprocessing Techniques in ML
No ratings yet
Data Preprocessing Techniques in ML
23 pages
Pattern Recognition Explained
No ratings yet
Pattern Recognition Explained
40 pages
Lecture10 PatternRecognition 1 51
No ratings yet
Lecture10 PatternRecognition 1 51
51 pages
Pattern Recognition for CS Scholars
0% (1)
Pattern Recognition for CS Scholars
37 pages
Gansp Awareness Quiz PDF
No ratings yet
Gansp Awareness Quiz PDF
13 pages
Pattern Recognition: Lecturer
No ratings yet
Pattern Recognition: Lecturer
43 pages
PR Assignment 01 - Seemal Ajaz (206979)
No ratings yet
PR Assignment 01 - Seemal Ajaz (206979)
7 pages
Pattern Recognition: Dr. Farah Qais Al-Khalidi
No ratings yet
Pattern Recognition: Dr. Farah Qais Al-Khalidi
43 pages
6 Data Mining Functionalities 08-01-2025
No ratings yet
6 Data Mining Functionalities 08-01-2025
23 pages
"Classifiers": R & D Project by Under The Guidance of
No ratings yet
"Classifiers": R & D Project by Under The Guidance of
59 pages
Pattern Recognition Unit 2
No ratings yet
Pattern Recognition Unit 2
24 pages
PRCV Viva Notes
No ratings yet
PRCV Viva Notes
32 pages
Pattern Recognition: Dr. Farah Qais Al-Khalidi
100% (1)
Pattern Recognition: Dr. Farah Qais Al-Khalidi
49 pages
Pattern Recognition Insights
No ratings yet
Pattern Recognition Insights
40 pages
Pattern Recoginition 5
No ratings yet
Pattern Recoginition 5
43 pages
Lecture 1
No ratings yet
Lecture 1
36 pages
Lecture 9 Introducation To ML
No ratings yet
Lecture 9 Introducation To ML
48 pages
Chapter 19
No ratings yet
Chapter 19
30 pages
Machine Learning for Data Analysts
No ratings yet
Machine Learning for Data Analysts
31 pages
Lecture 3 1611410001002
No ratings yet
Lecture 3 1611410001002
51 pages
Pattern Recognition 14
No ratings yet
Pattern Recognition 14
46 pages
Pattern Recognition
No ratings yet
Pattern Recognition
33 pages
AI Statistical Methods Course
No ratings yet
AI Statistical Methods Course
23 pages
Lecture 5 - Feature Extraction, Model Building & Evaluation
No ratings yet
Lecture 5 - Feature Extraction, Model Building & Evaluation
35 pages
Single Layer Perceptron Overview
No ratings yet
Single Layer Perceptron Overview
62 pages
کتاب پنجم بارگزاری شده
No ratings yet
کتاب پنجم بارگزاری شده
35 pages
Pattern Summary Final
No ratings yet
Pattern Summary Final
28 pages
Feature and Feature Extractionlect2
No ratings yet
Feature and Feature Extractionlect2
28 pages
PR 2 Unit
No ratings yet
PR 2 Unit
13 pages
Pattern Recognition: P.S.Sastry
No ratings yet
Pattern Recognition: P.S.Sastry
90 pages
Machine Learning Classifiers Overview
No ratings yet
Machine Learning Classifiers Overview
46 pages
352 Full
No ratings yet
352 Full
3 pages
Group Presentation About Usability Interaction
No ratings yet
Group Presentation About Usability Interaction
16 pages
Microfluidics: Fabrication Techniques Overview
No ratings yet
Microfluidics: Fabrication Techniques Overview
17 pages
Lecture 9 - Reinforced Learning
No ratings yet
Lecture 9 - Reinforced Learning
18 pages
Microfluidics Lecture 1
No ratings yet
Microfluidics Lecture 1
22 pages
JForce Training 1
No ratings yet
JForce Training 1
17 pages
2025 MICROFLUIDICS Revision Questions
No ratings yet
2025 MICROFLUIDICS Revision Questions
1 page
Jumia: Your Guide to Shopping in Africa
No ratings yet
Jumia: Your Guide to Shopping in Africa
16 pages
TOPIC 2 - Collection Development Policy
No ratings yet
TOPIC 2 - Collection Development Policy
12 pages
ProjectReport Charvi
No ratings yet
ProjectReport Charvi
29 pages
Application Support Specialist Resume 1
No ratings yet
Application Support Specialist Resume 1
1 page
Electronic Filing Toolkit
No ratings yet
Electronic Filing Toolkit
3 pages
Full Stack Dev: Blockchain & AI Focus
No ratings yet
Full Stack Dev: Blockchain & AI Focus
2 pages
ENGR 3032 2068 Report-3
No ratings yet
ENGR 3032 2068 Report-3
5 pages
CMM 28-12-04 Rev 1 (Pressure Relief Valve PN L97!63!602,-606)
No ratings yet
CMM 28-12-04 Rev 1 (Pressure Relief Valve PN L97!63!602,-606)
84 pages
Live Tracker - Pak Sim Data
No ratings yet
Live Tracker - Pak Sim Data
3 pages
Internship
No ratings yet
Internship
34 pages
Cbse - Department of Skill Education: Curriculum For Session 2024-2025 Information Technology (Sub. Code - 402)
No ratings yet
Cbse - Department of Skill Education: Curriculum For Session 2024-2025 Information Technology (Sub. Code - 402)
13 pages
20210817-Assignment 1 - Doubly Connected Edge List
No ratings yet
20210817-Assignment 1 - Doubly Connected Edge List
31 pages
Brocade 5000 SAN Switch
No ratings yet
Brocade 5000 SAN Switch
10 pages
Design of A Low-Cost CNC Milling Machine, Using Some Aspect of Parallel Engineering Concept
100% (1)
Design of A Low-Cost CNC Milling Machine, Using Some Aspect of Parallel Engineering Concept
7 pages
NGUYỄN VĂN TUYÊN SOFTWARE ENGINEER
No ratings yet
NGUYỄN VĂN TUYÊN SOFTWARE ENGINEER
10 pages
Software Architecture Quiz
No ratings yet
Software Architecture Quiz
11 pages
CompTIA SecAI+ CY0-001 Exam Objectives (1.1)
No ratings yet
CompTIA SecAI+ CY0-001 Exam Objectives (1.1)
12 pages
SCHS Exam Guidelines for Nurses
33% (3)
SCHS Exam Guidelines for Nurses
8 pages
Chapter 6: The Relational Algebra and Relational Calculus: Answers To Selected Exercises
100% (2)
Chapter 6: The Relational Algebra and Relational Calculus: Answers To Selected Exercises
4 pages
Wa0007.
No ratings yet
Wa0007.
15 pages
Formulas To Remember That Are Not Given in The Formula Sheet
No ratings yet
Formulas To Remember That Are Not Given in The Formula Sheet
7 pages
LipiScan Networking Guide SECTIONS 1-6
No ratings yet
LipiScan Networking Guide SECTIONS 1-6
18 pages
Os Lab Manual Final PDF
No ratings yet
Os Lab Manual Final PDF
94 pages
CSE 122: Conditional Logic in C
No ratings yet
CSE 122: Conditional Logic in C
3 pages
Improving Smart Contract Security With Contrastive Learning-Based Vulnerability Detection
No ratings yet
Improving Smart Contract Security With Contrastive Learning-Based Vulnerability Detection
11 pages
MT07 User Manual
No ratings yet
MT07 User Manual
15 pages
HTTP - App - Utu.ac - in - Utuexmanagement - Exammsters - Syllabus - OC5005 - Introduction To Algorithms and Analysis.
No ratings yet
HTTP - App - Utu.ac - in - Utuexmanagement - Exammsters - Syllabus - OC5005 - Introduction To Algorithms and Analysis.
5 pages
Python For Data Science
No ratings yet
Python For Data Science
398 pages
Daraz Long Formal Business Report
No ratings yet
Daraz Long Formal Business Report
45 pages
HPE GreenLake Flex Solutions-A50009576enw
No ratings yet
HPE GreenLake Flex Solutions-A50009576enw
7 pages
Bachelor of Information Technology (Hons)
No ratings yet
Bachelor of Information Technology (Hons)
2 pages

Lecture 12 - Training Methods

Uploaded by

Lecture 12 - Training Methods

Uploaded by

TRAINING METHODS

(1) Image enhancement

(2) Separate touching

(3) Find the boundary of

• We can use length as a feature and decide

• How should we choose the threshold?

• Even though sea bass is longer than salmon on

• It seems easier to choose the threshold x* but we

• Does adding more features always improve

• If each input feature is divided into M divisions, then the total

• Invariant features (e.g., invariant to geometric

• How should we find the optimal decision boundary?

How well can the model generalize to unknown samples?

complex model simpler model

Polynomial curve fitting: polynomials having various

Polynomial curve fitting: 9’th order polynomials fitted to

• How should we build and

(1) Deciding the fish was a sea bass when it was a

• Are both errors equally important ?

• How does this knowledge affect our decision?

• It would be very difficult to design a system that is

You might also like