0% found this document useful (0 votes)
10 views25 pages

Lecture 12 - Training Methods

The document discusses various training methods for classification systems, focusing on the importance of training and test data, feature extraction, and the complexities involved in model selection. It highlights the challenges of choosing the right features, the curse of dimensionality, and the significance of generalization in achieving accurate results. Additionally, it addresses the implications of misclassification costs and the computational complexity of algorithms in relation to performance.

Uploaded by

ttttahttttah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views25 pages

Lecture 12 - Training Methods

The document discusses various training methods for classification systems, focusing on the importance of training and test data, feature extraction, and the complexities involved in model selection. It highlights the challenges of choosing the right features, the curse of dimensionality, and the significance of generalization in achieving accurate results. Additionally, it addresses the implications of misclassification costs and the computational complexity of algorithms in relation to performance.

Uploaded by

ttttahttttah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

TRAINING METHODS

Calvin Abonga
Training/ Test phases
Training/Test data
• How do we know that we have collected an
adequately large and representative set of
examples for training/testing the system?

Training Set ?

Test Set ?

3
Pre-processing Step

Example

(1) Image enhancement

(2) Separate touching


or occluding fish

(3) Find the boundary of


each fish

4
Sensors & Preprocessing
• Sensing:
• Use a sensor (camera or microphone) for data capture.
• PR depends on bandwidth, resolution, sensitivity, distortion
of the sensor.

• Pre-processing:
• Removal of noise in data.
• Segmentation (i.e., isolation of patterns of interest from
background).

5
Feature Extraction
• Assume a fisherman told us that a sea bass is
generally longer than a salmon.

• We can use length as a feature and decide


between sea bass and salmon according to a
threshold on length.

• How should we choose the threshold?

6
“Length” Histograms

threshold l*

• Even though sea bass is longer than salmon on


the average, there are many examples of fish
where this observation does not hold.

7
“Average Lightness” Histograms
• Consider a different feature such as “average
lightness”

threshold x*

• It seems easier to choose the threshold x* but we


still cannot make a perfect decision.
8
Multiple Features
• To improve recognition accuracy, we might have to
use more than one feature.
• Single features might not yield the best performance.
• Using combinations of features might yield better
performance.

 x1  x1 : lightness
 x  x : width
 2 2
• How many features should we choose?

9
How Many Features?

• Does adding more features always improve


performance?
• It might be difficult and computationally expensive to
extract certain features.
• Correlated features might not improve performance
(i.e. redundancy).
• “Curse” of dimensionality.

10
Curse of Dimensionality
• Adding too many features can, paradoxically, lead to a
worsening of performance.
• Divide each of the input features into a number of intervals, so
that the value of a feature can be specified approximately by
saying in which interval it lies.

• If each input feature is divided into M divisions, then the total


number of cells is Md (d: # of features).
• Since each cell must contain at least one point, the number of
training data grows exponentially with d.

11
Missing Features
• Certain features might be missing (e.g., due to
occlusion).
• How should we train the classifier with missing
features ?
• How should the classifier make the best decision
with missing features ?

12
“Quality” of Features
• How to choose a good set of features?
• Discriminative features

• Invariant features (e.g., invariant to geometric


transformations such as translation, rotation and scale)
• Are there ways to automatically learn which features
are best ?

13
Classification
• Partition the feature space into two regions by finding
the decision boundary that minimizes the error.

• How should we find the optimal decision boundary?

14
Complexity of Model
• We can get perfect classification performance on the
training data by choosing a more complex model.
• Complex models are tuned to the particular training
samples, rather than on the characteristics of the true
model.

overfitting

How well can the model generalize to unknown samples?


15
Generalization
• Generalization is defined as the ability of a classifier to
produce correct results on novel patterns.
• How can we improve generalization performance ?
• More training examples (i.e., better model estimates).
• Simpler models usually yield better performance.

complex model simpler model

16
Understanding model complexity:
function approximation
• Approximate a function from a set of samples
o Green curve is the true function
o Ten sample points are shown by the blue circles
(assuming noise)

17
Understanding model complexity:
function approximation (cont’d)

Polynomial curve fitting: polynomials having various


orders, shown as red curves, fitted to the set of 10
sample points.

18
Understanding model complexity:
function approximation (cont’d)

Polynomial curve fitting: 9’th order polynomials fitted to


15 and 100 sample points.

19
Improve Classification Performance
through Post-processing
• Consider the problem of character recognition
• Exploit context to improve performance.

How m ch info
mation are y u mi
sing?

20
Improve Classification Performance
through Ensembles of Classifiers
• Performance can be
improved using a "pool" of
classifiers.

• How should we build and


combine different classifiers
?

21
Cost of miss-classifications
• Fish classification: two possible classification
errors:

(1) Deciding the fish was a sea bass when it was a


salmon.
(2) Deciding the fish was a salmon when it was a sea
bass.

• Are both errors equally important ?

22
Cost of miss-classifications
(cont’d)
• Suppose that:
• Customers who buy salmon will object vigorously if
they see sea bass in their cans.
• Customers who buy sea bass will not be unhappy if
they occasionally see some expensive salmon in their
cans.

• How does this knowledge affect our decision?

23
Computational Complexity
• How does an algorithm scale with the number of:
• features
• patterns
• categories
• Need to consider tradeoffs between computational
complexity and performance.

24
Would it be possible to build a
“general purpose” PR system?

• It would be very difficult to design a system that is


capable of performing a variety of classification
tasks.
• Different problems require different features.
• Different features might yield different solutions.
• Different tradeoffs exist for different problems.

25

You might also like