0% found this document useful (0 votes)
19 views15 pages

Classification and Regression Trees (CART) Algorithm

Uploaded by

Ilampooranan
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
Download as pptx, pdf, or txt
0% found this document useful (0 votes)
19 views15 pages

Classification and Regression Trees (CART) Algorithm

Uploaded by

Ilampooranan
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1/ 15

Classification and

Regression Trees
(CART) Algorithm
Prakash P
Classification and Regression Trees
• Classification and Regression Trees (CART) is only a modern term for
what are otherwise known as Decision Trees.
• Decision Trees have been around for a very long time and are
important for predictive modelling in Machine Learning.
• As the name suggests, these trees are used for classification and
prediction problems.
• This serve the basis for other modern classifiers such as Random
Forest
Classification
• Generally, a classification problem
can be described as follows:
Data: A set of records (instances)
that are described by:
* k attributes: A1, A2,...Ak
* A class: Discrete set of labels
Goal: To learn a classification
model from the data that can be
used to predict the classes of new
(future, or test) instances.
Classification

However, we must note that there can be many other possible decision
trees for a given problem - we want the shortest one.
Want it to be better in terms of accuracy (prediction error measured in
terms of misclassification cost).
CART Algorithm for
Classification
• The tree will be constructed in a top-down approach as follows:

Step 1: Start at the root node with all training instances

Step 2: Select an attribute on the basis of splitting criteria (Gain Ratio or other impurity metrics)

Step 3: Partition instances according to selected attribute recursively

Partitioning stops when:


 There are no examples left
 All examples for a given node belong to the same class
 There are no remaining attributes for further partitioning –
majority class is the leaf
What is Impurity?
• In our dataset we can see that a loan is always approved when the
applicant owns their own house. This is very informative (and certain)
and is hence set as the root node of the alternative decision tree shown
previously.
• Classifying a lot of future applicants will be easy.
Selecting the age attribute is not as informative - there is a degree of
uncertainity (or impurity). The person's age does not seem to affect the
final class as much.
• Based on the above discussion:
• A subset of data is pure if all instances belong to the same class.
• Our objective is to reduce impurity or uncertainty in data as much as possible.
The metric (or heuristic) used in CART to measure impurity is
the Gini Index and we select the attributes with lower Gini
Indices first.
Gini Index

Gini index or Gini impurity measures the degree or probability of a particular variable being wrongly classified when it is
randomly chosen.

But what is actually meant by ‘impurity’? If all the elements belong to a single class, then it can be called pure.

The degree of Gini index varies between 0 and 1, where 0 denotes that all elements belong to a certain class or if there
exists only one class, and 1 denotes that the elements are randomly distributed across various classes.

A Gini Index of 0.5 denotes equally distributed elements into some classes.
Example of Gini Index
Let’s start by calculating the
Gini Index for ‘Past Trend’.
Calculation of Gini Index for
Open Interest
Trading Volume
Example

From the above table, we observe that ‘Past Trend’ has the
lowest Gini Index and hence it will be chosen as the root node
for how decision tree works.

We will repeat the same procedure to determine the sub-nodes


or branches of the decision tree.
Gini Index for the ‘Positive’ branch

Calculation of Gini Index of Open Interest for Positive Past


Trend
Gini Index for the ‘Positive’ branch

Calculation of Gini Index of Open Interest for Trading Volume


Gini Index for the ‘Positive’ branch

We will split the node further using the ‘Trading Volume’


feature, as it has the minimum Gini index.

Gini Index, unlike information gain, isn’t computationally


intensive as it doesn’t involve the logarithm function used to
calculate entropy in information gain, which is why Gini Index is
preferred over Information gain

You might also like