0% found this document useful (0 votes)
17 views43 pages

Machine Learning

ml

Uploaded by

Shaik Reshma
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
Download as docx, pdf, or txt
0% found this document useful (0 votes)
17 views43 pages

Machine Learning

ml

Uploaded by

Shaik Reshma
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1/ 43

Machine Learning

Artificial intelligence:
1. "It is a branch of computer science by which we can create intelligent machines
which can behave like a human, think like humans, and able to make decisions."
2. According to the father of Artificial Intelligence, John McCarthy, it is “The science and
engineering of making intelligent machines, especially intelligent computer programs”
3. Artificial Intelligence is a way of making a computer, a computer-controlled
robot, or a software think intelligently, in the similar manner the intelligent
humans think.
4. It is comprised of two words "Artificial" and "intelligence", which
means "a human-made thinking power."
5. Artificial intelligence is a technology using which we can create intelligent systems
that can simulate human intelligence.
6. AI encompasses a range of abilities including learning, reasoning, perception,
problem solving, data analysis and language comprehension.
7. The ultimate goal of AI is to create machines that can emulate capabilities and carry
out diverse tasks, with enhanced efficiency and precision. The field of AI holds
potential to revolutionize aspects of our daily lives.
8. Today, the term “AI” describes a wide range of technologies that power many of the
services and goods we use every day – from apps that recommend tv shows to
chatbots that provide customer support in real time.
9. Strong AI is essentially AI that is capable of human-level, general intelligence. In
other words, it’s just another way to say “artificial general intelligence.”
10. Weak AI, meanwhile, refers to the narrow use of widely available AI technology, like
machine learning or deep learning, to perform very specific tasks, such as playing
chess, recommending songs, or steering cars. Also known as Artificial Narrow
Intelligence (ANI), weak AI is essentially the kind of AI we use daily.

Uses of Artificial Intelligence:


1. Healthcare: AI is used for medical diagnosis, drug discovery, and
predictive analysis of diseases.

2. Finance: AI helps in credit scoring, fraud detection, and financial


forecasting.

3. Retail: AI is used for product recommendations, price optimization,


and supply chain management.
4. Manufacturing: AI helps in quality control, predictive maintenance,
and production optimization.

5. Transportation: AI is used for autonomous vehicles, traffic prediction,


and route optimization.

6. Customer service: AI-powered chatbots are used for customer


support, answering frequently asked questions, and handling simple
requests.

7. Security: AI is used for facial recognition, intrusion detection, and


cybersecurity threat analysis.

8. Marketing: AI is used for targeted advertising, customer


segmentation, and sentiment analysis.

9. Education: AI is used for personalized learning, adaptive testing, and


intelligent tutoring systems.

Machine Learning:
1. Machine Learning is the science (and art) of programming computers
so they can learn from data.
2. A subset of artificial intelligence known as machine learning focuses
primarily on the creation of algorithms that enable a computer to
independently learn from data and previous experiences.
3. Arthur Samuel first used the term "machine learning" in 1959.
4. Machine Learning is the field of study that gives computers the ability
to learn without being explicitly programmed. —Arthur Samuel, 1959
5. A computer program is said to learn from experience E with respect to
some task T and some performance measure P, if its performance on T,
as measured by P, improves with experience E. —Tom Mitchell, 1997
6. Examples
1. Handwriting recognition learning problem
Task T : Recognizing and classifying handwritten words within images
Performance P : Percent of words correctly classified
Training experience E : A dataset of handwritten words with given
classifications
2. A robot driving learning problem
Task T : Driving on highways using vision sensors
Performance P : Average distance traveled before an error
Training experience E : A sequence of images and steering commands
recorded while observing a human driver
7. A machine can learn if it can gain more data to improve its
performance.

How does Machine Learning work


A machine learning system builds prediction models, learns from previous
data, and predicts the output of new data whenever it receives it. The
amount of data helps to build a better model that accurately predicts the
output, which in turn affects the accuracy of the predicted output.

The Machine Learning algorithm's operation is depicted in the following


block diagram:

Features of Machine Learning:


o Machine learning uses data to detect various patterns in a given
dataset.

o It can learn from past data and improve automatically.

o It is a data-driven technology.

o Machine learning is much similar to data mining as it also deals with


the huge amount of the data.

Classification of Machine Learning


At a broad level, machine learning can be classified into three types:

1. Supervised learning
2. Unsupervised learning
3. Reinforcement learning
1)Supervised learning:
1. In supervised learning, sample labeled data are provided to the
machine learning system for training, and the system then predicts the
output based on the training data.

2. The system uses labeled data to build a model that understands the
datasets and learns about each one. After the training and processing
are done, we test the model with sample data to see if it can
accurately predict the output.

3. The mapping of the input data to the output data is the objective of
supervised learning. The managed learning depends on oversight, and
it is equivalent to when an understudy learns things in the
management of the educator. Spam filtering is an example of
supervised learning.

Supervised learning can be grouped further in two categories of algorithms:

o Classification
o Regression

Here are some of the most important supervised learning algorithms


(covered in this book):

• k-Nearest Neighbors
• Linear Regression
• Logistic Regression
• Support Vector Machines (SVMs)
• Decision Trees and Random Forests
• Neural networks

Applications of Supervised Learning:


 Image classification: Identify objects, faces, and other features in
images.

 Natural language processing: Extract information from text, such


as sentiment, entities, and relationships.

 Speech recognition: Convert spoken language into text.


 Recommendation systems: Make personalized recommendations to
users.

 Predictive analytics: Predict outcomes, such as sales, customer


churn, and stock prices.

 Medical diagnosis: Detect diseases and other medical conditions.

 Fraud detection: Identify fraudulent transactions.

2) Unsupervised Learning
1. Unsupervised learning is a learning method in which a machine learns
without any supervision.

2. The training is provided to the machine with the set of data that has
not been labeled, classified, or categorized, and the algorithm needs to
act on that data without any supervision. The goal of unsupervised
learning is to restructure the input data into new features or a group of
objects with similar patterns.

3. In unsupervised learning, we don't have a predetermined result. The


machine tries to find useful insights from the huge amount of data. It
can be further classifieds into two categories of algorithms:

o Clustering
o Association

• Clustering

1. —K-Means
2. —DBSCAN
3. —Hierarchical Cluster Analysis (HCA)

• Anomaly detection and novelty detection

1. —One-class SVM
2. —Isolation Forest
 Visualization and dimensionality reduction
1. —Principal Component Analysis (PCA)
2. —Kernel PCA
3. —Locally-Linear Embedding (LLE)
4. —t-distributed Stochastic Neighbor Embedding (t-SNE)
• Association rule learning
1. —Apriori
2. —Eclat

Applications of Unsupervised Learning:


 Clustering: Group similar data points into clusters.

 Anomaly detection: Identify outliers or anomalies in data.

 Dimensionality reduction: Reduce the dimensionality of data while


preserving its essential information.

 Recommendation systems: Suggest products, movies, or content to


users based on their historical behavior or preferences.

 Topic modeling: Discover latent topics within a collection of


documents.

 Density estimation: Estimate the probability density function of data.

 Image and video compression: Reduce the amount of storage


required for multimedia content.

Semi-supervised learning:
Where an incomplete training signal is given: a training set with some (often
many) of the target outputs missing. There is a special case of this principle
known as Transduction where the entire set of problem instances is known at
learning time, except that part of the targets are missing. Semi-supervised
learning is an approach to machine learning that combines small labeled
data with a large amount of unlabeled data during training. Semi-supervised
learning falls between unsupervised learning and supervised learning.

Applications of Semi-Supervised Learning:


 Image Classification and Object Recognition: Improve the
accuracy of models by combining a small set of labeled images with a
larger set of unlabeled images.
 Natural Language Processing (NLP): Enhance the performance of
language models and classifiers by combining a small set of labeled
text data with a vast amount of unlabeled text.

 Speech Recognition: Improve the accuracy of speech recognition by


leveraging a limited amount of transcribed speech data and a more
extensive set of unlabeled audio.

 Recommendation Systems: Improve the accuracy of personalized


recommendations by supplementing a sparse set of user-item
interactions (labeled data) with a wealth of unlabeled user behavior
data.

 Healthcare and Medical Imaging: Enhance medical image analysis


by utilizing a small set of labeled medical images alongside a larger set
of unlabeled images.

3) Reinforcement Learning
1. Reinforcement learning is a feedback-based learning method, in which
a learning agent gets a reward for each right action and gets a penalty
for each wrong action. The agent learns automatically with these
feedbacks and improves its performance. In reinforcement learning,
the agent interacts with the environment and explores it. The goal of
an agent is to get the most reward points, and hence, it improves its
performance.
2. The robotic dog, which automatically learns the movement of his arms,
is an example of Reinforcement learning.
Limitations of Machine Learning-

1. The primary challenge of machine learning is the lack of data or the


diversity in the dataset.

2. A machine cannot learn if there is no data available. Besides, a


dataset with a lack of diversity gives the machine a hard time.

3. A machine needs to have heterogeneity to learn meaningful insight.

4. It is rare that an algorithm can extract information when there are no


or few variations.

5. It is recommended to have at least 20 observations per group to help


the machine learn. This constraint leads to poor evaluation and
prediction.
Applications of Reinforcement Machine Learning:
Here are some applications of reinforcement learning:

 Game Playing: RL can teach agents to play games, even complex


ones.

 Robotics: RL can teach robots to perform tasks autonomously.

 Autonomous Vehicles: RL can help self-driving cars navigate and


make decisions.

 Recommendation Systems: RL can enhance recommendation


algorithms by learning user preferences.

 Healthcare: RL can be used to optimize treatment plans and drug


discovery.

 Natural Language Processing (NLP): RL can be used in dialogue


systems and chatbots.

 Finance and Trading: RL can be used for algorithmic trading.

What is ML pipeline?
 ML pipeline expresses the workflow by providing a systematic way on
how to proceed with the machine learning model.

 ML pipelines automate the process of machine learning and following


the pipeline makes the process of making ML models systematic and
easy.

Here is the diagrammatic view of the ML pipeline:


Deep Learning::
1. The definition of Deep learning is that it is the branch of machine
learning that is based on artificial neural network architecture.
2. An artificial neural network or ANN uses layers of interconnected nodes
called neurons that work together to process and learn from the input
data.
3. A fully connected Deep neural network, there is an input layer and one
or more hidden layers connected one after the other.
4. Each neuron receives input from the previous layer neurons or the
input layer.
5. The output of one neuron becomes the input to other neurons in the
next layer of the network, and this process continues until the final
layer produces the output of the network.
6. The layers of the neural network transform the input data through a
series of nonlinear transformations, allowing the network to learn
complex representations of the input data.

Artificial neural networks


1. Artificial neural networks are built on the principles of the structure and
operation of human neurons.
2. It is also known as neural networks or neural nets.
3. An artificial neural network’s input layer, which is the first layer,
receives input from external sources and passes it on to the hidden
layer, which is the second layer.
4. Each neuron in the hidden layer gets information from the neurons in
the previous layer, computes the weighted total, and then transfers it
to the neurons in the next layer.
5. These connections are weighted, which means that the impacts of the
inputs from the preceding layer are more or less optimized by giving
each input a distinct weight.
6. These weights are then adjusted during the training process to
enhance the performance of the model.
Types of neural networks:
Deep Learning models are able to automatically learn features from the data,
which makes them well-suited for tasks such as image recognition, speech
recognition, and natural language processing. The most widely used
architectures in deep learning are feedforward neural networks,
convolutional neural networks (CNNs), and recurrent neural networks (RNNs).

1. Feedforward neural networks (FNNs) are the simplest type of ANN, with
a linear flow of information through the network. FNNs have been
widely used for tasks such as image classification, speech recognition,
and natural language processing.

2. Convolutional Neural Networks (CNNs) are specifically for image and


video recognition tasks. CNNs are able to automatically learn features
from the images, which makes them well-suited for tasks such as
image classification, object detection, and image segmentation.
3. Recurrent Neural Networks (RNNs) are a type of neural network that is
able to process sequential data, such as time series and natural
language. RNNs are able to maintain an internal state that captures
information about the previous inputs, which makes them well-suited
for tasks such as speech recognition, natural language processing, and
language translation

AI vs ML:

S.No MACHINE
. ARTIFICIAL INTELLIGENCE LEARNING

The terminology
“Machine Learning”
was first used
The terminology “Artificial Intelligence” was in 1952 by IBM
1. originally used by John McCarthy in 1956, who computer scientist
also hosted the first AI conference. Arthur Samuel, a
pioneer in artificial
intelligence and
computer games.

ML stands for
AI stands for Artificial intelligence, Machine Learning
2. where intelligence is defined as the ability to which is defined as
acquire and apply knowledge. the acquisition of
knowledge or skill

Machine Learning is
AI is the broader family consisting of ML and DL
3. the subset of
as its components.
Artificial Intelligence.

The aim is to
The aim is to increase the chance of success and increase accuracy,
4.
not accuracy. but it does not care
about; the success

5. AI is aiming to develop an intelligent system Machine learning is


capable of attempting to
S.No MACHINE
. ARTIFICIAL INTELLIGENCE LEARNING

construct
machines that can
performing a variety of complex jobs. decision-
only accomplish the
making
jobs for which they
have been trained.

Here, the tasks


It works as a computer program that does smart systems machine
6.
work. takes data and learns
from data.

The goal is to learn


from data on certain
The goal is to simulate natural intelligence to
7. tasks to maximize
solve complex problems.
the performance on
that task.

The scope of
8. AI has a very broad variety of applications. machine learning is
constrained.

ML allows systems to
9. AI is decision-making. learn new things
from data.

It involves creating
It is developing a system that mimics humans to
10. self-learning
solve problems.
algorithms.

AI is a broader family consisting of ML and DL as


11. ML is a subset of AI.
its components.

12. Three broad categories of AI are : Three broad


1. Artificial Narrow Intelligence (ANI) categories of ML are :
2. Artificial General Intelligence (AGI) 1. Supervised
3. Artificial Super Intelligence (ASI) Learning
S.No MACHINE
. ARTIFICIAL INTELLIGENCE LEARNING

2. Unsupervise
d Learning
3. Reinforceme
nt Learning

ML can work with


AI can work with structured, semi-structured, and
13. only structured and
unstructured data.
semi-structured data.

The most common


uses of machine
learning-
 Facebook’s
automatic
friend
AI’s key uses include-
suggestions
 Siri, customer service via chatbots
 Google’s
 Expert Systems
search
 Machine Translation like Google
14. algorithms
Translate
 Banking
 Intelligent humanoid robots such as
fraud
Sophia,
analysis
and so on.
 Stock price
forecast
 Online
recommende
r systems,
and so on.

ML is a subset of AI
AI refers to the broad field of creating machines
that involves training
that can simulate human intelligence and
algorithms on data to
15. perform tasks such as understanding natural
make predictions,
language, recognizing images and sounds,
decisions, and
making decisions, and solving complex problems.
recommendations.

16. AI is a broad concept that includes various ML focuses on


S.No MACHINE
. ARTIFICIAL INTELLIGENCE LEARNING

teaching machines
how to learn from
methods for creating intelligent machines, data without being
including rule-based systems, expert systems, explicitly
and machine learning algorithms. AI systems can programmed, using
be programmed to follow specific rules, make algorithms such as
logical inferences, or learn from data using ML. neural networks,
decision trees, and
clustering.

In contrast, ML
algorithms require
large amounts of
structured data to
learn and improve
AI systems can be built using
their performance.
both structured and unstructured data,
The quality and
including text, images, video, and audio. AI
17. quantity of the data
algorithms can work with data in a variety of
used to train ML
formats, and they can analyze and process data
algorithms are
to extract meaningful insights.
critical factors in
determining the
accuracy and
effectiveness of the
system.

ML, on the other


hand, is primarily
AI is a broader concept that encompasses many
used for pattern
different applications,
recognition, predic
including robotics, natural language
tive modeling, and
18. processing, speech recognition,
decision-making in
and autonomous vehicles. AI systems can be
fields such as
used to solve complex problems in various fields,
marketing, fraud
such as healthcare, finance, and transportation.
detection, and credit
scoring.

19. AI systems can be designed to work In contrast, ML


S.No MACHINE
. ARTIFICIAL INTELLIGENCE LEARNING

algorithms require
human involvement
to set up, train, and
optimize the system.
autonomously or with minimal human
ML algorithms
intervention, depending on the complexity of the
require the expertise
task. AI systems can make decisions and take
of data scientists,
actions based on the data and rules provided to
engineers, and other
them.
professionals to
design and
implement the
system.

Artificial Intelligence Machine learning

Artificial intelligence is a technology which enables a Machine learning is a subset of AI which allows a machine to automat
machine to simulate human behavior. data without programming explicitly.

The goal of AI is to make a smart computer system like The goal of ML is to allow machines to learn from data so that they can
humans to solve complex problems.

In AI, we make intelligent systems to perform any task In ML, we teach machines with data to perform a particular task and giv
like a human.

Machine learning and deep learning are the two main Deep learning is a main subset of machine learning.
subsets of AI.

AI has a very wide range of scope. Machine learning has a limited scope.

AI is working to create an intelligent system which can Machine learning is working to create machines that can perform only t
perform various complex tasks. which they are trained.

AI system is concerned about maximizing the chances of Machine learning is mainly concerned about accuracy and patterns.
success.

The main applications of AI are Siri, customer support The main applications of machine learning are Online recommen
using catboats, Expert System, Online game playing, search algorithms, Facebook auto friend tagging suggestions, e
intelligent humanoid robot, etc.

On the basis of capabilities, AI can be divided into three Machine learning can also be divided into mainly three types
types, which are, Weak AI, General AI, and Strong AI. learning, Unsupervised learning, and Reinforcement learning.

It includes learning, reasoning, and self-correction. It includes learning and self-correction when introduced with new data.

AI completely deals with Structured, semi-structured, and Machine learning deals with Structured and semi-structured data.
unstructured data.
Artificial intelligence (AI) is computer software that mimics human
cognitive abilities in order to perform complex tasks that historically could
only be done by humans, such as decision making, data analysis, and
language translation.

AI is an umbrella term covering a variety of interrelated, but distinct,


subfields. Some of the most common fields you will encounter within the
broader field of artificial intelligence include:

 Machine learning (ML): Machine learning is a subset of AI in which


algorithms are trained on data sets to become machine learning
models capable of performing specific tasks.

 Deep learning: Deep learning is a subset of ML, in which artificial


neural networks (AANs) that mimic the human brain are used to
perform more complex reasoning tasks without human intervention.

 Natural Language Processing (NLP): A subset of computer science,


AI, linguistics, and ML, natural language processing focuses on creating
software capable of interpreting human communication.

 Robotics: A subset of AI, computer science, and electrical


engineering, robotics is focused on creating robots capable of learning
and performing complex tasks in real world environments.

Machine learning (ML) is a subfield of artificial intelligence focused on


training machine learning algorithms with data sets to produce machine
learning models capable of performing complex tasks, such as sorting
images, forecasting sales, or analyzing big data.

Today, machine learning is the primary way that most people interact with
AI. Some common ways that you’ve likely encountered machine learning
before include:

 Receiving video recommendations on an online video streaming


platform.

 Troubleshooting a problem online with a chatbot, which directs you to


appropriate resources based on your responses.

 Using virtual assistants who respond to your requests to schedule


meetings in your calendar, play a specific song, or call someone.

Artificial intelligence:
 AI allows a machine to simulate human intelligence to solve problems

 The goal is to develop an intelligent system that can perform complex


tasks

 We build systems that can solve complex tasks like a human

 AI has a wide scope of applications

 AI uses technologies in a system so that it mimics human decision-


making

 AI works with all types of data: structured, semi-structured, and


unstructured

 AI systems use logic and decision trees to learn, reason, and self-
correct

Machine learning:

 ML allows a machine to learn autonomously from past data

 The goal is to build machines that can learn from data to increase the
accuracy of the output

 We train machines with data to perform specific tasks and deliver


accurate results

 Machine learning has a limited scope of applications

 ML uses self-learning algorithms to produce predictive models

 ML can only use structured and semi-structured data

 ML systems rely on statistical models to learn and can self-correct


when provided with new data

Ml vs Deep learning:

Machine learning Deep learning

A subset of AI A subset of machine learning


Machine learning Deep learning

Can train on smaller data sets Requires large amounts of data

Requires more human intervention to correct Learns on its own from environment and past
and learn mistakes

Shorter training and lower accuracy Longer training and higher accuracy

Makes simple, linear correlations Makes non-linear, complex correlations

Needs a specialized GPU (graphics processing unit


Can train on a CPU (central processing unit)
to train

Machine Learning Deep Learning

Uses artificial neural network architecture to learn


Apply statistical algorithms to learn the hidden
the hidden patterns and relationships in the
patterns and relationships in the dataset.
dataset.
Requires the larger volume of dataset compared to
Can work on the smaller amount of dataset
machine learning

Better for complex task like image processing,


Better for the low-label task.
natural language processing, etc.

Takes less time to train the model. Takes more time to train the model.

A model is created by relevant features which


Relevant features are automatically extracted from
are manually extracted from images to detect an
images. It is an end-to-end learning process.
object in the image.
More complex, it works like the black box
Less complex and easy to interpret the result.
interpretations of the result are not easy.

It can work on the CPU or requires less


It requires a high-performance computer with GPU
computing power as compared to deep learning.

Artificial Intelligence Machine Learning Deep Learning


Machine learning Deep learning

AI stands for Artificial


DL stands for Deep
Intelligence, and is ML stands for Machine
Learning, and is the study
basically the Learning, and is the study
that makes use of Neura
study/process which that uses statistical
Networks(similar to neuron
enables machines to methods enabling
present in human brain) t
mimic human behaviour machines to improve with
imitate functionality just
through particular experience.
like a human brain.
algorithm.

AI is the broader family


consisting of ML and DL ML is the subset of AI. DL is the subset of ML.
as it’s components.

DL is a ML algorithm that
AI is a computer
ML is an AI algorithm which uses deep(more than one
algorithm which exhibits
allows system to learn from layer) neural networks to
intelligence through
data. analyze data and provide
decision making.
output accordingly.

If you are clear about the


If you have a clear idea
math involved in it but
about the logic(math)
don’t have idea about the
involved in behind and you
Search Trees and much features, so you break th
can visualize the complex
complex math is complex functionalities int
functionalities like K-Mean,
involved in AI. linear/lower dimension
Support Vector Machines,
features by adding more
etc., then it defines the ML
layers, then it defines the
aspect.
DL aspect.

The aim is to basically It attains the highest rank


The aim is to increase
increase chances of in terms of accuracy whe
accuracy not caring much
success and not it is trained with large
about the success ratio.
accuracy. amount of data.

Three broad Three broad DL can be considered as


categories/types Of AI categories/types Of ML are: neural networks with a
are: Artificial Narrow Supervised Learning, large number of paramete
Intelligence (ANI), Unsupervised Learning and layers lying in one of the
Machine learning Deep learning

four fundamental networ


architectures: Unsupervise
Artificial General
Pre-trained Networks,
Intelligence (AGI) and
Reinforcement Learning Convolutional Neural
Artificial Super
Networks, Recurrent Neur
Intelligence (ASI)
Networks and Recursive
Neural Networks

The efficiency Of AI is Less efficient than DL as it


More powerful than ML as
basically the efficiency can’t work for longer
can easily work for large
provided by ML and DL dimensions or higher
sets of data.
respectively. amount of data.

Examples of AI
Examples of ML
applications include: Examples of DL application
applications include: Virtual
Google’s AI-Powered include: Sentiment based
Personal Assistants: Siri,
Predictions, Ridesharing news aggregation, Image
Alexa, Google, etc., Email
Apps Like Uber and Lyft, analysis and caption
Spam and Malware
Commercial Flights Use generation, etc.
Filtering.
an AI Autopilot, etc.

AI refers to the broad


field of computer
science that focuses on ML is a subset of AI that
creating intelligent focuses on developing DL is a subset of ML that
machines that can algorithms that can learn focuses on developing dee
perform tasks that from data and improve neural networks that can
would normally require their performance over automatically learn and
human intelligence, time without being extract features from data
such as reasoning, explicitly programmed.
perception, and
decision-making.

AI can be further broken ML algorithms can be DL algorithms are inspire


down into various categorized as supervised, by the structure and
subfields such as unsupervised, or function of the human
robotics, natural reinforcement learning. In brain, and they are
language processing, supervised learning, the particularly well-suited to
Machine learning Deep learning

algorithm is trained on
labeled data, where the
desired output is known. In
computer vision, expert tasks such as image and
unsupervised learning, the
systems, and more. speech recognition.
algorithm is trained on
unlabeled data, where the
desired output is unknown.

DL networks consist of
multiple layers of
In reinforcement learning,
interconnected neurons
AI systems can be rule- the algorithm learns by trial
that process data in a
based, knowledge- and error, receiving
hierarchical manner,
based, or data-driven. feedback in the form of
allowing them to learn
rewards or punishments.
increasingly complex
representations of the dat

AI vs. machine learning vs. deep learning

AI, machine learning, and deep learning are sometimes used


interchangeably, but they are each distinct terms.
Artificial Intelligence (AI) is an umbrella term for computer software that
mimics human cognition in order to perform complex tasks and learn from
them.
Machine learning (ML) is a subfield of AI that uses algorithms trained on
data to produce adaptable models that can perform a variety of complex
tasks.
Deep learning is a subset of machine learning that uses several layers
within neural networks to do some of the most complex ML tasks without
any human intervention.
Types of machine learning systems:
1. supervised/unsupervised
2. Batch and Online Learning
3. Instance-Based Versus Model-Based Learning

Batch learning
1. In batch learning, the system is incapable of learning incrementally: it
must be trained using all the available data.
2. This will generally take a lot of time and computing resources, so it is
typically done offline.
3. First the system is trained, and then it is launched into production and
runs without learning anymore; it just applies what it has learned. This
is called offline learning.
4. If you want a batch learning system to know about new data (such as
a new type of spam), you need to train a new version of the system
from scratch on the full dataset (not just the new data, but also the old
data), then stop the old system and replace it with the new one.
5. Fortunately, the whole process of training, evaluating, and launching a
Machine Learning system can be automated fairly easily (as shown in
Figure 1-3), so even a batch learning system can adapt to change.
6. Simply update the data and train a new version of the system from
scratch as often as needed.
Online learning:
1. Online learning In online learning, you train the system incrementally
by feeding it data instances sequentially, either individually or by small
groups called mini-batches.
2. Each learning step is fast and cheap, so the system can learn about
new data on thefly, as it arrives (see Fig1-13)

3. Online learning is great for systems that receive data as a continuous


flow (e.g., stock prices) and need to adapt to change rapidly or
autonomously.
4. It is also a good if you have limited computing resources: once an
online learning system has learned about new data instances, it does
not need them anymore, so you can discard them (unless you want to
be able to roll back to a previous state and “replay” the data). This can
save a huge amount of space
5. Online learning algorithms can also be used to train systems on huge
datasets that cannot fit in one machine’s main memory.
6. One important parameter of online learning systems is how fast they
should adapt to changing data: this is called the learning rate.
7. If you set a high learning rate, then your system will rapidly adapt to
new data, but it will also tend to quickly forget the old data .
8. A big challenge with online learning is that if bad data is fed to the
system, the sys tem’s performance will gradually decline.

Instance-based learning:
1. instance-based learning: the system learns the examples by heart,
then generalizes to new cases by comparing them to the learned
examples (or a subset of them), using a similarity measure.
2. If you were to create a spam filter this way, it would just flag all emails
that are identical to emails that have already been flagged by users—
not the worst solution, but certainly not the best.
3. Instead of just flagging emails that are identical to known spam emails,
your spam filter could be programmed to also flag emails that are very
similar to known spam emails.
4. This requires a measure of similarity between two emails.
5. A (very basic) similarity measure between two emails could be to count
the number of words they have in common.
6. The system would flag an email as spam if it has many words in
common with a known spam email.
7. For example, in Figure 1-15 the new instance would be classified as a
triangle because the majority of the most similar instances belong to
that class.

Model-based learning:
1. Another way to generalize from a set of examples is to build a model of
these examples, then use that model to make predictions. This is
called model-based learning.

Main Challenges of Machine Learning:


1. Inadequate Training Data
The major issue that comes while using machine learning algorithms is the
lack of quality as well as quantity of data. Although data plays a vital role in
the processing of machine learning algorithms, many data scientists claim
that inadequate data, noisy data, and unclean data are extremely exhausting
the machine learning algorithms. For example, a simple task requires
thousands of sample data, and an advanced task such as speech or image
recognition needs millions of sample data examples. Further, data quality is
also important for the algorithms to work ideally, but the absence of data
quality is also found in Machine Learning applications. Data quality can be
affected by some factors as follows:

o Noisy Data- It is responsible for an inaccurate prediction that affects


the decision as well as accuracy in classification tasks.

o Incorrect data- It is also responsible for faulty programming and


results obtained in machine learning models. Hence, incorrect data
may affect the accuracy of the results also.
o Generalizing of output data- Sometimes, it is also found that
generalizing output data becomes complex, which results in
comparatively poor future actions.

2. Poor quality of data


As we have discussed above, data plays a significant role in machine
learning, and it must be of good quality as well. Noisy data, incomplete data,
inaccurate data, and unclean data lead to less accuracy in classification and
low-quality results. Hence, data quality can also be considered as a major
common problem while processing machine learning algorithms.

3. Non-representative training data


To make sure our training model is generalized well or not, we have to ensure
that sample training data must be representative of new cases that we need
to generalize. The training data must cover all cases that are already
occurred as well as occurring.

Further, if we are using non-representative training data in the model, it


results in less accurate predictions. A machine learning model is said to be
ideal if it predicts well for generalized cases and provides accurate decisions.
If there is less training data, then there will be a sampling noise in the model,
called the non-representative training set. It won't be accurate in predictions.
To overcome this, it will be biased against one class or a group.

Hence, we should use representative data in training to protect against being


biased and make accurate predictions without any drift.

4. Overfitting and Underfitting

Overfitting:

Overfitting is one of the most common issues faced by Machine Learning


engineers and data scientists. Whenever a machine learning model is trained
with a huge amount of data, it starts capturing noise and inaccurate data
into the training data set. It negatively affects the performance of the model.
Let's understand with a simple example where we have a few training data
sets such as 1000 mangoes, 1000 apples, 1000 bananas, and 5000 papayas.
Then there is a considerable probability of identification of an apple as
papaya because we have a massive amount of biased data in the training
data set; hence prediction got negatively affected. The main reason behind
overfitting is using non-linear methods used in machine learning algorithms
as they build non-realistic data models. We can overcome overfitting by
using linear and parametric algorithms in the machine learning models.

Methods to reduce overfitting:

Increase training data in a dataset.

Reduce model complexity by simplifying the model by selecting one with


fewer parameters

Ridge Regularization and Lasso Regularization

Early stopping during the training phase

Reduce the noise

Reduce the number of attributes in training data.

Constraining the model.

Underfitting:

Underfitting is just the opposite of overfitting. Whenever a machine learning


model is trained with fewer amounts of data, and as a result, it provides
incomplete and inaccurate data and destroys the accuracy of the machine
learning model.

Underfitting occurs when our model is too simple to understand the base
structure of the data, just like an undersized pant. This generally happens
when we have limited data into the data set, and we try to build a linear
model with non-linear data. In such scenarios, the complexity of the model
destroys, and rules of the machine learning model become too easy to be
applied on this data set, and the model starts doing wrong predictions as
well.

ADVERTISEMENT
Methods to reduce Underfitting:

Increase model complexity

Remove noise from the data

Trained on increased and better features

Reduce the constraints

Increase the number of epochs to get better results.

5. Monitoring and maintenance

As we know that generalized output data is mandatory for any machine


learning model; hence, regular monitoring and maintenance become
compulsory for the same. Different results for different actions require data
change; hence editing of codes as well as resources for monitoring them also
become necessary.

6. Getting bad recommendations

A machine learning model operates under a specific context which results in


bad recommendations and concept drift in the model. Let's understand with
an example where at a specific time customer is looking for some gadgets,
but now customer requirement changed over time but still machine learning
model showing same recommendations to the customer while customer
expectation has been changed. This incident is called a Data Drift. It
generally occurs when new data is introduced or interpretation of data
changes. Hence, we should use representative data in training to protect
against being biased and make accurate predictions without any drift.

4. Overfitting and Underfitting

Overfitting:

Overfitting is one of the most common issues faced by Machine Learning


engineers and data scientists. Whenever a machine learning model is trained
with a huge amount of data, it starts capturing noise and inaccurate data
into the training data set. It negatively affects the performance of the model.
Let's understand with a simple example where we have a few training data
sets such as 1000 mangoes, 1000 apples, 1000 bananas, and 5000 papayas.
Then there is a considerable probability of identification of an apple as
papaya because we have a massive amount of biased data in the training
data set; hence prediction got negatively affected. The main reason behind
overfitting is using non-linear methods used in machine learning algorithms
as they build non-realistic data models. We can overcome overfitting by
using linear and parametric algorithms in the machine learning models.

Methods to reduce overfitting:

o Increase training data in a dataset.

o Reduce model complexity by simplifying the model by selecting one


with fewer parameters

o Ridge Regularization and Lasso Regularization

o Early stopping during the training phase

o Reduce the noise

o Reduce the number of attributes in training data.

o Constraining the model.

Underfitting:

Underfitting is just the opposite of overfitting. Whenever a machine learning


model is trained with fewer amounts of data, and as a result, it provides
incomplete and inaccurate data and destroys the accuracy of the machine
learning model.

Underfitting occurs when our model is too simple to understand the base
structure of the data, just like an undersized pant. This generally happens
when we have limited data into the data set, and we try to build a linear
model with non-linear data. In such scenarios, the complexity of the model
destroys, and rules of the machine learning model become too easy to be
applied on this data set, and the model starts doing wrong predictions as
well.

ADVERTISEMENT

Methods to reduce Underfitting:

o Increase model complexity

o Remove noise from the data

o Trained on increased and better features


o Reduce the constraints

o Increase the number of epochs to get better results.

5. Monitoring and maintenance

As we know that generalized output data is mandatory for any machine


learning model; hence, regular monitoring and maintenance become
compulsory for the same. Different results for different actions require data
change; hence editing of codes as well as resources for monitoring them also
become necessary.

6. Getting bad recommendations

A machine learning model operates under a specific context which results in


bad recommendations and concept drift in the model. Let's understand with
an example where at a specific time customer is looking for some gadgets,
but now customer requirement changed over time but still machine learning
model showing same recommendations to the customer while customer
expectation has been changed. This incident is called a Data Drift. It
generally occurs when new data is introduced or interpretation of data
changes. However, we can overcome this by regularly updating and
monitoring data according to the expectations.

However, we can overcome this by regularly updating and monitoring data


according to the expectations.

7. Lack of skilled resources

Although Machine Learning and Artificial Intelligence are continuously


growing in the market, still these industries are fresher in comparison to
others. The absence of skilled resources in the form of manpower is also an
issue. Hence, we need manpower having in-depth knowledge of
mathematics, science, and technologies for developing and managing
scientific substances for machine learning.

8. Customer Segmentation

Customer segmentation is also an important issue while developing a


machine learning algorithm. To identify the customers who paid for the
recommendations shown by the model and who don't even check them.
Hence, an algorithm is necessary to recognize the customer behavior and
trigger a relevant recommendation for the user based on past experience.

9. Process Complexity of Machine Learning


The machine learning process is very complex, which is also another major
issue faced by machine learning engineers and data scientists. However,
Machine Learning and Artificial Intelligence are very new technologies but
are still in an experimental phase and continuously being changing over
time. There is the majority of hits and trial experiments; hence the
probability of error is higher than expected. Further, it also includes
analyzing the data, removing data bias, training data, applying complex
mathematical calculations, etc., making the procedure more complicated and
quite tedious.

10. Data Bias

Data Biasing is also found a big challenge in Machine Learning. These errors
exist when certain elements of the dataset are heavily weighted or need
more importance than others. Biased data leads to inaccurate results,
skewed outcomes, and other analytical errors. However, we can resolve this
error by determining where data is actually biased in the dataset. Further,
take necessary steps to reduce it.

Methods to remove Data Bias:

o Research more for customer segmentation.

o Be aware of your general use cases and potential outliers.

o Combine inputs from multiple sources to ensure data diversity.

o Include bias testing in the development process.

o Analyze data regularly and keep tracking errors to resolve them easily.

o Review the collected and annotated data.

o Use multi-pass annotation such as sentiment analysis, content


moderation, and intent recognition.

11. Lack of Explainability

This basically means the outputs cannot be easily comprehended as it is


programmed in specific ways to deliver for certain conditions. Hence, a lack
of explainability is also found in machine learning algorithms which reduce
the credibility of the algorithms.

12. Slow implementations and results

This issue is also very commonly seen in machine learning models. However,
machine learning models are highly efficient in producing accurate results
but are time-consuming. Slow programming, excessive requirements and
overloaded data take more time to provide accurate results than expected.
This needs continuous maintenance and monitoring of the model for
delivering accurate results.

13. Irrelevant features

Although machine learning models are intended to give the best possible
outcome, if we feed garbage data as input, then the result will also be
garbage. Hence, we should use relevant features in our training sample. A
machine learning model is said to be good if training data has a good set of
features or less to no irrelevant features.
Ensemble Learning and Random Forests

1. Suppose you ask a complex question to thousands of random


people, then aggregate their answers. In many cases you will
find that this aggregated answer is better than an expert’s
answer. This is called the wisdom of the crowd.
2. Similarly, if you aggregate the predictions of a group of
predictors (such as classifiers or regressors), you will often get
better predictions than with the best individual predictor. A
group of predictors is called an ensemble; thus, this technique
is called Ensemble Learning, and an Ensemble Learning
algorithm is called an Ensemble method.
3. For example, you can train a group of Decision Tree classifiers,
each on a different random subset of the training set. To make
predictions, you just obtain the predictions of all individual
trees, then predict the class that gets the most votes (see the
last exercise in Chapter 6). Such an ensemble of Decision
Trees is called a Random Forest.
4. And despite its simplicity, this is one of the most powerful
Machine Learning algorithms available today.
5. In ensemble learning, different models, often of the same
type or different types, team up to enhance predictive
performance. It’s all about leveraging the collective wisdom of
the group to overcome individual limitations and make more
informed decisions in various machine learning tasks. Some
popular ensemble models
include- XGBoost, AdaBoost, LightGBM, Random
Forest, Bagging, Voting etc
6. Random Forest algorithm is a powerful tree learning technique
in Machine Learning.
7. It works by creating a number of Decision Trees during the
training phase.
8. Each tree is constructed using a random subset of the data
set to measure a random subset of features in each partition.
9. This randomness introduces variability among individual
trees, reducing the risk of overfitting and improving overall
prediction performance.
10. In prediction, the algorithm aggregates the results of all
trees, either by voting (for classification tasks) or by
averaging (for regression tasks) T
11. Random forests are widely used for classification and
regression functions, which are known for their ability to
handle complex data, reduce overfitting in different
environments.
12. Ensemble methods work best when the predictors are as
independ ent from one another as possible.
13. One way to get diverse classifiers is to train them using
very different algorithms. This increases the chance that they
will make very different types of errors, improving the
ensemble’s accuracy.

Voting Classifiers:
Suppose you have trained a few classifiers, each one achieving about 80%
accuracy. You may have a Logistic Regression classifier, an SVM classifier, a
Random Forest classifier, a K-Nearest Neighbors classifier, and perhaps a few
more

A very simple way to create an even better classifier is to aggregate the


predictions of each classifier and predict the class that gets the most votes.
This majority-vote classi fier is called a hard voting classifier
Somewhat surprisingly, this voting classifier often achieves a higher
accuracy than the best classifier in the ensemble. In fact, even if each
classifier is a weak learner (mean ing it does only slightly better than random
guessing), the ensemble can still be a strong learner (achieving high
accuracy), provided there are a sufficient number of weak learners and they
are sufficiently diverse.

Ensemble methods work best when the predictors are as independ ent from
one another as possible. One way to get diverse classifiers is to train them
using very different algorithms. This increases the chance that they will
make very different types of errors, improving the ensemble’s accuracy.

The following code creates and trains a voting classifier in Scikit-Learn,


composed of three diverse classifiers (the training set is the moons dataset,
introduced in Chap ter 5):

 from sklearn.ensemble import RandomForestClassifier


 from sklearn.ensemble import VotingClassifier
 from sklearn.linear_model import LogisticRegression
 from sklearn.svm import SVC
 log_clf = LogisticRegression()
 rnd_clf = RandomForestClassifier()
 svm_clf = SVC()
 voting_clf = VotingClassifier
( estimators=[('lr', log_clf), ('rf', rnd_clf), ('svc', svm_clf)],
voting='hard')
 voting_clf.fit(X_train, y_train)
 from sklearn.metrics import accuracy_score
 for clf in (log_clf, rnd_clf, svm_clf, voting_clf):
 ... clf.fit(X_train, y_train) ... ... ...
 y_pred = clf.predict(X_test)
 print(clf.__class__.__name__, accuracy_score(y_test, y_pred))
 LogisticRegression 0.864
RandomForestClassifier 0.896
SVC 0.888 VotingClassifier 0.904
If all classifiers are able to estimate class probabilities (i.e.,
they have a pre dict_proba() method), then you can tell Scikit-
Learn to predict the class with the highest class probability,
averaged over all the individual classifiers. This is called soft
voting.
It often achieves higher performance than hard voting
because it gives more weight to highly confident votes.
All you need to do is replace voting="hard" with
voting="soft" and ensure that all classifiers can estimate class
probabilities.

Bagging and Pasting:


Another approach is to use the same training algorithm for every predictor,
but to train them on different random subsets of the training set.

When sampling is performed with replacement, this method is called


bagging1 (short for bootstrap aggregating2).

When sampling is performed without replacement, it is called pasting.

In other words, both bagging and pasting allow training instances to be


sampled several times across multiple predictors, but only bagging allows
training instances to be sampled several times for the same predictor.

This sampling and training process is represented in Figure 7-4.


Once all predictors are trained, the ensemble can make a prediction for a
new instance by simply aggregating the predictions of all predictors. The
aggregation function is typically the statistical mode (i.e., the most frequent
prediction, just like a hard voting classifier) for classification, or the average
for regression.

Each individual predictor has a higher bias than if it were trained on the
original training set, but aggregation reduces both bias and variance.

Generally, the net result is that the ensemble has a similar bias but a lower
variance than a single predictor trained on the original training set.

Scikit-Learn offers a simple API for both bagging and pasting with the
BaggingClassifier class (or BaggingRegressor for regression)

The BaggingClassifier automatically performs soft voting instead of hard


voting if the base classifier can estimate class proba bilities (i.e., if it has a
predict_proba() method), which is the case with Decision Trees classifiers.

Out-of-Bag Evaluation
With bagging, some instances may be sampled several times for any given
predictor, while others may not be sampled at all. By default a
BaggingClassifier samples m training instances with replacement
(bootstrap=True), where m is the size of the training set. This means that
only about 63% of the training instances are sampled on average for each
predictor.6 The remaining 37% of the training instances that are not sampled
are called out-of-bag (oob) instances. Note that they are not the same 37%
for all predictors.
Random Forests
a Random Forest is an ensemble of Decision Trees, generally trained via the
bagging method (or sometimes pasting), typically with max_samples set to
the size of the training set. Instead of building a BaggingClassifier and
passing it a DecisionTreeClassifier, you can instead use the
RandomForestClassifier class, which is more convenient and optimized for
Decision Trees.

from sklearn.ensemble import RandomForestClassifier

rnd_clf = RandomForestClassifier(n_estimators=500, max_leaf_nodes=16,


n_jobs=-1)

rnd_clf.fit(X_train, y_train)

y_pred_rf = rnd_clf.predict(X_test)

The Random Forest algorithm introduces extra randomness when growing


trees; instead of searching for the very best feature when splitting a node
(see Chapter 6), it searches for the best feature among a random subset of
features.

This results in a greater tree diversity, which (once again) trades a higher
bias for a lower variance, generally yielding an overall better model.

Extra-Trees
When you are growing a tree in a Random Forest, at each node only a
random subset of the features is considered for splitting (as discussed
earlier). It is possible to make trees even more random by also using random
thresholds for each feature rather than searching for the best possible
thresholds (like regular Decision Trees do).

A forest of such extremely random trees is simply called an Extremely


Randomized Trees ensemble12 (or Extra-Trees for short). Once again, this
trades more bias for a lower variance. It also makes Extra-Trees much faster
to train than regular Random Forests since finding the best possible
threshold for each feature at every node is one of the most time-consuming
tasks of growing a tree.

You can create an Extra-Trees classifier using Scikit-Learn’s


ExtraTreesClassifier class. Its API is identical to the RandomForestClassifier
class. Similarly, the Extra TreesRegressor class has the same API as the
RandomForestRegressor class.
Key Features of Random Forest

Some of the Key Features of Random Forest are discussed below–>

1. High Predictive Accuracy: Imagine Random Forest as a team of


decision-making wizards. Each wizard (decision tree) looks at a part of
the problem, and together, they weave their insights into a powerful
prediction tapestry. This teamwork often results in a more accurate
model than what a single wizard could achieve.

2. Resistance to Overfitting: Random Forest is like a cool-headed


mentor guiding its apprentices (decision trees). Instead of letting each
apprentice memorize every detail of their training, it encourages a
more well-rounded understanding. This approach helps prevent getting
too caught up with the training data which makes the model less prone
to overfitting.

3. Large Datasets Handling: Dealing with a mountain of data? Random


Forest tackles it like a seasoned explorer with a team of helpers
(decision trees). Each helper takes on a part of the dataset, ensuring
that the expedition is not only thorough but also surprisingly quick.

4. Variable Importance Assessment: Think of Random Forest as a


detective at a crime scene, figuring out which clues (features) matter
the most. It assesses the importance of each clue in solving the case,
helping you focus on the key elements that drive predictions.

5. Built-in Cross-Validation: Random Forest is like having a personal


coach that keeps you in check. As it trains each decision tree, it also
sets aside a secret group of cases (out-of-bag) for testing. This built-in
validation ensures your model doesn’t just ace the training but also
performs well on new challenges.

6. Handling Missing Values: Life is full of uncertainties, just like


datasets with missing values. Random Forest is the friend who adapts
to the situation, making predictions using the information available. It
doesn’t get flustered by missing pieces; instead, it focuses on what it
can confidently tell us.

7. Parallelization for Speed: Random Forest is your time-saving buddy.


Picture each decision tree as a worker tackling a piece of a puzzle
simultaneously. This parallel approach taps into the power of modern
tech, making the whole process faster and more efficient for handling
large-scale projects.
Random Forest vs. Other Machine Learning Algorithms

Some of the key-differences are discussed below.


Feature Random Forest Other ML Algorithms

Typically relies on a single


Utilizes an ensemble of
model (e.g., linear
decision trees,
regression, support vector
Ensemble combining their outputs
machine) without the
Approach for predictions, fostering
ensemble approach,
robustness and
potentially leading to less
accuracy.
resilience against noise.

Some algorithms may be


Resistant to overfitting
prone to overfitting,
due to the aggregation
Overfitting especially when dealing
of diverse decision trees,
Resistance with complex datasets, as
preventing memorization
they may excessively
of training data.
adapt to training noise.

Exhibits resilience in
Other algorithms may
handling missing values
require imputation or
by leveraging available
Handling of elimination of missing
features for predictions,
Missing Data data, potentially impacting
contributing to
model training and
practicality in real-world
performance.
scenarios.

Provides a built-in
Many algorithms may lack
mechanism for assessing
an explicit feature
variable importance,
Variable importance assessment,
aiding in feature
Importance making it challenging to
selection and
identify crucial variables
interpretation of
for predictions.
influential factors.

Parallelizatio Capitalizes on Some algorithms may


n Potential parallelization, enabling have limited
the simultaneous parallelization capabilities,
training of decision potentially leading to
Feature Random Forest Other ML Algorithms

trees, resulting in faster


longer training times for
computation for large
extensive datasets.
datasets.

Boosting
Boosting (originally called hypothesis boosting) refers to any Ensemble
method that can combine several weak learners into a strong learner.
The general idea of most boosting methods is to train predictors
sequentially, each trying to correct its prede cessor. There are many
boosting methods available, but by far the most popular are AdaBoost
(short for Adaptive Boosting) and Gradient Boosting.

AdaBoost
One way for a new predictor to correct its predecessor is to pay a bit
more attention to the training instances that the predecessor
underfitted. This results in new predic tors focusing more and more on
the hard cases. This is the technique used by Ada Boost.

For example, to build an AdaBoost classifier, a first base classifier


(such as a Decision Tree) is trained and used to make predictions on
the training set. The relative weight of misclassified training instances
is then increased. A second classifier is trained using the updated
weights and again it makes predictions on the training set, weights are
updated, and so on
Let’s take a closer look at the AdaBoost algorithm. Each instance weight w(i)
is initially set to m . A first predictor is trained and its weighted error rate r1
is computed on the training set; see Equation 7-1
Gradient Boosting
Another very popular Boosting algorithm is Gradient Boosting.17 Just like
AdaBoost, Gradient Boosting works by sequentially adding predictors to an
ensemble, each one correcting its predecessor. However, instead of tweaking
the instance weights at every iteration like AdaBoost does, this method tries
to fit the new predictor to the residual errors made by the previous predictor.
Let’s go through a simple regression example using Decision Trees as the
base predic tors (of course Gradient Boosting also works great with
regression tasks). This is called Gradient Tree Boosting, or Gradient Boosted
Regression Trees (GBRT).

A simpler way to train GBRT ensembles is to use Scikit-Learn’s


GradientBoostingRe gressor class. Much like the RandomForestRegressor
class, it has hyperparameters to control the growth of Decision Trees (e.g.,
max_depth, min_samples_leaf, and so on), as well as hyperparameters to
control the ensemble training, such as the number of trees (n_estimators). T

You might also like