0% found this document useful (0 votes)

8 views

41 Machine Learning Algorithms I

Uploaded by

Khushboo Nayak from Business Club

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views

41 Machine Learning Algorithms I

Uploaded by

Khushboo Nayak from Business Club

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Machine Learning Algorithms I

WE'LL COVER THE FOLLOWING

• Introduction
• 1. Linear Regression
• 2. Logistic Regression
• 3. Decision Trees
• 4. Naive Bayes
• 5. Support Vector Machine (SVM)

Introduction #
In this lesson, we are going to learn about the most popular machine learning
algorithms. Note that we are not going to do a technical deep-dive as it would
be out of our scope. The goal is to cover details sufficiently enough so that you
can navigate through them when needed. The key is to know about the
different possibilities so that you can then go deeper on a need’s basis.

In this algorithm tour we are going to learn about:

1. Linear Regression
2. Logistic Regression
3. Decision Trees
4. Naive Bayes
5. Support Vector Machines, SVM
6. K-Nearest Neighbors, KNN
7. K-Means
8. Random Forest
9. Dimensionality Reduction
10. Artificial Neural Networks, ANN

1. Linear Regression #
Linear Regression is probably the most popular machine learning algorithm.

Remember in high school when you had to plot data points on a graph with an
X-axis and a Y-axis and then find the line of best fit? That was a very simple
machine learning algorithm, linear regression. In more technical terms, linear
regression attempts to represent the relationship between one or more
independent variables (points on X axis) and a numeric outcome or
dependent variable (value on Y axis) by fitting the equation of a line to the
data:

Example of simple linear regression, which has one independent variable (x-axis) and a dependent
variable (y-axis)

For example, you might want to relate the weights (Y) of individuals to their
heights (X) using linear regression. This algorithm assumes a strong linear
relationship between input and output variables as we would assume that if
height increases then weight also increases proportionally in a linear way.

The goal here is to derive optimal values for a and b in the equation above, so
that our estimated values, Y, can be as close as possible to their correct values.
Note that we know the actual values for Y during the training phase because
we are trying to learn our equation from the labelled examples given in the
training data set.

Once our machine learning model has learned the line of best fit via linear
regression, this line can then be used to predict values for new or unseen data
points.
Different techniques can be used to learn the linear regression model. The
most popular methods is that of least squares:

Ordinary least squares: The method of least squares calculates the best-
fitting line such that the vertical distances from each data point to the line are
minimum. The distances in green (figure below) should be kept to a minimum
so that the data points in red can be as close as possible to the blue line (line of
best fit). If a point lies on the fitted line exactly then its vertical distance from
the line is 0.

To be more specific, in ordinary least squares, the overall distance is the sum
of the squares of the vertical distances (green lines) for all the data points. The
idea is to fit a model by minimizing this squared error or distance.

In linear regression, the observations (red) are assumed to be the result of random deviations
(green) from an underlying relationship (blue) between a dependent variable (y) and an
independent variable (x). While finding the line of best fit, the goal is to minimize the the distance
shown in green -- red points as close as possible to the blue line.

Note: While using libraries like Scikit-Learn, you won’t have to

implement any of these functions yourself. Scikit Learn provides out
of the box model fitting! We will see this in action once we reach our
Projects section.

When we are dealing with only one independent variable, like in the example
above, we call it Simple Linear Regression. When there are more than one
independent variables, (e.g., we want to predict weight using more variables
than just the person’s height) then this type of regression is termed as Multiple

Linear Regression. As shown in the figure below, while finding the line of best
fit, we can use a polynomial, or a curved line instead of a straight line; this is
called Polynomial Regression.

Linear Regression (green) vs Polynomial Regression (red)

2. Logistic Regression #
Logistic regression has the same main idea as linear regression. The
difference is that this technique is used when the output or dependent
variable is binary meaning the outcome can have only two possible values.
For example, let’s say that we want to predict if age influences the probability
of having a heart attack. In this case, our prediction is only a “yes” or “no”,
only two possible values.

In logistic regression, the line of best fit is not a straight line anymore. The
prediction for the final output is transformed using a non-linear S-shaped
function called the logistic function, g(). This logistic function maps the
intermediate outcome values into an outcome variable Y with values ranging
from 0 to 1. These 0 to 1 values can then be interpreted as the probability of
occurrence of Y.

In the heart attack example, we have two class labels to be predicted, “yes” as
1 and “no” as 0. Say we set a threshold or cut-off value to 0.5, all instances
with a probability higher than this threshold will be classified as instances
belonging to class 1 , while all those having probability below the threshold
will be assigned to class 0. In other words, the properties of the S-shaped
logistic function make logistic regression suitable for classification tasks.
Graph of a logistic regression curve showing probability of passing an exam versus hours studying

3. Decision Trees #
Decision Trees also belong to the category of supervised learning algorithms,
but they can be used for solving both regression and classification tasks.

In this algorithm, the training model learns to predict values of the target
variable by learning decision rules with a tree representation. A tree is made
up of nodes corresponding to a feature or attribute. At each node we ask a
question about the data based on the available features, e.g., Is it raining or
not raining?. The left and right branches represent the possible answers. The
final nodes, leaf nodes, correspond to a class label/predicted value. The
importance for each feature is determined in a top-down approach — the
higher the node, the more important its attribute/feature. This is easier
understood with a visual representation of an example problem.

Say we want to predict whether or not we should wait for a table at a

restaurant. Below is an example decision tree that decides whether or not to
wait in a given situation based on different attributes:
Image Credits: http://www.cs.bham.ac.uk/~mmk/Teaching/AI/

In this example, our attributes are:

Alternate: alternative restaurant nearby

Bar: bar area to wait
Fri/Sat: true on Fridays and Saturdays
Hungry: whether we are hungry
Patrons: how many people in restaurant (none, some, or full)
Raining: raining outside
Wait-Estimate: estimated waiting time (<10,10-30,30-60,>60)

Our data instances are then classified into “wait” or “leave” based on the
attributes listed above. From the visual representation of the decision tree, we
can see that “wait-estimate” is more important than “raining” because it is
present at a relatively higher node in the tree.

4. Naive Bayes #
Naive Bayes is a simple yet widely used machine learning algorithm based on
the Bayes Theorem Remember we talked about it in the Statistics section? It
is called naive because the classifier assumes that the input variables are
independent of each other, quite a strong and unrealistic assumption for real
data!. The Bayes theorem is given by the equation below:

where,
P(c|x) = probability of the event of class c, given the predictor variable x,
P(x|c) = probability of x given c,
P( c) = probability of the class,
P(x) = probability of the predictor.

To put it simply, the model is composed of two types of probabilities:

The probability of each class;

The conditional probability for each class given each value of x

Say we have a training data set with weather conditions, x, and the
corresponding target variable “Played”, c. We can use this to obtain the
probability of “Players will play if it is rainy”, P(c|x). Note that even if the
answer is a numerical value ranging from 0 to 1, this is an example of a
classification problem — we can use the probabilities to reach a “yes/no”
outcome.

Naive Bayes classifiers are actually a popular statistical technique of spam e-

mail filtering. It works by correlating the use of tokens typically words, with
spam and non-spam e-mails and then using Bayes’ theorem to calculate a
probability that an email is spam or not.

5. Support Vector Machine (SVM) #

Support Vector Machines is a supervised algorithm used mainly for
classification problems. In this algorithm, we plot each data item as a point in
n-dimensional space, where n is the number of input features. For example,
with two input variables, we would have a two-dimensional space. Based on
these transformations, SVM finds an optimal boundary, called a hyperplane,
that best separates the possible outputs by their class label. In a two-
dimensional space, this hyperplane can be visualized as a line although not

necessarily a straight line. The task of the SVM algorithm is to find the
coefficients that provide the best separation of classes by this hyperplane.

The distance between the hyperplane and the closest class point is called the
margin. The optimal hyperplane is one that has the largest margin that
classifies points in such a way that the distance between the closest data
point from both classes is maximum.

In simple words, SVM tries to draw two lines between the data points with the
largest margin between them. Say we are given a plot of two classes, black
and white dots, on a graph as shown in the figure below. The job of the SVM
classifier would then be to decide the best line that can separate the black dots
from the white dots, as shown in the figure below:

H1 does not separate the two classes. H2 does, but only with a small margin. H3 separates them
with the maximal margin.

Assignment-Based Subjective Questions/Answers
No ratings yet
Assignment-Based Subjective Questions/Answers
3 pages
Machinelearning Algorithm Basics2 NOTES
No ratings yet
Machinelearning Algorithm Basics2 NOTES
72 pages
11 Most Common Machine Learning Algorithms Explained in A Nutshell by Soner Yıldırım Towards Data Science
No ratings yet
11 Most Common Machine Learning Algorithms Explained in A Nutshell by Soner Yıldırım Towards Data Science
16 pages
Commonly Used Machine Learning Algorithms
No ratings yet
Commonly Used Machine Learning Algorithms
27 pages
2-Machine Learning Algorithms
No ratings yet
2-Machine Learning Algorithms
16 pages
ML Model Paper 2 Solution
No ratings yet
ML Model Paper 2 Solution
15 pages
ML Model Paper 2 Solution
No ratings yet
ML Model Paper 2 Solution
15 pages
Machine Learing Algorithms
No ratings yet
Machine Learing Algorithms
13 pages
ML UNIT-4
No ratings yet
ML UNIT-4
34 pages
ML UNIT-4
No ratings yet
ML UNIT-4
35 pages
ML points
No ratings yet
ML points
13 pages
Model Definition
No ratings yet
Model Definition
6 pages
g (y) = βo + β (Age) - (a)
No ratings yet
g (y) = βo + β (Age) - (a)
6 pages
Model Definition11
No ratings yet
Model Definition11
6 pages
Regression
No ratings yet
Regression
45 pages
A) The Least-Squares Method
No ratings yet
A) The Least-Squares Method
19 pages
Unit 5-6
No ratings yet
Unit 5-6
18 pages
classification algorithm
No ratings yet
classification algorithm
43 pages
ML Algorithms
No ratings yet
ML Algorithms
12 pages
Broadly, There Are 3 Types of Machine Learning Algorithms.
No ratings yet
Broadly, There Are 3 Types of Machine Learning Algorithms.
33 pages
Anuranan Das Summer of Sciences, 2019. Understanding and Implementing Machine Learning
No ratings yet
Anuranan Das Summer of Sciences, 2019. Understanding and Implementing Machine Learning
17 pages
Lecture 4 - CS50's Introduction to Artificial Intelligence with Python
No ratings yet
Lecture 4 - CS50's Introduction to Artificial Intelligence with Python
17 pages
Essentials of Machine Learning Algorithms
No ratings yet
Essentials of Machine Learning Algorithms
15 pages
Unit 3
No ratings yet
Unit 3
9 pages
Types of Regression
No ratings yet
Types of Regression
8 pages
Simple Linear Regression Homework Solutions
100% (1)
Simple Linear Regression Homework Solutions
6 pages
Unit - 2 ML
No ratings yet
Unit - 2 ML
32 pages
Module3-Fitting A Model To Data
No ratings yet
Module3-Fitting A Model To Data
57 pages
Unit 2
No ratings yet
Unit 2
67 pages
Assignment-Based Subjective Questions/Answers
No ratings yet
Assignment-Based Subjective Questions/Answers
3 pages
Machine Learning
100% (3)
Machine Learning
46 pages
ML DL NLP Definitions
No ratings yet
ML DL NLP Definitions
22 pages
Unit Iii
No ratings yet
Unit Iii
27 pages
What Is Logistic Regression
No ratings yet
What Is Logistic Regression
20 pages
Document
No ratings yet
Document
6 pages
ML Algo
No ratings yet
ML Algo
36 pages
Commonly Used Machine Learning Algorithms
No ratings yet
Commonly Used Machine Learning Algorithms
38 pages
new89梁涛企业管理（运营与供应链方向）202111080248Application of linear regression model and logistic regression model based on Iris data set
No ratings yet
new89梁涛企业管理（运营与供应链方向）202111080248Application of linear regression model and logistic regression model based on Iris data set
21 pages
Commonly Used Machine Learning Algorithms (With Python and R Codes)
No ratings yet
Commonly Used Machine Learning Algorithms (With Python and R Codes)
19 pages
Everything You Need To Know About Linear Regression
No ratings yet
Everything You Need To Know About Linear Regression
19 pages
Unit 2 Supervised Learning and Applications
No ratings yet
Unit 2 Supervised Learning and Applications
13 pages
ML Lecture - 3
No ratings yet
ML Lecture - 3
47 pages
Machine Learning
No ratings yet
Machine Learning
33 pages
Bias Variance Tradeoff
No ratings yet
Bias Variance Tradeoff
6 pages
Applying_Machine_Learning_Algorithms_with_Scikit-learn(Sklearn)_-_Notes
No ratings yet
Applying_Machine_Learning_Algorithms_with_Scikit-learn(Sklearn)_-_Notes
19 pages
9 Types of Regression Analysis
No ratings yet
9 Types of Regression Analysis
16 pages
unit 2 svms linear logistic regression
No ratings yet
unit 2 svms linear logistic regression
9 pages
Classification
No ratings yet
Classification
74 pages
UNIt-3 TY
No ratings yet
UNIt-3 TY
67 pages
Unit 2 - NOTES1 - ML
No ratings yet
Unit 2 - NOTES1 - ML
35 pages
Linear Regression Subjective Questions
No ratings yet
Linear Regression Subjective Questions
14 pages
Unit 2 ML
No ratings yet
Unit 2 ML
201 pages
ML Unit 2
No ratings yet
ML Unit 2
33 pages
Machine Learning
No ratings yet
Machine Learning
87 pages
Machine Learning UNIT-2: Logistic Regression
No ratings yet
Machine Learning UNIT-2: Logistic Regression
12 pages
Unit 2 Notes
No ratings yet
Unit 2 Notes
105 pages
ML Unit 2
No ratings yet
ML Unit 2
21 pages
ML Unit 3 Part B Material
No ratings yet
ML Unit 3 Part B Material
15 pages
Unit-2
No ratings yet
Unit-2
26 pages
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
Lecture 15
No ratings yet
Lecture 15
21 pages
Generation of 1 GB Full Entropy Random Numbers With The Enhanced NRGB Method
No ratings yet
Generation of 1 GB Full Entropy Random Numbers With The Enhanced NRGB Method
10 pages
1 s2.0 S0029801823003311 Main
No ratings yet
1 s2.0 S0029801823003311 Main
10 pages
Design and Implementation of Low Power Fft/Ifft Processor For Wireless Communication
No ratings yet
Design and Implementation of Low Power Fft/Ifft Processor For Wireless Communication
4 pages
Take Home Assignment
No ratings yet
Take Home Assignment
1 page
Circuit Training - Solving Linear Equations (Variable On Both Sides) Name
No ratings yet
Circuit Training - Solving Linear Equations (Variable On Both Sides) Name
2 pages
Document Classification Using Distributed Machine Learning
No ratings yet
Document Classification Using Distributed Machine Learning
4 pages
Lec 8 Poisson Processes
No ratings yet
Lec 8 Poisson Processes
35 pages
Vlsi Design
No ratings yet
Vlsi Design
51 pages
T02640020220124069T0264-14 Probabilistic Reasoning Over Time
No ratings yet
T02640020220124069T0264-14 Probabilistic Reasoning Over Time
32 pages
Ai Iiuc PDF
No ratings yet
Ai Iiuc PDF
5 pages
Assignment 2 ECE
No ratings yet
Assignment 2 ECE
9 pages
PG - M.Sc. - Computer Science - 341 11 - Design and Analysis of Algorithms - Binder
No ratings yet
PG - M.Sc. - Computer Science - 341 11 - Design and Analysis of Algorithms - Binder
228 pages
Ugrd Cybs6101 Artificial Intelligence Fundamentals Midterms Exams
100% (1)
Ugrd Cybs6101 Artificial Intelligence Fundamentals Midterms Exams
58 pages
Gaussian Elimination Homework 2-2
100% (2)
Gaussian Elimination Homework 2-2
6 pages
CHE 555 Numerical Differentiation
100% (1)
CHE 555 Numerical Differentiation
9 pages
CNS Module II
No ratings yet
CNS Module II
111 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
31 pages
Diffie-Hellman Key Agreement: 14.1 Cyclic Groups
No ratings yet
Diffie-Hellman Key Agreement: 14.1 Cyclic Groups
6 pages
Jamboree Linear Regression Version 2 Jupyter Notebook
No ratings yet
Jamboree Linear Regression Version 2 Jupyter Notebook
12 pages
Workshop 1. Laplace Transform PDF
No ratings yet
Workshop 1. Laplace Transform PDF
3 pages
ACC Breakdown
No ratings yet
ACC Breakdown
3 pages
Index Data Structures Using "C": S.No Date Program Description Page No
No ratings yet
Index Data Structures Using "C": S.No Date Program Description Page No
13 pages
Chapter5 - Solution Manual
No ratings yet
Chapter5 - Solution Manual
4 pages
2D Frame - 3 Elements and 4 Nodes
No ratings yet
2D Frame - 3 Elements and 4 Nodes
11 pages
EE301 IIT Kanpur Course Instructor: Dr. Naren Naik Assignment - 1 (22 January 2019)
No ratings yet
EE301 IIT Kanpur Course Instructor: Dr. Naren Naik Assignment - 1 (22 January 2019)
2 pages
Patter Recognition (Spring 2015) Midterm Exam
No ratings yet
Patter Recognition (Spring 2015) Midterm Exam
4 pages
Demo - Thetawise
No ratings yet
Demo - Thetawise
1 page
Bana 3010 assignment 5
No ratings yet
Bana 3010 assignment 5
5 pages
Topic2 Bayesian
No ratings yet
Topic2 Bayesian
4 pages