0% found this document useful (0 votes)
2 views

linear regression

Linear Regression is a supervised learning algorithm used for predicting continuous outcomes by modeling the relationship between dependent and independent variables. It includes types such as Simple and Multiple Linear Regression, and relies on minimizing the sum of squared errors to find the best-fit line. The document also discusses the application of Linear Regression in predicting house prices, emphasizing the importance of various factors and methodologies in building a predictive model.

Uploaded by

mungaijames6303
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

linear regression

Linear Regression is a supervised learning algorithm used for predicting continuous outcomes by modeling the relationship between dependent and independent variables. It includes types such as Simple and Multiple Linear Regression, and relies on minimizing the sum of squared errors to find the best-fit line. The document also discusses the application of Linear Regression in predicting house prices, emphasizing the importance of various factors and methodologies in building a predictive model.

Uploaded by

mungaijames6303
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 130

Linear Regression

The Foundation of Predictive Modeling


Introduction

 Linear Regression is a supervised learning algorithm used to predict continuous


outcomes by modeling the relationship between dependent and independent variables.

➢ Common applications: Predicting house prices, stock trends, sales forecasting, etc.

➢ On Larger Note in LR we always looking for a best fit line.


Types of Linear Regression:
➢ Simple Linear Regression

➢ Multiple Linear Regression:


Assumptions of Linear Regression:
Additional Considerations:
Journey of Regression from Stats to ML:
➢ Linear Regression models the relationship between the response

variable (also known as the dependent, target, or result variable, denoted

as y) and the regression coefficients (denoted as βi​, wi​​).

➢ The relationship is assumed to be linear. This means the output y can be

expressed as a linear combination of the input features x and the

coefficients.
➢Regression Coefficients:
✓These coefficients (βi​, wi) are the weights assigned to each input feature x.
✓They determine how much each input feature contributes to the output y.
Graphical Representation:
 Scatter Plot of Data Points: We'll plot the given data points, where each point
represents a pair of weight and height values.
 Best Fit Line: We'll draw the line that best fits the data points according to the linear
regression model. This line minimizes the sum of the squared errors between the actual
data points and the line.
 Residuals (Errors): We'll show the vertical distances (residuals) between the data points
and the regression line, highlighting the concept of minimizing these distances.
Model Parameters:
 Slope (w1): let it be 0.361 (approximately)
 This means that for every unit increase in weight, the height is expected
to increase by about 0.361 units.
 Intercept (w0): let it be 9.11 (approximately)
 This is the height when the weight is zero. While it might not be
meaningful in a practical context, it is essential for defining the
regression line.
Geometric Intuition:
 The best fit line attempts to capture the linear relationship between weight
and height. The slope indicates the direction and steepness of this
relationship.
 Minimizing the residuals ensures that the line is as close as possible to all the
data points, providing the best possible predictions.
Finding a line to best fit the data points using Ordinary
Least Squares (OLS) regression
Concept of residuals and squared errors in the
context of linear regression.
Key Points:
➢ The best-fit line is determined by minimizing the sum of the squared vertical

distances between the actual data points and the predicted values on the line.

➢ The residuals show how far off the predictions are from the actual data points.
➢ Why we use
square of
error?
➢ In linear regression,
we often use the
square of the errors,
rather than just the
errors themselves, to
measure how well the
model fits the data.
➢ This is called the Sum
of Squared Errors
(SSE).
Why Not Use Absolute Errors?
R squared: Coefficient of Determination
 Ordinary Least Squares(OLS): Work on Best fit line
 Gradient Descent: Work on the concept of Reduce the error
Math's to Find Slope and Intercept
➢ Identification of significant variables:
➢ It can be done during Exploratory Data Analysis (EDA)
➢ As well as during model building.
GRADIENT DESCENT
APPROACH:
 Gradient Descent is a very generic optimization algorithm capable of finding optimal solutions to a wide range of
problems. The general idea of Gradient Descent is to tweak parameters iteratively in order to minimize a cost
function.
 Suppose you are lost in the mountains in a dense fog; you can only feel the slope of the ground below your feet.
 A good strategy to get to the bottom of the valley quickly is to go downhill in the direction of the steepest slope.
 This is exactly what Gradient Descent does:
 it measures the local gradient of the error function with regards to the parameter vector θ, and it goes in the
direction of descending gradient.
 Once the gradient is zero, you have reached a minimum!
 So, you start by filling θ with random values (this is called random initialization), and then you improve it
gradually, taking one baby step at a time, each step attempting to decrease the cost function (e.g., the MSE),
until the algorithm converges to a minimum

What is Gradient
Descent
➢ An important parameter in Gradient
Descent is the size of the steps,
determined by the learning rate
hyperparameter.

➢ If the learning rate is too small, then the


algorithm will have to go through many
iterations to converge, which will take a
long time

➢ On the other hand, if the learning rate is


too high, you might jump across the valley
and end up on the other side, possibly even
higher up than you were before.

➢ This might make the algorithm diverge,


with larger and larger values, failing to find
a good solution
➢ The two main challenges with Gradient
Descent: if the random initialization starts
the algorithm on the left, then it will
converge to a local minimum, which is not
as good as the global minimum.

➢ If it starts on the right, then it will take a


very long time to cross the plateau, and if
you stop too early you will never reach
the global minimum.

➢ Fortunately, the MSE cost function for a


Linear Regression model happens to be a
convex function, which means that if you
pick any two points on the curve, the line
segment joining them never crosses the
curve.

➢ This implies that there are no local


minima, just one global minimum. It is
also a continuous function with a slope
that never changes abruptly.4
➢ Where derivative of loss or cost with weight is called slope.
➢ Its direction decide in which direction we need to move to reach a point where loss is minimum.
➢ The derivative of loss wrt ndim of vector is called gradient.
➢ Where ndim vector is called a tensor.
➢ In calculus derivative of tensor is referred as tensor.
➢ In machine learning, data with n number of features is represented as a tensor.
Derivative
Gradient Descent: Types
Stochastic Gradient Descent (SGD)
Batch Gradient Descent
Mini-Batch Gradient Descent
Linear Regression
and
optimization
Linear Regression And Optimization

➢ Linear regression aims to minimize the squared loss, which measures the
discrepancy between the actual and predicted values.

➢ The squared loss function is fundamental in regression analysis for evaluating


the performance of a model.
Overfitting,
Under fitting,
and
Best Fit

➢ Threshold Accuracy:
➢ It’s indicated that an accuracy
threshold of 70-95% (or 0.7-0.95) is
desired.
➢ This is the target range for acceptable
model performance.
Regularization
Types of Regularization:
Types of Regularization
Types of Regularization
Application and Interpretation:
➢ When evaluating a linear regression model, several error metrics
help determine the model's performance.
➢ Each serves a slightly different purpose.
➢ The order of accuracy typically depends on the sensitivity of the
metric to outliers and the emphasis on specific error magnitudes.
➢ Here is a brief overview of the key error metrics, their order, and
when to use them:

Evaluation of a Regression Model:


Variance Inflation
Factor (VIF):
Durbin-Watson Test:
Train-Test Split
Cross-Validation
Combining Cross-Validation with a
Holdout Set
EXAMPLE
EXAMPLE:
➢ R-squared (R2): ≈0.964
➢ This R2 value indicates
that approximately
96.43% of the variance
in salary can be
explained by the linear
relationship with
experience in this
model.
➢ Introduction
The real estate market is influenced by various factors, including income levels, house age, number of rooms,
number of bedrooms, and population density. Understanding how these factors affect house prices can provide
valuable insights for buyers, sellers, and real estate professionals. In this project, we aim to develop a predictive
model to estimate house prices based on various features in the USAHousing dataset.
➢ Dataset Description
The USAHousing dataset contains information on various attributes related to houses in different areas. The
features included in the dataset are:
▪ Avg. Area Income: The average income of residents in the area.
▪ Avg. Area House Age: The average age of houses in the area.
▪ Avg. Area Number of Rooms: The average number of rooms in houses in the area.
▪ Avg. Area Number of Bedrooms: The average number of bedrooms in houses in the area.
▪ Area Population: The population of the area.
▪ Price: The price of the house.
▪ Address: The address of the house (considered as a non-significant variable and will be excluded from the
model).

Case Study: USAHOUSING PRICE PREDICTION


➢ The primary objective of this project is to build a
robust predictive model that can accurately estimate
the price of a house based on the following
independent variables:
1.Avg. Area Income
2.Avg. Area House Age
3.Avg. Area Number of Rooms
4.Avg. Area Number of Bedrooms
5.Area Population

Objective
Methodology
METHODOLOGY
Conclusion

▪ Predicting house prices is a complex task that involves


understanding various factors that influence the real estate
market.
▪ By leveraging machine learning techniques, we aim to build a
reliable model that can provide accurate price estimates and
valuable insights into the housing market.

You might also like