MachineLearning Unit II
MachineLearning Unit II
Unit-II
Linear regression
• Regression is essentially finding a relationship (or)
association between the dependent variable (Y) and the
independent variable(s) (X), i.e. to find the function ‘f ’ for
the association Y = f (X).
• Linear regression is a statistical model that is used to
predict a continuous dependent variable from one or more
independent variables
• It is called "linear" because the model is based on the idea
that the relationship between the dependent and independent
variables is linear.
• In a linear regression model, the independent variables are
referred to as the predictors and the dependent variable is
the response/target.
Linear regression
• The goal is to find the "best" line that fits the data. The
"best" line is the one that minimizes the sum of the
squared differences between the observed responses in the
dataset and the responses predicted by the line.
• For example, if you were using linear regression to model
the relationship between the temperature outside and the
number of ice cream cones sold at an ice cream shop, you
could use the model to predict how many ice cream cones
you would sell on a hot day given the temperature
outside.
Linear regression Cont…
• Linear Regression is Supervised Learning
The most common regression algorithms are
1. Simple linear regression
3. Polynomial regression
5. Logistic regression
PriceProperty = f(AreaProperty )
• Assuming a linear association, we can reformulate the
model as
PriceProperty = a + b. AreaProperty
• where ‘a’ and ‘b’ are intercept and slope of the straight
line, respectively.
Slope of the simple linear regression model
• Slope of a straight line represents how much the line in a graph
changes in the vertical direction (Y-axis) over a change in the
horizontal direction (X-axis) as shown in Figure 8.2.
Slope = Change in Y/Change in X
• Rise is the change in Y-axis (Y − Y ) and Run is the change in
X-axis (X − X ). So, slope is represented as given below:
Loss functions
• Suppose the model is trained and gives the predicted output
then the loss is the difference between the predicted values
and actual data values.
Type of loss in a linear model
MAE-This is the difference between the predicted and actual
values. It is also called mean absolute error (MAE).
Loss functions
Type of loss in a linear model
MSE- the squared average difference between the predicted
and actual value. It is also known as Mean Squared Error
(MSE). The formula of MSE loss is shown below.
Loss functions
Type of loss in a linear model
RSME Error: It tells the error rate by the square root of the L2
loss i.e. MSE. The formula of RSME is shown below.
Loss functions
Type of loss in a linear model
• R-squared error: It tells the good fit of the model-predicted
line with the actual values of data. The coefficient value
range is from 0 to 1 i.e. the value close to 1 is a well-fitted
line. The formula is shown below.
Slope Equation
1. Shrinkage Approach
2. Subset Selection
coefficient tends towards zero. This leads to both low variance (as