SimpleMultipleLinearRegression_FoundationalMathofAI_S24

Linear Regression
Yashil Sukurdeep
July 1, 2024
Contents
1 Linear Regression: Fundamentals 2
2 Simple Linear Regression 2
3 Multiple Linear Regression 4
4 Interpreting and Evaluating Linear Regression Models 5
1
1 Linear Regression: Fundamentals
Linear regression is a statistical technique used to model and analyze the rela-
tionships between a dependent variable and one or more independent variables.
By fitting a linear equation to the observed data, linear regression helps in pre-
dicting the value of the dependent variable based on the values of the indepen-
dent variables. This method is widely used in various fields such as economics,
biology, engineering, and social sciences due to its simplicity and interpretabil-
ity. The primary objective of linear regression is to determine the best-fitting
line, known as the line of best fit, which minimizes the differences between the
observed values and the predicted values. This allows researchers and analysts
to understand the strength and direction of the relationship between variables,
make predictions, and uncover trends. Linear regression is fundamental to both
statistical analysis and machine learning, serving as a foundational tool for more
complex modeling techniques.
2 Simple Linear Regression

Simple linear regression is a method used to model the relationship between a
single independent variable x and a dependent variable y, where both x and
y take on a continuous range of values. Let’s assume that we have a dataset
(x1 , y1 ), . . . , (xn , yn ), where xi ∈ R and yi ∈ R for all i = 1, . . . , n, such as the
one in Figure 1. For instance, the xi ’s could represent the number of hours you
spend on your final project, and the yi′ s could be your score on the final project.
Figure 1: Scatter plot showing a positive linear trend between two variables.
2
It seems reasonable to suspect that there is a positive linear relationship between
the dependent variable y and the independent variable x, which we can model
through the following linear model :
yi = β0 + β1 xi + ϵi for i = 1, . . . , n (1)
where:
• yi is the dependent variable for the i-th observation,
• xi is the independent variable for the i-th observation,

• β0 is the y-intercept of the regression line,
• β1 is the slope of the regression line,
• ϵi is the error term for the i-th observation, which we will assume to be a
random variable which follows a standard normal distribution.
The objective in simple linear regression is to find the line of best fit
y = βb0 + βb1 x (2)
whose y-intercept βb0 and slope βb1 are calculated such that they minimize the
sum of the squared errors between the observed values yi and the predicted
.
values ybi = βb0 + βb1 xi :
n
X 2
βb0 , βb1 = argmin (yi − β0 − β1 xi ) (3)
β0 ,β1 i=1
Figure 2: Line of best fit (in red) for the data from Figure 1.
3
Using some calculus and algebra, we can find explicit formulas for βb0 and βb1 :
βb0 = ȳ − βb1 x̄ (4)

Pn
(x − x̄)(yi − ȳ)
βb1 = i=1 Pn i 2
(5)
i=1 (xi − x̄)
Pn Pn
In the above, x̄ = n1 i=1 xi and ȳ = n1 i=1 yi denote the sample averages of
the independent and dependent variables respectively.
3 Multiple Linear Regression

Multiple linear regression extends simple linear regression to include multiple
independent variables. The model can be expressed as:
yi = β0 + β1 xi1 + β2 xi2 + · · · + βd xid + ϵi (6)

where:
• yi is the dependent variable for the i-th observation for each i = 1, . . . , n,
• β0 is the y-intercept of the regression plane,
• βj is the coefficient for the j-th independent variable for each j = 1, . . . , d,
• xij is the j-th independent variable for the i-th observation,
• ϵi is the error term for the i-th observation.
In matrix form, the multiple linear regression model can be written as:
y = Xβ + ϵ
where:
• y is an n × 1 vector of the dependent variable,
• X is an n×(d+1) matrix of the independent variables (including a column
of ones for the intercept),
• β is a (d + 1) × 1 vector of the regression coefficients,
• ϵ is an n × 1 vector of the error terms.
These vectors and matrices can be expressed as:
 
    β0  
y1 1 x11 x12 ··· x1d  β1  ϵ1
 y2  1 x21 x22 ··· x2d     ϵ2 
y= .  X = . β =  β2  ϵ=.
       
.. .. .. .. 
 ..   .. . . . .   ..   .. 
.
yn 1 xn1 xn2 ··· xnd ϵn
βd
4
The estimated coefficients β̂ can be calculated using the Moore-Penrose pseudo-
inverse matrix:
β̂ = (X⊤ X)−1 X⊤ y. (7)
4 Interpreting and Evaluating Linear Regres-

sion Models
We now outline how to interpret the coefficients of the linear regression mod-
els (1) and (6):
• Intercept (β0 ): The expected value of y when all independent variables

are zero.
• Slope Coefficients (βj ): The change in the dependent variable y for a
one-unit change in the independent variable xj , holding all other variables
constant.
• Sign: The sign of the coefficient indicates the direction of the relationship.
Positive coefficients indicate a positive linear relationship, while negative
coefficients indicate a negative linear relationship.
• Magnitude: The size of the coefficient indicates the strength of the re-
lationship. Larger absolute values indicate stronger relationships.
Of course, one always needs to scrutinize the result we obtain after fitting a
model to our data. In linear regression, it is important to consider:
• Significance: Coefficients are often tested for statistical significance to
determine if they are different from zero. This is typically done using
t-tests and p-values.
• Multicollinearity: In multiple linear regression is also important to be
aware of multicollinearity, which occurs when independent variables xj
and xj ′ (with j ̸= j ′ ) are highly correlated with each other. This can
make the estimates of the coefficients less reliable. If two predictors are
highly correlated, consider removing one of them from the model. This
can simplify the model and reduce multicollinearity.
For each i = 1, . . . , n, let
ŷi = β̂0 + β̂1 xi
be the predicted values obtained from the simple linear regression model (1),
and let
ŷi = β̂0 + β̂1 xi1 + β̂2 xi2 + · · · + β̂d xid
be the predicted values from the multiple linear regression model (6). When
evaluating a linear regression model, several metrics can be used to assess its
performance:
5
• Residual Sum of Squares (RSS): This measures the total deviation of
the predicted values from the actual values.
n
X
RSS = (yi − ŷi )2
i=1
• Explained Sum of Squares (ESS): This measures the amount of vari-

ation in the dependent variable that is explained by the independent vari-
ables.
Xn
ESS = (ŷi − ȳ)2
i=1
• Total Sum of Squares (TSS): This measures the total variation in the
dependent variable.
Xn
TSS = (yi − ȳ)2
i=1
• R-squared (R2 ): This is the proportion of the variance in the dependent

variable that is predictable from the independent variables. It is calculated
as:
RSS
R2 = 1 −
TSS
These metrics provide a comprehensive view of how well the regression model
fits the data and how much of the variance in the dependent variable can be
explained by the model.

SimpleMultipleLinearRegression_FoundationalMathofAI_S24

Uploaded by

SimpleMultipleLinearRegression_FoundationalMathofAI_S24

Uploaded by

Linear Regression

2 Simple Linear Regression 2

3 Multiple Linear Regression 4

4 Interpreting and Evaluating Linear Regression Models 5

2 Simple Linear Regression

• xi is the independent variable for the i-th observation,

y = βb0 + βb1 x (2)

βb0 = ȳ − βb1 x̄ (4)

3 Multiple Linear Regression

yi = β0 + β1 xi1 + β2 xi2 + · · · + βd xid + ϵi (6)

β̂ = (X⊤ X)−1 X⊤ y. (7)

4 Interpreting and Evaluating Linear Regres-

• Intercept (β0 ): The expected value of y when all independent variables

• Explained Sum of Squares (ESS): This measures the amount of vari-

• R-squared (R2 ): This is the proportion of the variance in the dependent

You might also like