SimpleMultipleLinearRegression_FoundationalMathofAI_S24
SimpleMultipleLinearRegression_FoundationalMathofAI_S24
Yashil Sukurdeep
July 1, 2024
Contents
1 Linear Regression: Fundamentals 2
1
1 Linear Regression: Fundamentals
Linear regression is a statistical technique used to model and analyze the rela-
tionships between a dependent variable and one or more independent variables.
By fitting a linear equation to the observed data, linear regression helps in pre-
dicting the value of the dependent variable based on the values of the indepen-
dent variables. This method is widely used in various fields such as economics,
biology, engineering, and social sciences due to its simplicity and interpretabil-
ity. The primary objective of linear regression is to determine the best-fitting
line, known as the line of best fit, which minimizes the differences between the
observed values and the predicted values. This allows researchers and analysts
to understand the strength and direction of the relationship between variables,
make predictions, and uncover trends. Linear regression is fundamental to both
statistical analysis and machine learning, serving as a foundational tool for more
complex modeling techniques.
Figure 1: Scatter plot showing a positive linear trend between two variables.
2
It seems reasonable to suspect that there is a positive linear relationship between
the dependent variable y and the independent variable x, which we can model
through the following linear model :
yi = β0 + β1 xi + ϵi for i = 1, . . . , n (1)
where:
• yi is the dependent variable for the i-th observation,
whose y-intercept βb0 and slope βb1 are calculated such that they minimize the
sum of the squared errors between the observed values yi and the predicted
.
values ybi = βb0 + βb1 xi :
n
X 2
βb0 , βb1 = argmin (yi − β0 − β1 xi ) (3)
β0 ,β1 i=1
Figure 2: Line of best fit (in red) for the data from Figure 1.
3
Using some calculus and algebra, we can find explicit formulas for βb0 and βb1 :
y = Xβ + ϵ
where:
• y is an n × 1 vector of the dependent variable,
• X is an n×(d+1) matrix of the independent variables (including a column
of ones for the intercept),
• β is a (d + 1) × 1 vector of the regression coefficients,
• ϵ is an n × 1 vector of the error terms.
These vectors and matrices can be expressed as:
β0
y1 1 x11 x12 ··· x1d β1 ϵ1
y2 1 x21 x22 ··· x2d ϵ2
y= . X = . β = β2 ϵ=.
.. .. .. ..
.. .. . . . . .. ..
.
yn 1 xn1 xn2 ··· xnd ϵn
βd
4
The estimated coefficients β̂ can be calculated using the Moore-Penrose pseudo-
inverse matrix:
Of course, one always needs to scrutinize the result we obtain after fitting a
model to our data. In linear regression, it is important to consider:
• Significance: Coefficients are often tested for statistical significance to
determine if they are different from zero. This is typically done using
t-tests and p-values.
• Multicollinearity: In multiple linear regression is also important to be
aware of multicollinearity, which occurs when independent variables xj
and xj ′ (with j ̸= j ′ ) are highly correlated with each other. This can
make the estimates of the coefficients less reliable. If two predictors are
highly correlated, consider removing one of them from the model. This
can simplify the model and reduce multicollinearity.
For each i = 1, . . . , n, let
ŷi = β̂0 + β̂1 xi
be the predicted values obtained from the simple linear regression model (1),
and let
ŷi = β̂0 + β̂1 xi1 + β̂2 xi2 + · · · + β̂d xid
be the predicted values from the multiple linear regression model (6). When
evaluating a linear regression model, several metrics can be used to assess its
performance:
5
• Residual Sum of Squares (RSS): This measures the total deviation of
the predicted values from the actual values.
n
X
RSS = (yi − ŷi )2
i=1
• Total Sum of Squares (TSS): This measures the total variation in the
dependent variable.
Xn
TSS = (yi − ȳ)2
i=1