STA405: Linear Modelling II
Dr. Idah
October 30, 2023
Dr. Idah STA405: Linear Modelling II October 30, 2023 1 / 1
Course Outline
Analysis of general linear model: model building, model selection and
validation, variable selection: stepwise and best subset regression,
Modelling under prior and additional information, ridge regression,
Modelling of non-normal data, Treatment of outliers in regression
models, Robustness, graphical techniques, Generalised linear models,
measurements of association in two-way tables, log-linear and other
models for contingency tables; logit, probit, categorical data, score tests,
case studies. Pre-requisite: STA 302
Dr. Idah STA405: Linear Modelling II October 30, 2023 2 / 1
References
1)
Introductory Statistics for Business and Economics by Thomas H.
Wannacott and Ronald J. Wannacott
Linear Models in Statistics by Alvin C. Rencher and Bruce G. Schaalje
2)
Probability and Statistics for Engineering and Scientists by Ronald E
3)
Walpole and Raymond H. Myers
4)
Model Selection and Inference by Kenneth P. Burnham and David R.
Anderson
Dr. Idah STA405: Linear Modelling II October 30, 2023 3 / 1
General Linear Model
The general linear model (general multivariate regression model) is
a compact way of simultaneously writing several multiple linear
regression models
The various multiple regression models may be compactly written
ˆ
as Y = Xβ + ε
where Y is matrix of measurements on one dependent variable, X
is matrix of observation on independent variable, β is a matrix
containing parameters to be estimated and ε is a matrix containing
error terms
Dr. Idah STA405: Linear Modelling II October 30, 2023 4 / 1
Analysis of General Linear model
Inmost research problems where regression analysis is applied,
more than one independent variable is needed in the regression
model. Therefore in order to be able to predict an important
response, a multiple regression model is needed.
When this model is linear in the coefficients, it is called a multiple
linear regression model.
Dr. Idah STA405: Linear Modelling II October 30, 2023 5 / 1
For the case of k independent variables x1, x2 · · · , xk the
expected response is given by the multiple linear regression
model
yˆ = β0 + β1x1 + · · · βk xk
where each regression coefficient βiis estimated from the sample
data using the method of least squares.
Dr. Idah STA405: Linear Modelling II October 30, 2023 6 / 1
Similar least square techniques can also be applied in estimating
the coefficients when the linear model involves, say, powers and
products of independent variables.
For example when k = 1, the polynomial regression
model yˆ = β0 + β1x + β2x2 + · · · βr xr
which is linear since the parameters occur linearly, regardless of
how the independent variables enter the model.
Dr. Idah STA405: Linear Modelling II October 30, 2023 7 / 1
An example of a nonlinear model is the exponential relationship given
by yˆ = αβx
Dr. Idah STA405: Linear Modelling II October 30, 2023 8 / 1
Estimating the coefficients
We obtain the least square estimates of the parameter β0, β1 · · · , βk
by fitting the multiple linear regression model
yˆ = β0 + β1x1 + · · · βk xk
to the data points
{(x1i, x2i, · · · , xki, yi); i = 1, 2, · · · , n and n > k}
where yiis the observed response to the values x1i, x2i, · · · , xki of the
k independent variables x1, x2, · · · , xk .
Dr. Idah STA405: Linear Modelling II October 30, 2023 9 / 1
Each of the observations (x1i, x2i, · · · , xki) satisfies the equation yi = β0 +
β1x1i + β2x2i + · · · + βk xki + εi
where εi are the random error term associated with the response yi. Dr. Idah
STA405: Linear Modelling II October 30, 2023 10 / 1
In using the concept of least squares to arrive at the
estimates βˆ0, βˆ1, · · · , βˆk , we minimize the expression
n ε2i =Xn i=1 (yi − β0 − β1x1i − β2x2i
SSE =X i=1
− · · · − βk xki)2
Differentiating SSE in turn with respect to β0, β1, · · · , βk and
equating to zero, we generate k + 1 normal equations
Dr. Idah STA405: Linear Modelling II October 30, 2023 11 / 1
n
nβ0 + β1Xn i=1 x1i + β2X i=1 x2i + · · · + xki =Xn i=1 yi
βkXn i=1
β0Xn i=1
x1i + β1Xn i=1
x21i + β2Xn
i=1
x1i x2i + · · ·
+ βkXn i=1
x1i xki =Xn i=1
x1i yi
.. .. .. .. ..
. . . . .
n n
β0Xn i=1 xki + β1X i=1 xki x1i + β2X xki x2i + · · · x ki =X i=1 xki yi
2 n
i=1
+ βkXn i=1
Dr. Idah STA405: Linear Modelling II October 30, 2023 12 / 1
These equations can be solved for β0, β1, · · · , βk by any
appropriate method for solving systems of linear equations
Dr. Idah STA405: Linear Modelling II October 30, 2023 13 / 1
The Linear Regression Model using Matrices
In fitting a multiple linear regression model, particularly when the
number of variables exceeds two, a knowledge of matrix theory
can facilitate the mathematical manipulations considerably.
Suppose that the experimenter has k independent variables x1, x2,
· · · , xk and n observations y1, y2, · · · , yn each of which can be
expressed by the equation
yi = β0 + β1x1i + β2x2i + · · · βk xki + εi
Dr. Idah STA405: Linear Modelling II October 30, 2023 14 / 1
This model essentially represents n equations describing how
the response values are generated in a scientific process.
Using matrix notation we can write the equations
Dr. Idah STA405: Linear Modelling II October 30, 2023 15 / 1
y = Xβ + ε
where
y=
· · xk2 xn2 2..
y1 1 x11 .. .. .. .. β .
2.. x21 · · · xk1 . . . . 1 β0
y . , X = βk
1 x12 x22 · x1n x2n · · · β1
yn , β =
Dr. Idah STA405: Linear Modelling II October 30, 2023 16 / 1
The solution for the regression coefficients are
ˆ ′ −1 ′
β = (X X) X y
Dr. Idah STA405: Linear Modelling II October 30, 2023 17 / 1
where
P n n
X′X = i=1 x1i x21iP i=1 x1i x2i· · ·P i=1 x1i xki
n n n Pn
nP i=1 x1iP i=1 x2i· · ·P i=1 xki
...
Pk
n n n
i=1 xki P i=1 xki x1iP i=1 xki x2i· · ·P i=1
x2ki
Dr. Idah STA405: Linear Modelling II October 30, 2023 18 / 1
and Pn
P.
i=1 yi n
Pn i=1 xki yi
.
i=1 x1i yi .
X′y =
as long as matrix X′X is nonsingular
Dr. Idah STA405: Linear Modelling II October 30, 2023 19 / 1
Exercise
1)Given the data
x 0 1 2 3 4 5 6 7 8 9 y 9.1 7.3 3.2 4.6 4.8 2.9 5.7 7.1 8.8
10.2
fit a regression model of the form ˆy = β0 + β1x + β2x2and estimate ˆy
when x = 2
Dr. Idah STA405: Linear Modelling II October 30, 2023 20 / 1
Solution
From the data
X10 i=1 10 x2i = 297,X10
4
x3i = 2025,X10 x i = 15332
= 45,X i=1
i=1 i=1
X10 i=1 yi = 53.7,X10 i=1
2
xi yi = 307.3,X10 i=1 x i=1yi = 2153.3, n
= 10
Dr. Idah STA405: Linear Modelling II October 30, 2023 21 / 1
Solving the normal equations
10β0 + 45β1 + 297β2 = 53.7
45β0 + 297β1 + 2025β2 = 307.3
297β0 + 2025β1 + 15, 332β2 = 53.7
Dr. Idah STA405: Linear Modelling II October 30, 2023 22 / 1
we obtain
β0 = 8.697, β1 = −2.341, β3 = 0.288
Therefore
yˆ = 8.697 − 2.341x + 0.288x2(1)
Dr. Idah STA405: Linear Modelling II October 30, 2023 23 / 1
When x = 2 then
yˆ = 8.697 − (2.341)(2) + (0.288)(22)
= 5.2
Dr. Idah STA405: Linear Modelling II October 30, 2023 24 / 1
Dr. Idah STA405: Linear Modelling II October 30, 2023 25 / 1
2)Consider the following quantities for the model Y = β0 + β1x1 + β2x2 + ε
Given that
and 1.1799 −7.3098 7.3006
(X′X)−1 =
−7.3098 7.9799
−1.2371 7.3006
−1.2371 4.6576
X′y = 220 36, 768 9, 964
Estimate the regression coefficients in the model specified above
and present the estimated model
Dr. Idah STA405: Linear Modelling II October 30, 2023 26 / 1
3)The personnel department of a certain industrial firm used 12 subjects
in a study to determine the relationship between job performance rating
(y) and the scores of four tests. The data are as follows
Dr. Idah STA405: Linear Modelling II October 30, 2023 26 / 1
y x1 x2 x3 x4
11.2 56.5 71.0 38.5 43.0
14.5 59.5 72.5 38.2 44.8
17.2 69.2 76.0 42.5 49.0
17.8 74.5 79.5 43.4 56.3
19.3 81.2 84.0 47.5 60.2
24.5 88.0 86.2 47.4 62.0
21.2 78.2 80.5 44.5 58.1
16.9 69.0 72.0 41.8 48.1
14.8 58.1 68.0 42.1 46.0
20.0 80.5 85.0 48.1 60.3
13.2 58.3 71.0 37.5 47.1
22.5 84.0 87.2 51.0 65.2
Dr. Idah STA405: Linear Modelling II October 30, 2023 27 / 1
Estimate the regression coefficients and the regression model Dr. Idah STA405:
Linear Modelling II October 30, 2023 28 / 1