0% found this document useful (0 votes)

37 views27 pages

EC501 Lecture 02

The document discusses the statistical properties of the Ordinary Least Squares (OLS) estimator in linear regression, outlining the assumptions required for its validity, such as independence of errors and homoskedasticity. It explains the implications of these assumptions on the unbiasedness and variance of the OLS estimator, as well as the Gauss-Markov theorem which states that OLS is the Best Linear Unbiased Estimator (BLUE). Additionally, it touches on the behavior of OLS estimators with small and large sample sizes, including the concept of consistency and asymptotic properties.

Uploaded by

T T

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views27 pages

EC501 Lecture 02

Uploaded by

T T

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

EC501 Econometric Methods

2. Linear Regression: Statistical Properties

Marcus Chambers

Department of Economics
University of Essex

19 October 2023

1 / 27
Outline

Review

The linear regression model: assumptions

Statistical properties of OLS: small N

Statistical properties of OLS: large N

Reference: Verbeek, chapter 2.

2 / 27
Review
We motivated the ordinary least squares (OLS) estimator by
choosing a linear combination of the regressors that provides a
‘good’ approximation of the dependent variable.
Our measure of ‘good’ was in terms of the sum of squared
residuals, where the residual for observation i is

ei = yi − β̃1 − β̃2 xi2 − . . . − β̃K xiK , i = 1, . . . , N.

The OLS estimator is obtained as b = arg minβ̃ S(β̃) where

N
X
S(β̃) = e2i = e′ e = (y − X β̃)′ (y − X β̃).
i=1

The result is
b = (X ′ X)−1 X ′ y.

3 / 27
Properties of b?

But: what are the (statistical) properties of b?

To answer this question we need to move beyond thinking of
OLS in a purely algebraic sense.
Instead of describing the properties of a given sample we shall
think in terms of a statistical model relating y to x2 , . . . , xK .
We try to learn something about this relationship from our
observed sample.
The statistical properties (assumptions) of the model then
determine the statistical properties of b.

4 / 27
The linear regression model
The linear regression model takes the form

yi = β1 + β2 xi2 + . . . + βK xiK + ϵi or yi = xi′ β + ϵi ,

where ϵi is an error term or disturbance.

This is a population relationship between y and x and is
assumed to hold for any possible observation.
Our goal is to estimate the population parameters, β1 , . . . , βK ,
based on our sample, (yi , xi ; i = 1, . . . , N).
We regard yi and ϵi (and usually xi ) as random variables that
are part of a sample derived from the population.
Recall that we can write the model in matrix form as
y = X β + ϵ.
(1)
(N × 1) (N × K) (K × 1) (N × 1)

5 / 27
Random sampling
The origins of the linear regression model lie in the sciences
where the xi variables are determined in a laboratory setting.
The xi variables are fixed in repeated samples so that the only
source of randomness is ϵi leading to different values for yi
across samples.
This can be hard to justify in Economics where it is more
common to regard both xi and ϵi as changing across samples.
This leads to different observed values of yi and xi each time a
new sample is drawn.
A random sample implies that each observation, (yi , xi ), is an
independent drawing from the population.
We will use this idea as a basis for a set of statistical
assumptions.

6 / 27
Assumptions
Our assumptions concern the linear model

yi = xi′ β + ϵi , i = 1, . . . , N.

The Gauss-Markov conditions are:

E{ϵi } = 0, i = 1, . . . , N; (A1)
{ϵ1 , . . . , ϵN } and {x1 , . . . , xN } are independent; (A2)
V{ϵi } = σ 2 , i = 1, . . . , N; (A3)
cov{ϵi , ϵj } = 0, i, j = 1, . . . , N, i ̸= j. (A4)

Note that we also need N > K and X ′ X to be invertible – here

we need X to have rank K i.e. the columns of X are linearly
independent (M4).
What do these conditions imply?
7 / 27
Assumptions A1, A3 and A4
Assumption (A1) suggests that the regression line holds on
average (more on this shortly).
Assumption (A3) states that all disturbances have the same
variance - this is known as homoskedasticity (which rules out
heteroskedasticity, or non-constant variances, which we shall
deal with later).
Assumption (A4) tells us that all pairs, ϵi and ϵj , are
uncorrelated (this is essentially just random sampling), thereby
ruling out autocorrelation.
In terms of the N × 1 vector ϵ, these assumptions imply (see
S12) that

E{ϵ} = 0 (N × 1) and V{ϵ} = σ 2 IN (N × N),

where IN is the N × N identity matrix.

8 / 27
Assumption A2
Under assumption (A2) the matrix X and vector ϵ are
independent.
This means that knowledge of X tells us nothing about the
distribution of ϵ (and vice versa).
It implies that

E{ϵ|X} = E{ϵ} = 0 and V{ϵ|X} = V{ϵ} = σ 2 IN .

Under (A1) and (A2) the linear regression model is a model for
the conditional mean of yi , because

E{yi |xi } = E{xi′ β + ϵi |xi } = xi′ β + E{ϵi |xi } = xi′ β

in view of E{ϵi |xi } = 0.

Assumptions (A1)–(A4) jointly determine the properties of b.

9 / 27
Small N

We shall begin by taking the sample size, N, to be a finite

number (but recall N > K).
First, note that the OLS vector b is a linear function of y:

b = (X ′ X)−1 X ′ y

= (X ′ X)−1 X ′ (Xβ + ϵ) (using y = Xβ + ϵ)

= (X ′ X)−1 X ′ Xβ + (X ′ X)−1 X ′ ϵ

= β + (X ′ X)−1 X ′ ϵ (because (X ′ X)−1 X ′ X = IK ).

It is, therefore, also a linear function of the unobservable

random vector ϵ.

10 / 27
E{b}

The expected value of b is

E{b} = E{β + (X ′ X)−1 X ′ ϵ} = β + E{(X ′ X)−1 X ′ ϵ}.

But
E{(X ′ X)−1 X ′ ϵ} = E{(X ′ X)−1 X ′ }E{ϵ} = 0
by (A2) and E{ϵ} = 0 by (A1).
Hence E{b} = β and the OLS estimator b is said to be an
unbiased estimator of β.
In repeated sampling the OLS estimator will be equal to β ‘on
average.’
Note that unbiasedness does not require (A3) and (A4).

11 / 27
V{b|X}
The conditional covariance matrix of b is:
V{b|X} = E{(b − β)(b − β)′ |X}

= E{(X ′ X)−1 X ′ ϵϵ′ X(X ′ X)−1 |X}

= (X ′ X)−1 X ′ E{ϵϵ′ |X}X(X ′ X)−1

= σ 2 (X ′ X)−1 X ′ X(X ′ X)−1 as E{ϵϵ′ |X} = σ 2 IN

= σ 2 (X ′ X)−1 .

We will denote this as V{b} = σ 2 (X ′ X)−1 for convenience.

The unconditional variance matrix is actually

V{b} = σ 2 E (X ′ X)−1 ,

which is rather more complicated!

12 / 27
Gauss-Markov Theorem
Clearly OLS is a Linear Unbiased Estimator (LUE).
But how does OLS compare to other LUEs?

Gauss-Markov Theorem
Under Assumptions (A1)–(A4), the OLS estimator b of β is the
Best Linear Unbiased Estimator (BLUE) in the sense that it has
minimum variance within the class of LUEs.
What does this mean?
Take any other LUE, call it b̃; then

V{b̃|X} ≥ V{b|X}

in the sense that the matrix V{b̃|X} − V{b|X} is positive

semi-definite; see (M10).

13 / 27
Normality of ϵ

Sometimes it is appropriate to actually specify the distribution

of the random disturbance vector ϵ.
A common assumption, that incorporates (A1), (A3) and (A4),
is:
ϵ ∼ N(0, σ 2 IN ). (A5)
This is equivalent to

ϵi ∼ NID(0, σ 2 ), (A5′ )

where NID denotes ‘normally and independently distributed.’

This also implies that yi ∼ NID(xi′ β, σ 2 ) (conditional on X) which
is not always appropriate.

14 / 27
Normality of b

Under (A2) and (A5) it follows that

b ∼ N(β, σ 2 (X ′ X)−1 )

because b is linear in ϵ.
Each element of b is also normally distributed:

bk ∼ N(βk , σ 2 ckk ), k = 1, . . . , K,

where ckk denotes the (k, k) (diagonal) element of (X ′ X)−1 .

These results motivate statistical tests based on b but, in
practice, we don’t know σ 2 .
We therefore estimate σ 2 using the data – how do we do this?

15 / 27
Estimation of σ 2 = V{ϵi }
We usually estimate variances by sample averages but ϵi is
unobserved.
Instead we can base an estimator on the residuals:
N
1 X 2
s2 = ei .
N−K
i=1

This estimator is unbiased (i.e. E{s2 } = σ 2 ).

Note the degrees of freedom adjustment – the denominator is
N − K rather than N − 1.
This is because we have estimated K parameters in order to
obtain the residuals (ei = yi − xi′ b).
The estimated variance matrix of b is then

V̂{b} = s2 (X ′ X)−1 .

16 / 27
Returning to the R output for a regression of individuals’ wages
on years of education from last week:
> fit1 <- lm(lwage~educ, data=wage1)
> summary(fit1)

Call:
lm(formula = lwage ~ educ, data = wage1)

Residuals:
Min 1Q Median 3Q Max
-2.21158 -0.36393 -0.07263 0.29712 1.52339

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.583773 0.097336 5.998 3.74e-09 ***
educ 0.082744 0.007567 10.935 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.4801 on 524 degrees of freedom

Multiple R-squared: 0.1858,Adjusted R-squared: 0.1843
F-statistic: 119.6 on 1 and 524 DF, p-value: < 2.2e-16

= 0.4801 (implying s2 = 0.2305), while the standard

Here, s p
errors ( s2 ckk ) are 0.0973 and 0.0076 for b1 and b2 , respectively.

17 / 27
Large N

The Gauss-Markov assumptions ensure that exact finite

sample results hold for b (e.g. unbiasedness, normality).
If we wish to relax some of these assumptions then exact finite
sample results are typically not available.
For example, if (A2) doesn’t hold, then b will be biased.
We therefore use results for large N to find out the asymptotic
properties as N → ∞.
For large enough N we treat the asymptotic results as holding
approximately.

18 / 27
Convergence

Consider a sequence of numbers indexed by N e.g.

−N 1 1 1 1
{xN = e } = , , ,..., N,... .
e e2 e3 e

We can define the limit of this sequence as N → ∞:

lim xN = lim e−N = 0.

N→∞ N→∞

The sequence {xN } is said to converge to zero.

But what happens if the elements of the sequence are random
variables?

19 / 27
Convergence of random variables

The sequence of random variables {xN } is said to converge in

probability to a constant c if

lim P {|xN − c| > δ} = 0 for all δ > 0;

N→∞

(see, for example, (2.69) on p.34 of Verbeek).

This is written
p
xN → c or plim xN = c.
In words: there exists a positive number δ such that, as N gets
larger and larger, the probability that the distance between xN
and c is larger than δ, converges to zero.
Note that δ can be arbitrarily small.

20 / 27
Slutsky’s Theorem
If plim b = β then b is a consistent estimator of β.
Consistency can be thought of as a large sample version of
unbiasedness and is a minimum requirement for an estimator.
A useful property of the plim operator is:

Slutsky’s Theorem
If g(·) is a continuous function and plim xN = c, then

plim g(xN ) = g(plim xN ) = g(c);

(see, for example, (2.71) on p.34 of Verbeek).

This is not a property shared by the expectations operator; in

general,
̸ g{E(x)}
E{g(x)} =
for a random variable x.
21 / 27
Convergence to a constant
6

N=10
N=100
5
N=1000

4
Density

0
-1 -0.5 0 0.5 1
Estimator

Convergence to a constant c (here, c = 0) is illustrated above

by the variance of the distribution becoming smaller as N
increases.

22 / 27
Large N assumptions

What can we say about b in large samples? Is it consistent?

It is convenient to make the following assumptions:
N
1 ′ 1X ′ p
XX= xi xi → Σxx , (finite, nonsingular); (A6)
N N
i=1
E{xi ϵi } = 0, i = 1, . . . , N. (A7)

In (A6) the matrix Σxx can be regarded as E(xi xi′ ).

Assumption (A7) states that xi and ϵi are uncorrelated.
What do these conditions imply for b?

23 / 27
Properties of b

We begin by writing
−1
1 ′ 1 ′
b = β+ XX Xϵ
N N
N
!−1 N
1X ′ 1X
= β+ xi xi xi ϵi .
N N
i=1 i=1

Applying the plim operator and using Slutsky we find

N
!−1 N
1X ′ 1X
plim(b − β) = plim xi xi plim xi ϵi .
N N
i=1 i=1

The first term converges to Σ−1

xx using (A6).

24 / 27
Large sample results
It is reasonable to assume that sample averages converge to
their population values and so
N
1X
plim xi ϵi = E{xi ϵi }.
N
i=1

But E{xi ϵi } = 0 under (A7) and so

plim(b − β) = Σ−1
xx E{xi ϵi } = 0.

Hence b is a consistent estimator of β.

It is also possible to show that, as N → ∞,
√
N(b − β) → N(0, σ 2 Σ−1
xx ),

where → means ‘is asymptotically distributed as’.

25 / 27
Large sample approximation

For a large but finite sample size we can use this result to
approximate the distribution of b as
a
b ∼ N(β, σ 2 Σ−1
xx /N),
a
where ∼ means ‘is approximately distributed as.’
Our best estimate of Σxx is X ′ X/N and we estimate σ 2 using s2 .
Hence we have the familiar result
a
b ∼ N(β, s2 (X ′ X)−1 ).

But note this is only approximate as it is based on weaker

assumptions than Gauss-Markov.

26 / 27
Summary

• Gauss-Markov assumptions
• statistical properties of OLS: small N and large N

• Next week:
• goodness-of-fit
• hypothesis testing (t and F tests)

27 / 27

L5 2025 Spring
No ratings yet
L5 2025 Spring
40 pages
03 Assumptions and Gauss Markov
No ratings yet
03 Assumptions and Gauss Markov
5 pages
Statistics II - Least Squares Regression: Marcelo Sant'Anna
No ratings yet
Statistics II - Least Squares Regression: Marcelo Sant'Anna
18 pages
Ec 2
No ratings yet
Ec 2
12 pages
Econometrics I Lecture 4 Wooldridge
No ratings yet
Econometrics I Lecture 4 Wooldridge
33 pages
4-Econometrics-Linear Regression
No ratings yet
4-Econometrics-Linear Regression
12 pages
Gauss-Markov Theorem
No ratings yet
Gauss-Markov Theorem
5 pages
Financial Econometrics Lecture 4
No ratings yet
Financial Econometrics Lecture 4
41 pages
ECO 401 Econometrics: SI 2021 Week 2, 14 September
100% (1)
ECO 401 Econometrics: SI 2021 Week 2, 14 September
47 pages
Chapter 2 Econometrics
No ratings yet
Chapter 2 Econometrics
9 pages
Unit - 1
No ratings yet
Unit - 1
8 pages
Tema I (Mínimos Cuadrados Ordinarios)
No ratings yet
Tema I (Mínimos Cuadrados Ordinarios)
49 pages
Understanding Ordinary Least Squares
No ratings yet
Understanding Ordinary Least Squares
50 pages
TA Session 06
No ratings yet
TA Session 06
13 pages
Chapter 6: Regression
No ratings yet
Chapter 6: Regression
7 pages
Wooldridge Notes
No ratings yet
Wooldridge Notes
15 pages
OLS Matrix Analysis for Statisticians
No ratings yet
OLS Matrix Analysis for Statisticians
14 pages
Advanced Econometrics 1 (Lecture of 5 August 2025)
No ratings yet
Advanced Econometrics 1 (Lecture of 5 August 2025)
13 pages
Lect 6
No ratings yet
Lect 6
20 pages
Chapter 2 Regression Analysis Notes
No ratings yet
Chapter 2 Regression Analysis Notes
11 pages
Chapter 02
No ratings yet
Chapter 02
14 pages
Econometrics for Finance Students
No ratings yet
Econometrics for Finance Students
64 pages
Lecture Slides On Statistics at Uni St. Gallen
No ratings yet
Lecture Slides On Statistics at Uni St. Gallen
20 pages
Week2 Lecture2
No ratings yet
Week2 Lecture2
59 pages
Properties of The OLS Estimator: Quantitative Methods 2
No ratings yet
Properties of The OLS Estimator: Quantitative Methods 2
57 pages
Econometrics for Students
No ratings yet
Econometrics for Students
28 pages
Understanding Ordinary Least Squares
No ratings yet
Understanding Ordinary Least Squares
17 pages
ECONOMETRICS
No ratings yet
ECONOMETRICS
36 pages
Classical Linear Regression and Its Assumptions
No ratings yet
Classical Linear Regression and Its Assumptions
63 pages
Class 6a EEFE 510 - Fall 2024
No ratings yet
Class 6a EEFE 510 - Fall 2024
25 pages
Chapter3
No ratings yet
Chapter3
52 pages
Econometrics I - Lecture 4 (Wooldridge)
No ratings yet
Econometrics I - Lecture 4 (Wooldridge)
32 pages
Understanding Simple Linear Regression
No ratings yet
Understanding Simple Linear Regression
30 pages
Session-Classical Assumption
No ratings yet
Session-Classical Assumption
26 pages
OLS Estimators: Properties and Bias Analysis
No ratings yet
OLS Estimators: Properties and Bias Analysis
25 pages
Chapter 4 Evaluation of Estimators
No ratings yet
Chapter 4 Evaluation of Estimators
25 pages
MultivariableRegression 3
No ratings yet
MultivariableRegression 3
67 pages
Two-Variable Regression Model Basics
No ratings yet
Two-Variable Regression Model Basics
17 pages
Simple Linear Regression Explained
No ratings yet
Simple Linear Regression Explained
17 pages
FECO Note 2 - Simple Linear Regression: Xuan Chinh Mai
No ratings yet
FECO Note 2 - Simple Linear Regression: Xuan Chinh Mai
7 pages
Chapter3 PDF
No ratings yet
Chapter3 PDF
52 pages
Basic Regression Analysis in Econometrics
No ratings yet
Basic Regression Analysis in Econometrics
5 pages
Econometrics - Exercise Set 1 (Solution)
No ratings yet
Econometrics - Exercise Set 1 (Solution)
7 pages
Advanced Econometrics & Estimation
0% (1)
Advanced Econometrics & Estimation
81 pages
OLS Estimator Properties and Assumptions
100% (1)
OLS Estimator Properties and Assumptions
23 pages
Block 1
No ratings yet
Block 1
83 pages
The Multiple Linear Regression Model: Version: 30-10-2023, 16:07
No ratings yet
The Multiple Linear Regression Model: Version: 30-10-2023, 16:07
17 pages
CH 03
No ratings yet
CH 03
17 pages
CLRM Assumptions in Regression Analysis
No ratings yet
CLRM Assumptions in Regression Analysis
24 pages
Lecture5-Estimating The Linear Conditional Mean Model II - Annotated
No ratings yet
Lecture5-Estimating The Linear Conditional Mean Model II - Annotated
27 pages
Chapter 2 The Classical Linear Regression Model (CLRM)
No ratings yet
Chapter 2 The Classical Linear Regression Model (CLRM)
20 pages
Lecture Notes On High Dimensional Linear Regression
No ratings yet
Lecture Notes On High Dimensional Linear Regression
73 pages
UnivariateRegression 3
No ratings yet
UnivariateRegression 3
81 pages
Ce (Pe) 601B
No ratings yet
Ce (Pe) 601B
1 page
chp-1.1 An Astrologer's Day
No ratings yet
chp-1.1 An Astrologer's Day
9 pages
Year 7 Spreadsheets: Olympic Data Skills
No ratings yet
Year 7 Spreadsheets: Olympic Data Skills
2 pages
Muet Reading For Self Revisions
No ratings yet
Muet Reading For Self Revisions
10 pages
Strategic Capacity Planning For Products and Services
No ratings yet
Strategic Capacity Planning For Products and Services
15 pages
IELTS Speaking Predictions Q1 2024
No ratings yet
IELTS Speaking Predictions Q1 2024
76 pages
587M Parr 6400 Calorimeter After 10 2010 Inst
No ratings yet
587M Parr 6400 Calorimeter After 10 2010 Inst
110 pages
EAP11 - 12 - Unit 10 - Lesson 1 - Kinds of Reports
No ratings yet
EAP11 - 12 - Unit 10 - Lesson 1 - Kinds of Reports
31 pages
You Exec - Product Roadmap Free 1
No ratings yet
You Exec - Product Roadmap Free 1
21 pages
5 English For Everyone, Level 3 Intermediate, Course Book
No ratings yet
5 English For Everyone, Level 3 Intermediate, Course Book
278 pages
JFE Steel Specification
No ratings yet
JFE Steel Specification
44 pages
Operations Manager Expertise
No ratings yet
Operations Manager Expertise
2 pages
Understanding Synthetic Life Concepts
No ratings yet
Understanding Synthetic Life Concepts
19 pages
Understanding Line in Art Elements
No ratings yet
Understanding Line in Art Elements
7 pages
Variables Control Groups and More!: Lesson Plan & Notes
No ratings yet
Variables Control Groups and More!: Lesson Plan & Notes
13 pages
Food Tech: Cubit Cake Flour Study
No ratings yet
Food Tech: Cubit Cake Flour Study
13 pages
08 - Theory of Design
No ratings yet
08 - Theory of Design
5 pages
Industrial Development in India Some Reflections On Growth and Stagnation
No ratings yet
Industrial Development in India Some Reflections On Growth and Stagnation
11 pages
Matlab Solutions For Maxflow-Mincut and Ford Fulkerston Algorithm
No ratings yet
Matlab Solutions For Maxflow-Mincut and Ford Fulkerston Algorithm
4 pages
Sustainable Road Rehabilitation
No ratings yet
Sustainable Road Rehabilitation
5 pages
2D Plasma Arc Fault Switch Modeling
No ratings yet
2D Plasma Arc Fault Switch Modeling
47 pages
Product Development
No ratings yet
Product Development
20 pages
The Advantages and Disadvantages of Being A Engineer
No ratings yet
The Advantages and Disadvantages of Being A Engineer
1 page
Slit Lamp Illuminations Self Evaluation
No ratings yet
Slit Lamp Illuminations Self Evaluation
3 pages
WebLogic Server Start Script Guide
No ratings yet
WebLogic Server Start Script Guide
4 pages
Water Conservation
50% (2)
Water Conservation
13 pages
DVD Input Connection & TV Settings Guide
No ratings yet
DVD Input Connection & TV Settings Guide
8 pages
Chung Randolph Schneider 2006
No ratings yet
Chung Randolph Schneider 2006
9 pages
Robbins mgmt10 TB 06
No ratings yet
Robbins mgmt10 TB 06
29 pages
Extract of Sltu SP
No ratings yet
Extract of Sltu SP
44 pages

EC501 Lecture 02

Uploaded by

EC501 Lecture 02

Uploaded by

EC501 Econometric Methods

2. Linear Regression: Statistical Properties

The linear regression model: assumptions

Statistical properties of OLS: small N

Statistical properties of OLS: large N

Reference: Verbeek, chapter 2.

ei = yi − β̃1 − β̃2 xi2 − . . . − β̃K xiK , i = 1, . . . , N.

The OLS estimator is obtained as b = arg minβ̃ S(β̃) where

But: what are the (statistical) properties of b?

yi = β1 + β2 xi2 + . . . + βK xiK + ϵi or yi = xi′ β + ϵi ,

where ϵi is an error term or disturbance.

The Gauss-Markov conditions are:

Note that we also need N > K and X ′ X to be invertible – here

E{ϵ} = 0 (N × 1) and V{ϵ} = σ 2 IN (N × N),

where IN is the N × N identity matrix.

E{ϵ|X} = E{ϵ} = 0 and V{ϵ|X} = V{ϵ} = σ 2 IN .

E{yi |xi } = E{xi′ β + ϵi |xi } = xi′ β + E{ϵi |xi } = xi′ β

in view of E{ϵi |xi } = 0.

We shall begin by taking the sample size, N, to be a finite

= (X ′ X)−1 X ′ (Xβ + ϵ) (using y = Xβ + ϵ)

= β + (X ′ X)−1 X ′ ϵ (because (X ′ X)−1 X ′ X = IK ).

It is, therefore, also a linear function of the unobservable

The expected value of b is

E{b} = E{β + (X ′ X)−1 X ′ ϵ} = β + E{(X ′ X)−1 X ′ ϵ}.

= E{(X ′ X)−1 X ′ ϵϵ′ X(X ′ X)−1 |X}

= (X ′ X)−1 X ′ E{ϵϵ′ |X}X(X ′ X)−1

= σ 2 (X ′ X)−1 X ′ X(X ′ X)−1 as E{ϵϵ′ |X} = σ 2 IN

We will denote this as V{b} = σ 2 (X ′ X)−1 for convenience.

which is rather more complicated!

in the sense that the matrix V{b̃|X} − V{b|X} is positive

Sometimes it is appropriate to actually specify the distribution

where NID denotes ‘normally and independently distributed.’

Under (A2) and (A5) it follows that

where ckk denotes the (k, k) (diagonal) element of (X ′ X)−1 .

This estimator is unbiased (i.e. E{s2 } = σ 2 ).

Residual standard error: 0.4801 on 524 degrees of freedom

= 0.4801 (implying s2 = 0.2305), while the standard

The Gauss-Markov assumptions ensure that exact finite

Consider a sequence of numbers indexed by N e.g.

We can define the limit of this sequence as N → ∞:

lim xN = lim e−N = 0.

The sequence {xN } is said to converge to zero.

The sequence of random variables {xN } is said to converge in

lim P {|xN − c| > δ} = 0 for all δ > 0;

(see, for example, (2.69) on p.34 of Verbeek).

plim g(xN ) = g(plim xN ) = g(c);

(see, for example, (2.71) on p.34 of Verbeek).

This is not a property shared by the expectations operator; in

Convergence to a constant c (here, c = 0) is illustrated above

What can we say about b in large samples? Is it consistent?

In (A6) the matrix Σxx can be regarded as E(xi xi′ ).

Applying the plim operator and using Slutsky we find

The first term converges to Σ−1

But E{xi ϵi } = 0 under (A7) and so

Hence b is a consistent estimator of β.

where → means ‘is asymptotically distributed as’.

But note this is only approximate as it is based on weaker

You might also like