0% found this document useful (0 votes)
223 views63 pages

Simple Linear Regression Overview

This document provides an overview of chapter 2 on simple linear regression from an econometrics textbook. It defines the simple linear regression model as relating a dependent variable y to an independent variable x with an error term u. It introduces key concepts such as the population regression function, which describes the relationship between the conditional mean of y and values of x, and the population regression line/curve. It also provides examples of simple regression models relating soybean yield to fertilizer and wage to education.

Uploaded by

abebealemu292112
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
223 views63 pages

Simple Linear Regression Overview

This document provides an overview of chapter 2 on simple linear regression from an econometrics textbook. It defines the simple linear regression model as relating a dependent variable y to an independent variable x with an error term u. It introduces key concepts such as the population regression function, which describes the relationship between the conditional mean of y and values of x, and the population regression line/curve. It also provides examples of simple regression models relating soybean yield to fertilizer and wage to education.

Uploaded by

abebealemu292112
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Econ 3044: Introduction to Econometrics

Chapter-2: Simple Linear Regression

Lemi D.

Addis Ababa University


[Link]@[Link]

April 25, 2021

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 1 / 63


Overview

1 Definition of the Simple Regression Model

2 The Concept of PRF and SRF

3 The problem of Estimation

4 Residuals and Goodness of Fit

5 The Gauss-Markov Assumptions for Simple Regression

6 Expected Values and Variances of the OLS Estimators

7 The Gauss-Markov Theorem

8 Nonlinearities in Simple Regression

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 2 / 63


Definition of the Simple Regression Model

Definition of the Simple Regression Model

Much of applied econometric analysis begins with the following


premise: y and x are two variables, representing some population, and
we are interested in “explaining y in terms of x”.
In writing down a model that will “explain y in terms of x,” we must
confront three issues.
I First, since there is never an exact relationship between two variables,
how do we allow for other factors to affect y?
I Second, what is the functional relationship between y and x?
I And third, how can we be sure we are capturing a ceteris paribus
relationship between y and x?

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 3 / 63


Definition of the Simple Regression Model

Definition of the Simple Regression Model

We can resolve these ambiguities by writing down an equation relating


y to x. A simple equation is

y = β0 + β1 x + u (1)

Equation 1, which is assumed to hold in the population of interest,


defines the simple linear regression model.
When related by Equation 1, the variables y and x have several
different names used interchangeably, as shown in Table 1.
The terms “dependent variable” and “independent variable” are
frequently used in econometrics.

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 4 / 63


Definition of the Simple Regression Model

Definition of the Simple Regression Model

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 5 / 63


Definition of the Simple Regression Model

Definition of the Simple Regression Model

The variable u, called the error term or disturbance in the relationship,


represents factors other than x that affect y.
A simple regression analysis effectively treats all factors affecting y
other than x as being unobserved.
Other justifications for the inclusion of the error term in the model
include:
I Measurement error
I Wrong mathematical specification of the model
I Errors in aggregation
I The randomness of human behavior

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 6 / 63


Definition of the Simple Regression Model

Definition of the Simple Regression Model

Equation 1 also addresses the issue of the functional relationship


between y and x.
If the other factors in u are held fixed, so that the change in u is zero,
∆u = 0, then x has a linear effect on y:

∆y = β1 ∆x if ∆u = 0

Thus, the change in y is simply β1 multiplied by the change in x.


This means that β1 is the slope parameter in the relationship
between y and x, holding the other factors in u fixed.
The intercept parameter β0 , sometimes called the constant term,
also has its uses, although it is rarely central to an analysis.

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 7 / 63


Definition of the Simple Regression Model

Definition of the Simple Regression Model

Example (Soybean Yield and Fertilizer)


Suppose that soybean yield is determined by the model

yield = β0 + β1 f ertilizer + u,

so that y = yield and x = f ertilizer. The agricultural researcher is


interested in the effect of fertilizer on yield, holding other factors fixed.
This effect is given by β1 . The error term u contains factors such as land
quality, rainfall, and so on.

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 8 / 63


Definition of the Simple Regression Model

Definition of the Simple Regression Model

Example (A Simple Wage Equation)


A model relating a person’s wage to observed education and other
unobserved factors is

wage = β0 + β1 educ + u.

If wage is measured in dollars per hour and educ is years of education,


then β1 measures the change in hourly wage given another year of
education, holding all other factors fixed. Some of those factors include
labor force experience, innate ability, tenure with current employer, work
ethic, and numerous other things.

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 9 / 63


Definition of the Simple Regression Model

The Concept of PRF and SRF

Consider the data in Table 2, which refers to a total population of 60


families in a hypothetical community and their weekly income (X) and
weekly consumption expenditure (Y ), both in dollars.
The 60 families are divided into 10 income groups (from $80 to $260)
and the weekly expenditures of each family in the various groups are
as shown in the table.
Despite the variability of weekly consumption expenditure within each
income bracket, on the average, weekly consumption expenditure
increases as income increases.

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 10 / 63


The Concept of PRF and SRF

The Concept of PRF and SRF

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 11 / 63


The Concept of PRF and SRF

The Concept of PRF and SRF

In all we have 10 mean values for the 10 subpopulations of Y . We call


these mean values conditional expected values, as they depend on
the given values of the (conditioning) variable X.
Symbolically, we denote them as E(Y |X), which is read as the
expected value of Y given the value of X.
The dark circled points in Figure 1 show the conditional mean values
of Y against the various X values.
If we join these conditional mean values, we obtain what is known as
the population regression line (PRL), or more generally, the
population regression curve.

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 12 / 63


The Concept of PRF and SRF

The Concept of PRF and SRF

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 13 / 63


The Concept of PRF and SRF

The Concept of PRF and SRF

Geometrically, then, a population regression curve is simply the locus


of the conditional means of the dependent variable for the fixed values
of the explanatory variable(s).
More simply, it is the curve connecting the means of the
subpopulations of Y corresponding to the given values of the regressor
X. It can be depicted as in Figure 2.
This figure shows that for each X (i.e., income level) there is a
population of Y values (weekly consumption expenditures) that are
spread around the (conditional) mean of those Y values.
For simplicity, we are assuming that these Y values are distributed
symmetrically around their respective (conditional) mean values. And
the regression line (or curve) passes through these (conditional) mean
values.

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 14 / 63


The Concept of PRF and SRF

The Concept of PRF and SRF

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 15 / 63


The Concept of PRF and SRF

The Concept of PRF and SRF

From the preceding discussion and Figures 1 and 2, it is clear that


each conditional mean E(Y |Xi ) is a function of Xi , where Xi is a
given value of X. Symbolically,

E(Y |Xi ) = f (Xi ) (2)

where f (Xi ) denotes some function of the explanatory variable X.


Equation 2 is known as the population regression function (PRF).
In our example, E(Y |Xi ) is a linear function of Xi :

E(Y |Xi ) = β0 + β1 Xi

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 16 / 63


The Concept of PRF and SRF

The Concept of PRF and SRF


Stochastic Specification of PRF

We see from Figure 1 that, given the income level of Xi , an individual


family’s consumption expenditure is clustered around the average
consumption of all families at that Xi , that is, around its conditional
expectation.
Therefore, we can express the deviation of an individual Yi around its
expected value as follows:

ui = Yi − E(Y |Xi )

or
Yi = E(Y |Xi ) + ui (3)
where the deviation ui is an unobservable random variable taking
positive or negative values.

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 17 / 63


The Concept of PRF and SRF

The Concept of PRF and SRF


Stochastic Specification of PRF

Technically, ui is known as the stochastic disturbance or stochastic


error term.
If E(Y |Xi ) is assumed to be linear in Xi , Equation 3 may be written
as
Yi = β0 + β1 Xi + ui (4)
Now if we take the expected value of Equation 4, on both sides, we
obtain

E(Yi |Xi ) = β0 + β1 E(Xi |Xi ) + E(ui |Xi )

= β0 + β1 Xi + E(ui |Xi )

= E(Y |Xi ) + E(ui |Xi ) (5)

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 18 / 63


The Concept of PRF and SRF

The Concept of PRF and SRF


Stochastic Specification of PRF

Since E(Yi |Xi ) is the same thing as E(Y |Xi ), Equation 5 implies that

E(ui |Xi ) = 0

Thus, the assumption that the regression line passes through the
conditional means of Y implies that the conditional mean values of ui
(conditional upon the given X’s) are zero.

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 19 / 63


The Concept of PRF and SRF

The Concept of PRF and SRF

However, for most practical situations what we have is a sample of Y


values corresponding to some fixed X’s.
Therefore, our task now is to estimate the PRF on the basis of the
sample information.
Suppose that the population of Table 1 was not known to us and the
only information we had was a randomly selected sample of Y values
for the fixed X’s as given in Table 2.
The question is: Can we estimate the PRF from the sample data?
We may not be able to estimate the PRF “accurately” because of
sampling fluctuations.
In general, we would get N different estimates for N different
samples. We represent these estimates by the sample regression
function(SRF).

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 20 / 63


The Concept of PRF and SRF

The Concept of PRF and SRF

Figure: A random sample from the population of Table 1

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 21 / 63


The Concept of PRF and SRF

The Concept of PRF and SRF

The sample regression function may be written as

Ŷi = β̂0 + β̂1 Xi

where

Ŷi = estimator of E(Y |Xi )


β̂0 = estimator of β0
β̂1 = estimator of β1

Note that an estimator is simply a rule or formula or method that


tells how to estimate the population parameter from the information
provided by the sample at hand.

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 22 / 63


The Concept of PRF and SRF

The Concept of PRF and SRF

To sum up, our primary objective in regression analysis is to estimate


the PRF
Yi = β0 + β1 Xi + ui
on the basis of the SRF

Yi = β̂0 + β̂1 Xi + ûi

But because of sampling fluctuations our estimate of the PRF based


on the SRF is at best an approximate one.
This approximation is shown diagrammatically in Figure 3.

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 23 / 63


The Concept of PRF and SRF

The Concept of PRF and SRF

Figure: Sample and population regression lines.


Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 24 / 63
The problem of Estimation

The problem of Estimation

Let {(xi , yi ) : i = 1, . . . , n} denote a random sample of size n from


the population.
The linear regression model can be written as

yi = β0 + β1 xi + ui (6)

for each i.
One of the objectives of the whole exercise in regression is to obtain
appropriate estimates for the parameters β0 and β1 .

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 25 / 63


The problem of Estimation

The problem of Estimation

There are a number of methods for estimating the parameters of this


model. The most popular ones are:
I the method of moments
I the method of ordinary least squares (OLS), and
I the method of maximum likelihood
Though they give different outcomes in generalized models, in the
case of the simple linear regression, all three give identical results.

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 26 / 63


The problem of Estimation

The problem of Estimation


The Method of Ordinary Least Squares

By and large, it is the method of OLS that is used extensively in


regression analysis.
We estimate the PRF in Equation 6 from the SRF:

yi = β̂0 + β̂1 xi + ûi

First, we express the SRF as

ûi = yi − β̂0 − β̂1 xi

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 27 / 63


The problem of Estimation

The problem of Estimation


The Method of Ordinary Least Squares

We adopt the least squares criterion, which states that the SRF can
be fixed in such a way that
n
X n
X
û2i = (yi − β̂0 − β̂1 xi )2
i=1 i=1

is as small as possible.
The sum of the squared residuals is some function of the estimators β̂0
and β̂1 .

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 28 / 63


The problem of Estimation

The problem of Estimation


The Method of Ordinary Least Squares

Figure: Least-squares criterion.


Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 29 / 63
The problem of Estimation

The problem of Estimation


The Method of Ordinary Least Squares

The process of differentiation yields the following equations for


estimating β0 and β1 :
X X
yi = nβ̂0 + β̂1 xi
X X X
yi xi = β̂0 xi + β̂1 x2i

where n is the sample size. These simultaneous equations are known


as the normal equations.

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 30 / 63


The problem of Estimation

The problem of Estimation


The Method of Ordinary Least Squares

Solving the normal equations simultaneously, we obtain


P P P
n x i yi − x i yi
β̂1 =
n x2i − ( xi )2
P P
P
(xi − x)(yi − y)
=
(xi − x)2
P
P ∗ ∗
x y
= P i∗2i
xi

where x and y are the sample means of x and y and where we define
x∗i = (xi − x) and yi∗ = (yi − y).

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 31 / 63


The problem of Estimation

The problem of Estimation


The Method of Ordinary Least Squares

and the intercept term becomes


P 2P P P
xi yi − xi xi yi
β̂0 =
n x2i − ( xi )2
P P

= y − β̂1x.

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 32 / 63


Residuals and Goodness of Fit

Residuals and Goodness of Fit

Given β̂0 and β̂1 , we can obtain the fitted value ŷi for each
observation.
The OLS residual associated with observation i, ûi , is the difference
between yi and its fitted value.
For each i write,
yi = ŷi + ûi
Thus, we can view OLS as decomposing each yi into two parts, a
fitted value and a residual.

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 33 / 63


Residuals and Goodness of Fit

Residuals and Goodness of Fit

Define the total sum of squares (SST), the explained sum of


squares (SSE), and the residual sum of squares (SSR) as follows:
n
X
SST = (yi − y)2 .
i=1
n
X
SSE = (ŷi − y)2 .
i=1
n
X
SSR = û2i
i=1

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 34 / 63


Residuals and Goodness of Fit

Residuals and Goodness of Fit

The total variation in y can always be expressed as the sum of the


explained variation and the unexplained variation SSR.
Thus,
SST = SSE + SSR (7)
Assuming that the total sum of squares, SST, is not equal to zero, we
can divide Equation 7 by SST to get 1 = SSE/SST + SSR/SST .
The R-squared of the regression, sometimes called the coefficient of
determination, is defined as
SSE SSR
R2 ≡ =1−
SST SST

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 35 / 63


Residuals and Goodness of Fit

Residuals and Goodness of Fit

R2 is the ratio of the explained variation compared to the total


variation; thus, it is interpreted as the fraction of the sample variation
in y that is explained by x.
The value of R2 is always between zero and one, because SSE can be
no greater than SST.
When interpreting R2 , we usually multiply it by 100 to change it into
a percent: 100 · R2 is the percentage of the sample variation in y that
is explained by x.
If the data points all lie on the same line, OLS provides a perfect fit to
the data. In this case, R2 = 1.
A value of R2 that is nearly equal to zero indicates a poor fit of the
OLS line: very little of the variation in the yi is captured by the
variation in the ŷi .

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 36 / 63


Residuals and Goodness of Fit

Residuals and Goodness of Fit

Example (CEO Salary and Return on Equity)


For the population of chief executive officers, let y be annual salary
(salary) in thousands of dollars. Let x be the average return on equity
(roe) for the CEO’s firm for the previous three years. To study the
relationship between this measure of firm performance and CEO
compensation, we postulate the simple model

salary = β0 + β1 roe + u

Using the data in CEOSAL1, the OLS regression line relating salary to roe
is

\ = 963.191 + 18.501 roe


salary

n = 209, R2 = 0.0132.

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 37 / 63


Residuals and Goodness of Fit

Residuals and Goodness of Fit

Example (CEO Salary and Return on Equity)


How do we interpret the equation? First, if the return on equity is zero,
roe = 0, then the predicted salary is the intercept, 963.191, which equals
$963,191 since salary is measured in thousands. Next, we can write the
predicted change in salary as a function of the change in roe:
\ = 18.501 (∆roe). This means that if the return on equity
∆salary
increases by one percentage point, ∆roe = 1, then salary is predicted to
change by about 18.5, or $18,500.
Using the R-squared (rounded to four decimal places) reported for this
equation, we can see how much of the variation in salary is actually
explained by the return on equity. The answer is: not much. The firm’s
return on equity explains only about 1.3% of the variation in salaries for
this sample of 209 CEOs. That means that 98.7% of the salary variations
for these CEOs is left unexplained!

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 38 / 63


The Gauss-Markov Assumptions for Simple Regression

The Gauss-Markov Assumptions for Simple Regression

Assumption SLR.1 (Linear in Parameters)


In the population model, the dependent variable, y, is related to the
independent variable, x, and the error (or disturbance), u, as

y = β0 + β1 x + u

where β0 and β1 are the population intercept and slope parameters,


respectively.

Assumption SLR.2 (Random Sampling)


We have a random sample of size n, {(xi , yi ) : i = 1, . . . , n}, following the
population model in Assumption SLR.1.

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 39 / 63


The Gauss-Markov Assumptions for Simple Regression

The Gauss-Markov Assumptions for Simple Regression

Assumption SLR.3 (Sample Variation in the Explanatory Variable)


The sample outcomes on x, namely, {xi , i = 1, . . . , n} are not all the same
value.

Assumption SLR.4 (Zero Conditional Mean)


The error u has an expected value of zero given any value of the
explanatory variable. In other words,

E(u|x) = 0.

Assumption SLR.5 (Homoskedasticity)


The error u has the same variance given any value of the explanatory
variable. In other words,
Var(u|x) = σ 2 .
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 40 / 63
The Gauss-Markov Assumptions for Simple Regression

The Gauss-Markov Assumptions for Simple Regression

Figure: Homoscedasticity.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 41 / 63
The Gauss-Markov Assumptions for Simple Regression

The Gauss-Markov Assumptions for Simple Regression

Figure: Heteroscedasticity.

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 42 / 63


The Gauss-Markov Assumptions for Simple Regression

The Gauss-Markov Assumptions for Simple Regression

Assumption SLR.4 is needed to estimate the ceteris paribus effect of x


on y.
Before we discuss the key assumption about how x and u are related,
we can always make one assumption about u.
As long as the intercept β0 is included in the equation, nothing is lost
by assuming that the average value of u in the population is zero.
Mathematically,
E(u) = 0. (8)
This assumption says nothing about the relationship between u and x,
but simply makes a statement about the distribution of the
unobserved factors in the population.

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 43 / 63


The Gauss-Markov Assumptions for Simple Regression

The Gauss-Markov Assumptions for Simple Regression

We now turn to the crucial assumption regarding how u and x are


related.
Because u and x are random variables, we can define the conditional
distribution of u given any value of x.
In particular, for any x, we can obtain the expected (or average) value
of u for that slice of the population described by the value of x.
The crucial assumption is that the average value of u does not depend
on the value of x. We can write this assumption as

E(u|x) = E(u). (9)

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 44 / 63


The Gauss-Markov Assumptions for Simple Regression

The Gauss-Markov Assumptions for Simple Regression

Equation (9) says that the average value of the unobservables is the
same across all slices of the population determined by the value of x
and that the common average is necessarily equal to the average of u
over the entire population.
When we combine (9) with assumption (8), we obtain the zero
conditional mean assumption, E(u|x) = 0.

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 45 / 63


The Gauss-Markov Assumptions for Simple Regression

The Gauss-Markov Assumptions for Simple Regression

Example (Returns to Education)


Let us see what equation (9) entails in the wage example. To simplify the
discussion, assume that u is the same as innate ability. Then equation (9)
requires that the average level of ability is the same, regardless of years of
education. For example, if E(abil|8) denotes the average ability for the
group of all people with eight years of education, and E(abil|16) denotes
the average ability among people in the population with sixteen years of
education, then equation (9) implies that these must be the same. In fact,
the average ability level must be the same for all education levels. If, for
example, we think that average ability increases with years of education,
then equation (9) is false.

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 46 / 63


The Gauss-Markov Assumptions for Simple Regression

The Gauss-Markov Assumptions for Simple Regression

Example (Fertilizer and Yield)


In the fertilizer example, if fertilizer amounts are chosen independently of
other features of the plots, then equation (9) will hold: the average land
quality will not depend on the amount of fertilizer. However, if more
fertilizer is put on the higher-quality plots of land, then the expected value
of u changes with the level of fertilizer, and equation (9) fails.

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 47 / 63


Expected Values and Variances of the OLS Estimators

Expected Values of the OLS Estimators

Unbiasedness of OLS
Using Assumption SLR.1 through SLR.4,

E(β̂0 ) = β0 and E(β̂1 ) = β1 ,

for any values of β0 and β1 . In other words, β̂0 is unbiased for β0 and β̂1 is
unbiased for β1 .

To see why this is true, first note that


Pn Pn
i=1 (xi − x)(yi − y) (xi − x)ui
β̂1 = Pn 2
= β1 + Pi=1
n 2
i=1 (xi − x) i=1 (xi − x)
Xn
= β1 + (1/SSTx ) di ui ,
i=1

where di = xi − x.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 48 / 63
Expected Values and Variances of the OLS Estimators

Expected Values of the OLS Estimators

Therefore (and keeping the conditioning on the x’s) implicit, we have


n
X
E(β̂1 ) = β1 + (1/SSTx ) di E(ui )
i=1
Xn
= β1 + (1/SSTx ) di · 0 = β1 ,
i=1

where we have used Assumptions SLR.2 and SLR.4.


The proof for β̂0 is now straightforward. Average (6) across i to get
y = β0 + β1x + u, and plug this into the formula for β̂0 :

β̂0 = y − β̂1x = β0 + β1x + u − β̂1x = β0 + (β1 − β̂1 )x + u.

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 49 / 63


Expected Values and Variances of the OLS Estimators

Expected Values of the OLS Estimators

Then, conditional on the values of the xi ,

E(β̂0 ) = β0 + E[(β1 − β̂1 )x] + E(u) = β0 + E[(β1 − β̂1 )]x,

because E(u) = 0 by Assumptions SLR.2 and SLR.4.


But, we showed that E(β̂1 ) = β1 , which implies that
E[(β1 − β̂1 )] = 0. Thus, E(β̂0 ) = β0 .

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 50 / 63


Expected Values and Variances of the OLS Estimators

Variances of the OLS Estimators

Under Assumptions SLR.1 through SLR.5,

σ2 σ2
Var(β̂1 ) = P =
(xi − x)2 x∗2
P
i

and
σ 2 n−1 x2i
P P 2
x
Var(β̂0 ) = P 2
= P i∗2 σ 2
(xi − x) n xi
σ 2 can be estimated by the following formula:
P 2
ûi
σ̂ 2 =
n−2
where n − 2 is the number of degrees of freedom (df).

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 51 / 63


The Gauss-Markov Theorem

Properties of Least-Squares Estimators: The Gauss-Markov


Theorem

Given the assumptions of the classical linear regression model, the


least squares estimates possess some ideal or optimum properties.
These properties are contained in the well-known Gauss-Markov
theorem.

Gauss-Markov Theorem
Given the assumptions of the classical linear regression model, the
least-squares estimators, in the class of unbiased linear estimators, have
minimum variance, that is, they are BLUE (Best Linear Unbiased
Estimators).

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 52 / 63


The Gauss-Markov Theorem

Properties of Least-Squares Estimators: The Gauss-Markov


Theorem

An estimator, say the OLS estimator β̂1 , is said to be a best linear


unbiased estimator (BLUE) of β̂1 if the following hold:

1 It is linear, that is, a linear function of a random variable, such as the


dependent variable Y in the regression model.
2 It is unbiased, that is, its average or expected value, E(β̂1 ), is equal
to the true value, β1 .
3 It has minimum variance in the class of all such linear unbiased
estimators; an unbiased estimator with the least variance is known as
an efficient estimator.

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 53 / 63


Nonlinearities in Simple Regression

Incorporating Nonlinearities in Simple Regression

So far, we have focused on linear relationships between the dependent


and independent variables.
However, linear relationships are not nearly general enough for all
economic applications.
In reading applied work in the social sciences, you will often encounter
regression equations where the dependent variable appears in
logarithmic form.
Consider the wage-education example, where we regress hourly wage
on years of education,

wage = β0 + β1 educ + u. (10)

Suppose we obtained a slope estimate of 0.54, which means that each


additional year of education is predicted to increase hourly wage by 54
cents.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 54 / 63
Nonlinearities in Simple Regression

Incorporating Nonlinearities in Simple Regression

Because of the linear nature of (10), 54 cents is the increase for either
the first year of education or the twentieth year; this may not be
reasonable.
Probably a better characterization of how wage changes with
education is that each year of education increases wage by a constant
percentage.
A model that gives (approximately) a constant percentage effect is

log(wage) = β0 + β1 educ + u, (11)

where log(·) denotes the natural logarithm.


In particular, if ∆u = 0, then

%∆wage ≈ (100 · β1 )∆educ

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 55 / 63


Nonlinearities in Simple Regression

Incorporating Nonlinearities in Simple Regression

Since the percentage change in wage is the same for each additional
year of education, the change in wage for an extra year of education
increases as education increases; in other words, (10) implies an
increasing return to education.
By exponentiating (11), we can write wage = exp(β0 + β1 educ + u).
This equation is graphed in Figure 7.

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 56 / 63


Nonlinearities in Simple Regression

Incorporating Nonlinearities in Simple Regression

Figure: wage = exp(β0 + β1 educ + u), with β1 > 0.

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 57 / 63


Nonlinearities in Simple Regression

Incorporating Nonlinearities in Simple Regression

Example (A Log Wage Equation)


Using the data in WAGE1 and using log(wage) as the dependent variable,
we obtain the following relationship:

\ = 0.584 + 0.083 educ


log(wage)
n = 526, R2 = 0.186.

The coefficient on educ has a percentage interpretation when it is


multiplied by 100: wage
[ increases by 8.3% for every additional year of
education. This is what economists mean when they refer to the “return to
another year of education.”
The intercept is not very meaningful, because it gives the predicted
log(wage), when educ = 0. The R-squared shows that educ explains
about 18.6% of the variation in log(wage) (not wage).

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 58 / 63


Nonlinearities in Simple Regression

Incorporating Nonlinearities in Simple Regression

Another important use of the natural log is in obtaining a constant


elasticity model.
The elasticity of y w.r.t x is approximately equal to
∆ log(y)/∆ log(x).
Thus, a constant elasticity model is approximated by

log(y) = β0 + β1 log(x) + u

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 59 / 63


Nonlinearities in Simple Regression

Incorporating Nonlinearities in Simple Regression

Example (CEO Salary and Firm Sales)


We can estimate a constant elasticity model relating CEO salary to firm
sales. The data set is the same one used in the CEO Salary-RoE example,
except we now relate salary to sales. Let sales be annual firm sales,
measured in millions of dollars. A constant elasticity model is

log(salary) = β0 + β1 log(sales) + u,

where β1 is the elasticity of salary with respect to sales. This model falls
under the simple regression model by defining the dependent variable to be
y = log(salary) and the independent variable to be x = log(sales).

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 60 / 63


Nonlinearities in Simple Regression

Incorporating Nonlinearities in Simple Regression

Example (CEO Salary and Firm Sales (continued ))


Estimating this equation by OLS gives

\
log(salary) = 4.822 + 0.257 log(sales)
n = 209, R2 = 0.211.

The coefficient of log(sales) is the estimated elasticity of salary with


respect to sales. It implies that a 1% increase in firm sales increases CEO
salary by about 0.257%—the usual interpretation of an elasticity.

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 61 / 63


Nonlinearities in Simple Regression

Incorporating Nonlinearities in Simple Regression

In the log-level model, 100 · β1 is sometimes called the semi-elasticity


of y with respect to x.
In the log-log model, β1 is the elasticity of y with respect to x.

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 62 / 63


Nonlinearities in Simple Regression

************* End of Chapter Two *************

Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 63 / 63

You might also like