Econ 3044: Introduction to Econometrics
Chapter-2: Simple Linear Regression
Lemi D.
Addis Ababa University
[Link]@[Link]
April 25, 2021
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 1 / 63
Overview
1 Definition of the Simple Regression Model
2 The Concept of PRF and SRF
3 The problem of Estimation
4 Residuals and Goodness of Fit
5 The Gauss-Markov Assumptions for Simple Regression
6 Expected Values and Variances of the OLS Estimators
7 The Gauss-Markov Theorem
8 Nonlinearities in Simple Regression
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 2 / 63
Definition of the Simple Regression Model
Definition of the Simple Regression Model
Much of applied econometric analysis begins with the following
premise: y and x are two variables, representing some population, and
we are interested in “explaining y in terms of x”.
In writing down a model that will “explain y in terms of x,” we must
confront three issues.
I First, since there is never an exact relationship between two variables,
how do we allow for other factors to affect y?
I Second, what is the functional relationship between y and x?
I And third, how can we be sure we are capturing a ceteris paribus
relationship between y and x?
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 3 / 63
Definition of the Simple Regression Model
Definition of the Simple Regression Model
We can resolve these ambiguities by writing down an equation relating
y to x. A simple equation is
y = β0 + β1 x + u (1)
Equation 1, which is assumed to hold in the population of interest,
defines the simple linear regression model.
When related by Equation 1, the variables y and x have several
different names used interchangeably, as shown in Table 1.
The terms “dependent variable” and “independent variable” are
frequently used in econometrics.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 4 / 63
Definition of the Simple Regression Model
Definition of the Simple Regression Model
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 5 / 63
Definition of the Simple Regression Model
Definition of the Simple Regression Model
The variable u, called the error term or disturbance in the relationship,
represents factors other than x that affect y.
A simple regression analysis effectively treats all factors affecting y
other than x as being unobserved.
Other justifications for the inclusion of the error term in the model
include:
I Measurement error
I Wrong mathematical specification of the model
I Errors in aggregation
I The randomness of human behavior
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 6 / 63
Definition of the Simple Regression Model
Definition of the Simple Regression Model
Equation 1 also addresses the issue of the functional relationship
between y and x.
If the other factors in u are held fixed, so that the change in u is zero,
∆u = 0, then x has a linear effect on y:
∆y = β1 ∆x if ∆u = 0
Thus, the change in y is simply β1 multiplied by the change in x.
This means that β1 is the slope parameter in the relationship
between y and x, holding the other factors in u fixed.
The intercept parameter β0 , sometimes called the constant term,
also has its uses, although it is rarely central to an analysis.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 7 / 63
Definition of the Simple Regression Model
Definition of the Simple Regression Model
Example (Soybean Yield and Fertilizer)
Suppose that soybean yield is determined by the model
yield = β0 + β1 f ertilizer + u,
so that y = yield and x = f ertilizer. The agricultural researcher is
interested in the effect of fertilizer on yield, holding other factors fixed.
This effect is given by β1 . The error term u contains factors such as land
quality, rainfall, and so on.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 8 / 63
Definition of the Simple Regression Model
Definition of the Simple Regression Model
Example (A Simple Wage Equation)
A model relating a person’s wage to observed education and other
unobserved factors is
wage = β0 + β1 educ + u.
If wage is measured in dollars per hour and educ is years of education,
then β1 measures the change in hourly wage given another year of
education, holding all other factors fixed. Some of those factors include
labor force experience, innate ability, tenure with current employer, work
ethic, and numerous other things.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 9 / 63
Definition of the Simple Regression Model
The Concept of PRF and SRF
Consider the data in Table 2, which refers to a total population of 60
families in a hypothetical community and their weekly income (X) and
weekly consumption expenditure (Y ), both in dollars.
The 60 families are divided into 10 income groups (from $80 to $260)
and the weekly expenditures of each family in the various groups are
as shown in the table.
Despite the variability of weekly consumption expenditure within each
income bracket, on the average, weekly consumption expenditure
increases as income increases.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 10 / 63
The Concept of PRF and SRF
The Concept of PRF and SRF
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 11 / 63
The Concept of PRF and SRF
The Concept of PRF and SRF
In all we have 10 mean values for the 10 subpopulations of Y . We call
these mean values conditional expected values, as they depend on
the given values of the (conditioning) variable X.
Symbolically, we denote them as E(Y |X), which is read as the
expected value of Y given the value of X.
The dark circled points in Figure 1 show the conditional mean values
of Y against the various X values.
If we join these conditional mean values, we obtain what is known as
the population regression line (PRL), or more generally, the
population regression curve.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 12 / 63
The Concept of PRF and SRF
The Concept of PRF and SRF
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 13 / 63
The Concept of PRF and SRF
The Concept of PRF and SRF
Geometrically, then, a population regression curve is simply the locus
of the conditional means of the dependent variable for the fixed values
of the explanatory variable(s).
More simply, it is the curve connecting the means of the
subpopulations of Y corresponding to the given values of the regressor
X. It can be depicted as in Figure 2.
This figure shows that for each X (i.e., income level) there is a
population of Y values (weekly consumption expenditures) that are
spread around the (conditional) mean of those Y values.
For simplicity, we are assuming that these Y values are distributed
symmetrically around their respective (conditional) mean values. And
the regression line (or curve) passes through these (conditional) mean
values.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 14 / 63
The Concept of PRF and SRF
The Concept of PRF and SRF
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 15 / 63
The Concept of PRF and SRF
The Concept of PRF and SRF
From the preceding discussion and Figures 1 and 2, it is clear that
each conditional mean E(Y |Xi ) is a function of Xi , where Xi is a
given value of X. Symbolically,
E(Y |Xi ) = f (Xi ) (2)
where f (Xi ) denotes some function of the explanatory variable X.
Equation 2 is known as the population regression function (PRF).
In our example, E(Y |Xi ) is a linear function of Xi :
E(Y |Xi ) = β0 + β1 Xi
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 16 / 63
The Concept of PRF and SRF
The Concept of PRF and SRF
Stochastic Specification of PRF
We see from Figure 1 that, given the income level of Xi , an individual
family’s consumption expenditure is clustered around the average
consumption of all families at that Xi , that is, around its conditional
expectation.
Therefore, we can express the deviation of an individual Yi around its
expected value as follows:
ui = Yi − E(Y |Xi )
or
Yi = E(Y |Xi ) + ui (3)
where the deviation ui is an unobservable random variable taking
positive or negative values.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 17 / 63
The Concept of PRF and SRF
The Concept of PRF and SRF
Stochastic Specification of PRF
Technically, ui is known as the stochastic disturbance or stochastic
error term.
If E(Y |Xi ) is assumed to be linear in Xi , Equation 3 may be written
as
Yi = β0 + β1 Xi + ui (4)
Now if we take the expected value of Equation 4, on both sides, we
obtain
E(Yi |Xi ) = β0 + β1 E(Xi |Xi ) + E(ui |Xi )
= β0 + β1 Xi + E(ui |Xi )
= E(Y |Xi ) + E(ui |Xi ) (5)
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 18 / 63
The Concept of PRF and SRF
The Concept of PRF and SRF
Stochastic Specification of PRF
Since E(Yi |Xi ) is the same thing as E(Y |Xi ), Equation 5 implies that
E(ui |Xi ) = 0
Thus, the assumption that the regression line passes through the
conditional means of Y implies that the conditional mean values of ui
(conditional upon the given X’s) are zero.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 19 / 63
The Concept of PRF and SRF
The Concept of PRF and SRF
However, for most practical situations what we have is a sample of Y
values corresponding to some fixed X’s.
Therefore, our task now is to estimate the PRF on the basis of the
sample information.
Suppose that the population of Table 1 was not known to us and the
only information we had was a randomly selected sample of Y values
for the fixed X’s as given in Table 2.
The question is: Can we estimate the PRF from the sample data?
We may not be able to estimate the PRF “accurately” because of
sampling fluctuations.
In general, we would get N different estimates for N different
samples. We represent these estimates by the sample regression
function(SRF).
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 20 / 63
The Concept of PRF and SRF
The Concept of PRF and SRF
Figure: A random sample from the population of Table 1
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 21 / 63
The Concept of PRF and SRF
The Concept of PRF and SRF
The sample regression function may be written as
Ŷi = β̂0 + β̂1 Xi
where
Ŷi = estimator of E(Y |Xi )
β̂0 = estimator of β0
β̂1 = estimator of β1
Note that an estimator is simply a rule or formula or method that
tells how to estimate the population parameter from the information
provided by the sample at hand.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 22 / 63
The Concept of PRF and SRF
The Concept of PRF and SRF
To sum up, our primary objective in regression analysis is to estimate
the PRF
Yi = β0 + β1 Xi + ui
on the basis of the SRF
Yi = β̂0 + β̂1 Xi + ûi
But because of sampling fluctuations our estimate of the PRF based
on the SRF is at best an approximate one.
This approximation is shown diagrammatically in Figure 3.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 23 / 63
The Concept of PRF and SRF
The Concept of PRF and SRF
Figure: Sample and population regression lines.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 24 / 63
The problem of Estimation
The problem of Estimation
Let {(xi , yi ) : i = 1, . . . , n} denote a random sample of size n from
the population.
The linear regression model can be written as
yi = β0 + β1 xi + ui (6)
for each i.
One of the objectives of the whole exercise in regression is to obtain
appropriate estimates for the parameters β0 and β1 .
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 25 / 63
The problem of Estimation
The problem of Estimation
There are a number of methods for estimating the parameters of this
model. The most popular ones are:
I the method of moments
I the method of ordinary least squares (OLS), and
I the method of maximum likelihood
Though they give different outcomes in generalized models, in the
case of the simple linear regression, all three give identical results.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 26 / 63
The problem of Estimation
The problem of Estimation
The Method of Ordinary Least Squares
By and large, it is the method of OLS that is used extensively in
regression analysis.
We estimate the PRF in Equation 6 from the SRF:
yi = β̂0 + β̂1 xi + ûi
First, we express the SRF as
ûi = yi − β̂0 − β̂1 xi
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 27 / 63
The problem of Estimation
The problem of Estimation
The Method of Ordinary Least Squares
We adopt the least squares criterion, which states that the SRF can
be fixed in such a way that
n
X n
X
û2i = (yi − β̂0 − β̂1 xi )2
i=1 i=1
is as small as possible.
The sum of the squared residuals is some function of the estimators β̂0
and β̂1 .
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 28 / 63
The problem of Estimation
The problem of Estimation
The Method of Ordinary Least Squares
Figure: Least-squares criterion.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 29 / 63
The problem of Estimation
The problem of Estimation
The Method of Ordinary Least Squares
The process of differentiation yields the following equations for
estimating β0 and β1 :
X X
yi = nβ̂0 + β̂1 xi
X X X
yi xi = β̂0 xi + β̂1 x2i
where n is the sample size. These simultaneous equations are known
as the normal equations.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 30 / 63
The problem of Estimation
The problem of Estimation
The Method of Ordinary Least Squares
Solving the normal equations simultaneously, we obtain
P P P
n x i yi − x i yi
β̂1 =
n x2i − ( xi )2
P P
P
(xi − x)(yi − y)
=
(xi − x)2
P
P ∗ ∗
x y
= P i∗2i
xi
where x and y are the sample means of x and y and where we define
x∗i = (xi − x) and yi∗ = (yi − y).
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 31 / 63
The problem of Estimation
The problem of Estimation
The Method of Ordinary Least Squares
and the intercept term becomes
P 2P P P
xi yi − xi xi yi
β̂0 =
n x2i − ( xi )2
P P
= y − β̂1x.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 32 / 63
Residuals and Goodness of Fit
Residuals and Goodness of Fit
Given β̂0 and β̂1 , we can obtain the fitted value ŷi for each
observation.
The OLS residual associated with observation i, ûi , is the difference
between yi and its fitted value.
For each i write,
yi = ŷi + ûi
Thus, we can view OLS as decomposing each yi into two parts, a
fitted value and a residual.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 33 / 63
Residuals and Goodness of Fit
Residuals and Goodness of Fit
Define the total sum of squares (SST), the explained sum of
squares (SSE), and the residual sum of squares (SSR) as follows:
n
X
SST = (yi − y)2 .
i=1
n
X
SSE = (ŷi − y)2 .
i=1
n
X
SSR = û2i
i=1
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 34 / 63
Residuals and Goodness of Fit
Residuals and Goodness of Fit
The total variation in y can always be expressed as the sum of the
explained variation and the unexplained variation SSR.
Thus,
SST = SSE + SSR (7)
Assuming that the total sum of squares, SST, is not equal to zero, we
can divide Equation 7 by SST to get 1 = SSE/SST + SSR/SST .
The R-squared of the regression, sometimes called the coefficient of
determination, is defined as
SSE SSR
R2 ≡ =1−
SST SST
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 35 / 63
Residuals and Goodness of Fit
Residuals and Goodness of Fit
R2 is the ratio of the explained variation compared to the total
variation; thus, it is interpreted as the fraction of the sample variation
in y that is explained by x.
The value of R2 is always between zero and one, because SSE can be
no greater than SST.
When interpreting R2 , we usually multiply it by 100 to change it into
a percent: 100 · R2 is the percentage of the sample variation in y that
is explained by x.
If the data points all lie on the same line, OLS provides a perfect fit to
the data. In this case, R2 = 1.
A value of R2 that is nearly equal to zero indicates a poor fit of the
OLS line: very little of the variation in the yi is captured by the
variation in the ŷi .
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 36 / 63
Residuals and Goodness of Fit
Residuals and Goodness of Fit
Example (CEO Salary and Return on Equity)
For the population of chief executive officers, let y be annual salary
(salary) in thousands of dollars. Let x be the average return on equity
(roe) for the CEO’s firm for the previous three years. To study the
relationship between this measure of firm performance and CEO
compensation, we postulate the simple model
salary = β0 + β1 roe + u
Using the data in CEOSAL1, the OLS regression line relating salary to roe
is
\ = 963.191 + 18.501 roe
salary
n = 209, R2 = 0.0132.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 37 / 63
Residuals and Goodness of Fit
Residuals and Goodness of Fit
Example (CEO Salary and Return on Equity)
How do we interpret the equation? First, if the return on equity is zero,
roe = 0, then the predicted salary is the intercept, 963.191, which equals
$963,191 since salary is measured in thousands. Next, we can write the
predicted change in salary as a function of the change in roe:
\ = 18.501 (∆roe). This means that if the return on equity
∆salary
increases by one percentage point, ∆roe = 1, then salary is predicted to
change by about 18.5, or $18,500.
Using the R-squared (rounded to four decimal places) reported for this
equation, we can see how much of the variation in salary is actually
explained by the return on equity. The answer is: not much. The firm’s
return on equity explains only about 1.3% of the variation in salaries for
this sample of 209 CEOs. That means that 98.7% of the salary variations
for these CEOs is left unexplained!
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 38 / 63
The Gauss-Markov Assumptions for Simple Regression
The Gauss-Markov Assumptions for Simple Regression
Assumption SLR.1 (Linear in Parameters)
In the population model, the dependent variable, y, is related to the
independent variable, x, and the error (or disturbance), u, as
y = β0 + β1 x + u
where β0 and β1 are the population intercept and slope parameters,
respectively.
Assumption SLR.2 (Random Sampling)
We have a random sample of size n, {(xi , yi ) : i = 1, . . . , n}, following the
population model in Assumption SLR.1.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 39 / 63
The Gauss-Markov Assumptions for Simple Regression
The Gauss-Markov Assumptions for Simple Regression
Assumption SLR.3 (Sample Variation in the Explanatory Variable)
The sample outcomes on x, namely, {xi , i = 1, . . . , n} are not all the same
value.
Assumption SLR.4 (Zero Conditional Mean)
The error u has an expected value of zero given any value of the
explanatory variable. In other words,
E(u|x) = 0.
Assumption SLR.5 (Homoskedasticity)
The error u has the same variance given any value of the explanatory
variable. In other words,
Var(u|x) = σ 2 .
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 40 / 63
The Gauss-Markov Assumptions for Simple Regression
The Gauss-Markov Assumptions for Simple Regression
Figure: Homoscedasticity.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 41 / 63
The Gauss-Markov Assumptions for Simple Regression
The Gauss-Markov Assumptions for Simple Regression
Figure: Heteroscedasticity.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 42 / 63
The Gauss-Markov Assumptions for Simple Regression
The Gauss-Markov Assumptions for Simple Regression
Assumption SLR.4 is needed to estimate the ceteris paribus effect of x
on y.
Before we discuss the key assumption about how x and u are related,
we can always make one assumption about u.
As long as the intercept β0 is included in the equation, nothing is lost
by assuming that the average value of u in the population is zero.
Mathematically,
E(u) = 0. (8)
This assumption says nothing about the relationship between u and x,
but simply makes a statement about the distribution of the
unobserved factors in the population.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 43 / 63
The Gauss-Markov Assumptions for Simple Regression
The Gauss-Markov Assumptions for Simple Regression
We now turn to the crucial assumption regarding how u and x are
related.
Because u and x are random variables, we can define the conditional
distribution of u given any value of x.
In particular, for any x, we can obtain the expected (or average) value
of u for that slice of the population described by the value of x.
The crucial assumption is that the average value of u does not depend
on the value of x. We can write this assumption as
E(u|x) = E(u). (9)
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 44 / 63
The Gauss-Markov Assumptions for Simple Regression
The Gauss-Markov Assumptions for Simple Regression
Equation (9) says that the average value of the unobservables is the
same across all slices of the population determined by the value of x
and that the common average is necessarily equal to the average of u
over the entire population.
When we combine (9) with assumption (8), we obtain the zero
conditional mean assumption, E(u|x) = 0.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 45 / 63
The Gauss-Markov Assumptions for Simple Regression
The Gauss-Markov Assumptions for Simple Regression
Example (Returns to Education)
Let us see what equation (9) entails in the wage example. To simplify the
discussion, assume that u is the same as innate ability. Then equation (9)
requires that the average level of ability is the same, regardless of years of
education. For example, if E(abil|8) denotes the average ability for the
group of all people with eight years of education, and E(abil|16) denotes
the average ability among people in the population with sixteen years of
education, then equation (9) implies that these must be the same. In fact,
the average ability level must be the same for all education levels. If, for
example, we think that average ability increases with years of education,
then equation (9) is false.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 46 / 63
The Gauss-Markov Assumptions for Simple Regression
The Gauss-Markov Assumptions for Simple Regression
Example (Fertilizer and Yield)
In the fertilizer example, if fertilizer amounts are chosen independently of
other features of the plots, then equation (9) will hold: the average land
quality will not depend on the amount of fertilizer. However, if more
fertilizer is put on the higher-quality plots of land, then the expected value
of u changes with the level of fertilizer, and equation (9) fails.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 47 / 63
Expected Values and Variances of the OLS Estimators
Expected Values of the OLS Estimators
Unbiasedness of OLS
Using Assumption SLR.1 through SLR.4,
E(β̂0 ) = β0 and E(β̂1 ) = β1 ,
for any values of β0 and β1 . In other words, β̂0 is unbiased for β0 and β̂1 is
unbiased for β1 .
To see why this is true, first note that
Pn Pn
i=1 (xi − x)(yi − y) (xi − x)ui
β̂1 = Pn 2
= β1 + Pi=1
n 2
i=1 (xi − x) i=1 (xi − x)
Xn
= β1 + (1/SSTx ) di ui ,
i=1
where di = xi − x.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 48 / 63
Expected Values and Variances of the OLS Estimators
Expected Values of the OLS Estimators
Therefore (and keeping the conditioning on the x’s) implicit, we have
n
X
E(β̂1 ) = β1 + (1/SSTx ) di E(ui )
i=1
Xn
= β1 + (1/SSTx ) di · 0 = β1 ,
i=1
where we have used Assumptions SLR.2 and SLR.4.
The proof for β̂0 is now straightforward. Average (6) across i to get
y = β0 + β1x + u, and plug this into the formula for β̂0 :
β̂0 = y − β̂1x = β0 + β1x + u − β̂1x = β0 + (β1 − β̂1 )x + u.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 49 / 63
Expected Values and Variances of the OLS Estimators
Expected Values of the OLS Estimators
Then, conditional on the values of the xi ,
E(β̂0 ) = β0 + E[(β1 − β̂1 )x] + E(u) = β0 + E[(β1 − β̂1 )]x,
because E(u) = 0 by Assumptions SLR.2 and SLR.4.
But, we showed that E(β̂1 ) = β1 , which implies that
E[(β1 − β̂1 )] = 0. Thus, E(β̂0 ) = β0 .
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 50 / 63
Expected Values and Variances of the OLS Estimators
Variances of the OLS Estimators
Under Assumptions SLR.1 through SLR.5,
σ2 σ2
Var(β̂1 ) = P =
(xi − x)2 x∗2
P
i
and
σ 2 n−1 x2i
P P 2
x
Var(β̂0 ) = P 2
= P i∗2 σ 2
(xi − x) n xi
σ 2 can be estimated by the following formula:
P 2
ûi
σ̂ 2 =
n−2
where n − 2 is the number of degrees of freedom (df).
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 51 / 63
The Gauss-Markov Theorem
Properties of Least-Squares Estimators: The Gauss-Markov
Theorem
Given the assumptions of the classical linear regression model, the
least squares estimates possess some ideal or optimum properties.
These properties are contained in the well-known Gauss-Markov
theorem.
Gauss-Markov Theorem
Given the assumptions of the classical linear regression model, the
least-squares estimators, in the class of unbiased linear estimators, have
minimum variance, that is, they are BLUE (Best Linear Unbiased
Estimators).
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 52 / 63
The Gauss-Markov Theorem
Properties of Least-Squares Estimators: The Gauss-Markov
Theorem
An estimator, say the OLS estimator β̂1 , is said to be a best linear
unbiased estimator (BLUE) of β̂1 if the following hold:
1 It is linear, that is, a linear function of a random variable, such as the
dependent variable Y in the regression model.
2 It is unbiased, that is, its average or expected value, E(β̂1 ), is equal
to the true value, β1 .
3 It has minimum variance in the class of all such linear unbiased
estimators; an unbiased estimator with the least variance is known as
an efficient estimator.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 53 / 63
Nonlinearities in Simple Regression
Incorporating Nonlinearities in Simple Regression
So far, we have focused on linear relationships between the dependent
and independent variables.
However, linear relationships are not nearly general enough for all
economic applications.
In reading applied work in the social sciences, you will often encounter
regression equations where the dependent variable appears in
logarithmic form.
Consider the wage-education example, where we regress hourly wage
on years of education,
wage = β0 + β1 educ + u. (10)
Suppose we obtained a slope estimate of 0.54, which means that each
additional year of education is predicted to increase hourly wage by 54
cents.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 54 / 63
Nonlinearities in Simple Regression
Incorporating Nonlinearities in Simple Regression
Because of the linear nature of (10), 54 cents is the increase for either
the first year of education or the twentieth year; this may not be
reasonable.
Probably a better characterization of how wage changes with
education is that each year of education increases wage by a constant
percentage.
A model that gives (approximately) a constant percentage effect is
log(wage) = β0 + β1 educ + u, (11)
where log(·) denotes the natural logarithm.
In particular, if ∆u = 0, then
%∆wage ≈ (100 · β1 )∆educ
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 55 / 63
Nonlinearities in Simple Regression
Incorporating Nonlinearities in Simple Regression
Since the percentage change in wage is the same for each additional
year of education, the change in wage for an extra year of education
increases as education increases; in other words, (10) implies an
increasing return to education.
By exponentiating (11), we can write wage = exp(β0 + β1 educ + u).
This equation is graphed in Figure 7.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 56 / 63
Nonlinearities in Simple Regression
Incorporating Nonlinearities in Simple Regression
Figure: wage = exp(β0 + β1 educ + u), with β1 > 0.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 57 / 63
Nonlinearities in Simple Regression
Incorporating Nonlinearities in Simple Regression
Example (A Log Wage Equation)
Using the data in WAGE1 and using log(wage) as the dependent variable,
we obtain the following relationship:
\ = 0.584 + 0.083 educ
log(wage)
n = 526, R2 = 0.186.
The coefficient on educ has a percentage interpretation when it is
multiplied by 100: wage
[ increases by 8.3% for every additional year of
education. This is what economists mean when they refer to the “return to
another year of education.”
The intercept is not very meaningful, because it gives the predicted
log(wage), when educ = 0. The R-squared shows that educ explains
about 18.6% of the variation in log(wage) (not wage).
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 58 / 63
Nonlinearities in Simple Regression
Incorporating Nonlinearities in Simple Regression
Another important use of the natural log is in obtaining a constant
elasticity model.
The elasticity of y w.r.t x is approximately equal to
∆ log(y)/∆ log(x).
Thus, a constant elasticity model is approximated by
log(y) = β0 + β1 log(x) + u
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 59 / 63
Nonlinearities in Simple Regression
Incorporating Nonlinearities in Simple Regression
Example (CEO Salary and Firm Sales)
We can estimate a constant elasticity model relating CEO salary to firm
sales. The data set is the same one used in the CEO Salary-RoE example,
except we now relate salary to sales. Let sales be annual firm sales,
measured in millions of dollars. A constant elasticity model is
log(salary) = β0 + β1 log(sales) + u,
where β1 is the elasticity of salary with respect to sales. This model falls
under the simple regression model by defining the dependent variable to be
y = log(salary) and the independent variable to be x = log(sales).
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 60 / 63
Nonlinearities in Simple Regression
Incorporating Nonlinearities in Simple Regression
Example (CEO Salary and Firm Sales (continued ))
Estimating this equation by OLS gives
\
log(salary) = 4.822 + 0.257 log(sales)
n = 209, R2 = 0.211.
The coefficient of log(sales) is the estimated elasticity of salary with
respect to sales. It implies that a 1% increase in firm sales increases CEO
salary by about 0.257%—the usual interpretation of an elasticity.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 61 / 63
Nonlinearities in Simple Regression
Incorporating Nonlinearities in Simple Regression
In the log-level model, 100 · β1 is sometimes called the semi-elasticity
of y with respect to x.
In the log-log model, β1 is the elasticity of y with respect to x.
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 62 / 63
Nonlinearities in Simple Regression
************* End of Chapter Two *************
Lemi D. (AAUSC) Ch 2: Simple Linear Regression April 25, 2021 63 / 63