M Is Specification
M Is Specification
3 4
7 8
RESET Test Example: House prices, hprice1.gdt RESET Test Example: Level-level Model
I Level-level model:
Auxiliary regression for RESET specification test
OLS, using observations 1-88
price = β0 + β1 lotsize + β2 sqrf t + β3 bdrms + u Dependent variable: price
11 12
15 16
Tests Against Nonnested Alternatives Using Proxy Variables for Unobserved Explanatory
I The other method is known as the Davidson-MacKinnon test. Variables
This test is based on including the fitted values ŷ from one I Can we use a proxy variable for an omitted unobserved
model into the other model as an additional regressor and explanatory variable?
conducting a t-test. I We know that if the unobserved variable is an important,
I We will not examine these tests in detail. relevant variable then OLS estimators are biased and
I There are several drawbacks associated with nonnested tests. inconsistent.
I First, these tests may not choose a correct specification. Both I The question can be rephrased as follows: Can we solve or at
models could be rejected or neither model could be rejected. least mitigate the omitted variable bias using proxy variables?
I If neither model could be rejected, we can use the adjusted I A Proxy variable is something that is related to the
R-square to choose between them. unobserved variable that we would like to control for.
I Second, rejecting one model does not automatically mean I Example: recall that in the wage equation we could not
that the alternative is correct. The true model may have a observe innate ability. Can we use intelligence quotient (IQ) as
completely different specification. a proxy for ability?
I Third, if the dependent variable is different, for example if one I IQ does not have to be the same thing as ability, we know
has y and the other has log(y) as dependent variables, these they are not. But what we need is for IQ to be correlated with
tests cannot be used. We need to employ more complex ability.
testing procedures which we will not discuss here.
17 18
19 20
I This says that once x3 is controlled for the expected value of I Let the composite error term be e = u + β3 ν3
x∗3 does not depend on x1 and x2 .
y = α0 + β1 x1 + β2 x2 + α3 x3 + e
I For example, in the wage equation where IF is the proxy
variable for ability this condition becomes where α0 = (β0 + β3 δ0 ), α3 = β3 δ3
I If the assumptions for the proxy variables are all satisfied then
E(ability|educ, exper, IQ) = E(ability|IQ) = δ0 + δ3 IQ
the composite error term e will be uncorrelated with the
I This implies that the average level of ability only changes with explanatory variables included in the model. Thus, OLS
IQ, not with educ and exper. Is this a reasonable estimators of α0 , β1 , β2 , α3 will be consistent.
assumption? I The coefficient on IQ, α3 , measures the impact of a one point
change in IQ test score on wage.
21
Dependent variable: log(wage)
Using Proxy Variables: Wage2.gdt
I This data set contains information about monthly wages,
education, experience, tenure, IQ scores, and several
demographic characteristics for a sample of 935 working men
in 1980.
I Adding IQ test scores we obtain the following results:
Model 1: OLS, using observations 1–935
Dependent variable: lwage
23 24
Using Lagged Dependent Variables as Proxy Variables Using Lagged Dependent Variables as Proxy Variables
I In some applications (eg, the wage example) we have at least I Example: CRIME2.gdt, 1987 crime data for 46 cities,
a vague idea about which unobserved factor we want to information in 1982 also available
I The model without the lagged crime rate:
control.
I In other applications, we suspect that one or more of the \ = 3.34 − 0.029 unem87 + 0.203 l lawexpc87
l crmrte87
(1.251) (0.032) (0.173)
independent variables is correlated with an omitted variable, 2
n = 46 R = 0.057
but we have no idea how to obtain a proxy for that omitted
variable. I The model with lagged crime rate:
I In such cases, we can include the value of the dependent
\ = 0.076 + 0.009 unem87 − 0.140 l lawexpc87 + 1.194 l crmrte82
l crmrte87
variable y from an earlier time period, y−1 . (0.821) (0.02) (0.109) (0.132)
I To do this we need the lagged value of the dependent 2
n = 46 R = 0.680
variable. This provides a way of controlling historical factors
I In the first model, crime rate decreases as unemployment
that cause current differences in dependent variable. increases. This is counterintuitive.
I For example, some cities have had high crime rates in the past I After controlling for the crime rate in 1982 (5 years ago)
Many of the unobserved factors contribute to both high coefficient on unem is positive but insignificant.
current and past crime rates. Slowly moving components in I What is the elasticity of the current crime rate to the crime
rate in the previous period?
dependent variable (inertial effects) can be captured by the
lagged value.
25 26
27 28
Measurement Errors in the Dependent Variable Measurement Errors in the Dependent Variable: Example
I The model is: I Consider the following savings model:
y = β0 + β1 x1 + β2 x2 + . . . + βk xk + u + e0
| {z } sav ∗ = β0 + β1 inc + β2 size + β3 educ + β4 age + u
I If the measurement error, e0 , is uncorrelated with each xj sav ∗ : actual household savings, sav: reported (observed)
then consistent estimation is possible. If the measurement household savings, inc: annual household income, size:
error is independent from explanatory variables then OLS number of individuals in the household, educ: education level
estimators are unbiased and consistent. of the household head, age: age of the household head.
I If the error term, u and the measurement error e0 are I When the measurement error (sav − sav ∗ ) creates a problem?
independent (this is usually assumed), then we have: I We can assume that the measurement error is uncorrelated
with income, size, education and age.
Var(u + e0 ) = Var(u) + Var(e0 ) = σu2 + σ02 > σu2
I On the other hand, we may think that families with higher
I This means that measurement error in the dependent variable incomes, or more education, report their savings more
results in a larger error variance than when no error occurs. accurately.
I As a result, OLS estimators will have larger variances and I Since we cannot observe measurement error we may never be
standard errors. In this case, we may try to collect more able to determine if the measurement error is correlated with
“quality” data. income or education.
29 30
31 32
(1) e1 and x1 are uncorrelated (2) e1 and x∗1 are uncorrelated (CEV Assumption)
I This assumption can be written as I This is known as the “Classical Errors-in-Variables (CEV)”. In
the econometrics literature, when we talk about measurement
Cov(x1 , e1 ) = 0 error in explanatory variable we usually mean CEV.
I The CEV assumption can be written as:
I Since e1 = x1 − x∗1 , it must be the case that e1 and x∗1 are
correlated. Cov(x∗1 , e1 ) = 0
I Under this assumption, substituting x∗1 = x1 − e1 in the model I The observed value can be written as the sum of actual value
we obtain: and measurement error:
y = β0 + β1 x1 + (u − β1 e1 )
x1 = x∗1 + e1
I Expected value and variance of the composite error term:
I Obviously, if x∗1 and e1 are uncorrelated, then, x1 and e1 must
E(u − β1 e1 ) = 0, Var(u − β1 e1 ) = σu2 + β12 σe21 be correlated:
I OLS estimators are consistent because the error term and x1 Cov(x1 , e1 ) = E(x1 e1 ) = E(x∗1 e1 ) + E(e21 ) = 0 + σe21 = σe21
are uncorrelated. But the variance will be higher. I Under CEV assumption, the covariance between x1 and e1 is
equal to the variance of the measurement error.
33 34
35 36
σx2∗ σx2∗
! !
1 1
plim(β̂1 ) = β1 6= β1 plim(β̂1 ) = β1 6= β1
σx2∗ + σe21 σx2∗ + σe21
| 1 {z } | 1 {z }
≤1 ≤1
I The term in the parenthesis will always be smaller than 1. If I If the variance of x∗1 is large as compared to the variance of e1
and only if σe21 = 0 then it is 1. then the ratio Var(x∗1 )/Var(x1 ) will be close to 1. In this case
I This means that: β̂1 is always closer to 0 than the true value the amount of inconsistency may not be large. But it is almost
β1 is. This is called attenuation bias. impossible to determine this.
I If β1 > 0 then β̂1 will approach a value smaller than the true
I Things are more complicated when we add more explanatory
value in the limit (underestimation). Otherwise, it will variables.
approach a bigger value (overestimation). I But we can say that measurement errors generally lead to
inconsistency of all OLS estimators.
37 38
39 40
43 44
48
Outliers: Example
Outliers: Example
I Of the 32 firms, 31 have annual sales less than $20 billion.
One firm has annual sales of nearly $40 billion
I This may be an outlier. Estimation results without outlier:
Outliers
I Certain functional forms may be less sensitive to outlying
observations. Logarithmic transformation significantly narrows
the range of the data that can potentially mitigate the
problems created by outliers. For example, consider the
following model
log(rd) = β0 + β1 log(sales) + β2 prof marg + u
rd: R$D expenditures, $millions
I n = 32 with outlier:
\ = −4.378 + 1.084 log(sales) + 0.023 profmarg
log(rd)
(0.468) (0.060) (0.013)
2
n = 32 R = 0.918
I n = 31 without outlier:
\ = −4.404 + 1.088 log(sales) + 0.0218 profmarg
log(rd)
(0.511) (0.067) (0.013)
2
n = 31 R = 0.9037
I Results are practically the same. Can we reject the null
hypothesis of unit elasticity?