0% found this document useful (0 votes)

275 views

Note On Panel Data

Panel data regression methods allow the modeling of longitudinal data that follows the same observational units, like firms or countries, over time. This combines elements of time series and cross-sectional data. There are two main types of panel data models - fixed effects models and random effects models. Fixed effects models account for time-invariant characteristics of observational units by allowing intercept terms to vary across units, while random effects models treat these effects as random variables. Estimation techniques include dummy variable regression or within transformation to control for fixed effects.

Uploaded by

HassanMubasher

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

275 views

Note On Panel Data

Uploaded by

HassanMubasher

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 19

Panel Data Regression Methods

1) What is Panel Data?

In panel data we have elements of both time series and cross-sectional data.
That is, the same cross-sectional unit (say a firm/state/country) is surveyed over time.
Example: data on GDP for 3 countries for a period of 5 years (see Table).
Year
1991
1992
1993
1994
1995

India
***
***
***
***
***

USA
***
***
***
***
***

Pakistan
***
***
***
***
***

In the typical panel, there are a large number of cross-sectional units and only a few
periods though the opposite case is also relevant.
Such data set focuses on cross-sectional variation or heterogeneity.
There are other names for panel data such as pooled data and longitudinal data. But
strictly not all pooled data are panel data. It will become complete panel only when
we include individual and/or time effects.

2) Types of Panel Data

(A) Balanced panel: Each cross-sectional unit has same number of time series
observations.
Year
1991
1992
1993
1994
1995
1991
1992
1993
1994
1995
1991
1992
1993
1994
1995

Firm ID
1
1
1
1
1
2
2
2
2
2
3
3
3
3
3

Investment (Y)
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***

Value of Firm (X1)

***
***
***
***
***
***
***
***
***
***
***
***
***
***
***

Capital Stock (X2)

***
***
***
***
***
***
***
***
***
***
***
***
***
***
***

(B) Unbalanced panel: Number of observations/time-periods differs among

panel members
Year
1991
1992
1993
1991
1992
1993
1994
1991
1992
1993
1994

Firm ID
1
1
1
2
2
2
2
3
3
3
3

Investment (Y)
***
***
***
***
***
***
***
***
***
***
***

Value of Firm (X2)

***
***
***
***
***
***
***
***
***
***
***

Capital Stock (X3)

***
***
***
***
***
***
***
***
***
***
***

3) Advantages of Panel Data

It takes into account heterogeneity across individual cross-sectional units and through
time. This is done by including individual and time specific variables.
It allows inclusion of unobserved effects as explanatory variables in the model.
Exclusion of such effects could cause omitted variable bias.
It helps to examine issues that cant be studied in either cross-sectional or time-series
settings alone.
It gives more informative data, more variability, less collinearity among variables,
more degrees of freedom and more efficiency
It is better suited to study the dynamics of change as it uses repeated cross section of
observations
It enables us to study more complicated behavioural models (e.g. technological
change)
It allows researcher far greater flexibility in modelling differences in behaviour across
individual cross-sectional units (see one-way fixed effects model)
Using a single cross section disregards much useful information in other time periods.
On the other hand, usage of time-series data disregards useful information on other
cross-sectional units.

4) Methods of Estimation

Panel Data Regression

Fixed Effect Methods

(FEM)

One-way
FEM

Two-way
FEM

Random Effect Methods

(REM)

One-way
REM

Two-way
REM

5) Fixed Effects Model (FEM)

Consider the following simple (panel) regression model
Yit = 1 + 2X2it + 3 X3it + uit
(1)
Where i stands for i th cross-sectional unit or observation (e.g. firm or state); t stands
for t th time period; and uit is common error term.
Problem with this model: It ignores two important aspects which are relevant in
panel data context
First, it doesnt consider space/individual/group dimension: that is factors which are
specific to each cross-sectional unit (individuals/firms/states/countries) but remain
unchanged overtime.

Examples: geographical location of a state, ability of an individual, managerial

style/philosophy of a company, fixed capital of a firm, historical factors etc.
Second, it ignores time dimension: that is factors which are specific to the period in
which they occur but not carried across periods within a cross-sectional unit
Examples: technological change, changes in govt. regulatory policies, economic boom
or bust, global economic slowdown, weather factor and so on.
Together,
group
and
effects/variables/factors.

time

dimensions

are

called

unobserved

Thus, model (1) is the usual OLS regression and in panel data context is called pooled
regression model.
Due to omission of group and time dimensions (specification error), the OLS results
might not be fully reliable.
Hence, we need to find some way to take into account group and time dimension in
model (1).
How to accomplish this?

5.1. The case of group dimension:

We let the intercept of (1) vary for each cross-sectional unit but dont allow it to
change overtime for each cross-sectional unit.
If we do so, model (1) would become
Yit = 1i + 2X2it + 3 X3it + uit

(2)

In (2) we have an intercept term (1i) specific to each individual cross-sectional unit i
Technically, 1i is called the individual effect and is an unknown parameter to be
estimated
Note that in model (1) we have a common intercept term ( 1) which is assumed as
same across all cross-sectional units
By incorporating group dimensions we acknowledge that there are heterogeneity or
significant differences among cross-sectional units and the difference is captured
through individual specific constant term 1i.
Model (2) is known as the one-way Fixed Effects Model since only individual effect
is present.
The term fixed effects is due to the fact that (a) we consider 1i as a group specific
constant term in regression model; and (b) each cross-sectional units (is) intercept
does not vary over time; that is, it is time invariant.

How to allow for Group/Individual effect in econometric model?

We can easily allow this using the dummy variable technique.
Now, allowing for dummy variables, model (2) can be written as
Yit= 1+ 2D2i+ 3D3i ++ nDni+ 2X2it + 3 X3it + uit

(3)

Where D2i = 1 if observation belong to cross-sectional unit 2; zero otherwise

D3i = 1 if the observation belong to cross-sectional unit 3, zero otherwise
Dni = 1 if the observation belong to nth cross-sectional unit, zero otherwise
Note: If we have n cross-sectional units/groups, we use n-1 dummies to avoid falling
into dummy-variable trap (i.e. situation of perfect collinearity). Hence there is no
dummy for first cross-sectional unit, which means 1 represents intercept for first (or
omitted i in terms of assigning dummies) cross-sectional unit.
Other s represent differential intercept coefficients indicating how much
intercepts of dummy variable assigned i's differ from intercept of i which is not
assigned a dummy.
In short, cross-sectional unit which is not assigned a dummy becomes comparison
cross-sectional unit.
Of course we are free to choose any cross-sectional unit as comparison cross-sectional
unit [Anyway, computer programme will automatically omit the dummy for first
cross-sectional unit (i.e. select the first i as comparison i) and compute the results].
Since we use dummies to estimate the fixed/individual effects, this model is also
known as Least-Squares Dummy Variable (LSDV) model. Hence, the terms FEM
and LSDV can be used interchangeably.
How to incorporate individual dummy variable in the data set?
Assume that we have 5 year data for 4 companies on three variables: one dependent
(investment) and 2 independent (value of firm and capital stock) variables.
We can stack the 5 observations for each company on the top of the other; thus giving
in all 20 observations for each of the variables in the model.
After that the individual dummies are assigned in the following fashion.

No of
observati
ons*
1
2
3
4
5
6
7
8
-----------N

Year

Firm ID

1940
1941
1942
1943
1944
1940
1941
1942
1943
1944
1940
1941
1942
1943
1944
1940
1941
1942
1943
1944

1
1
1
1
1
2
2
2
2
2
3
3
3
3
3
4
4
4
4
4

I
74.4
113
91.9
61.3
56.8
461.2
512
448
499.6
547.5
361.6
472.8
445.6
361.6
288.2
28.57
48.51
43.34
37.02
37.81

F
2132.2
1834.1
1588
1749.4
1687.2
4643.9
4551.2
3244.1
4053.7
4379.3
2202.9
2380.5
2168.6
1985.1
1813.9
628.5
537.1
561.2
617.2
626.7

C
186.6
220.9
287.8
319.9
321.3
207.2
255.2
303.7
264.1
201.6
254.2
261.4
298.7
301.8
279.1
26.5
36.2
60.8
84.4
91.2

1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0

0
0
0
0
0
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0

0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
0
0
0
0
0

0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1

* - No of years multiplied by no of cross-sectional units.

For this data we get the following result
Yit = -52.468 + 260.69 D2i + 285.51 D3i + 49.51 D4i + .065 X2it + .056 X3it
Where Yit represents I; X2it represents F; and X3it represents C.
Note: Other statistics are not produced
The intercept values of the four companies are as follows: -52.47 for Company 1;
208.22 (= 260.69 52.47) for Company 2; 233.04 (= 285.51 52.47) for Company 3
and -2.96 (= 49.51 52.47).
We may say that these differences in intercepts may be due to unique features of each
company, such as differences in management style or managerial talent.
The problem with dummy variable analysis is that it produces many explanatory
variables, especially when the number of cross-sectional observations is plenty. Thus,
in such cases, dummy variable method is not very practical.
However, computer regression packages take care of this. But loss in degrees of
freedom is the major casualty of this exercise.
But the packages rarely produce (LIMDEP & GRETL do not) the estimated intercepts
from the dummy variable analysis for the practical reason that there are so many such
intercepts.

Some econometric packages that support fixed effects estimation report only one
intercept (GRETL does this) which can cause confusion.
Typically, the intercept reported is this case is the average of individual-specific
intercepts: (-52.47 + 208.22 + 233.04 -2.96)/4 = 96.45 in the above example
How to decide on the statistical significance of individual/group effects (or individual
dummies)? [OR] How can we choose between OLS regression results and FEM (with
individual effects) results?
We can use the restricted F test for the purpose. We use F test because our aim is to
test joint significance of null hypothesis that 2 = 3 = n = 0 (n-1 restriction,
since one group is chosen as the base group), where n is the number of crosssectional units [or the null hypothesis is that all dummy parameters
except one are zero: 1 = 2 = n-1 = 0]. In vast majority of applications, the
dummy variables will be jointly significant.
For the purpose of the F test, OLS model is considered a restricted model in the
sense that it imposes a common intercept on all the individual cross-sectional units.
The FEM (with individual effects) is considered as an unrestricted model.
The null hypothesis set for the purpose of testing is: Intercept terms are equal for
all i [or] No significant difference across i's
Acceptance of this null means that the efficient estimator is the one obtained from the
model which assumes constant intercept, i.e. OLS.
The F ratio used for the test is as follows:
(R2UR R2R)/(n-1)
F(n-1, nT-n-K) = --------------------------(1- R2UR)/(nT-n-K)
Where n number of cross sectional units (or groups)
T number of time period w.r.t a single i
K number of explanatory variables (excluding constant)
R2UR R2 value of unrestricted regression model (FEM)
R2R - R2 value of restricted regression model (OLS)
If computed F value (for n-1 numerator df and nT-n-K denominator df) is larger than
the critical value from the F table (for n-1 numerator df and nT-n-K denominator df)
we reject the null hypothesis.
Rejection of null hypothesis favours existence of individual effect in the data (and
vice versa).

5.2. The case of time dimension:

This is another variant of FEM with only time-effect/dimension is present.

If we incorporate this effect, the panel regression model will look like:
Yit = 1t + 2X2it + 3 X3it + uit
(4)
In (4) we have an intercept term (1t) specific to each time period t
Technically, 1t is called the time effect and is an unknown parameter to be estimated
Model (4) is also known as one-way FEM since only time effect is present
We could account for time effects in the model by introducing time dummies (t-1
dummies to avoid perfect collinearity) on the right hand side of the equation (4).
In such case we write (4) as:
Yit= 0+ 1DYear1+ 2DYear2+..+ n DYearT + 2X2it + 3 X3it + uit
(5)
Where DYear1 indicates dummy for year 1; DYear2 indicates dummy for year 2 and
so on. DYear1 takes a value of 1 for observation in Year 1 and 0 otherwise etc.
Assignment of time dummies: Illustration for our example
Year
1940
1941
1942
1943
1944
1940
1941
1942
1943
1944
1940
1941
1942
1943
1944
1940
1941
1942
1943
1944

Firm ID
1
1
1
1
1
2
2
2
2
2
3
3
3
3
3
4
4
4
4
4

I
74.4
113
91.9
61.3
56.8
461.2
512
448
499.6
547.5
361.6
472.8
445.6
361.6
288.2
28.57
48.51
43.34
37.02
37.81

F
2132.2
1834.1
1588
1749.4
1687.2
4643.9
4551.2
3244.1
4053.7
4379.3
2202.9
2380.5
2168.6
1985.1
1813.9
628.5
537.1
561.2
617.2
626.7

C
186.6
220.9
287.8
319.9
321.3
207.2
255.2
303.7
264.1
201.6
254.2
261.4
298.7
301.8
279.1
26.5
36.2
60.8
84.4
91.2

T2
1
0
0
0
0
1
0
0
0
0
1
0
0
0
0
1
0
0
0
0

T3
0
1
0
0
0
0
1
0
0
0
0
1
0
0
0
0
1
0
0
0

0
0
1
0
0
0
0
1
0
0
0
0
1
0
0
0
0
1
0
0

T5
0
0
0
1
0
0
0
0
1
0
0
0
0
1
0
0
0
0
1
0

There is no dummy for first time-period, which means 0 represents intercept for first
(or omitted time period in terms of assigning dummies) time-period.
Other s represent differential intercept coefficients indicating how much intercepts
of dummy variable assigned t's differ from intercept of t which is not assigned a
dummy.

0
0
0
0
1
0
0
0
0
1
0
0
0
0
1
0
0
0
0
1

The problem with time dummies is that it produces many explanatory variables,
especially when the number of time periods is plenty. Also, loss in degrees of
freedom is the major casualty of this exercise.
How to decide on the statistical significance of time effects (or time dummies)? [OR]
How can we choose between OLS regression results and FEM (with time effects)
results?
We can use the restricted F test explained above.

5.3. The case of individual and time dimension:

The third variant of FEM incorporates both individual and time-effect, called twoway model.
If we incorporate these effects, the panel regression model will look like:
Yit = 1i + 1t + 2X2it + 3 X3it + uit
(6)
How to incorporate both individual and the time dummies in the data set? Follow the
procedure explained above.
How to decide on the statistical significance of model with both individual and time
effects? [OR] How can we choose between FEM with individual effect and FEM
with both individual and time effects?
We can use the restricted F test explained above.
The model with both individual and time effects is rarely used in practice due to two
reasons.
First, the cost in terms of degrees of freedom is often not justified.
Second, in those instances in which a model of the time-wise evolution of the
disturbances is desired, a more general model than the dummy variable formulation is
usually used.

5.4. Regression on Time-demeaned data:

The major problems with one-way or two-way FEM with dummy variables are (a)
they are impractical (Is it so considering the help rendered by software packages?)
when number of cross-sectional units and time periods is very large (say for example
if n = 1000 workers or t = 50 years); and (b) there is a cost involved in terms of the
loss in degrees of freedom.
One way to overcome this is to run a pooled OLS regression based on the timedemeaned or entity demeaned variables. This method avoids the inclusion of
dummies by transforming all the variables using group means. In other words, the
philosophy underlying this method is to eliminate the influence of
unobserved/individual effects prior to estimation.

To see what this method involves, consider the following model with a single
explanatory variable: for each i.
yit = 1xit + ai + uit, t = 1, 2, ., T ------ (1)
where ai is the individual or unobserved effect
Now for each i, by averaging equation (1) over time. We get
y i 1 x i ai u i

( 2)

1
Where y i T y it . Similar interpretation holds for x i and u i .
t 1

Because ai is fixed over time, it appears in both (1) and (2).

If we subtract (2) from (1) for each t, we get
y it y i 1 ( xit x i ) u it u i , t 1,2,....., T

(or)
y it 1 xit uit , t 1,2,....T ,
Where yit
variables.

y it y i is

(3)

the time-demeaned data on y, and similarly for other

As a result of this transformation (called fixed effects transformation), ai gets

eliminated! This suggests that we should estimate (3) by pooled OLS
But, note that from (2) above we can compute dummies using the following formula
a i y i 1 xi

(4)

Note that OLS on (3) should not have a constant term (Why?). Even if we generate a
constant it is arbitrary.
Illustration: elimination of unobserved (individual) effects via fixed effects
transformation
Time
1
2
3

Firm
1
1
1

4
5

1
1

Group
average
2

Actual Data
C
D1
186.6
1
220.9
1
287.8
1

I
74.4
113
91.9

F
2132.2
1834.1
1588

61.3
56.8
397
79.5

1749.4
1687.2
8990.9
1798.18

319.9
321.3
1336.5
267.3

461

4643.9

207.2

Time-demeaned data
I
F
C
-5.08 334.02
-80.7
33.52
35.92
-46.4
12.42 -210.2
20.5

D2
0
0
0

D3
0
0
0

D4
0
0
0

1
1
5
1

0
0

-18.18
-22.68

-48.78
-111

52.6
54

-32.46

469.46

-39.16

2
3
4
5

2
2
2
2

1
2
3
4
5

Group
average
3
3
3
3
3

1
2
3
4
5

Group
average
4
4
4
4
4
Group
average

512
448
500
548
2468
494

4551.2
3244.1
4053.7
4379.3
20872.2
4174.44

255.2
303.7
264.1
201.6
1231.8
246.36

0
0
0
0

1
1
1
1
5
1

0
0
0
0

18.34
-45.66
5.94
53.84

376.76
-930.3
-120.7
204.86

8.84
57.34
17.74
-44.76

362
473
446
362
288
1930
386

2202.9
2380.5
2168.6
1985.1
1813.9
10551
2110.2

254.2
261.4
298.7
301.8
279.1
1395.2
279.04

0
0
0
0
0

1
1
1
1
1
5
1

0
0
0
0
0

-24.36
86.84
59.64
-24.36
-97.76

92.7
270.3
58.4
-125.1
-296.3

-24.84
-17.64
19.66
22.76
0.06

28.6
48.5
43.3
37
37.8
195
39.1

628.5
537.1
561.2
617.2
626.7
2970.7
594.14

26.5
36.2
60.8
84.4
91.2
299.1
59.82

0
0
0
0
0

1
1
1
1
1
5
1

-10.48
9.46
4.29
-2.03
-1.24

34.36
-57.04
-32.94
23.06
32.56

-33.32
-23.62
0.98
24.58
31.38

The regression on these time-demeaned variables will produce the following result:
Iit = 0.0651 F + 0.0558 C

Now, using the coefficient estimates of F and C, we can compute group dummies by
applying (4) above. For example the dummy for firm one is computed as
= 79.5 [0.0651 (1798.18) + 0.0558 (267.3)]
= 79.5 [117.062 + 14.915]
= -52.477
The fixed effects transformation is also called the within transformation. This can be
done in a single command in STATA econometrics software package.
Since the within transformation or within effect model does not use the
dummies, it has larger degrees of freedom, smaller MSE (mean
square error), and incorrect (smaller) standard errors of
parameters than those of LSDV. Also, R-squared of the within effect
model (which is small compared to LSDV model) is not correct
because an intercept is suppressed.
A pooled OLS estimator that is based on the time-demeaned variable is called the
fixed effects estimator or the within estimator. The latter name comes from the fact
that OLS on (3) uses the time variation in y and x within each cross-sectional
observation.
The method of time-demeaned data can be extended to time-period dummies as
well. In the presence of time effect, the within effect model involves 3 steps: (i)

compute time average of all variables (i.e. take average of variables across time
instead of group-wise); (ii) compute deviation of time-wise observations from the
time average and (iii) use (ii) for running OLS.
Illustration for the above data
Time
Firm
1
1
1
2
1
3
1
4
Average for year 1

I
74.4
461
362
28.6
***

F
2132.2
4643.9
2202.9
628.5
***

C
186.6
207.2
254.2
26.5
***

The major defect of this procedure is that in the process of eliminating the
individual effects, any other explanatory variable that is constant over time for
all i gets swept away.
Hence, we cannot include (i.e. find estimates of) variables such as gender or a citys
distance from a river as explanatory variables in the model. In such a case, we should
use random effects model.
There is yet another way to estimate fixed effects model, called
between group effect model. This model uses aggregate
information, group means of variables (i.e. equation 2 above). In
other words, the unit of analysis is not an individual observation, but
groups or subjects. The number of observations jumps down to n
from nT.
Illustration for the above data:
Firm
Group 1 average
Group 2 average
Group 3 average
Group 4 average

I
79.5
494
386
39.1

F
1798.18
4174.44
2110.2
594.14

C
267.3
246.36
279.04
59.82

This group mean regression produces different goodness-of-fits and

parameter estimates from those of LSDV and the within effect
model.
However, the between estimator ignores important information on
how the variables change over time.
In the presence of time effect, the between effect model involves
regressing time means of dependent variables (i.e. take average of
variables across time instead of group-wise) on those of
independent variables.

Note that between effect model is not valid in case of two-way fixed
effect model (why?)

5.5. Dummy variable Regression Vs Regression on Time-demeaned data

Dummy variable regression is considered as a traditional way of applying fixed
effects panel regression.
If the researcher sees the unobserved/individual effect as a parameter to be estimated
for each i then dummy variable regression method is fine.
Anyway, the estimated intercepts from the dummy variable analysis are of only
occasional interest. Example: if we want to pick a particular firm or city to see
whether its dummy intercept coefficient is above or below average value in the
sample.
Also, an interesting feature of dummy variable regression is that it gives us exactly
the same estimates of the explanatory variables that we would obtain from the
regression on time-demeaned data.
But, one major benefit of the dummy variable regression is that the R-squared from
such regression is usually rather high. This occurs because we are including a
dummy variable for each cross-sectional unit, which explains much of the variation in
the data.
On the other hand R-squared of the regression on time-demeaned data would be low.
But this method gains more degrees of freedom compared to dummy variable
regression.
An examination of results of computer software packages namely GRETL and
LIMDEP reveal that they both use dummy variable regression to estimate fixed effect
panel model.

6. Random Effects Model (REM)

6.1. Why or when REM?
One of the major assumptions underlying FEM is that the unobserved effect is
correlated with one or more of the explanatory variables (i.e. Xs).

Suppose we think/assume that the unobserved effect is uncorrelated with one or

more of the explanatory variables then the usage of random effects model (REM) is
more attractive or appropriate
When the unobserved effect is thought to be uncorrelated with the explanatory
variables then unobserved effect can be left in the error term, the resulting serial
correlation over time can be handled by GLS estimation.

6.2. Illustration
Let us consider the following one-way (individual effect) FEM
Yit = 1i + 2X2it + 3 X3it + uit

(1)

Suppose in this case the sampled cross-sectional units were drawn from a large
population (say the case of longitudinal data set) [OR] the cross-sectional units
included in our sample are a drawing from a much larger universe of such units.
In such cases we need to treat 1i as random variable like uit and not as an
individual specific fixed constant over time.
If we treat 1i as random variable the variable would take the following form
1i = 1 + i i = 1, 2, 3,.., N ---------------- (2)
Where 1 is the mean of intercepts (1i) of all cross-sectional units and i is a random
error term characterizing the ith observation and is constant through time. What the
term (2) essentially implies is that the intercept of an individual cross-sectional unit is
nothing but the mean of intercepts of all cross-sectional units PLUS or MINUS the
error term (the error term represents the random deviation of individual intercept from
the mean of intercepts of all cross-sectional units). Hence, the individual differences
in the intercept value of each company are reflected in the error term i.
Now, Substituting (2) into (1), we obtain:
Yit = 1 + 2X2it + 3 X3it + i + uit
Yit = 1 + 2X2it + 3 X3it + wit

(3)

Where wit = i + uit . (4)

Note that the composite error term wit consists of two components, i, which is the
cross-section or individual-specific error component, and uit, which is the combined
time series and cross-section error component. i is assumed independent of uit.

Since the composite error term consists of two or more (if we include time effects)
error terms, REM is also called error components model.
Notice carefully the difference between FEM and REM. In FEM each cross-sectional
unit has its own (fixed) intercept value. In REM, on the other hand, the intercept 1
represents the mean value of all the cross-sectional intercepts, and the error
component i represents the random deviation of individual intercept from this mean
value.

Now, since i is in the composite error in each time period, the wit are serially
correlated across time
That is, correlation (wit, wis) = 2 / ( 2 + 2u), t s. Here, 2 = Variance ( i) and 2u
= Variance (uit).
This serial correlation problem can be solved or eliminated by making use of the
GLS method. As the usual pooled OLS standard errors ignore this correlation, they
will not be correct
Deriving the GLS transformation that eliminates serial correlation in the errors
requires sophisticated matrix algebra. But the transformation itself is simple (or) there
is a simple transformation method.
Now, define 1

u2
( u2 T 2 )

Here lies between zero and one. Then, the transformed equation turns out to be
y it y i 1 (1 ) 2 ( x 2 it x 2 i ) 3 ( x3it x 3i ) ( wit w i )

(3)

Where the overbar denotes the time averages.

This is very interesting equation, as it involves quasi-demeaned data on each variable.
The (random effects) transformation here involves subtracting a fraction of the
group/time average with respect to each variable, where the fraction depends on
2u, 2 and the number of time periods, T.
In contrast, the fixed effects estimator subtracts complete/full time or group
averages from the corresponding variable.
Thus, GLS estimator is simply the pooled OLS regression on quasi-demeaned data
[equation (3)].
The transformation in (3) allows for the inclusion of explanatory variables that are
constant over time in the regression. This is because the quasi-time-demeaning only
removes a fraction of the group or time average, and not the whole group or time
average.

This is one major advantage random effects model has over fixed effects model.
Equation (3) allows us to relate the RE estimator to both pooled OLS and fixed
effects.
Pooled OLS is obtained when =0, and FE is obtained when =1.
In practice, the estimate is never zero or one. But, if is close to zero, the RE
estimates will be close to the pooled OLS estimates. This is the case when
unobserved effect i is relatively unimportant (because, it has small variance relative
to 2u).
It is more common for 2 to be large relative to 2u in which case will be closer to
unity.
As T gets large, tends to one, and this makes the RE and FE estimates very similar.
Thus the value of the estimated transformation parameter indicates whether the
estimates are likely to be closer to the pooled OLS or the fixed effects estimates.
The REM with time effect and the REM with both individual and time effects is a
straightforward extension of the REM with individual effect.
The REM with time effect will take the following form:
Yit = 1 + 2X2it + 3 X3it + t + uit
Yit = 1 + 2X2it + 3 X3it + vit
where vit = t + uit
The REM with both individual and time effect will take the following form:
Yit = 1 + 2X2it + 3 X3it + i + t + uit
Yit = 1 + 2X2it + 3 X3it + Wit
where Wit = i + t + uit

6.3. Testing for Random effects model

[or Choosing between OLS and REM/FEM]:
Whether the data supports random effects model can be verified with the help of
Lagrange multiplier (LM) test developed by Breusch and Pagan (called BreuschPagan Lagrange multiplier (LM) test). The test procedure is as follows:
Set H0: Variances of groups are zero, i.e. 2 = 0 (or Correlation [wit, wis] =
0)1
1

In correlation (wit, wis) = 2 / ( 2 + 2u), t s, if 2 = 0 then correlation (wit, wis) will become zero.

Under the null hypothesis, LM is distributed as chi-squared with one degree of

freedom (two degrees of freedom in case the model has both individual and time
effects).
If the calculated LM test statistic exceeds the 95 percent critical/table value for chisquared with one degree of freedom (two degrees of freedom in case of two way
random effects model) (which is 3.841), we reject H0.
Rejection of H0 implies classical regression model (OLS) with a single constant term
is inappropriate for the data. The result of the test is to reject H 0 in favour of the
random effects model.
The null hypothesis of the one-way random time effect is that
variance components for time are zero, i.e. 2 = 0
The two way random effects model has the null hypothesis that
variance components for groups and time are all zero.

6.4. Choosing between FEM and REM The Hausman Test

An inevitable question in panel data models is: which of the two models FEM &
REM should be used?
The random effect model is recommended under the following circumstances:
(a) If the key explanatory variable is constant over time (e.g. education
qualification), we cannot use FE to estimate its effects on dependent variable.
(b) If we are willing to assume the unobserved effect is uncorrelated with
explanatory variables then we can only use random effects model.
(c) If the key policy variable is set experimentally, then random effects would
be appropriate for estimating the coefficients. For example, for estimating the
effect of class size on class performance random effects would be appropriate
if each year children are randomly assigned to classes of different sizes.
(d) If we treat (or believe) our sample as a random drawings from a large
population (the case of a wide longitudinal data set), the REM has some
intuitive appeal.
(e) When number of cross-sectional units (N) is big, REM is preferred
The random effect model is recommended under the following circumstances:
(a) If we are willing to assume the unobserved effect is correlated with
explanatory variables then we can only use fixed effects model. In other

words, as fixed effects allow arbitrary correlation between unobserved effects

and the explanatory variables, while random effects does not, FE is widely
thought to be a more convincing tool for estimating ceteris paribus effects. In
most cases the regressors are themselves outcomes of choice processes and
likely to be correlated with unobserved effects.2
(b) In some applications of panel data methods, we cannot treat our sample as
a random sample from a large population, especially when the unit of
observation is a large geographical unit (say, states or provinces). Then, it
often makes sense to think of each unobserved effect (ai) as a separate
intercept to estimate for each cross-sectional unit. In this case, we use fixed
effects. Hence, FE is almost always much more convincing than RE for policy
analysis using aggregate data.
(c) If T (the number of time series data) is large and N (the number of crosssectional units) is small, there is likely to be little difference in the values of
the parameters estimated by the FEM and REM. Hence the choice here is
based on computational convenience. On this score, FEM may be preferable.
A readymade solution to this choice problem is the Hausman specification test. This
test compares the fixed versus random effects under the null hypothesis that the
individual or unobserved effects are uncorrelated with the other regressors in the
model.
If Hausman test rejects the null hypothesis, fixed effect model is preferred over
random effect model (this implies that the unobserved effect are uncorrelated with
each explanatory variable.
Null hypothesis is rejected if the calculated Hausman test statistic is higher than the
critical value from the chi-squared table (the Hausman test statistic has an asymptotic
chi-squared distribution) with k degrees of freedom (where k is the number of
explanatory variables in the model excluding the constant term).

Consider an example. Suppose we have a random sample of large number of individuals and we want
to model their wage or earnings function. Suppose earnings are a function of education, work
experience, etc. Now if we let i stand for innate ability, family background etc., then when we model
the earnings function including i it is very likely to be correlated with education, for innate ability and
family background are often crucial determinants of education. As Wooldridge contends, In many
applications, the whole reason for using panel data is to allow the unobserved effect [i.e., i] to be
correlated with the explanatory variables.

6.5. FEM and REM A Comparison:

Fixed Effect Model
Dummies are considered as a part of the
intercept.

Random Effect Model

Dummies act as an error term

Intercepts vary across groups and/or

times

There is one common intercept for all

groups taken together.

Error variances are constant

Error variances vary across groups and/or

times.

Examines group differences in

intercepts

Estimates variance components

for groups and error, assuming the
same intercept and slopes. The
difference among groups (or time
periods) lies in the variance of the
error term.
Use least squares dummy Use generalized least squares
variable (LSDV), within effect (GLS) method
and between effect estimation
methods.
In
short,
OLS
regressions with dummies, in
fact, are fixed effect models.
Fixed effects are tested by the
(incremental) F test

Random effects are examined by

the Lagrange Multiplier (LM) test.

Solutions Manual For Forensic Chemistry 2nd Edition by Suzanne Bell
100% (4)
Solutions Manual For Forensic Chemistry 2nd Edition by Suzanne Bell
23 pages
Greene, W. H., Econometric Ana
No ratings yet
Greene, W. H., Econometric Ana
3 pages
Chapter Six
100% (1)
Chapter Six
112 pages
Introduction To Econometrics, Tutorial
No ratings yet
Introduction To Econometrics, Tutorial
10 pages
Martingale (Probability Theory) - Wikipedia, The Free Encyclopedia
No ratings yet
Martingale (Probability Theory) - Wikipedia, The Free Encyclopedia
9 pages
CRD Prob With Sol PDF
No ratings yet
CRD Prob With Sol PDF
9 pages
Econometrics Module
No ratings yet
Econometrics Module
148 pages
Econometrics Material For Exit Exam
No ratings yet
Econometrics Material For Exit Exam
81 pages
#.Development Planing II
No ratings yet
#.Development Planing II
173 pages
92-Worksheet - Econometrics II
100% (1)
92-Worksheet - Econometrics II
4 pages
Dev't PPA I (Chap-4)
100% (1)
Dev't PPA I (Chap-4)
48 pages
Introduction To Econometrics
No ratings yet
Introduction To Econometrics
90 pages
Econometrics MTU
No ratings yet
Econometrics MTU
31 pages
Econometrics
No ratings yet
Econometrics
320 pages
Econometrics ppt-1
No ratings yet
Econometrics ppt-1
205 pages
405 Econometrics Odar N. Gujarati: Prof. M. El-Sakka
100% (1)
405 Econometrics Odar N. Gujarati: Prof. M. El-Sakka
27 pages
Lecture 1 Introduction To Econometrics
No ratings yet
Lecture 1 Introduction To Econometrics
47 pages
LT 2 Econometrics
No ratings yet
LT 2 Econometrics
94 pages
Introduction To Introduction To Econometrics Econometrics Econometrics Econometrics (ECON 352) (ECON 352)
100% (2)
Introduction To Introduction To Econometrics Econometrics Econometrics Econometrics (ECON 352) (ECON 352)
12 pages
Econometrics I CH-1
No ratings yet
Econometrics I CH-1
32 pages
Chapter 3
100% (1)
Chapter 3
28 pages
Mathematical Economics
No ratings yet
Mathematical Economics
123 pages
1 Logit Probit and Tobit Model
100% (2)
1 Logit Probit and Tobit Model
51 pages
EKO2111: Macroeconomics II Final Exam
100% (2)
EKO2111: Macroeconomics II Final Exam
3 pages
Econometrics
No ratings yet
Econometrics
115 pages
Chapter 3
100% (1)
Chapter 3
25 pages
Microeconomics II
No ratings yet
Microeconomics II
134 pages
Natural Resource Economics: An Overview: (Chapter 6)
No ratings yet
Natural Resource Economics: An Overview: (Chapter 6)
16 pages
Chapt 2 MIC
No ratings yet
Chapt 2 MIC
13 pages
Qualitative Response Regression Questions
No ratings yet
Qualitative Response Regression Questions
10 pages
Introduction To Economics: Haramaya University College of Business and Economics Department of Economics
No ratings yet
Introduction To Economics: Haramaya University College of Business and Economics Department of Economics
79 pages
Econometrics Chapter 1 7 2d AgEc 1
No ratings yet
Econometrics Chapter 1 7 2d AgEc 1
89 pages
Econometrics 1: Dummy Dependent Variables Models
0% (1)
Econometrics 1: Dummy Dependent Variables Models
12 pages
Determinants of Household Savings
100% (2)
Determinants of Household Savings
12 pages
Development Economics 1 &2
No ratings yet
Development Economics 1 &2
43 pages
Macroeconomics Handout
No ratings yet
Macroeconomics Handout
115 pages
EAE 304 - Labour Economics-1
No ratings yet
EAE 304 - Labour Economics-1
108 pages
Econometrics Lecture Chapter 2 Note pdf-1
No ratings yet
Econometrics Lecture Chapter 2 Note pdf-1
34 pages
Financial Economics Chapter 4
No ratings yet
Financial Economics Chapter 4
26 pages
CH 3 Macroeconomics AAU
100% (1)
CH 3 Macroeconomics AAU
90 pages
Econ 3051-Dynamic Optimization
No ratings yet
Econ 3051-Dynamic Optimization
11 pages
CH04 Consumption and Saving
No ratings yet
CH04 Consumption and Saving
64 pages
Economics of Industry - PPT Lecture 2
No ratings yet
Economics of Industry - PPT Lecture 2
80 pages
Handout Econometrics - Module
No ratings yet
Handout Econometrics - Module
86 pages
Microeconomics 2023 Bis
No ratings yet
Microeconomics 2023 Bis
223 pages
Chapter Two: Simple Linear Regression Models: Assumptions and Estimation
100% (3)
Chapter Two: Simple Linear Regression Models: Assumptions and Estimation
34 pages
Econometrics II
100% (1)
Econometrics II
4 pages
University of Gondar College of Business and Economics: Scool of Economics PPT Compiled For Macroeconomics I
No ratings yet
University of Gondar College of Business and Economics: Scool of Economics PPT Compiled For Macroeconomics I
83 pages
SRSTRT Mod
100% (1)
SRSTRT Mod
184 pages
Labor Economics - Module
No ratings yet
Labor Economics - Module
159 pages
Introductory Econometrics IGNOU
No ratings yet
Introductory Econometrics IGNOU
212 pages
Econometrics I-For Lectuure Latest
67% (3)
Econometrics I-For Lectuure Latest
148 pages
Chapter 2 (Econometrics)
No ratings yet
Chapter 2 (Econometrics)
36 pages
Theories of Consumption and Investment
No ratings yet
Theories of Consumption and Investment
25 pages
2017 AAU Entrance Exam - MSC in Economics
No ratings yet
2017 AAU Entrance Exam - MSC in Economics
2 pages
Introduction To Microeconomics Notes
No ratings yet
Introduction To Microeconomics Notes
162 pages
Chapter 3-Multiple Regression Model
No ratings yet
Chapter 3-Multiple Regression Model
26 pages
Micro - Economics Notes, RUCO, REC 101, BBA1
No ratings yet
Micro - Economics Notes, RUCO, REC 101, BBA1
507 pages
Econometrics Chapter # 0: Introduction
No ratings yet
Econometrics Chapter # 0: Introduction
19 pages
Industrial Economics Chapter Seven Slides
No ratings yet
Industrial Economics Chapter Seven Slides
5 pages
Panel Data Assign
No ratings yet
Panel Data Assign
19 pages
Introduction To Panel Data
No ratings yet
Introduction To Panel Data
20 pages
Fem & Rem
No ratings yet
Fem & Rem
20 pages
Model Sheep Farm
No ratings yet
Model Sheep Farm
21 pages
Name of The Borrower Mr./Ms./Mrs. Parentage/Spouse Mr./Ms./Mrs
No ratings yet
Name of The Borrower Mr./Ms./Mrs. Parentage/Spouse Mr./Ms./Mrs
14 pages
DPR Form
No ratings yet
DPR Form
1 page
Dairy Project PDF
No ratings yet
Dairy Project PDF
23 pages
Dairy Project PDF
No ratings yet
Dairy Project PDF
23 pages
Jammu and Kashmir Entreprenuership Development Institute Seed Capital Fund Scheme (SCFS)
No ratings yet
Jammu and Kashmir Entreprenuership Development Institute Seed Capital Fund Scheme (SCFS)
8 pages
DPR Form
No ratings yet
DPR Form
1 page
Dairy Farming
No ratings yet
Dairy Farming
26 pages
Exchange Rate
No ratings yet
Exchange Rate
2 pages
JBZ A12 Paper Cup Machine
No ratings yet
JBZ A12 Paper Cup Machine
2 pages
Analysis of Variance
No ratings yet
Analysis of Variance
20 pages
Rip 2
No ratings yet
Rip 2
12 pages
Ttest
No ratings yet
Ttest
16 pages
AMIS0224 Certificate
No ratings yet
AMIS0224 Certificate
26 pages
Final Research 10 g1 1
No ratings yet
Final Research 10 g1 1
22 pages
BUSA3015 2023 S1 Report 2
No ratings yet
BUSA3015 2023 S1 Report 2
9 pages
Applied Statistics
No ratings yet
Applied Statistics
31 pages
RVP Petroleo Con HTGC y HDA
No ratings yet
RVP Petroleo Con HTGC y HDA
7 pages
Group Assignment Completed
No ratings yet
Group Assignment Completed
68 pages
2022-08-04 ChiSq F
No ratings yet
2022-08-04 ChiSq F
59 pages
Jadhav 2012
No ratings yet
Jadhav 2012
9 pages
BC2406 S01 G02 Final Report
No ratings yet
BC2406 S01 G02 Final Report
33 pages
The Effectiveness of Using Text Twist Game To Improve The Student's Vocabulary
No ratings yet
The Effectiveness of Using Text Twist Game To Improve The Student's Vocabulary
79 pages
The Impact of International Education Aid Project- Type Intervention Costs on Research Output Enhancement in Higher Education Institutions in Sub-Saharan Africa: A Generalized Method of Moments Analysis
No ratings yet
The Impact of International Education Aid Project- Type Intervention Costs on Research Output Enhancement in Higher Education Institutions in Sub-Saharan Africa: A Generalized Method of Moments Analysis
13 pages
STS Reviewer
No ratings yet
STS Reviewer
129 pages
Design of Experiments (DOE) Tutorial
No ratings yet
Design of Experiments (DOE) Tutorial
11 pages
Data (China) Interest Rate and Money Supply From 1983-2013 Year Interest Rate Money Supply
No ratings yet
Data (China) Interest Rate and Money Supply From 1983-2013 Year Interest Rate Money Supply
7 pages
GLM Notes
No ratings yet
GLM Notes
173 pages
Labuyo
No ratings yet
Labuyo
19 pages
eHRM2 PDF
No ratings yet
eHRM2 PDF
12 pages
Statistics
No ratings yet
Statistics
4 pages
Rapule SD PDF
No ratings yet
Rapule SD PDF
75 pages
Repeated Measures ANOVA - Understanding A Repeated Measures ANOVA - Laerd ST
No ratings yet
Repeated Measures ANOVA - Understanding A Repeated Measures ANOVA - Laerd ST
10 pages
Citasi Harus 2
No ratings yet
Citasi Harus 2
16 pages
Analysis of Variance: LC - GC Europe Online Supplement
No ratings yet
Analysis of Variance: LC - GC Europe Online Supplement
4 pages
Srivastava Et Al 2023 Estimating Relative Tax Efficiency For Selected States in India An Error Correction Approach
No ratings yet
Srivastava Et Al 2023 Estimating Relative Tax Efficiency For Selected States in India An Error Correction Approach
27 pages
126 PDF
No ratings yet
126 PDF
14 pages
Quantitative Methods: Reading Number Reading Title Study Session
No ratings yet
Quantitative Methods: Reading Number Reading Title Study Session
40 pages