Econometrics Advanced Panel Data Techniques Advanced Panel Data Topics Fixed Effects estimation STATA stuff: xtreg Autocorrelation/Cluster correction  But first! Review of heteroskedasticity Probably won’t get to.

Download Report

Transcript Econometrics Advanced Panel Data Techniques Advanced Panel Data Topics Fixed Effects estimation STATA stuff: xtreg Autocorrelation/Cluster correction  But first! Review of heteroskedasticity Probably won’t get to.

Econometrics
Advanced Panel Data Techniques
1
Advanced Panel Data Topics
Fixed Effects estimation
STATA stuff: xtreg
Autocorrelation/Cluster correction

But first! Review of heteroskedasticity
Probably won’t get to details:


Random Effects estimation
Hausman test
Other kinds of panel data
2
Panel Data with two periods
Notation:
vit
yit = 0 + 0d2t + 1xit1 +…+ kxitk + ai + uit
Dummy for t= 2
Person (firm,
(intercept shift)
etc) i…
…in period t

third
subscript:
variable #
ai = time-constant
component of the
composite error, 
ai = “person effect” (etc) has no “t” subscript
 All unobserved influences which are fixed for a
person over time (e.g., “ability”)

uit = “idiosyncratic error”
3
Fixed Effects Estimation
Two periods of data. The population
model is
yit = 0 + 0d2t + 1xit1 +…+ kxitk + ai + uit
 ai is unknown, but we could estimate it…
 Estimate âi by including a dummy variable for
each individual, i!
 For example, in a dataset with 46 cities each
observed in two years (1982, 1987) we would
have 45 dummies (equal to one for only two
observations each)
 d2t is a dummy for the later year (e.g., 1987)
4
crmrte unem
73.3 14.9
63.7
7.7
169.3 9.1
164.5 2.4
96.1 11.3
120.0 3.9
116.3
169.5
…
70.8
72.5
5.3
4.6
…
6.9
6.2
d87
0
1
0
1
0
1
dcity1
1
1
0
0
0
0
dcity2
0
0
1
1
0
0
…
…
…
…
…
…
…
dcity45
0
0
0
0
0
0
0
1
…
0
1
0
0
…
0
0
0
0
…
0
0
…
…
…
…
…
0
0
…
1
1
5
Fixed Effects Estimation
We are essentially estimating:
yit = 0 + 0d2t + 1xit1 +…+ kxitk + a1d(i=1) + a2d(i=2)
+ … + a45d(i=45) + uit
 But, for short, we just write,
yit = 0 + 0d2t + 1xit1 +…+ kxitk + ai + uit
 Estimated âi are the slope coefficients on these dummy
variables
 These are called “fixed effects”

The dummies control for anything – including
unobservable characteristics – about an individual
which is fixed over time
6
More on fixed effects…
In two-period data (only), including fixed effects
equivalent to differencing


That is, either way should get you exactly the same
slope estimates
Can see this in difference eq’s for predicted values:
Period 2: ŷi2 = b0 + d0∙1 +b1xi21 +…+ bkxi2k + âi
Period 1: ŷi1 = b0 + d0∙0 +b1xi11 +…+ bkxi1k + âi
Diff: ŷi = d0 +b1xi1 +…+ bkxik
Intercept in differences same
as coefficient on year dummy
7
Fixed Effects In STATA:
three ways
1. In STATA, can estimate fixed effects by
creating dummies for every individual and
including them in your regression

E.g.: tab city, gen(citydummy)
2. The “areg” command does the same, w/o
showing the dummy coefficients (the “fixed
effects”) [we don’t usually care anyway!]


a = “absorb” the fixed effects
Syntax: areg y x yrdummies, absorb(city)
3. xtreg …, fe (below)
Variable identifying
cross-sectional units
8
Fixed effects regression
. areg crmrte unem d87, absorb(area) robust
Linear regression, absorbing indicators
Number of obs
F( 2,
44)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
92
4.50
0.0166
0.8909
0.7743
14.178
-----------------------------------------------------------------------------|
Robust
crmrte |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------unem |
2.218
.8155056
2.72
0.009
.5744559
3.861543
d87 |
15.4022
5.178907
2.97
0.005
4.964803
25.8396
_cons |
75.40837
8.916109
8.46
0.000
57.43913
93.3776
-------------+---------------------------------------------------------------area |
absorbed
(46 categories)
9
First difference regression
c = “change” = 
. reg ccrmrte cunem, robust
Linear regression
Number of obs =
F( 1,
44) =
Prob > F
=
R-squared
=
Root MSE
=
46
7.40
0.0093
0.1267
20.051
-----------------------------------------------------------------------------|
Robust
ccrmrte |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------cunem |
2.217999
.8155056
2.72
0.009
.5744559
3.861543
_cons |
15.4022
5.178907
2.97
0.005
4.964803
25.8396
------------------------------------------------------------------------------
Same as coefficienst on unemp, d87 in
fixed effects estimates!
10
Fixed effects vs. first differences
First difference and fixed effects (f.e.)
equivalent only when there are exactly
two periods of data
When there are more than two periods of
data, f.e. equivalent to “demeaning” data
yit  0  1 xit1  ... ai  uit
 Individuals’ means over t: yi  0  1 xi1  ...  ai  ui
 Difference…
 Fixed effects model:

  yit  yi   1 xit1  xi1   ... uit  ui 
11
Fixed Effects vs. First Differences
 yit  yi   1 xit1  xi1   ... uit  ui 
Textbook writes as yit  1xit1  ... uit
 i.e. where yit   yit  yi  etc.


Also known as the “within” estimator
Idea: only using variation “within” individuals
(or other cross-sectional units) over time,
and not the variation “between” individuals
 “Between” estimator, in contrast, uses just the
means, and none of the variation over time:
yi  0  1xi1  ... ui
12
With T>2…
First Differences vs Fixed Effects
F.E. estimation more common than differences


Probably because it’s easier to do (no “differencing”
required) not necessarily because it’s better
Advantages:
 Fixed effects easily implemented for “unbalanced” panels
(not all individuals are observed in all periods)
 Also pooled cross-section: don’t need to see the same
‘individuals’ over time, just, e.g., the same cities

Fixed effects estimation is more efficient than
differencing if no autocorrelation of in the uit’s
 Intuition: first differences (estimating in changes) removes
more of the individual variation over time than fixed effects
13
Aside: fixed effects using “xtreg”
command
In STATA, there are a powerful set of
commands, beginning with “XT,” which allow
you do carry out many panel techniques (fixed
effect, random effects – below)
Step 0 in using these commands: tell STATA
the name of the variables that contain…


The cross-sectional unit of observation (“X”)
The time-series (period) unit of observation (“T”)
Command is: xtset xsecvar timevar
e.g., city,
person, firm
e.g., year
14
xtset:
. xtset area year, delta(5)
panel variable: area (strongly balanced)
time variable: year, 82 to 87
delta: 5 units
• “area” is the cross section unit variable
• “year” is the year variable
• delta(5) is an option tells STATA that a one-unit
change in time is 5 years (in this case – 82 to 87)
• After that, can run xtreg…
15
. xtreg crmrte unem d87, fe
Fixed-effects (within) regression
Group variable: area
Number of obs
Number of groups
=
=
92
46
R-sq:
Obs per group: min =
avg =
max =
2
2.0
2
within = 0.1961
between = 0.0036
overall = 0.0067
corr(u_i, Xb)
= -0.1477
F(2,44)
Prob > F
=
=
5.37
0.0082
-----------------------------------------------------------------------------crmrte |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------unem |
2.218
.8778658
2.53
0.015
.4487771
3.987222
d87 |
15.4022
4.702117
3.28
0.002
5.92571
24.8787
_cons |
75.40837
9.070542
8.31
0.000
57.12789
93.68884
-------------+---------------------------------------------------------------sigma_u | 28.529804
sigma_e | 14.178068
rho | .80194672
(fraction of variance due to u_i)
-----------------------------------------------------------------------------F test that all u_i=0:
F(45, 44) =
7.97
Prob > F = 0.0000
16
First Differences
After you run the Xtset command, you can also do
first differences with a “D.” in front of any variable:
. reg D.crmrte D.unem
Not xtreg: reg D.crmrte D.unem
Source |
SS
df
MS
-------------+-----------------------------Model | 2566.43744
1 2566.43744
Residual | 17689.5497
44 402.035219
-------------+-----------------------------Total | 20255.9871
45 450.133047
Number of obs
F( 1,
44)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
46
6.38
0.0152
0.1267
0.1069
20.051
-----------------------------------------------------------------------------D.crmrte |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------unem |
D1. |
2.217999
.8778658
2.53
0.015
.4487771
3.987222
_cons |
15.4022
4.702117
3.28
0.002
5.92571
24.8787
------------------------------------------------------------------------------
17
Autocorrelation
it -- composite error term
Yit = β0 + β1Xit1 + … ai + uit

Model implies autocorrelation = errors correlated
across periods
 ai is perfectly correlated over time, for example
 (uit’s may also have autocorrelation)
 We continue assume no error correlation between different
individuals
Consequences similar to heteroskedasticity…:


OLS calculates biased standard errors
OLS is inefficient (not “BLUE”)
18
Aside: se formula derived…
(bivariate case); N = #people; T = # of time periods
  

ˆ1  E ˆ1
2
x112 112
 x 
it
2 2
it
 ... 
x

 ...  2
2
2
xNT
 NT
 x 
it
i
2 2
it

x
Other
time
periods
(if T>2)

i ,t 1 i ,t  2 i ,t 1 i ,t  2
2 2
 x 
it
it
x x Cov 


x E  



ˆ
ˆ
E   E   
 ... 
2

 
 x 
 x 
x
ˆ
2
1
 
x11 E  112
1
var 1 
2
2
it
2 2
it
2
NT
2 2
NT
it
it
i
i1 i 2
i1 i 2
it
2 2
it
OLS standard errors calculated assuming:
2
2
2




E


...

E



Homoskedasticity 
11
NT
No autocorrelation  Cov(different v’s) = 0
19
 ...

 ...
1. “cluster” correction
To correct OLS se’s for the possibility of
correlation of errors over time, “cluster”
Clustering on “person” variable (i) adds error
interaction terms from above to se calculation:
ˆi 2 = person i, period 2
 i xi1 xi 2ˆi1ˆi 2
w/
2
 ...
2 2
residual
it xit 
 This provides consistent se’s if you have a
large # of people to average over (N big)

 True se formula has cov(vi1,vi2); averaged over
large #of people, distinction not important
20
“Cluster”
In STATA:
Cross-sectional
unit with
autocorrelation
across periods
reg crmrte lpolpc, cluster(area)


Usually makes standard errors larger
because cov(vi1,vi2)>0
Other intuition:
 OLS treats every observation as independent
information about the relationship
 With autocorrelation, that’s not true –
observations are not independent
 Estimates are effectively based on less
information, they should be less precise
21
With and without “cluster” correction
. reg crmrte lpolpc d87, robust noheader
-----------------------------------------------------------------------------|
Robust
crmrte |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------lpolpc |
41.09728
9.527411
4.31
0.000
22.16652
60.02805
d87 |
5.066153
5.78541
0.88
0.384
-6.429332
16.56164
_cons |
66.44041
7.324693
9.07
0.000
51.8864
80.99442
-----------------------------------------------------------------------------. reg crmrte lpolpc d87, cluster(area) noheader
(Std. Err. adjusted for 46 clusters in area)
-----------------------------------------------------------------------------|
Robust
crmrte |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------lpolpc |
41.09728
12.5638
3.27
0.002
15.79249
66.40208
d87 |
5.066153
3.027779
1.67
0.101
-1.032107
11.16441
_cons |
66.44041
8.99638
7.39
0.000
48.32077
84.56005
------------------------------------------------------------------------------
Coefficient estimates exactly the same – both OLS
Standard errors larger
22
2. Efficiency Correction = “Random Effects”
“Random effects” is a data transformation to
get rid of autocorrelation in errors


Like WLS or feasible GLS, random effects
transforms data to produce error w/o
autocorrelation
Transformed data meets Gauss-Markov
assumptions, and so estimates are efficient
If other Gauss-Markov assumptions hold,
random effects will be unbiased and “BLUE”


Important: random effects assumes ** ai is
uncorrelated with the x’s **
If not, random effects estimates are biased (o.v. bias!)
and inconsistent
23
it = ai + uit
What is Random Effects?
Define   1     T 
2
u

2
u
2 12
a
Interpretation: fraction of error variance due
to factors fixed across periods
2
 If  u = 0, then  = 1.


“Quasi-demeaning” by this factor gets rid of
error correlation across periods
yit  yi   0 1     1 xit1  xi1  
...   k xitk  xik    it   i 
Slopes theoretically
same, (though OLS
estimates may not be if
errors correlated w/X’s)
(can show): new
composite error has no
correlation across periods24
What is Random Effects?
yit  yi   0 1     1 xit1  xi1  
...   k xitk  xik    it   i 
Also can be interpreted as a weighted average
of OLS and fixed effects


If  = 1 (error = fixed), then random effects = regular
demeaning = fixed effects
If  = 0 (error = entirely unfixed – no autocorrelation)
– pure OLS, ignoring panel structure
25
Example: panel of firms from
1987-89
Time variable
storage display
value
variable name
type
format
label
variable label
------------------------------------------------------------------------------year
int
%9.0g
1987, 1988, or 1989
fcode
float %9.0g
firm code number
employ
int
%9.0g
# employees at plant
Cross-section
sales
float %9.0g
annual sales, $
avgsal
float %9.0g variable
average employee salary
scrap
float %9.0g
scrap rate (per 100 items)
rework
float %9.0g
rework rate (per 100 items)
tothrs
int
%9.0g
total hours training
union
byte
%9.0g
=1 if unionized
grant
byte
%9.0g
= 1 if received grant
d89
byte
%9.0g
= 1 if year = 1989
d88
byte
%9.0g
= 1 if year = 1988
hrsemp
float %9.0g
tothrs/totrain
. xtset fcode year
26
More on XTreg in STATA
After doing “xtset,” “xtreg” same as “reg”
but can do panel techniques


Random effects (default): xtreg y x1…, re
Fixed effects: xtreg y x1 x2…, fe
Xtreg can handle all the other stuff that
we have used in “reg”:

Robust, cluster, weights, etc.
27
. xtset fcode year
. xtreg lscrap hrsemp lsales tothrs union d89 d88, re
Random-effects GLS regression
Group variable: fcode
Number of obs
Number of groups
=
=
135
47
R-sq:
Obs per group: min =
avg =
max =
1
2.9
3
within = 0.2010
between = 0.0873
overall = 0.0977
Random effects u_i ~ Gaussian
Wald chi2(6)
=
25.50
corr(u_i, X)
= 0 (assumed)
Prob > chi2
=
0.0003
-----------------------------------------------------------------------------lscrap |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------hrsemp |
.0029383
.0042795
0.69
0.492
-.0054494
.011326
lsales | -.1935359
.1553334
-1.25
0.213
-.4979837
.1109119
tothrs | -.0060263
.0044724
-1.35
0.178
-.0147921
.0027395
union |
.9214526
.4627995
1.99
0.046
.0143823
1.828523
d89 | -.3653513
.1191926
-3.07
0.002
-.5989645
-.1317381
d88 | -.0831037
.1130728
-0.73
0.462
-.3047223
.1385149
_cons |
3.287955
2.326674
1.41
0.158
-1.272241
7.848152
-------------+---------------------------------------------------------------28
Fixed vs. Random Effects
Always remember: random effects estimated
assuming ai uncorrelated with X’s


Unlike fixed effects, r.e. does not remove any
omitted variables bias; just more “efficient”
assuming no omitted variables bias
Put another way: random effects can only reduce
standard errors, not bias
The ai assumption is testable!

If it holds, fixed effects estimates should be
statistically indistiguishable from r.e. estimates
29
Hausman test
Hausman test intuition:


H0: cov(ai,xit)=0; estimate with random effects since
it’s the most efficient under this assumption
Then estimate with fixed effects, and if the
coefficient estimates are significantly different reject
then null
IMPORTANT: as always, failure to reject the
null ≠ there is no bias in random effects

We never “accept the null” (we just lack sufficient
evidence to reject it)
 For example, both random effects and fixed effects could
be biased by a similar amount; or standard errors just big
30
Hausman test
More broadly, Hausman tests are specification
tests comparing two estimators where…


One estimator is efficient (in this case, random
effects) if the null hypothesis is true (cov[ai,xi]=0)
One estimator is consistent (in this case, fixed
effects) if the null hypothesis is false
Related to latter, important caveat on this test
and all Hausman tests:

We must assume (without being able to test!) that
there is no omitted variables bias in the alternative
(fixed effects) estimator
 Reasoning: need an unbiased estimate of the “true” slopes
 If not true, Hausman test tells us nothing
31
Hausman test in STATA
How do you compare your fixed effects
and random effects estimates? Steps:
1.
2.
Save your fixed effects and random effects
estimates using the “estimates store”
command (example below)
Feed to the “Hausman” command
 Hausman command calculates standard errors
on difference in whole list of coefficient estimates
and tests whether they are jointly significantly
different
32
Hausman test example: Does job training reduce “scrap” (error) rates: re or fe?
. xtset fcode year
. xtreg lscrap hrsemp lsales tothrs union d89 d88, re
Random-effects GLS regression
Group variable: fcode
Number of obs
Number of groups
=
=
135
47
R-sq:
Obs per group: min =
avg =
max =
1
2.9
3
within = 0.2010
between = 0.0873
overall = 0.0977
Random effects u_i ~ Gaussian
Wald chi2(6)
=
25.50
corr(u_i, X)
= 0 (assumed)
Prob > chi2
=
0.0003
-----------------------------------------------------------------------------lscrap |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------hrsemp |
.0029383
.0042795
0.69
0.492
-.0054494
.011326
lsales | -.1935359
.1553334
-1.25
0.213
-.4979837
.1109119
tothrs | -.0060263
.0044724
-1.35
0.178
-.0147921
.0027395
union |
.9214526
.4627995
1.99
0.046
.0143823
1.828523
d89 | -.3653513
.1191926
-3.07
0.002
-.5989645
-.1317381
d88 | -.0831037
.1130728
-0.73
0.462
-.3047223
.1385149
_cons |
3.287955
2.326674
1.41
0.158
-1.272241
7.848152
-------------+---------------------------------------------------------------33
Estimates store
After any regression command, can save the
estimates for later. Purposes:


Look at the results again later (some regressions
take a long time to estimate) - “estimates replay”
Feed to another command (like Hausman).
Here, after estimating by random effects, type:
-- stores ests in “reff”
(More generally, estimates store anyname)
. estimates store reff
34
Fixed effects:
. xtreg lscrap hrsemp lsales tothrs union d89 d88, fe
Fixed-effects (within) regression
Group variable: fcode
Number of obs
Number of groups
=
=
135
47
R-sq:
Obs per group: min =
avg =
max =
1
2.9
3
within = 0.2021
between = 0.0016
overall = 0.0224
F(5,83)
=
4.20
corr(u_i, Xb) = -0.0285
Prob > F
=
0.0019
-----------------------------------------------------------------------------lscrap |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------hrsemp |
.0025011
.0044336
0.56
0.574
-.0063171
.0113194
lsales | -.1294226
.2084326
-0.62
0.536
-.5439866
.2851414
tothrs | -.0060857
.0046726
-1.30
0.196
-.0153793
.0032079
union | (dropped)
d89 | -.3692155
.1247804
-2.96
0.004
-.6173986
-.1210324
d88 | -.0841784
.1162988
-0.72
0.471
-.3154921
.1471353
_cons |
2.627201
3.18534
0.82
0.412
-3.708313
8.962715
-------------+---------------------------------------------------------------. estimates store feff
35
Hausman test STATA command
Syntax:
hausman consistent_est efficient_est
…:
. hausman feff reff
---- Coefficients ---|
(b)
(B)
(b-B)
sqrt(diag(V_b-V_B))
|
feff
reff
Difference
S.E.
-------------+---------------------------------------------------------------hrsemp |
.0025011
.0029383
-.0004372
.0011587
lsales |
-.1294226
-.1935359
.0641133
.1389808
tothrs |
-.0060857
-.0060263
-.0000594
.001353
d89 |
-.3692155
-.3653513
-.0038642
.0369224
d88 |
-.0841784
-.0831037
-.0010747
.0272022
-----------------------------------------------------------------------------b = consistent under Ho and Ha; obtained from xtreg
B = inconsistent under Ha, efficient under Ho; obtained from xtreg
Test:
Ho:
difference in coefficients not systematic
chi2(5) = (b-B)'[(V_b-V_B)^(-1)](b-B)
=
0.66
Prob>chi2 =
0.9851
Large p-value:
fail to reject H0
36
Fixed Effects or Random?
My view: Don’t use random effects.

Random effects is just an efficiency correction
 Key assumption that fixed unobservables are
uncorrelated with x’s is almost always implausible
 My mantra: if you are worried about efficiency, you
just don’t have enough data


Bottom line: just correct the standard errors
using “cluster” and forget about efficiency
Analog of my view on heteroskedasticity
37
Autocorrelation/heteroskedasticity:
Problems and solutions
Problem:
Heteroskedasticity Autocorrelation
OLS SE’s biased
“robust” produces
consistent SE’s
“cluster” produces
consistent SE’s
OLS inefficientcould get smaller
SE’s from same
data
Weighted least
squares (or
“feasible GLS”)
Random effects
Not a good idea
except in cases when
you know the form of
heteroskedasticity
(prone to
manipulation)
Not a good idea:
requires implausible
assumption that there
is no ov bias from
fixed unobservables
38
Other Uses of Panel Methods
It’s possible to think of models where there is
an unobserved fixed effect, even if we do not
have true panel data

A common example: observe different members of
the same family (but not necessarily over time)
 Or individual plants of larger firms, etc.

We think there is an unobserved family effect
Examples: difference siblings, twins, etc.

Can estimate “family fixed effect” model
39