No Slide Title

Download Report

Transcript No Slide Title

Research Method
Lecture 7 (Ch14)
Pooled Cross Sections
and Simple Panel Data
Methods
©
1
An independently pooled
cross section
This type of data is obtained by sampling
randomly from a population at different
points in time (usually in different years)
You can pool the data from different year
and run regressions.
However, you usually include year
dummies.
2
Panel data
This is the cross section data collected at
different points in time.
However, this data follow the same
individuals over time.
You can do a bit more than the pooled
cross section with Panel data.
You usually include year dummies as
well.
3
Pooling independent cross
sections across time.
As long as data are collected independently, it
causes little problem pooling these data over
time.
However, the distribution of independent
variables may change over time. For example,
the distribution of education changes over time.
To account for such changes, you usually need
to include dummy variables for each year (year
dummies), except one year as the base year
Often the coefficients for year dummies are of
interest.
4
Example 1
Consider that you would like to see the
changes in fertility rate over time after
controlling for various characteristics.
Next slide shows the OLS estimates of the
determinants of fertility over time. (Data:
FERTIL1.dta)
The data is collected every other year.
The base year for the year dummies are
year 1972.
5
Dependent variable =# kids per woman
. reg kids educ age agesq black east northcen west farm othrural town smcity y74 y76 y80 y82 y84
Source
SS
df
MS
Model
Residual
399.265559
16 24.9540975
2686.24374 1112 2.41568682
Total
3085.5093 1128 2.73538059
kids
Coef.
educ
age
agesq
black
east
northcen
west
farm
othrural
town
smcity
y74
y76
y80
y82
y84
_cons
-.1287556
.535383
-.0058384
1.077747
.2180929
.3616071
.1989796
-.0553556
-.1662171
.0825938
.2092197
.301226
-.0639849
-.037886
-.4892665
-.5112715
-7.844731
Std. Err.
.0183209
.1380659
.001561
.1733806
.1327211
.1207846
.1668093
.146947
.1751486
.124396
.1600797
.1488953
.1556646
.1598956
.1482989
.1496524
3.038574
t
-7.03
3.88
-3.74
6.22
1.64
2.99
1.19
-0.38
-0.95
0.66
1.31
2.02
-0.41
-0.24
-3.30
-3.42
-2.58
Number of obs
F( 16, 1112)
Prob > F
R-squared
Adj R-squared
Root MSE
P>|t|
0.000
0.000
0.000
0.000
0.101
0.003
0.233
0.706
0.343
0.507
0.191
0.043
0.681
0.813
0.001
0.001
0.010
=
=
=
=
=
=
1129
10.33
0.0000
0.1294
0.1169
1.5542
[95% Conf. Interval]
-.164703
.264484
-.0089013
.7375571
-.042319
.1246157
-.1283168
-.3436803
-.5098761
-.1614836
-.1048727
.0090786
-.3694143
-.3516171
-.7802437
-.8049044
-13.80672
-.0928081
.8062821
-.0027756
1.417937
.4785049
.5985984
.5262761
.2329692
.177442
.3266712
.5233121
.5933735
.2414445
.2758452
-.1982893
-.2176385
-1.882745
6
The number of children one woman has
in 1982 is 0.49 less than the base year.
Similar result is found for year 1984.
The year dummies show significant drops
in fertility rate over time.
7
Example 2
CPS78_85.dta has wage data collected in 1978
and 1985.
we estimate the earning equation which
includes education, experience, experience
squared, union dummy, female dummy and the
year dummy for 1985.
Suppose that you want to see if gender gap has
changed over time, you include interaction
between female and 1985; that is you estimate
the following.
8
Log(wage)=β0+β1(educ)
+β2(exper)+β3(expersq)+β4(Union)
+β5(female)
+β6(year85)
+β7(year85)(female)
You can check if gender wage gap in 1985 is
different from the base year (1978) by checking if β7
is equal to zero or not.
The gender gap in each period is given by:
-gender gap in the base year (1978) = β5
-gender gap in 1985= β5+ β7
9
. reg lwage educ exper expersq union
Source
SS
df
female y85
MS
Model
Residual
135.328704
183.762464
7
1076
19.332672
.170782959
Total
319.091167
1083
.29463635
lwage
Coef.
educ
exper
expersq
union
female
y85
y85fem
_cons
.0833217
.0294761
-.0003975
.205237
-.3195333
.3530916
.0884046
.3522088
y85fem
Std. Err.
.0050646
.0035717
.0000776
.0302943
.0366427
.0333324
.0513498
.0763137
t
16.45
8.25
-5.12
6.77
-8.72
10.59
1.72
4.62
Number of obs
F( 7, 1076)
Prob > F
R-squared
Adj R-squared
Root MSE
P>|t|
0.000
0.000
0.000
0.000
0.000
0.000
0.085
0.000
=
=
=
=
=
=
1084
113.20
0.0000
0.4241
0.4204
.41326
[95% Conf. Interval]
.0733841
.0224679
-.0005498
.1457945
-.3914324
.2876877
-.0123524
.2024683
.0932594
.0364844
-.0002451
.2646795
-.2476341
.4184954
.1891616
.5019493
Coefficient for the interaction term (y85)(Female) is
positive and significant at 10% significance level. So
gender gap appear to have reduced over time.
gender gap in 1978 =-0.319
gender gap in 1985=-0.319+0.088 =-0.231
10
Policy analysis with pooled
cross sections:
The difference in difference
estimator
I explain a typical policy analysis with
pooled cross section data, called the
difference-in-difference estimation, using
an example.
11
Example: Effects of garbage
incinerator on housing prices
This example is based on the studies of
housing price in North Andover in
Massachusetts
The rumor that a garbage incinerator will
be build in North Andover began after
1978. The construction of incinerator
began in 1981.
You want to examine if the incinerator
affected the housing price.
12
Our hypothesis is the following.
Hypothesis: House located near the incinerator
would fall relative to the price of more distant
houses.
For illustration define a house to be near the
incinerator if it is within 3 miles.
So create the following dummy variables
nearinc =1 if the house is `near’ the incinerator
=0 if otherwise
13
Most naïve analysis would be to run the following
regression using only 1981 data.
price =β0+β1(nearinc)+u
where the price is the real price (i.e., deflated using CPI to
express it in 1978 constant dollar).
Using the KIELMC.dta, the result is the following
. reg rprice nearinc if year==1981
Source
SS
df
MS
Model
Residual
2.7059e+10
1.3661e+11
1 2.7059e+10
140 975815048
Total
1.6367e+11
141 1.1608e+09
rprice
Coef.
nearinc
_cons
-30688.27
101307.5
Std. Err.
5827.709
3093.027
t
-5.27
32.75
Number of obs
F( 1, 140)
Prob > F
R-squared
Adj R-squared
Root MSE
P>|t|
0.000
0.000
=
142
= 27.73
= 0.0000
= 0.1653
= 0.1594
= 31238
[95% Conf. Interval]
-42209.97
95192.43
-19166.58
107422.6
But can we say from this estimation that the incinerator has
negatively affected the housing price?
14
To see this, estimate the same equation using
1979 data. Note this is before the rumor of
incinerator building began.
. reg rprice nearinc if year==1978
Source
SS
df
MS
Model
Residual
1.3636e+10
1.5332e+11
1
177
1.3636e+10
866239953
Total
1.6696e+11
178
937979126
rprice
Coef.
nearinc
_cons
-18824.37
82517.23
Std. Err.
4744.594
2653.79
t
-3.97
31.09
Number of obs
F( 1,
177)
Prob > F
R-squared
Adj R-squared
Root MSE
P>|t|
0.000
0.000
=
=
=
=
=
=
179
15.74
0.0001
0.0817
0.0765
29432
[95% Conf. Interval]
-28187.62
77280.09
-9461.117
87754.37
Note that the price of the house near the place where the
incinerator is to be build is lower than houses farther from the
location.
So negative coefficient simply means that the garbage incinerator
15
was build in the location where the housing price is low.
Now, compare the two regressions.
Year 1978 regression
. reg rprice nearinc if year==1978
Source
Compared to
1978, the price
penalty for
houses near the
incinerator is
greater in
1981.
Perhaps, the
increase in the
price penalty in
1981 is caused
by the
incinerator
SS
df
MS
Model
Residual
1.3636e+10
1.5332e+11
1
177
1.3636e+10
866239953
Total
1.6696e+11
178
937979126
rprice
Coef.
nearinc
_cons
-18824.37
82517.23
Std. Err.
4744.594
2653.79
t
-3.97
31.09
Number of obs
F( 1,
177)
Prob > F
R-squared
Adj R-squared
Root MSE
P>|t|
0.000
0.000
=
=
=
=
=
=
179
15.74
0.0001
0.0817
0.0765
29432
[95% Conf. Interval]
-28187.62
77280.09
-9461.117
87754.37
Year 1981 regression
. reg rprice nearinc if year==1981
Source
SS
df
MS
Model
Residual
2.7059e+10
1.3661e+11
1
140
2.7059e+10
975815048
Total
1.6367e+11
141
1.1608e+09
rprice
Coef.
nearinc
_cons
-30688.27
101307.5
Std. Err.
5827.709
3093.027
t
-5.27
32.75
Number of obs
F( 1,
140)
Prob > F
R-squared
Adj R-squared
Root MSE
P>|t|
0.000
0.000
This is the basic idea of the
difference-in-difference estimator
=
=
=
=
=
=
142
27.73
0.0000
0.1653
0.1594
31238
[95% Conf. Interval]
-42209.97
95192.43
-19166.58
107422.6
16
The difference-in-difference estimator in
this example may be computed as follows.
I will show you more a general case later
on.
The difference-in-difference estimator :
ˆ1 = (coefficient for nearinc in 1981)
‒ (coefficient for nearinc in 1979)
= ‒ 30688.27 ‒(‒ 18824.37)= ‒11846
So, incinerator has decreased the house prices on
average by $11846.
17
Note that, in this example, the coefficient for
(nearinc) in 1979 is equal to
Average price
of houses near
the incinerator
‒
Average price of
houses not near
the incinerator
This is because the regression includes only one dummy variable:
(Just recall Ex.1 of the homework 2).
Therefore the difference in difference estimator ˆ in this
1
example is written as.

 1  ( Price)
1981, near
 ( Price)
1981, far
  ( Price)
1979, near
 ( Price)
1979, far
This is the reason why the estimator is called the difference
in difference estimator.

18
Difference in difference
estimator: More general
case.
The difference-in-difference estimator can
be estimated by running the following
single equation using pooled sample.
Difference in
difference estimator
price =β0+β1(nearinc)
+β2(year81)+δ1(year81)(nearinc)
19
. reg rprice nearinc y81 y81nrinc
Source
SS
df
MS
Model
Residual
6.1055e+10
2.8994e+11
3 2.0352e+10
317 914632739
Total
3.5099e+11
320 1.0969e+09
rprice
Coef.
nearinc
y81
y81nrinc
_cons
-18824.37
18790.29
-11863.9
82517.23
Std. Err.
4875.322
4050.065
7456.646
2726.91
t
-3.86
4.64
-1.59
30.26
Number of obs
F( 3, 317)
Prob > F
R-squared
Adj R-squared
Root MSE
P>|t|
0.000
0.000
0.113
0.000
=
321
= 22.25
= 0.0000
= 0.1739
= 0.1661
= 30243
[95% Conf. Interval]
-28416.45
10821.88
-26534.67
77152.1
-9232.293
26758.69
2806.867
87882.36
Difference in difference estimator
This form is more general since in addition to policy dummy
(nearinc), you can include more variables that affect the housing
price such as the number of bedrooms etc. When you include more
variables, ˆ1 cannot be expressed in a simple difference-indifference format. However, the interpretation does not change, and
therefore, it is still called the difference-in-difference estimator 20
Natural experiment (or
quasi-experiment)
The difference in difference estimator is
frequently used to evaluate the effect of
governmental policy.
Often governmental policy affects one group of
people, while it does not affect other group of
people. This type of policy change is called the
natural experiment.
For example, the change in spousal tax
deduction system in Japan which took place in
1995 has affected married couples but did not
affect single people.
21
The group of people who are affected by
the policy is called the treatment group.
Those who are not affected by the policy
is called the control group.
Suppose that you want to know how the
change in spousal tax deduction has
affected the hours worked by women.
Suppose, you have the pooled data of
workers in 1994 and 1995.
The next slide shows the typical
procedure you follow to conduct the
difference-in-difference analysis.
22
Step 1: Create the treatment dummy such
that
Dtreat =1 if the person is affected by the policy change
=0 otherwise.
Step 2: Run the following regression.
(Hours worked)=β0+β1Dtreat+ β0(year95) +δ1(Year95)(Dtreat)+u
Difference in difference estimator. This shows
the effect of the policy change on the women’s
hours worked.
23
Two period panel data
analysis
Motivation:
Remember the effects of employee training grant on the
scrap rate. You estimated the following model for the
1987 data.
log( Scrap )   0   1 ( grant )   2 log( sales )   3 log( employment
)v
. reg lscrap grant lsale lemploy if year==1988
Source
SS
df
MS
Model
Residual
6.8054029
88.2852083
3
46
2.26846763
1.91924366
Total
95.0906112
49
1.94062472
lscrap
Coef.
grant
lsales
lemploy
_cons
-.0517781
-.4548425
.6394289
4.986779
Std. Err.
.4312869
.3733152
.3651366
4.655588
t
-0.12
-1.22
1.75
1.07
Number of obs
F( 3,
46)
Prob > F
R-squared
Adj R-squared
Root MSE
P>|t|
0.905
0.229
0.087
0.290
=
=
=
=
=
=
50
1.18
0.3270
0.0716
0.0110
1.3854
[95% Conf. Interval]
-.9199137
-1.206287
-.095553
-4.384433
.8163574
.2966021
1.374411
14.35799
You did not find the evidence that receiving the grant will
reduce scrap rate.
24
The reason why we did not find the significant
effect is probably due to the endogeneity
problem.
The company with low ability workers tend to
apply for the grant, which creates positive bias
in the estimation. If you observe the average
ability of the workers, you can eliminate the bias
by including the ability variable. But since you
cannot observe ability, you have the following
situation.
log( Scrap )   0   1 ( grant )   2 log( sales )   3 log( employment
)  (  3 ability  u )
   

v
where ability is in the error term v. v=(β3ability+u)
is called the composite error term.
25
log( Scrap )   0   1 ( grant )   2 log( sales )   3 log( employment
)  (  3 ability  u )
   

v
Because ability and grant are correlated
(negatively), this causes a bias in the
coefficient for (grant).
We predicted the direction of bias in the
Effect of
following way.
~
1

ˆ

1

(  )

~
ˆ 

4
1


(  ) ( 

)
ability on
scrap rate
Sign is
determined by
Bias term
the correlation
The true negative effect of grant is cancelled out by between ability
the bias term. Thus, the bias make it difficult to
and grant
26
find the effect.
True effect
of grant
(  )
Now you know that there is a bias. Is
there anything we can do to correct for the
bias?
When you have a panel data, we can
eliminate the bias.
I will explain the method using this
example. I will generalize it later.
27
Eliminating bias using two
period panel data
Now, go back to the equation.
log( Scrap )   0   1 ( grant )   2 log( sales )   3 log( employment
)  (  4 ability  u )
   

v
The grant is administered in 1988.
Suppose that you have a panel data of
firms for two period, 1987 and 1988.
Further assume that the average ability of
workers does not change over time. So
(ability) is interpreted as the innate ability
of workers, such as IQ.
28
When you have the two period panel
data, the equation can be written as:
log( Scrap ) it   0   1 ( grant ) it   2 log( sales ) it   3 log( employment ) it
  5 ( year 88 ) it  (  4 ability i  u it )
    
v it
i is the index for ith firm. t is the index for the
period.
Since ability is constant overtime, ability has
only i index.
Now, I will use a short hand notation for
β4(ability)i. Since (ability) is assumed constant over
time, write β4(ability)i=ai. Then above equation can
be written as:
29
log( Scrap ) it   0   1 ( grant ) it   2 log( sales ) it   3 log( employment ) it
  5 ( year 88 ) it  ( a i  u it )



v it
 ai is called, the fixed effect, or the unobserved effect. If
you want to emphasize that it is the unobserved firm
characteristic, you can call it the firm fixed effect as well
 uit is called the idiosyncratic error.
 Now the bias in OLS occurs because the fixed effect is
correlated with (grant).
 So if we can get rid of the fixed effect, we can eliminate
the bias. This is the basic idea.
 In the next slide, I will show the procedure of what is
called the first-differenced estimation.
30
First, for each firm, take the first difference. That
is, compute the following.
 log( Scrap ) it  log( Scrap ) it  log( Scrap ) it 1
It follows that,
 log( Scrap ) it   0   1 ( grant ) it   2 log( sales ) it   3 log( employment
) it
  5 ( year 88 ) it  ( a i  u it )  [  0   1 ( grant ) it 1   2 log( sales ) it 1
  3 log( employment
) it 1   5 ( year 88 ) it 1  ( a i  u it 1 )]
  1  ( grant ) it   2  log( sales ) it   3  log( employment
) it   5  ( year 88 ) it   u it
The first differenced equation.
31
So, by taking the first difference, you can
eliminate the fixed effect.
 log( Scrap ) it   1  ( grant ) it   2  log( sales ) it   3  log( employment ) it   5  ( year 88 ) it   u it
If ∆uit is not correlated with ∆(grant)it, estimating
the first differenced model using OLS will produce
unbiased estimates. If we have controlled for
enough time-varying variables, it is reasonable to
assume that they are uncorrelated.
Note that this model does not have the constant.
Now, estimate this model using JTRAIN.dta
32
.
.
.
.
**************************
* Declare panel
*
**************************
tsset fcode year
panel variable: fcode (strongly balanced)
time variable: year, 1987 to 1989
delta: 1 unit
. ******************************
. * Generate first differenced *
. * variables
*
. ******************************
. gen difflscrap=lscrap-L.lscrap
(363 missing values generated)
When you use ‘nocons’
option, the stata omits
constant term.
. gen diffgrant=grant-L.grant
(157 missing values generated)
. gen difflsales=lsales-L.lsales
(226 missing values generated)
. gen difflemploy=lemploy-L.lemploy
(181 missing values generated)
. gen diffd88=d88-L.d88
(157 missing values generated)
.
.
.
.
**********************
* Run the regression *
**********************
reg difflscrap diffgrant difflsales difflemploy diffd88 if year<=1988, nocons
Source
SS
df
MS
Model
Residual
2.71885438
16.0749657
4
43
.679713595
.373836411
Total
18.79382
47
.399868511
difflscrap
Coef.
diffgrant
difflsales
difflemploy
diffd88
-.3223172
-.1733036
.0233784
-.0272418
Std. Err.
.1879101
.365626
.5064015
.120639
t
-1.72
-0.47
0.05
-0.23
Number of obs
F( 4,
43)
Prob > F
R-squared
Adj R-squared
Root MSE
P>|t|
0.093
0.638
0.963
0.822
=
=
=
=
=
=
47
1.82
0.1428
0.1447
0.0651
.61142
[95% Conf. Interval]
-.701274
-.9106586
-.9978775
-.2705336
Now, the grant is negative and significant at
10% level.
.0566396
.5640514
1.044634
.2160501
33
Note that, when you use this method in your
research, it is a good idea to tell your audience
what the potential fixed effect would be and
whether it is correlated with the explanatory
variables. In this example, unobserved ability is
potentially an important source of the fixed
effect.
Off course, one can never tell exactly what the
fixed effect is since it is the aggregate effects of
all the unobserved effects. However, if you tell
what is contained in the fixed effect, your
audience can understand the potential direction
of the bias, and why you need to use the firstdifferenced method.
34
General case
First differenced model in a more general
situation can be written as follows.
Yit=β0+β1xit1+β2xit2+…+βkxitk+ai+uit
Fixed
effect
If ai is correlated with any of the explanatory variables,
the estimated coefficients will be biased. So take the
first difference to eliminate ai, then estimate the
following model by OLS.
∆Yit=∆ β1xit1+ ∆ β2xit2+…+ ∆ xitk+∆ uit
35
Note, when you take the first difference,
the constant term will also be eliminated.
So you should use `nocons’ option in
STATA when you estimate the model.
When some variables are time invariant,
these variables are also eliminated. If the
treatment variable does not change
overtime, you cannot use this method.
36
First differencing for more
than two periods.
You can use first differencing for more
than two periods.
You just have to difference two adjacent
periods successively.
For example, suppose that you have 3
periods. Then for the dependent variable,
you compute ∆yi2=yi2-yi1, and ∆yi3=yi3-yi2.
Do the same for x-variables. Then run the
regression.
37
Exercise
The data ezunem.dta contains the city level
unemployment claim statistics in the state of
Indiana. This data also contains information
about whether the city has an enterprise zone or
not.
The enterprise zone is the area which
encourages businesses and investments through
reduced taxes and restrictions. Enterprise zones
are usually created in an economically
depressed area with the purpose of increasing
the economic activities and reducing
unemployment.
38
 Using the data, ezunem.dta, you are asked to estimate the
effect of enterprise zones on the city-level unemployment
claim. Use the log of unemployment claim as the
dependent variable
Ex1. First estimate the following model using OLS.
log(unemployment claims)it =β0+β1(Enterprise zone)it
+β(year dummies)it+vit
Discuss whether the coefficient for enterprise zone is biased
or not. If you think it is biased, what is the direction of
bias?
Ex2. Estimate the model using the first difference method.
Did it change the result? Was your prediction of bias
correct?
39
OLS results
. reg luclms ez d81 d82 d83 d84 d85 d86 d87 d88
Source
SS
df
Model
Residual
35.5700512
64.9262278
9 3.95222791
188 .345352276
Total
100.496279
197 .510133396
luclms
Coef.
ez
d81
d82
d83
d84
d85
d86
d87
d88
_cons
-.0387084
-.3216319
.1354957
-.2192554
-.5970717
-.6216534
-.6511313
-.9188151
-1.2575
11.69439
Std. Err.
.1148501
.1771882
.1771882
.1771882
.1799355
.1847186
.1847186
.1847186
.1847186
.125291
Number of obs
F( 9, 188)
Prob > F
R-squared
Adj R-squared
Root MSE
MS
t
-0.34
-1.82
0.76
-1.24
-3.32
-3.37
-3.52
-4.97
-6.81
93.34
P>|t|
0.736
0.071
0.445
0.217
0.001
0.001
0.001
0.000
0.000
0.000
=
=
=
=
=
=
198
11.44
0.0000
0.3539
0.3230
.58767
[95% Conf. Interval]
-.2652689
-.6711645
-.2140369
-.568788
-.9520237
-.986041
-1.015519
-1.283203
-1.621887
11.44724
.187852
.0279007
.4850283
.1302772
-.2421197
-.2572658
-.2867437
-.5544275
-.893112
11.94155
40
First differencing
. reg lagluclms lagez lagd81 lagd82 lagd83 lagd84 lagd85 lagd86 lagd87 lagd88, nocons
Source
SS
df
MS
Model
Residual
17.3537634
7.79583815
9 1.92819594
167 .046681666
Total
25.1496016
176 .142895463
lagluclms
Coef.
lagez
lagd81
lagd82
lagd83
lagd84
lagd85
lagd86
lagd87
lagd88
-.1818775
-.3216319
.1354957
-.2192554
-.5580256
-.5565765
-.5860544
-.8537383
-1.192423
Std. Err.
.0781862
.046064
.0651444
.0797852
.0945636
.108961
.1182979
.1269499
.1350488
t
-2.33
-6.98
2.08
-2.75
-5.90
-5.11
-4.95
-6.72
-8.83
Number of obs
F( 9, 167)
Prob > F
R-squared
Adj R-squared
Root MSE
P>|t|
0.021
0.000
0.039
0.007
0.000
0.000
0.000
0.000
0.000
=
=
=
=
=
=
176
41.31
0.0000
0.6900
0.6733
.21606
[95% Conf. Interval]
-.3362382
-.4125748
.0068831
-.3767731
-.7447196
-.7716951
-.8196066
-1.104372
-1.459046
-.0275169
-.2306891
.2641083
-.0617378
-.3713315
-.3414579
-.3525023
-.6031047
-.9257998
41
The do file used to generate the results.
tsset city year
reg luclms ez d81 d82 d83 d84 d85 d86 d87 d88
gen lagluclms =luclms -L.luclms
gen lagez =ez -L.ez
gen lagd81 =d81 -L.d81
gen lagd82 =d82 -L.d82
gen lagd83 =d83 -L.d83
gen lagd84 =d84 -L.d84
gen lagd85 =d85 -L.d85
gen lagd86 =d86 -L.d86
gen lagd87 =d87 -L.d87
gen lagd88 =d88 -L.d88
reg lagluclms lagez lagd81 lagd82 lagd83 lagd84 lagd85 lagd86 lagd87 lagd88,
nocons
42
The assumptions for the first
difference method.
Assumption FD1: Linearity
For each i, the model is written as
yit=β0+β1xit1+…+βkxitk+ai+uit
43
Assumption FD2:
We have a random sample from the cross
section
Assumption FD3:
There is no perfect collinearity. In addition,
each explanatory variable changes over
time at least for some i in the sample.
44
Assumption FD4. Strict exogeneity
E(uit|Xi,ai)=0 for each i.
Where Xi is the short hand notation for ‘all
the explanatory variables for ith individual
for all the time period’.
This means that uit is uncorrelated with the
current year’s explanatory variables as
well as with other years’ explanatory
variables.
45
The unbiasedness of first
difference method
Under FD1 through FD4, the estimated
parameters for the first difference method
are unbiased.
46
Assumption FD5: Homoskedasticity
Var(∆uit|Xi)=σ2
Assumption FD6: No serial correlation
within ith individual.
Cov(∆uit,∆uis)=0 for t≠s
Note that FD2 assumes random sampling across
difference individual, but does not assume
randomness within each individual. So you
need an additional assumption to rule out the
serial correlation.
47