Econometrics II
Memorial University of Newfoundland
Heteroskedasticity

8.1 The Nature of Heteroskedasticity

8.2 Using the Least Squares Estimator

8.3 The Generalized Least Squares Estimator

8.4 Detecting Heteroskedasticity
E ( y )  1   2 x
ei  y i  E ( y i )  y i   1   2 x i
y i   1   2 x i  ei
(8.1)
(8.2)
(8.3)
Figure 8.1 Heteroskedastic Errors
E ( ei )  0
var( e i )  
2
cov( e i , e j )  0
var( y i )  var( e i )  h ( x i )
(8.4)
Food expenditure example:
yˆ i  83.42  10.21 x i
eˆi  y i  83.42  10.21 x i
Figure 8.2 Least Squares Estimated Expenditure Function and Observed Data Points
The existence of heteroskedasticity implies:

The least squares estimator is still a linear and unbiased estimator, but
it is no longer best. There is another estimator with a smaller
variance.

The standard errors usually computed for the least squares estimator
are incorrect. Confidence intervals and hypothesis tests that use these
y i   1   2 x i  ei
var( b2 ) 
var( e i )  

2
(8.5)
2
N
 ( xi  x )
(8.6)
2
i 1
y i   1   2 x i  ei
var( e i )   i
2
(8.7)
N
N
var( b 2 ) 

wi  i 
2
2
i 1

i 1

2
  ( xi  x ) 
 i 1

N
N
N
var( b 2 ) 

i 1
2 2
w i eˆi 
 ( x i  x ) 2  i2 



i 1
2
(8.8)
 ( x i  x ) 2 eˆi2 



2
  ( xi  x ) 
 i 1

N
2
(8.9)
We can use a robust estimator: GRETL offers several options…check the defaults
yˆ i  83.42  10.21 x i
(27.46) (1.81)
(W hite se)
(43.41) (2.09)
(incorrect se)
W hite:
b 2  t c se( b 2 )  10.21  2.024  1.81  [6.55, 13.87 ]
Incorrect:
b 2  t c se( b 2 )  10.21  2.024  2.09  [5.97, 14.45]
The existence of heteroskedasticity implies:

Why not use robust estimation all the time?

Well, that is a good idea for large samples but for small samples,
homoskedasticity plus normality guarantees that the t ratios are
distributed as t

But robust estimates do not guarantee that, so our inference could be

If you have a small sample, check whether there is homoskedasticity
or not!
y i   1   2 x i  ei
(8.10)
E ( ei )  0
var( e i )   i
2
cov( e i , e j )  0
var  ei    i   x i
2
2
 1 
 x
i
 1 
  2 
 x 
 x
xi
i 
i


yi

i
y 
yi
xi

i1
x 
1
xi
x

i2

xi
xi

(8.11)

ei


xi

xi
(8.12)

i
e 
ei
xi
(8.13)




y i   1 x i1   2 x i 2  e i
 e
var( e )  var  i
 x
i


i
 1
1 2
2
var( e i ) 
 xi  

 x
xi
i

(8.14)
(8.15)
To obtain the best linear unbiased estimator for a model with
heteroskedasticity of the type specified in equation (8.11):
1.
Calculate the transformed variables given in (8.13).
2.
Use least squares to estimate the transformed model given in (8.14).
The generalized least squares estimator is as a weighted least
squares estimator. Minimizing the sum of squared transformed errors
that is given by:
N
e
2
i
2
N

i 1
When
ei


xi
i 1
xi
N
 ( xi
 1/ 2
ei )
2
i 1
regression function and the observations are weighted heavily.
When
xi
is large, the data contain less information and the
observations are weighted lightly.
Food example again, where was the problem coming from?
regress food_exp income [aweight = 1/income]
yˆ i  78.68  10.45 x i
(8.16)
(se) (23.79) (1.39)
ˆ 2  t c se(ˆ 2 )  10.451  2.024  1.386  [7.65,13.26]

var( e i )   i   x i
2
2
(8.17)
ln (  i )  ln (  )   ln ( x i )
2
2
 i  exp  ln(  )   ln( x i ) 
2
2
(8.18)
 exp(  1   2 z i )
 i  ex p (  1   2 z i 2 
2
  s z iS )
ln (  i )   1   2 z i
2
(8.19)
(8.20)
y i  E ( y i )  ei   1   2 x i  ei
2
2
ln ( eˆi )  ln (  i )  v i   1   2 z i  v i
(8.21)
ln ( ˆ i )  .9 3 7 8  2 .3 2 9 z i
2
2
ˆ i  ex p ( ˆ 1  ˆ 1 z i )
 yi 
 1 
 x i   ei 

  1 
  2 


 i 
 i 
 i   i 
 ei   1 
 1  2
var 
   2  var( e i )   2   i  1
 i   i 
 i 
 yi 
y 

 ˆ i 
 1 
x 

 ˆ i 

i

i1


x

 xi 


 ˆ i 
(8.23)

y i   1 x i1   2 x i 2  e i

i2
(8.22)
(8.24)
y i  1   2 xi 2 
  k x iK  e i
var( e i )   i  ex p (  1   2 z i 2 
2
  s z iS )
(8.25)
(8.26)
The steps for obtaining a feasible generalized least squares estimator
for

1 ,  2 ,
,K
are:
1. Estimate (8.25) by least squares and compute the squares of the
least squares residuals

2. Estimate  1 ,  2 ,
ln eˆi   1   2 z i 2 
2
2
eˆ i .
,S
by applying least squares to the equation
  S z iS  v i
3. Compute variance estimates ˆ i2
 ex p ( ˆ 1  ˆ 2 z i 2 
 ˆ S z iS. )
4. Compute the transformed observations defined by (8.23),
including x i3 ,

, x iK
if
K  2.
5. Apply least squares to (8.24), or to an extended version of (8.24)
if
K 2
.
yˆ i  76.05  10.63 x
(se)
(9.71)
(.97 )
(8.27)
For our food expenditure example (GRETL:
#Estimating the skedasticity function and GLS
ols y const x
genr lnsighat = log(\$uhat*\$uhat)
genr z = log(x)
#Obtain prediction of variance:
ols lnsighat const z
genr predsighat = exp(\$yhat)
#generate weights;
genr w = 1/predsighat
wls w y const x
For our food expenditure example (STATA):
gen z = log(income)
regress food_exp income
predict ehat, residual
gen lnehat2 = log(ehat*ehat)
regress lnehat2 z
* -------------------------------------------* Feasible GLS
* -------------------------------------------predict sig2, xb
gen wt = exp(sig2)
regress food_exp income [aweight = 1/wt]
Using our wage data (cps2.dta):
W A G E   9.914  1.234 E D U C  .133 E X P E R  1.524 M E T R O
(se)
(1.08)
(.070)
(.015)
W A G E Mi   M 1   2 E D U C Mi   3 E X P E RMi  eMi
W A G E Ri   R 1   2 E D U C Ri   3 E X P E R Ri  e Ri
(.431)
i  1, 2,
i  1, 2,
b M 1   9.914  1.524   8.39
(8.28)
,NM
,NR
(8.29a)
(8.29b)
???
var( e M i )   M
2
var( e R i )   R
2
ˆ M  3 1 .8 2 4
2
2
ˆ R  1 5 .2 4 3
b M 1   9.052
b M 2  1.282
b R 1   6.166
b R 2  .956
(8.30)
b M 3  .1346
b R 3  .1260
 W AG E Mi 
 1 
 ED U C Mi 
 E X P E RMi   eMi 












M1
2 
3 





M
M
M


 M 



  M 
i  1, 2,
,NM
 W A G E Ri 
 1 
 E D U C Ri 
 E X P E R Ri   eRi 












R1 
2 
3 





R
R
R


 R 



  R 
i  1, 2,
(8.31a)
(8.31b)
,NR
Feasible generalized least squares:
1. Obtain estimated ˆ M and ˆ R by applying least squares separately to
the metropolitan and rural observations.
2.
 ˆ M w hen M E T R O i  1
ˆ i  
 ˆ R w hen M E T R O i  0
3. Apply least squares to the transformed model
 W AG Ei 
 1 
 EDUCi 
 E X P E Ri 
 M E T R O i   ei 
















R1 
2 
3 
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ






i
i
i
i


 i





  i
(8.32)
W A G E   9.398  1.196 E D U C  .132 E X P E R  1.539 M E T R O
(se)
(1.02)
(.069)
(.015)
(8.33)
(.346)
. regress wage educ exper metro [aweight = 1/wt]
(sum of wgt is
3.7986e+01)
Source
SS
df
MS
Model
Residual
9797.0667
26284.1488
3
996
3265.6889
26.3897076
Total
36081.2155
999
36.1173328
wage
Coef.
educ
exper
metro
_cons
1.195721
.1322088
1.538803
-9.398362
Std. Err.
.068508
.0145485
.3462856
1.019673
t
17.45
9.09
4.44
-9.22
Number of obs
F(
3,
996)
Prob > F
R-squared
Root MSE
P>|t|
0.000
0.000
0.000
0.000
=
=
=
=
=
=
1000
123.75
0.0000
0.2715
0.2693
5.1371
[95% Conf. Interval]
1.061284
.1036595
.8592702
-11.39931
1.330157
.160758
2.218336
-7.397408
STATA
Commands:
* -------------------------------------------* Rural subsample regression
* -------------------------------------------regress wage educ exper if metro == 0
scalar rmse_r = e(rmse)
scalar df_r = e(df_r)
* -------------------------------------------* Urban subsample regression
* -------------------------------------------regress wage educ exper if metro == 1
scalar rmse_m = e(rmse)
scalar df_m = e(df_r)
* -------------------------------------------* Groupwise heteroskedastic regression using FGLS
* -------------------------------------------gen rural = 1 - metro
gen wt=(rmse_r^2*rural) + (rmse_m^2*metro)
regress wage educ exper metro [aweight = 1/wt]
GRETL
Commands:
#Wage Example
open "c:\Program Files\gretl\data\poe\cps2.gdt"
ols wage const educ exper metro
# Use only metro observations
smpl metro --dummy
ols wage const educ exper
scalar stdm = \$sigma
#Restore the full sample
smpl full
#Create a dummy variable for rural
genr rural = 1-metro
GRETL
Commands:
#Restrict sample to rural observations
smpl rural --dummy
ols wage const educ exper
scalar stdr = \$sigma
#Restore the full sample
smpl full








#Generate standard deviations for each metro and rural obs
genr wm = metro*stdm
genr wr = rural*stdr
#Make the weights (reciprocal)
#Remember, Gretl's wls needs these to be variances so you'll need to
square them
genr w = 1/(wm + wr)^2
#Weighted least squares
wls w wage const educ exper metro
Remark: To implement the generalized least squares estimators
described in this Section for three alternative heteroskedastic
specifications, an assumption about the form of the
heteroskedasticity is required. Using least squares with White
standard errors avoids the need to make an assumption about the
form of heteroskedasticity, but does not realize the potential
efficiency gains from generalized least squares.
8.4.1 Residual Plots

Estimate the model using least squares and plot the least squares
residuals.

With more than one explanatory variable, plot the least squares
residuals against each explanatory variable, or against yˆ i , to see if
those residuals vary in a systematic way relative to the specified
variable.
8.4.2 The Goldfeld-Quandt Test
F 
2
2
ˆ M  M
ˆ

2
R
2
R
F( N M  K M , N R  K R )
(8.34)
H 0 :  M   R against H 0 :  M   R
2
F 
2
ˆ M
ˆ
2
R

2
31.824
2
2
(8.35)
 2.09
15.243
8.4.2 The Goldfeld-Quandt Test
STATA:
* -------------------------------------------* Goldfeld Quandt test
* --------------------------------------------
GRETL:
#Goldfeld Quandt statistic
scalar fstatistic = stdm^2/stdr^2
scalar GQ = rmse_m^2/rmse_r^2
scalar crit = invFtail(df_m,df_r,.05)
scalar pvalue = Ftail(df_m,df_r,GQ)
scalar list GQ pvalue crit
F 
2
ˆ M
ˆ
2
R

31.824
 2.09
15.243
8.4.2 The Goldfeld-Quandt Test
2
ˆ 1  3 5 7 4 .8
More generally, the test can be based
Simply on a continuous variable
Split the sample in halves (usually omitting
some from the middle) after ordering
them according to the suspected variable
(income in our food example)
ˆ  1 2, 9 2 1 .9
2
2
F 
2
ˆ 2
ˆ
2
1

12, 921.9
 3.61
3574.8
8.4.2 The Goldfeld-Quandt Test
2
ˆ 1  3 5 7 4 .8
2
ˆ 2  1 2, 9 2 1 .9
F 
2
ˆ 2
ˆ
2
1
For the food expenditure data
You should now be able to obtain
this test statistic
And check whether it exceeds the critical
value

12, 921.9
 3.61
3574.8
Remember that you can probably use
the one-tail version of this test
Why?
8.4.3 Testing the Variance Function
For the mean:
y i  E ( y i )  ei   1   2 x i 2 
  K x iK  ei
For the variance, in general:
var( y i )    E ( e i )  h (  1   2 z i 2 
2
i
2
(8.36)
  S z iS )
(8.37)
For example::
h ( 1   2 zi 2 
  S z iS )  exp(  1   2 z i 2 
  S z iS )
h (  1   2 z i )  exp  ln(  )   ln( x i ) 
2
8.4.3 Testing the Variance Function
h ( 1   2 zi 2 
  S z iS )   1   2 z i 2 
h ( 1   2 zi 2 
H 0 : 2  3 
  S z iS
(8.38)
  S z iS )  h (  1 )
 S  0
(8.39)
H 1 : not all the  s in H 0 are zero
8.4.3 Testing the Variance Function
var( y i )   i  E ( e i )   1   2 z i 2 
2
2
ei  E ( ei )  v i   1   2 z i 2 
2
2
2
eˆi   1   2 z i 2 
  NR
2
2
  S z iS
(8.40)
  S z iS  v i
(8.41)
  S z iS  v i
(8.42)
 ( S 1)
(8.43)
2
S is the number of variables used




This is a large sample test
It is a Lagrange Multiplier (LM) test, which are
based on an auxiliary regression
In this case named after Breusch and Pagan
Here (and in the textbook) we saw a test
statistic based on a linear function of the
squared residual, but the good thing about
this test is that this form can be used to test
for any form of heteroskedasticity
8.4.3a The White Test
Since we may not know which variables explain heteroskedasticity…
E ( y i )  1   2 xi 2   3 xi 3
z2  x2
z 3  x3
z4  x2
2
z 5  x3
2
8.4.3b Testing the Food Expenditure Example
S S T  4, 6 1 0, 7 4 9, 4 4 1
S S E  3, 7 5 9, 5 5 6,1 6 9
STATA:
whitetst
R 1
2
SSE
 .1 8 4 6
SST
Or
Breusch-Pagan test
estat imtest, white
  N  R  40  .1846  7.38
2
2
White test
  N  R  40  .18888  7.555
2
2
p -value  .023
GRETL: ols y const x
modtest --breusch-pagan
modtest –white














Breusch-Pagan test
generalized least squares
Goldfeld-Quandt test
heteroskedastic partition
heteroskedasticity
heteroskedasticity-consistent
standard errors
homoskedasticity
Lagrange multiplier test
mean function
residual plot
transformed model
variance function
weighted least squares
White test
y i   1   2 x i  ei
E ( ei )  0
var ( e i )   i
cov( e i , e j )  0
2
b2   2 
wi 
 w i ei
(i  j )
(8A.1)
xi  x
  xi  x 
2
E  b2   E   2   E   w i ei 
 2 
 w i E  ei    2
var  b 2   var   w i e i 




w i var  e i  
2
i j
w

  w i w j cov  ei , e j 
2
i

(8A.2)
2
i
 ( x i  x ) 2  i2 


  ( xi  x ) 


2
2
var( b 2 ) 

2
  xi  x 
2
(8A.3)
2
eˆi   1   2 z i 2 
F 
(8B.2)
SSE / ( N  S )

i 1
(8B.1)
( SST  SSE ) / ( S  1)
N
SST 
  S z iS  v i
2
2
eˆi  eˆ

2
N
and
SSE 

2
vˆi
i 1
  ( S  1)  F 
2
SST  SSE
SSE / ( N  S )
var( e )  var( v i ) 
2
i
 
2
SSE
N S
SST  SSE
 ( S 1)
2
(8B.3)
(8B.4)
(8B.5)
2
i
var( e )
SST  SSE
 
2
 e i2 
var  2   2
 e 
2 ˆ
1
4
var( e ) 
1
N
var( e i )  2
var( e i )  2  e
2
e
2
i
(8B.6)
4
e
N
 (eˆ
i 1
2
i
 eˆ ) 
2
2
2
SST
4
(8B.7)
N
 
2
SST  SSE
SST / N
SSE 

 N  1 

SST 

 NR
(8B.8)
2
```