Problems in Regression Analysis

download report

Transcript Problems in Regression Analysis

Problems in Regression Analysis
Heteroscedasticity
Violation of the constancy of the variance of the
errors.
 Cross-sectional data

Serial Correlation
Violation of uncorrelated error terms
 Time-series data

Spring 02
1
Heteroscedasticity
The OLS model assumes homoscedasticity, i.e., the
variance of the errors is constant. In some regressions,
especially in cross-sectional studies, this assumption
may be violated.
When heteroscedasticity is present, OLS estimation
puts more weight on the observations which have large
error variances than on those with small error
variances.
The OLS estimates are unbiased but they are inefficient
but have larger than minimum variance.
Spring 02
2
Tests of Heteroscedasticity
Lagrange Multiplier Tests
Goldfeld-Quant Test
White’s Test
Spring 02
3
Goldfeld-Quant Test
Order the data by the magnitude of the independent
variable, X, which is thouth to be related to the error
variance.
Omit the middle d observations. (d might be 1/5 of the
total sample size)
Fit two separate regressions; one for the low values,
another for the high values
Calculate ESS1 and ESS2
Calculate
ESS 1
F(( N  d  2 k )
Spring 02
( N d 2k ) )
2
2

ESS 2
4
Problem
Salvatore – Data on income and consumption
Y
12
13
14
15
16
17
18
19
20
21
Spring 02
10.6
11.4
12.3
13.0
13.8
14.4
15.0
15.9
16.9
17.2
Consumption
10.8
11.7
12.6
13.3
14.0
14.9
15.7
16.5
17.5
17.8
11.1
12.1
13.2
13.6
14.2
15.3
16.4
16.9
18.1
18.5
5
Problem
19.0
18.0
17.0
16.0
15.0
14.0
13.0
12.0
11.0
10.0
10
Spring 02
12
14
16
18
20
22
6
Problem
Regression on the whole sample:
Cˆ  1.48* .788*Yd
Regressions on the first twelve and last twelve
observations:
2
ˆ
C1  .85  .837Yd , R1  0.91, ESS1  1.069
Cˆ  2.31 .837Y , R 2  0.71, ESS  3.344
2
d
2
1
F10 ,10  3.344
 3.3  F5% crit  2.97
1.069
Spring 02
7
To Correct for Heteroscedasticity
To correct for heteroscedasticity of the form
Var(ei)=CX2, where C is a nonzero constant,
transform the variables by dividing through by the
problematic variable.
In the two variable case,
Yi
ei
1

 2 
Xi Xi
Xi
The transformed error term is now homoscedastic
Spring 02
8
Problem
C
1
 1   2  ui
Yd
Yd
Cˆ
1
 .792 1.421
Yd
Yd
Cˆ  1.421 .792Y
d
Spring 02
9
Serial Correlation
This is the problem which arises in OLS
estimation when the errors are not independent.

The error term in one period is correlated with error
terms in previous periods.
If ei is correlated with ei-1, then we say there is
first order serial correlation.
Serial correlation may be positive or negative.
E(ei,ei-1)>0
 E(ei,ei-1)<0

Spring 02
10
Serial Correlation
If serial correlation is present, the OLS
estimates are still unbiased and consistent, but
the standard errors are biased, leading to
incorrect statistical tests and biased confidence
intervals.
With positive serial correlation, the standard errors
of hat is biased downward, leading to higher t stats
 With negative serial correlation, the standard errors
of hat is biased upward, leading to lower t stats

Spring 02
11
Durbin-Watson Statistic
n
d
 (e
t 2
t
 e t 1 )
2
n
e
t 1
2
t
0d 4
0
Spring 02
dL
+SC
dU
inconcl
2
no serial correlation
4-dU
4-dL
inconcl
-SC
4
12
Problem
Data 9-4 shows corporate profits and sales in
billions of dollars for the manufacturing sector
of the U.S. from 1974 to 1994.
Estimate the equation
Profits = 1+2Sales + e
Test for first-order serial correlation.
Spring 02
13
Problem
OLS Estimate of Profit as a function of Sales:
Coefficientsa
Model
1
(Constant)
SALES
Unstandardized
Coefficients
B
Std. Error
34.014
24.041
2.654E-02
.011
Standardi
zed
Coefficien
ts
Beta
.496
t
1.415
2.492
Sig .
.173
.022
a. Dependent Variable: PROFITS
ˆt  34.01 .027* Sales
Spring 02
14
Problem
Test for serial correlation

SPSS
Model Summaryb
Model
1
R
.496a
R Square
.246
Adjusted
R Square
.207
Std. Error of
the Estimate
31.251
Durbin-W
atson
1.080
a. Predictors: (Constant), SALES
b. Dependent Variable: PROFITS
Spring 02
15
Correcting for Serial Correlation
We assume:
e t  et 1  ut
Cov(e t , e t 1 )

2
e

Where ut is distributed normally with a zero mean
and constant variance.
Follow a Durbin Procedure
Spring 02
16
Correcting for Serial Correlation
Yt  1   2 X 2t  ...   k X kt  e t
Yt 1  1   2 X 2t 1  ...   k X kt 1  e t 1
Yt 1   1   2 X 2t 1  ...   k X kt 1   et 1
Yt  Yt 1  1 (1   )   2 ( X 2t  X 2t 1 )  ...   k ( X kt  X kt 1 )  (e t   et 1 )
Spring 02
17
Correcting for Serial Correlation
• Move the lagged dependent variable term to the
right-hand side and estimate the equation using OLS.
The estimated coefficient on the lagged dependent
variable is .
Yt  1 (1   )  Yt 1   2 ( X 2t  X 2t 1 )  ... k ( X kt  X kt1 )  (e t  et 1 )
Spring 02
18
Correcting for Serial Correlation
Create new independent and dependent
variables by the following process:
X  X t  X t 1
*
t
Yt  Yt  Yt 1
*
Estimate the following equation:
Yt  Yt 1  1 (1   )   2 ( X 2t  X 2t 1 )  ...  k ( X kt  X kt1 )  (e t  et 1 )
Yt   (1  )1  2 X  ... k X  ut
*
Spring 02
*
2
*
k
19
Correcting for Serial Correlation
Yt  1 (1  )  2 X 2*  ... k X k*  ut
The estimates of the slope coefficients are the
same (but corrected for serial correlation) as in
the original equation.
The constant of the regression on the
transformed variables is
 *   (1   )
*
1
1
or
1*
1 
(1   )
Spring 02
20
Problem
Begin by regressing Profit () on Profit lagged
one period, Sales, and Sales lagged one period.
 t  1  t 1   2 St  2 St 1  ut
The estimated coefficient on the lagged
dependent variable is .
Spring 02
21
Problem
 = .49
Coefficientsa
Model
1
(Constant)
PROFITSL
SALES
SALESL
Unstandardized
Coefficients
B
Std. Error
-1.419
24.387
.492
.209
.176
.052
-.161
.053
Standardi
zed
Coefficien
ts
Beta
.419
3.106
-2.840
t
-.058
2.358
3.355
-3.046
Sig .
.954
.031
.004
.008
a. Dependent Variable: PROFITS
Spring 02
22
Problem
Then generate the transformed (starred) variables.
Run regression on transformed variables
Coefficientsa
Model
1
(Constant)
SALESS
Unstandardized
Coefficients
B
Std. Error
.167
24.855
4.234E-02
.020
Standardi
zed
Coefficien
ts
Beta
.442
t
.007
2.091
Sig .
.995
.051
a. Dependent Variable: PROFITSS
Profit*=.167+.042 Sales*
Profit = .327 +.027 Sales

Spring 02
With no serial correlation
23