178.280 Research Methods in Financial Economics

Download Report

Transcript 178.280 Research Methods in Financial Economics

125.785 Research Methods in Finance
Seminar 2: The Simple
OLS Model
Introduction
Tough guys don’t do math. Tough guys fry
chicken for a living.
-- Jaime Escalante (Stand and Deliver, 1988).
Outline

Administration
–
–


Computer Labs
Quiz 1 (31 July 33.30pm)
The Simple Regression
Model
Readings: Chapter 2, 4,
Studenmund.

Eviews Demonstration
Recap: Simple OLS Model


We begin with the
Ordinary Least
Squares (OLS)
regression model
This generates a
‘straight line’ between 2
variables.


The line ‘approximates’
the relationship
between the two
variables
The variables are
–
–
Dependent (Y)
Independent or
explanatory (X).
Why Use OLS?


OLS is relatively easy to use (unlike MLE)
The goal of minimising  ei2 is appropriate and
theoretically appealing.
–

It punishes large deviations from the regression line
OLS estimates have some useful characteristics
–
Under certain conditions it generates the best, unbiased
linear estimators of the coefficients.
Aside



OLS uses a least
squares method.
It minimises the sum of
the square of the
residuals.
The estimate of Y
consists of
–
–
Constant (beta- nought)
Slope (beta- one)
mine  ( yi  yˆ )
2
2
The Classical Assumptions

Assumption 1
–

The regression model is
linear, is correctly
specified, and has an
addititive error term.
Assumption 2
–
The error term has a
zero population mean

Assumption 3
–
All explanatory variables
are uncorrelated with the
error term.
Classical Assumptions II

Assumption 4
–
–
Observations of the error
term are uncorrelated
with each other (no serial
correlation).
We can’t predict future
errors on the basis of
past errors.

Assumption 5
–

The error term has a
constant variance
(homoscedastic)
Assumption 6
–
No explantory variable is
a perfect linear function
of any other explanatory
variable.
Illustration: Violation of
Homoscedasticity
1500
E
1000
500
E
0
-500
-1000
-1500
-2000
-2500
0
5
10
15
20
TIME
25
30
35
Analysing the OLS Model

The model has a
deterministic part.
–

That part of the value of
Y we can successfully
predict
Stochastic part
–
That part of Y we can’t
explain or predict.

The stochastic part
contains
–
–
–
–
Measurment errors
Missing variables
Genuine random effects
Non-linearities in the
relationship.
General Equation
Dependent
Variable
Constant
Explanatory
Variable
Y  0  1 X  e
Deterministic
portion
Stochastic
portion
Analysing a Model
Estimate
Test value
Select
Value of
for
preferred
Parameter
Significance
hypothesis
E.g. R2
E.g. F-test
Evaluating the Model
Overall Model- R2
Coefficients- β
Evaluating the Model

This is very important:
–
–
–
Models are evaluated at
two levels.
The high level is the
overall model.
The low level is the
individual components
of the model.

Overall Model
–
–
–
–
We measure how good
(goodness of fit) it is with
the R2.
Hypotheses are:
EITHER [H0] the model
does not have a good fit
(R2=0) OR
[H1] the model has a
good fit (R2>0)
Hypothesis Tests



We test each measure.
Goodness of fit uses an
F-test
A high R2 implies a high
F-statistic.


A high F-stastic implies
a low p-value.
A p-value is the odds of
making a mistake (Type
I error).
–
–
The mistake is rejecting
the null hypothesis when
it is true
Critical values for p
range from 1% to 10%
Evaluating the coefficients

The Simple OLS model
generates two estimates of
the coefficients
–
–



Constant (Y-intercept)
Slope (beta)
These are the measures of
value.


These are tested with a ttest.
The null hypothesis is
ALWAYS
– βi = 0
This hypothesis implies that
X has no effect on Y.
Use p-value for evaluation.
Causality


It can be easy to find
relationships between
economic variables.
A relationship does not
prove that X causes Y.
–
E.g. Storks and Babies


Modellers prefer to use
economic theory to
generate a model
Then test the model.
Summary- R2
R2
F-test
R2=0
Do not reject
R2=0
Reject R2=0
Summary- β
β
t-test
β=0
Do not reject
β=0
Reject β=0
Recap- Regression Coefficients

The key point is that X

Causality is in one direction.
The OLS model estimates 2 coefficients

–
–
Slope and
Constant
Y.
The Regression Equation



Yi = β0 + β1Xi + ei
This is the general Cartesian equation for a
straight line.
Note- the values of the coefficients are
estimates.
–
We cannot know their true value with certainty.
The Slope


One way to describe β is the slope.
A better way is
–
–

ΔY / ΔX or
dY/dX
Intuitively it is
–
How much Y changes IF we change X by one unit
ALL other factors constant
Slopes and Natural Logs
The expression dY/dX
approximates a
derivative.
Note:
The derivative f′(x) = ln
g(x) is:

g ' ( x)
g ( x)


So the slope of:
ln(y) =f(x)
= (dy/y)/dx
(Numerator is now a
growth-rate.)

ln(y)=f(ln(x)
= (dy/y)/(dx/x)
= (dy/dx)(x/y)
(Expression is an elasticity)
Testing Coefficients



If β has a true value of
0, the “explanatory”
variable has no
explanatory power.
This is tested with the tratio
H0 β=0 vs. H1 β≠0




We can test other
hypotheses about β
Eg. test elasticity of X.
The null is (usually) that
β=1.
If we reject we may
conclude
–
Elastic β>1 vs. Inelastic
β<1.
t-test for β


β has a Student’s tdistribution
t-statistic is a function
of
–
–
–

Estimated β
Hypothesised β
Standard Error
Tests easily done with
programs like Eviews.
t
ˆ   0
SE
ANOVA



ANOVA is abbreviation
for
ANalysis Of VAriance.
In order to calculate
‘goodness of fit’ we
need a comparison
point.
This is the Total Sum of
Squares.
5.2
LP91
5
4.8
4.6
LP91

4.4
4.2
4
3.8
5
10
15
20
AGE
25
30
Measuring TSS

TSS is a measure of
“total scatter”.
Measured either as
squared sum of
differences between:
–
Observed yi and mean y
5.2
LP91
5
4.8
4.6
LP91

4.4
4.2
4
3.8
5
10
15
20
AGE
25
30
ANOVA II


The TSS is a variance
measure (spread term).
It has two components
–
–
Explained Sum of
Squares
Residual Sum of
Squares

ESS is the proportion of
the TSS our model
“explains”
–
–

R2 = ESS/TSS
Has an F-distribution
RSS is the proportion of
TSS we can’t explain.
–
OLS minimises this term
Statistical Tests
Tests used
t-test
Χ2 test
F-test
A ‘point’
estimate
E.g. β
A ‘variance’
estimate
E.g. σ2
A ratio of
variances
E.g. R2
Eviews
Tool Bar

The Eviews window
has:
–
Top
Pane
–
–
Scroll
Bar
Tool bar
Top-pane (for
manipulating variables)
Lower Scroll-bar (output
may appear there)
Data
New pane
opens
Lab Introduction



The data file is
trade.wf1
We will estimate an
‘Absorption Model’ for
NZ.
This uses National
Income Accounting
procedures.

Y=C+I+G+X-M
–

Y-C-I-G=X-M
–

Hence
Add/Substract TX from
both sides
(Y-TX-C)-I+(TX-G)=(XM)+TX-TX
Absorption Model


Note- Y-TX-C equals Savings (S)
So
–
–
–

(S-I)+(TX-G)=(X-M)
Net Private Savings + Net Public Savings =
Current Account Balance
Net National Savings (NNS) = CAB
Implication: Economies with low savings
rates (deficits) will run trade deficits.
Definitions




C- Household
consumption on final
goods and services
I- Investment
G- Government
Spending on goods and
services
TX- Taxes



X- Exports
M- Imports (removed
for double-counting
purposes).
S- Household Income,
less Consumption
Return on an asset

We have data for an asset (Grange Wine)
–
–

Price at 3 auctions
Vintage of each bottle
This exercise requires the data be
manipulated in 3 ways:
–
–
–
The price has to be expressed as a growth rate.
The Vintage of the bottle has to be expressed as
‘age’
Years with missing variables need to be omitted.
New Series


EITHER
New variables can be
created using:
–
Quick


Generate Series…
OR
–
Expression is included in
regression estimate
Conclusion

Key Numbers
–
–
–
R-square, F-statistics
and p-value
Coefficient estimates
t-statistics and their pvalues.