No Slide Title
Download
Report
Transcript No Slide Title
Econ 140
Binary Response
Lecture 22
Lecture 22
1
Today’s plan
Econ 140
• Three models:
• Linear probability model
• Probit model
• Logit model
• L22.xls provides an example of a linear probability model
and a logit model
Lecture 22
2
Discrete choice variable
• Defining variables:
Yi = 1 if individual :
Takes BART
Buys a car
Joins a union
Econ 140
Yi = 0 if individual:
Does not take BART
Does not buy a car
Does not join a union
• The discrete choice variable Yi is a function of individual
characteristics: Yi = a + bXi + ei
Lecture 22
3
Graphical representation
Econ 140
X = years of labor market experience
Y = 1 [if person joins union]
= 0 [if person doesn’t join union]
Y
1
Yˆ
Observed data with OLS
regression line
0
Lecture 22
X
4
Linear probability model
Econ 140
• The OLS regression line in the previous slide is called the
linear probability model
– predicting the probability that an individual will join a
union given their years of labor market experience
• Using the linear probability model, we estimate the
equation:
Yˆ aˆ bˆX
– using aˆ & bˆ
Lecture 22
we can predict the probability
5
Linear probability model (2)
Econ 140
• Problems with the linear probability model
1) Predicted probabilities don’t necessarily lie within the 0
to 1 range
2) We get a very specific form of heteroskedasticity
• errors for this model are ei Yi Yˆi
• note: Yˆi values are along the continuous OLS line,
but Yi values jump between 0 and 1 - this creates
large variation in errors
3) Errors are non-normal
• We can use the linear probability model as a first guess
– can be used for start values in a maximum likelihood
Lecture 22problem
6
McFadden’s Contribution
Econ 140
• Suggestion: curve that runs strictly between 0 and 1 and
tails off at the boundaries like so:
Y
1
0
Lecture 22
7
McFadden’s Contribution
Econ 140
• Recall the probability distribution function and cumulative
distribution function for a standard normal:
1
PDF
0
Lecture 22
0
CDF
8
Probit model
Econ 140
• For the standard normal, we have the probit model using
the PDF
• The density function for the normal is:
1
1 2
f Z
exp Z
2
2
where Z = a + bX
• For the probit model, we want to find
Pr(Yi 1) F Z i
f Z i PDF , F ( Z i ) CDF
Pr(Z z ) CDF
Lecture 22
9
Probit model (2)
Econ 140
• The probit model imposes the distributional form of the
CDF in order to estimate a and b
• The values aˆ and bˆ have to be estimated as part of the
maximum likelihood procedure
Lecture 22
10
Logit model
Econ 140
• The logit model uses the logistic distribution
Density:
1
ez
gz
1 ez
Cumulative:
1
G Z
1 ez
Standard normal F(Z)
Logistic G(Z)
0
Lecture 22
11
Maximum likelihood
Econ 140
• Alternative estimation that assumes you know the form of
the population
• Using maximum likelihood, we will be specifying the
model as part of the distribution
Lecture 22
12
Maximum likelihood (2)
Econ 140
• For example: Bernoulli distribution where: (with a
parameter )
Pr(Y 1)
Pr(Y 0) 1
• We have an outcome
1110000100
• The probability expression is:
3 1 4 1 2 4 1 6
0 .4
• We pick a sample of Y1….Yn
PrYi 1
PrYi 0 1
Lecture 22
13
Maximum likelihood (3)
Econ 140
• Probability of getting observed Yi is based on the form
we’ve assumed:
Yi 1 1Yi
• If we multiply across the observed sample:
n
Yi 1 (1Yi )
i 1
• Given we think that an outcome of one occurs r times:
( nr )
r
ˆ
ˆ
1
Lecture 22
14
Maximum likelihood (3)
• If we take logs, we get
Econ 140
L ˆ r log ˆ n r log 1 ˆ
– This is the log-likelihood
– We can differentiate this and obtain a solution for ˆ
Lecture 22
15
Maximum likelihood (4)
Econ 140
• In a more complex example, the logit model gives
PrYi 1 G Z i
Z i a bX i
PrYi 0 1 G Z i
• Instead of looking for estimates of we are looking for
estimates of a and b
• Think of G(Zi) as :
– we get a log-likelihood
L(a, b) = Si [Yi log(Gi) + (1 - Yi) log(1 - Gi)]
– solve for a and b
Lecture 22
16
Example
Econ 140
• Data on union membership and years of labor market
experience (L22.xls)
• To build the maximum likelihood form, we can think of:
– intercept: a
– coefficient on experience : b
• There are three columns
– Predicted value Z
– Estimated probability
– Estimated likelihood as given by the model
• The Solver from the Tools menu calculates estimates of a
and b
Lecture 22
17
Example (2)
Econ 140
• How the solver works:
• Defining a and b using start values
• Choose start values of a and b equal to zero
• Define our model: Z = a + bX
1
• Define the predictive possibilities:
G z
1 ez
• Define the log-likelihood and sum it
– Can use Solver to change the values on a and b
Lecture 22
18
Comparing parameters
Econ 140
• How do we compare parameters across these models?
• The linear probability form is: Y = a + bX
– where Pr
b
X
• Recall the graphs associated with each model
– Consequently Pr
g Zˆ i b
X
– This is the same for the probit and logit forms
Lecture 22
19
L22.xls example
Econ 140
• Predicting the linear probability model:
Uˆ 0.281 0.005EXPER
• If we wanted to predict the probability given 20 years of
experience, we’d have:
Uˆ 0.281 0.00520 0.291
• For the logit form:
– use logit distribution:
ez
gz
1 ez
– logit estimated equation is:
Zˆ Uˆ 2.38 0.06EXPER
Lecture 22
20
L22.xls example (2)
Econ 140
• At 20 years of experience:
Zˆ Uˆ 2.38 0.0620 1.18
Zˆ
e e 1.18 0.307
0.307
g Z
0.234
1 0.307
• Thus the slope at 20 years of experience is:
0.234 x 0.06 = 0.014
Lecture 22
21