BINARY CHOICE MODELS: PROBIT ANALYSIS In - Kian

Download Report

Transcript BINARY CHOICE MODELS: PROBIT ANALYSIS In - Kian

In previous lecture, we dealt with the unboundedness problem of LPM using the logit model.

In this lecture, we will consider another alternative, i.e.

the probit model.

Adapted from “Introduction to Econometrics” by Christopher Dougherty

1

BINARY CHOICE MODELS: PROBIT ANALYSIS 1.00

0.75

f

(

Z

)

1 2

e

1 2

Z

2 0.4

0.3

0.50

0.2

0.25

Z

 

1

 

2

X

2

...

 

k X k

0.1

0.00

-3 -2 -1 0 1 2

Z

0 In the case of probit analysis, the sigmoid function is the cumulative standardized normal distribution. The maximum likelihood principle is again used to obtain estimates of the parameters.

2

Estimating the probability of success

Z

 

1

 

2

X

2

...

 

k X k

Suppose that the probit equation yields a Z = +0.2171. Since Z is positive, the area in the larger portion of the curve is 0.5859, or a prediction of a 58.59% success rate [ You can use a standard normal table or Excel function NORMSDIST ].

Area = 0.5859

3

Z= + 0.2171

Quantifying the Marginal Effect We will do this theoretically for the general case where Z is a function of several explanatory variables.

p

F

(Z )

Z

 

1

 

2

X

2

...

k X k

Since p is a function of Z, and Z is a function of the X variables, the marginal effect of X

i

on p can be written as:

p

X i

dp dZ

 

Z

X i

4

(1)

dp dZ dp dZ

f

(

Z

)  1 2 

e

 1 2

Z

2

The marginal effect of Z on p is given by the standardized normal distribution. (2)

Z

X i

Z

 

1

 

2

X

2

...

k X k

Z

X i

 

i

The marginal effect of X

i

on Z is given by

i

.

5

(3)

p

X i

dp dZ

 

Z

X i

    1 2 

e

 1 2

Z

2    

i

Hence we obtain an expression for the marginal effect of X

i

on p.

As with logit analysis, the marginal effects vary with Z.

A common procedure is to evaluate them for the value of Z given by the sample means of the explanatory variables.

6

ILLUSTRATION

Why do some people graduate from high school while others drop out?

Here we use the same multivariate example as in the case of logit model (see Illustration 2 in logit lecture slides), so as to facilitate comparison.

7

. probit GRAD ASVABC SM SF MALE Iteration 0: log likelihood = -118.67769

Iteration 1: log likelihood = -98.195303

Iteration 2: log likelihood = -96.666096

Iteration 3: log likelihood = -96.624979

Iteration 4: log likelihood = -96.624926

Probit estimates Number of obs = 540 LR chi2(4) = 44.11

Prob > chi2 = 0.0000

Log likelihood = -96.624926 Pseudo R2 = 0.1858

----------------------------------------------------------------------------- GRAD | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+--------------------------------------------------------------- ASVABC | .0648442 .0120378 5.39 0.000 .0412505 .0884379

SM | -.0081163 .0440399 -0.18 0.854 -.094433 .0782004

SF | .0056041 .0359557 0.16 0.876 -.0648677 .0760759

MALE | .0630588 .1988279 0.32 0.751 -.3266368 .4527544

_cons | -1.450787 .5470608 -2.65 0.008 -2.523006 -.3785673

------------------------------------------------------------------------------

8

. sum GRAD ASVABC SM SF MALE Variable | Obs Mean Std. Dev. Min Max -------------+------------------------------------------------------- GRAD | 540 .9425926 .2328351 0 1 ASVABC | 540 51.36271 9.567646 25.45931 66.07963

SM | 540 11.57963 2.816456 0 20 SF | 540 11.83704 3.53715 0 20 MALE | 540 .5 .5004636 0 1 As with logit analysis, the coefficients have no direct interpretation.

However, we can use them to quantify the marginal effects of the explanatory variables on the probability of graduating from high school.

We will estimate the marginal effects, putting all the explanatory variables equal to their sample means.

9

Step 1: Calculate Z, when the X variables are equal to their sample means.

ASVABC

X i

51.36

i

0.065

i X i

3.328

SM

11.58

–0.008

–0.094

SF MALE

Constant Total 11.84

0.50

1.00

0.006

0.063

–1.451

0.066

0.032

–1.451

1.881

Z

  1   1 .

881  2

X

2  ...

k X k

10

dp

Step 2: Calculate

dZ dp

dZ f

(

Z

)  1 2 

e

 1 2

Z

2  0 .

068

Step 3: Calculate

p

X i

Note that:

Z

X i

 

i

p

X i

dp dZ

 

Z

X i

11

ASVABC SM SF MALE

i

0.065

–0.008

0.006

0.063

dp dZ

0.068

0.068

0.068

0.068

p

X i

dp

 

i dZ

0.004

-0.001

0.000

0.004

We see that a one-point increase in ASVABC increases the probability of graduating from high school by about 0.004, i.e. 0.4%.

Mother's schooling (SM) has negligible effect and father's schooling (SF) has no discernible effect at all.

Males have 0.4 percent higher probability of graduating than females.

12

What is the probability of graduating when ASVABC equal to (a) 30 (b) 50 ? Set the values of other X variables equal to their sample means.

ASVABC SM SF MALE

Constant Total

X i

30 11.58

11.84

0.50

1.00

i

0.065

–0.008

0.006

0.063

–1.451

i X i

1.95

–0.094

Z

  1   0 .

503  2

X

2  ...

k X k

0.066

0.032

NORMSDIST (0.503) =0.6925

–1.451

0.503

When ASVABC = 30, the probability of graduating is 69.25%.

13

When ASVABC = 50, the probability of graduating is 96.43%.

ASVABC SM SF MALE

Constant Total

X i

50 11.58

11.84

0.50

1.00

i

0.065

–0.008

0.006

0.063

–1.451

i X i

3.25

–0.094

Z

  1   1 .

803  2

X

2  ...

k X k

0.066

0.032

NORMSDIST (0.503) =0.9643

–1.451

1.803

14

Logit versus Probit

ASVABC SM SF MALE

Logit Probit Linear f(Z)b f(Z)b b 0.004

–0.001

0.004

–0.001

0.007

–0.002

0.000

0.004

0.000

0.004

0.001

–0.007

The logit and probit results are displayed for comparison.

The coefficients in the regressions are very different because different mathematical functions are being fitted.

Nevertheless the estimates of the marginal effects are usually similar.

15

However, if the outcomes in the sample are divided between a large majority and a small minority, they can differ.

This is because the observations are then concentrated in a tail of the distribution.

Although the logit and probit functions share the same sigmoid outline, their tails are somewhat different.

This is the case here, but even so the estimates are identical to three decimal places.

16

So, logit or probit?

The logit model is easier to compute, and used to be more popular than the probit model.

Probit model is theoretically more appealing as it is based on normal distribution. However, it uses more computer time.

Given computer technology advanced nowadays, the between the logit model and probit model is a matter of taste.

choice

17