#### Transcript Modeling Consumer Decision Making and Discrete Choice Behavior

```Discrete Choice Modeling
William Greene
New York University
Part 6
Modeling Latent Parameter Heterogeneity
Parameter Heterogeneity

Fixed and Random Effects Models



Latent common time invariant “effects”
Heterogeneity in level parameter – constant term – in
the model
General Parameter Heterogeneity in Models


Discrete: There is more than one time of individual in
the population – parameters differ across types.
Produces a Latent Class Model
Continuous; Parameters vary randomly across
individuals: Produces a Random Parameters Model
or a Mixed Model. (Synonyms)
Latent Class Models
There are J types of people, j = 1,…,J
 For each type,
Prob(Outcome|type=j) = f(y,x|βj)
 Individual i is and remains a member of class j
 An individual will be drawn at random from the
population. Prob(in class j) = πj
 From the modeler’s point of view:
Prob(Outcome) = Σj πj Prob(Outcome|type=j)
= Σj πj f(y,x|βj)

Finite Mixture Model
Prob(Outcome|type=j) = f(y,x|βj) depends on
parameter vector
 Parameters are randomly, discretely distributed
among population members, with
Prob(β = βj) = πj, j = 1,…,J
 Integrating out the variation across parameters,
Prob(Outcome) = Σj πj f(y,x|βj)
 Same model, slightly different interpretation

Estimation Problems

Estimation of population features





Latent parameter vectors, βj, j = 1,…,J
Mixing probabilities, πj, j = 1,…,J
Probabilities, partial effects, predictions,
etc.
Model structure: The number of classes, J
Classification: Prediction of class
membership for individuals
Extended Latent Class Model
(1 ) T h e re a re J cla sse s, u n o b se rva b le to th e a n a lyst
(2 ) C la ss sp e cific m o d e l: f(y it | x it , cla ss  j)  g ( y it , x it , β j )
(3) C o n d itio n a l cla ss p ro b a b ilitie s  j
C o m m o n m u ltin o m ia l lo g it fo rm fo r p rio r cla ss p ro b a b ilitie s
to co n stra in a ll p ro b a b ilitie s to ( 0 ,1 ) a n d e n su re
m u ltin o m ia l lo g it fo rm fo r cla ss p ro b a b ilitie s;
P (cla ss= j| δ )   j 
exp( j )

J
j 1
N o te ,  j = lo g (  j /  J ).
exp( j )
, J = 0

J
j= 1
 j  1;
Log Likelihood for an LC Model
C o n d itio n a l d e n sity fo r e a ch o b se rv a tio n is
P (y i,t | x i,t , cla ss  j)  f(y it | x i,t , β j )
Jo in t co n d itio n a l d e n sity fo r T i o b se rv a tio n s is

f ( y i1 , y i2 , ..., y i,T | X i , β j ) 
i
Ti
t 1
f(y it | x i,t , β j )
( Ti m a y b e 1 . T h is is n o t o n ly a 'p a n e l d a ta ' m o d e l.)
M a x im ize th is fo r e a ch cla ss if th e cla s se s a re k n o w n .
T h e y a re n 't. U n co n d itio n a l d e n sity fo r in d iv id u a l i is
f ( y i1 , y i2 , ..., y i,T | X i ) 
i

J
j
j 1

Ti
t 1
f(y it | x i,t , β j )

L o g L ik e lih o o d
L o g L ( β 1 , ..., β J , δ 1 , ..., δ J ) 
N
 i1 lo g 
J
j
j 1
Ti
t 1
f(y it | x i,t , β j )
Example: Mixture of Normals
J n o rm a l po pu la tio n s e a ch w ith a m e a n  j a n d sta n da rd de via tio n  j
Fo r e a ch in dividu a l in e a ch cla ss a t e a c h pe rio d,

1
1  y it   j

f ( y it | cla ss  j) 
exp  
j
 2 
 j 2





2

1  y it   j
=

 j 
j



.


P a n e l da ta , T o bse rva tio n s o n e a ch in dividu a l i,
f ( y i1 , ..., y iT

1
| cla ss  j)  
  2
 j
T

 exp



1  y  j
  T   it
t 1
2 
j






2




Lo g Like lih o o d

N

lo g L   i 1 lo g  


J
j 1
j

1

  2
 j
T

 exp



1  y  j
  T   it
t 1
2 
j






2









Unmixing a Mixed Sample
N[1,1] and N[5,1]
Sample ; 1 – 1000\$
Calc
; Ran(123457)\$
Create ; lc1=rnn(1,1) ; lc2=rnn(5,1)\$
Create ; class=rnu(0,1)\$
Create ; if(class<.3)ylc=lc1 ; (else)ylc=lc2\$
Kernel ; rhs=ylc \$
Regress ; lhs=ylc;rhs=one;lcm;pts=2;pds=1\$
.2 2 4
.1 8 0
D e n s i ty
.1 3 5
.0 9 0
.0 4 5
.0 0 0
-4
-2
0
2
4
6
YL C
Ke rn e l d e n s i ty e s ti m a te fo r
YL C
8
10
Mixture of Normals
+---------------------------------------------+
| Latent Class / Panel LinearRg Model
|
| Dependent variable
YLC
|
| Number of observations
1000
|
| Log likelihood function
-1960.443
|
| Info. Criterion: AIC =
3.93089
|
| LINEAR regression model
|
| Model fit with 2 latent classes.
|
+---------------------------------------------+
+--------+--------------+----------------+--------+--------+----------+
|Variable| Coefficient | Standard Error |b/St.Er.|P[|Z|>z]| Mean of X|
+--------+--------------+----------------+--------+--------+----------+
+--------+Model parameters for latent class 1
|
|Constant|
4.97029***
.04511814
110.162
.0000
|
|Sigma
|
1.00214***
.03317650
30.206
.0000
|
+--------+Model parameters for latent class 2
|
|Constant|
1.05522***
.07347646
14.361
.0000
|
|Sigma
|
.95746***
.05456724
17.546
.0000
|
+--------+Estimated prior probabilities for class membership
|
|Class1Pr|
.70003***
.01659777
42.176
.0000
|
|Class2Pr|
.29997***
.01659777
18.073
.0000
|
+--------+------------------------------------------------------------+
| Note: ***, **, * = Significance at 1%, 5%, 10% level.
|
+---------------------------------------------------------------------+
Estimating Which Class
P ro b [c la ss= j]=  j
P rio r c la s s p ro b a b ility
Jo in t c o n d itio n a l d e n s ity fo r T i o b se rv a tio n s is
P ( y i1 , y i2 , ..., y i,T | X i , c la ss  j) 
i

Ti
t 1
f(y it | x i,t , β j )
Jo in t d e n s ity fo r d a ta a n d c la s s m e m b e rsh ip is th e p ro d u c t
P ( y i1 , y i2 , ..., y i,T , c la ss  j | X i )   j 
i
Ti
t 1
f(y it | x i,t , β j )
P o s te rio r p ro b a b ility fo r c la s s , g iv e n t h e d a ta
P (c la ss  j | y i1 , y i2 , ..., y i,T , X i ) 
i
P ( y i , c la ss  j | X i )
P ( y i1 , y i2 , ..., y i,T | X i )
i

P ( y i , c la ss  j | X i , z i )

j
q 1
P ( y i , c la ss  j | X i , z i )
U se B a y e s T h e o re m to c o m p u te th e p o s te rio r (c o n d itio n a l) p ro b a b ility
w ( j | y i , X i )  P (c la ss  j | y i , X i ) 
j

J
j 1
Ti
t 1
j
f(y it | x i,t , β j )
Ti
t 1
f(y it | x i,t , β j )
 w ij
B e s t g u e s s = th e c la s s w ith th e la rg e s t p o s te rio r p ro b a b ility .
Posterior for Normal Mixture

ˆ j 
Ti
1  y it  
 j 

ˆ

t 1




ˆj 
ˆj

 
ˆ ( j | d a ta i )  w
ˆ ( j | i) 
w

ˆj
Ti
J
1  y it  
 j 1 ˆ j   t 1 ˆ   ˆ

j
j




 
Estimated Posterior Probabilities
How Many Classes?
(1 ) J is n o t a 'p a ra m e te r' - ca n 't 'e stim a te ' J w ith
 and β
(2 ) C a n 't 'te st' d o w n o r 'u p ' to J b y co m p a rin g
lo g like lih o o d s. D e g re e s o f fre e d o m fo r J+ 1
vs. J cla sse s is n o t w e ll d e fin e d .
(3 ) U se A K A IK E IC ; A IC = -2  lo g L + 2 # P a ra m e te rs.
A IC 1  1 0 8 2 7 .8 8
A IC 2  9 9 5 4 .2 6 8    
A IC 3  9 9 5 8 .7 5 6
More Difficult When the
Populations are Close Together
The Technique Still Works
---------------------------------------------------------------------Latent Class / Panel LinearRg Model
Dependent variable
YLC
Sample is 1 pds and
1000 individuals
LINEAR regression model
Model fit with 2 latent classes.
--------+------------------------------------------------------------Variable| Coefficient
Standard Error b/St.Er. P[|Z|>z]
Mean of X
--------+------------------------------------------------------------|Model parameters for latent class 1
Constant|
2.93611***
.15813
18.568
.0000
Sigma|
1.00326***
.07370
13.613
.0000
|Model parameters for latent class 2
Constant|
.90156***
.28767
3.134
.0017
Sigma|
.86951***
.10808
8.045
.0000
|Estimated prior probabilities for class membership
Class1Pr|
.73447***
.09076
8.092
.0000
Class2Pr|
.26553***
.09076
2.926
.0034
--------+-------------------------------------------------------------
LC Regression for Banking Data
Bank Cost Data, 500 Banks, 5 Years:
Variables in the file are
Cit = total cost of transformation of financial and physical resources into
loans and investments = the sum of the five cost items described below;
Q1it = installment loans to individuals for personal and household expenses;
Q2it = real estate loans;
Q4it = federal funds sold and securities purchased under agreements to resell;
Q5it = other assets;
All variables are in logs in the regression models .
An LCM for US Banks
+---------------------------------------------+
| Latent Class / Panel LinearRg Model
|
| Number of observations
2500
|
| Log likelihood function
-722.4603
|
| Number of parameters
23
|
| Akaike IC= 1490.921 Bayes IC= 1624.874
|
| Sample is 5 pds and
500 individuals.
|
+---------------------------------------------+
+---------+--------------+----------------+--------+---------+
|Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] |
+---------+--------------+----------------+--------+---------+
Model parameters for latent class 1
Constant
2.12699463
.29651372
7.173
.0000
Q1
.12099446
.03964929
3.052
.0023
Q2
.36291987
.03752392
9.672
.0000
Q3
.10728655
.05245420
2.045
.0408
Q4
.12785217
.02482950
5.149
.0000
Q5
.39535779
.06081496
6.501
.0000
Sigma
.71931764
.02537027
28.353
.0000
Model parameters for latent class 2
Constant
2.51877624
.06958519
36.197
.0000
Q1
.05918445
.00899501
6.580
.0000
Q2
.44083356
.00930001
47.401
.0000
Q3
.23897724
.01492919
16.007
.0000
Q4
.04896772
.00484760
10.101
.0000
Q5
.16105964
.01307985
12.314
.0000
Sigma
.18434496
.00520057
35.447
.0000
Model parameters for latent class 3
Constant
3.83600468
.10233076
37.486
.0000
Q1
.08904293
.01502856
5.925
.0000
Q2
.33710302
.01266856
26.609
.0000
Q3
-.01256845
.01987228
-.632
.5271
Q4
.06333872
.00782013
8.099
.0000
Q5
.42847054
.02326421
18.418
.0000
Sigma
.23914408
.00872954
27.395
.0000
Estimated prior probabilities for class membership
Class1Pr
.24778109
.02112395
11.730
.0000
Class2Pr
.45386105
.03497825
12.976
.0000
Class3Pr
.29835786
.03472726
8.591
.0000
Heckman and Singer Model


Random Effects Model
Random Constants with Discrete Distribution
(1 ) T h e re a re J cla sse s, u n o b se rva b le to th e a n a lyst
(2 ) C la ss sp e cific m o d e l: f(y it | x it , cla ss  j)  g ( y it , x it ,  j , β )
(3) C o n d itio n a l cla ss p ro b a b ilitie s  j
C o m m o n m u ltin o m ia l lo g it fo rm fo r p rio r cla s s p ro b a b ilitie s
to co n stra in a ll p ro b a b ilitie s to ( 0 ,1 ) a n d e n su re
m u ltin o m ia l lo g it fo rm fo r cla ss p ro b a b ilitie s;
P (cla ss= j| δ )   j 
exp( j )

J
j 1
N o te ,  j = lo g (  j /  J ).
exp( j )
, J = 0

J
j= 1
 j  1;
3 Class Heckman-Singer Form
| Log likelihood function
-722.4603
| (Full LC model)
| Log likelihood function
-794.2760
| (Restricted – random constant)
+--------+--------------+----------------+--------+--------+----------+
|Variable| Coefficient | Standard Error |b/St.Er.|P[|Z|>z]| Mean of X|
+--------+--------------+----------------+--------+--------+----------+
---------+Model parameters for latent class 1
Constant|
3.28396608
.09620151
34.136
.0000
Q1
|
.06662880
.00698098
9.544
.0000
8.58763095
Q2
|
.41250826
.00605969
68.074
.0000
10.0931831
Q3
|
.13886506
.00908522
15.285
.0000
9.71949206
Q4
|
.05974750
.00405876
14.721
.0000
7.78290462
Q5
|
.26368046
.00934816
28.207
.0000
7.13715510
Sigma
|
.75439763
.03173404
23.773
.0000
---------+Model parameters for latent class 2 (Same slopes)
Constant|
3.00580474
.05459323
55.058
.0000
Sigma
|
.28646077
.01926618
14.869
.0000
---------+Model parameters for latent class 3 (Same slopes)
Constant|
2.91327814
.05028419
57.936
.0000
Sigma
|
.18372096
.00917844
20.017
.0000
---------+Estimated prior probabilities for class membership
Class1Pr|
.23571564
.02199255
10.718
.0000
Class2Pr|
.29609849
.07681471
3.855
.0001
Class3Pr|
.46818587
.08003086
5.850
.0000
Heckman and Singer’s Model, J=1,…,5
LCM for Health Status
Self Assessed Health Status = 0,1,…,10
 Recoded: Healthy = HSAT > 6
 Prob = f(Age,Educ,Income,Married,Kids)
 2, 3 classes

Too Many Classes
---------------------------------------------------------------------Latent Class / Panel Probit
Model
Dependent variable
HEALTHY
Estimation based on N =
6209, K = 20
Unbalanced panel has
887 individuals
PROBIT (normal) probability model
Model fit with 3 latent classes.
--------+------------------------------------------------------------Variable| Coefficient
Standard Error b/St.Er. P[|Z|>z]
Mean of X
--------+------------------------------------------------------------|Model parameters for latent class 1
Constant|
.01265
.385900D+10
.000 1.0000
AGE|
.16523
.138024D+09
.000 1.0000
44.3352
EDUC|
.15327
.520918D+08
.000 1.0000
10.9409
HHNINC|
.43195
.887276D+09
.000 1.0000
.34930
MARRIED|
.06640
.153413D+09
.000 1.0000
.84539
HHKIDS|
.17832
.152061D+09
.000 1.0000
.45482
|Model parameters for latent class 2
Constant|
.32074
.29082
1.103
.2701
AGE|
-.02690***
.00406
-6.622
.0000
44.3352
EDUC|
.12215***
.01753
6.969
.0000
10.9409
HHNINC|
-.03849
.17139
-.225
.8223
.34930
MARRIED|
.20051***
.07749
2.588
.0097
.84539
HHKIDS|
.05879
.06565
.895
.3705
.45482
|Model parameters for latent class 3
Constant|
.00731
.26582
.027
.9781
AGE|
-.03396***
.00446
-7.612
.0000
44.3352
EDUC|
.02741*
.01466
1.869
.0616
10.9409
HHNINC|
.73861***
.24133
3.061
.0022
.34930
MARRIED|
.10671
.10520
1.014
.3104
.84539
HHKIDS|
.16550**
.07838
2.111
.0347
.45482
|Estimated prior probabilities for class membership
Class1Pr|
.12387***
.01676
7.390
.0000
Class2Pr|
.52530***
.02447
21.468
.0000
Class3Pr|
.35083***
.02268
15.466
.0000
--------+-------------------------------------------------------------
Two Class Model
---------------------------------------------------------------------Latent Class / Panel Probit
Model
Dependent variable
HEALTHY
Unbalanced panel has
887 individuals
PROBIT (normal) probability model
Model fit with 2 latent classes.
--------+------------------------------------------------------------Variable| Coefficient
Standard Error b/St.Er. P[|Z|>z]
Mean of X
--------+------------------------------------------------------------|Model parameters for latent class 1
Constant|
.61652**
.28620
2.154
.0312
AGE|
-.02466***
.00401
-6.143
.0000
44.3352
EDUC|
.11759***
.01852
6.351
.0000
10.9409
HHNINC|
.10713
.20447
.524
.6003
.34930
MARRIED|
.11705
.09574
1.223
.2215
.84539
HHKIDS|
.04421
.07017
.630
.5287
.45482
|Model parameters for latent class 2
Constant|
.18988
.31890
.595
.5516
AGE|
-.03120***
.00464
-6.719
.0000
44.3352
EDUC|
.02122
.01934
1.097
.2726
10.9409
HHNINC|
.61039***
.19688
3.100
.0019
.34930
MARRIED|
.06201
.10035
.618
.5367
.84539
HHKIDS|
.19465**
.07936
2.453
.0142
.45482
|Estimated prior probabilities for class membership
Class1Pr|
.56604***
.02487
22.763
.0000
Class2Pr|
.43396***
.02487
17.452
.0000
Partial Effects in LC Model
---------------------------------------------------------------------Partial derivatives of expected val. with
respect to the vector of characteristics.
They are computed at the means of the Xs.
Conditional Mean at Sample Point
.6116
Scale Factor for Marginal Effects
.3832
B for latent class model is a wghted avrg.
--------+------------------------------------------------------------Variable| Coefficient
Standard Error b/St.Er. P[|Z|>z] Elasticity
--------+------------------------------------------------------------|Two class latent class model
AGE|
-.01054***
.00134
-7.860
.0000
-.76377
EDUC|
.02904***
.00589
4.932
.0000
.51939
HHNINC|
.12475**
.05598
2.228
.0259
.07124
MARRIED|
.03570
.02991
1.194
.2326
.04934
HHKIDS|
.04196**
.02075
2.022
.0432
.03120
--------+------------------------------------------------------------|Pooled Probit Model
AGE|
-.00846***
.00081
-10.429
.0000
-.63399
EDUC|
.03219***
.00336
9.594
.0000
.59568
HHNINC|
.16699***
.04253
3.927
.0001
.09865
|Marginal effect for dummy variable is P|1 - P|0.
MARRIED|
.02414
.01877
1.286
.1986
.03451
|Marginal effect for dummy variable is P|1 - P|0.
HHKIDS|
.06754***
.01483
4.555
.0000
.05195
--------+-------------------------------------------------------------
Conditional Means of Parameters
E st .E[  | A ll inform ation for individual i] =

J
j= 1
wˆ ijˆ
j
using posterior (conditional) estim ated class probabilities.
The EM Algorithm
La te n t C la ss is a ' m issin g d a ta ' m o de l
d i,j  1 if in dividu a l i is a m e m be r o f cla ss j
If d i,j w e re o bse rve d, th e co m ple te da ta lo g like lih o o d w o u ld be
lo gL c 
N
 i1 lo g

J
d i,j  
j 1

Ti
t 1
f ( y i,t | da ta i,t , cla ss  j) 


(O n ly o n e o f th e J te rm s w o u ld be n o n ze r o .)
E x pe cta tio n - M a x im iza tio n a lgo rith m h a s tw o ste ps
(1 ) E x pe cta tio n S te p: Fo rm th e 'E x pe cte d lo g like lih o o d'
give n th e da ta a n d a prio r gu e ss o f th e pa r a m e te rs.
(2 ) M a x im ize th e e x pe cte d lo g like lih o o d to o bta in a n e w
gu e ss fo r th e m o de l pa ra m e te rs.
(E .g., h ttp://cro w .e e .w a sh in gto n .e du /pe o ple /bu lyko /pa pe rs/e m .pdf)
Implementing EM for LC Models
0
0
0
0
0
0
0
0
G iv e n in itia l g u e sse s  q   1 ,  2 , ...,  J , β j  β 1 , β 2 , ..., β J
E.g., u se 1 /J fo r e a ch  j a n d th e M L E o f β fro m a o n e cla ss
m o d e l. (M u st p e rtu rb e a ch o n e slig h tly , a s if a ll  j a re e q u a l
a n d a ll β j a re th e sa m e , th e m o d e l w ill sa tisfy th e FO C .)
ˆ0, δ
ˆ0
ˆ (j|i) = p o ste rio r cla ss p ro b a b ilitie s, u sin g β
(1) C o m p u te w
R e e stim a te e a ch β j u sin g a w e ig h te d l o g lik e lih o o d
M a x im ize w rt β j
N
 i= 1 wˆ ij

Ti
t= 1
lo g f(y it | x i1 , β j )
(2 ) R e e stim a te  j b y re e stim a tin g δ
N
ˆ (j|i) u sin g o ld ˆ
 q = (1 /N )  i= 1 w
 and new β
ˆ
N o w , re tu rn to ste p 1 .
Ite ra te u n til co n v e rg e n ce .
An Extended Latent Class Model
C la ss p ro b a b ilitie s re la te to o b se rv a b le v a ria b le s (u su a lly
d e m o g ra p h ic fa c to rs su c h a s a g e a n d se x ) .
(1 ) T h e re a re J c la sse s, u n o b se rv a b le to th e a n a ly st
(2 ) C la ss sp e c ific m o d e l: f(y it | x it , c la ss  j)  g ( y it , x it , β j )
(3) C o n d itio n a l c la ss p ro b a b ilitie s g iv e n so m e in fo rm a tio n , z i )
C o m m o n m u ltin o m ia l lo g it fo rm fo r p rio r c la ss p ro b a b ilitie s
P (c la ss= q | z i , δ )   ij 
e x p ( z i δ j )

Q
q 1
e x p ( z i δ j )
, δJ = 0
Health Satisfaction Model
---------------------------------------------------------------------Latent Class / Panel Probit
Model
Dependent variable
HEALTHY
Log likelihood function
-3465.98697
--------+------------------------------------------------------------Variable| Coefficient
Standard Error b/St.Er. P[|Z|>z]
Mean of X
--------+------------------------------------------------------------|Model parameters for latent class 1
Constant|
.60050**
.29187
2.057
.0396
AGE|
-.02002***
.00447
-4.477
.0000
44.3352
EDUC|
.10597***
.01776
5.968
.0000
10.9409
HHNINC|
.06355
.20751
.306
.7594
.34930
MARRIED|
.07532
.10316
.730
.4653
.84539
HHKIDS|
.02632
.07082
.372
.7102
.45482
|Model parameters for latent class 2
Constant|
.10508
.32937
.319
.7497
AGE|
-.02499***
.00514
-4.860
.0000
44.3352
EDUC|
.00945
.01826
.518
.6046
10.9409
HHNINC|
.59026***
.19137
3.084
.0020
.34930
MARRIED|
-.00039
.09478
-.004
.9967
.84539
HHKIDS|
.20652***
.07782
2.654
.0080
.45482
|Estimated prior probabilities for class membership
ONE_1|
1.43661***
.53679
2.676
.0074
(.56519)
AGEBAR_1|
-.01897*
.01140
-1.664
.0960
FEMALE_1|
-.78809***
.15995
-4.927
.0000
ONE_2|
.000
......(Fixed Parameter)......
(.43481)
AGEBAR_2|
.000
......(Fixed Parameter)......
FEMALE_2|
.000
......(Fixed Parameter)......
--------+-------------------------------------------------------------
Random Parameters Models
P a ra m e te rs V a ry R a n d o m ly w ith a C o n tin u o u s D istrib u tio n
 A G e n e ra l M o d e l S tru c tu re f(y it | x it , β i )
β i = a se t o f ra n d o m p a ra m e te rs = β + u i
f( β i ) = h ( β i | Σ )
Σ = a se t o f p a ra m e te rs in th e d istrib u tio n o f β i
 T y p ic a l a p p lic a tio n "re p e a te d m e a su re s" = p a n e l
 T y p ic a l a p p lic a tio n a ssu m e s n o rm a l d ist rib u tio n
 T h e "m ix e d " m o d e l
f(y it | x it , Σ ) 

βi
f(y it | x it , β i )h( β i , Σ )d β i
fo rm s th e b a sis o f a lik e lih o o d fu n c tio n fo r th e o b se rv e d d a ta .
(N O T E : R a n d o m (h e te ro g e n e o u s) p a ra m e te rs is n o t to b e
c o n fu se d w ith th e B a y e sia n n o tio n o f "ra n d o m p a ra m e te rs.")
A Mixed Probit Model
R a n d o m p a ra m e te rs p ro b it m o d e l
f ( y it | x it , β i )   [(2 y it  1) x it β i ]
β i  β + ui
2
u i ~ N[ 0 , Σ ], Σ = Γ Λ Γ '
Λ = d ia g o n a l m a trix o f sta n d a rd d e v ia tio n s
Γ = lo w e r tria n g u la r m a trix o r I if u n co rre la te d
N
L o g L ( β , Γ , Λ )=  i= 1 lo g 
βi

Ti
t 1
2
 [(2 y it  1) x it β i ] N[ β , Γ Λ Γ ']d β i
Application – Healthy
German Health Care Usage Data, 7,293 Individuals, Varying Numbers of Periods
Variables in the file are
Data downloaded from Journal of Applied Econometrics Archive. This is an unbalanced panel with 7,293
individuals. They can be used for regression, count models, binary choice, ordered choice, and bivariate binary
choice. This is a large data set. There are altogether 27,326 observations. The number of observations ranges
from 1 to 7. (Frequencies are: 1=1525, 2=2158, 3=825, 4=926, 5=1051, 6=1000, 7=987). Note, the variable
NUMOBS below tells how many observations there are for each person. This variable is repeated in each row of
the data for the person. (Downlo0aded from the JAE Archive)
DOCTOR = 1(Number of doctor visits > 0)
HSAT = health satisfaction, coded 0 (low) - 10 (high)
DOCVIS = number of doctor visits in last three months
HOSPVIS = number of hospital visits in last calendar year
PUBLIC = insured in public health insurance = 1; otherwise = 0
HHNINC = household nominal monthly net income in German marks / 10000.
(4 observations with income=0 were dropped)
HHKIDS = children under age 16 in the household = 1; otherwise = 0
EDUC = years of schooling
AGE = age in years
MARRIED = marital status
EDUC = years of education
Estimates of a Mixed Probit Model
Partial Effects are Also Simulated
Simulating Conditional Means
for Individual Parameters
1
Eˆ (  i | y i , X i ) 

R
R
r 1
1
( ˆ  Lˆ w i , r ) 

R
R
r 1

Ti
t 1
Ti
t 1
F [ q it ( ˆ  Lˆ w i , r )  x it ]
F [ q it ( ˆ  Lˆ w i , r )  x it ]
Posterior estimates of E[parameters(i) | Data(i)]
Summarizing Simulated Estimates
Correlated Parameters
---------------------------------------------------------------------Random Coefficients Probit
Model
Dependent variable
HEALTHY
PROBIT (normal) probability model
Simulation based on
25 random draws
--------+------------------------------------------------------------Variable| Coefficient
Standard Error b/St.Er. P[|Z|>z]
Mean of X
--------+------------------------------------------------------------|Means for random parameters
Constant|
.22395
.18073
1.239
.2153
AGE|
-.03919***
.00257
-15.256
.0000
44.3352
EDUC|
.15526***
.01173
13.236
.0000
10.9409
HHNINC|
.28023**
.12572
2.229
.0258
.34930
MARRIED|
.03971
.05918
.671
.5023
.84539
HHKIDS|
.06313
.04713
1.340
.1804
.45482
---------------------------------------------------------------------Partial derivatives of expected val. with
respect to the vector of characteristics.
They are computed at the means of the Xs.
Conditional Mean at Sample Point
.6351
Scale Factor for Marginal Effects
.3758
AGE|
-.01473***
.00102
-14.420
.0000
-1.02820
EDUC|
.05835***
.00444
13.149
.0000
1.00526
HHNINC|
.10532**
.04722
2.231
.0257
.05793
MARRIED|
.01492
.02228
.670
.5029
.01987
HHKIDS|
.02373
.01754
1.353
.1761
.01699
--------+-------------------------------------------------------------
Cholesky Matrix
--------+------------------------------------------------------------Variable| Coefficient
Standard Error b/St.Er. P[|Z|>z]
Mean of X
--------+------------------------------------------------------------|Means for random parameters
Constant|
.22395
.18073
1.239
.2153
AGE|
-.03919***
.00257
-15.256
.0000
44.3352
EDUC|
.15526***
.01173
13.236
.0000
10.9409
HHNINC|
.28023**
.12572
2.229
.0258
.34930
MARRIED|
.03971
.05918
.671
.5023
.84539
HHKIDS|
.06313
.04713
1.340
.1804
.45482
|Diagonal elements of Cholesky matrix
Constant|
.66612***
.21850
3.049
.0023
AGE|
.01041***
.00183
5.687
.0000
EDUC|
.07307***
.00592
12.346
.0000
HHNINC|
.18897*
.10133
1.865
.0622
MARRIED|
.47889***
.03140
15.252
.0000
HHKIDS|
.44804***
.03126
14.334
.0000
|Below diagonal elements of Cholesky matrix
lAGE_ONE|
-.00211
.00298
-.706
.4799
lEDU_ONE|
.07359***
.01403
5.246
.0000
lEDU_AGE|
-.01881**
.00778
-2.417
.0156
lHHN_ONE|
-.32031**
.15453
-2.073
.0382
lHHN_AGE|
.05302
.12989
.408
.6831
lHHN_EDU|
.44021***
.13082
3.365
.0008
lMAR_ONE|
-.19247**
.07503
-2.565
.0103
lMAR_AGE|
-.24710***
.06002
-4.117
.0000
lMAR_EDU|
.01475
.05933
.249
.8037
lMAR_HHN|
.07949*
.04724
1.683
.0924
lHHK_ONE|
-.07220
.05686
-1.270
.2041
lHHK_AGE|
.21508***
.04456
4.827
.0000
lHHK_EDU|
.31374***
.04369
7.181
.0000
lHHK_HHN|
-.11592***
.04023
-2.881
.0040
lHHK_MAR|
-.35853***
.04154
-8.631
.0000
--------+-------------------------------------------------------------
Estimated Parameter Correlation Matrix
Modeling Parameter Heterogeneity
C o n ditio n a l M o de l, lin e a r o r n o n lin e a r d e n sity :
f ( y i,t | x i, t , β i , θ )  g( y i,t , x i, t , β i , θ )
In dividu a l h e te ro ge n e ity in th e m e a n s o f th e pa ra m e te rs
β i = β  Δ z i + u i , E[ u i | X i , z i ]  0 (H ie ra rc h ic a l M o d e l )
H e te ro ge n e ity in th e va ria n ce s o f th e pa ra m e te rs
V a r[u i,k | z i ]   ik   k e x p( z i δ k )
V a r[ u i | z i ] = Φ i = dia g(  ik )
(D iffe re n t va ria ble s in z i m a y a ppe a r in m e a n s a n d va ria n ce s.)
Fre e co rre la tio n :
V a r[ u i | z i ] = Σ i = Γ Φ iΓ ',
Γ = a lo w e r tria n gu la r m a trix w ith 1 s o n th e dia go n a l.
Hierarchical Probit Model
 i , k   k   k ,1 A g e i   k , 2 Fem ale i   k w i , k
---------------------------------------------------------------------Random Coefficients Probit
Model
--------+------------------------------------------------------------Variable| Coefficient
Standard Error b/St.Er. P[|Z|>z]
Mean of X
--------+------------------------------------------------------------|Means for random parameters
Constant|
2.80514***
.84261
3.329
.0009
AGE|
-.06321***
.01397
-4.523
.0000
44.3352
EDUC|
-.15340***
.05506
-2.786
.0053
10.9409
HHNINC|
2.56154***
.67822
3.777
.0002
.34930
MARRIED|
.61453**
.26650
2.306
.0211
.84539
HHKIDS|
-.19855
.24303
-.817
.4140
.45482
|Scale parameters for dists. of random parameters
Constant|
.12981***
.02448
5.303
.0000
AGE|
.01424***
.00050
28.712
.0000
EDUC|
.00368**
.00172
2.142
.0322
HHNINC|
.52685***
.05165
10.201
.0000
MARRIED|
.16399***
.02111
7.768
.0000
HHKIDS|
.13928***
.02845
4.896
.0000
|Heterogeneity in the means of random parameters
cONE_AGE|
-.02875
.02082
-1.381
.1673
cONE_FEM|
-.98200***
.35328
-2.780
.0054
cAGE_AGE|
.00022
.00029
.740
.4592
cAGE_FEM|
.01552***
.00510
3.043
.0023
cEDU_AGE|
.00575***
.00130
4.438
.0000
cEDU_FEM|
-.00877
.02172
-.404
.6864
cHHN_AGE|
-.04540***
.01485
-3.057
.0022
cHHN_FEM|
-.03645
.25041
-.146
.8843
cMAR_AGE|
-.01556**
.00610
-2.550
.0108
cMAR_FEM|
.20538*
.11232
1.828
.0675
cHHK_AGE|
.01053*
.00552
1.906
.0566
cHHK_FEM|
-.25666***
.08923
-2.876
.0040
--------+-------------------------------------------------------------
Mixed Model Estimation
Programs differ on the models fitted, the algorithms, the paradigm, and
the extensions provided to the simplest RPM, i = +ui.

MLWin: http://www.cmm.bristol.ac.uk/MLwiN/index.shtml



WinBUGS: Mainly for Bayesian Applications






Classical
Mixing done by Monte Carlo integration – maximum simulated likelihood
Numerous linear, nonlinear, loglinear models
Stata: Classical - GLAMM



Classical
Uses primarily a kind of GLS/GMM (method of moments algorithm for loglinear models)
LIMDEP/NLOGIT


MCMC
User specifies the model – constructs the Gibbs Sampler/Metropolis Hastings
SAS: Proc Mixed.


Multilevel models
Regression and some loglinear models
Mixing done by quadrature. (Very, very slow for 2 or more dimensions)
Several loglinear models
Ken Train’s Gauss Code



Monte Carlo integration
Used by many researchers
Mixed Multinomial Logit model only (but free!)
Hierarchical Model
C o n ditio n a l M o de l, lin e a r o r n o n lin e a r d e n sity :
f ( y i,t | x i, t , β i , θ )  g( y i,t , x i, t , β i , θ )
In dividu a l h e te ro ge n e ity in th e m e a n s o f th e pa ra m e te rs
β i = β  Δ z i + u i , E[ u i | X i , z i ]  0 (H ie ra rc h ic a l M o d e l )
H e te ro ge n e ity in th e va ria n ce s o f th e pa ra m e te rs
V a r[u i,k | z i ]   ik   k e x p( z i δ k )
V a r[ u i | z i ] = Φ i = dia g(  ik )
(D iffe re n t va ria ble s in z i m a y a ppe a r in m e a n s a n d va ria n ce s.)
Fre e co rre la tio n :
V a r[ u i | z i ] = Σ i = Γ Φ iΓ ',
Γ = a lo w e r tria n gu la r m a trix w ith 1 s o n th e dia go n a l.
Maximum Simulated Likelihood
lo gL( θ , Ω )=

N
i= 1
lo g
 
βi
T
t 1
f(y it | x it , β i , θ )h( β i | z i , Ω )d β i
Ω = β , Δ ,  1 , ...,  K , δ 1 , ..., δ K , Γ
Monte Carlo Integration
(1 ) In te g ra l is o f th e fo rm
K =

ra n g e o f v
g (v |d a ta , β ) f(v | Ω ) d v
w h e re f(v ) is th e d e n sity o f ra n d o m v a ri a b le v
p o ssib ly c o n d itio n e d o n a se t o f p a ra m e t e rs Ω
a n d g (v |d a ta , β ) is a fu n c tio n o f d a ta a n d p a ra m e te rs.
(2 ) B y c o n stru c tio n , K ( Ω ) = E [g (v |d a ta , β )]
(3 ) S tra te g y :
a . S a m p le R v a lu e s fro m th e p o p u la tio n
o f v u sin g a ra n d o m n u m b e r g e n e ra to r.
R
b . C o m p u te a v e ra g e K = (1 /R )  r= 1 g (v r|d a t a , β )
B y th e la w o f la rg e n u m b e rs, p lim K = K .
Monte Carlo Integration
1
R
R

r =1
f(u ir )  
  f(u i )g (u i )d u i = E u i [f(u i )]
P
ui
(C e rta in s m o o th n e s s c o n d itio n s m u s t b e m e t.)
D ra w in g u ir b y 'ra n d o m s a m p lin g '
u ir = t(v ir ), v ir ~ U [0 ,1 ]
-1
2
E .g ., u ir = σ Φ (v ir ) + μ fo r N [μ ,σ ]
R e q u ire s m a n y d ra w s , typ ic a lly
h u n d re d s o r th o u s a n d s
Example: Monte Carlo Integral



2
 ( x 1  .9 v )  ( x 2  .9 v )  ( x 3  .9 v )
e x p(  v / 2)
2
dv
w h e re  is th e sta n da rd n o rm a l C D F a n d
x 1 = .5 , x 2 = -.2 , x 3 = .3 .
T h e w e igh tin g fu n ctio n fo r v is th e sta n da rd n o rm a l.
S tra te gy: D ra w R (sa y 1 0 0 0 ) sta n da rd n o rm a l ra n do m
dra w s, v r . C o m pu te th e 1 0 0 0 fu n ctio n s
 ( x 1  .9 v )  ( x 2  .9 v )  ( x 3  .9 v ) a n d a ve ra ge th e m .
(B a se d o n 1 0 0 , 1 0 0 0 , 1 0 0 0 0 , I ge t .2 8 7 4 6 , .2 8 4 3 7 , .2 7 2 4 2 )
Generating a Random Draw
M o st co m m o n a ppro a ch is th e "in ve rse pro ba bility tra n sfo rm "
Le t u = a ra n do m dra w fro m th e sta n da rd u n ifo rm (0 ,1 ).
Le t x = th e de sire d po pu la tio n to dra w fro m
A ssu m e th e C D F o f x is F(x ).
-1
T h e ra n do m dra w is th e n x = F (u).
E x a m ple : e x po n e n tia l, . f(x )=  e x p(-  x ), F( x )= 1 -e x p(-  x )
E qu a te u to F(x ), x = -(1 /  )lo g(1 -u ).
E x a m ple : N o rm a l(  ,  ). In ve rse fu n ctio n do e s n o t e x ist in
clo se d fo rm . T h e re a re go o d po lyn o m ia l a ppro x im a tio n s to pro du ce a dra w fro m N [0 ,1 ] fro m a U (0 ,1 ).
T h e n x =  +  v.
T h is le a ve s th e qu e stio n o f h o w to dra w th e U (0 ,1 ).
Drawing Uniform Random Numbers
C o m p u te r g e n e ra te d ra n d o m n u m b e rs a re n o t ra n d o m ; th e y
a re M a rko v ch a in s th a t lo o k ra n d o m .
T h e O rig in a l R a n d o m N u m b e r G e n e ra to r fo r 3 2 b it co m p u te rs.
S E E D o rig in a te s a t so m e la rg e o d d n u m b e r
d 3 = 2 1 4 7 4 8 3 6 4 7 .0
d 2 = 2 1 4 7 4 8 3 6 5 5 .0
d 1 = 1 6 8 0 7 .0
S E E D = M o d (d 1 * S E E D ,d 3 ) ! M O D (a ,p ) = a - IN T (a /p ) * p
X = S E E D /d 2 is a ra n d o m va lu e b e tw e e n 0 a n d 1 .
P ro b le m s:
(1 ) S h o rt p e rio d . B a se d o n 3 2 b its, s o re cycle s a fte r 2
31
 1 va lu e s
(2 ) E vid e n tly n o t ve ry clo se to ra n d o m . (R e ce n t te sts h a ve
d iscre d ite d th is R N G )
Quasi-Monte Carlo Integration Based on Halton Sequences
C o v e ra g e o f th e u n it in te rv a l is th e o b j e c tiv e ,
n o t ra n d o m n e ss o f th e se t o f d ra w s.
H a lto n se q u e n c e s --- M a rk o v c h a in
p = a p rim e n u m b e r,
r= th e se q u e n c e o f in te g e rs, d e c o m p o se d a s
H (r|p ) 
 i  0 bi p
I
 i 1

I
i0
bi p
i
, r = r1 , ... (e .g ., 1 0 ,1 1 ,1 2 ,...)
For example, using base p=5, the integer r=37 has b0 =
2, b1 = 2, and b3 = 1; (37=1x52 + 2x51 + 2x50). Then
H(37|5) = 25-1 + 25-2 + 15-3 = 0.448.
Halton Sequences vs. Random Draws
Requires far fewer draws – for one dimension, about
1/10. Accelerates estimation by a factor of 5 to 10.
Simulated Log Likelihood for a
Mixed Probit Model
R a n do m pa ra m e te rs pro bit m o de l
f ( y it | x it , β i )   [(2 y it  1) x it β i ]
β i  β + ui
2
u i ~ N[ 0 , Γ Λ Γ ']
N
Lo gL ( β , Γ )=  i= 1 lo g 
βi
S
Lo gL 

N
i= 1
lo g
1
R


R
r 1
Ti
t 1

2
 [(2 y it  1) x it β i ] N[ β , Γ Λ Γ ']dβ i
Ti
t 1
 [(2 y it  1) x it ( β + Γ Λ v ir )]
W e n o w m a x im ize th is fu n ctio n w ith re spe ct to ( β , Γ , Λ ).
```