Part 23: Parameter Heterogeneity

Download Report

Transcript Part 23: Parameter Heterogeneity

Part 23: Parameter Heterogeneity [1/115]
Econometric Analysis of Panel Data
William Greene
Department of Economics
Stern School of Business
Part 23: Parameter Heterogeneity [2/115]
Econometric Analysis of Panel Data
23. Individual Heterogeneity
and Random Parameter Variation
Part 23: Parameter Heterogeneity [3/115]
Heterogeneity

Observational: Observable differences across
individuals (e.g., choice makers)

Choice strategy:

Structural: Differences in model frameworks

Preferences: Differences in model ‘parameters’
How consumers make
decisions – the underlying behavior
Part 23: Parameter Heterogeneity [4/115]
Parameter Heterogeneity
(1) Regression model
y i,t  x i,t βi  εit
(2) Conditional probability or other nonlinear model
f(y it | x i,t , βi )
(3) Heterogeneity - how are parameters distributed across
individuals?
(a) Discrete - the population contains a mixture of Q
types of individuals.
(b) Continuous. Parameters are part of the stochastic
structure of the population.
Part 23: Parameter Heterogeneity [5/115]
Distinguish Bayes and Classical


Both depart from the heterogeneous ‘model,’
f(yit|xit)=g(yit,xit,βi)
What do we mean by ‘randomness’




With respect to the information of the analyst (Bayesian)
With respect to some stochastic process governing ‘nature’
(Classical)
Bayesian: No difference between ‘fixed’ and ‘random’
Classical: Full specification of joint distributions for
observed random variables; piecemeal definitions of
‘random’ parameters. Usually a form of ‘random
effects’
Part 23: Parameter Heterogeneity [6/115]
Fixed Management and Technical Efficiency
in a Random Coefficients Model
Antonio Alvarez, University of Oviedo
Carlos Arias, University of Leon
William Greene, Stern School of Business, New York University
Part 23: Parameter Heterogeneity [7/115]
The Production Function Model
Definition: Maximal output, given the inputs
Inputs: Variable factors, Quasi-fixed (land)
Form: Log-quadratic - translog
Latent Management as an unobservable input
1
ln yit =   x ln xit   xx (ln xit )2
2
1
 m mi  mm mi2   xm ln xit mi
2
 vit
Part 23: Parameter Heterogeneity [8/115]
Application to Spanish Dairy Farms
N = 247 farms, T = 6 years (1993-1998)
Input
Units
Mean
Std. Dev.
Minimum
92,539
14,110
Milk
Milk production (liters)
131,108
Cows
# of milking cows
2.12
11.27
4.5
82.3
Labor
# man-equivalent units
1.67
0.55
1.0
4.0
Land
Hectares of land devoted
to pasture and crops.
12.99
6.17
2.0
45.1
Feed
Total amount of feedstuffs
fed to dairy cows (tons)
57,941
47,981
3,924.14
Maximum
727,281
376,732
Part 23: Parameter Heterogeneity [9/115]
Translog Production Model
ln yit = ln yit* - uit
    k 1 k ln xitk 
K
+ m m 
*
i
1
2  mm
 
m  
1
2
*2
i
K
K
k 1
l 1
1
2
kl ln xitk ln xitl
K
*

ln
x
m
itk i
k 1 km
+ vit - uit
mi * is an unobserved, time invariant effect.
uit = ln yit* - ln yit

 m  1 2  k 1 km ln xkit
K
m
*
i
 mi   1 2  mm  mi* 2  mi2   0.
Part 23: Parameter Heterogeneity [10/115]
Random Coefficients Model
ln yit     m m 
*
i
1
2  mm m
*2
i
1  m* ln x



  k 1  k 2 km i  itk
K
 1 2  k 1  l 1 kl ln xitk ln xitl  vit  uit
K
K
  i   k 1 ki ln xitk 
K
1
2
 
K
K
k 1
l 1
kl ln xitk ln xitl  it
K
mi*   k ln xk  wi
k 1
[Chamberlain/Mundlak:]
(1) Same random effect appears in each random parameter
(2) Only the first order terms are random
Part 23: Parameter Heterogeneity [11/115]
Discrete vs. Continuous Variation


Classical context: Description of how parameters are distributed
across individuals
Variation

Discrete: Finite number of different parameter vectors distributed
across individuals




Mixture is unknown as well as the parameters: Implies randomness from
the point of the analyst. (Bayesian?)
Might also be viewed as discrete approximation to a continuous
distribution
Continuous: There exists a stochastic process governing the
distribution of parameters, drawn from a continuous pool of
candidates.
Background common assumption: An over-reaching stochastic
process that assigns parameters to individuals
Part 23: Parameter Heterogeneity [12/115]
Discrete Parameter Variation
The Latent Class Model
(1) Population is a (finite) mixture of Q types of individuals.
q = 1,...,Q. Q 'classes' differentiated by (βq )
(a) Analyst does not know class memberships. ('latent.')
(b) 'Mixing probabilities' (from the point of view of the
J
q  1
analyst) are 1 ,..., Q , with  q=1
(2) Conditional density is
P(y i,t | class  q)  f(y it | x i,t , βq )
Part 23: Parameter Heterogeneity [13/115]
Latent Classes




A population contains a mixture of individuals of
different types (classes)
Common form of the data generating mechanism
within the classes
Observed outcome y is governed by the
common process F(y|x,j )
Classes are distinguished by the parameters, j.
Part 23: Parameter Heterogeneity [14/115]
Part 23: Parameter Heterogeneity [15/115]
Part 23: Parameter Heterogeneity [16/115]
Part 23: Parameter Heterogeneity [17/115]
How Finite Mixture Models Work
Part 23: Parameter Heterogeneity [18/115]
Find the ‘Best’ Fitting Mixture of Two Normal Densities
 2
1  yi - μj  
LogL =  i=1 log   j=1 π j  
 



σ
σ
j
j




Maximum Likelihood Estimates
1000
Class 1
Class 2
Estimate
Std. Error
Estimate
Std. error
μ
7.05737
.77151
3.25966
.09824
σ
3.79628
.25395
1.81941
.10858
π
.28547
.05953
.71453
.05953

1
1
 y - 7.05737  
 y - 3.25966  
ˆF(y) = .28547 
 3.79628   3.79628   +.71453 1.81941   1.81941  






Part 23: Parameter Heterogeneity [19/115]
Mixing probabilities .715 and .285
Part 23: Parameter Heterogeneity [20/115]
Approximation
Actual
Distribution
Part 23: Parameter Heterogeneity [21/115]
Application Shoe Brand Choice

Simulated Data: Stated Choice, 400 respondents, 8 choice
situations

3 choice/attributes + NONE





Fashion = High=1 / Low=0
Quality = High=1 / Low=0
Price
= 25/50/75,100,125 coded 1,2,3,4,5 then divided by 25.
Heterogeneity: Sex, Age (<25, 25-39, 40+) categorical
Underlying data generated by a 3 class latent class process (100,
200, 100 in classes)

Thanks to www.statisticalinnovations.com (Latent Gold)
Part 23: Parameter Heterogeneity [22/115]
A Random Utility Model
Random Utility Model for Discrete Choice
Among J alternatives at time t by person i.
Uitj
=
j
+
′xitj
+
ijt
j
= Choice specific constant
xitj
= Attributes of choice presented to person
(Information processing strategy. Not all
attributes will be evaluated. E.g., lexicographic
utility functions over certain attributes.)

= ‘Taste weights,’ ‘Part worths,’ marginal utilities
ijt
= Unobserved random component of utility


Mean=E[ ijt] = 0; Variance=Var[ ijt] =
2
Part 23: Parameter Heterogeneity [23/115]
The Multinomial Logit Model
Independent type 1 extreme value (Gumbel):




F(itj) = 1 – Exp(-Exp(itj))
Independence across utility functions
Identical variances, 2 = π2/6
Same taste parameters for all individuals
Prob[choice j | i,t] =
exp(α j +β'xitj )

J(i,t)
j=1
exp(α j +β'xitj )
Part 23: Parameter Heterogeneity [24/115]
Estimated MNL
+---------------------------------------------+
| Discrete choice (multinomial logit) model
|
| Log likelihood function
-4158.503
|
| Akaike IC= 8325.006 Bayes IC= 8349.289
|
| R2=1-LogL/LogL* Log-L fncn R-sqrd RsqAdj |
| Constants only
-4391.1804 .05299 .05259 |
+---------------------------------------------+
+---------+--------------+----------------+--------+---------+
|Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] |
+---------+--------------+----------------+--------+---------+
BF
1.47890473
.06776814
21.823
.0000
BQ
1.01372755
.06444532
15.730
.0000
BP
-11.8023376
.80406103
-14.678
.0000
BN
.03679254
.07176387
.513
.6082
Part 23: Parameter Heterogeneity [25/115]
Latent Classes and Random Parameters
Heterogeneity with respect to 'latent' consumer classes
Pr(Choicei ) =  q=1 Pr(choice i | class = q)Pr(class = q)
Q
Pr(choicei | class = q) =
exp(xi,choiceβclass )
Σ j=choice exp(xi,jβ class )
Pr(class = q | i) = i,q , e.g., Fi,q =
exp(ziδ q )
Σq=classes exp(ziδ q )
Simple discrete random parameter variation
exp(xi,choiceβi )
Pr(choicei | βi ) =
Σ j=choice exp(xi,jβi )
Pr (βi  β q )  i,q =
exp(z iδ q )
Σq=classes exp(ziδ q )
, q = 1,..., Q
Pr(Choicei ) =  q=1 Pr(choice | βi  β q )Pr(β q )
Q
Part 23: Parameter Heterogeneity [26/115]
+---------------------------------------------+
| Latent Class Logit Model
|
| Log likelihood function
-3649.132
|
+---------------------------------------------+
+---------+--------------+----------------+--------+---------+
|Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] |
+---------+--------------+----------------+--------+---------+
Utility parameters in latent class -->> 1
BF|1
3.02569837
.14335927
21.106
.0000
BQ|1
-.08781664
.12271563
-.716
.4742
BP|1
-9.69638056
1.40807055
-6.886
.0000
BN|1
1.28998874
.14533927
8.876
.0000
Utility parameters in latent class -->> 2
BF|2
1.19721944
.10652336
11.239
.0000
BQ|2
1.11574955
.09712630
11.488
.0000
BP|2
-13.9345351
1.22424326
-11.382
.0000
BN|2
-.43137842
.10789864
-3.998
.0001
Utility parameters in latent class -->> 3
BF|3
-.17167791
.10507720
-1.634
.1023
BQ|3
2.71880759
.11598720
23.441
.0000
BP|3
-8.96483046
1.31314897
-6.827
.0000
BN|3
.18639318
.12553591
1.485
.1376
This is THETA(1) in class probability model.
Constant
-.90344530
.34993290
-2.582
.0098
_MALE|1
.64182630
.34107555
1.882
.0599
_AGE25|1
2.13320852
.31898707
6.687
.0000
_AGE39|1
.72630019
.42693187
1.701
.0889
This is THETA(2) in class probability model.
Constant
.37636493
.33156623
1.135
.2563
_MALE|2
-2.76536019
.68144724
-4.058
.0000
_AGE25|2
-.11945858
.54363073
-.220
.8261
_AGE39|2
1.97656718
.70318717
2.811
.0049
This is THETA(3) in class probability model.
Constant
.000000
......(Fixed Parameter).......
_MALE|3
.000000
......(Fixed Parameter).......
_AGE25|3
.000000
......(Fixed Parameter).......
_AGE39|3
.000000
......(Fixed Parameter).......
Estimated
Latent Class
Model
Part 23: Parameter Heterogeneity [27/115]
Latent Class Elasticities
+-----------------------------------------------------------------+
| Elasticity
Averaged over observations.
|
| Effects on probabilities of all choices in the model:
|
| Attribute is PRICE
in choice B1
MNL
LCM |
| *
Choice=B1
.000
.000
.000
-.889
-.801 |
|
Choice=B2
.000
.000
.000
.291
.273 |
|
Choice=B3
.000
.000
.000
.291
.248 |
|
Choice=NONE
.000
.000
.000
.291
.219 |
| Attribute is PRICE
in choice B2
|
|
Choice=B1
.000
.000
.000
.313
.311 |
| *
Choice=B2
.000
.000
.000 -1.222
-1.248 |
|
Choice=B3
.000
.000
.000
.313
.284 |
|
Choice=NONE
.000
.000
.000
.313
.268 |
| Attribute is PRICE
in choice B3
|
|
Choice=B1
.000
.000
.000
.366
.314 |
|
Choice=B2
.000
.000
.000
.366
.344 |
| *
Choice=B3
.000
.000
.000
-.755
-.674 |
|
Choice=NONE
.000
.000
.000
.366
.302 |
+-----------------------------------------------------------------+
Part 23: Parameter Heterogeneity [28/115]
Individual Specific Means
Part 23: Parameter Heterogeneity [29/115]
A Practical Distinction

Finite Mixture (Discrete Mixture):






Functional form strategy
Component densities have no meaning
Mixing probabilities have no meaning
There is no question of “class membership”
The number of classes is uninteresting – enough to get a good fit
Latent Class:





Mixture of subpopulations
Component densities are believed to be definable “groups”
(Low Users and High Users in Bago d’Uva and Jones
application)
The classification problem is interesting – who is in which class?
Posterior probabilities, P(class|y,x) have meaning
Question of the number of classes has content in the context of
the analysis
Part 23: Parameter Heterogeneity [30/115]
The Latent Class Model
(1) There are Q classes, unobservable to the analyst
(2) Class specific model: f(y it | x it , class  q)  g(y it , x it , β q )
(3) Conditional class probabilities (possibly given some
information, zi ) P(class=q|zi , δ)
Common multinomial logit form for prior class probabilities
exp(ziδ q )
P(class=q|zi , δ)  iq 
, δQ = 0
Q
 q1 exp(ziδq )
Note, if no zi , q = log(iq / iQ ).
Part 23: Parameter Heterogeneity [31/115]
Estimating an LC Model
Conditional density for each observation is
P(y i,t | x i,t , class  q)  f(y it | x i,t , β q )
Joint conditional density for Ti observations is
f(y i1 , y i2 ,..., y i,Ti | X i , β q )   t i 1 f(y it | x i,t , β q )
T
(Ti may be 1. This is not only a 'panel data' model.)
Maximize this for each class if the classes are known.
They aren't. Unconditional density for individual i is
f(y i1 , y i2 ,..., y i,Ti | X i , zi )   q1 iq
Q

Ti
t 1
f(y it | x i,t , β q )

LogLikelihood
LogL(β1 ,..., β Q , δ1 ,..., δ Q )   i1 log  q1 iq  t i 1 f(y it | x i,t , β q )
N
Q
T
Part 23: Parameter Heterogeneity [32/115]
Estimating Which Class
Prior class probability Prob[class=q|zi ]=iq
Joint conditional density for Ti observations is
P(y i1 , y i2 ,..., y i,Ti | X i , class  q)   t i 1 f(y it | x i,t , β q )
T
Joint density for data and class membership is the product
P(y i1 , y i2 ,..., y i,Ti , class  q | X i , zi )  q  t i 1 f(y it | x i,t , β q )
T
Posterior probability for class, given the data
P(class  q | y i1 , y i2 ,..., y i,Ti , X i , zi ) 

P( y i , class  q | X i , zi )
P(y i1 , y i2 ,..., y i,Ti | X i , zi )
P( y i , class  q | X i , zi )

Q
q1
P( y i , class  q | X i , zi )
Use Bayes Theorem to compute the posterior (conditional) probability
iq  t i 1 f(y it | x i,t , β q )
T
w(q | y i , X i , zi )  P(class  j | y i , X i , zi ) 

Q
q1
iq  t i 1 f(y it | x i,t , β q )
T
 w iq
Best guess = the class with the largest posterior probability.
Part 23: Parameter Heterogeneity [33/115]
‘Estimating’ βi
ˆ from the class with the largest estimated probability
(1) Use β
j
(2) Probabilistic - in the same spirit as the 'posterior mean'
ˆ = Q Posterior Prob[class=q|data ] β
ˆ
β
i
i
q
q=1
ˆ
ˆ iqβ
= q=1 w
q
Q
Note : This estimates E[βi | y i , X i , zi ], not βi itself.
Part 23: Parameter Heterogeneity [34/115]
How Many Classes?
(1) Q is not a 'parameter' - can't 'estimate' Q with
 and β
(2) Can't 'test' down or 'up' to Q by comparing
log likelihoods. Degrees of freedom for Q+1
vs. Q classes is not well define d.
(3) Use AKAIKE IC; AIC = -2  logL + 2#Parameters.
Part 23: Parameter Heterogeneity [35/115]
Modeling Obesity with a
Latent Class Model
Mark Harris
Department of Economics, Curtin University
Bruce Hollingsworth
Department of Economics, Lancaster University
Pushkar Maitra
Department of Economics, Monash University
William Greene
Stern School of Business, New York University
Part 23: Parameter Heterogeneity [36/115]
300 Million People Worldwide. International Obesity Task Force: www.iotf.org
Part 23: Parameter Heterogeneity [37/115]
Costs of Obesity




In the US more people are obese than smoke or
use illegal drugs
Obesity is a major risk factor for noncommunicable diseases like heart problems and
cancer
Obesity is also associated with:


lower wages and productivity, and absenteeism
low self-esteem

USA costs are around 4-8% of all annual health care
expenditure - US $100 billion
Canada, 5%; France, 1.5-2.5%; and New Zealand
2.5%
An economic problem. It is costly to society:

Part 23: Parameter Heterogeneity [38/115]
Measuring Obesity


An individual’s weight given their height should
lie within a certain range
 Body Mass Index (BMI)
 Weight (Kg)/height(Meters)2
World Health Organization guidelines:
 Underweight
BMI < 18.5
 Normal
18.5 < BMI < 25
 Overweight 25
< BMI < 30
 Obese
BMI > 30
 Morbidly Obese
BMI > 40
Part 23: Parameter Heterogeneity [39/115]
Two Latent Classes: Approximately Half of European Individuals
Part 23: Parameter Heterogeneity [40/115]
Modeling BMI Outcomes


Grossman-type health production function
Health Outcomes = f(inputs)
Existing literature assumes BMI is an ordinal, not cardinal,
representation of individuals.




Weight-related health status
Do not assume a one-to-one relationship between BMI levels and
(weight-related) health status levels
Translate BMI values into an ordinal scale using WHO
guidelines
Preserves underlying ordinal nature of the BMI index but
recognizes that individuals within a so-defined weight range
are of an (approximately) equivalent (weight-related) health
status level
Part 23: Parameter Heterogeneity [41/115]
Conversion to a Discrete Measure

Measurement issues: Tendency to
under-report BMI




women tend to under-estimate/report
weight;
men over-report height.
Using bands should alleviate this
Allows focus on discrete ‘at risk’ groups
Part 23: Parameter Heterogeneity [42/115]
A Censored Regression Model for BMI
Simple Regression Approach Based on Actual BMI:
BMI* = ′x + ,  ~ N[0,2] , σ2 = 1
True BMI = weight proxy is unobserved
Interval Censored Regression Approach
WT
=
0 if
BMI* < 25
1 if 25 < BMI* < 30
2 if
BMI* > 30
Normal
Overweight
Obese
 Inadequate accommodation of heterogeneity
 Inflexible reliance on WHO classification
 Rigid measurement by the guidelines
Part 23: Parameter Heterogeneity [43/115]
Heterogeneity in the BMI Ranges

Boundaries are set by the WHO narrowly defined for all individuals

Strictly defined WHO definitions may consequently push individuals
into inappropriate categories

We allow flexibility at the margins of these intervals

Following Pudney and Shields (2000) therefore we consider
Generalised Ordered Choice models - boundary parameters are
now functions of observed personal characteristics
Part 23: Parameter Heterogeneity [44/115]
Generalized Ordered Probit Approach
A Latent Regression Model for True BMI
BMIi* = ′xi + i , i ~ N[0,σ2], σ2 = 1
Observation Mechanism for Weight Type
WTi = 0 if
BMIi* < 0
Normal
1 if 0
< BMIi* < i(wi) Overweight
2 if (wi) < BMIi*
Obese
Part 23: Parameter Heterogeneity [45/115]
Latent Class Modeling

Several ‘types’ or ‘classes. Obesity be due to genetic
reasons (the FTO gene) or lifestyle factors

Distinct sets of individuals may have differing reactions
to various policy tools and/or characteristics

The observer does not know from the data which class
an individual is in.

Suggests a latent class approach for health outcomes
(Deb and Trivedi, 2002, and Bago d’Uva, 2005)
Part 23: Parameter Heterogeneity [46/115]
Latent Class Application

Two class model (considering FTO gene):



More classes make class interpretations much more
difficult
Parametric models proliferate parameters
Endogenous class membership: Two classes
allow us to correlate the equations driving class
membership and observed weight outcomes via
unobservables.
Part 23: Parameter Heterogeneity [47/115]
Heterogeneous Class Probabilities



j = Prob(class=j) = governor of a detached
natural process. Homogeneous.
ij = Prob(class=j|zi,individual i)
Now possibly a behavioral aspect of the
process, no longer “detached” or “natural”
Nagin and Land 1993, “Criminal Careers…
Part 23: Parameter Heterogeneity [48/115]
Endogeneity of Class Membership
Class Membership: C* = z i  ui , C = 1[C* > 0] (Probit)
BMI|Class=0,1
BMI* = c xi   c ,i , BMI group = OP[BMI*,(c w i )]
 0   1
 ui 
Endogeneity:   ~ N   , 
  c ,i 
 0   c
c  

1 
Bivariate Ordered Probit (one variable is binary).
Full information maximum likelihood.
Part 23: Parameter Heterogeneity [49/115]
Model Components



x: determines observed weight levels within classes
For observed weight levels we use lifestyle factors such
as marital status and exercise levels
z: determines latent classes
For latent class determination we use genetic proxies
such as age, gender and ethnicity: the things we
can’t change
w: determines position of boundary parameters within
classes
For the boundary parameters we have: weighttraining intensity and age (BMI inappropriate for the
aged?) pregnancy (small numbers and length of term
unknown)
Part 23: Parameter Heterogeneity [50/115]
Data




US National Health Interview Survey
(2005); conducted by the National
Center for Health Statistics
Information on self-reported height and
weight levels, BMI levels
Demographic information
Split sample (30,000+) by gender
Part 23: Parameter Heterogeneity [51/115]
Outcome Probabilities



Class 0 dominated by normal and overweight probabilities ‘normal weight’ class
Class 1 dominated by probabilities at top end of the scale ‘non-normal weight’
Unobservables for weight class membership, negatively correlated with those
determining weight levels:
Part 23: Parameter Heterogeneity [52/115]
Normal
Overweight
Class 1
Obese
Class 0
Normal
Overweight
Obese
Part 23: Parameter Heterogeneity [53/115]
Classification (Latent Probit) Model
Part 23: Parameter Heterogeneity [54/115]
BMI Ordered Choice Model






Conditional on class membership, lifestyle factors
Marriage comfort factor only for normal class women
Both classes associated with income, education
Exercise effects similar in magnitude
Exercise intensity only important for ‘non-normal’ class:
Home ownership only important for .non-normal.class, and negative:
result of differing socieconomic status distributions across classes?
Part 23: Parameter Heterogeneity [55/115]
Effects of Aging on Weight Class
Part 23: Parameter Heterogeneity [56/115]
Effect of Education on Probabilities
Part 23: Parameter Heterogeneity [57/115]
Effect of Income on Probabilities
Part 23: Parameter Heterogeneity [58/115]
Inflated Responses in Self-Assessed Health
Mark Harris
Department of Economics, Curtin University
Bruce Hollingsworth
Department of Economics, Lancaster University
William Greene
Stern School of Business, New York University
Part 23: Parameter Heterogeneity [59/115]
Introduction



Health sector an important part of developed countries’
economies: E.g., Australia 9% of GDP
To see if these resources are being effectively utilized,
we need to fully understand the determinants of
individuals’ health levels
To this end much policy, and even more academic
research, is based on measures of self-assessed health
(SAH) from survey data
Part 23: Parameter Heterogeneity [60/115]
SAH vs. Objective Health Measures
Favorable SAH categories seem artificially high.
 60% of Australians are either overweight or obese (Dunstan et. al, 2001)
 1 in 4 Australians has either diabetes or a condition of impaired glucose
metabolism
 Over 50% of the population has elevated cholesterol
 Over 50% has at least 1 of the “deadly quartet” of health conditions
(diabetes, obesity, high blood pressure, high cholestrol)
 Nearly 4 out of 5 Australians have 1 or more long term health conditions
(National Health Survey, Australian Bureau of Statistics 2006)
 Australia ranked #1 in terms of obesity rates
Similar results appear for other countries
Part 23: Parameter Heterogeneity [61/115]
SAH vs. Objective Health
1. Are these SAH outcomes are “overinflated”
2. And if so, why, and what kinds of people
are doing the over-inflating/misreporting?
Part 23: Parameter Heterogeneity [62/115]
HILDA Data
The Household, Income and Labour Dynamics in
Australia (HILDA) dataset:
1. a longitudinal survey of households in Australia
2. well tried and tested dataset
3. contains a host of information on SAH and other health
measures, as well as numerous demographic variables
Part 23: Parameter Heterogeneity [63/115]
Self Assessed Health

“In general, would you say your health is: Excellent,
Very good, Good, Fair or Poor?"

Responses 1,2,3,4,5 (we will be using 0,1,2,3,4)
Typically ¾ of responses are “good” or “very good”
health; in our data (HILDA) we get 72%
Similar numbers for most developed countries

Does this truly represent the health of the nation?


Part 23: Parameter Heterogeneity [64/115]
Part 23: Parameter Heterogeneity [65/115]
A Two Class Latent Class Model
True Reporter
Misreporter
Part 23: Parameter Heterogeneity [66/115]
Reporter Type Model
r*  xrr   r
r = 1 if r* > 0 True reporter
0 if r*  0 Misreporter
r is unobserved
Part 23: Parameter Heterogeneity [67/115]
Y=4
Y=3
Y=2
Y=1
Y=0
Part 23: Parameter Heterogeneity [68/115]
Pr(true,y) = Pr(true) * Pr(y | true)
Part 23: Parameter Heterogeneity [69/115]


Mis-reporters choose either good or very good
The response is determined by a probit model
m*  xm m  m
Y=3
Y=2
Part 23: Parameter Heterogeneity [70/115]
Part 23: Parameter Heterogeneity [71/115]
Observed Mixture of Two Classes
Part 23: Parameter Heterogeneity [72/115]
Pr( y )  Pr(true) Pr( y | true)  Pr(misreporter ) Pr( y | misreporter )
Part 23: Parameter Heterogeneity [73/115]
Part 23: Parameter Heterogeneity [74/115]
Who are the Misreporters?
Part 23: Parameter Heterogeneity [75/115]
Priors and Posteriors
M=Misreporter, T=True reporter
Priors : Pr( M )  ( xr),
Pr(T )  ( xr)
Posteriors:
Noninflated outcomes 0, 1, 4
Pr( M | y  0,1, 4)  0, Pr(T | y  0,1, 4)  (  xr)
Inflated outcomes 2, 3
Pr( y  2 | M )Pr( M )
Pr( M | y  2) 
Pr( y  2 | M )Pr( M )  Pr( y  2 | T )Pr(T )
Part 23: Parameter Heterogeneity [76/115]
General Results
Part 23: Parameter Heterogeneity [77/115]
Part 23: Parameter Heterogeneity [78/115]
Latent Class Efficiency Studies



Battese and Coelli – growing in weather
“regimes” for Indonesian rice farmers
Kumbhakar and Orea – cost structures for U.S.
Banks
Greene (Health Economics, 2005) – revisits
WHO Year 2000 World Health Report
Part 23: Parameter Heterogeneity [79/115]
Studying Economic Efficiency
in Health Care

Hospital and Nursing Home



Cost efficiency
Role of quality (not studied today)
Agency for Health Reseach and Quality
(AHRQ)
Part 23: Parameter Heterogeneity [80/115]
Stochastic Frontier Analysis


logC = f(output, input prices, environment) + v
+u
ε = v+u



v = noise – the usual “disturbance”
u = inefficiency
Frontier efficiency analysis



Estimate parameters of model
Estimate u (to the extent we are able – we use
E[u|ε])
Evaluate and compare observed firms in the sample
Part 23: Parameter Heterogeneity [81/115]
Nursing Home Costs




44 Swiss nursing homes, 13 years
Cost, Pk, Pl, output, two environmental
variables
Estimate cost function
Estimate inefficiency
Part 23: Parameter Heterogeneity [82/115]
Estimated Cost Efficiency
Part 23: Parameter Heterogeneity [83/115]
Inefficiency?



Not all agree with the presence (or
identifiability) of “inefficiency” in market
outcomes data.
Variation around the common production
structure may all be nonsystematic and not
controlled by management
Implication, no inefficiency: u = 0.
Part 23: Parameter Heterogeneity [84/115]
A Two Class Model

Class 1: With Inefficiency


Class 2: Without Inefficiency




logC = f(output, input prices, environment) + vv + uu
logC = f(output, input prices, environment) + vv
u = 0
Implement with a single zero restriction in a
constrained (same cost function) two class model
Parameterization: λ = u /v = 0 in class 2.
Part 23: Parameter Heterogeneity [85/115]
LogL= 464 with a common frontier
model, 527 with two classes
Part 23: Parameter Heterogeneity [86/115]
Part 23: Parameter Heterogeneity [87/115]
Random Parameters (Mixed) Models
 A General Model Structure
f(y it | x it , βi )  g(y it | x it , βi , θ)
βi = a set of random parameters = β + ui
f(βi |zi ) = h(βi , zi , Ω)
θ = a set of nonrandom parameters in the density of y it
Ω = a set of parameters in the distribution of βi
 Typical application "repeated measures" = panel
 The "mixed" model
f(y it | x it , zi , θ, Ω) 

βi
f(y it | x it , βi , θ)h(βi , zi , Ω)dβi
forms the basis of a likelihood function for the observed data.
Part 23: Parameter Heterogeneity [88/115]
Mixed Model Estimation

WinBUGS:



SAS: Proc Mixed.




Mixing done by quadrature. (Very slow for 2 or more dimensions)
Several loglinear models - GLAMM
LIMDEP/NLOGIT




Classical
Uses primarily a kind of GLS/GMM (method of moments algorithm for loglinear models)
Stata: Classical


MCMC
User specifies the model – constructs the Gibbs Sampler/Metropolis Hastings
Classical
Mixing done by Monte Carlo integration – maximum simulated likelihood
Numerous linear, nonlinear, loglinear models
Ken Train’s Gauss Code



Monte Carlo integration
Used by many researchers
Mixed Logit (mixed multinomial logit) model only (but free!)
Programs differ on the models fitted, the algorithms, the paradigm, and the
extensions provided to the simplest RPM, i = +wi.
Part 23: Parameter Heterogeneity [89/115]
Modeling Parameter Heterogeneity
Conditional Model, linear or nonlinear
density : f(y i,t | x i,t , βi , θ)  g(y i,t , x i,t , βi , θ)
Individual heterogeneity in the means of the parameters
βi = β  Δzi + ui , E[ui | X i , zi ]  0
Heterogeneity in the variances of the parameters
Var[ui,k | zi ]  ik  k exp(ziδk )
Var[ui | zi ] = Φi = diag(ik )
(Different variables in zi may appear in means and variances.)
Free correlation: Var[ui | zi ] = Σi = ΓΦiΓ', Γ = a lower triangular
matrix with 1s on the diagonal.
Part 23: Parameter Heterogeneity [90/115]
A Mixed Probit Model
Random parameters probit model
f(y it | x it , βi )  [(2y it  1) x it βi ]
βi  β + ui
ui ~ N[0, Σ], Σ = ΓΛ 2Γ'
Λ = diagonal matrix of standard deviations
Γ = I lower triangular matrix or I if uncorrelated
LogL(β, Γ, Λ )= i=1 log 
N
βi
2

Γ']d βi
ΓΛ
,
β
N[
]
β
x
1)

[(2y

 t 1
it i
it
Ti
Part 23: Parameter Heterogeneity [91/115]
Maximum Simulated Likelihood
logL(θ, Ω)=

N
i=1
log
 
βi
T
t 1
f(y it | x it , βi , θ)h(βi | zi , Ω)dβi
Ω = β, Δ, 1 ,..., K , δ1 ,..., δK , Γ
Part 23: Parameter Heterogeneity [92/115]
Simulated Log Likelihood for a Mixed
Probit Model
Random parameters probit model
f(y it | x it , βi )  [(2y it  1) x it βi ]
βi  β + ui
ui ~ N[0, ΓΛ 2Γ']
LogL(β, Γ)= i=1 log 
N
βi
2


[(2y

1)
x
β
]
N[
β
,
ΓΛ
Γ']dβi
 t 1
it
it i
Ti
Ti
1 R
[(2y it  1) x it (β + ΓΛv ir )]

r 1  t 1
R
We now maximize this function with respect to (β, Γ, Λ).
LogLS   i=1 log
N
Part 23: Parameter Heterogeneity [93/115]
Application – Doctor Visits
German Health Care Usage Data, 7,293 Individuals, Varying Numbers of Periods
Variables in the file are
Data downloaded from Journal of Applied Econometrics Archive. This is an unbalanced panel with 7,293
individuals. They can be used for regression, count models, binary choice, ordered choice, and bivariate binary
choice. This is a large data set. There are altogether 27,326 observations. The number of observations ranges
from 1 to 7. (Frequencies are: 1=1525, 2=2158, 3=825, 4=926, 5=1051, 6=1000, 7=987). Note, the variable
NUMOBS below tells how many observations there are for each person. This variable is repeated in each row of
the data for the person.
DOCTOR = 1(Number of doctor visits > 0)
HSAT = health satisfaction, coded 0 (low) - 10 (high)
DOCVIS = number of doctor visits in last three months
HOSPVIS = number of hospital visits in last calendar year
PUBLIC = insured in public health insurance = 1; otherwise = 0
ADDON = insured by add-on insurance = 1; otherswise = 0
HHNINC = household nominal monthly net income in German marks / 10000.
(4 observations with income=0 were dropped)
HHKIDS = children under age 16 in the household = 1; otherwise = 0
EDUC = years of schooling
AGE = age in years
MARRIED = marital status
EDUC = years of education
Part 23: Parameter Heterogeneity [94/115]
Estimates of a Mixed Probit Model
+---------------------------------------------+
| Random Coefficients Probit
Model
|
| Dependent variable
DOCTOR
|
| Log likelihood function
-16483.96
|
| Restricted log likelihood
-17700.96
|
| Unbalanced panel has
7293 individuals.
|
+---------------------------------------------+
+---------+--------------+----------------+--------+---------+----------+
|Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X|
+---------+--------------+----------------+--------+---------+----------+
Means for random parameters
Constant
-.09594899
.04049528
-2.369
.0178
AGE
.02102471
.00053836
39.053
.0000
43.5256898
HHNINC
-.03119127
.03383027
-.922
.3565
.35208362
EDUC
-.02996487
.00265133
-11.302
.0000
11.3206310
MARRIED
-.03664476
.01399541
-2.618
.0088
.75861817
+---------+--------------+----------------+--------+---------+----------+
Constant
.02642358
.05397131
.490
.6244
AGE
.01538640
.00071823
21.423
.0000
43.5256898
HHNINC
-.09775927
.04626475
-2.113
.0346
.35208362
EDUC
-.02811308
.00350079
-8.031
.0000
11.3206310
MARRIED
-.00930667
.01887548
-.493
.6220
.75861817
Part 23: Parameter Heterogeneity [95/115]
Random Parameters Probit
Diagonal elements of Cholesky matrix
Constant
.55259608
.05381892
10.268
AGE
.279052D-04
.00041019
.068
HHNINC
.03545309
.04094725
.866
EDUC
.00994387
.00093271
10.661
MARRIED
.01013553
.00643526
1.575
Below diagonal elements of Cholesky matrix
lAGE_ONE
.00668600
.00071466
9.355
lHHN_ONE
-.23713634
.04341767
-5.462
lHHN_AGE
.09364751
.03357731
2.789
lEDU_ONE
.01461359
.00355382
4.112
lEDU_AGE
-.00189900
.00167248
-1.135
lEDU_HHN
.00991594
.00154877
6.402
lMAR_ONE
-.04871097
.01854192
-2.627
lMAR_AGE
-.02059540
.01362752
-1.511
lMAR_HHN
-.12276339
.01546791
-7.937
lMAR_EDU
.09557751
.01233448
7.749
.0000
.9458
.3866
.0000
.1153
.0000
.0000
.0053
.0000
.2562
.0000
.0086
.1307
.0000
.0000
Part 23: Parameter Heterogeneity [96/115]
Application Shoe Brand Choice

Simulated Data: Stated Choice, 400 respondents, 8 choice
situations

3 choice/attributes + NONE





Fashion = High=1 / Low=0
Quality = High=1 / Low=0
Price
= 25/50/75,100,125 coded 1,2,3,4,5 then divided by 25.
Heterogeneity: Sex, Age (<25, 25-39, 40+) categorical
Underlying data generated by a 3 class latent class process (100,
200, 100 in classes)

Thanks to www.statisticalinnovations.com (Latent Gold and Jordan
Louviere)
Part 23: Parameter Heterogeneity [97/115]
A Discrete (4 Brand) Choice Model with Heterogeneous
and Heteroscedastic Random Parameters
Ui,1,t = βF,i Fashioni,1,t +β Q Quality i,1,t +βP,i Pricei,1,t + εi,1,t
Ui,2,t = βF,i Fashioni,2,t +β Q Quality i,2,t +βP,i Pricei,2,t + εi,2,t
Ui,3,t = βF,i Fashioni,3,t +β Q Quality i,3,t +βP,i Pricei,3,t + εi,3,t
Ui,NONE,t = αNONE
+ εi,NONE,t
βF,i = βF + δF Sex i +[σF exp(γ F1 AgeL25i + γ F2 Age2539i )] w F,i ; w F,i ~ N[0,1]
βP,i = βP + δPSex i +[σ P exp(γ P1 AgeL25i + γ P2 Age2539i )] w P,i ; w P,i ~ N[0,1]
Part 23: Parameter Heterogeneity [98/115]
Multinomial Logit Model Estimates
Part 23: Parameter Heterogeneity [99/115]
Mixed Logit Estimates
+---------------------------------------------+
| Random Parameters Logit Model
|
| Log likelihood function
-3911.945
|
| At start values -4158.5029 .05929 .05811 |
+---------------------------------------------+
+---------+--------------+----------------+--------+---------+
|Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] |
+---------+--------------+----------------+--------+---------+
Random parameters in utility functions
BF
1.46523951
.12626655
11.604
.0000
BQ
1.14369857
.16954024
6.746
.0000
Nonrandom parameters in utility functions
BP
-12.1098155
.91584476
-13.223
.0000
BN
.17706909
.07784730
2.275
.0229
Heterogeneity in mean, Parameter:Variable
BF:MAL
.28052695
.14266576
1.966
.0493
BQ:MAL
-.42310284
.20387789
-2.075
.0380
Derived standard deviations of parameter distributions
NsBF
1.16430284
.13731611
8.479
.0000
NsBQ
1.81872569
.18108194
10.044
.0000
Heteroscedasticity in random parameters
sBF|AG
-.32466344
.16986949
-1.911
.0560
sBF0|AG
-.51032609
.23975740
-2.129
.0333
sBQ|AG
-.37953350
.13798031
-2.751
.0059
sBQ0|AG
-.41636803
.17143046
-2.429
.0151
Part 23: Parameter Heterogeneity [100/115]
Estimated Elasticities
+--------------------------------------------------------------+
| Elasticity
Averaged over observations.
|
| Effects on probabilities of all choices in the model:
|
| Attribute is PRICE
in choice B1
RPL
MNL
LCM |
| *
Choice=B1
.000
.000
-.818
-.889
-.801 |
|
Choice=B2
.000
.000
.240
.291
.273 |
|
Choice=B3
.000
.000
.244
.291
.248 |
|
Choice=NONE
.000
.000
.241
.291
.219 |
| Attribute is PRICE
in choice B2
|
|
Choice=B1
.000
.000
.291
.313
.311 |
| *
Choice=B2
.000
.000 -1.100 -1.222 -1.248 |
|
Choice=B3
.000
.000
.270
.313
.284 |
|
Choice=NONE
.000
.000
.276
.313
.268 |
| Attribute is PRICE
in choice B3
|
|
Choice=B1
.000
.000
.287
.366
.314 |
|
Choice=B2
.000
.000
.326
.366
.344 |
| *
Choice=B3
.000
.000
-.647
-.755
-.674 |
|
Choice=NONE
.000
.000
.311
.366
.302 |
+--------------------------------------------------------------+
Part 23: Parameter Heterogeneity [101/115]
Conditional Estimators
Counterpart to Bayesian posterior mean and variance
ˆ = argmax N log 1 R
Ω
 i=1 R  r=1
Ti
ˆ
)
P (βˆ | Ω,data
Lˆ =
i

ˆ
E[β
i,k
t=1
ijt
i
ˆ
E[β
t=1
Pijt (βir | Ω,datait )
it
ˆ
(1/R)ΣRr=1βi,k,r ΠTt=1Pijt (βˆ i | Ω,data
1 R
it )
ˆ i,r βi,k,r
w
=
| datai ] =

r=1
T
R
ˆ
ˆ
Π P (β | Ω,data ) R
(1/R)Σ
r=1
2
i,k

Ti
t=1 ijt
i
it
2
ˆ
ΠTt=1Pijt (βˆ i | Ω,data
(1/R)ΣRr=1βi,k,r
1 R
it )
2
ˆ i,r βi,k,r
w
=
| datai ] =

ˆ
) R r=1
ΠT P (βˆ | Ω,data
(1/R)ΣR
r=1
Var[βi,k
t=1 ijt

i
it

ˆ
ˆ 2 | data ] - E[β
| datai ] = E[β
i,k | d atai ]
i
i,k
2
ˆ
E[β
i,k | datai ] ± 2 Var[β i,k | data i ] will encompass 95% of any
reasonable distribution
Part 23: Parameter Heterogeneity [102/115]
Individual E[i|datai] Estimates
Part 23: Parameter Heterogeneity [103/115]
Disaggregated Parameters

The description of classical methods as only producing aggregate results

As regards “targeting specific groups…” both of these sets of methods

NEITHER METHOD PRODUCES ESTIMATES OF INDIVIDUAL
is obviously untrue.
produce estimates for the specific data in hand. Unless we want to trot
out the specific individuals in this sample to do the analysis and marketing,
any extension is problematic. This should be understood in both
paradigms.
PARAMETERS, CLAIMS TO THE CONTRARY NOTWITHSTANDING. BOTH
PRODUCE ESTIMATES OF THE MEAN OF THE CONDITIONAL
(POSTERIOR) DISTRIBUTION OF POSSIBLE PARAMETER DRAWS
CONDITIONED ON THE PRECISE SPECIFIC DATA FOR INDIVIDUAL I.
Part 23: Parameter Heterogeneity [104/115]
Appendix: EM Algorithm
Part 23: Parameter Heterogeneity [105/115]
The EM Algorithm
Latent Class is a 'missing data' model
di,j  1 if individual i is a member of class j
If di,j were observed, the complete data log likelihood would be
logL c   i1 log


Ti

d
 j1 i,j  t 1 f(y i,t | datai,t , class  j) 
(Only one of the J terms would be nonzero.)
Expectation - Maximization algorithm has two steps
N
J
(1) Expectation Step: Form the 'Expected log likelihood'
given the data and a prior guess of the parameters.
(2) Maximize the expected log likelihood to obtain a new
guess for the model parameters.
(E.g., http://crow.ee.washington.edu/people/bulyko/papers/em.pdf)
Part 23: Parameter Heterogeneity [106/115]
Implementing EM
0
Given initial guesses iq0  i10 , i20 ,..., iQ
, β0q  βi10 , βi20 ,..., βiq0
E.g., use 1/Q for each iq and the MLE of β from a one class
model. (Have to perturb each one slightly, as if all iq are equal
and all β q are the same, the model will satisfy the FOC.)
ˆ0 , δ
ˆ0
ˆ
(1) Compute w(q|i)
= posterior class probabilities, using β
Reestimate each β q using a weighted log likelihood
Maximize wrt β q
 i=1 wˆ iq
N

Ti
t=1
log f(y it | x i1 , β q )
(2) Reestimate iq by reestimating δ q
ˆ
If no zi , new ˆ
q =(1/N)Ni=1w(q|i)
using old ˆ
 and new β
If zi , Maximize wrt δ q
Now, return to step 1.
Iterate until convergence.
ˆ
log
 i=1 w(q|i)
N
exp(ziδ q )
 Qq=1exp(ziδ q )
Part 23: Parameter Heterogeneity [107/115]
Appendix: Monte Carlo Integration
Part 23: Parameter Heterogeneity [108/115]
Monte Carlo Integration
(1) Integral is of the form
K=

range of v
g(v|data,β) f(v|Ω) dv
where f(v) is the density of random variable v
possibly conditioned on a set of parameters Ω
and g(v|data,β) is a function of data and parameters.
(2) By construction, K(Ω) = E[g(v|data,β)]
(3) Strategy:
a. Sample R values from the population
of v using a random number generator.
b. Compute average K = (1/R)  r=1 g(v r|data,β)
R
By the law of large numbers, plim K = K.
Part 23: Parameter Heterogeneity [109/115]
Monte Carlo Integration
1 R
P
f
(
u
)

 f (ui ) g (ui )dui  Eui [ f (ui )]

ir
ui
R r 1
(Certain smoothness conditions must be met.)
Drawing uir by 'random sampling'
uir  t (vir ), vir ~ U [0,1]
E.g ., uir   1 (vir )   for N [,  2 ]
Requires many draws, typically
hundreds or thousands
Part 23: Parameter Heterogeneity [110/115]
Example: Monte Carlo Integral


(x1  .9v)(x 2  .9v)(x 3  .9v)
exp( v 2 / 2)
dv
2
where  is the standard normal CDF and x 1 = .5,

x 2 = -.2, x 3 = .3. (Looks like a RE probit model.)
The weighting function for v is the standard normal.
Strategy: Draw R (say 1000) standard normal random
draws, v r . Compute the 1000 functions
(x1  .9v)(x 2  .9v)(x 3  .9v) and average them.
(Based on 100, 1000, 10000, I get .28746, .28437, .27242)
Part 23: Parameter Heterogeneity [111/115]
Generating a Random Draw
Most common approach is the "inverse probability transform"
Let u = a random draw from the standard uniform (0,1).
Let x = the desired population to draw from
Assume the CDF of x is F(x).
The random draw is then x = F -1 (u).
Example : exponential, . f(x)=exp(-x), F(x)=1-exp(-x)
Equate u to F(x), x = -(1/)log(1-u).
Example: Normal(,). Inverse function does not exist in
closed form. There are good polynomial approximations to produce a draw from N[0,1] from a U(0,1).
Then x = +v.
This leaves the question of how to draw the U(0,1).
Part 23: Parameter Heterogeneity [112/115]
Drawing Uniform Random Numbers
Computer generated random numbers are not random; they
are Markov chains that look random.
The Original Random Number Generator for 32 bit computers.
SEED originates at some large odd number
d3 = 2147483647.0
d2 = 2147483655.0
d1=16807.0
SEED=Mod(d1*SEED,d3) ! MOD(a,p) = a - INT(a/p) * p
X=SEED/d2 is a random value between 0 and 1.
Problems:
(1) Short period. Based on 32 bits, so recycles after 231  1 values
(2) Evidently not very close to random. (Recent tests have
discredited this RNG)
Part 23: Parameter Heterogeneity [113/115]
L’Ecuyer’s RNG
Define: norm
= 2.328306549295728e-10,
m1
= 4294967087.0, m1
= 4294944443.0,
a12
= 140358.0,
a13n
= 810728.0,
a21
= 527612.0,
a23n
= 1370589.0,
Initialize
s10
= the seed,
s11
= 4231773.0,
s12
= 1975.0,
s20
= 137228743.0,
s21
= 98426597.0,
s22
= 142859843.0.
Preliminaries for each draw (Resets at least some of 5 seeds)
p1 = a12*s11 - a13n*s10, k = int(p1/m1), p1 = p1 - k*m1
if p1 < 0, p1 = p1 + m1, s10 = s11, s11 = s12, s12 = p1;
p2 = a21*s22 - a23n*s20, k = int(p2/m2), p2 = p2 - k*m2
if p2 < 0, p2 = p2 + m2, s20 = s21, s21 = s22, s22 = p2;
Compute the random number
u = norm*(p1 - p2)
if p1 > p2,
u = norm*(p1 - p2 + m1)
otherwise.
Passes all known randomness tests.
Period = 2191
Pierre L'Ecuyer. Canada Research Chair in Stochastic Simulation and
Optimization. Département d'informatique et de recherche opérationnelle
University of Montreal.
Part 23: Parameter Heterogeneity [114/115]
Quasi-Monte Carlo Integration Based on
Halton Sequences
Coverage of the unit interval is the objective,
not randomness of the set of draws.
Halton sequences --- Markov chain
p = a prime number,
r= the sequence of integers, decomposed as

I
i 0
bi p i
H(r|p)   i 0 bi p  i 1 , r = r1 ,... (e.g., 10,11,12,...)
I
For example, using base p=5, the integer r=37 has b0 =
2, b1 = 2, and b3 = 1; (37=1x52 + 2x51 + 2x50). Then
H(37|5) = 25-1 + 25-2 + 15-3 = 0.448.
Part 23: Parameter Heterogeneity [115/115]
Halton Sequences vs.
Random Draws
Requires far fewer draws – for one dimension, about 1/10.
Accelerates estimation by a factor of 5 to 10.