True Random Effects in Stochastic Frontier Models
Download
Report
Transcript True Random Effects in Stochastic Frontier Models
True Random Effects in Stochastic
Frontier Models
William Greene
New York University
North American Productivity Workshop
Ottawa, June 6, 2014
1/78
Agenda
Skew normality – Adelchi Azzalini
2/78
Stochastic frontier model
Panel Data: Time invariant inefficiency models
Panel Data: Time varying inefficiency models
Panel Data: True random effects models
Applications of true random effects
Spatial effects in a stochastic frontier model
Persistent and transient inefficiency in Swiss railroads
A panel data sample selection corrected stochastic frontier model
Skew Normality
3/78
The Stochastic Frontier Model
ln yi xi vi ui ,
vi ~ N 0, v2 ,
ui | U i |, U i ~ N 0, u2 ,
i vi ui = vi | U i |
Convenient parameterization (notation)
i vVi u | U i | = v N [0,1]i u | N [0,1] |
4/78
Log Likelihood
u
, = u2 2v
v
log L( , , , ) =
N
i 1
Skew Normal
Density
=
5/78
N
i 1
2
yi x i
log log
( yi xi )
log
2 i i
log
Birnbaum (1950) Wrote About Skew Normality
Effect of
Linear
Truncation on
a Multinormal
Population
6/78
Weinstein (1964) Found f()
Query 2: The Sum of
Values from a
Normal and a
Truncated Normal
Distribution
See, also, Nelson (Technometrics, 1964), Roberts (JASA, 1966)
7/78
O’Hagan and Leonard (1976) Found
Something Like f()
Resembles f()
Bayes Estimation
Subject to Uncertainty
About Parameter
Constraints
8/78
ALS (1977) Discovered How
to Make Great Use of f()
See, also, Forsund and Hjalmarsson (1974), Battese and Corra (1976)
Poirier,… Timmer, … several others.
9/78
Azzalini (1985) Figured Out f()
And Noticed the Connection to ALS
The standard skew normal distribution
f() = 2()()
10/78
© 2014
http://azzalini.stat.unipd.it/SN/abstracts.html#sn99
ALS
11/78
http://azzalini.stat.unipd.it/SN/
12/78
A Useful FAQ About the Skew Normal
How to generate pseudo random draws on
1. Draw U ,V from independent N[0,1]
2. = uV + u | U |
13/78
Random Number Generator
For a particular desired and
2 2
2
Use u
and v
=
2
2
1
1
Then
v N (0,1) u | N (0,1) |
14/78
2 u2
How Many Applications of SF Are There?
15/78
W. D. Walls (2006) On Skewness in the Movies
16/78
Cites Azzalini.
2( z )(z )
SNARCH Model for Financial Crises (2013)
“The skew-normal
distribution
developed by Sahu et
al. (2003)…”
Does not
know Azzalini.
17/78
A Skew Normal Mixed Logit Model (2010)
Mixed Logit Model
Prob(Choicei j )
exp(i xij )
J
j 1
exp(i xij )
Random Parameters
ik wik
Asymmetric (Skewed) Parameter Distribution
wik vik | U ik |~ SN (0, , )
Greene (2010, knows Azzalini and ALS),
Bhat (2011, knows not Azzalini … or ALS)
18/78
Skew Normal Applications
Foundation: An Entire Field
Stochastic Frontier Model
Occasional Modeling Strategy
Culture: Skewed Distribution of Movie Revenues
Finance: Crisis and Contagion
Choice Modeling: The Mixed Logit Model
How can these people find each other?
Where else do applications appear?
19/78
Stochastic Frontier
20/78
The Cross Section Departure Point: 1977
Aigner et al. (ALS) Stochastic Frontier Model
yi x i vi ui
vi ~ N [0, v2 ]
ui | U i | and U i ~ N [0, u2 ]
Jondrow et al. (JLMS) Inefficiency Estimator
(i )
uˆi E[ui | i ]
2 i
1
(
)
i
i vi ui ,
21/78
u
i
, 2v u2 , i
v
The Panel Data Models Appear: 1981
Pitt and Lee Random Effects Approach: 1981
Time
yit x it vit ui
fixed
vit ~ N [0, v2 ], ui | U i | and U i ~ N [0, u2 ]
it vit ui
Counterpart to Jondrow et al. (1982)
(i / )
uˆi E [ui | i1 ,..., iT ] i
1
(
/
)
i
u2
u
T i
i =
,
1
T
2v
1 T
22/78
Reinterpreting the Within Estimator: 1984
Schmidt and Sickles Fixed Effects Approach: 1984
yit i x it vit
vit ~ N [0, v2 ], i semiparametically specified
fixed mean, constant variance.
Counterpart to Jondrow et al. (1982)
uˆi max i ( ˆ i ) ˆ i
(The cost of the semiparametric specification is the
location of the inefficiency distribution. The authors
also revisit Pitt and Lee to demonstrate.)
23/78
Time
fixed
Misgivings About Time Fixed Inefficiency: 1990-
Cornwell Schmidt and Sickles (1990)
it 0i 1i t 2i t 2
Kumbhakar (1990)
uit [1 exp(bt ct 2 )]1 | U i |
Battese and Coelli (1992, 1995)
uit exp[(t T )] | U i |, uit exp[ g (t, T , zit )] | U i |
Cuesta (2000)
uit exp[i (t T )] | U i |, uit exp[ gi (t , T , zit )] | U i |
24/78
Are the systematically time varying models
more like time fixed or freely time varying?
A Pooled Model
yit x it vit uit
Battese and Coelli (1992)
uit exp[ ( t T )] | U i |
yit x it vit | U i |
Pitt and Lee (1981)
Where is Battese and Coelli?
Closer to the pooled model or to Pitt and Lee?
Greene (2004): Much closer to the Pitt and Lee model
25/78
In these models with time varying inefficiency,
yit x it vit gi (t , z it ) | U i |
vit ~ N [0, 2v ] and U it ~ N [0, u2 ],
where does unobserved time invariant
heterogeneity end up?
In the inefficiency! Even with the extensions.
26/78
Skepticism About Time Varying Inefficiency
Models: Greene (2004)
27/78
True Random Effects
28/78
True Random and Fixed Effects: 2004
True Random and Fixed Effects Approach: 2004
Time
yit i x it vit uit
varying
vit ~ N [0, v2 ], uit | U it | and U it ~ N [0, u2 ]
Time
fixed
i Unobserved time invariant heterogeneity,
not unobserved time invariant inefficiency
Jondrow et al. (JLMS) Inefficiency Estimator
(it )
E [uit | it ]
2 it
1
(
)
it
u
it
2
2
it vit uit ,
, v u , i
v
29/78
Estimation of TFE and TRE Models: 2004
True Fixed Effects: MLE
yit i x it vit uit
vit ~ N [0, v2 ], uit | U it | and U it ~ N [0, u2 ]
i Unobserved time invariant heterogeneity,
not unobserved time invariant inefficiency
Just add firm dummy variables to the SF model (!)
True Random Effects: Maximum Simulated Likelihood (RPM)
yit ( wi ) x it vit uit
vit ~ N [0, v2 ], uit | U it | and U it ~ N [0, u2 ], wi ~ N [0, 2w ]
i Unobserved time invariant heterogeneity,
not unobserved time invariant inefficiency
Random parameters stochastic frontier model
30/78
Log likelihood function for stochastic frontier model
log L(, , , ) =
31/78
N
i 1
2
yi xi
log log
( yi xi )
log
Simulated log likelihood function for stochastic frontier model
with a time invariant random constant term. (TRE model)
2 yit ( w wir ) x it
N
T
1 R
S
log L (,,,, w ) = i 1 log r 1 t 1
R
( yit ( w wir ) x it )
wir draws from N[0,1].
32/78
The Most Famous Frontier Study Ever
33/78
The Famous WHO Model
logCOMP= +1logPerCapitaHealthExpenditure +
2logYearsEduc +
3Log2YearsEduc +
= v - u
Schmidt/Sickles FEM
191 Countries.
140 of them observed 1993-1997.
34/78
The Notorious WHO Results
35/78
August
12, 2012
37
No, it
doesn’t.
36/78
x 1,log Exp,log Ed ,log 2 Ed
z log PopDen,log PerCapitaGDP,
GovtEff ,VoxPopuli, OECD, GINI
37/78
Greene, W., Distinguishing Between
Heterogeneity and Inefficiency:
Stochastic Frontier Analysis of the
World Health Organization’s Panel
Data on National Health Care
Systems, Health Economics, 13, 2004,
pp. 959-980.
38/78
Three Extensions of the
True Random Effects Model
39/78
Spatial Stochastic Frontier Models: Accounting for Unobserved
Local Determinants of Inefficiency: A.M.Schmidt, A.R.B.Morris,
S.M.Helfand, T.C.O.Fonseca, Journal of Productivity Analysis, 31,
2009, pp. 101-112
Simply redefines the random effect to be a ‘region effect.’ Just a
reinterpretation of the ‘group.’ No spatial decay with distance.
True REM does not “perform” as well as several other
specifications. (“Performance” has nothing to do with the frontier
model.)
40/78
Generalized True Random Effects Stochastic Frontier Model
yit Ai Bi xit vit uit
Transient random components
vit uit
Time varying normal - half normal SF
Permanent random components
Ai Bi
41/78
Time fixed normal - half normal SF
A Stochastic Frontier Model with ShortRun and Long-Run Inefficiency:
Colombi, R., Kumbhakar, S., Martini, G.,
Vittadini, G.
University of Bergamo, WP, 2011,
JPA 2014, forthcoming.
42/78
Generalized True Random Effects Stochastic Frontier Model
yit ( w wi | ei |) xit vit uit
Time varying, transient random components
vit ~ N [0, v2 ], uit | U it | and U it ~ N [0, u2 ],
Time invariant random components
wi ~ N [0,1], ei ~ N [0,1]
The random constant term in this model has a closed skew
normal distribution, instead of the usual normal distribution.
43/78
Colombi et al. Classical Maximum Likelihood Estimator
log T (y i Xi 1T , AVA)
log L i 1
log
(
R
(
y
X
1
,
))
nq
log
2
q
i
i
T
T (...)
T-variate normal pdf.
N
q (..., )) (T 1) Multivariate normal integral.
Very time consuming and complicated.
“From the sampling theory perspective, the application
of the model is computationally prohibitive when T is
large. This is because the likelihood function depends
on a (T+1)-dimensional integral of the normal
distribution.” [Tsionas and Kumbhakar (2012, p. 6)]
44/78
Tsionas, G. and Kumbhakar, S.
Firm Heterogeneity, Persistent and Transient Technical Inefficiency:
A Generalized True Random Effects Model
Journal of Applied Econometrics. Published online, November, 2012.
Extremely involved Bayesian MCMC procedure. Efficiency components
estimated by data augmentation.
45/78
Kumbhakar, Lien, Hardaker
Technical Efficiency in Competing Panel Data Models: A Study of
Norwegian Grain Farming, JPA, Published online, September, 2012.
Three steps based on GLS:
(1) RE/FGLS to estimate (,)
(2) Decompose time varying residuals using MoM and SF.
(3) Decompose estimates of time invariant residuals.
46/78
Maximum Simulated Full Information log likelihood function for the
"generalized true random effects stochastic frontier model"
2 yit ( w wir | U ir |) xit
T
t 1 ( y ( w | U |) x )
it
w ir
ir
it
draws from N[0,1]
,
N
1 R
logLS , = i 1 log r 1
R
,
w
wir
|Uir | absolute values of draws from N[0,1]
47/78
Estimating Efficiency in the CSN Model
Moment Generating Function for the Multivariate CSN Distribution
E[exp(tui ) | y i ]
T 1 (Rri t, )
exp tRri 12 tt
T 1 (Rri , )
(..., ) Multivariate normal cdf. Parts defined in Colombi et al.
Computed using GHK simulator.
ei
1
u
0
u i i1 , t = ,
u
0
iT
48/78
0
0
1
0
, ...,
0
1
WHO Results: 2014
x 1, log Exp, log Ed , log 2 Ed
z log PopDen, log PerCapitaGDP,
GovtEff ,VoxPopuli, OECD, GINI
it Ai Bi vit uit
49/78
Computation of the GTRE Model is Actually Fast and Easy
247 Farms, 6 years.
100 Halton draws.
Computation time:
35 seconds including
computing efficiencies.
50/78
MSL Estimation
51/78
Why is the MSL method so computationally
efficient compared to classical FIML and
Bayesian MCMC for this model?
Conditioned on the permanent effects, the group
observations are independent.
The joint conditional distribution is simple and easy to
compute, in closed form.
The full likelihood is obtained by integrating over only
one dimension. (This was discovered by Butler and
Moffitt in 1982.)
Neither of the other methods takes advantage of this
result. Both integrate over T+1 dimensions.
52/78
53/78
Equivalent Log Likelihood – Identical Outcome
One Dimensional Integration over δi
T+1 Dimensional Integration over Rei.
54/78
Simulated [over (w,h)] Log Likelihood
N
i 1
1 R
S
log r 1 Gi (ir | , , , , w , h )
R
Very Fast – with T=13, one minute or so
55/78
Also Simulated Log Likelihood
GHK simulator is used to approximate the T+1 variate normal
integrals.
Very Slow – Huge amount of unnecessary computation.
56/78
Does the simulation chatter degrade the
econometric efficiency of the MSL estimator?
Hajivassiliou, V., “Some practical issues in maximum simulated
likelihood,” Simulation-based Inference in Econometrics: Methods
and Applications, Mariano, R., Weeks, M. and Schuerman, T.,
Cambridge University Press, 2008
Speculated that Asy.Var[estimator] = V + (1/R)C
The contribution of the chatter would be of second or third order.
R is typically in the hundreds or thousands.
No other evidence on this subject.
57/78
An Experiment
Pooled Spanish Dairy Farms Data
Stochastic frontier using FIML.
Random constant term linear regression with
constant term equal to - |w|, w~ N[0,1]
This is equivalent to the stochastic frontier
model.
Maximum simulated likelihood
500 random draws for the simulation for the base case.
Uses Mersenne Twister for the RNG
50 repetitions of estimation based on 500 random
draws to suggest variation due to simulation chatter.
58/78
ˆ v 0.10371
ˆ u 0.15573
59/78
Simulation Noise in Standard Errors of Coefficients
Chatter
.00543
.00590
.00042
.00119
60/78
Is It Really Simulation?
Halton or Sobol sequences
Not random – far more stable than
random draws, by a factor of about 10.
There is no simulation chatter
View the same as numerical quadrature
There may be some approximation error.
How would we know?
61/78
Sample Selection
62/78
TECHNICAL EFFICIENCY ANALYSIS CORRECTING FOR
BIASES FROM OBSERVED AND UNOBSERVED
VARIABLES: AN APPLICATION TO A NATURAL RESOURCE
MANAGEMENT PROJECT
Empirical Economics: Volume 43, Issue 1 (2012), Pages 55-72
Boris Bravo-Ureta
University of Connecticut
Daniel Solis
University of Miami
William Greene
New York University
63/78
The MARENA Program in Honduras
Several programs have been implemented to address
resource degradation while also seeking to improve
productivity, managerial performance and reduce
poverty (and in some cases make up for lack of public
support).
One such effort is the Programa Multifase de Manejo de
Recursos Naturales en Cuencas Prioritarias or MARENA
in Honduras focusing on small scale hillside farmers.
64/78
Expected Impact Evaluation
65/78
Methods
A matched group of beneficiaries and control
farmers is determined using Propensity Score
Matching techniques to mitigate biases that
would stem from selection on observed
variables.
In addition, we deal with possible self-selection
on unobservables arising from unobserved
variables using a selectivity correction model for
stochastic frontiers introduced by Greene (2010).
66/78
A Sample Selected SF Model
di = 1[′zi + hi > 0], hi ~ N[0,12]
yi = + ′xi + i, i ~ N[0,2]
(yi,xi) observed only when di = 1.
i = vi - ui
ui = u|Ui| where Ui ~ N[0,12]
vi = vVi where Vi ~ N[0,12].
(hi,vi) ~ N2[(0,1), (1, v, v2)]
67/78
Simulated logL for the Standard SF Model
exp[ 12 ( yi xi u |Ui |)2 / v2 ]
f ( yi | xi ,| U i |)
v 2
f ( yi | xi )
|Ui |
exp[ 12 ( yi xi u |Ui |)2 / v2 ]
p(| Ui |)d | Ui |
v 2
2exp[ 12 | U i |2 ]
p(| U i |)
, |U i | 0. (Half normal)
2
1 R exp[ 12 ( yi xi u |Uir |)2 / v2 ]
f ( y | xi )
R r 1
v 2
2
2
1 R exp[ 12 ( yi xi u |Uir |) / v ]
logLS (,,u ,v ) = i =1 log r 1
R
2
v
N
This is simply a linear regression with a random constant term, αi = α - σu |Ui |
68/78
Likelihood For a Sample Selected SF Model
f yi | ( x i , d i , zi ,| U i |)
exp 12 ( yi x i u | U i |)2 / v2 )
v 2
di
( yi x i u | U i |) / zi
2
1
f yi | ( x i , d i , zi )
69/78
|U i |
(1 d i ) ( zi )
f yi | ( xi , d i , zi ,| U i |) f (| U i |)d | U i |
Simulated Log Likelihood for a Selectivity
Corrected Stochastic Frontier Model
The simulation is over the inefficiency term.
log LS (, , u , v , , ) i 1 log
N
70/78
1 R
R r 1
exp 12 ( yi x i u | U ir |) 2 / v2 )
di
v 2
( y x | U |) / z
i
i
u
ir
i
2
1
(1 d ) ( z )
i
i
JLMS Estimator of ui
exp 12 ( yi ˆ ˆ x i ˆ u | U ir |) 2 / ˆ v2 )
ˆ v 2
fˆir
ˆ ( yi ˆ ˆ x i ˆ u | U ir |) / ˆ v ai
2
1 ˆ
ˆA = 1 R ( ˆ | U |) fˆ , Bˆ 1 R fˆ
i
u
ir
ir
i
ir
R r 1
R r 1
Aˆi
uˆi Estimator of E [ui |i ]
Bˆi
R
R
fˆir
ˆ
ˆ
ˆ
r 1 gir | uU ir | where gir R
, r 1 gˆ ir 1
ˆ
f
r 1 ir
71/78
Closed Form for the Selection Model
The selection model can be estimated without
simulation
“The stochastic frontier model with correction
for sample selection revisited.” Lai, Hung-pin.
Forthcoming, JPA
Based on closed skew normal distribution
Similar to Maddala’s 1982 result for the linear
selection model. See slide 42.
Not more computationally efficient.
Statistical properties identical.
Suggested possibility that simulation chatter is an element of
inefficiency in the maximum simulated likelihood estimator.
72/78
Closed Form vs. Simulation
Spanish Dairy Farms: Selection based on being farm #1-125. 6 periods
The theory works.
73/78
Variables Used
in the Analysis
Production
Participation
74/78
Findings from the First Wave
75/78
A Panel Data Model
Selection takes place only at the baseline.
There is no attrition.
d i 0 1[zi 0 hi 0 > 0]
Sample Selector
yit wi x it vit uit , t 0,1,... Stochastic Frontier
Selection effect is exerted on wi ; Corr(hi 0 , wi ,)
P( yit , d i 0 ) P(d i 0 ) P( yit | d i 0 )
Conditioned on the selection (hi 0 ) observations are independent.
P( yi 0 , yi1 ,..., yiT | d i 0 ) t 0 P( yit | d i 0 )
T
I.e., the selection is acting like a permanent random effect.
P( yi 0 , yi1 ,..., yiT , d i 0 ) P( d i 0 ) t 0 P( yit | d i 0 )
T
76/78
Simulated Log Likelihood
log LS ,C (, , u , v , )
1 R
d 1 log r 1
i
R
77/78
T
t 0
exp 12 ( yit xit u | U itr |) 2 / v2 )
v 2
( yit xit u | U itr |) / v ai 0
2
1
Main Empirical Conclusions from Waves 0 and 1
78/78
Benefit group is more efficient in both years
The gap is wider in the second year
Both means increase from year 0 to year 1
Both variances decline from year 0 to year 1
79/78
Summary
The skew normal distribution
Two useful models for panel data (and one
potentially useful model pending development)
Extension of TRE model that allows both transient and
persistent random variation and inefficiency
Sample selection corrected stochastic frontier
Spatial autocorrelation stochastic frontier model
Methods: Maximum simulated likelihood as an
alternative to received brute force methods
80/78
Simpler
Faster
Accurate
Simulation “chatter” is a red herring – use Halton sequences