HSE PowerPoint template

Download Report

Transcript HSE PowerPoint template

Stochastic DEA:
Myths and misconceptions
Timo Kuosmanen (HSE & MTT)
Andrew Johnson (Texas A&M University)
Mika Kortelainen (University of Manchester)
XI EWEPA 2009, Pisa, Italy
What is stochastic DEA?
”DEA is truly a stochastic frontier estimation
method, and it is incorrect to classify it as a
deterministic method.”
Banker & Natarajan (2008) Operations Research, p.49
2
What is stochastic DEA?
• Term stochastic
(from Greek “Στοχος” for ”aim” or ”guess”)
generally refers to statistical random variation
3
Elements of random variation in DEA
• Random sampling of observations from the
production possibility set (sampling error)
• Random sampling of observations outside the
production possibility set (outliers)
• Random outcome of production process
(stochastic technology)
• Random measurement errors, omitted variables,
and other disturbances (stochastic noise)
4
Common myths and misconceptions
• Confusing stochastic noise with sampling variation,
outliers, or stochastic technology
• Statistical inference on sampling error is believed
to improve robustness to noise
• Robustness to outliers is seen as the same as
robustness to noise (or at least closely related)
5
Sampling error
output y
True frontier
input x
6
Sampling error
y
True frontier
Random
sample of
observations
(DMUs, firms)
x
7
Sampling error
y
True frontier
Random
sample of
observations
(DMUs, firms)
x
8
Sampling error
y
True frontier
Random
sample of
observations
(DMUs, firms)
x
9
Sampling error
y
True frontier
DEA frontier
x
10
Statistical foundation of DEA
–
–
–
–
Banker (1993) Management Science
Korostelev, Simar & Tsybakov (1995) Annals Stat.
Kneip, Park & Simar (1998) Econometric Theory
Simar & Wilson (2000) JPA
• Deterministic technology
• No outliers or noise
• Data randomly sampled from the PPS
• DEA frontier converges to the true frontier as the
sample size approaches to infinity
• In a finite sample, DEA frontier is downward biased
11
Statistical foundation of DEA
• Statistical inference on sampling error is possible
by using
– Asymptotic sampling distribution (Banker 1993)
– Bootstrapping (Simar & Wilson 1998)
• Such inferences have nothing to do with
– outliers
– stochastic technology
– stochastic noise
12
Bootstrapping
• Purpose of the smooth consistent bootstrap (Simar &
Wilson 1998, 2000) is to mimic the original random
sampling to estimate the sampling bias
• Bias corrected DEA frontier will always lie above the
original DEA frontier
• In noisy data, DEA tends to overestimate the frontier
• Assuming away noise, and “correcting” for the small
sample bias by bootstrapping, we will shift the frontier
upward
=> If noise is a problem, then bias correction will only
make it worse
13
Simulated example
y
6,000
Frontier
Data points
5,000
4,000
3,000
0,000
2,000
4,000
6,000
8,000
10,000
12,000
x
14
Simulated example
y
6,000
DEA Frontier
Frontier
Data points
5,000
4,000
3,000
0,000
2,000
4,000
6,000
8,000
10,000
12,000
x
15
Simulated example
y
6,000
Bias Corrected Frontier
DEA Frontier
Frontier
5,000
Data points
4,000
3,000
0,000
2,000
4,000
6,000
8,000
10,000
12,000
x
16
Critique of Löthgren & Tambour (LT)
“LT bootstrap involves measuring the distance from
a different, random (as opposed to fixed) point
to the [frontier] on each replication of the
bootstrap Monte Carlo exercise. It seems entirely
unclear what this procedure estimates. Certainly, it
does not estimate anything of interest.”
…
“LT method assumes not only that [the frontier] is
unknown, but also (implicitly) that the point from
which one wishes to measure distance to the
frontier is unknown. This is absurd.”
Simar & Wilson (2000), JPA, pp. 67-68.
17
Outliers
y
Outliers
True frontier
x
18
Outliers
y
DEA
frontier
True frontier
x
19
Outliers
– Super-efficiency approach (Wilson 1995 JPA)
– Peeling the onion; context dependent DEA (Seiford & Zhu
1999 Management Science)
– Robust efficiency measures / efficiency depth (Kuosmanen
& Post 1999 DP, Cherchye, Kuosmanen & Post 2000 DP)
– Conditional order-m and order-α quantile frontiers (Aragon,
Daouia & Thomas-Agnan 2002 DP; Cazals, Florens & Simar
2002 J Econometrics; Daouia & Simar 2007 J
Econometrics; Daraio & Simar 2007 book)
• Deterministic technology
• Improve robustness to outliers by not enveloping
the most extreme observations
• Outliers are different from noise
– Noise affects all observations
20
Stochastic technology
y
Pr.[f(x)≤f]= 0.05
Pr.[f(x)≤f]= 0.50
Pr.[f(x)≤f]= 0.95
x
21
Stochastic technology
y
Pr.[f(x)≤f]= 0.05
Pr.[f(x)≤f]= 0.50
Pr.[f(x)≤f]= 0.95
x
22
Chance constrained DEA
–
–
–
–
Land, Lovell & Thore (1993) Managerial & Decision Econ.
Olesen & Petersen (1995) Management Science
Cooper, Huang & Li (1996) Annals of OR
Huang & Li (2001) JPA
• Stochastic technology, stochastic noise, both?
23
Chance constrained stochastic DEA
• Huan & Li (2001) JPA
• Assume inputs and outputs are multivariate normal
random variables, with known expected values and
covariance matrix
24
Chance constrained stochastic DEA
• How do we get “knowledge” about the expected
values of inputs and outputs?
– Cannot be estimated from cross-sectional data
– Panel data estimation would require that the true
inputs and outputs do not change over time
• How do we get “knowledge” about the variances
and covariances of the error terms???
• Uncertainty of the parameter estimates not taken
into account in the model
25
Stochastic noise
y
True frontier
x
26
Stochastic noise
y
True frontier
x
27
Stochastic noise
y
True frontier
x
28
Stochastic DEA models to deal with noise
• DEA+
– Gstach (1998) JPA
– Banker & Natarajan (2008) Operations Research
• “Stochastic DEA”
– Banker, Datar & Kemerer (1991) Management Science
• Stochastic FDH/DEA estimators
– Simar & Zelenyuk (2008) DP.
• Stochastic Nonparametric Envelopment of Data
(StoNED)
– Kuosmanen (2006) DP; Kuosmanen & Kortelainen (2007) DP.
29
Stochastic DEA models to deal with noise
• Estimation of a fully deterministic frontier based on
data perturbed by noise
– The shape of frontier can be estimated without
parametric assumptions
• Estimation of inefficiency (efficiency scores) is very
challenging in cross-sectional setting
– Observed output contains the noise term
– Only conditional expected value can be estimated
– Even the SFA efficiency estimator is not consistent!
30
Stochastic DEA models to deal with noise
• In cross-sectional setting, identifying inefficiency
and noise requires some strong assumption
– Assuming away noise completely is a strong
assumption, too
• Distributional assumptions do not influence the
efficiency rankings
– Ondrich & Ruggiero 2001, EJOR
31
Conclusions
• Stochastic noise should not be confused with
sampling error, outliers, or stochastic technology
• Correcting for small sample bias by bootstrapping
does not improve robustness to noise; it can even
make things worse
• Improving robustness to outliers is different from
stochastic noise that perturbs all observations
32