Small Area Prediction under Alternative Model Specifications

Download Report

Transcript Small Area Prediction under Alternative Model Specifications

Small Area Prediction under
Alternative Model Specifications
By
Wayne A. Fuller and Andreea L. Erciulescu
Department of Statistics, Iowa State University
Small Area Estimation 2014
Poznan, Poland, September, 2014
Outline
I. Motivating example
II. Models: auxiliary information
III. Bootstrap for prediction MSE
IV.Simulation
2
Conservation Effects Assessment
Project (CEAP): Natural Resources
Conservation Service
Impacts of conservation practices
Sample of fields
Subsample: National Resources Inventory(NRI)
Hydrologic Units
3
4
Unit Level Model
( yij , xij ) i  1, 2,  , m; j  1, 2,  , ni
yij  g (xij , β, bi )  eij , bi ~ N (0,  )
2
b
bi  area effect
E{eij | (xij , bi )}  0
Continuous and discrete yij
Auxiliary information
5
Auxiliary Data
μ x1 : Soils data (known)
~
μ : Cover, practices (NRI)
x2
6
Parameters
 i   g (x, β, bi )dFxi ( x)
x
~
 i  E{ i | (y i , x i ), ψ}
2
2

ψ  ( b ,  e , β, φi )
φ i  parameters of Fxi ( x)
ˆ  E{ | (y , x ), ψˆ }
i
i
i
i
 i  MSE (ˆi )  E{(ˆi   i ) 2 }
7
Parametric Bootstrap
2
2
ˆ
ˆ

ˆ
ˆ
Original sample : ψ  (β ,  b ,  e , φˆ )
ˆ , r1,k )
Data generator : DG ( ψ
r1,k  random seed
*
i , k
 true for sample k
ˆi*,k  predicted for sample k
i*,k  (ˆi*,k  i*,k )2
B1
Level one : ˆi*  i*  B11  i*,k
k 1
8
Double Bootstrap Estimation
ˆ , r1,k )
Generate B1 level - one samples DG (ψ
ˆ *k , r2,k ,t )
For each, generate B2 level - two samples DG (ψ
ˆ *k is estimate for level - one sample k
ψ
Bˆ iasi ,k  B
1
2
B2

t 1
 i*,*k ,t   i*,k
 i*,*k   i*,k  Bˆ iasi ,k  2 i*,k   i*,k*
ˆ  2  B B
**
i
*
i
1
1
1
2
B1
B2
k 1
t 1

 i*,*k ,t
9
Fast Double Bootstrap
Davidson and Mac Kinnon (2007)
 i*,*k ,1   i*,k is an est. of bias
Generate one  k** for each  k*
B1
Bˆ ias  B11  ( i*,*k ,1   i*,k ) :  i**   i*
k 1
ˆ i*,*C  2 i*   i**
10
Telescoping Double Bootstrap
ˆ , r1,k )
Level one sample uses DG ( ψ
Level two sample uses
*
DG ( ψk , r1, k 1 )
B1 1
ˆi*,*T  ( B1  1) 1  (i*,k  i*,k 1  i*,*k )
k 1
B1 1
 ( B1  1) 1  ( 2i*,k  i*,*k )
k 1
11
CEAP Simulation Model
yij  g ( xij , bi )  eij , ( yij , xij ) observed
g(x ij , bi ) 
exp(  0  xij 1  bi )
1  exp(  0  xij 1  bi )
yij ~ Binomial g ( xij , bi )
xij ~ NI (  xi ,  2 )
bi ~ NI (0,  b2 ) ind. of xtj all i, t , j
 i   g ( x, bi )dFxi ( x)
x
12
Alternative Specifications for x
Some external information
Area means known
Estimated random means
No external information
Area means fixed
Area means random
13
Simulation Parameters
36 areas; ni  (2, 10, 40) 12 each
yij bernoulli E{ yij | xij }  g ( xij , bi )
g ( xij , bi ) 
exp(  0  xij 1  bi )
1  exp(  0  xij 1  bi )
xij   x   i   ij
(  x ,  0 , 1 )  (0.0,  0.8, 1.0)
( ,   ,   )  (0.25, 0.16, 0.36)
2
b
2
2
14
Estimation and Prediction
REML for (β,  ) and (  x ,   ,   )
ˆi  E{ i | (y i , x i , ~xi ), ψˆ }
2
b
2
2
15
Prediction MSE/Prediction MSE μxi Known
Size
Fixed  xi
No ~
xi
Random  xi
No ~
Random  xi
~
xi
xi
2
1.76
1.43
1.16
10
1.20
1.12
1.05
40
1.09
1.04
1.02
16
Bootstrap Estimator of Prediction MSE (%),
400 MC Samples, ( B1 , B2 )  (100, 1)
Random  xi , Observed ~xi
ˆ *
Size
ˆ T**
ˆ C**
2
Rel Bias
Rel Sd
-14.6
38.9
-9.4
45.1
-9.4
44.7
10
Rel Bias
Rel Sd
Rel Bias
Rel Sd
-13.2
30.7
-7.5
20.1
-6.8
36.5
-1.9
23.3
-6.8
35.9
-1.9
25.1
40
17
Equal Efficiency Bootstrap Samples
Random 𝜇𝑥𝑖 , Observed 𝜇𝑥𝑖
Bootstrap
Telescoping (100, 1)
Classic (100, 1)
Classic (44, 50)
𝑛𝑖 = 2
Level One Total
100
200
150
300
44
2244
18
Summary
Fast double bootstrap improves bootstrap efficiency
Double bootstrap reduces bias (about 50%)
Double bootstrap increases variance (15 to 30 %)
Random x model has potential to reduce MSE
19
Future Work
Confidence Intervals
Triple Bootstrap
Regression with Bootstrap
Nonparametric Bootstrap
Predictions for CEAP
20
Thank You
21