Mapping Poverty and Inequality: A Small Area Estimation Approach Peter Lanjouw,

Download Report

Transcript Mapping Poverty and Inequality: A Small Area Estimation Approach Peter Lanjouw,

Mapping Poverty and Inequality:
A Small Area Estimation Approach
Peter Lanjouw,
DECRG, The World Bank
Washington D.C., January 25, 2007
Introduction
Project within World Bank’s research
department in collaboration with Chris Elbers
(Free University, Amsterdam) and Jean
Lanjouw (UC Berkeley)
Numerous other collaborators.
Goal is to produce (and use) estimators of
welfare that are accurate and easily
calculated. Termed “poverty maps”.
What are Poverty Maps?
Not necessarily “Maps”; rather,
highly disaggregated databases of welfare
Poverty
Inequality
Calorie intake
Under-nutrition
Other indicators (health outcomes? lifeexpectancy?)
disaggregation may, but need not, be spatial
Poverty of “statistically invisible” groups
Example: Yunnan Province (China)
County Level Poverty Incidence Estimates
<2%
2%-6%
6%-12%
12%-18%
18%-30%>30%
>30%
Example: Yunnan Province (China)
Township-level poverty
Why is there demand for these?
Geographic targeting of antipoverty programs
Decentralization and Public Policy
(fiscal, evidence-based policy)
Political economy of local decisionmaking
Within-country determinants of
welfare outcomes.
What is the problem?
Main source of information on
distributional outcomes household surveys - permit only
limited disaggregation.
Very large data sources (e.g.
census) typically collect very limited
information on welfare outcomes.
What are the options?
1. Collect larger samples
- expensive
- some kind of data compromise
2. Combine the limited information available in
data sources like the census, into some
proxy of welfare (e.g. “basic needs index”)
- ad-hoc
- often widely disputed (multiple maps)
- limited usually to a notion of poverty
- how
to interpret? (poverty=low
income?)
Options, continued
3. Impute a preferred measure of
welfare (e.g. comprehensive real
consumption) from household survey
into census, using statistical prediction
methods
Small Area Estimation
Approach developed by a DECRG team
Our Solution: Combine Census and Survey
Impute a measure of welfare from household survey
into census, using statistical prediction methods.
Produces readily interpretable estimates:
Works with exactly the same concept of welfare
as traditional survey-based analysis.
Statistical precision can be gauged
Encouraging results to date
But, non-negligible data requirements
Methodology
ELL (2002, 2003)
Estimate a model of, for example, per-capita
consumption, yh, using sample survey data.
Restrict explanatory variables to those that can be
linked to households in survey and census.
Estimate expected level of poverty or inequality for
a target population using its census-based
characteristics and the estimates from the model
of y.
Three Basic Stages
Zero stage: establish comparability of data sources;
identify/merge common variables; understand
sampling strategy.
First stage: estimate model of consumption.
Second stage: take parameter estimates to census,
predict consumption, and estimate poverty and
inequality.
Methodology
Let W(m, y) be a welfare measure based on a vector of
household per-capita expenditures, y, and household
sizes, m.
We want to estimate W for a target population (say a
village, v) where y is unknown.
Typical Estimations to date have used a log-linear
model of consumption:
ln ych  E[ln ych | zch ]  uch  z   c   ch ,
T
ch
where ηc is a cluster random effect allowing for a
locational influence on consumption. (Can be more
than one level.)
Estimation Details
Estimate separate regressions per stratum
Use cluster weights where significant
Allow for non-normality of disturbances
(parametric/non-parametric), and
Heteroskedasticity in individual-specific component
of disturbances.
Logistic model of the variance of εch conditional on zch, bounding the
prediction between zero and a maximum, A, set equal to
(1.05)*max{ech²}:
2
ech
T
ˆ  rch .
ln[
]=z
ch
2
A+ech
Simulation of Welfare Measures
We are interested in estimating W(m, Z, ζ , u) where
m, Z, and u are conformable arrays of household
size, observables and disturbances.
The expected value of W is:
 = E[W | mv , Zv ,  ],
where ζ is the vector of model parameters
We replace the unknown vector ζ with consistent
estimators from the first stage, and use simulation to
obtain our estimator,  .
Prediction Error
The error in the estimator can be decomposed as:
  W    (W   )  (    ).
Idiosyncratic error – increases with smaller
populations.
Model error – not related to size of target population.
Other elements can include:
Computation error – part of model error, can be
negligible.
Sample error – when large dataset is also a sample vs.
census.
Albania Poverty Map
Validation
In Mexico, PROGRESA is a rural health,
education and nutrition project.
As part of the PROGRESA evaluation, a
census was taken of all households in
496 rural communities in 1998.
20,544 households were covered
This census included detailed information
on consumption expenditures.
Testing the Poverty Mapping Methodology
A random sample of 50 communities
was drawn from the full 496
communities.
From each of the 50 communities, 10
households were randomly sampled.
The data from these households serve
as a “pseudo” survey.
Testing the Poverty Mapping Methodology
20 “target populations” were constructed
from the full set of 496 localities, by grouping
together 24-25 localities at a time.
Each target population covers about 1000
households.
Central Question: How well does poverty
mapping methodology applied to the “pseudo
survey” predict poverty in the target
populations?
Target Population
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Total
True FGT0
60.6
56.8
57.2
63.6
61.6
64.0
62.1
64.7
61.0
67.5
60.3
56.8
64.7
60.4
57.6
59.5
55.3
58.9
67.6
61.3
61.1
Estimated FGT0
58.2
60.0
59.8
61.8
62.8
59.2
59.7
70.9
60.3
65.0
59.8
62.7
64.1
56.9
57.8
58.6
54.7
61.5
65.9
65.2
61.2
s.e.
3.7
3.6
3.8
3.8
3.9
3.9
4.1
3.8
3.5
4.0
3.6
4.0
3.4
4.6
3.6
3.6
4.2
4.3
4.2
3.7
2.1
Precision of estimates
Table 4: Relative Frequency of True Target Population Welfare Falling
Within 95% Confidence Interval Around Estimated Welfare
Survey
1
2
3
4
5
6
7
8
9
10
Overall
Mean
1.00
0.95
0.90
0.95
1.00
0.80
0.95
0.95
0.85
0.95
0.93
Headcount
1.00
0.95
0.85
1.00
1.00
0.90
0.95
0.95
0.85
0.90
0.94
FGT2
1.00
1.00
0.80
1.00
1.00
0.80
0.95
0.90
0.80
0.90
0.92
GE0
1.00
1.00
0.95
0.90
0.85
0.60
1.00
0.70
0.90
0.95
0.89
Caveats
Mexican localities do not vary much in terms
of poverty.
Haven’t been able to test the methodology in a
setting where localities vary dramatically from
each other.
Test is almost a laboratory set-up:
Survey and census refer explicitly to exactly the
same time period
Variables are fully comparable, by construction.
The fact that results are good in Mexico does
not mean they will be good everywhere.
Mexican experience shows that method
matters.
Countries Being Mapped
Latin America: Mexico, Guatemala, Honduras,
Nicaragua, Panama, Ecuador, Bolivia
Asia: China, Indonesia, Papua New Guinea, Laos,
Cambodia, Vietnam, Thailand, Philippines,
Bangladesh, Sri Lanka, India.
Africa: South Africa, Mozambique, Madagascar,
Kenya, Uganda, Zambia, Malawi, Morocco
Eastern Europe and FSU: Albania, Azerbaijan,
Bulgaria, Kazakhstan
What have we learned so far?
The exercise itself can have a huge
impact on the debate on poverty within
a country and the Bank’s relationship
with the client government (e.g. Kenya,
Morocco, among many others)
Methods borrowed from ‘poverty
mapping’ can improve the quality of the
debate and inform policy.
What have we learned so far?
There should be
no presumption
that small, poor,
rural villages are
homogenous in
terms of income
or consumption
What have we learned so far?
Geographic
targeting
of transfers
can
generate
very large
gains for
poverty
reduction
Table 2: Cost of Reducing Poverty to the same level achieved by Uniform Transfers using
the “Optimal” Targeting Scheme
Uniform transfer
Optimal Targeting (1st
Administrative Level)
Optimal Targeting (2nd
Administrative Level)
Optimal Targeting (3rd
Administrative Level)
Ecuador
(Rural)
100
76.0
Madagascar
(Urban and Rural)
100
60.7
Cambodia
(Urban and Rural)
100
54.5
66.7
46.4
41.4
58.4
37.6
30.8
What have we learned so far?
Inequality is
negatively
correlated with
many social
outcomes we
care about.
It can also
influence the
functioning of
projects at the
local level.
Figure 1: Local Inequality and Probability of FISE Project Receipt
Kernel regression, bw = .02, k = 6
.615436
.373049
.051611
.148224
Grid points
Research and Future Directions:
Methodology
Mapping other indicators (e.g. nutrition)
Survey to Survey
Updating Poverty maps
Research
Should we target using maps? How?
“Impact” Mapping (ex-ante policy impact analysis)
Small area analysis of poverty, inequality and:
•
•
•
•
•
•
•
Growth
Environment
Crime
Health
Infrastructure
City size
Electoral Participation
Cambodia Stunting Map
Poverty and Stunting in Cambodia
Survey to Survey Application:
Introduction
India in the 1990s: Acceleration of
economic growth
Economic reforms and liberalization
What has happened to Poverty?
Has poverty reduction accelerated?
Survey to Survey cont.
Poverty Trends
Analysis has traditionally been based on
National Sample Survey (NSS) data.
Two “thick” rounds in the 1990s: 1993/4
(50th) and 1999/0 (55th)
“Thick” rounds are designed to be
representative at the state and even, substate, level (NSS-regions).
Survey to Survey, cont.
At first glance analysis of NSS data
suggest impressive declines in poverty:
“Official” headcounts (published by
Planning Commission)
1987/8
Rural India: 39.4
Urban India: 39.1
1993/4 1999/0
37.3
32.4
27.1
23.6
These trends suggest an acceleration of
poverty decline
Survey to Survey, cont.
Are these estimates credible?
Serious concern regarding comparability of
the 50th and 55th round NSS consumption
data
• 55th round applies different recall periods
across expenditure items, while 50th applies
only one (30 days).
– 7-day recall period for frequency items like food,
365-day recall period for low-frequency items
• Households might try to reconcile their answer
to questions that refer to different recall
periods, it is likely to boost the expenditure
estimates based on 30-day recall data.
Survey to Survey : Methodology
Based on Elbers, Lanjouw, Lanjouw (2001, 2003)
We estimate in the 50th round:
ln yh = xhβ + εh
yh: per capita exp for household h,
x: 30-day exp, or a set of observed household
characteristics,
ε: disturbance term
Model is estimated with weights.
We estimate separate regressions per state and again per region.
We allow for:
intra-cluster correlation in disturbances
heteroskedasticity in individual-specific component of
disturbances.
non-normal and non-parametric disturbances
report estimated standard errors alongside point estimates.
Adjustment Methodologies
Deaton and Dreze (2002)identify one consumption
component in 55th round questionnaire which has
not been altered.
• 30-day “intermediate goods” consumption
• Comprises fuel and light, rent, non-institutional
medical miscellaneous goods and services.
Assess the relationship between 30-day comparable
consumption and probability of being poor in 50th
round, and then predict 55th round headcounts
Key Assumptions:
(1) reported 30-day expenditure are unaffected by
the changes in questionnaire elsewhere
(2) relationship between 30-day consumption and
full consumption is much the same in 55th as in
50th round.
Adjustment Methodologies
Deaton and Dreze find a slightly less rapid
decline in rural poverty during the 90s.
Kijima and Lanjouw worry about some of
the regional patterns emerging from Deaton
and Dreze results.
Is it possible that “comparable” 30-day
expenditure has been contaminated by
questionnaire changes elsewhere?
Does the stability assumption underpinning
D&D fail to hold in places?
Adjustment Methodologies
Kijima and Lanjouw propose another
specification:
model replaces 30-day intermediate goods
consumption with a set of other household
variables that have clearly not been redefined
between 50th and 55th rounds: “multivariate
model”
Assumption: relationship between household’s
characteristics and log per capita expenditure is stable in
each region between 1993/4 and 1999/0
•
•
How reasonable?
Note: we side-step inflation adjustments
Selected Region-Level Results
Rural
50th
Andhra South
22%
Bihar Southern
53%
Gujarat Dry Areas
38%
Karnataka Eastern 22%
Karnataka Southern 40%
MP Central
46%
MP Western
65%
Maharashtra North 53%
Tamil Nadu Southern 42%
UP Southern:
51%
55th (single)
32%
44%
23%
6%
22%
20%
52%
38%
21%
16%
Note: Standard errors are not small
Partly sampling, and partly prediction error
55th (multiple)
22%
(3.1)
46%
(2.5)
35%
(4.9)
15%
(3.3)
30%
(2.9)
37%
(4.7)
66%
(3.7)
47%
(3.7)
37%
(2.6)
45%
(5.8)
Adjusted Rural Poverty Estimates
% point decline 50th -55th Rounds
Rural India
UNADJUSTED
10
Sundaram and Tendulkar (2003a)
9
Deaton and Dreze (2002)
7
Sundaram and Tendulkar (2003b)
Datt, Kozel, and Ravallion (2003)
Sen and Himanshu (2004)
Kijima and Lanjouw (2003)
Kijima and Lanjouw (2005)
5
4
3
3
2
Note: different poverty lines!
Rate of poverty decline depends on adjustment method
Concluding remarks
“Poverty Mapping” has been widely
implemented.
As “snap shots” become more common, demand
is shifting increasingly towards updates.
Poverty monitoring at the local level is a, key,
policy imperative in many countries.
Critical step is to validate methods outlined
here