Transcript Zone design methods for epidemiological studies
Zone design methods for epidemiological studies
Samantha Cockings, David Martin
Department of Geography University of Southampton, UK Thanks to: Arne Poulstrup, Henrik Hansen Medical Office of Health, Province of Vejle, Denmark [email protected]
Why use areas?
No choice - data only available for areas Confidentiality Cost Through choice Believe some phenomena are area-level Rates/ratios Visualisation/mapping Decision-making/planning
Problems with using areas
Modifiable areal unit problem (MAUP) Scale Aggregation For a given set of data, different aggregations/zoning systems will often show apparently different spatial patterns in the data (Openshaw, 1984) Ecological fallacy Relationships between variables which are observed at one level of aggregation may not hold at the individual, or any other, level of aggregation (Blalock, 1964) Small numbers/instability of rates Non-nesting units
Recent developments in (UK) automated zone design methods/tools
2001 UK Census of Population Automated design of Output Areas (OAs) Martin et al (2001) 1 ; Martin (2002) 2 Based on Automated Zoning Procedure (AZP) Openshaw (1977) 3 ; Openshaw & Rao (1995) 4 Automated Zone Matching software (AZM) Martin (2002) 5
1 Environment & Planning A, 33, 1949-1962 2 Population Trends 108, 7-15 5 IJGIS, 17, 181-196 3 Transactions of the IBG, NS, 2, 459-472 4 Environment & Planning A, 27, 425-446
Methods
Automated zone design … iterative recombination
Building blocks Initial random aggregation Iterative recombination Maximise objective function Aggregated zones
Martin, D (2002), Population Trends, 108, p.11
How can automated zone design help in environment and health studies?
Explore sensitivity of results to MAUP Design sets of ‘optimal’ purpose-specific zones Stability of estimates • Zones of homogeneous population size?
Exploring spatial patterning of disease • Zones of homogeneous rates?
Analysing relationships between variables • Zones of homogeneous risk/confounding factors?
Barriers/boundaries • Zones constrained by geog. features or admin. boundaries
Empirical study 1: Pre-aggregated data
Morbidity and deprivation in SW England
County of Avon (1991 Census) 1970 enumeration districts 177 wards Premature (0-64 years) limiting long term illness (LLTI) Townsend deprivation score Standardisation to England & Wales
SMR LLTI 0-64 0 - 0.62
0.62 - 0.83
0.83 - 1.03
1.03 - 1.36
1.36 - 9998 Restricted 0 N 2 4 Kilometers
© Crown copyright/ED-LINE Consortium, ESRC/JISC supported
Townsend score -6.87 - -3.37
-3.37 - -1.97
-1.97 - -0.43
-0.43 - 1.83
1.83 - 9998 Restricted 0 N 2 4 Kilometers
© Crown copyright/ED-LINE Consortium, ESRC/JISC supported
Population (0-64) EDs 1 - 291 292 - 364 365 - 420 421 - 500 501 - 1321 Restricted 0 N 2 4 Kilometers
© Crown copyright/ED-LINE Consortium, ESRC/JISC supported
Population (0-64) wards 43 - 1754 1758 - 2939 2986 - 4065 4142 - 7868 8020 - 14333 0 N 2 4 Kilometers
© Crown copyright/ED-LINE Consortium, ESRC/JISC supported
Aims
Explore sensitivity of association at different scales (population size) Explore sensitivity of association for different aggregations at a given scale Explore ‘robustness’ of ED and ward level zoning systems for this type of spatial analysis
AZM software
©David Martin
target 3250; mean 0-64 pop. 3713
0 N 2 4 Kilometers
© Crown copyright/ED-LINE Consortium, ESRC/JISC supported
Population (0-64) target 3250 2931 - 3252 3253 - 3603 3604 - 3994 3995 - 4517 4518 - 5746 0 N 2 4 Kilometers
© Crown copyright/ED-LINE Consortium, ESRC/JISC supported
EDs Correlation (Townsend score and LLTI SMR) against mean pop. size …
the scale effect
Wards
Standard deviation (pop. 0-64) against mean pop. size …
the scale effect
Wards EDs
Correlation (LLTI-Townsend) vs. mean population size at given scale …
the aggregation effect
0.89
0.87
0.85
0.83
0.81
0.79
0.77
0.75
0 500 1000 1500 2000 2500
Mean population (0-64)
3000 3500 4000 4500 5000
Results
Observed association affected by choice of zoning system – MAUP/ecological fallacy Automated zoning systems demonstrating greater stability of population size, higher correlations Generally increasing Townsend-LLTI correlation with increasing zone size (pop.) and iterations ED and ward correlations at low end of variation at given scale Neighbourhood scale of ~3000 for UK?
Empirical study 2: Individual level data
Dioxins and cancer, Kolding, Denmark
Background c.50,000 residents Airborne carcinogenic dioxin Data Geo-referenced addresses of residents 1986-2002 Roads, rivers, lakes Buildings/urban areas
Possible zone design criteria
Population size: threshold/target Physical boundaries Roads, rivers, lakes Shape Homogeneity Built environment - dwelling type, tenure Socio-economic - education, income, occupation
Methods: Thiessen polygons around addresses
Methods: Using constraining features – roads and rivers
Methods: Clipped thiessen polygons
Illustrative zoning system from AZM: target 300, threshold 250
Next steps
Other design constraints Physical boundaries in zone design process Homogeneity • Built environment • Social environment Use zones to calculate rates of cancer Sensitivity analysis
Conclusions
All zoning systems are imposed and should not be considered neutral or stable Zone design methods offer: The ability to explore the sensitivity and robustness of existing and alternative zoning systems The ability to design purpose-specific zoning systems according to pre-defined criteria
Environment and health studies: What are we trying to model?
Points?
People Health Outcome Points/areas?
Risk factors Individual level Points?
Confounding factors
Predisposing
: age, sex, ethnicity, genetics, birthweight
Lifestyle
: smoking, diet, exercise, alcohol
Socio-economic
: occupation, income, education ‘People’/‘Composition’ Area level Areas?
Pollution
: air, water, noise
‘Neighbourhood’
: services, housing type/quality, ethnic groupings/population mixing, deprivation, crime, support networks ‘Place’/’Context’
Standard deviation (0-64) vs. mean population size for different aggregations at a given scale
800.000
700.000
600.000
500.000
400.000
300.000
200.000
100.000
0.000
0 500 1000 1500 2000 2500 3000
Mean population size (0-64)
3500 4000 4500 5000