Transcript Slide 1

The complexities of publishing
gridded data for the UK
European Forum for Geostatistics
Krakow – October 2014
Ian Coady
Geography Policy and Research Manager
Office for National Statistics
Why are we different?
• No use of grids at the national or sub-national level
• No use of administrative data in the collection and
publication of statistics
• Historic reliance on administrative boundaries for the
publication of official statistics
• Large user demand for Census microdata on
administrative boundaries
But….
• It is (partially) understood that the publication of
Census microdata at the administrative level is
not sustainable due to disclosure issues
• INSPIRE has given us a common platform and
specification for the publication of statistics
• …and I am here to try promote the
interoperability of data at the European level
Output Areas
 Created for the 2001 Census
 Built around consistent
numbers of population and
households
 Used postcodes as the building
blocks
 Included a high level of social
homogeneity
 Some cartographic constraints
were used
Geography Policy for National Statistics
Previous work on grids
• 1km grids provided between 2005 and 2006
• Provided by aggregating population estimates from the
postcode level to the grids
• Lack of administrative data sources meant data below
postcode level was no longer available
• ONS has provided data for England and Wales but (we
think) Scotland and Northern Ireland have provided
data separately
Best-Fitting from Postcode
Research Questions
1) How can the sex and age variables that do not currently exist on
the postcode level mid-year population estimates be published?
2) How do the methodologies for publishing gridded population data
differ across the devolved administrations of the UK, and how
could this approach be harmonised?
3) What is the statistical impact of publishing gridded data at the
postcode aggregate level rather than from the microdata?
4) What options exist for visualising gridded population data?
• Small Area Population Estimates team no
longer produce mid-year estimates at the
postcode level
• To align with the Geography Policy for
National Statistics all estimates are now
produced at the Output Area level
Best-Fitting from Output Areas
Exact-Fitting from Census
Microdata
Grid Methodologies
Level
Records
Frequency
Source
Additional
Variables
Households
23,406,162
10 years
Census
Yes
Unknown
Unknown
No
Annual
Mid-Year
Estimates
No
Postcodes
Output
Areas
181,408
Other alternatives…..
Use soil sealing layer (SSL) to produce dasymetric reallocation of population:
• Small area population data are intersected with the SSL and used to re-weight
population counts into settled areas, which are then aggregated to grid cells
• May offer consistency with broader European practice, but inconsistent with other
ONS approaches.
• Disclosure control attitude unknown and would need to be tested.
• Offers alternative to postcode use that can no longer be carried out.
Use ONS built-up areas layer to produce dasymetric reallocation of population:
• Variant of previous option but consistent with ONS data sources as uses the BuiltUp Areas (BUA) layer instead of SSL.
• Intersects OA boundaries with BUAs and allocates population to BUAs in
proportion to area. Then aggregate to grid cells.
• Uses only published ONS data, so no additional disclosure risk.
• Reproducible methodology but needs to be tested.
A country of countries
• Northern Ireland have
disclosure concerns about
the disclosure risk of
publishing on both the Irish
National Grid and the
GEOSTAT Grid
• Both Scottish and Northern
Irish Census microdata is
stored separately and
experience shows accessing
this data can be a long and
slow process
Sustainable variables
• Small Area population
estimates do not currently
include sex or age
• Increasing use of
administrative data could
allow this to be included
• Change to the SAPE processes
would be required
• Would depend on the agreed
level of publication but could
potentially go further than age
and sex
Conclusions
• Producing UK level grid outputs is possible
• Producing them from Census microdata is possible but only
through Census and the differences between this and the
small area population estimates would be noticable
• Differences in the methodologies of the devolved
administrations are superficial but improved data sharing is
needed to do this on an ongoing basis
• Additional variables could be provided on an ongoing basis
but only through administrative data that has not yet been
integrated into the ONS Statistical Business Process Model
Any questions???