The 2011 Census: Estimating the population

Download Report

Transcript The 2011 Census: Estimating the population

The 2011 Census:
Estimating the Population
Alexa Courtney
Overview
• Background
– New topics on questionnaire
– Internet Data Collection
• Delivery of data
• Overview of 2011 processing
– Discuss “downstream” processes
2011 Census
• 27th March 2011
– Census Rehearsal: 11th October 2009
• Most complex UK Census
– More questions and topics than previous censuses
– Range of delivery and completion options
• Similar to 2001 – keep what worked
• Some new/modified methodologies
– Operational and statistical
• Wide range of outputs
Questionnaire
• New topics
–
–
–
–
Citizenship
Second address
National identity
Language
• Full census returns from short-term migrants
– In UK for 3 months or more
– Identified through intention to stay question
Questionnaire completion
•
Internet completion being offered for first time
– Internet Access Code provided on front of paper questionnaire
•
Offers opportunities to improve data quality and reduce
respondent burden
– Automatic routing
– Validation rules
– Use of radio buttons
•
No unnecessary changes from paper questionnaire to minimise
modal bias
•
Advantages and disadvantages
– Reduces amount of editing required
– Increases possibility of multiple responses
Data delivery
• Can be split into three groups
– Questionnaires returned within 6 weeks of Census day
• Majority of data
• Fully processed across UK
• Matched to CCS
– Questionnaires returned within 10 weeks of Census day
• Fully processed in England, Wales & Northern Ireland
– Questionnaires returned more than 10 weeks after Census day
• May be used in coverage adjustment
Removing false persons
• Problem identified in 2001 Census
– Records created in error
• Pages crossed out
• Dust on scanner
• “Two of Five” rule
• Name (from individual questions) or Date of Birth
AND
• One of: Name (from individual questions), Date of Birth, Sex, Marital
Status, or Name (from household members table)
• Important for data quality and matching
Multiple response resolution
• Overcount
• Several types of multiple response
– Two questionnaires from same household
• Two paper questionnaires
• Paper and Internet
– Person on same questionnaire twice (or more!)
– Person on Household and Individual questionnaire
– Person on Household and Internet questionnaire
• Needs to be a quick process
Multiple response resolution
• Duplicate households identified when receipted
– Questionnaire tracking for England, Wales & Northern Ireland
– Matched questionnaire IDs and address in Scotland
• Resolved by matching people within household
– Key variables: Name (or soundex), Date of Birth, Sex
– If Age <30, name must match exactly
• Minimise risk of matching twins
– If no people match, two household records created
– If any people match, questionnaires merged
Multiple response resolution
• Merging questionnaires
– “Most complete” response kept
– Missing variables copied from duplicate record(s)
– Priority given to individual questionnaires
• Process for within postcode multiples
– People completing neighbour’s questionnaire
– Similar principles for resolution
Filter Rules
• Based on 2001 Rules
• Used to identify incorrect/unnecessary responses
• Deterministic – based on other responses
• Used to prepare data for main edit & imputation
• e.g. Person aged <16, economically inactive (student)
• e.g. Person employed, not looking for work
Edit and Imputation
• Will use CANCEIS system
• Resolves inconsistent data
– Probabilistic
– Programmed with all possible inconsistencies
• Impute missing data
– Based on complete records
– Searches for similar donor
• Ensures complete and consistent data
Output flags
• Non-standard outputs possible for England & Wales
– Use information on Second Residences
• Population staying in UK 3-12 months identified
– Exclusion from standard outputs
– Production of specific outputs
– England, Wales and Northern Ireland only
• Considering including this population in coverage adjustment
• Mark records now to enable easy production of these outputs
Coverage assessment process
Census
Coverage
Survey
2011 Census
Matching
Estimation
Adjustment
Quality
Assurance
Disclosure Control - Options
• Necessary to protect confidentiality of respondents
• Three options were short-listed:
– Pre-tabular:
• Record swapping (pre-tabular)
– Small number of records swapped across areas
– Adds uncertainty to “unique” records
• Over-imputation (pre-tabular)
– Some variables deleted and re-imputed
– Post-tabular:
• Invariant ABS Cell Perturbation (IACP)
– Small counts can be altered
– Two stage process to ensure “additivity”
Disclosure Control – Chosen
Methodology
• Pre-tabular method recommended
– User preference for consistency between tables
– IACP method rejected
• Record swapping chosen instead of over-imputation
– No persons or data items removed
– Outputs at national level and high geographies unaffected
Outputs
• Main base will be Usual Residents
– All people living in UK for 12 months or more
– Consistent across UK
• First outputs – September 2012
– Other standard outputs by Spring 2013
• ONS producing non-standard outputs
– e.g. Weekday population, Majority of time
– Consultation to decide exactly what
• Outputs on short-term migrant population
– All people living in UK for 3-12 months
– England, Wales and Northern Ireland
Questions?