A Framework for the Accuracy Dimension of Data Quality for

Download Report

Transcript A Framework for the Accuracy Dimension of Data Quality for

A Framework for the Accuracy
Dimension of Data Quality
for Price Statistics
Ottawa Group
Ottawa, October 2007
Presenter: Geoff Neideck
Australian Bureau of Statistics
Aim of the Presentation
• Introduction
• Extending the ABS Quality
Framework
• Comparisons with other
Frameworks
• Why Do Errors Occur?
• Key Principles in Managing for
Accuracy
• Quality Gates
Introduction
• Many NSOs have experienced
errors in their prices statistics
• Some major reviews of practices
have been undertaken
• Some serious consequences
Undermining reputation and raising
questions of reliability
Extending the ABS Data
Quality Framework
• Dimension of quality:
– Relevance
– Timeliness
– Accuracy
– Coherence
– Interpretability
– Accessibility
Extending the ABS Data
Quality Framework
ACCURACY
Non-design Error
Design Error
Sampling Error and
Statistical Bias
Data Source Errors
Index Bias
Operations Errors
Questionnaire Design Error
Systems Errors
Process Design Errors
Comparisons with other frameworks
– ILO CPI Manual Ch. 11
Taxonomy of Errors
• Sampling error
– Selection error
– Estimation error
• Non-sampling error
– Observation error
• Over coverage
• Response error
• Processing error
– Non-observation error
• Under coverage
• Non-response
Comparisons with other frameworks
– IMF DQAF for CPI/PPIs
Elements
Dimensions
0
Prerequisites of Quality
1
Assurances of Integrity
2 Methodological
Soundness
3
Accuracy and
Reliability
4 Serviceability
5
Accessibility
3.1
Source data
3.2 Assessment of
source data
3.3
Indicators
3.2.1 Source data
are routinely
assessed
3.4.1 Intermediate
results are
validated against
other data sources
Statistical techniques
3.4 Assessment and
validation of
intermediate data and
statistical output
3.4.2 Statistical
discrepancies in
intermediate data
are assessed and
investigated
3.5 Revision studies
3.4.3 Statistical
discrepancies and
other potential
problems in
statistical output
are investigated
Why Do Errors Occur?
• Critical Risk Areas
– Culture
– Change
– Education
– Documentation
– Engagement with stakeholders
– Spreadsheets and black boxes
Culture
•
•
•
•
Understanding the ‘why’
Taking an ‘end-to-end’ view
Taking a questioning approach
Taking responsibility for quality
Change
• Indexes subject to much change
• Risks increase
– Routine change
– Regular but less frequent change
– New methods and procedures
• Transfer of knowledge
• Change outside existing systems
Education
Prices theory
-Introduction of
new concepts
and practices
Best practice
-choice of options
Standard methods
low
high
Complexity of knowledge & decision-making
Documentation
• High on staff list of things that
would enable them to do a better
job
• Necessary but not sufficient
• Filling in the gaps/making
assumptions
Engagement with stakeholders
• Identifying key stakeholders
• Flow of information
• Knowledge transfer
Spreadsheets and black boxes
• Spreadsheets
– A great tool but ….
– Usually outside the existing system
– Cottage industries
– Protocols/documentation
• Black Boxes
– Data in, data out …
– …but what happened in between
Key Principles for Managing
Data Quality
• Developing and sustaining a quality
culture
• A program of capability development
• Managing Change
• Regular communication with
stakeholders
• Appropriate documentation
• A program of reviews
Quality Gates
Index Compilation Process and Quality Gates
Source data
Intermediate data
Q
Data
Collection
Direct
Other
Collection Sources
Q
Micro
Editing
Products/
Outlets
Output data
Q
Macro
Editing
Q
Output
Indexes
EAs Upper
level
indexes
Q
Validation against independent sources
Product
Information
Industry
Information
Data
Release
Independent
Indicators
= Quality Gate