United Nations Economic Commission for Europe Statistical Division Measuring and Communicating Data Quality UNECE Training Workshop on Dissemination of MDG Indicators and Statistical Information Astana, Kazakhstan.
Download
Report
Transcript United Nations Economic Commission for Europe Statistical Division Measuring and Communicating Data Quality UNECE Training Workshop on Dissemination of MDG Indicators and Statistical Information Astana, Kazakhstan.
United Nations Economic Commission for Europe
Statistical Division
Measuring and Communicating
Data Quality
UNECE Training Workshop on Dissemination of
MDG Indicators and Statistical Information
Astana, Kazakhstan 23 – 25 November 2009
Steven Vale, UNECE
Contents
What is quality?
How can we measure quality?
How should we report and
communicate quality?
Steven Vale - UNECE Statistical Division
Slide 2
Which is the Best Quality?
Steven Vale - UNECE Statistical Division
Slide 3
Definition of Quality
International Standard
ISO 9000/2005 defines quality as;
'The degree to which a set of
inherent characteristics
fulfils requirements.’
Steven Vale - UNECE Statistical Division
Slide 4
What Does This Mean?
Whose requirements?
•
A set of inherent characteristics?
•
The user of the goods or services
Users judge quality against a set of
criteria reflecting the different
characteristics of the goods or services
So quality is all about providing goods
and services that meet the needs of
users (customers)
Steven Vale - UNECE Statistical Division
Slide 5
Quality Criteria
Steven Vale - UNECE Statistical Division
Slide 6
Quality Criteria for Statistics
Different statistical organisations use
different criteria
- but lists of criteria are quite similar
UNECE list:
Relevance
Comparability
Accuracy
Clarity
Timeliness
Accessibility
Punctuality
Steven Vale - UNECE Statistical Division
Slide 7
Relevance
Are the statistics that are produced
needed?
Are the statistics that are needed
produced?
Do the concepts, definitions and
classifications meet user needs?
Steven Vale - UNECE Statistical Division
Slide 8
Accuracy
The closeness of statistical
estimates to true values
In the past: Quality = Accuracy
Now accuracy is just one part of
quality
Steven Vale - UNECE Statistical Division
Slide 9
Timeliness
The length of time between data
being made available and the event
or phenomenon they describe
Punctuality
The time lag between the actual
delivery date and the promised
delivery date
Steven Vale - UNECE Statistical Division
Slide 10
Comparability
The extent to which differences are
real, or due to methodological or
measurement differences
•
•
•
Comparability over time
Comparability through space (e.g.
between countries / regions)
Comparability between statistical domains
(sometimes referred to as coherence)
Steven Vale - UNECE Statistical Division
Slide 11
Accessibility
The ways in which users can obtain or
benefit from statistical services
(pricing, format, location, language
etc.)
Clarity
The availability of additional material
(e.g. metadata, charts etc.) to allow
users to understand outputs better
Steven Vale - UNECE Statistical Division
Slide 12
Importance of Accessibility
Not just about making data available on
the Internet or in a book
•
Passive accessibility
Accessibility is about bringing data to users
in an understandable way, opening a
dialogue with those users, and ensuring
that their information needs are met
•
Active accessibility
Steven Vale - UNECE Statistical Division
Slide 13
Accessibility Should Include:
Communicating
Marketing
Interpreting
“Story-telling”
Informing
Educating
Steven Vale - UNECE Statistical Division
Slide 14
Accessibility and Visualization
Good visualizations make data accessible to
many more users
Bad visualizations are unhelpful / misleading
“Self-service” visualization needs to be
simple, with guidance to help users get
meaningful results
“Ready-made” visualizations can be more
complex, tailored to specific data sets
Steven Vale - UNECE Statistical Division
Slide 15
Accessibility and Visualization
Is it more cost-effective to:
develop “ready-made” graphics, or
• offer users more “self-service” functionality?
•
Many users don’t have the time or
knowledge to produce good visualizations
Advanced users have access to their own
visualization and analysis tools
Steven Vale - UNECE Statistical Division
Slide 16
Importance of Clarity
Clarity is all about explaining data
Do current explanatory notes help?
•
Often written by specialists for specialists
• Full of jargon
• Too long
• Too boring!
Simplified, plain-text versions needed
Steven Vale - UNECE Statistical Division
Slide 17
Other Considerations
Cost / efficiency
Integrity / trust
Reputation of the organization
Professionalism
•
Adherence to international standards
(e.g. UN Fundamental Principles of
Official Statistics)
Steven Vale - UNECE Statistical Division
Slide 18
Quality is not just about outputs
Input
Process
Output
To have good outputs we need to have
good inputs and processes, so we need
to think about the quality of these as well
Steven Vale - UNECE Statistical Division
Slide 19
Quality of Inputs
Timeliness
Completeness – are there any
missing units or variables?
Comparability with other sources
Quality check survey?
Knowledge of the source is vital!
Steven Vale - UNECE Statistical Division
Slide 20
Quality of Processing
Quality of matching / linking
Outlier detection and treatment
Quality of data editing
Quality of imputation
Keep raw data / metadata to refer
back to if necessary
Steven Vale - UNECE Statistical Division
Slide 21
Quality of Outputs
Are the users satisfied?
Are the outputs comparable with
data from other sources?
What is the impact on time series?
Are the outputs cost-effective?
Quality reports to measure and
communicate differences?
Steven Vale - UNECE Statistical Division
Slide 22
Measuring Quality
Quantitative methods
•
E.g. confidence intervals
User surveys
Self evaluation
Benchmarking
Steven Vale - UNECE Statistical Division
Slide 23
Quantitative Measures
The tops of the bars
indicate estimated
values and the red
lines represent the
confidence intervals
surrounding them.
Steven Vale - UNECE Statistical Division
Slide 24
UNECE Database User Survey
Launched each autumn on database
web site
10 questions
150 responses
(target 100)
Steven Vale - UNECE Statistical Division
Slide 25
Exercise
Design a user survey with up to 10
questions for users of your web site
20 minutes
Steven Vale - UNECE Statistical Division
Slide 26
UNECE User Survey Questions
1. Type of user
2. Frequency of use
3. Location (country)
4. Type of data
5. Database relevance
6. Timeliness
Steven Vale - UNECE Statistical Division
Slide 27
Continued...
7. Clarity (metadata)
8. Overall data quality
9. User interface
10. Other comments and questions
Steven Vale - UNECE Statistical Division
Slide 28
Results:
Type of
user
Media
Individual
Other
Private
business
National
Statistical
Office
National
government
International
organization
/ NGO
Student
Academic /
research
Results:
Frequency
of use
Results:
Location
Very poor
1%
Poor
1%
Results:
Data quality
Excellent
18%
Average
17%
Good
63%
Results:
User
interface
Poor
1%
Average
23%
Very poor
1%
Excellent
15%
Good
60%
Improving Our Services
Better timeliness of data
New “Country Overview” data cube to give
quick access to key indicators
More content in Russian
Improved user interface
More and better metadata
Statistical literacy
Steven Vale - UNECE Statistical Division
Slide 34
Self-evaluation
Relatively quick and cheap
Is it sufficiently objective?
Needs a standard framework to ensure
comparability of quality assessments
•
Eurostat DESAP check list:
http://epp.eurostat.ec.europa.eu/portal/page
/portal/quality/documents/desap%20G0LEG-20031010-EN.pdf
Steven Vale - UNECE Statistical Division
Slide 35
Benchmarking
Comparing data
values or data
production processes
between two sources
Differences can be
studied to try to find
ways to improve
quality
Steven Vale - UNECE Statistical Division
Slide 36
Benchmarking Between
Countries
Fairly cheap and easy way to get ideas
on how to improve statistical processes
Mutual benefit - “win - win”
Helps to improve international
cooperation
May lead to joint development projects
Steven Vale - UNECE Statistical Division
Slide 37
Communicating Quality
Quality Reports
•
Summary – “traffic light” indicator
Red – Serious quality issues, read the
quality report before using
Orange – Caution, do not use for important
decisions without reading the quality report
Green – Good quality
Intermediate – short quality report
(1000 words maximum)
• Detailed – full quality report
•
Steven Vale - UNECE Statistical Division
Slide 38
Detailed Quality Reports
Should cover all
components of quality
Should be written for
the user
Should be easily
accessible
Should follow a
standard template
Steven Vale - UNECE Statistical Division
Slide 39
Exercise
What should be covered in a
detailed quality report?
•
List the topics that should be included
10 minutes
Steven Vale - UNECE Statistical Division
Slide 40
Contents (1)
Introduction to the statistical process and its
outputs
Relevance
Accuracy
Timeliness
Punctuality
Accessibility
Clarity
Steven Vale - UNECE Statistical Division
Slide 41
ESQR Contents (2)
Comparability
Trade-offs between quality components
Assessment of User Needs and
Perceptions
Performance, Cost and Respondent Burden
Confidentiality, Transparency and Security
Conclusion
Steven Vale - UNECE Statistical Division
Slide 42
Summary
Quality is all about meeting user needs
There are many different aspects to
quality, some of which may be in conflict
•
E.g. Timeliness versus Accuracy
There are various ways of measuring
quality; user views are important
Quality should be communicated to users
in a way they can understand
Steven Vale - UNECE Statistical Division
Slide 43
Which is the Best Quality?
It depends what the user needs!
Steven Vale - UNECE Statistical Division
Slide 44
Questions?
Steven Vale - UNECE Statistical Division
Slide 45