Statistical processes

Download Report

Transcript Statistical processes

Strengthening Statistics
Managing processes
Core business of the NSO
Part 2
Produced in Collaboration between World Bank Institute and
the Development Data Group (DECDG)
Copyright 2010, The World Bank Group. All Rights Reserved.
Statistical business register
• Statistical business registers (SBR) transform administrative units
into statistical units
• With respect to the target population the SBR is imperfect
• Survey statisticians must compensate imperfections as much as
possible
• The SBR must contain the statistical and administrative attributes
necessary for implementing the survey
Copyright 2010, The World Bank Group. All Rights Reserved.
Sample design
•
•
•
•
•
There is a variety of sampling methods
Sample size depends on the data to be published
Tables to be published contain estimates
The underlying unknown quantities are called parameters
The sample must be designed to produce estimates lying as near as
possible to the values of the parameters
• Which design fits best for a particular survey, depends on the
auxiliary information present in the frame
• The more information is available before sampling, the better the
sampling design can be tailored to the survey objectives
Copyright 2010, The World Bank Group. All Rights Reserved.
Sampling strategy
• The sampling design is a set of specifications which define the
target population, the sampling units, and the probabilities attached
to the possible samples
• An estimator is the mathematical function by means of which the
estimate for a particular parameter is computed
• The combination of a design and an estimator is called a strategy
• Bias relates to all estimates for a certain parameter the sample
survey might produce
• There are different sources of bias, including non-response
Copyright 2010, The World Bank Group. All Rights Reserved.
Sample size
• Two aspects play a role: cost and precision
• Often sample size is decided by the budget
• In business surveys the method of stratified sampling is widely used
• Size class usually does well as stratifying variable
• Determination of an optimum allocation is often an iterative process
Copyright 2010, The World Bank Group. All Rights Reserved.
Sampling error and total error
• An important activity after the survey has been held is determining
error
• Sampling error results from taking a sample rather than using
information from the whole population
• Non-sampling error relates e.g. to frame imperfections, imprecise
objectives, poor question design and non-response
• Variance expresses accuracy - how close the estimates lie near the
expectation of the estimator
• Mean square error expresses precision - the closeness of estimates
around the parameter
Copyright 2010, The World Bank Group. All Rights Reserved.
Sample survey data collection methods
• Deciding about the best possible data collection method for a survey
is an important activity of survey design
• For business surveys self enumeration is the only realistic option,
either on the basis of a paper questionnaire, or through web forms
• Business surveys are often repetitive
• Methods to organize repetitive business surveys are repeated crosssectional surveys, panel surveys and compromises between these
two extremes
• It depends on the objectives of the survey which type of design is
most suitable
Copyright 2010, The World Bank Group. All Rights Reserved.
Self enumeration methods
• The three most common self-enumeration methods are:
• Postal survey
• Drop-off-mail-back and drop-off-pickup
• Electronic form
• Advantages of postal surveys include that respondents can fill out
the questionnaire when and how they want to
• Disadvantages are: lower and late response, and questions cannot
be too difficult
• Drop-off-Mail-back and Drop-off-Pickup provide better response, but
are more costly
• Use of electronic forms has many advantages, but also limitations
Copyright 2010, The World Bank Group. All Rights Reserved.
Data collection in practice
• The procedures used for collection of data from businesses are of
enormous importance
• Most of the operations can be supported by modern automation
tools
• The list of sampled units and the questionnaire items provide the
ingredients for the setting up of a micro data file
• Respondents should be informed in advance in case of new
surveys, as well as substantially changed questionnaires
• Respondents should be invited to contact the NSO in case of any
problems
Copyright 2010, The World Bank Group. All Rights Reserved.
Data entry
• There are five types of data entry:
• EDI
• Scanning
• OCR
• Heads-up data entry
• Heads-down data entry
• Each techniques has advantages and limitations
• On PCs database programs and dedicated programs can be used
• Spreadsheets are generally less suited for data entry
Copyright 2010, The World Bank Group. All Rights Reserved.
Data processing
•
•
•
•
Processing data is more than aggregating
One reason is that respondents make errors
Another reason is non-response and incompletion of data
Yet other reasons include improving coherence, translation of
bookkeeping concepts into statistical concepts, correcting problems
with the sampling frame, and dealing with non-response
Copyright 2010, The World Bank Group. All Rights Reserved.
Data editing methods
•
•
•
•
•
Editing is correcting data errors
Whatever technique used, not all errors will be traced
The aim is to detect and correct serious errors
Data editing takes place during or after data entry
Types of editing include:
• Routing checks: have all questions been answered?
• Data validation: are answers permissible?
• Relational checks: is the ratio between variables within bounds,
do data add up?
• Automated editing is becoming increasingly important
• Selective editing or macro-editing is about detection and treatment
of outliers
Copyright 2010, The World Bank Group. All Rights Reserved.
Data integration
•
•
•
•
Coherence of statistical data can be enhanced by integration
Achieving coherence is a complex process
A first issue is attuning of concepts
One solution is adjusting the names concepts to make clear that
surveys observe different things
• Another solution is adjusting the definition of the concepts of one
survey to those of other surveys – this is not always possible
• A third solution is eliminating duplications – one survey skipping a
question and deriving data from other surveys
• Another solution is the NSO adopting a ‘one number policy’ –
publishing only one number about a phenomenon
Copyright 2010, The World Bank Group. All Rights Reserved.
Data analysis
• There are countless types of analysis an NSO may engage in
• Only a few examples are given, which are important for users:
• Seasonal adjustment
• Statistical disclosure control of tabular data
Copyright 2010, The World Bank Group. All Rights Reserved.
Data dissemination
• Websites have become the main public face of statistics
• For it to present a good image, the website must be up-to-date and
error free
• Achieving this requires a dedicated team
• Subject matter statisticians must take responsibility for their products
• Each data series on the website must have an owner
Copyright 2010, The World Bank Group. All Rights Reserved.
Administrative registers
• Use of administrative registers for statistical purposes is stimulated
by:
• The need to reduce the reporting burden
• Budget cuts, in combination with increased demand for statistics
• Direct data collection from businesses is only justified if other
sources fail
Copyright 2010, The World Bank Group. All Rights Reserved.
Pros and cons of administrative
registers
Advantages of administrative registers are:
• Avoidance of reporting burden
• Cost effectiveness
• Negligible non-response
• No sampling error
• Data reported may be more accurate
• Disadvantages of administrative registers are:
• Discrepancy between administrative and statistical concepts
• Risks with respect to stability
• Data reported may be less accurate
• Data may become available with considerable delay
• Legal constraints
Copyright 2010, The World Bank Group. All Rights Reserved.