Transcript PPT

Handbook Composite Estimators,
Data Related Issues (chapter 4)
Frank van de Pol, Jan van den Brakel, Pim Ouwehand, Floris
van Ruth, Piet Verbiest
To be presented 8-10-2014 at the UN workshop
before the CIRET conference in Hangzhou.
The views expressed in this presentation are those of
the authors and do not necessarily reflect the policies of
Statistics Netherlands.
2
Business cycle clock summarises and
shows economic trend as clock movement
– Eurostat gives series for EU countries, USA, …
– http://epp.eurostat.ec.europa.eu/cache/BCC2/group1/xdis_en.html
– OECD gives series for total of member countries
http://stats.oecd.org/mei/bcc/default.html
– Netherlands, Germany, Denmark, …
– What if we have “data problems”?
3 Ideal situation: all data in time, comparable,…
Problematic data?
A. Discontinuities in time series: back casting
B. Some indicators may be missing or too late
C. Mixed frequency data, delayed results
D. Incomplete data and indices, changing weights
E. Seasonal Adjustment interpretation: outlier or start of crisis?
Data Related Issues
with Composite
Estimators
4 Overview
Interruption of time series
caused by a change of method
• Importance of comparable figures in time series
• Answers are influenced by context
• Selective response affects outcomes
• Survey design changes
• “New” modes: telephone, web
• Combination of similar surveys for more detail
SN with other governmental bodies
• Police monitor (BiZa), safety monitor (SN) & other
• Health survey (SN) & GGD surveys (communal health service)
5 A. Interrupted time series
6 A. Interrupted time series
Jan van den Brakel, Paul Smith, Simon Compton,
Survey Methodology 2008
7 A. Interrupted time series
8 A. Interrupted time series
Sometimes a discontinuity is found
9 D. Interrupted time series
Interrupted time series due to method
change: several situations
• Changed measurement target variables
• Parallel data collection with both designs
• Step parameter interrupted time series
• Change in classification, editing or imputation
• Recalculation with both methods
• New editing strategy (partially automatic)
• More efficient estimation (small area)
• Classification change: profession, education, industry
• UN classifications with thousands of categories
• http://unstats.un.org/unsd/class/family/default.asp
10 A. Interrupted time series
Discontinuities: how to cope?
• Suppose a correction factor has been calculated
• Which part of the series is better, old or new?
• Adapt history: “back casting” of the series
• Adapt future data: cannot continue very long
• Correction factor gets less valid as extrapolation is
extended further, both w.r.t. past and future
• It would be convenient if we could consider the new
figures as better figures
11 A. Interrupted time series
Some indicators may be missing or too late
The right data are not available
‐ “Holes” in micro data
‐ No register, but there is a survey
‐ No survey, but related registers
‐ Register information is too late, but proxy available
‐ Some regions provide information, others not /late
‐ Only social media information available
Data Related Issues
with Composite
Estimators
12 B. Indicators missing or too late
Some indicators may be missing or too late
‐ “Holes” in micro data:
‐ No register
‐ No survey, but related registers
‐ Register information is too late,
but proxy available
‐ Some regions provide
information, others not /late
‐ Only social media information
available
Data Related Issues
with Composite
Estimators
‐ Imputation
‐ Use a survey
‐ Use predictors
‐ Combine old register data
with change information
‐ Use small area estimate or
synthetic estimator
‐ Create indicator from
Twitter, Facebook, Weibo
13 B. Indicators missing or too late
Mixed frequency data and delayed results
Composing several indicators
‐ Publication delay differs: ragged edges
‐ Publication frequency differs: ragged edges
Use high frequency (monthly) indicator to forecast the lagging
quarterly or yearly indicators (Foroni and Marcelino, 2013)
Methods:
– Interpolation: linear, Chow-Lin, Denton (Eviews),
– ARIMA regression & imputation (US Leading Economic Index)
– State space models (Stamp, Ox, Eviews),
– Mixed Frequency Vector Autoregression (VAR),
– Bridge models,
– Mixed Data Sampling (MiDaS)
14 C. Mixed frequency data, delayed results
Incomplete data and indices, changing weights
– Data collection skipping 2nd half of the period
‐ i.e. observation period does not match reporting period
– Large revisions from preliminary to final figures are
harmful and should be avoided
– Observation period one week? Fluctuations between
weeks will inflate fluctuation in monthly series
– Observation period first two weeks of a month? More
stable series will result.
15 D. Incomplete data and indices, weights
Incomplete data and indices, changing weights
– Index: starting point=100, multiplied with a
series of growth rates
– For this simplification to be true, the population
should not change, but it does
16 D. Incomplete data and indices, weights
Incomplete data and indices, changing weights
– Price index: weighted set of basic products
– Weights should reflect consumption pattern
– Weights can be fixed until a revision year or
updated every year, linking years with a chain of
growth rates
17 D. Incomplete data and indices, weights
Seasonal Adjustment: outliers and
seasonal or calendar effects
– When should an outlier be viewed as indicative of a
discontinuity, a crisis or a boom?
– Treat it as an outlier until the contrary is proven
‐ Statistical proof (standard error)
‐ Context proof (news item, change in related series)
– Should an outlier affect seasonal pattern or not?
– Correct treatment of calendar effects (holidays)
18 E. Seasonal adjustment and outliers
Seasonal Adjustment: outliers and
seasonal or calendar effects
Consumer confidence index
18
0
-18
-36
J F MA MJ J A S OND J F MA MJ J A S ONDJ F MA MJ J A S OND J F MA MJ J A S ON DJ F MA MJ J A S ON DJ F MA MJ J A S O N D
2005
2006
2007
2008
2009
19 E. Seasonal adjustment and outliers
Original
Seasonally adjusted
2010
Seasonal Adjustment: outliers and
seasonal or calendar effects
IndsutrialProduction
Production Index
Industrial
Index
120
100
80
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
Original
seasonally
adjusted (no
outliers set) and outliers
20
E. Seasonal
adjustment
seasonally adjusted (with outliers set manually)