Panel discussion questions for Session ii Panellists • Dan Gilman (BLS) – BLS are an associate member of the DDI Alliance • Eric Rodriguez.

Download Report

Transcript Panel discussion questions for Session ii Panellists • Dan Gilman (BLS) – BLS are an associate member of the DDI Alliance • Eric Rodriguez.

Panel discussion questions for
Session ii
Panellists
• Dan Gilman (BLS)
– BLS are an associate member of the DDI Alliance
• Eric Rodriguez (INEGI)
• Achim Wackerow (DDI Alliance)
• Arofan Gregory (Invited Expert)
– Metadata Technology, Open Data Foundation
• Jeremy Iverson & Dan Smith (Invited Experts)
– Algenta Technologies (developed Colectica)
• Plus…all METIS participants!
Starting points
• Focus
– Primarily DDI-L rather than DDI-C
• Implementations
– Very early
• Statistics New Zealand (SNZ)
• Australian Bureau of Statistics (ABS)
– Project / team established with DDI in scope
• French National Institute for Statistics and Economic Studies
(INSEE)
– Considering
• Many - including Statistics Sweden, Statistics Norway,
Statistics Canada, ONS
Q1 :To what extent is DDI really implemented
within the statistical production process?
• Early days for NSIs in terms of DDI-L
• SNZ have used DDI-C for Archiving since 2006
– Data & metadata disseminated to statistical output areas &
researchers via microdata access facilities
• Positive experience in this regard led to decision to
pursue “all of lifecycle” rather than “end of lifecycle”
approach via DDI-L
– Replace & enhance existing metadata management
processes
• RFT (Request for Tender) process (completed 2011)
included definition of business needs & strategy,
testing of market etc
continued (1.2)
• ABS
– REEM (Remote Execution Environment for Microdata)
in production use on restricted basis
• Microdata described using DDI-L for input to environment
– Aggregate tabulations can be returned using SDMX
• Next phase of development (analytical capabilities beyond
simple tabulation) is underway
– Proof of Concept (PoC) stage of Metadata Registry /
Repository (MRR) development implemented
elements of DDI model (delivered June 2011)
– Extensive mapping between DDI-L and ABS
Questionnaire Development Tool
continued (1.3)
• Question may also warrant reference to
implementation of DDI-L to support statistical
production by organisations other than NSIs
– Several are further along the path than NSIs, eg
• University of Michigan Survey Research Center
– Several applications for different phases of life cycle
• Canadian RDC Network
Q2 : What are the based-DDI tools used in practice by
the statistical offices (Colectica, others...) ?
• SNZ plan to implement Colectica next year to
support documentation & data management
processes
– Work underway to extract content from existing
systems as DDI-L (eg household survey platform)
via custom development
• Investigating StatTransfer application which now
supports DDI-L
• Interested in DDI/Blaise interoperability
continued (2.2)
• ABS
– REEM
• Developed by ABS in partnership with vendor
– MRR PoC
• Developed by ABS harnessing design expertise from
consultants
– Evaluating Colectica
– Customised utilities/applications
• eg extract data and metadata for REEM, in accordance
with DDI-L specification, from existing repositories
Q3 : Are there applications or tools which
communicate with DDI (Blaise, others...)?
*
*
*
*
*
V0.2 of diagram – requires further quality assurance
continued (3.2)
• Plus
– MQDS (Michigan Questionnaire Documentation
System)
• extract comprehensive metadata from Blaise survey
instruments & render as DDI
– Assorted applications internal to agencies that
developed them
– OpenDDI (beta)
• global catalog of DDI documented surveys
• Inclusion of support for DDI-L in StatTransfer is
seen by other vendors as a signal of its ongoing
prominence
Q4a : Is there a repository of variables,
questions ?
• DDI Alliance has a major focus on re-use of metadata
– a cornerstone for the design of DDI-L
– practical work underway on business practices & processes
(eg via case studies) to support re-use
• Technical design of standard & availability of repositories &
applications are “necessary but not sufficient” to achieve re-use in
practice
• Case studies will influence further technical support
• Colectica Repository can be used for this role
– eg, SNZ’s current reference metadata library to be
replaced by Collectica
• Various banks for variables, questions etc have been
built by various agencies
Q4b : To what extent is a variable repository restrictive for the
user? For example if he uses a name of a variable repository is he
obliged to use the code list associated in the repository?
• Very modular model to support reuse
• Strong support for relating different objects
– eg Variable X is the same as Variable Y in terms of concept,
universe and response categories but the codes used to
denote the response categories are different
• Applications working with the DDI-L can harness this
modularity, eg
– User : I need a variable identical to that one except I need
different codes
– System : New Variable created which
• reuses majority of “building blocks” for Existing Variable
• has explicitly defined & recorded relationship to Existing Variable
Q5 What is the strategy to implement statistical standards with a
view of covering the life cycle of statistical operations from an
end-to-end perspective?
• SNZ
– implement improved metadata management in a
staged manner across the end-to-end lifecycle
• Full support for DDI across all of the statistical business
process is not planned to be complete until at least
2020.
– have focused on the documentary needs of the
organisation
• ensure reference metadata available to identify the
studies SNZ undertake and to provide staff with key
contextual information.
continued (5.2)
• now expanding focus to include data-centric
needs
– including management of classifications and
variables with DDI.
• Strategy is to begin to apply DDI across a wide
range of information objects across the
statistical business process in order to better
understand needs and the coverage of DDI.
continued (5.3)
• In the next 12 months plan to start using DDI to describe
Concepts, Variables, Classifications, Questions.
• Ten year roadmap for metadata indicates systems
implementations of classification, question and variable
libraries, building upon initial implementation, to expand
and enhance functionality and meet long term needs.
• It is intended DDI-L will be the primary metadata standard
within SNZ
– Supplemented by others where required
• (e.g. SDMX codelists for classifications)
– Externally use a mix of DDI and SDMX depending on which is
best suited to a particular use case.
continued (5.4)
• ABS similar to SNZ
– Aiming to achieve transformation by 2017
– Relatively greater early emphasis on “machine
actionable” aspects for metadata driven processes
– Planning to apply DDI and SDMX internally in
accordance with “industry standard” practice (once
that emerges)
• Emphasis on facilitating, in practice, international
collaboration & sharing regarding new methods and IT
components
• Currently expect GSIM (Generic Statistical Information
Model) to be operationalised via DDI & SDMX in future.
• ABS Transitional Model in meantime spans DDI & SDMX
Q5a Is it to implement only DDI awaiting the results of
the SDMX/DDI dialogue in progress?
• No
• SNZ is using SDMX as part of dissemination platform and
now starting to implement DDI across the business process.
• During the next 12 months will map the metadata captured
throughout the statistical business process with
dissemination metadata needs.
– Data dissemination uses an SDMX based system (OECD.stat)
– By default will map DDI to SDMX for a particular use case.
– Within 18 months expect metadata flows between DDI & SDMX
based systems regardless of outcome of SDMX/DDI dialogue.
• ABS : Similar
– Recognise a range of community benefits if agencies flow
metadata between DDI & SDMX on a consistent basis
Q5b : Is it to implement DDI and SDMX from the very
beginning in the objective covering the life cycle of
statistical operation end to end ?
• Yes
Q5c Other strategy?
• 5b is key strategy
Q6: What are perspectives in term of versioning for DDI
: frequency of important changes?
• Good balance between
– responsiveness to identified additional requirements and bug
fixes, and
– Stability
• Similarly to SDMX
– when there is a new release agencies do not need to upgrade
unless & until they have a business driver
– emphasis on testing, backwards compatibility
• Advantages
– Dual product line (C vs L) helps limit conflict of interest between
users with simple needs & with advanced needs
– DDI 3.0, 3.1, 3.2 more responsive than SDMX 2.0, 2.1 gap
– New release framework is even more responsive
Q7: Which institute has to deliver metadata in DDI 2 ?
DDI3 ? Which institute has to deliver metadata to data
archives working with DDI2 ? DDI3 ?
• SNZ does not have any requirement to
disseminate data and metadata in DDI
– Aim to encourage other government agencies to
use DDI-L where possible to describe their data.
• DDI Usage Map
• SDMX/DDI combined usage map
SDMX only
DDI only
SDMX and DDI
Total
Bank
173
0
1
174
NSI
33
55
8
96
University
0
43
2
45
Government
Agency
2
9
0
11
Archive
0
16
0
16
Other
4
3
1
8
TOTAL
212
126
12
350
Q8: Is there an implementation of
metadata using Blaise and DDI ?
• SNZ
– No use of Blaise and DDI together.
– Intention Blaise should be able to be generated
from DDI based metadata repositories.
• CF
– Q3
– Colectica
– MQDS
Q9 : New Are there any evaluations of the DDI at
statistical offices that show how well the DDI addresses
their statistical metadata needs?
• SNZ
– Didn’t see question in advance
– May have more information than ABS (eg to support RFT)
• ABS
– High level information written for ABS specific audience
(jargon!)
– Detailed examination in particular areas (eg questionnaires,
microdata record relationships) positive, sometimes requiring
minor extensions
• Excellent idea to share general information on evaluations.
– Aim for well balanced factual information
• not written specifically to advocate a business case
– Seek examples from other NSIs & other agencies