Statistics New Zealand’s End-to-End Metadata Life-Cycle ”Creating a New Business Model for a National Statistical Office if the 21st Century” Gary Dunnet Manager, Business.

Download Report

Transcript Statistics New Zealand’s End-to-End Metadata Life-Cycle ”Creating a New Business Model for a National Statistical Office if the 21st Century” Gary Dunnet Manager, Business.

Statistics New Zealand’s End-to-End
Metadata Life-Cycle
”Creating a New Business Model for a National Statistical Office if the 21st Century”
Gary Dunnet
Manager, Business Solutions
[email protected]
BmTS Scope
1.
A number of standard, generic end-to end processes for
collection, analysis and dissemination of statistical data
and information



Includes statistical methods
Covering business process life-cycle
To enable statisticians to focus on data quality and
implemented best practice methods, greater coordination and
effective resource utilisation.
2.
A disciplined approach to data and metadata
management, using a standard information lifecycle
3.
An agreed enterprise-wide technical architecture
BmTS Success Criteria - Financial
• A reduction in the operating cost to produce a
statistical output (that are operating on a separate
subject matter system) by between 10 – 20% after
moving to the new business model
• A reduction of 50% in the investment (of time and
money) required to implement the end to end
processes and systems required for a new
statistical output
Generic Business Process Model
From:
Need
Design/
Build
Collect
Process
Analyse
Disseminate
To:
Need
Design/
Build
Collect
Process
Analyse
Disseminate
Need
1
Design
2
Build
3
Collect
4
Process
5
Analyse
6
Disseminate
7
Determine
information
requirement
1.1
Develop detailed
project plan
2.1
Build collection
vehicle
3.1
Identify postout
population and data
services
4.1
Capture data into
electronic form
5.1
Examine source
data
6.1
Receive and
validate draft
content
7.1
Determine and
confirm need
1.2
Develop survey
methodology
2.2
Build technology
solution
3.2
Manage
respondents
4.2
Perform macro
editing
5.2
Produce Statistical
Results
6.2
Manage and load
dissemination
repositories
7.2
Develop budget
and plan
1.3
Questionnaire
design and testing
2.3
Test technology
solution
3.3
Post out
4.3
Run imputations/
estimations
5.3
Validate Statistical
Results
6.3
Prepare pre-release
for publishing
7.3
Obtain financial
support
1.4
Design operational
requirements
2.4
Implement solution
3.4
Acquire data
4.4
Produce clean
dataset
5.4
Interpret Statistical
Results
6.4
Manage first release
7.4
Prepare Content for
Dissemination
6.5
Handle customer
enquiries
7.5
Design computer
system
2.5
Obtain ministerial
approval
2.6
Close off collection
4.5
Conduct Quality
Control
6.6
E-Form
Raw
Data
Clean
Data
Aggregate
Data
2. Output Data Store
‘UR’
Data
Summary
Data
Official Statistics System &
Data Archive
1. Input Data Store
Web
6. Transformations
RADL
5. Information Portal
Output Channels
CAI
Multi-Modal Collection
Imaging Admin.
Data
4. Analytical Environment
INFOS CURFS
10. Workflow
8. Customer Management
7. Respondent Management
3. Metadata Store
Statistical
Process
Knowledge Base
9. Reference Data Stores
Existing Metadata Issues
•
•
•
•
•
•
•
•
•
•
metadata is not kept up to date
metadata maintenance is considered a low priority
metadata is not held in a consistent way
relevant information is unavailable
there is confusion about what metadata needs to be stored
the existing metadata infrastructure is being under utilised
there is a failure to meet the metadata needs of advanced
data users
it is difficult to find information unless you have some
expertise or know it exists
there is inconsistent use of classifications/terminology
in some instances there is little information about data, where
it came from, processes it has been under or even the
question to which it relates
Target Metadata Principles
•
•
•
•
•
•
•
•
•
•
•
•
metadata is centrally accessible
metadata structure should be strongly linked to data
metadata is shared between data sets
content structure conforms to standards
metadata is managed from end-to-end in the data life cycle.
there is a registration process (workflow) associated with
each metadata element
capture metadata at source, automatically
ensure the cost to producers is justified by the benefit to users
metadata is considered active
metadata is managed at as a high a level as is possible
metadata is readily available and useable in the context of
client's information needs (internal or external)
track the use of some types of metadata (eg. classifications)
Metadata Logical Model
Search and Discovery
Metadata and Data Access
Passive metadata store/s
Business Logic
Classification Management
Question Library
Data
Data
Definition
Frames/Reference Stores
Schema - SDMX
Metadata: End-to-End

Need
–
–

capture requirements eg usage of data, quality requirements
access existing data element concept definitions to clarify requirements
Design
–
–
–
–

capture constraints, basic dissemination plans eg products
capture design parameters that could be used to drive automated
processes eg stratification
capture descriptive metadata about the collection - methodologies used
reuse or create required data definitions, questions, classifications
Build
–
–

capture operational metadata about selection process eg number in each
stratum
access design metadata to drive selection process
Collect
–
–
–
capture metadata about the process
access procedural metadata about rules used to drive processes
capture metadata eg quality metrics
Metadata: End-to-End (2)

Process
–
–
–

capture metadata about operation of processes
access procedural metadata, eg edit parameters
create and/or reuse derivation definitions and imputation parameters
Analyse
–
–
–
–

capture metadata eg quality measures
access design parameters to drive estimation processes
capture information about quality assurance and sign-off of products
access definitional metadata to be used in creation of products
Disseminate
–
–
–
–
–
capture operational metadata
access procedural metadata about customers
Needed to support Search, Acquire, Analyse (incl; integrate), Report
capture re-use requirements, including importance of data - fitness for
purpose
Archive or Destruction - detail on length of data life cycle.
Metadata: End-to-End - Worked Example
Question Text: “Are you employed?”

Need
–
–
–

Concept discussed with users
Check International standards
Assess exisiting collections & questions
Design
–
–
–
–

Design question text, answers & methodologies
Align with output variables (e.g. ILO classifications)
Data model, supported through meta-model
Develop Business Process Model – process & data / metadata flows
Build
–
–
–
Concept Library – questions, answers & methods
‘Plug & Play’ methods, with parameters (metadata) the key
System of linkages (no hard-coding)
Metadata: End-to-End - Worked Example
Question Text: “Are you employed?”


–
–
–
–
–

Collect
Question, answers & methods rendered to questionnaire
Deliver respondents question
Confirm quality of concept
Process
Draw questions, answers & methods from meta-store
Business logic drawn from ‘rules engine’
Analyse
–
–
–

Deliver question text, answers & methods to analyst
Search & Discover data, through metadata
Access knowledge-base (metadata)
Disseminate
–
–
Deliver question text, answers & methods to user
Archive question text, answers & methods
Metadata: Recent Practical Experiences


–
–
–
–
Generic data model – federated cluster design
Metadata the key
Corporately agreed dimensions
Data is integrateable, rather than integrated
Blaise to Input Data Environment
Exporting Blaise metadata
‘Rules Engine’

–
–

Based around s/sheet
Working with a workflow engine to improve (BPM based)
Audience Model
–
Public, professional, technical – added system
Questions?