Transcript ESSnet DWH

ESSnet on microdata linking and data warehousing
in statistical production
ESS-net DWH
Content

Background ESS-net

Challenges

Explaining the statistical data warehouse (S-DWH)

Elements of the S-DWH
- Business architecture
- GSBPM mapping
 Meta data
 Organisational aspects
ESS-net DWH
1
ESSnet Partnership
ESS-net coordinator:

Statistics Netherlands (CBS)
Co-partners:

Estonia, Italy, Lithuania, Portugal, Sweden, UK
Starting date:

4 October 2010

SGA 1: first year, till 3 October 2011

SGA 2: last 2 years, till 3 October 2013
ESS-net DWH
2
General Objectives ESSnet DWH
Provide assistance in:
the development and implementation of a maximum efficient
statistical process for business and trade statistics, independent
of any (technical) specific architecture
Results in daily statistical practice:
 increase the efficiency of data processing
in statistical production systems
 maximize the reuse of already collected data

a 'data warehouse' approach to statistics
ESS-net DWH
3
Start SGA2
Conclusions

Data Warehousing in statistics is ‘hot’

Metadata is found important…..
but also often neglected !

S-DWH is very difficult to compare with
common commercial DWH

Visiting NSIs has proven very effective for gathering
information AND for sharing knowledge and expertise
 Great need for knowledge & expertise
ESS-net DWH
4
The Challenges


Decrease of costs & administrative burden
increase of efficiency & flexibility
Rapidly changing demand for information:
-


versus
growing need for more information on more topics
decreasing lifecycle of policymakers, quicker delivery
Disclosure of all kinds of new data sources
Need for integrated production systems
 Make optimal use of all available data sources (existing & new)
ESS-net DWH
5
The Statistical Data Warehouse
A central data hub to connect and integrate all available data
sources, supporting statistical production AND data collection
processes by providing:


a detailed and correct overview/insight of all available data sources
a framework for adequate data governance, including metadata
A central
‘statistical
data store’
for and
managing
management,
confidentiality
aspects
data authorisation
 available
flexible data
storage
and data exchange
between
processes
all
data
of interest,
regardles
of its source,

accessthe
to registers
sampling frames
(BR, etc);
enabling
NSI to produce
necessary
information (= statistics !)
and to (re)use available data to create new data / new outputs.
ESS-net DWH
6
Dataset
Selected
sample
Dataset
Admin data
source
Dataset
Working data
Selected
sample
Staging area
Rules for generating samples etc.
Data
extracts
Data
extracts
Aggregate
Statistics
Aggregate
Statistics
Microdata
Admin data
source
Backbones
(BR eg.)
BB
snapshots
Rules for updating BB
Input
reference frame
ESS-net DWH
Input data
Storage,
combination
Data
extracts
Outputs
7
Explaining the S-DWH
A system or set of integrated systems, designed to handle the
processing of statistical data in the production of statistics,
comprimising:

technical facilities for storing and processing data,
receiving data in and producing outputs in a flexible way

rules for updating the sources for the DWH

definitions necessary to achieve those samples / sources

The S-DWH is a concept that provides an architectural
model of the statistical data flow,
from data collection to statistical output
ESS-net DWH
8
The S-DWH Business Architecture
 Conceptualisation of how to build up a S-DWH
 A common model for the total statistical process
and data flow
 Provide optimal organisation of all structured data,
enabling re-use, creation of new data etc.
 4 Layers, covering all statistical activities
‒ Sources
‒ Integration
‒ Interpretation & Analysis
‒ Data Access / Output
ESS-net DWH
9
The layered architectureof the S-DWH, with focus on the data sources used in each layer
Specific for S-DWH
ESS-net DWH
10
Mapping the S-DWH on the GSBPM
Use the GSBPM as common language to identify and locate
the various phases on the 4 S-DWH layers
ESS-net DWH
11
Managing the S-DWH
The S-DWH is a logically coherent central data store,
not necessarily one single physical unit.
Metadata is vital in the governance, satisfying 2 essential needs:

to guide statisticians in processing and controlling
the statistical data

to inform users by giving insight in the exact meaning
of the statistical data
The vertical metadata layer enables to search all (meta)data in
the 4 layers and, if permitted, give access to the data.
ESS-net DWH
12
Meta data layer
Metadata Layer
Data Access Layer
Interpretation and Data
Analysis Layer
Integration Layer
Source Layer
ESS-net DWH
13
Metadata - the DNA of the S-DWH
Framework:

General metadata definitions

Metadata for the S-DWH

Use of metadata models

Metadata standards & norms

Metadata quality & governance

Categories & subsets

Minimum requirements
ESS-net DWH
14
S-DWH meta data requirements
Subsets
Standards & Norms
ISO 11179
Internal rules
Guidelines
Mata data model
ESS-net DWH
S-DWH Gatekeeper
15
Centre of knowledge & expertise
Defining and implementing business modell:
 Organisational aspects
- Experts from partners and other ESS members
- Research on actual topics
- Seminar / workshop
 Financial aspects covered
 Roll out for more fields of expertise
ESS-net DWH
16
Organisational aspects
Implementation of a S-DWH has huge organisational impact:
 It means:
moving from single operations
to integrated, generic processes
 It needs:
a redesign of the statistical process
 It asks:
new IT systems, tools, high investments
 It is:
a new way of working

Only changing systems will not do the trick,
changing people is the key to success
ESS-net DWH
17
ESSnet on data warehousing
Thank you !
ESS-net DWH