Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
Download
Report
Transcript Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
Open GSBPM compliant
data processing system
in Statistics Estonia
(VAIS)
2011 MSIS Conference
Maia Ennok
Head of Data Warehouse Service
Data Processing Systems Department
Statistics Estonia
23th. of May 2011
Strategy of Statistics Estonia
2008–2011
“From data collector to information service provider”
Objective: High-quality information service
Standardise the process of data processing:
Indicator: Introduction of the unified data processing
software
Working out and introduction of the universal data processing
information system
25.05.2016
Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
Architecture of the information system
Metadata
iMETA
KUNDE
system
Economic
entities
eSTAT
Data
VVIS
collection
Persons
25.05.2016
Statistical
analysis
Processing
eGeostat
ADAM
Administrative
registers
PX-Web
VAIS
Statistical
SRS
registers
Dissemination
Users
Census-HUB
Data
Warehouse
Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
Data processing system (VAIS)
VAIS is a collection of tools and technologies aimed
at automating data processing (Phase 5 in GSBPM).
In essence, the task of check, clean, and
transforming statistical activity data can be identified
as taking the raw data from one or more sources and
transforming it to analytical system source data input
data base structures (observation registry).
25.05.2016
Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
Framework for …
Integrate data
Classify & code
Review, validate and edit
Impute
Derive new variables
& statistical units
Calculate weights
Calculate aggregates
Finalize data files
25.05.2016
Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
Metadata driven template based tool
Template driven approach provides an universal
solution for three main goals of the VAIS project:
Create an easy to use statistical data processing
tool requiring minimal programming skills for
transformation package creation.
Create a metadata driven process-oriented and
automated statistical data processing tool.
Create an extendable data transformation tool.
25.05.2016
Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
Imputation Method for
CommonStatistical Activity N
Metadata
RepositoryAggregation Def for
Statistical Activity N
Data Sources for
Statistical Activity N
Target Dataset for
Statistical Activity N
25.05.2016
VALIADTE
IMPUTE
VALIADTE
IMPUTE
AGGREGATE
AGGREGATE
INTEGRATE DATA
INTEGRATE DATA
LOAD DATA
Common XDTL Packages
Validation Rules for
Statistical Activity N
INTEGRATE DATA
INTEGRATE DATA
Common XDTL Packages
Data Sources for
Statistical Activity N
Data processingng package (XDTL) for
Statistical Activity N
Design Phase
LOAD DATA
Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
Data processing with VAIS
Automating and speeding up data transformation
Raw data, transformation metadata
and source data audit trails
Metadata driven template
based tool
Balancing automation
and manual intervention
25.05.2016
Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
VAIS architecture
Balancing automation and
manual intervention
Manual
data
processing
RAW
data
Automated data
processing
OK?
Data
Warehou
se
Metadata (validation and
transformation rules)
25.05.2016
Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
VAIS applications and roles
Roll
VAIS
Designer
Designer
x
Data Warehouse
programmer
x
VAIS
VAIS
Operator Administrator
Chief operator
x
Operator
x
Administrator
25.05.2016
x
URMA
x
Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
URMA
User rights management application
Allows using existing user for authorization
Allows create roles and link users with roles
Allows set rights according to domain statistical work
25.05.2016
Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
VAIS Designer
Application for data processing design
User interfaces for designing each processing
procedures
Procedures group to packages
Packages setup fallows policy of ETL
Packages are designed for each statistical work
version
25.05.2016
Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
VAIS Operator
Allows user to manually intervene to data
processing.
Allows to solve tasks created from data validation.
Report of data processing gives overview of data in
process.
Gives users information for decision, that is
necessary to solve tasks.
25.05.2016
Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
Technical platform
VAIS is built on open-sourced freely available technological components.
XDTL (eXtensible Data Transformation Language – an XML based
descriptional language designed for specifying data transformations,
see http://xdtl.org) run-time engine (XDTL RT).
MMX Metadata Repository, part of Metadata Framework (a MOF
compliant metadata management environment designed with a wide
variety of metadata-driven applications in mind, see
http://mmframework.org).
Apache Foundation's Velocity template engine
(http://velocity.apache.org) is used as the template engine combining
excellent template rendering functionality with very easy to use
template language.
The user applications are programmed in Java, based on Wicket MVC
framework (http://wicket.apache.org)
Quartz scheduling framework (http://www.quartz-scheduler.org) is
used for execution scheduling.
25.05.2016
Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
Implementation
VAIS development 05.2010- 10.2011
Data processing of Population and Housing Census 2011
(31.12.2011)
Reuse administrative data (2012)
Data collecting system for administrative data (ADAM)
and eSTAT development for prefilling questionnaires in
eSTAT with administrative data (annual bookkeeping
report). (31.08.2011). VAIS is used for converting
administrative data into the statistical data format. (for
the year 2012 i.e for the reference year 2011 data
collection)
Data processing of other statistical activities (first pilots 2013)
Data processing of next registry based Population and Housing
Census (pilot 2014)
25.05.2016
Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
Questions?
Thank you!