Seminar on Statistical Data Collection (Geneva, 25-27 September 2013) The Business Statistical Portal: a new way of organising and managing data collection process for business.

Download Report

Transcript Seminar on Statistical Data Collection (Geneva, 25-27 September 2013) The Business Statistical Portal: a new way of organising and managing data collection process for business.

Seminar on Statistical Data Collection
(Geneva, 25-27 September 2013)
The Business Statistical Portal:
a new way of
organising and managing data collection process
for business surveys in ISTAT
ISTAT : Renato N. Fazio, Manuela Murgia, A. Nunnari
Stefania Macchia
[email protected]
The approach of the way data are collected is changing in Istat, in order to
 Reduce respondent burden
 Improve efficiency
 Maintain the level of data quality
…… within a climate of general budgets cuts
STAT2015 project
envisages the gradual standardisation and industrialisation of the entire
cycle of ISTAT statistical processes, according to a model based on a
metadata-driven, service-oriented architecture.
In this context:
a study has been carried out to define a comprehensive set of
requirements according to which select, design or further develop
the IT tools supporting data collection
a big project is ongoing: ISTAT Business Statistical Portal, a
single entry point for Web-based data collection from enterprises,
Generalisation 
A system is generalised if dynamically manages changes in the
environment through parameterisation.
It must:
 be able to implement any kind of mode-specific questionnaires (CAPI,
CATI, CAWI etc.),
 be able to adapt surveys to specific features of different target
(households/individuals, enterprises and institutions),
 properly work on the wider possible range of software and hardware
It should be also compliant with some quality constraints:
usability, flexibility, modularity, reduction of response burden,
standardisation of data and metadata representation or compliance with
recognised standards, processes integration (as opposite to stovepipe
organisation), independence from proprietary systems and maximisation of
“degrees of freedom” for statisticians using the IT tools (no need of IT
Requirements have been defined according to 4
Survey unit management
Data collection management
Communication facilities
Electronic questionnaire
Survey unit management area
Tools for managing the information required to contact the
respondents and to logically represent them as users in the system:
name, address, phone, email, username and password, etc.
The IT tool has to be able to:
 collect and manage master data (i.e. deleting, inserting and updating
unit records). It should allow loading of lists of survey units and of all bodies
involved in the data collection phase (chambers of commerce, regional
statistical offices, local supervisors, etc) to the survey database. It should also
provide survey managers with functions to update or to modify the uploaded
lists at any point of a survey wave.
 create and manage authentication procedures through user IDs and
passwords. It should provide tools for management of user credentials in
order to guarantee proper user identification.
Survey unit management area
The IT tool has to be able to:
 integrate survey data with internal administrative or base registers
storing master data. In order to reduce response burden, it is necessary
to ensure data integration between the collection system and any base
or administrative registers managed inside the NSI and used as input
or output source of the data collection process
Data collection management area
Real-time administrative tools for conducting and monitoring the
data collection process: management of user grants, first validation
tools, questionnaire tracking systems, reporting tools
The IT tool has to be able to:
 track and manage the questionnaire completion and processing status
(not answered, draft, completed, completed and validated, completed
but not yet validated, etc.);
 access and manage relevant process data as, for example,
registrations, accesses to the system, etc.;
 manage sub-sets of users and their assigned roles (administrator,
supervisor, etc.);
Data collection management area
The IT tool has to be able to:
 error-check each completed questionnaire through first-level
 to produce custom reports to keep track of critical aspects of the
collection process: contact results (interviews, refusals,
appointments, registrations, etc), trend of key survey variables,
interviewers’ productivity etc.;
 to manage mixed-mode surveys.
This involves that the microdata, metadata and status information
related to different, mode-specific questionnaires should be modelled
and possibly stored in a way that ensures data interoperability
and integrated management.
Communication facilities management area
Tools for sharing information with survey respondents: helpdesk
system, content management system, automatic reminders
management, etc.
These instruments include:
 a centralised help desk system that would allow an integrated
management of any kind of respondents’ requests – technical
questions as well as information needs – that can come from different
communication channels like, for instance, toll-free numbers, email,
SMS, etc
 a real-time access to survey’s paradata to improve the help desk
management and to enable real-time troubleshooting;
 links to external sources of statistical information where respondents, if
they wish, can get more insights about the survey;
Communication facilities management area
These instruments include:
 emails, telephone numbers of survey and IT managers;
 instructions on how to fill in the questionnaire that respondents can
download, print or read on video;
 FAQ (Frequent Asked Questions);
 management of invitations for respondents to cooperate with the survey
 management of paper questionnaires to be sent by fax-server, e-mail,
 CATI scheduler;
 CAPI agenda system.
Electronic questionnaire area
Tools for the collection of survey microdata.
See last year UNECE Seminar: “Improve the quality on data collection: minimum requirements for
a generalised software independently from the mode”
completeness of functions,
generalisation of functions,
independence from proprietary systems,
cross-browser compatibility,
platform compatibility,
modularity of functions,
logical and semantic abstraction,
integration with XML data representation model in order to reach compliance
with recognised standards for data and metadata description.
ISTAT Business Statistical Portal
A. The Project
B. Data collection
C. An integrated system: the business portal back-office
D. Compliance to the requirements
E. Business Portal: year 2013
ISTAT Business Statistical Portal
A. The Project – a platform dedicated to the acquisition of statistical
The project started in October 2010 after an agreement between Istat, Italia Union
Chambers of Commerce and Minister of Public Administration.
Its objectives, in accordance with strategy proposed by the Regulation of the
European Parliament and of the Council on the European statistical program 20132017 and the new Code of Digital Administration (CAD), are:
to simplify the way data are collected from the business sector;
to reduce the costs the enterprises incur to comply with their statistical
to reduce the statistical burden on enterprises;
to optimize the processes of delivery of public services for enterprises;
to streamline the statistical data collection;
to increase the overall information potential of business statistics;
to return information to businesses;
to act as a driving force for the overcoming of the vertical production
pattern of business statistics in favour of an integrated horizontal model.
ISTAT Business Statistical Portal
A. The Project
“Legacy stovepipe” architecture of
ISTAT business statistics.
Horizontally integrated architecture of
ISTAT’s new Business Statistical Portal
Stovepipe model: for each "stovepipe" (specific field of statistics) every stage of the
statistical process (from survey design to collection, processing and dissemination of
data) takes place in an autonomous and independent line of production, set apart
from the others
Enterprise centred model: the sharing of a vast array of processes, tools and methods
through a centralized platform, together with the integration of data and metadata via
a common modelling.
ISTAT Business Statistical Portal
A. The Project
Horizontally integrated architecture of
ISTAT’s new Business Statistical Portal
The enterprise-centred model features some unique aspects:
 single entry point;
 data collection adjusted to the organizational lay-out of the enterprise;
 provision of services that facilitate the response through an up-to-date report on the
status of fulfilment of statistical obligations;
 feedback of custom statistical information to respondents.
ISTAT Business Statistical Portal
B. Data collection
Since the stove pipe model cannot be sustained anymore then data collection system
has to be re-organized/re-planned in order to support the new “enterprise centred”
After the analysis of the existing data collection systems a new generalized system
has been developed.
The entire project has been customer oriented. It considered three types of
 Statisticians
 Enterprises
 IT people
ISTAT Business Statistical Portal
B. Data collection
They can self-sufficiently set up a new survey, from the design to the test of the
production system.
Granted access to Portal’s back-office, they are enabled to use integrated tools to
 validation of checks according to the compatibility plan;
 data entry;
 on-line edits;
 trend reports grouped by business domain, interviewers or rules laid down in the
compatibility plan;
 monitoring of logged users.
 They are able to verify their involvement in the survey, the deadline, the state of
completion, any reminders or warnings received;
 They are provided with a standard graphical interface to perform data collection
IT people
They are facilitated in performing maintenance and evolutionary development of the
Enterprise: information from Registers
Enterprise: involvement in the survey, deadline, state of completion,
reminders or warnings received
ISTAT Business Statistical Portal
B. Data collection – GX : a survey design tool
It is a generalized design system for Web-base surveys.
It finds its natural place between the "Design" and the "Process" phases of the
GSBPM model.
 3.1 Build data collection instrument
 4.1 Select sample
 4.2 Set up collection
 4.3 Run collection
 4.4 Finalize collection
 5.2 Classify and code
 5.3 Review, validate and edit
 5.7 Calculate aggregates.
ISTAT Business Statistical Portal
B. Data collection – GX : a survey design tool
It is a generalized system: the informative contents are decoupled from the system that
uses them, in other words, from their instantiations.
Therefore XML has been chosen to create the following survey’s contents:
Survey Metadata
Survey Variables
Checking rules plan
Skipping rules
Reasons why XML:
1) it is the format for both human-and-machine-readable document marking,
2) it allows for easier interpretation of the information represented (metadata, data,
checks, rules of administration) ensuring interoperability;
3) it would make easier to comply with standards such as SDMX, DDI, etc.
More info on generalized software system can be found at:
ISTAT Business Statistical Portal
B. Data collection
One of the XML potentialities is represented by its enormous semantic flexibility if used
together with XSLT: XSLT allows to visualize the same XML file in various formats
XSLT could create an HTML file, or (even better) a PHP file, or a Javascript or Jquery
file, SQL or PL/SQL code
This technology has been chosen for the data collection module of the Business Portal
ISTAT Business Statistical Portal
B. Data collection
In the Business Portal project, XML is the starting point to create the DB structure, the
server side applications – PHP - and the client side application javascript/Jquery.
ISTAT Business Statistical Portal
B. Data collection
The architecture: MVC, security and performance
DB owner
DB user
ISTAT Business Statistical Portal
B. Data collection
A user interface is under development. It follows the Wysiwyg principle in order to let
also non IT people “building” their own survey ex-novo.
(English translation of the above tabs)
= Survey tree
ISTAT Business Statistical Portal
C. Business portal back-office
Monitoring and help desk functions
• Management of interviewers staff
• Management of respondents list
• Questionnaire management:
• questionnaire status,
• coherence errors,
• inconsistency errors,
• etc.
• Monitoring:
• reports on contact results,
• reports on questionnaire status
• in compilation,
• not compiled,
• definitive questionnaire,
• draft questionnaire,
• etc.
ISTAT Business Statistical Portal
C. Business portal back-office
Future developments for the analysis of collected data: dynamic draws
ISTAT Business Statistical Portal
D. Compliance to the requirements
Functional requirements
Survey unit management
 Management of authentication procedures through smart card or user IDs and
 Management of lists of survey units and of all bodies involved;
 Management of business master data (business name, address, fiscal code,
type of company, NACE description, number of employees, etc.) extracted
directly from ISTAT’s business register (ASIA). Real-time update is provided
upon request for some variables
Data collection management
 Management of contact results;
 Back-office monitoring tools (standard and customized reports in order to
monitor data collection and relevant process data);
 Content management functions to update the front-end content
ISTAT Business Statistical Portal
D. Compliance to the requirements
Functional requirements
 Centralized help desk ;
 Tools for exchanging information with survey respondents.
Electronic questionnaire area
 Questionnaire authoring tool based on XML standard ;
Non Functional requirements
 Interoperability - Sharing of data and metadata on the basis of a common data
 Centralised governance - Integrated management of the data collection processes
according to businesses’ needs (such as single sign-on, business-managed profiling
 Rationalisation - Inter-process sharing and re-use of data that are already available
in the statistical system or among the various public administrations
 Standardisation – Use of metadata standard representation (XML)
 Data reciprocation - Feedback to business: macro and micro benchmark with data
supplied by ISTAT data warehouse (I.Stat)
ISTAT Business Statistical Portal
E. Business Portal: year 2013
1. June 2013: Test on a small number of enterprises with the help of
business representatives.
Objective: to evaluate the different management procedures and
functions in a real-world setting,
69 companies involved in 3 surveys:
 a survey that provides statistics on the production of manufactured goods
 a short-term survey on turnover and orders in industry (FATT)
 a structural business survey on information and communication
technologies in enterprises (ICT),
2. November 2013: First release of GX (Generalized Italian (data)
collection system Xml)
3. December 2013: MPS survey (multi-purposes surveys among
business groups) as part of the IX Industry Census. The electronic
questionnaire will be developed with GX.