OVERVIEW OF ARCHIVING OF MICRODATA SILAS M. MULWA Kenya National Bureau of Statistics United Nations Regional Seminar on Census Data Archiving for Africa Addis Ababa,

Download Report

Transcript OVERVIEW OF ARCHIVING OF MICRODATA SILAS M. MULWA Kenya National Bureau of Statistics United Nations Regional Seminar on Census Data Archiving for Africa Addis Ababa,

OVERVIEW OF ARCHIVING OF
MICRODATA
SILAS M. MULWA
Kenya National Bureau of Statistics
United Nations Regional Seminar on Census Data Archiving for Africa
Addis Ababa, 20-23rd September 2011
OUTLINE






Definition of Micro-data
Why disseminate Micro-data?
Acquisition and preparation of data
Tools for Archiving of Micro-data
Dissemination of micro-data
Risks of disseminating microdata
Definition of Micro-data
Micro-data: are defined as files of records
pertaining to individual respondent units
(mainly through household interviews).
Micro-data files for dissemination purposes may
differ from those used within the KNBS;
 all direct and indirect identifiers have may be
removed through various anonymization
processes
 Micro-data as opposed to Macro-data contains
records pertaining to individual
firms/institutions or aggregated records at
national or county levels.
Why disseminate Data use?
 To Broaden data use and reuse
 To add value to data bringing subject
matter knowledge to data analysis
 To get feedback from the data users,
which could be used to improve data
quality and also improve data
collection in the future.
Why disseminate Data use? Cont’
 Foster diversity and deepen the
quality of data analysis thereby
extracting more information from the
data.
 Reduce duplication in data collection
 Leveraging funding for statistics
 Complying with a contractual or legal
obligation e.t.c.
5
Acquisition and preparation
 Questionnaire Design and Sampling
design
 Data collection
 Data is coded after collection
 Data Capture
 Draft Data is archived using tools
 Data editing and derived variables
created
 Edited set is archived followed by
anonymization
Tools for Archiving of Micro-data
1.Data archiving has various components
these includes;
 Data documentation
 Cataloguing
 Data dissemination
 Anonymization
 Preservation.
A combination of tools are required so as to construct all
the components.
Archiving tools cont’d
KNBS has used both standard and non standard
tools for data archiving.
 IHSN micro-data management Toolkit – It is a
standard kit for documenting datasets in
compliance with Data Documentation Initiative
(DDI) and the Dublin Core Standards.
 KNBS has not used the toolkit to document
census micro-data but 7 survey datasets have
been documented and archived using this tool.
Archiving tools cont’d
 NADA
Is DDI standard compliant for archiving
and dissemination. KNBS has not used the
tool but it is building capacity on NADA
 Redatam – IMIS (integrated mult-sectoral
information system)
KNBS has used IMIS to store 1989, 1999
Census Microdata. Mainly it is used to
query information and is web-based
Archiving tools cont’d
CSPRO and ICADE:
 Not for archiving but for capturing and
storage of census-data Micro-data, there is
very little documentation apart from
providing ASCII file structures
DISEMINATION OF MICRO-DATA
Due to Technological advancement in ICT and
increased demand for data, KNBS is changing
the way of accessing and disseminating
information to the users.
 Traditionally KNBS has been using publications,
seminars and workshops to release and
disseminate survey and census data.
 Census micro-data can be accessed and
queried using IMIS.

DISEMINATION OF MICRO-DATA cont’d
 Offline Dissemination
CDs, DVDs are used for data distribution
Microdata after anonymisation process.
Only 5% of Census data can be given to
users on special request.
 Mobile Dissemination: It was used to
disseminate 2009 population census results.
Risks of disseminating
microdata
Disclosure:
 This is the risk of re-identification of particular
individuals and is one of biggest challenges.
 It can lead to violation of laws, lose of trust by
the public and compromise quality
Controversy of results:
User may obtain different results from the one
published by KNBS resulting to controversy and
criticism
END OF PRESENTATION