Archiving microdata Standards and good practices United Nations Statistics Commission New York, February 26, 2009 Olivier Dupriez World Bank, Development Data Group and International Household Survey Network [email protected].

Download Report

Transcript Archiving microdata Standards and good practices United Nations Statistics Commission New York, February 26, 2009 Olivier Dupriez World Bank, Development Data Group and International Household Survey Network [email protected].

Archiving microdata
Standards and good practices
United Nations Statistics Commission
New York, February 26, 2009
Olivier Dupriez
World Bank, Development Data Group
and
International Household Survey Network
[email protected]
The value of data
• Survey and censuses
– High cost !  High value ?
• Data have value beyond the purpose for
which they were originally collected
(“repurposing” of data)
– Large under-exploited potential
• Condition: proper archiving
– Documentation, dissemination, preservation
Data archiving – Two models
By a specialized data center
(“trusted repository”)
By the data producer
(Most developing countries)
(US, Canada, Europe)
•
•
•
•
Often academic
High level of expertise
Infrastructure
Standards and best practices
for documentation
• Formal dissemination and
preservation policies and
procedures
• Support to users
• Not seen as a key role
• Lack of expertise
• Inappropriate
infrastructure
• Ad hoc practices
• No compliance with
international standards
• Unclear policies and
procedures
Sharing good practices
Objective: transfer data archiving good practices
and standards to data producers
International Household Survey Network (IHSN)
– A network of international agencies (coordinated by
World Bank /PARIS21)
– Develop tools, guidelines, training materials
– Advocates compliance with good practices
and international standards
www.ihsn.org
Microdata documentation
Good documentation is needed to:
–
–
–
–
Properly analyze the data
Increase credibility of derived indicators and analysis
Allow replication of data collection or analysis
Build institutional memory
DDI + Dublin Core metadata standards (XML)
A checklist of everything you need to know
– Study description
– File description
– Variable description
– Related materials
www.ddialliance.org
IHSN DDI Metadata Editor
Documenting the study: sampling, data
collection, scope and coverage, etc.
IHSN DDI Metadata Editor
Documenting files and variables: formulation
of question, interviewer’s instructions,
computation of variables, etc.
IHSN DDI Metadata Editor
Metadata in XML format …
… can be “transformed”
into html, pdf, other
Microdata cataloguing
XML/DDI metadata is web-ready, “browsable and searchable”
Microdata dissemination
• Growing demand for microdata
• Potential to add much value to existing data
• But requires:
– Enabling legislation
– Formal policy/procedures (IHSN guidelines)
– Technical capacity to prepare data for dissemination
• Documenting, cataloguing
• Anonymizing (IHSN tools being tested)
Data and metadata preservation
Situation in many countries: documents in hard copy only,
outdated storage media, multiple versions of datasets, much
information lost (or never generated).
Goal: Data and documentation remain readable, meaningful,
understandable, accessible  manage hardware, software
and storage media (not only backups; also “migration”)
On-going: IHSN-ICPSR guidelines (Open Archival Information System OAIS; ISO 14721)
Conclusions and recommendations
– NSOs do not need to have all features of
advanced data centers, but data archive is part
of their mandate
– Documentation and preservation are a MUST,
even if you don’t disseminate
– Good practices and standards are relatively
easy to implement
– Good documentation of past surveys helps
improve the quality of future surveys