Archiving microdata Standards and good practices United Nations Statistics Commission New York, February 26, 2009 Olivier Dupriez World Bank, Development Data Group and International Household Survey Network [email protected].
Download ReportTranscript Archiving microdata Standards and good practices United Nations Statistics Commission New York, February 26, 2009 Olivier Dupriez World Bank, Development Data Group and International Household Survey Network [email protected].
Archiving microdata Standards and good practices United Nations Statistics Commission New York, February 26, 2009 Olivier Dupriez World Bank, Development Data Group and International Household Survey Network [email protected] The value of data • Survey and censuses – High cost ! High value ? • Data have value beyond the purpose for which they were originally collected (“repurposing” of data) – Large under-exploited potential • Condition: proper archiving – Documentation, dissemination, preservation Data archiving – Two models By a specialized data center (“trusted repository”) By the data producer (Most developing countries) (US, Canada, Europe) • • • • Often academic High level of expertise Infrastructure Standards and best practices for documentation • Formal dissemination and preservation policies and procedures • Support to users • Not seen as a key role • Lack of expertise • Inappropriate infrastructure • Ad hoc practices • No compliance with international standards • Unclear policies and procedures Sharing good practices Objective: transfer data archiving good practices and standards to data producers International Household Survey Network (IHSN) – A network of international agencies (coordinated by World Bank /PARIS21) – Develop tools, guidelines, training materials – Advocates compliance with good practices and international standards www.ihsn.org Microdata documentation Good documentation is needed to: – – – – Properly analyze the data Increase credibility of derived indicators and analysis Allow replication of data collection or analysis Build institutional memory DDI + Dublin Core metadata standards (XML) A checklist of everything you need to know – Study description – File description – Variable description – Related materials www.ddialliance.org IHSN DDI Metadata Editor Documenting the study: sampling, data collection, scope and coverage, etc. IHSN DDI Metadata Editor Documenting files and variables: formulation of question, interviewer’s instructions, computation of variables, etc. IHSN DDI Metadata Editor Metadata in XML format … … can be “transformed” into html, pdf, other Microdata cataloguing XML/DDI metadata is web-ready, “browsable and searchable” Microdata dissemination • Growing demand for microdata • Potential to add much value to existing data • But requires: – Enabling legislation – Formal policy/procedures (IHSN guidelines) – Technical capacity to prepare data for dissemination • Documenting, cataloguing • Anonymizing (IHSN tools being tested) Data and metadata preservation Situation in many countries: documents in hard copy only, outdated storage media, multiple versions of datasets, much information lost (or never generated). Goal: Data and documentation remain readable, meaningful, understandable, accessible manage hardware, software and storage media (not only backups; also “migration”) On-going: IHSN-ICPSR guidelines (Open Archival Information System OAIS; ISO 14721) Conclusions and recommendations – NSOs do not need to have all features of advanced data centers, but data archive is part of their mandate – Documentation and preservation are a MUST, even if you don’t disseminate – Good practices and standards are relatively easy to implement – Good documentation of past surveys helps improve the quality of future surveys