International digital data management and sharing initiatives

Download Report

Transcript International digital data management and sharing initiatives

International digital data
management and sharing initiatives
in the social sciences
Peter Elias
University of Warwick, England
Presentation to the First African Digital Curation Conference
Pretoria, 12 – 13 February 2008
Why do we need to share data?
• To address research on issues of global importance:
(e.g. poverty; migration; economic growth and
development; spread of infectious diseases; environmental
change; response to natural disasters; threats to security)
• To facilitate international research collaboration
• To make best use of limited resources
• To exchange skills and knowledge about the use of
research data
• To engage in comparative research
Macro vs. micro data
• Macro data are aggregates of micro data
(where the unit of observation is the
individual, household or organisation).
• Macro data are helpful in guiding the
development of research and policy (e.g.
trend analysis).
• Considerable progress has been made in
providing access to international macro data.
Macro data access and mapping
• UK Economic and Social Data Service
International
• Web based tools for data presentation and
mapping:
– www.gapminder.org
– www.worldmapper.org
World map – land areas
Territory size is shown proportional to surface areas of territories.
Source: www.worldmapper.org
Infant mortality 2002
Territory size shows the proportion of infant deaths worldwide that occurred there in
2002. Infant deaths are deaths of babies during their first year of life.
Source: www.worldmapper.org
Carbon emissions 2000
Tonnes of carbon dioxide emitted per person living in that territory in 2000
Source: www.worldmapper.org
Macro vs. micro data
• Micro data contain more useful variation than micro
data.
• Micro data are more amenable to quality
investigation and control.
• Micro data are more flexible as research data – they
permit the analyst to reconstruct aggregates in
different ways or to reclassify the information they
contain.
• Micro data can be classified according to source:
– censuses; surveys; administrative systems; transactions
What mechanisms for sharing micro data
already exist?
Sharing survey data:
• International Household Survey Network
Census data sharing:
• Integrated Public Use Microdata Series — International (IPUMS
International)
Developing common surveys:
• Demographic and Health Surveys
• European Social Survey
• World Values Survey
• International Social Survey Programme
• Longitudinal surveys of ageing (SHARE, ELSA, HRS)
Sharing archives
• Council for European Social Survey Data Archives
• Inter-university Consortium for Political and Social Research
What else needs to be done to
improve data sharing?
• Need to improve knowledge about data and
metadata
• Need to promote cross-disciplinary research
• Need to resolve problems of access to data
deemed ‘sensitive’ or subject to ethical
safeguards
• Need to gear data development to
international research needs
The NSF/ESRC initiative
In 2002 the US National Science Foundation and the
UK Economic and Social Research Council agreed
to seek ways to improve international
collaboration in the social, behavioural and
economic sciences
After preparatory work by the Social Science
Research Council, six countries agreed to fund a
major conference in Beijing
(www.internationaldataforum.org)
Conference agreed to prepare plan for a new
International body – the IDF
Plans to establish the International
Data Forum
• Scientific committee has met twice
(representatives from 9 countries and the
ISSC)
• Proposal for IDF ready by May 2008
• If accepted, launch of IDF at World Social
Science Forum in May 2009
What would the IDF do?
• Data discovery
Improving knowledge about existing data opportunities – worsening because of data
deluge
• Data collection
Working with research funders and research communities to identify gaps in data collection
for research across national boundaries
• Data management
Promoting efforts to build standards based data and metadata management processes into
the data lifecycle; introduce interoperability and make it cumulative
• Data dissemination
Identifying obstacles to data sharing, seeking to remove these and making data available
to researchers
• Data re-purposing
Promoting innovation in ways of re-using data