International Workshop on Introduction to the DDI and the IHSN Microdata Management Toolkit UNITED NATIONS DEPARTMENT OF ECONOMIC AND SOCIAL AFFAIRS STATISTICS DIVISION NATIONAL BUREAU OF STATISTICS OF.

Download Report

Transcript International Workshop on Introduction to the DDI and the IHSN Microdata Management Toolkit UNITED NATIONS DEPARTMENT OF ECONOMIC AND SOCIAL AFFAIRS STATISTICS DIVISION NATIONAL BUREAU OF STATISTICS OF.

International Workshop on
Introduction to the DDI and
the IHSN Microdata Management Toolkit
UNITED NATIONS
DEPARTMENT OF ECONOMIC
AND SOCIAL AFFAIRS
STATISTICS DIVISION
NATIONAL BUREAU OF
STATISTICS OF CHINA
Beijing, 17-19 June 2013
DDI元数据标准及IHSN国际住户调查网
络微观数据管理工具国际培训班
联合国
经济和社会事务部
统计司
中华人民共和国
国家统计局
北京,2013年6月17日-19日
3
Workshop objectives - Context
Generic Statistical Business Process Model (GSBPM)
Design
Build
Collect
Process
Analyze
Disseminate
Archive
Evaluate
Metadata Management
Quality Management
Specify the needs
Describes statistical
processes (e.g.,
implementation of a
survey) in 9 phases,
each divided into subprocesses.
A convenient tool for
assessment, planning
of statistical processes.
4
培训班目标 – 背景
通用统计业务流程模型 (GSBPM)
指明需求
设计
收集
处理
分析
传播
存档
评估
元数据管理
质量管理
建立
描述统计流程(例如,
实施一项调查)的9
个阶段,每个阶段有
各自的子流程。
一个用来评估与规划
统计流程的便利工具。
5
Workshop objectives
The workshop will introduce standards and tools for:
• Metadata management
Design
Build
Collect
Process
Analyze
Disseminate
Archive
Evaluate
Metadata Management
Quality Management
Specify the needs
• The DDI standard
• IHSN Metadata Editor
• Dissemination
• Policy, technical and
ethical issues
• NADA software
• Archiving
• Preservation of digital
information
6
培训班目标
培训班介绍标准和工具的目的是:
• 元数据管理
指明需求
• DDI标准
• IHSN元数据编辑软件
设计
收集
处理
分析
传播
存档
评估
元数据管理
质量管理
建立
• 传播
• 政策,技术和道德问题
• NADA软件
• 存档
• 数字信息保存
7
Metadata management
Part 1
Documenting your surveys and censuses using
the DDI Metadata Standard
and the IHSN Metadata Editor (Nesstar Publisher)
8
元数据管理
第1部分
使用DDI元数据标准
以及IHSN元数据编辑软件(Nesstar发布软件)
记录您的调查和普查
9
Why do data producers need metadata?
• To increase the credibility and transparency of
their statistical outputs
• To preserve institutional memory
• To allow replication of data collection and
analysis
• To allow re-use or re-purposing of the metadata
10
为何数据生产者需要元数据?
• 为了增加其统计输出的公信力和透明度
• 为了保持机构记忆
• 为了允许复制数据收集和分析
• 为了允许重复使用或重新利用元数据
11
Why do data users need metadata?
• To fully understand the (micro)data and make good
use of them
– To minimize the risk of misuse/misinterpretation,
users need to fully understand the data. Why, by
whom, when, and how data were collected and
processed are important information.
• For making data discoverable in on-line catalogs
– Users will know about the availability of your data
by searching or browsing detailed metadata
catalogs.
12
为何数据使用者需要元数据?
• 为了充分认识(微观)数据并很好的利用他们
– 为了尽量减少误用/曲解的风险,使用者需要充
分了解数据。数据收集和处理的重要信息包括:
目的,收集者/处理者,时间和方式。
• 为了便于搜寻在线目录中的数据
– 使用者通过搜索或浏览详细的元数据目录,将
会知道是否可以获得您的数据。
13
Standards and tools
• The Data Documentation Initiative (DDI)
metadata standard helps structure, preserve and
share survey or census metadata
• The IHSN Microdata Management Toolkit, a.k.a.
Nesstar Publisher, provides a free and user
friendly solution to document and catalog
surveys/censuses in compliance with the DDI
standard and international best practices
14
标准和工具
• 数据记录倡议(DDI)元数据标准有助于结构化,
保存和分享调查或普查的元数据
• IHSN 国际住户调查网络微观数据管理工具包,
又名Nesstar发布软件,为记录并编目符合DDI
元数据标准和国际最佳实践的调查/普查,提供
了一个免费且用户友好的解决方案。
15
What is the DDI?
• A checklist of what you need to know about a study and its
dataset
– A structured and comprehensive list of hundreds of
elements that may be used to document a survey dataset
• An XML metadata standard
• Developed by academic data centers / the DDI Alliance.
• Designed to encompass the kinds of data generated by
surveys, censuses, administrative records.
• For microdata, not indicators.
• Two versions:
– Version 2.n (DDI codebook), used by the IHSN Toolkit
– Version 3.n (DDI life cycle)
16
什么是DDI元数据标准?
• 一张列有您所需要知道的,有关一个研究及其数据集信息
的核对表
•
•
•
•
•
– 一张结构化的综合列表,包含数百个元素,可用来记
录一项调查的数据集。
一个XML格式的元数据标准
由学术数据中心/DDI联盟开发。
旨在涵盖由调查,普查,行政记录产生的这类数据。
用于微观数据,而非指标。
两个版本:
– 版本2.n (DDI码本), 用于IHSN国际住户调查网络工具包
– 版本3.n (DDI生命周期)
17
What is XML ?
• XML stands for eXtensible Markup Language. It is
used to structure information to be shared on the
Web or exchanged between software systems.
• XML is a file format, readable by any text editor (e.g.,
Notepad).
• XML tags text for meaning. HTML tags text for
appearance. The “tags” are conceptually the same as
“fields” in a database.
• In an XML file, the information is wrapped between
an opening tag and a closing tag. The tag name
indicates its content.
18
什么是XML?
• XML代表可扩展标记语言,用于结构化在网络
上共享或在软件系统之间交换的信息。
• XML是一种文件格式,在任何文本编辑器(例
如:Notepad)上可读。
• XML语言的标签文本具有内容含义。HTML语言
的标签文本用于文字外观。在XML语言下的数
据库中,“标签”和“字段”在概念上是相同
的。
• 在一个XML文件中,信息被包裹在开始标签和
结束标签之间。标签名称表示其内容。
19
DDI and XML - An example
“The National Statistics Office (NSO) of Popstan conducted the Multiple Indicators
Cluster Survey (MICS) with the financial support of UNICEF. 5,000 households,
representing the overall population of the country, were randomly selected to
participate in the survey, following a two-stage stratified sampling methodology.
4,900 of these households provided information.”
In XML/DDI this would look like this:
<titl> Multiple Indicator Cluster Survey 2005 </titl>
<altTitl> MICS 2005</altTitl>
<AuthEnty> National Statistics Office (NSO) </AuthEnty>
<fundAg abbr= "UNICEF">United Nations Children Fund </fundAg>
<nation> Popstan </nation>
<geogCover> National </geogCover>
<sampProc> 5,000 households, stratified two stages </sampProc>
<respRate> 98 percent </respRate>
20
DDI和XML - 例子
“Popstan国国家统计局(NSO)在联合国儿童基金会(UNICEF)的资金支持下,
开展了多指标类集调查(MICS)。调查采用二阶段分层抽样法,从参与这项
调查的全国总人口中,随机抽取了5000户家庭作为代表全体的样本。其中4900
户家庭提供了信息。”
在XML/DDI中,以上内容呈现如下:
<titl>多指标类集调查 2005</titl>
<altTitl>MICS 2005</altTitl>
<AuthEnty>国家统计局 (NSO)</AuthEnty>
<fundAg abbr= “UNICEF”>联合国儿童基金会</fundAg>
<nation>Popstan国</nation>
<geogCover>全国</geogCover>
<sampProc>5000户家庭, 二阶段分层抽样</sampProc>
<respRate>百分之98</respRate>
21
Advantage of XML
• Can be transformed into many kinds of outputs:
– Databases, HTML, PDF, on-line catalogs, others
• Plain text files. Not specific to any operating
system or application
• Easy to generate using specialized tools such as
the IHSN Metadata Editor
22
XML的优势
• 可以转化为多种输出:
– 数据库、HTML、PDF、在线目录,及其他
• 纯文本文件,不是某个操作系统或应用程序的
特定文件
• 使用特定工具生成非常便捷,例如IHSN国际住
户调查网络元数据编辑软件
23
Structure of the DDI 2.0 standard
The DDI elements are organized in five sections:
1. Document Description. Used to document the
documentation process (“metadata on metadata”).
2. Study Description. Information about the survey such as
title, dates/method of data collection, sampling, funding,
etc.
3. Data File Description. Content, producer, version, etc.
4. Variable Description. Literal question, universe, labels,
derivation and imputation methods, etc.
5. Other Material. Description of materials related to the
study such as questionnaires, coding information, reports,
interviewer's manuals, data processing and analysis
programs, etc.
24
DDI2.0标准的结构
DDI元素由5部分组成:
1. 文档描述:用来记录文档著录过程(“元数据的元数
据”)。
2. 研究描述:关于调查的信息,例如标题、数据收集的日
期/方法、抽样、资金等等。
3. 数据文件描述:内容、生产者、版本等等。
4. 变量描述:字面问题、全域、标签、推导和估算方法,
等等。
5. 其他相关信息:描述与研究相关的材料,例如问卷、编
码信息、报告、面试官手册、数据处理和分析程序等等。
25
Exercises
Workshop participants will install the IHSN
Metadata Editor (a.k.a. Nesstar Publisher) and
document a small census dataset.
26
练习
培训班与会者将安装IHSN国际住户调查网络元数
据编辑软件(又名Nesstar发布软件)并学习记
录一个小的普查数据集。
27
Exercise data files
Content of the USB provided to participants
Chinese version of:
• Popstan census data files (2) in Stata format
• Census questionnaire
• Enumerator manual
Same content in English
Selected technical and policy guidelines
IHSN Metadata Editor software and templates
28
练习的数据文件
USB存储盘向与会者提供以下内容
中文版本:
•Stata格式的人口普查数据(2个文件)
•人口普查问卷
•统计员手册
英文内容相同
技术和政策方面的指导原则
IHSN国际住户调查网元数据编辑软件和模板
29
Exercise 1 – Installation
• Run NesstarPublisherInstaller_v4.0.9.exe to install
the software
• Next step is to install the IHSN templates
Open the Template Manager
30
练习1- 安装
• 运行NesstarPublisherInstaller_v4.0.9.exe,安装
软件
• 下一步是安装IHSN国际住户调查网络模板
打开模板管理程序
31
Exercise 1 – Installation
Click on “Import” and
select the English (EN) or
Chinese (CN) template
found in folder “Software”
Then select the added
template and click “Use”
to activate it. This will
now be the default study
template.
Repeat the exact same
process for the Resource
Description Template
32
练习1- 安装
点击“导入”,在
“Software (软件)”文件
夹中选择英语(EN)或
中文(CN)模板
然后选择要添加的模板,
点击“使用”来激活它。
这个模板将成为默认的
研究模板。
重复相同的步骤来添加
资源描述模板
33
Exercise 2 - Documentation
The next steps will be to document the Census:
- Import the data files (Stata)
- Add metadata in the Document Description,
Study Description, Data Files Description, and
Variables Description sections
- Attach and document the questionnaire and
manual as external resources
- Export the metadata to DDI (and RDF) formats
34
练习2 – 记录
接下来的步骤是记录普查:
- 导入数据文件(Stata)
- 添加文件描述,研究描述,数据文件描述,
和变量描述部分的元数据
- 将调查问卷和面试官手册作为外部资源附
加并记录
- 将元数据以DDI(和RDF)格式导出
35
When should data be documented?
Document “as you go” – not after completion of the
operation. When documentation is done as a “last step”,
much information is lost.
Much information loss, or never generated
36
数据在何时应该被记录?
“按进度”记录每一步 – 而不是在调查结束以后。如
果只在“最后一步”记录数据,许多信息已经丢失。
37
Software and guidelines
Available at www.ihsn.org
http://www.ihsn.org/home/node/117
http://www.ihsn.org/home/software/ddi-metadata-editor
38
软件和指导原则
可下载于www.ihsn.org
http://www.ihsn.org/home/node/117
http://www.ihsn.org/home/software/ddi-metadata-editor
39
Metadata and microdata dissemination
Part 2
Formulating a microdata dissemination policy,
disseminating data and metadata, and the
IHSN National Data Archive (NADA) software
40
元数据和微观数据传播
第2部分
制定一个微观数据传播政策,
数据和元数据的传播,
以及IHSN国际住户调查网络国家数据归档
(NADA)软件
41
Benefits of dissemination
• Diversity of research work. Data producers usually publish
tabular and analytical outputs. But they will never identify
all the research questions that can be addressed using the
data. Microdata dissemination encourages diversity (and
quality) of analysis.
• Credibility/acceptability of data. Broader access to
metadata and microdata demonstrates the producer’s
confidence in the data, by making replication (or correction)
possible by independent parties.
42
传播的优点
• 使研究工作多元化 :数据生产者通常发布表格和分析
输出。但他们绝不会辨识出这组数据能解决的所有研究
问题。微观数据的传播促进了分析的多样性(和质量)。
• 数据的公信力和认可度:通过让独立的第三方能够复制
(或修正)数据,对元数据和微观数据更广泛的访问显
示了生产者对数据的信心。
43
Benefits of dissemination
• Reduced duplication. Non accessibility to microdata forces
users to conduct their own surveys. Microdata
dissemination would reduce the risk of duplicated activities.
It will also reduce the burden on respondents, and
minimize the risk of inconsistent studies on a same topic.
• Funding. Better use of data means better return for survey
sponsors, who will thus be more inclined to support data
collection activities.
• Quality of data. It is often through the use of data that
insights for improvement for survey design can be
identified.
44
传播的优点
• 减少重复:无法获得微观数据迫使用户自己进行调查。
微观数据的传播将减少重复工作的风险。它也将减少受
访者的负担,并将同一主题不一致研究的风险降到最低。
• 资金:更好地利用数据意味着对调查赞助者更好的回报,
从而使他们更倾向于支持数据收集活动。
• 数据质量:往往在数据使用的过程中,会产生如何改进
调查设计的见解。
45
Costs and risks of dissemination
• Exposure to criticism. Quality itself often puts a brake on
microdata dissemination. Some data producers may fear to
be exposed to criticism when data are not fully reliable, and
to be confronted to the obligation to defend their results
when challenged by secondary users.
• Loss of exclusivity. When disseminating microdata, data
owners lose their exclusive right to discoveries. This is more
of an issue for academic researchers than official producers.
46
传播的成本和风险
• 受到批评:质量本身往往会阻碍微观数据的传播。一些
数据生产者可能担心当数据不是完全可靠时会受到批评,
并且在面临二级用户质疑时,要承担为自己的结果辩论
的义务。
• 丧失专用性:微观数据的传播使数据拥有者失去了他们
对自己发现的数据的专用权。相比官方数据生产者,这
对学术研究者来说是更大的一个问题。
47
Costs and risks of dissemination
• Official vs. non-official results, and exposure to
contradiction. Dissemination of microdata may lead to a
proliferation of differing -and possibly contradictory- results
and statistics. It may become more and more difficult to
distinguish between official figures and other sources of
statistics.
• Financial cost. Properly documenting and disseminating
microdata has a cost. This includes not only the costs of
creating and documenting microdata files, but also the
costs of creating access tools and safeguards, and of
supporting enquiries made by the research community.
48
传播的成本和风险
• 官方与非官方结果,对比揭露矛盾:微观数据的传播可
能激增不同的 - 并可能是相互矛盾的 - 结果和统计。传
播可能导致官方数据和其他来源的统计数据变得越来越
难以区分。
• 财务成本:妥善记录和传播微观数据是有成本的。这不
仅包括创建和记录微观数据文件的成本,还包括建立访
问工具和保障措施,以及向研究界提供辅助问询的成本。
49
Costs and risks of dissemination
• Confidentiality. One of the biggest challenges of microdata
dissemination is to minimize the risk of disclosure of any
data that would compromise the identity of respondents.
• Legality. All countries have a specific national statistical and
data protection legislation.
50
传播的成本和风险
• 保密性:微观数据传播的最大挑战之一,是如何尽量减
少任何由于披露数据而导致的,可能危及受访者身份保
密性的风险。
• 合法性:所有国家都有其特定的国家统计和数据保护法
例。
51
Principles - UNECE
• It is appropriate for microdata collected for official
statistical purposes to be used for statistical analysis to
support research as long as confidentiality is protected.
• Provision of microdata should be consistent with legal and
other necessary arrangements that ensure that
confidentiality of the released microdata is protected.
Managing Statistical Confidentiality and Microdata Access
- Principles and guidelines of Good Practice, by the
Conference of European Statisticians (CES) and United
Nations Economic Commission for Europe (UNECE)
52
原则 - UNECE联合国欧洲经济委员会
• 在确保保密性的前提下,研究者可以使用为了官方统计
目而收集的微观数据,来进行统计分析并支持研究。
• 提供微观数据应当符合法律和其他必要的约定,以确保
被发布的微观数据的保密性。
管理统计保密性和微观数据访问 -良好实践的原则和指
导方针, 欧洲统计学家会议(CES) 与联合国欧洲经济委员
会(UNECE)
53
Anonymization
• Statistical agencies are charged with protecting
the confidentiality of survey respondents.
• Protecting confidentiality necessitates some sort
of data anonymization so that individual
respondents can not be identified.
54
匿名化
• 统计机构被委以为调查受访者保密的责任。
• 为了保密,必须采取一定的数据匿名化措施,
从而使得个体受访者不会被辨识。
55
Anonymization concepts
• Identifying variables include:
– Direct identifiers, which are variables such as names, addresses,
or identity card numbers. They should be removed from the
published dataset.
– Indirect identifiers, which are characteristics whose
combination could lead to the re-identification of respondents
(e.g., region, age, sex, occupation). Such variables are needed
for statistical purposes, and should not be removed from the
published data files.
• Anonymizing the data involves determining which variables
are potential identifiers and modifying the specificity of
these variables to reduce the risk of re-identification to an
acceptable level. The challenge is to maximize the security
while minimizing the resulting information loss.
56
匿名化概念
• 识别变量包括:
– 直接识别符, 是诸如姓名、地址或身份证号码的变量。这
些变量应该从被公布的数据集中删除。
– 间接识别符, 是一些个体特征变量,若组合在一起可重新
识别受访者(例如地区、年龄、性别、职业)。这样的
变量出于统计目的需要,不应该从被公布的数据文件中
删除。
• 数据匿名化涉及确定哪些变量是潜在识别符,并修改
这些变量的特征,从而将重新识别的风险降低到一个
可接受的水平。当前的挑战是如何在保持最大程度安
全性的同时,最大限度地减少信息损失。
57
Anonymization techniques
•
•
•
•
•
•
•
•
•
•
Removing variables (e.g., detailed geographic identification)
Removing records (outliers)
Global recoding (e.g., from age to age groups)
Top- or bottom-coding (e.g., create “65+” age category)
Local suppression (replace with missing)
Micro-aggregation (e.g., for income variable)
Data swapping
Post-randomization
Noise addition
Resampling
58
匿名化技术
•
•
•
•
•
•
•
•
•
•
删除变量(例如,详细的地理标识)
删除记录(离群值)
全球性重新编码(例如,将年龄改成年龄组)
顶部或底部编码(例如,创建“65+”年龄组别)
本地隐瞒(更换为缺失值)
微聚集(例如,对于收入变量)
数据替换
后随机化
添加噪声
重新抽样
59
Anonymization tools and guidelines
Software: sdcMicro
Technical guidelines
An open source (R-based)
package
http://www.ihsn.org/home/node/118
More practical guidelines are
being produced by the IHSN.
NOTE :
Anonymization is a complex
process. It requires analytical
skills and involves some
arbitrary decisions.
60
匿名化工具和指导原则
软件: sdcMicro
一个开放资源的(以R语
言为基础的)软件包
技术指引
http://www.ihsn.org/home/node/1
18
IHSN国际住户调查网络提供了
更多的实际操作指引。
注释 :
匿名化是一个复杂的过程。它
需要分析技巧并涉及到一些
主观的判定。
61
Policy guidelines on dissemination
Formulating a microdata access policy
http://www.ihsn.org/home/node/120
62
传播的政策指引
制定一个微观数据访问政策
http://www.ihsn.org/home/node/120
63
Cataloguing
• Data and metadata need to be made visible.
• Users will benefit from advanced data discovery
tools, in particular on-line searchable catalogs.
• The IHSN developed an open source application,
compliant with the DDI standard, to help
disseminate metadata and (optional) microdata.
This application (NADA) complements the
Metadata Editor.
64
编目
• 数据和元数据要成为用户可见的。
• 用户将受益于先进的数据发现工具,特别是可
在线搜索的目录。
• IHSN国际住户调查网络开发了一款符合DDI元
数据标准的,开放资源的应用程序,用来帮助
传播元数据和微观数据(可选)。此应用程序
(NADA)是Nesstar元数据编辑软件的一个补
充。
65
Dissemination Exercise
Workshop participants will upload their DDI
metadata (generated during the documentation
exercise) in an on-line, searchable survey catalog
66
传播练习
培训班的与会者将把他们(在记录练习中生成)
的DDI元数据,上传到一个可在线搜索的调查
目录中。
67
Survey catalogs
100+ agencies in 65+ countries have started
establishing a microdata archive using IHSN tools
68
调查目录
在65个以上国家的100多个机构中,已经开展了
使用IHSN国际住户调查网络工具对微观数据进
行存档
69
Archiving
Part 3
Preserving data and metadata
70
存档
第3部分
保存数据和元数据
71
Issues
Common issues include:
– Loss of data and metadata, because of human error,
technical problems, or disasters such as fire or flood
– Data available, but on unreadable formats/media
(hardware and software obsolescence)
– Data available, but undocumented
– Documentation only available in hard copy
– Multiple versions of datasets
available, with no “versioning”
information
72
问题
常见问题包括:
– 由于人为失误,技术问题,或诸如火灾或水灾等
灾害,造成数据和元数据的损失
– 存在数据,但格式/媒介无法读取(硬件和软件过
时)
– 存在数据, 但尚未记录
– 只提供硬拷贝文档
– 存在多个版本的数据集,但没有
“版本管理”信息
73
Physical threats
Physical damage can occur to hardware and media due to:
• Material instability
• Improper storage environment (temperature, humidity,
light, dust)
• Overuse (mainly for physical contact media)
• Natural disaster (fire, flood, earthquake)
• Infrastructure failure (plumbing, electrical, climate control)
• Inadequate hardware maintenance
• Human error (including improper handling)
• Sabotage (theft, vandalism)
74
实体威胁
有形损害可能因为以下因素,发生在硬件和媒介上:
• 材料的不稳定性
• 不适当的储存环境(温度、湿度、光照、灰尘)
• 过度使用(主要针对有直接接触的媒介)
• 自然灾害(火灾、水灾、地震)
• 基础设施故障(水暖、电气、气候控制)
• 硬件维护不足
• 人为失误(包括处理不当)
• 蓄意破坏(盗窃、破坏)
75
Software obsolescence
A file format may be superseded by newer versions
and no longer be supported.
<XML>
Html 2
1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008
76
软件过时
一种文件格式可能被更新的版本取代,因而不再
受到支持
<XML>
Html 2
1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008
77
Hardware obsolescence
Storage medium are rapidly superseded by smaller,
denser, faster media. The device needed to read an
“old” medium may no longer be manufactured.
1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004
78
硬件过时
旧的存储媒介被体积更小,更密集,更快速的新
媒介所取代。阅读“旧”媒介所须的设备可能已
经停产。
1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004
79
Preservation policies
Microdata preservation refers to the management of digital
data and related metadata over time to guarantee their long
term usability. It requires the establishment and
implementation of a preservation policy and procedures.
–Back up your data
–Ensure suitable data storage
• Refreshing media: copy digital information from one
medium to another.
• Technology preservation: preserve old operating systems,
software, media drives as a disaster recovery strategy.
• Migrating data: copy or convert data from one technology
to another, whether hardware or software.
80
保存政策
微观数据保存是指随着时间的推移,管理数字化数据以
及相关的元数据,以保证他们的长期可使用性。这需要
建立和实施一整套保存政策及程序。
–备份您的数据
–确保适当的数据存储
• 翻新媒介:将数字信息从一种媒介复制到另一种媒
介。
• 技术保存:通过保存旧的操作系统,软件,媒体驱
动器来作为一项应急恢复策略。
• 迁移数据:将数据从一个技术复制或转换到另一个
技术,无论是硬件还是软件。
81
Guidelines
• Unlike the preservation of information on paper,
the preservation of digital information demands
constant attention.
• Guidelines: complex, but useful as a “technical
audit manual”
http://www.ihsn.org/home/node/121
82
指导原则
• 不同于在纸面上的信息保存,保存数字信息需
要不断关注。
• 指导原则:一份复杂,但有用的“技术审核手
册”
http://www.ihsn.org/home/node/121