Transcript Slide 1

New Custodians and New Practices
Digital Curation for Family History Materials
IFLA-GENLOC Satellite Meeting,11 August 2011
Ross Harvey (Simmons College, Boston)
Introduction
The information diaspora requires new custodians of
information - including individuals
Much of this information is in digital form - digitized and
‘born digital’
Family history sources are increasingly digital
‘Old-style’ preservation doesn’t work with digital
information
Custodians (including individuals) need to adjust their
preservation strategies





2
Topics
New thinking about preservation
Digital material at risk
Digital preservation: current best practice
An aside: where collections come from
Guidelines for small organizations and individuals
Where to go next
Conclusion







3
New thinking about preservation
Preservation is …


‘concerned with maintaining or restoring access to artifacts,
documents and records’ (SAA Glossary); ‘measures taken to
extend the usable life of materials … to slow down the natural
processes of deterioration of an object’ (Wikipedia)
Paper-based preservation thinking does not work with
digital information because it:



Focuses attention on the carrier (the physical medium)
Emphasizes secure storage facilities, stable environmental
conditions
This doesn’t address preservation issues of digital objects

4
Digital material at risk: why?
Obsolescence of computers and software
Vulnerability to corruption
Lack of knowledge about best practice
Insufficient resources allocated to digital preservation
Insufficient professionals with appropriate skills
Lack of knowledge about what the best organizational
structures are






5
Questions for you
Do you back up your personal digital files?
Do you back them up according to a regular schedule?
Have you ever tried reinstating files from the backup?
How many copies of the backup files do you keep?
Where do you store them?
Have you ever had a hard disk crash?
When you upgrade to a new computer, operating system
or software version, how do you make sure you can read
your old digital files?







6
Questions for you
Backing up to a regular schedule / Checking that
backups work / Keeping multiple copies in
distributed storage


All of these are good practices for short-term storage
In libraries and archives, we are interested in



7
Long-term preservation and in ensuring the digital files
can be used after time has passed – DIGITAL
CURATION
This is much harder to do
What’s so hard about keeping digital materials?
Quantities
We create and handle lots of
digital materials, e.g.


Files created in digitizing projects
Born-digital materials
Internet-hosted
materials
Quantities extremely large
BUT our procedures for archiving
can currently handle only small
quantities
8
What’s so hard about keeping digital materials?
The hardware
changes fast

Osborne portable
computer 1981
9
What’s so hard about keeping digital materials?
The storage media deteriorate fast
and obsolescence gets in the way
10
What’s so hard about keeping digital materials?

The software changes fast


11
What is this?
How would you open it?
What’s so hard about keeping digital materials?

The file formats change fast


12
What is this?
How would you open it?
What’s so hard about keeping digital materials?
Some of my old files: how to open them?
13
What’s so hard about keeping digital materials?
14
What’s so hard about keeping digital materials?
And there’s more:

Technical




Lack of standards
Access barriers (e.g. encrypted files without the encryption keys)
Viruses
Non-technical – these are MAJOR




15
Funding is not sustained over time
Legal permissions
Inadequate knowledge and skills
Materials poorly identified and described
The inescapable conclusion

We can’t place digital objects on shelf and leave 100+ years –
ongoing intervention is required
“Preservation by digitization is precisely like
running a glasshouse for plants where you
have to provide water continuously,
otherwise you will lose everything…This is
why a … digitization [project] is so
dangerous if the ‘watering’ for all eternity is
not paid, nothing is preserved” (Source:
http://www.fiji.gov.fj/publish/printer_5449.shtml)
Broken link: a digital preservation issue
16
Digital preservation: current best practice




Will summarize current best practice in digital
preservation
BUT this has been developed for use in large, wellresourced archives and libraries.
It doesn't scale down well to small libraries or archives,
small collections, private information
What is this current best practice?
17
Current best practice: open data, open
source, open everything

The open data movement


18
Open access
Open source
Current best practice: metadata

Standards: we need more
Better metadata
- Data capture
- File formats
- Metadata
- Citation
- Annotation
- Representation
information
- Data interoperability
- Software integration
19
Current best practice: better understanding
Better understanding of
 The challenges
 Best practice in digital archiving


20
Needed by information professionals (you!)
Needed by creators of digital materials (including the general
public)
Current best practice: better tools

Better software tools for digital curation

21
Useful and usable
Current best practice: life-cycle responses

Develop responses that take
account of the life-cycle of
information
DCC Curation Lifecycle
Model
Open Archival Information System
Reference Model
22
Current best practice: different kinds of
organizations

Develop organizational
structures that respond
to digital curation
demands
McGovern, Nancy (2007)
‘A Digital Decade: Where
Have We Been and Where
Are We Going in Digital
Preservation?’ RLG
DigiNews v11 no1
23
Current best practice: new skillsets
MLS or equivalent, plus other skills such as:
‘Experience with XSLT, Perl or other scripting languages,
and/or experience with major repository platforms’
‘Knowledge of XML ... Semantic web technologies …
Experience with one or more metadata manipulation and
scripting languages: XSLT, Java, Perl, Python, or PHP’
24
An aside: where collections come from

Role of the individual in collection building





Collections eventually come to the archive or library
Many collections will include digital objects





Collector
Compiler
Creator
Photographs
Documents, spreadsheets
Databases
These digital objects are created by individuals
Creating 'good' digital objects is crucial for their long life
25
Guidelines for small organizations, individuals


Current best practice has been developed in large, wellresourced organizations
Can we translate them into guidelines that family history
researchers, librarians, collections custodians and
archivists in small organizations can apply?

26
Aim: to ensure digital materials are available for use in the
future
Guidelines for small organizations, individuals
General guidelines (National Library of Australia, 2009)
Refresh
files (copy them to newer storage media)
Check that the data hasn’t changed by running integrity
checks
Add metadata about the processes you apply
Keeping multiple copies of the file
Monitor developments in hardware, software, file formats
and standards that will have high impact on digital
preservation, and respond to them
But these ‘simple’ guidelines are still complex
27
Guidelines for small organizations, individuals
Creating ‘good’ digital files
Why?
Preservation-friendly files are readable for longer;
they are easier to preserve
Principles and practices:
1.Use open software if possible (eg OpenOffice not Microsoft Word)
2.Use open formats if possible (eg .CSV not .XLS)
3.Give files a unique name (eg
‘NZ_Family_History_Newsletter_no6_11June2009’ not ‘Newsletter6’)
4.Describe

28
your files using metadata
Record details about the file (eg format, who created it, date)
Guidelines for small organizations, individuals
Managing digital files
Why? To
avoid obsolescence issues
Principles and practices:
1.Refresh files when needed (eg copy them to newer storage media)
2.Check files after copying to make sure they haven’t
changed (eg try opening some of them)
3.Always keep one copy of the original file (eg and at least one
other copy, preferably more)
4.Decide
duplicates)
29
which files are most important (eg some may be
Guidelines for small organizations, individuals
Storing digital files
Why? To make sure there is an accessible, unchanged
copy available
 Principles and practices:
1. Keep several copies of the files (eg at least two copies,

preferably more)
2.
Store them in different physical locations (eg one at
home, one at work)
3.
Store them on different media (eg hard disk, CD/DVD, cloud
storage)
30
Guidelines for small organizations, individuals
Guidelines for preserving digital photographs
1.
2.
3.
4.
Identify where you have them stored
Decide which photos are most important
Organize the photos selected as important
Make copies and store them in different locations
More about this at:
http://www.digitalpreservation.gov/you/content/photos.html
31
Guidelines for small organizations, individuals
Guidelines for designing preservable web sites
1.Follow
accessibility standards (eg W3C’s Web Accessibility
Initiative)
2.Avoid
proprietary formats (eg use HTML, CSS)
3.Maintain stable URLs (eg if changing URL, make sure there’s a
redirect)
4.Design
navigation carefully (eg include a sitemap)
5.Allow browsing of content, not just searching (this helps
web harvesting software, eg Internet to capture all of the content)
Source: http://blog.photography.si.edu/2011/08/02/five-tips-for-designingpreservable-websites/
32
Guidelines for small organizations, individuals
Keep an eye on:
Digital
Preservation in a Box
http://www.digitalpreservation.gov/register/7Outreach.pdf
Personal Archiving: Preserving Your
http://www.digitalpreservation.gov/you/
33
Digital Memories
34
35
36
37
Where to go next

For lots of good
advice: European
projects



DCC
Digital Preservation
Europe
In the U.S.

38
NDIIPP (Library of
Congress)
Where to go next

Cornell University’s online tutorial Digital Preservation
http://www.icpsr.umich.edu/dpm/index.html

PARADIGM (Personal Archives Accessible in Digital Media)
http://www.paradigm.ac.uk/
39
Conclusion


The need to preserve digital information is here – it
won’t go away
It is worth putting effort into:
a)
b)


Creating ‘preservation-friendly’ digital objects
Managing, storing personal digital objects effectively
Advice is plentiful
Just do it!


It isn’t hard
But you have to be organized
Email [email protected]
40
Ross Harvey in his office, ca 1963