Managing Your First Digitization Project LITA 2004 National Forum St. Louis, October 2004 Krystyna K.
Download
Report
Transcript Managing Your First Digitization Project LITA 2004 National Forum St. Louis, October 2004 Krystyna K.
Managing Your First Digitization Project
LITA 2004 National Forum
St. Louis, October 2004
Krystyna K. Matusiak
Digital Collections Librarian
UWM Libraries
University of Wisconsin-Milwaukee
[email protected]
1
Outline
Provide an overview of the digitization
process
Planning a digitization project
Selection of source items
Image capture and image processing
Indexing and descriptive metadata
Building an online collection
Maintaining the collection
Preservation of master files
Share the lessons learned from the project
2
Goals for the Pilot Project
Create the first digitization project at the
UWM Libraries
Build a searchable collection accessible
through the Internet
Investigate digital library best practices and
standards
Establish an infrastructure for future
digitization projects at the UWM Libraries
3
Collection
Milwaukee Repertory Theater Photographic History
http://www.uwm.edu/Library/digilib/milrep/index.htm
Image collection – includes 1,800 images
Documents 195 performances of the Milwaukee
Repertory Theater in the years 1977 -1994
4
Planning Your First Project
Set clear goals for the project
Consider your audience
Define the scope and content of the project
Evaluate the source collection
Copyright
Format and size
Level of indexing
Determine
Staffing
Timeframe
Cost
5
Source Collection
Mark Avery Photography: Photographs 1977-1994
consists of thousands of black and white 35 mm
film negatives
Housed at the Archives, UWM Libraries
Finding aid available at:
http://www.uwm.edu/Libraries/arch/findaids/uwmmss
155.htm
Mark Avery worked as the staff photographer for
the Milwaukee Repertory Theater Company from
1976 to 1994
He donated his collection to the Archives at the
UWM Libraries in 1999
6
Why the Mark Avery
Collection?
Share a unique resource with a wider audience
Archives at the UWM Libraries owns copyright to the
collection
Increase the visibility of the collection and
encourage new scholarly use
Improve access to the collection
Create an online resource for the UWM theater
students, researchers, and community users
Enhance intellectual control through indexing
7
and creation of metadata
William Shakespeare Romeo and Juliet
1978-1979
Thomas Hulce as Romeo and Valerie Mahaffey as Juliet
8
Arthur Miller The Crucible
1985-1986
Center: Johanna Melamed as Mary Warren. Right: Daniel Mooney as
9
John Proctor and Albert Farrar as Deputy Governor Danforth
David Mamet Glengarry Glen Ross
1985-1986
James Pickering as Williamson and Kenneth Albers as Levene
10
August Wilson Fences
1989-1990
11
Lawrence James as Troy
Moliere Tartuffe
1986-1987
12
James Pickering as Tartuffe
Brian Friel Dancing at Lughnasa
1993-1994
Richard Halverson as Jack and Rose Pickering as Kate
13
Larry Shue The Foreigner
1992-1993
James Pickering as Charlie and Tom Blair as Owen
14
Charles Dickens A Christmas Carol
1987-1988
15
Daniel Mooney as Ebenezer Scrooge
Selection of Images
The most representative images capturing
key scenes and characters of the play
The negatives were selected for scanning
after careful examination using a light table
Images selected for all 195 performances
represented in the Mark Avery Collection
16
Image Capture
Follow digital imaging standards
Use-neutral approach
Originals vs. intermediaries
Scan the photographic negatives at 4000 dpi
resolution in grayscale mode using a Nikon
4000 ED film scanner
Create digital master files
Save scan images as uncompressed TIFFs
Assign a unique name following file naming
conventions, e.g. av00001
17
Image Processing
Process images using Adobe Photoshop
Remove dust marks and scratches
Correct images for tone and contrast, when
necessary
Save the changes in the working copy of
master TIFF file
Create derivative (access) images for Web
delivery and save them in JPEG format
18
Indexing
The negatives filed by season and
performance; no other indexing data
available
A research process accompanied the
creation of the digital collection
Cooperation between the UWM Libraries and
Milwaukee Repertory Theater Company was
established at an early phase of the project
19
Research
Gather the indexing data: names of actors and
characters featured in the images, play titles,
authors, dates, names of other contributing artists,
such as directors, costume and set designers, and
lighting designers
Examine research materials (play scripts,
programs, and photographic prints)
Consult with subject experts
20
Browsable Collection
http://www.uwm.edu/Library/digilib/milrep/records/browse.htm
21
Building an Online Collection
1.
2.
3.
In order to build an online searchable
collection you need a digital delivery system
Possible solutions
Develop an application in-house using a
generic database (e.g. MS Access, MySQL) +
middleware (e.g. PHP, ColdFusion)
Purchase a digital management program, e.g.
CONTENTdm or Luna Insight
Use an open source digital library software,
such as Greenstone (New Zealand) or DLXS
22
Image Delivery Systems
In-house Developed Applications
Advantages
Low initial cost
Flexibility in database
and web interface
design
Disadvantages
Limited database size
High cost of
programming
Commercial Digital Management Programs
Advantages
No programming
required
Offer database structure
plus a web interface
Disadvantages
Proprietary
Offer limited
customization
23
CONTENTdm: Digital Media
Management System
A multifunction software suite used to build
and manage multimedia collections and
make them available on the Web
Import, index, store, and manage digital objects,
as well as search and display them
Can store many digital media types including
images, text documents, compound objects,
audio and video files
Designed for library and cultural heritage
collections
24
CONTENTdm: Digital Media
Management System
Built on digital library standards
XML, Dublin Core, VRA Core 3.0 (Visual Resources
Association )
Supports OAI (Open Archives Initiative) Protocol for
Metadata Harvesting
Supports single and multiple collections
Individual metadata for each collection
Capability to search across collections
Offers batch loading to WorldCat starting with
version 3.5
Version 3.7 released in August, 2004
25
Descriptive Metadata
Metadata to provide a description of the
digital object and its intellectual content
Describe objects in a consistent, standardized
way
Enhance access - provide means of searching in
multiple ways
Facilitate access to the original source
Descriptive metadata standards
Dublin Core
VRA Core 3.0 (Visual Resources Association)
26
Dublin Core
CONTENTdm provides a default metadata
template with the 15 Dublin Core elements
Title
Type
Creator
Format
Subject
Identifier
Description
Source
Publisher
Language
Contributor
Relation
Date
Coverage
Rights
27
Customization of Metadata
Templates
28
Controlled Vocabulary
Use controlled vocabulary to ensure
consistent metadata entry
Define a list of valid terms for a field
Create controlled vocabulary lists as text files
and import them to CONTENTdm
Add cross-reference terms to the lists
29
Building Records with CONTENTdm
30
Collection Interface
Use a default HTML client provided by
CONTENTdm
or
Design a customized collection interface
Search page
Search Results Template – displays a number of
thumbnails and their titles
Item Display Template – displays large image
and its associated metadata
31
Search Page
32
Search Results Page
33
Item Display – Main Record
34
Resource Discovery Metadata
Metadata for discovery and retrieval of the site
on the Internet
Metadata embedded within the HTML tags of
the main (index) page
A set of Dublin Core metadata elements
describing the project on the collection level
The Dublin Core Metadata Template available from
the Nordic Metadata Project at
http://www.lub.lu.se/metadata/DC_creator.html
35
Maintaining the Collection
Document the digitization process
Compile documentation during the project
Write final project report
Promote the collection
Issue press releases and announcements
Schedule presentations and workshops
Update the collection with feedback from users
Enable OAI support on the server and register the
collection with a OAI harvesting service, e.g. OAIster
http://www.oaister.org/o/oaister/
Update the collection with time according to new
36
digital technologies and standards
Preservation
Two sets of master TIFF files are stored at UWM
Libraries
The archival master files can be used to create other
types of digital derivatives or high-quality prints
Document the digitization process to ensure a
long-term retention of the archival files
Follow guidelines included in the OCLC/RLG report “A
Metadata Framework to Support the Preservation of
Digital Objects”
Follow the NISO standard “Technical Metadata for
Digital Still Images” to record the metadata on the item
level
37
Preservation Metadata
Metadata for identifying master files and
maintaining them over time
Collection level metadata: compression,
resolution, and bit depth
Metadata on item level: digital file id, file size,
dimensions in pixels, scan date, and master
file location
38
Standards & Tools
Standards used to represent content
TIFF
JPEG
Standards used to describe content
Dublin Core
Standards and tools used to represent
structure
HTML
CONTENTdm software
39
Standards & Tools
Standards, guidelines, and tools used to
record preservation metadata
OAIS (Open Archive Information System)
Information Model
NISO standard: Technical Metadata for Digital Still
Images
MS Access database
40
Lessons
Focus on the users and outcomes
Apply standards to build a robust and sustainable
collection
Avoid the hidden costs of internal development of
applications
Select a commercial digital image delivery system if
programming expertise is not available
Include time for indexing and metadata creation in
your project plan
Metadata creation can take up to 2/3 of the project time
Address the issue of master file preservation
41
Digital
UWM
UWM Collections
Libraries
Digital
Collections
Libraries
URL: http://www.uwm.edu/Library/digilib/
42