Managing Your First Digitization Project LITA 2004 National Forum St. Louis, October 2004 Krystyna K.

Download Report

Transcript Managing Your First Digitization Project LITA 2004 National Forum St. Louis, October 2004 Krystyna K.

Managing Your First Digitization Project
LITA 2004 National Forum
St. Louis, October 2004
Krystyna K. Matusiak
Digital Collections Librarian
UWM Libraries
University of Wisconsin-Milwaukee
[email protected]
1
Outline
 Provide an overview of the digitization
process







Planning a digitization project
Selection of source items
Image capture and image processing
Indexing and descriptive metadata
Building an online collection
Maintaining the collection
Preservation of master files
 Share the lessons learned from the project
2
Goals for the Pilot Project
 Create the first digitization project at the
UWM Libraries
 Build a searchable collection accessible
through the Internet
 Investigate digital library best practices and
standards
 Establish an infrastructure for future
digitization projects at the UWM Libraries
3
Collection
Milwaukee Repertory Theater Photographic History
http://www.uwm.edu/Library/digilib/milrep/index.htm
 Image collection – includes 1,800 images
 Documents 195 performances of the Milwaukee
Repertory Theater in the years 1977 -1994
4
Planning Your First Project




Set clear goals for the project
Consider your audience
Define the scope and content of the project
Evaluate the source collection



Copyright
Format and size
Level of indexing
 Determine



Staffing
Timeframe
Cost
5
Source Collection
 Mark Avery Photography: Photographs 1977-1994
consists of thousands of black and white 35 mm
film negatives


Housed at the Archives, UWM Libraries
Finding aid available at:
http://www.uwm.edu/Libraries/arch/findaids/uwmmss
155.htm
 Mark Avery worked as the staff photographer for
the Milwaukee Repertory Theater Company from
1976 to 1994
 He donated his collection to the Archives at the
UWM Libraries in 1999
6
Why the Mark Avery
Collection?
 Share a unique resource with a wider audience

Archives at the UWM Libraries owns copyright to the
collection
 Increase the visibility of the collection and
encourage new scholarly use
 Improve access to the collection
 Create an online resource for the UWM theater
students, researchers, and community users
 Enhance intellectual control through indexing
7
and creation of metadata
William Shakespeare Romeo and Juliet
1978-1979
Thomas Hulce as Romeo and Valerie Mahaffey as Juliet
8
Arthur Miller The Crucible
1985-1986
Center: Johanna Melamed as Mary Warren. Right: Daniel Mooney as
9
John Proctor and Albert Farrar as Deputy Governor Danforth
David Mamet Glengarry Glen Ross
1985-1986
James Pickering as Williamson and Kenneth Albers as Levene
10
August Wilson Fences
1989-1990
11
Lawrence James as Troy
Moliere Tartuffe
1986-1987
12
James Pickering as Tartuffe
Brian Friel Dancing at Lughnasa
1993-1994
Richard Halverson as Jack and Rose Pickering as Kate
13
Larry Shue The Foreigner
1992-1993
James Pickering as Charlie and Tom Blair as Owen
14
Charles Dickens A Christmas Carol
1987-1988
15
Daniel Mooney as Ebenezer Scrooge
Selection of Images
 The most representative images capturing
key scenes and characters of the play
 The negatives were selected for scanning
after careful examination using a light table
 Images selected for all 195 performances
represented in the Mark Avery Collection
16
Image Capture
 Follow digital imaging standards


Use-neutral approach
Originals vs. intermediaries
 Scan the photographic negatives at 4000 dpi
resolution in grayscale mode using a Nikon
4000 ED film scanner
 Create digital master files

Save scan images as uncompressed TIFFs
 Assign a unique name following file naming
conventions, e.g. av00001
17
Image Processing
 Process images using Adobe Photoshop
 Remove dust marks and scratches
 Correct images for tone and contrast, when
necessary
 Save the changes in the working copy of
master TIFF file
 Create derivative (access) images for Web
delivery and save them in JPEG format
18
Indexing
 The negatives filed by season and
performance; no other indexing data
available
 A research process accompanied the
creation of the digital collection
 Cooperation between the UWM Libraries and
Milwaukee Repertory Theater Company was
established at an early phase of the project
19
Research
 Gather the indexing data: names of actors and
characters featured in the images, play titles,
authors, dates, names of other contributing artists,
such as directors, costume and set designers, and
lighting designers
 Examine research materials (play scripts,
programs, and photographic prints)
 Consult with subject experts
20
Browsable Collection
http://www.uwm.edu/Library/digilib/milrep/records/browse.htm
21
Building an Online Collection

1.
2.
3.
In order to build an online searchable
collection you need a digital delivery system
Possible solutions
Develop an application in-house using a
generic database (e.g. MS Access, MySQL) +
middleware (e.g. PHP, ColdFusion)
Purchase a digital management program, e.g.
CONTENTdm or Luna Insight
Use an open source digital library software,
such as Greenstone (New Zealand) or DLXS
22
Image Delivery Systems
In-house Developed Applications
 Advantages


Low initial cost
Flexibility in database
and web interface
design
 Disadvantages


Limited database size
High cost of
programming
Commercial Digital Management Programs
 Advantages


No programming
required
Offer database structure
plus a web interface
 Disadvantages


Proprietary
Offer limited
customization
23
CONTENTdm: Digital Media
Management System
 A multifunction software suite used to build
and manage multimedia collections and
make them available on the Web


Import, index, store, and manage digital objects,
as well as search and display them
Can store many digital media types including
images, text documents, compound objects,
audio and video files
 Designed for library and cultural heritage
collections
24
CONTENTdm: Digital Media
Management System
 Built on digital library standards


XML, Dublin Core, VRA Core 3.0 (Visual Resources
Association )
Supports OAI (Open Archives Initiative) Protocol for
Metadata Harvesting
 Supports single and multiple collections


Individual metadata for each collection
Capability to search across collections
 Offers batch loading to WorldCat starting with
version 3.5

Version 3.7 released in August, 2004
25
Descriptive Metadata
 Metadata to provide a description of the
digital object and its intellectual content



Describe objects in a consistent, standardized
way
Enhance access - provide means of searching in
multiple ways
Facilitate access to the original source
 Descriptive metadata standards


Dublin Core
VRA Core 3.0 (Visual Resources Association)
26
Dublin Core
 CONTENTdm provides a default metadata
template with the 15 Dublin Core elements

Title

Type

Creator

Format

Subject

Identifier

Description

Source

Publisher

Language

Contributor

Relation

Date

Coverage

Rights
27
Customization of Metadata
Templates
28
Controlled Vocabulary
 Use controlled vocabulary to ensure
consistent metadata entry
 Define a list of valid terms for a field
 Create controlled vocabulary lists as text files
and import them to CONTENTdm
 Add cross-reference terms to the lists
29
Building Records with CONTENTdm
30
Collection Interface
 Use a default HTML client provided by
CONTENTdm
or
 Design a customized collection interface



Search page
Search Results Template – displays a number of
thumbnails and their titles
Item Display Template – displays large image
and its associated metadata
31
Search Page
32
Search Results Page
33
Item Display – Main Record
34
Resource Discovery Metadata
 Metadata for discovery and retrieval of the site
on the Internet
 Metadata embedded within the HTML tags of
the main (index) page
 A set of Dublin Core metadata elements
describing the project on the collection level

The Dublin Core Metadata Template available from
the Nordic Metadata Project at
http://www.lub.lu.se/metadata/DC_creator.html
35
Maintaining the Collection
 Document the digitization process


Compile documentation during the project
Write final project report
 Promote the collection


Issue press releases and announcements
Schedule presentations and workshops
 Update the collection with feedback from users
 Enable OAI support on the server and register the
collection with a OAI harvesting service, e.g. OAIster
http://www.oaister.org/o/oaister/
 Update the collection with time according to new
36
digital technologies and standards
Preservation
 Two sets of master TIFF files are stored at UWM
Libraries

The archival master files can be used to create other
types of digital derivatives or high-quality prints
 Document the digitization process to ensure a
long-term retention of the archival files


Follow guidelines included in the OCLC/RLG report “A
Metadata Framework to Support the Preservation of
Digital Objects”
Follow the NISO standard “Technical Metadata for
Digital Still Images” to record the metadata on the item
level
37
Preservation Metadata
 Metadata for identifying master files and
maintaining them over time
 Collection level metadata: compression,
resolution, and bit depth
 Metadata on item level: digital file id, file size,
dimensions in pixels, scan date, and master
file location
38
Standards & Tools
 Standards used to represent content


TIFF
JPEG
 Standards used to describe content

Dublin Core
 Standards and tools used to represent
structure


HTML
CONTENTdm software
39
Standards & Tools
 Standards, guidelines, and tools used to
record preservation metadata



OAIS (Open Archive Information System)
Information Model
NISO standard: Technical Metadata for Digital Still
Images
MS Access database
40
Lessons
 Focus on the users and outcomes
 Apply standards to build a robust and sustainable
collection
 Avoid the hidden costs of internal development of
applications

Select a commercial digital image delivery system if
programming expertise is not available
 Include time for indexing and metadata creation in
your project plan

Metadata creation can take up to 2/3 of the project time
 Address the issue of master file preservation
41
Digital
UWM
UWM Collections
Libraries
Digital
Collections
Libraries
URL: http://www.uwm.edu/Library/digilib/
42