Managing Your First Digitization Project LITA 2004 National Forum St. Louis, October 2004 Krystyna K.
Download ReportTranscript Managing Your First Digitization Project LITA 2004 National Forum St. Louis, October 2004 Krystyna K.
Managing Your First Digitization Project LITA 2004 National Forum St. Louis, October 2004 Krystyna K. Matusiak Digital Collections Librarian UWM Libraries University of Wisconsin-Milwaukee [email protected] 1 Outline Provide an overview of the digitization process Planning a digitization project Selection of source items Image capture and image processing Indexing and descriptive metadata Building an online collection Maintaining the collection Preservation of master files Share the lessons learned from the project 2 Goals for the Pilot Project Create the first digitization project at the UWM Libraries Build a searchable collection accessible through the Internet Investigate digital library best practices and standards Establish an infrastructure for future digitization projects at the UWM Libraries 3 Collection Milwaukee Repertory Theater Photographic History http://www.uwm.edu/Library/digilib/milrep/index.htm Image collection – includes 1,800 images Documents 195 performances of the Milwaukee Repertory Theater in the years 1977 -1994 4 Planning Your First Project Set clear goals for the project Consider your audience Define the scope and content of the project Evaluate the source collection Copyright Format and size Level of indexing Determine Staffing Timeframe Cost 5 Source Collection Mark Avery Photography: Photographs 1977-1994 consists of thousands of black and white 35 mm film negatives Housed at the Archives, UWM Libraries Finding aid available at: http://www.uwm.edu/Libraries/arch/findaids/uwmmss 155.htm Mark Avery worked as the staff photographer for the Milwaukee Repertory Theater Company from 1976 to 1994 He donated his collection to the Archives at the UWM Libraries in 1999 6 Why the Mark Avery Collection? Share a unique resource with a wider audience Archives at the UWM Libraries owns copyright to the collection Increase the visibility of the collection and encourage new scholarly use Improve access to the collection Create an online resource for the UWM theater students, researchers, and community users Enhance intellectual control through indexing 7 and creation of metadata William Shakespeare Romeo and Juliet 1978-1979 Thomas Hulce as Romeo and Valerie Mahaffey as Juliet 8 Arthur Miller The Crucible 1985-1986 Center: Johanna Melamed as Mary Warren. Right: Daniel Mooney as 9 John Proctor and Albert Farrar as Deputy Governor Danforth David Mamet Glengarry Glen Ross 1985-1986 James Pickering as Williamson and Kenneth Albers as Levene 10 August Wilson Fences 1989-1990 11 Lawrence James as Troy Moliere Tartuffe 1986-1987 12 James Pickering as Tartuffe Brian Friel Dancing at Lughnasa 1993-1994 Richard Halverson as Jack and Rose Pickering as Kate 13 Larry Shue The Foreigner 1992-1993 James Pickering as Charlie and Tom Blair as Owen 14 Charles Dickens A Christmas Carol 1987-1988 15 Daniel Mooney as Ebenezer Scrooge Selection of Images The most representative images capturing key scenes and characters of the play The negatives were selected for scanning after careful examination using a light table Images selected for all 195 performances represented in the Mark Avery Collection 16 Image Capture Follow digital imaging standards Use-neutral approach Originals vs. intermediaries Scan the photographic negatives at 4000 dpi resolution in grayscale mode using a Nikon 4000 ED film scanner Create digital master files Save scan images as uncompressed TIFFs Assign a unique name following file naming conventions, e.g. av00001 17 Image Processing Process images using Adobe Photoshop Remove dust marks and scratches Correct images for tone and contrast, when necessary Save the changes in the working copy of master TIFF file Create derivative (access) images for Web delivery and save them in JPEG format 18 Indexing The negatives filed by season and performance; no other indexing data available A research process accompanied the creation of the digital collection Cooperation between the UWM Libraries and Milwaukee Repertory Theater Company was established at an early phase of the project 19 Research Gather the indexing data: names of actors and characters featured in the images, play titles, authors, dates, names of other contributing artists, such as directors, costume and set designers, and lighting designers Examine research materials (play scripts, programs, and photographic prints) Consult with subject experts 20 Browsable Collection http://www.uwm.edu/Library/digilib/milrep/records/browse.htm 21 Building an Online Collection 1. 2. 3. In order to build an online searchable collection you need a digital delivery system Possible solutions Develop an application in-house using a generic database (e.g. MS Access, MySQL) + middleware (e.g. PHP, ColdFusion) Purchase a digital management program, e.g. CONTENTdm or Luna Insight Use an open source digital library software, such as Greenstone (New Zealand) or DLXS 22 Image Delivery Systems In-house Developed Applications Advantages Low initial cost Flexibility in database and web interface design Disadvantages Limited database size High cost of programming Commercial Digital Management Programs Advantages No programming required Offer database structure plus a web interface Disadvantages Proprietary Offer limited customization 23 CONTENTdm: Digital Media Management System A multifunction software suite used to build and manage multimedia collections and make them available on the Web Import, index, store, and manage digital objects, as well as search and display them Can store many digital media types including images, text documents, compound objects, audio and video files Designed for library and cultural heritage collections 24 CONTENTdm: Digital Media Management System Built on digital library standards XML, Dublin Core, VRA Core 3.0 (Visual Resources Association ) Supports OAI (Open Archives Initiative) Protocol for Metadata Harvesting Supports single and multiple collections Individual metadata for each collection Capability to search across collections Offers batch loading to WorldCat starting with version 3.5 Version 3.7 released in August, 2004 25 Descriptive Metadata Metadata to provide a description of the digital object and its intellectual content Describe objects in a consistent, standardized way Enhance access - provide means of searching in multiple ways Facilitate access to the original source Descriptive metadata standards Dublin Core VRA Core 3.0 (Visual Resources Association) 26 Dublin Core CONTENTdm provides a default metadata template with the 15 Dublin Core elements Title Type Creator Format Subject Identifier Description Source Publisher Language Contributor Relation Date Coverage Rights 27 Customization of Metadata Templates 28 Controlled Vocabulary Use controlled vocabulary to ensure consistent metadata entry Define a list of valid terms for a field Create controlled vocabulary lists as text files and import them to CONTENTdm Add cross-reference terms to the lists 29 Building Records with CONTENTdm 30 Collection Interface Use a default HTML client provided by CONTENTdm or Design a customized collection interface Search page Search Results Template – displays a number of thumbnails and their titles Item Display Template – displays large image and its associated metadata 31 Search Page 32 Search Results Page 33 Item Display – Main Record 34 Resource Discovery Metadata Metadata for discovery and retrieval of the site on the Internet Metadata embedded within the HTML tags of the main (index) page A set of Dublin Core metadata elements describing the project on the collection level The Dublin Core Metadata Template available from the Nordic Metadata Project at http://www.lub.lu.se/metadata/DC_creator.html 35 Maintaining the Collection Document the digitization process Compile documentation during the project Write final project report Promote the collection Issue press releases and announcements Schedule presentations and workshops Update the collection with feedback from users Enable OAI support on the server and register the collection with a OAI harvesting service, e.g. OAIster http://www.oaister.org/o/oaister/ Update the collection with time according to new 36 digital technologies and standards Preservation Two sets of master TIFF files are stored at UWM Libraries The archival master files can be used to create other types of digital derivatives or high-quality prints Document the digitization process to ensure a long-term retention of the archival files Follow guidelines included in the OCLC/RLG report “A Metadata Framework to Support the Preservation of Digital Objects” Follow the NISO standard “Technical Metadata for Digital Still Images” to record the metadata on the item level 37 Preservation Metadata Metadata for identifying master files and maintaining them over time Collection level metadata: compression, resolution, and bit depth Metadata on item level: digital file id, file size, dimensions in pixels, scan date, and master file location 38 Standards & Tools Standards used to represent content TIFF JPEG Standards used to describe content Dublin Core Standards and tools used to represent structure HTML CONTENTdm software 39 Standards & Tools Standards, guidelines, and tools used to record preservation metadata OAIS (Open Archive Information System) Information Model NISO standard: Technical Metadata for Digital Still Images MS Access database 40 Lessons Focus on the users and outcomes Apply standards to build a robust and sustainable collection Avoid the hidden costs of internal development of applications Select a commercial digital image delivery system if programming expertise is not available Include time for indexing and metadata creation in your project plan Metadata creation can take up to 2/3 of the project time Address the issue of master file preservation 41 Digital UWM UWM Collections Libraries Digital Collections Libraries URL: http://www.uwm.edu/Library/digilib/ 42