Describing Images for the Digital Environment

Download Report

Transcript Describing Images for the Digital Environment

Describing Images for the
Digital Environment
Translating collective description
skills to the item level
Amanda Focke, CA
Description skills for collections & groups
o As archivists we are trained to describe
the collective level, and have not been
called upon as often to describe individual
objects. However, we all know about:
Components of finding aids (data content)
Containers for that description (data structure)
Sharing description (data exchange)
How do I use my descriptive skills at the
item level?
o If you can assign a title to a collection and write
a bio note, scope note, and the rest of the
finding aid parts, you can describe single items,
too!
o How do I get started?
o What are correlating standards and practices for
describing and sharing items in a digital environment?
Parallel standards
Material
Culture
Bibliographic
Archival
Data Structure
CDWA
MARC
EAD for finding aids,
Dublin Core for describing objects,
MODS / METS for digital objects
TEI for encoding full text
Data content
CCO
AACR2 (RDA)
DACS
LCSH, LCNAF, and
others
AAT, TGN, LCSH, LCNAF, and others
Data Value
Standards
(Controlled
vocabularies)
Data Format
XML
XML/ISO2 709
XML
Data
Exchange
OAI
OAI
Z39.50
SRU/SRW
OAI
“Google-ability”
Table based on “Metadata for All: Descriptive Standards and Metadata Sharing across Libraries, Archives
and Museums” by Mary W. Elings and Günter Waibel, First Monday, volume 12, number 3 (March 2007),
URL: http://firstmonday.org/issues/issue12_3/elings/index.html
All archival acronyms noted and described in Resources slide.
 As you can see from the previous table, there are a
number of data structures used for archival materials,
but particularly Dublin Core and MODS/METS. All of
these have their own guidelines for how best to use
them.
 Dublin Core is the more common standard used
currently. It has pros and cons. It’s easy to use and can
be qualified (put example here) but still doesn’t allow
much granularity of information.
 MODS / METS are more complex to use but more
specific in the data they work together to hold, they allow
a lot of granularity (continue same example here).
 The difference might be like a simple database table
(DC) and a relational database (MODS/METS).
Main tools currently used in digital
archives projects
Dublin Core metadata
standard
Describing Archives: A
Content Standard
(DACS), by SAA
AAT, LCSH, LCNAF, and
many others
Fields used to structure
information – use a current
“best practices” guide for
assistance
Guidelines for the content of
those fields
Controlled vocabularies for
terms to be used as Index
Terms (people, places, things,
subjects, formats)
To describe an item, I need…
 The original item and digital version
 The relevant standards & guidelines
 Enough subject knowledge to write a description
(metadata)
This is the same approach as for
a collection or record group.
 At Rice we have an institutional repository running on the open source software
dSpace. It accepts many, many file types for digital objects (jpgs, tiffs, pdf’s, xml,
many, many more) and uses Dublin Core as the metadata structure standard.
 So to put an item up online, I need the digital object itself and I need enough
information to write some kind of metadata describing the item.
For example: finding aid parts mapped to
single item metadata parts
Finding aid parts
Single item metadata parts
(using DACS)
(using Dublin Core guidelines and DACS)
Title
Title
Scope and contents
Description
Bio note / Historical sketch
Description.Abstract
Use restrictions
Rights
Example of a portrait with known creator
and subject
 Here’s an image for us to describe.
(Prasilova Scott portrait)
 I am using Dublin Core as the
structure – so I have a list of fields to
fill in
 I am using DACS as the content
standard.
 Filling in creator's names, dates, etc
are straightforward in this case But
how to create a title for an untitled
object?
 Use DACS and DC guidelines to
fill in the less obvious fields






Creator: Scott, Vera Prasilova
Title: Caroline Wiess Law portrait
Date.original: 1934
Date.digital: 2007
Description: ????
Description.Abstract: ????
Example of a portrait with known creator
and subject, cont.
 Description (Similar to scope note)
 This photograph of Caroline Wiess Law is part
of the Vera Prasilova Scott portraiture
collection. Scott was a studio photographer
and the wife of a Rice faculty member. She
took portraits of many prominent Houstonians
in the late 1920s and early 1930s.
 Description.abstract (Similar to bio note)
 Caroline Wiess Law was an oil heiress and art
collector, and a major philanthropist in
Houston. She is perhaps best known for her
endowment bequests to the Museum of Fine
Arts, Houston. In 1998 the Museum honoured
her 40-year commitment as a passionate and
dedicated supporter by renaming the main
building in her honour. Wiess also supported
programs at Baylor College of Medicine, in
Houston, and University of Texas M.D.
Anderson Cancer Center, also in Houston.
University.
Example of a snapshot with unknown
creator and semi-known subject
 Creator: Unknown
photographer
 Title: ????
 Date.original: 1957
 Date.digital: 2007
 Description: ??????
 Description.Abstract:
??????
Example of a snapshot with unknown
creator and semi-known subject –cont.
 Title:
 Gus and Lyndall Wortham, at formal social
event
 Description (Similar to scope note)
 Snapshot photograph of Gus S. Wortham
and Lyndall Finley Wortham, from the Gus
S. Wortham family and business records.
 Description.abstract (Similar to bio note)
 Gus Sessions Wortham (1891-1976),
businessman, civic leader, cattle rancher
and philanthropist, served as chairman of
the board and chief executive officer of
American General for almost five
decades. He and his wife, Lyndall Finley
Wortham (1892-1980), are also
remembered for establishing the Wortham
Foundation for support of cultural
organizations and parks in Houston.
That was the hard part?
 Yes, title and description fields are typically the most labor
intensive because each item is unique.
 Each repository and each project will have to decide how complete
and detailed their metadata will be based on their available staff
and other resources.
 There are more descriptive fields, administrative fields, and
technical fields.
As with any archival description, the more time you spend on one
thing means the less time you can spend on another.
Test how long you’re taking per metadata record and see if your time scales up appropriately and fits
your overall digitization goals.
The other fields are important to include but are slightly more straightforward in terms of
implementation.
Examples of other fields
Source: Forms part of the Sam Houston papers
(http://www.rice.edu/fondren/woodson/mss/ms049.html),
Box 1, folder 3, item 1, at the Woodson Research
Center, Fondren Library, Rice University
(http://www.rice.edu/fondren/woodson/, 713-348-1586).
Publisher: Digital version published by Woodson
Research Center, Fondren Library, Rice University.
Rights: This item is licensed under Creative
Commons License 3.0, retaining the owner’s copyright
but freely allowing use and modifications with proper
attribution.
In closing
The same skills used in collective
description can be used in single-item
description.
You just need the appropriate standards
and guidelines… and a lot of patience!
Resources
 Data Structures

EAD: Encoded Archival Description.




Dublin Core metadata standard

A simple and commonly used field structure for holding metadata

http://dublincore.org/

An example of a best practices document: www.cdpheritage.org/cdp/documents/CDPDCMBP.pdf
MODS: Metadata Object Description Schema



an XML schema intended to be able to carry selected data from existing MARC 21 records as well as to enable the creation
of original resource description records
http://www.loc.gov/standards/mods/
METS: Metadata Encoding and Transmission Standard



An xml language used for marking up archival finding aids, typically for use on-line.
http://www.loc.gov/ead/
A standard for encoding descriptive, administrative, and structural metadata regarding objects within a digital library, using
an XML schema
http://www.loc.gov/standards/mets/
TEI: Text Encoding Initiative


An XML schema for encoding machine-readable texts such as books, journals, letters.
http://www.tei-c.org/
 Data Content

DACS: Describing Archives: A Content Standard. SSA publication.
 DACS is an output-neutral set of rules for describing archives, personal papers, and manuscript collections, and can be
applied to all material types.
Resources cont.

Controlled Vocabularies
 AAT = Art & Architecture Thesaurus, by the Getty Museum of Art.
 Controlled vocabulary used for format names such as “Studio portraits”.
 http://www.getty.edu/research/conducting_research/vocabularies/aat/
 LCNAF = Library of Congress Name Authority File.
 Authorized format of proper nouns – people places and things.
 http://authorities.loc.gov/
 LCSH = Library of Congress Subject Headings.
 Authorized headings for topics / subjects.
 http://authorities.loc.gov/
 TGN: Thesaurus of Geographic Names
 Controlled vocabulary used for geographic names
 http://www.getty.edu/research/conducting_research/vocabularies/tgn/

Data Format
 XML: Extensible markup language
 http://www.w3.org/XML/

Data Exchange
 OAI-PMH: Open Archives Initiative Protocol for Metadata Harvesting
 A low-barrier mechanism for repository interoperability. Data Providers are repositories that expose
structured metadata via OAI-PMH. Service Providers then make OAI-PMH service requests to harvest
that metadata.
Contact
 Amanda Focke, CA
 Asst. Head of Special Collections, Fondren Library, Rice University
 [email protected]