METADATA: What is It and What can I Do with It?

Download Report

Transcript METADATA: What is It and What can I Do with It?

METADATA
What Is It and What Can I
Do With It?
Vicki L. Gregory
Associate Professor
School of Library & Information Science
University of South Florida
[email protected]
What is Metadata?
• Data about data
– A library catalog
– Database records from indexing and
abstracting services
– Metatags/descriptors for information
available across a network
If the Internet is to continue to
thrive, “something very much like
traditional library services will be
needed to organize access and
preserve networked information.”
Clifford Lynch, Scientific American
Information Retrieval on the
Web: The Impossible Dream?
• Super- or metacatalog?
• Robot-generated indexes
• Encoded text
MARC is a Metadata Format
• Advantage -- A
way of integrating
metadata into
existing library
systems
• Disadvantage -Personnel
intensive
Robot-Generated Indexes:
Harvesting Information
from Web Sites
• HTML META tags
• Attributes
– CONTENT
– HTTP-EQUIV
– NAME
• Example
– <META NAME = “Keywords” CONTENT =
“metadata, Dublin Core, TEI”>
– <META NAME = “Description” CONTENT =
“Discusses the concept of metadata, its
various formats, and the strengths and
weaknesses of each.”>
Dublin Core
• Enrichment of information about a
document provided either by the
author or a third party, such as a
library cataloger
• 15-element metadata set allowing
metadata to be attached or
embedded in a large number of Web
documents
Dublin Core Elements
• Title
• Format
• Author
• Identifier
• Subject
• Source
• Description
• Language
• Publisher
• Relation
• Contributor
• Coverage
• Date
• Rights
• Type
Partial Example of Dublin
Core
• <meta NAME = “D.C. identifier” CONTENT =
“http://www.cas.usf.edu/lis/lis6511/”>
• <meta NAME = “D.C. author” CONTENT =
“Vicki L. Gregory”>
• <meta NAME = “D.C. subject” CONTENT =
“collection development, selection, weeding,
preservation, intellectual freedom”>
Example (Continued)
• <meta NAME = “D.C. description” CONTENT
= A survey course dealing with all aspects of
collection development and collection
maintenance issues.”>
• <meta NAME = “D.C. date” Content =
“January 5, 1999”>
• <meta NAME = “D.C. language” CONTENT =
“English”>
• <meta NAME = “D.C. format” CONTENT =
“HTML”>
Resource Description
Framework (RDF)
• Another effort to standardize description
and resource discovery for the Web
• Developed by World Wide Web
Consortium (W3C)
• Netscape and Microsoft have developed
tools to accommodate RDF specifications
U.S. Government Metadata
Standards
• FGDC’s CSDGM (Content Standard for
Digital Geospatial Metadata)
– minus: very complex, over 300
– different elements with differing options
for application
– plus: allows sharing of data among
geographic information systems.
U.S. Government Metadata
Standards
• GILS (Government Information Locator Service)
– Federal Depository libraries required to provide at
least one GILS point of access to the public
– GILS locator records may describe libraries, and,
thus, incorporate them into the GILS system
– Rich source of data co-searchable by Z39.50
online catalogs
– GILS has incorporated MARC definitions with oneto-one mapping
Human Selection
• Selection according to stated criteria
• Addition of descriptive metadata to aid
in retrieval
Text Encoding Initiative (TEI)
• Humanities related text
collections
• Header element
– contains bibliographic
information about the
attached document
TEI Header
• File Description
– bibliographic information
• Encoding Description
– editing decisions when encoding document
• Profile Description
– languages used, setting, etc.
• Revision Description
– log of changes made
TEI Header: Partial Example
<TEIHEADER>
<FILEDESC>
<TITLESTMT>
<TITLE TYPE = “245”> Blood of the Prophets / by
Edgar Lee Masters as Dexter Wallace [electronic
text]</TITLE>
<AUTHOR> Masters, Edgar Lee, 18681950</AUTHOR>
</TITLESTMT>
TEI Example (Continued)
<EXTENT> ca. 122 kb</EXTENT>
<PUBLICATIONSTMT>
<PUBLISHER>University of Michigan Humanities Text
Initiative</PUBLISHER>
<PUBPLACE> Ann Arbor, Mich. </PUBPLACE>
</FILEDESC>
<ENCODINGDESC>
<EDITORIALDECI>
<p> All poems, line groups, and lines are represented.
Indentation and table of contents have been preserved.
</P></EDITORIALDECI></TEIHEADER>
Future of the TEI Header
• Available to patrons on the Web by
using XML, instead of having to
convert to HTML (with corresponding
loss of information details).
Encoded Archival Description
(EAD)
• SGML DTD designed to reflect the
structure of archival finding aids and
the collections they describe.
• Response to the need for hierarchical
structure and highly contextual
information that are part of the nature
of archival information.
Crosswalks
• Mapping metadata sets
• Concerns
– data loss
– reversibility
– who owns and maintains a given
map
– map variants
Metadata have an
acknowledged role in the
organization of and access
to networked information.
Prediction
At some point in the relatively near future,
catalogers will probably be creating
metadata as commonly as they
now do MARC records
.