OAI metadata: why and how

Download Report

Transcript OAI metadata: why and how

Sharing With the Open
Archives Initiative
Jenn Riley
Metadata Librarian
Indiana University
Purpose of Open Archives Initiative



“develops and promotes interoperability
standards that aim to facilitate the efficient
dissemination of content”
“has its roots in the open access and
institutional repository movements”
“Archive” defined broadly: “as a repository for
stored information”
3/20/07
Getty Technical Talk
2
Early history of the Open Archives
Initiative



Originally defined a “metadata harvesting
protocol” – OAI-PMH
Grew out of efforts to share e-prints
Original work supported by:



3/20/07
Digital Library Federation (DLF)
Coalition for Networked Information (CNI)
National Science Foundation (NSF)
Getty Technical Talk
3
OAI-PMH

Protocol history






Version 1.0 released January 2001
Version 1.1 released July 2001
Version 2.0 released June 2002
No further major revisions planned
Protocol for harvesting metadata, not content
No inherent assumption that the metadata
describes digital content
3/20/07
Getty Technical Talk
4
How OAI-PMH works
Diagram from OAI for Beginners - the Open Archives Forum online tutorial at
http://www.oaforum.org/tutorial/english/intro.htm
3/20/07
Getty Technical Talk
5
Data providers



Set up a server that responds to harvesting
requests
Required to expose metadata in simple Dublin
Core (DC) format
Can supplement DC with metadata in any other
format expressible with an XML schema (e.g.,
CDWA Lite)
3/20/07
Getty Technical Talk
6
Service providers






Harvest and store metadata
Generally provide search/browse access to this
metadata
Can be general or domain-specific
Can choose to collect metadata in formats other
than DC
Can provide value-added services
Sometimes re-expose metadata to other
aggregations
3/20/07
Getty Technical Talk
7
Typical service provider behavior

Today




Collect and normalize metadata
Provide basic discovery
Send user back to home institution for more
information and/or access to content
Future



3/20/07
Metadata enrichment
Resource licensing
…
Getty Technical Talk
8
Why share metadata?

Benefits to users



One-stop searching
Aggregation of subject-specific resources
Benefits to institutions



Increased exposure for collections
Broader user base
Bringing together of distributed collections
Don’t expect users will know about your
collection and remember to visit it.
3/20/07
Getty Technical Talk
9
Why share metadata with OAI-PMH?





“Low barrier” protocol
Shares metadata only, not content,
simplifying rights issues
Same effort on your part to share with one or
a hundred service providers (basically)
Wide adoption in the cultural heritage sector
Quickly eclipsed methods such as Z39.50
3/20/07
Getty Technical Talk
10
Sharing can be hard

Some initiatives have fizzled out



Some are still going





CIMI
AMICO
ARTstor
RLG Cultural Materials
CAMIO and other AMICO derivatives
Art museums have been most active
Customizing for each individual aggregator isn’t
sustainable
3/20/07
Getty Technical Talk
11
Sharing is easier with OAI-PMH


Framework for sharing with multiple
aggregators
Museum-centric OAI initiatives are emerging



CDWA Lite from the Getty
RLG Museum Collections Sharing Working Group
Museums are beginning to explore more
open sharing models
3/20/07
Getty Technical Talk
12
Challenges to OAI-PMH adoption for
museums



Protocol implicitly assumes you want
metadata to be harvestable by anyone
DC a poor match for describing most
museum materials
Museums often want to share content as well
as metadata (with select partners)
One solution? Start a specialized service
provider in the community.
3/20/07
Getty Technical Talk
13
Some service providers




OAIster
National Science Digital Library
Sheet Music Consortium
Open Language Archives Community
3/20/07
Getty Technical Talk
14
“Shareable” metadata




Promotes search interoperability - “the ability
to perform a search over diverse sets of
metadata records and obtain meaningful
results” (Priscilla Caplan)
Is human understandable outside of its local
context
Is useful outside of its local context
Preferably is machine processable
3/20/07
Getty Technical Talk
15
Models for sharing with OAI-PMH
Digital asset management system
Metadata
creation
system
3/20/07
MODS
OAI data
provider
module
QDC
OAI Harvester
CDWA Lite
Static
Repository
Gateway
DC
XML File
Transformation
Metadata
creation
module
Transformation
Metadata
creation
module
Transformation
QDC
MODS
Stand-alone
OAI data
provider
DC
CDWA Lite
Getty Technical Talk
16
Basic metadata sharing workflow








Create metadata, thinking about shareability
Determine format(s) you wish to share your
metadata in
Transform records into versions appropriate for
sharing via OAI
Validate transformed metadata
Load transformed metadata into OAI data provider
Test with OAI Repository Explorer
Communicate with service providers
See what your metadata looks like once a service
provider harvests it
3/20/07
Getty Technical Talk
17
Expanding the scope


“Over time, however, the work of OAI has
expanded to promote broad access to digital
resources for eScholarship, eLearning, and
eScience”
Some experiments with sharing content




3/20/07
CIC Metadata Portal
Fedora Asset Actions
MPEG-21 DIDL over OAI-PMH
OAI-ORE
Getty Technical Talk
18
CIC Metadata Portal




Research project to build an OAI-based
aggregator for a consortium of academic
libraries in the Midwest
Created a version of qualified DC to indicate
the location of a thumbnail image
Integrated harvested thumbnails into search
interface
Procedure documented in January 2006 DLib Magazine article
3/20/07
Getty Technical Talk
19
Asset Actions







Grew out of need for “actionable URLs”
XML schema designed for facilitating the sharing
and manipulation of digital objects
Define core functions for digital objects of all types,
e.g., “get preview”
Begun to define further functions for specific content
types
Proof-of-concept implementation created for DLF
Aquifer project
Can be shared via OAI-PMH as a supplemental
metadata format
Documented in October 2006 D-Lib Magazine article
3/20/07
Getty Technical Talk
20
MPEG-21 DIDL over OAI-PMH





Repository architecture at Los Alamos National
Laboratory uses MPEG-21 DIDL for complex digital
objects
OAI-PMH repositories integral parts of the internal
repository architecture
These internal OAI-PMH repositories do not support
DC metadata – “Because mapping a DID that
represents a complex digital object to simple DC is
quite an impossible task, support of DC by these
OAI-PMH repositories is rather meaningless.”
Described in 2004 JCDL paper
Lay the groundwork for OAI-ORE
3/20/07
Getty Technical Talk
21
OAI-ORE





Open Archives Initiative Object Re-Use and
Exchange
Two-year Mellon-funded project beginning October
2006
Will develop specifications that allow distributed
repositories to exchange information about their
constituent digital objects
Goal is to facilitate “a new digitally-based scholarly
communication framework”
Imagine how research could be transformed if users
had seamless access to information in any
repository, anywhere, and the tools to use them
3/20/07
Getty Technical Talk
22
Goals for initial OAI-ORE project




Formation of an international advisory committee,
consisting of leaders in e-Science, institutional
repositories, publishing, library, and educational
technology communities.
Formation of an international working group that will
meet over the two year period and develop the set
of ORE specifications.
Establishment and management of an experimental
deployment community that will exercise the
developed standards in a variety of contexts.
Establishment of a sustainable community to
support the widespread deployment and
management of the standards fabric.
3/20/07
Getty Technical Talk
23
OAI-ORE potential for cultural
materials





Original focus is on scholarship, and sharing
text and datasets
No inherent limitations to these uses
Facilitates, but doesn’t require exchange of
actual content
Focus on complex objects better suited to
cultural materials than OAI-PMH model
It’s still early, but inherent flexibility of model
looks promising for cultural materials
3/20/07
Getty Technical Talk
24
For more information


[email protected]
These presentation slides
<http://www.dlib.indiana.edu/~jenlrile/presentations/getty2007/oai.ppt>

OAI home page <http://www.openarchives.org>


3/20/07
OAI-PMH <http://www.openarchives.org/pmh/>
OAI-ORE <http://www.openarchives.org/ore/>
Getty Technical Talk
25