OAIster != Google - University of Michigan Library
Download
Report
Transcript OAIster != Google - University of Michigan Library
OAIster != Google
Kat Hagedorn
University of Michigan Libraries
October 26, 2007
Outline
Buzzwords
Brief history/overview of OAI
Why OAIster was created
OAIster: digital union catalog
Integration
Is Google next-gen?
Buzzwords
Next-gen, um, anything
Lib2.0/Web2.0
Z39.50/SRU, OAI, RSS…
Bottom line: user accesses material
where they typically find things
Expectations
OAI was developed to make it easier
(not exhaustive) to create a place “where
they typically find things”
And to find things they typically can’t find
elsewhere
OAIster was designed to be the place
What is OAI?
OAI stands for Open Archives Initiative
“…develops and promotes interoperability
standards that aim to facilitate the efficient
dissemination of content.”
Probably should have been called SAI:
Shared Archives Initiative
Includes a Protocol for Metadata Harvesting
(PMH), i.e., what we use to fill OAIster
Consists of data providers and service
providers
Metadata records
Data providers use protocol to share their
metadata records
Service providers harvest the metadata
so they can provide a service using them
Metadata needs to be
XML1.1 compliant
UTF-8 enabled
Sufficient for discovery
OAI: what it is not
OAI ≠ open access
“…defining and promoting machine interfaces that facilitate
the availability of content from a variety of providers.
Openness does not mean ‘free’ or ‘unlimited’ access to the
information repositories that conform to the OAI-PMH.”
However, a large majority of OAIster
records are available to all and sundry
Perfect opportunity-- freely sharing free
stuff
Why OAIster?
Initially, wanted to build the Academic HotBot
(now we would say the Academic Google)
Essentially, a union catalog of digital objects that
are not easily roboted or spidered
Currently, have more records that link to
“objects” than there are records in our OPAC:
13+ million
What does OAIster contain?
Pre-prints, post-prints, published articles, grey
literature, scanned images, archival videos…
Harvest everything available
except obvious test repositories
Keep nearly everything
must have a valid digital object link
must have decent metadata
must be scholarly or informational
http://memory.loc.gov/mbrs/varsmp/0526.mpg
Library of Congress Digitized Historical Collections
http://name.umdl.umich.edu/ADM0370.0002.001
University of Michigan Digital Collections
Why do (should) people use it?
It’s big-- will pass 14 million shortly
It’s varied-- besides articles, photos, and
videos, it contains datasets, audio files,
finding aids, manuscripts…
It keeps growing-- as long as they keep
paying my salary
Integration: to date
SRU Level 0
keyword access in federated search engines
connector for ExLibris MetaLib
anything else that uses SRU
Yahoo and Google
included in search indexes…
…poorly currently, without use of metadata
OpenURL, currently a hack
Integration: future
RSS
subject or specific search link from results page
alerts on new repositories
Sakaibrary / Blackboard
Facebook
searching app…but useful?
Zotero (Refworks / Endnote)
APIs to…?
What purpose, integration?
Google as example…
Can’t get at everything, until it starts
using OAI itself
Gray literature and other scholarly
materials in index
Push metadata so ranks high in index
Google != OAIster
Does it matter if Google is next-gen?
It’s more like only-gen
Should we all either
conform to Google look-and-feel?
or insinuate ourselves everywhere?
even though integration is a catch-up game
Questions?
Kat Hagedorn
University of Michigan Libraries
Digital Library Production Service
www.oaister.org
[email protected]