OAIster != Google - University of Michigan Library

Download Report

Transcript OAIster != Google - University of Michigan Library

OAIster != Google
Kat Hagedorn
University of Michigan Libraries
October 26, 2007
Outline
Buzzwords
Brief history/overview of OAI
Why OAIster was created
OAIster: digital union catalog
Integration
Is Google next-gen?
Buzzwords
Next-gen, um, anything
Lib2.0/Web2.0
Z39.50/SRU, OAI, RSS…
Bottom line: user accesses material
where they typically find things
Expectations
OAI was developed to make it easier
(not exhaustive) to create a place “where
they typically find things”
And to find things they typically can’t find
elsewhere
OAIster was designed to be the place
What is OAI?
 OAI stands for Open Archives Initiative
“…develops and promotes interoperability
standards that aim to facilitate the efficient
dissemination of content.”
 Probably should have been called SAI:
Shared Archives Initiative
 Includes a Protocol for Metadata Harvesting
(PMH), i.e., what we use to fill OAIster
 Consists of data providers and service
providers
Metadata records
Data providers use protocol to share their
metadata records
Service providers harvest the metadata
so they can provide a service using them
Metadata needs to be
 XML1.1 compliant
 UTF-8 enabled
 Sufficient for discovery
OAI: what it is not
OAI ≠ open access
 “…defining and promoting machine interfaces that facilitate
the availability of content from a variety of providers.
Openness does not mean ‘free’ or ‘unlimited’ access to the
information repositories that conform to the OAI-PMH.”
However, a large majority of OAIster
records are available to all and sundry
Perfect opportunity-- freely sharing free
stuff
Why OAIster?
 Initially, wanted to build the Academic HotBot
(now we would say the Academic Google)
 Essentially, a union catalog of digital objects that
are not easily roboted or spidered
 Currently, have more records that link to
“objects” than there are records in our OPAC:
13+ million
What does OAIster contain?
 Pre-prints, post-prints, published articles, grey
literature, scanned images, archival videos…
 Harvest everything available
 except obvious test repositories
 Keep nearly everything
 must have a valid digital object link
 must have decent metadata
 must be scholarly or informational
http://memory.loc.gov/mbrs/varsmp/0526.mpg
Library of Congress Digitized Historical Collections
http://name.umdl.umich.edu/ADM0370.0002.001
University of Michigan Digital Collections
Why do (should) people use it?
It’s big-- will pass 14 million shortly
It’s varied-- besides articles, photos, and
videos, it contains datasets, audio files,
finding aids, manuscripts…
It keeps growing-- as long as they keep
paying my salary
Integration: to date
 SRU Level 0
 keyword access in federated search engines
 connector for ExLibris MetaLib
 anything else that uses SRU
 Yahoo and Google
 included in search indexes…
 …poorly currently, without use of metadata
 OpenURL, currently a hack
Integration: future
 RSS
 subject or specific search link from results page
 alerts on new repositories
 Sakaibrary / Blackboard
 Facebook
 searching app…but useful?
 Zotero (Refworks / Endnote)
 APIs to…?
What purpose, integration?
Google as example…
Can’t get at everything, until it starts
using OAI itself
Gray literature and other scholarly
materials in index
Push metadata so ranks high in index
Google != OAIster
Does it matter if Google is next-gen?
It’s more like only-gen
Should we all either
 conform to Google look-and-feel?
 or insinuate ourselves everywhere?
 even though integration is a catch-up game
Questions?
Kat Hagedorn
University of Michigan Libraries
Digital Library Production Service
www.oaister.org
[email protected]