Open Archives Initiative Primer Thomas Krichel Palmer School of Library and Information Science Long Island University With apologies to Carl Lagoze DC2001 – Tokyo, October.

Download Report

Transcript Open Archives Initiative Primer Thomas Krichel Palmer School of Library and Information Science Long Island University With apologies to Carl Lagoze DC2001 – Tokyo, October.

Open Archives Initiative Primer
Thomas Krichel
Palmer School of Library and Information Science
Long Island University
With apologies to Carl Lagoze
DC2001 – Tokyo, October 25, 2001
Where I come from...
•
•
•
•
Trained economist
Early (1991) visionary of free online scholarship
Creator of NetEc in 1993
Principal founder of RePEc in 1997
– Largest distributed academic DL in the world
– Collection that is open for
• Contribution
• Usage
– Grown to over 100 archives, over 10 partly
interoperable user services
Metadata collection process
• Free online scholarship requires academic self-
documentation
• Metadata is expensive to collect
• Building free metadata collection is difficult
• no established business model
• no established funding channels
• Only a collaborative effort will be succeed.
The example of eprint servers
• attractive building block for the transformation of
scholarly communication
• but isolated efforts do not make for a scholarly
communication system
• need to federate archive
• need to interoperate with other scholarly
communication components
Example: e-print accessibility
e-print
e-print
e-print
e-print
e-print
Example: e-print accessibility
e-print
e-print
e-print
e-print
e-print
metadata harvesting
metadata
e-print
e-print
e-print
e-print
e-print
metadata harvesting
e-print
metadata
e-print
Author
Title
Abstract
Identifer
e-print
e-print
e-print
other examples
• within the area of scholarly commuication
• already implemented in RePEc
• Sharing of log data between service providers
• Provision non-document data for document data
provider
• personal data
• institutional data
core concepts in OAI 1.1
• low-barrier interoperability
• data-provider / service-provider model
• metadata harvesting model
OAI 1.1 protocol
HTTP based
Reply • XML Schema
• Self contained
• shared metadata format
Dublin Core
• parallel metadata formats
Community specific
harvester / repository
support
data
harvesting
data
h
a
r
v
e
s
t
e
r
oai protocol
r
e
p
o
s
i
t
o
r
y
items
OAI protocol requests
service provider
h
a
r
v
e
s
t
e
r
Supporting protocol requests:
• Identify
• ListMetadataFormats
• ListSets
Harvesting protocol requests:
• ListRecords
• ListIdentifiers
• GetRecord
data provider
r
e
p
o
s
i
t
o
r
y
HTTP encoding - requests
BASE-URL -----------> an.oa.org/OAI-script
keyword arguments --> verb=ListIdentifers&set=S1
GET
http://an.oa.org/OAI-script?verb=ListIdentifers&set=S1
POST
POST http://an.oa.org/OAI-script HTTP/1.0
Content-Length: 78
Content-Type: application/x-www-form-urlencoded
verb=ListIdentifers&set=S1
HTTP encoding - responses
<xml version=1.0 encoding=“UTF-8” ?>
<GetRecord
xmlns=“http://oai.namespace.uri”
xmlns:xsi=“http://w3.namespace.uri”
xsi:schemaLocation=“http://oai.namespace.uri
http://oai.schemaURL”>
<responseDate>2000-19-01T19:30:30-04:00</responseDate>
<requestURL>http://an.oa.org/OAI-script?verb=GetRecord
&amp;identifier=oai%3AarXiv%3A0001
&amp;metadataPrefix=oai_dc</requestURL>
<record>
record contents
</record>
additional records
</GetRecord>
xml
namespaces
response
header
response
data
record
<record>
<header>
<identifier>oai:eg:001</identifier>
<datestamp>1999-01-01</datestamp>
</header>
<metadata>
<dc xmlns=“http://purl.org/dc”>
<title>My Example</title>
</dc>
</metadata>
<about>
<ea xmlns=“http://www.arXiv.org/ea”
<usage>No restrictions</usage>
</ea>
</about>
</record>
protocol support
format-specific
metadata
community-specific
record data
selective harvesting - datestamps
harvest within
date range
record
record
r
e
p
o
s
i
t
o
r
y
selective harvesting - sets
harvest within set
record
record
record
r
e
p
o
s
i
t
o
r
y
S1
S2
Communication re OAI
• lists: subscribe via http://www.openarchives.org
• oai-general list
• oai-implementers list
• web: http://www.openarchives.org
• FAQ: http://www.openarchives.org/faq.htm
• mail: [email protected]
revision of specifications
• Currently frozen specifications for 12 -18 months:
• stable for experimentation; not definitive
• minimize risk for early adopters
• maximize chances for future interoperability
across communities
The technical committee are working on the
“definitive” specifications
The technical committee
- Herbert Van de Sompel
- Carl Lagoze
(British Library)
(Cornell U)
-
Thomas Krichel
(Long Island U & RePEc)
Jeff Young
(OCLC)
Tim Cole
(U of Illinois at Urbana Champaign)
Hussein Suleman (Virginia Tech)
Simeon Warner
(LANL & arXiv)
Michael Nelson
(NASA & NACA)
Caroline Arms
(Library of Congress)
Muhammad Zubair (Old Dominion U & ARC)
Steven Bird
(U Penn & Open Language Archive Community)
Robert Tansley
(MIT & DSpace)
-
Andy Powell
Mogens Sandfaer
Thomas Severiens
Thomas Baron
Les Carr
Thomas Place
UK (UKOLN)
Denmark (DTV)
Germany
Switserland (CERN)
UK (U of Southampton)
Netherlands (Tilburg U)
Current activities
Currently they are working on a list of technical
issues related to the protocol
A new specification is supposed to be drafted 2002-02
Alpha testing will start 2002-04
The new specification will be released shortly after that.
Thank you for your attention!
Thomas Krichel
Palmer School of Library and Information Science
720 Northern Boulevard
Brookville NY 11548-1300
USA
http://openlib.org/home/krichel