Open Archives Initiative Primer Thomas Krichel Palmer School of Library and Information Science Long Island University With apologies to Carl Lagoze DC2001 – Tokyo, October.
Download ReportTranscript Open Archives Initiative Primer Thomas Krichel Palmer School of Library and Information Science Long Island University With apologies to Carl Lagoze DC2001 – Tokyo, October.
Open Archives Initiative Primer Thomas Krichel Palmer School of Library and Information Science Long Island University With apologies to Carl Lagoze DC2001 – Tokyo, October 25, 2001 Where I come from... • • • • Trained economist Early (1991) visionary of free online scholarship Creator of NetEc in 1993 Principal founder of RePEc in 1997 – Largest distributed academic DL in the world – Collection that is open for • Contribution • Usage – Grown to over 100 archives, over 10 partly interoperable user services Metadata collection process • Free online scholarship requires academic self- documentation • Metadata is expensive to collect • Building free metadata collection is difficult • no established business model • no established funding channels • Only a collaborative effort will be succeed. The example of eprint servers • attractive building block for the transformation of scholarly communication • but isolated efforts do not make for a scholarly communication system • need to federate archive • need to interoperate with other scholarly communication components Example: e-print accessibility e-print e-print e-print e-print e-print Example: e-print accessibility e-print e-print e-print e-print e-print metadata harvesting metadata e-print e-print e-print e-print e-print metadata harvesting e-print metadata e-print Author Title Abstract Identifer e-print e-print e-print other examples • within the area of scholarly commuication • already implemented in RePEc • Sharing of log data between service providers • Provision non-document data for document data provider • personal data • institutional data core concepts in OAI 1.1 • low-barrier interoperability • data-provider / service-provider model • metadata harvesting model OAI 1.1 protocol HTTP based Reply • XML Schema • Self contained • shared metadata format Dublin Core • parallel metadata formats Community specific harvester / repository support data harvesting data h a r v e s t e r oai protocol r e p o s i t o r y items OAI protocol requests service provider h a r v e s t e r Supporting protocol requests: • Identify • ListMetadataFormats • ListSets Harvesting protocol requests: • ListRecords • ListIdentifiers • GetRecord data provider r e p o s i t o r y HTTP encoding - requests BASE-URL -----------> an.oa.org/OAI-script keyword arguments --> verb=ListIdentifers&set=S1 GET http://an.oa.org/OAI-script?verb=ListIdentifers&set=S1 POST POST http://an.oa.org/OAI-script HTTP/1.0 Content-Length: 78 Content-Type: application/x-www-form-urlencoded verb=ListIdentifers&set=S1 HTTP encoding - responses <xml version=1.0 encoding=“UTF-8” ?> <GetRecord xmlns=“http://oai.namespace.uri” xmlns:xsi=“http://w3.namespace.uri” xsi:schemaLocation=“http://oai.namespace.uri http://oai.schemaURL”> <responseDate>2000-19-01T19:30:30-04:00</responseDate> <requestURL>http://an.oa.org/OAI-script?verb=GetRecord &identifier=oai%3AarXiv%3A0001 &metadataPrefix=oai_dc</requestURL> <record> record contents </record> additional records </GetRecord> xml namespaces response header response data record <record> <header> <identifier>oai:eg:001</identifier> <datestamp>1999-01-01</datestamp> </header> <metadata> <dc xmlns=“http://purl.org/dc”> <title>My Example</title> </dc> </metadata> <about> <ea xmlns=“http://www.arXiv.org/ea” <usage>No restrictions</usage> </ea> </about> </record> protocol support format-specific metadata community-specific record data selective harvesting - datestamps harvest within date range record record r e p o s i t o r y selective harvesting - sets harvest within set record record record r e p o s i t o r y S1 S2 Communication re OAI • lists: subscribe via http://www.openarchives.org • oai-general list • oai-implementers list • web: http://www.openarchives.org • FAQ: http://www.openarchives.org/faq.htm • mail: [email protected] revision of specifications • Currently frozen specifications for 12 -18 months: • stable for experimentation; not definitive • minimize risk for early adopters • maximize chances for future interoperability across communities The technical committee are working on the “definitive” specifications The technical committee - Herbert Van de Sompel - Carl Lagoze (British Library) (Cornell U) - Thomas Krichel (Long Island U & RePEc) Jeff Young (OCLC) Tim Cole (U of Illinois at Urbana Champaign) Hussein Suleman (Virginia Tech) Simeon Warner (LANL & arXiv) Michael Nelson (NASA & NACA) Caroline Arms (Library of Congress) Muhammad Zubair (Old Dominion U & ARC) Steven Bird (U Penn & Open Language Archive Community) Robert Tansley (MIT & DSpace) - Andy Powell Mogens Sandfaer Thomas Severiens Thomas Baron Les Carr Thomas Place UK (UKOLN) Denmark (DTV) Germany Switserland (CERN) UK (U of Southampton) Netherlands (Tilburg U) Current activities Currently they are working on a list of technical issues related to the protocol A new specification is supposed to be drafted 2002-02 Alpha testing will start 2002-04 The new specification will be released shortly after that. Thank you for your attention! Thomas Krichel Palmer School of Library and Information Science 720 Northern Boulevard Brookville NY 11548-1300 USA http://openlib.org/home/krichel