OAI Protocol Implementation on Indonesia Digital Library

Download Report

Transcript OAI Protocol Implementation on Indonesia Digital Library

km
rg
Extending The OAI Protocol
as the Data Integration Framework
for the Digital Library Network
in the Third World
Ismail Fahmi, Ismail Khalil Ibrahim,
Donny Fauzan, Rurie Muharto
Knowledge Management Research Group
ITB
IndonesiaDLN
km
rg




Outline
Introduction
Background & Motivation
Basic Concept of OAI Protocol
Extending OAI Protocol for
IndonesiaDLN
IndonesiaDLN
km
rg
Introduction
IndonesiaDLN
km
rg
Digital Library
A digital library is a vast collection of
entities stored and maintained by multiple
information sources including databases,
image banks, file systems, email systems,
the Web, and applications providing
structured or semi-structured data.
IndonesiaDLN
km
rg
Network of Digital Libraries
Why Network ?
 Physically distributed information sources
 Heterogeneous storage
 Autonomous content and data format
Goal
provide users with a uniform interface to access, relate, and
combine data stored in multiple, distributed, autonomous, and
possibly heterogeneous information sources.
IndonesiaDLN
km
rg
Background & Motivation
IndonesiaDLN
km
rg
Challenges
 Interoperability
 Unreliable, low speed & high cost Internet
Connection
 Most of Knowledge & Information Sources
don’t have Dedicated Internet connection
(IndonesiaDLN partners consist of 50:50
dedicated vs dial-up connection)
 Centralized data provider solution results
slow responds, high costs and
dissappointments due to unreliable
connection.
IndonesiaDLN
km
rg
Data Integration Architecture
 Virtual approach :




Large number of information
Rapid change of data
Unpredictable client needs
Queries to vast amount of data
from very large number of
information sources
IndonesiaDLN
km
rg
Data Integration Architecture
 Materialization approach :





Predictable portions of the
available information required
High performance query required
Access to private copies
Requirement to save information
which not maintained by the source
IndonesiaDLN
km
rg
Data Integration Architecture
 Approach that suitable for Digital
Library Network in the third world is
Materialization / Harvesting Approach
because :
The need of fast query response
Low quality internet connection
Availability of dedicated internet
connection used by knowledge source
IndonesiaDLN
km
rg
OAI : Basic Concept
IndonesiaDLN
km
rg
Open Archives Initiative
http://www.openarchives.org
IndonesiaDLN
km
rg
OAI Objectives
The Open Archives Initiative has been
set up to create a forum to discuss and
solve matters of interoperability
between preprint solutions, as a way
to promote their global acceptance.
Paul Ginsparg, Rick Luce & Herbert Van de Sompel
IndonesiaDLN
km
rg
OAI Implementers

arXiv



clinmed
CogPrints
CSTC




ETD
HeinOnline
HUBerlin
NACA





NCSTRL
NDLTD
OLA
OVP
WCR

T&D Worldcat
physics, mathematics, non-linear systems and computer science (Los
Alamos)
Clinical Medicine and Health Research Netprints
U. Southampton
Computer Science Teaching Center , Digital library of peer-reviewed
teaching resources for computer science educators
Virgina Tech
Law journals from Corneel U
Humboldt University at Berlin/Germany Document Server
NACA Technical Report Server, Scanned reports of the National
Advisory Committee for Aeronautics (1917-1958); the predecessor
organisation to NASA.
Networked Computer Science Technical Reference Library
Networked Digital Library of Theses and Dissertations
Open Language Archives, U Pennsylvania
Open Video Project, U Northern Carolina
Web Characterisation Repository, database of meta-information
relating to trace files, tools and publications that are relevant to
characterisation of the World Wide Web
OCLC
IndonesiaDLN
km
rg





OAI Definitions & Concept
Repository is a network accessible server to which OAI protocol
request can be submitted.
A record is an XML-encoded byte stream that is returned by a
repository in response to an OAI protocol request for metadata from
an item in that repository. The OAI records are organized into header,
metadata, and about.
Header is necessary for the harvesting process, and consists of two
parts: unique-identifier, the key for extracting metadata from an item
in a repository; and datestamp of creation, deletion, and last date of
modification.
Metadata is a single manifestation of a metadata from an item. The
OAI protocol supports multiple format of metadata.
About is an optional container to hold data about the metadata of the
record, such as rights information, term and conditions for usage,
etc[17].
IndonesiaDLN
km
rg
OAI Definitions & Concept
 Data Providers administer systems that support the
OAI protocol as a means of exposing metadata about
the content in their systems
 Service Providers issues OAI protocol requests to
the systems of data providers and use the returned
metadata as a basis for building value-added
services.
IndonesiaDLN
km
rg
OAI Data Flow
 Local data provider harvests metadata from remote data
providers, and then serves requests from service provider
 Service provider acts as the interface for the users (searching,
browsing, etc)
IndonesiaDLN
km
rg
OAI Protocol Specification
 Uses HTTP as the transport protocol
 Uses HTTP’s URL as the request format
 Uses XML as the response data coding
IndonesiaDLN
km
rg
OAI Service Requests
 Identify is a request for information about the repository as a
whole. Returned is such information as the name of the
repository, the version of the protocol, and the email address
of the administrator.
 ListIdentifiers lists identifiers for all objects or within a given
date range and/or within a given set.
 ListMetadataFormats will return the list of all metadata
formats supported by the archive.
 ListRecords lists complete metadata for all objects or within a
given date range and/or within a given set.
 ListSets lists the sets (and subsets, recursively) contained
within the repository.
IndonesiaDLN
km
rg
OAI-PMH
(Protocol For Metadata Harvesting)
Service Provider
Data Provider
Local Network
Data Provider
IndonesiaDLN
Data Provider
km
rg
OAI-PMH
(Protocol For Metadata Harvesting)
<?xml version="1.0" encoding="utf-8" ?>
<OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/
http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd">
<request verb="ListIdentifiers"
metadataPrefix="oai_dc">http://localhost</request>
<ListIdentifiers>
<header>
<identifier>oai:gdlhub.indonesiaDLN.org:agriknow-2002-607</identifier>
<datestamp>2002-07-02</datestamp>
</header>
Service Provider
Data Provider
<header>
(digilib.itb.ac.id)
(gdlhub.indonesiadln.org)
<identifier>oai:gdlhub.indonesiaDLN.org:agriknow-2002-603</identifier>
<datestamp>2002-03-10</datestamp>
http://gdlhub.indonesiadln.org/OAI/response.php?
</header>
verb=ListIdentifiers&metadataPrefix=oai_dc
. . .
<header>
<identifier>oai:gdlhub.indonesiaDLN.org:agriknow-1998-521</identifier>
<datestamp>1998-01-12</datestamp>
</header>
</ListIdentifiers>
</OAI-PMH>
IndonesiaDLN
km
rg
<?xml version="1.0" encoding="utf-8" ?>
<OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/
http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd">
OAI-PMH
<request verb="GetRecord" identifier="oai:gdlhub.indonesiaDLN.org:agriknow-2001-36"
metadataPrefix="oai_dc">http://localhost</request>
<GetRecord>
<record>
<header>
<identifier>oai:gdlhub.indonesiaDLN.org:agriknow-2001-36</identifier>
<datestamp>2001-05-28</datestamp>
</header>
<metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:xsi="http://www.w3.org/2001/XMLSchemainstance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/
http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Outcome Reports of National Networked Digital Library,
Indonesia</dc:title>
(Protocol For Metadata Harvesting)
<dc:creator>Ismail Fahmi</dc:creator>
<dc:description>A project undertaken by Computer Network Research Group (CNRG) and
Knowledge Management Research Group (KMRG) ITB, funded by IDRC Canada under its Pan Asia
Networking R&D Grants Program. Web site: http://idln.itb.ac.id. <p><p><p>Abstract<p>This
...
Data Provider
Data Provider
forward the revolution in electronic scholarly publishing so that it may lead to
universities playing a more active and cost-effective role in the production,
organization, preservation and dissemination of knowledge</dc:description>
<dc:publisher>JBPEISMAIL</dc:publisher>
<dc:date>2001-05-28</dc:date>
<dc:type>res</dc:type>
<dc:identifier>jbpeismail-gdl-res-2001-ismail-1-idrc</dc:identifier>
<dc:language>English</dc:language>
<dc:rights>Copyright © 2000 by Ismail Fahmi.Verbatim copying and distribution of this
entire article is permitted by author in any medium, provided this notice is
preserved.</dc:rights>
</oai_dc:dc>
</metadata>
</record>
</GetRecord>
</OAI-PMH>
http://digilib.itb.ac.id/OAI/response.php?
verb=GetRecord&identifier=oai:gdlhub.indonesiaDLN.org:agrikno
w-2001-36&metadataPrefix=oai_dc
IndonesiaDLN
km
rg
Extending OAI Protocol for
IndonesiaDLN
IndonesiaDLN
km
rg
IndonesiaDLN : The Network
o ACPTUNSYIAH
o SUPTIAIN
o SBPTIAIN
o SAPTUNSRAT
o RIPTIAIN
o KBPTUNTAN
o JKIKCMC
o JKLPNDPDII
o KSPTIAIN
o JKPKBPPK
o SSPTIAIN
o SGPTUNHALU
o JKPKELNUSA
o LAPTIAIN
o JKPKFORLINK
o YOPTIAIN
o SNPTIAIN
o JKPKKIH
o JKPKLEMHANNAS
o JKPNPNRI
o JKPTBINUS
o JKPTBIS
o JBKMRGGREY o JBPTITBAR
o JKPTIAIN
o JIPTIAIN
o JBPTITBBI
o JKPTPERBANAS o JBPKBATAN
o JBPTITBPSUD o JIPTSTIKOMSBY
o JBPKINSTY
o JKPTYARSI
o JIPTUBAYA
o JBPTITBTI
o JBPKPBA
o JKUNINDFS
o JIPTUMM
o JBPKPERSIS o JBPTSTIEKES
o JKUNUAJ
o JBPKSALMAN o JBPTUNPADLP
o JIIJKLIB
o JBPTUPBJJUTB
o JBPTIAIN
o JIIYPIA
o JBPTUPI
o IJPTUNCEN
IndonesiaDLN
km
rg
IndonesiaDLN Stats
 Partners
• Registered :
• Active
:
86
35
 Connection
• Dedicated :
• Temporary :
43
43
IndonesiaDLN
km
rg
OAI Extension
 Background Idea :
• Many resources providers are connected to
internet temporarily
i.e. dial-up connection, even provider
behind proxy
• The barriers of Indonesia network
connection:
i.e. Internet limited bandwith capacity
IndonesiaDLN
km
rg
OAI Extension
 Solution :
• Data Uploading functionality to resolve
the problems, so provider which has
problems can put data into Data Provider
• And then the Data Provider may be
harvested by others through OAI-PMH
IndonesiaDLN
km
rg
OAI Extension
IndonesiaDLN
km
rg
Interoperability through
Metadata Harvesting and Posting
Mechanism
 Harvesting Mechanism
OAI v2.0 protocol metadata harvesting
 Posting Mechanism
– non-dedicated server/temporary server,
both data and metadata involved.
– has been implemented in the GDL
software environment.
IndonesiaDLN
km
rg
OAI Extension:
Framework Definitions and Concepts
• Protocol Metadata Posting (PMP)
involves two participants;
– Data Provider
• Retrieves uploaded metadata to build its own
value-added services
– Service Provider.
• Puts metadata into data provider
IndonesiaDLN
km
rg
Protocol Requests and
Responses:
 There are 6 verbs to use as request:
 Connect this request contains keyword argument
containing PUBLISHER_ID, its serial number and
epoch_time.
 Disconnect finishes connection to the current hub data
provider
 PutRecord puts a record into the repository
 PutListRecords put the list of record needs to be
uploaded, by comparing them so the newer metadata
only will be uploaded.
 PutFileFragment put the fragments of file to the server.
 MergeFileFragments merge the uploaded fragments of
file in the server.
IndonesiaDLN
km
rg
Protocol Metadata Posting
 pertama kali service provider melakukan Connect to Hub Data
Provider.
 data provider memberikan sebuah id ke service provider jika
authentication sukses. Remainder verb selanjutnya bisa
digunakan, tetapi jika authenetication failed, remainder verb
gak bisa digunakan.
 service provider kemudian mem-put single metadata dgn
PutRecord atau bbrp metadata dgn PutListRecords. Intinya
request ini akan mendapatkan response berupa status
penguploadan yang success or not.
 service provider juga bisa mem-put file dgn PutFileFragment
dimana potong2an file diupload ke Data Provider. Kemudian di
merged dgn request MergeFileFragment
IndonesiaDLN
km
rg
OAI-PMH
(Protocol For Metadata Posting)
<?xml version="1.0" encoding="UTF-8"?>
<PMP xmlns="http://www.indonesiadln.org/OAI/1.0/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.indonesiadln.org/OAI/1.0
/http://www.indonesiadln.org/OAI/1.0/OAI-PMP.xsd">
<responseDate>2002-02-08T08:55:46Z</responseDate>
<request verb="Connect"
providerId="JBPTITBPP"
providerSerialNumber="28H3oIZETdASw"
epochTime="1031366328">http://agri/OAI-PMPscript.php</request>
Data Provider
<Connect>
Hub/Central Data Provider
(dial-up client)
<sessionId>ca612fe33acc768d4aa2f5940238c8ae</sessi
(gdlhub.indonesiadln.org)
onid>
http://gdlhub.indonesiadln.org/OAI/OAI-PMP-script.php?
</Connect>
</OAI-PMP>
verb=Connect&providerId=JBPTITBPP&providerSerialNumber=28
H3oIZETdASw&epochTime=1031366328
IndonesiaDLN
km
rg
Conclusion
• IndonesiaDLN has a standard
interoperability framework that based
on both Metadata Harvesting and
Metadata Uploading
• Bla..bla
IndonesiaDLN
km
rg
Final Remarks
• Diharapkan model protokol ini bisa
diimplementasikan di negara2 dunia ke
3 seperti Indonesia.
IndonesiaDLN
km
rg




Credits
Ismail Fahmi ([email protected])
Ismail Khalil Ibrahim ([email protected])
Donny Fauzan ([email protected])
Rurie Muharto ([email protected])
IndonesiaDLN
km
rg
Thank You
IndonesiaDLN