OAI Protocol Implementation on Indonesia Digital Library
Download
Report
Transcript OAI Protocol Implementation on Indonesia Digital Library
km
rg
Extending The OAI Protocol
as the Data Integration Framework
for the Digital Library Network
in the Third World
Ismail Fahmi, Ismail Khalil Ibrahim,
Donny Fauzan, Rurie Muharto
Knowledge Management Research Group
ITB
IndonesiaDLN
km
rg
Outline
Introduction
Background & Motivation
Basic Concept of OAI Protocol
Extending OAI Protocol for
IndonesiaDLN
IndonesiaDLN
km
rg
Introduction
IndonesiaDLN
km
rg
Digital Library
A digital library is a vast collection of
entities stored and maintained by multiple
information sources including databases,
image banks, file systems, email systems,
the Web, and applications providing
structured or semi-structured data.
IndonesiaDLN
km
rg
Network of Digital Libraries
Why Network ?
Physically distributed information sources
Heterogeneous storage
Autonomous content and data format
Goal
provide users with a uniform interface to access, relate, and
combine data stored in multiple, distributed, autonomous, and
possibly heterogeneous information sources.
IndonesiaDLN
km
rg
Background & Motivation
IndonesiaDLN
km
rg
Challenges
Interoperability
Unreliable, low speed & high cost Internet
Connection
Most of Knowledge & Information Sources
don’t have Dedicated Internet connection
(IndonesiaDLN partners consist of 50:50
dedicated vs dial-up connection)
Centralized data provider solution results
slow responds, high costs and
dissappointments due to unreliable
connection.
IndonesiaDLN
km
rg
Data Integration Architecture
Virtual approach :
Large number of information
Rapid change of data
Unpredictable client needs
Queries to vast amount of data
from very large number of
information sources
IndonesiaDLN
km
rg
Data Integration Architecture
Materialization approach :
Predictable portions of the
available information required
High performance query required
Access to private copies
Requirement to save information
which not maintained by the source
IndonesiaDLN
km
rg
Data Integration Architecture
Approach that suitable for Digital
Library Network in the third world is
Materialization / Harvesting Approach
because :
The need of fast query response
Low quality internet connection
Availability of dedicated internet
connection used by knowledge source
IndonesiaDLN
km
rg
OAI : Basic Concept
IndonesiaDLN
km
rg
Open Archives Initiative
http://www.openarchives.org
IndonesiaDLN
km
rg
OAI Objectives
The Open Archives Initiative has been
set up to create a forum to discuss and
solve matters of interoperability
between preprint solutions, as a way
to promote their global acceptance.
Paul Ginsparg, Rick Luce & Herbert Van de Sompel
IndonesiaDLN
km
rg
OAI Implementers
arXiv
clinmed
CogPrints
CSTC
ETD
HeinOnline
HUBerlin
NACA
NCSTRL
NDLTD
OLA
OVP
WCR
T&D Worldcat
physics, mathematics, non-linear systems and computer science (Los
Alamos)
Clinical Medicine and Health Research Netprints
U. Southampton
Computer Science Teaching Center , Digital library of peer-reviewed
teaching resources for computer science educators
Virgina Tech
Law journals from Corneel U
Humboldt University at Berlin/Germany Document Server
NACA Technical Report Server, Scanned reports of the National
Advisory Committee for Aeronautics (1917-1958); the predecessor
organisation to NASA.
Networked Computer Science Technical Reference Library
Networked Digital Library of Theses and Dissertations
Open Language Archives, U Pennsylvania
Open Video Project, U Northern Carolina
Web Characterisation Repository, database of meta-information
relating to trace files, tools and publications that are relevant to
characterisation of the World Wide Web
OCLC
IndonesiaDLN
km
rg
OAI Definitions & Concept
Repository is a network accessible server to which OAI protocol
request can be submitted.
A record is an XML-encoded byte stream that is returned by a
repository in response to an OAI protocol request for metadata from
an item in that repository. The OAI records are organized into header,
metadata, and about.
Header is necessary for the harvesting process, and consists of two
parts: unique-identifier, the key for extracting metadata from an item
in a repository; and datestamp of creation, deletion, and last date of
modification.
Metadata is a single manifestation of a metadata from an item. The
OAI protocol supports multiple format of metadata.
About is an optional container to hold data about the metadata of the
record, such as rights information, term and conditions for usage,
etc[17].
IndonesiaDLN
km
rg
OAI Definitions & Concept
Data Providers administer systems that support the
OAI protocol as a means of exposing metadata about
the content in their systems
Service Providers issues OAI protocol requests to
the systems of data providers and use the returned
metadata as a basis for building value-added
services.
IndonesiaDLN
km
rg
OAI Data Flow
Local data provider harvests metadata from remote data
providers, and then serves requests from service provider
Service provider acts as the interface for the users (searching,
browsing, etc)
IndonesiaDLN
km
rg
OAI Protocol Specification
Uses HTTP as the transport protocol
Uses HTTP’s URL as the request format
Uses XML as the response data coding
IndonesiaDLN
km
rg
OAI Service Requests
Identify is a request for information about the repository as a
whole. Returned is such information as the name of the
repository, the version of the protocol, and the email address
of the administrator.
ListIdentifiers lists identifiers for all objects or within a given
date range and/or within a given set.
ListMetadataFormats will return the list of all metadata
formats supported by the archive.
ListRecords lists complete metadata for all objects or within a
given date range and/or within a given set.
ListSets lists the sets (and subsets, recursively) contained
within the repository.
IndonesiaDLN
km
rg
OAI-PMH
(Protocol For Metadata Harvesting)
Service Provider
Data Provider
Local Network
Data Provider
IndonesiaDLN
Data Provider
km
rg
OAI-PMH
(Protocol For Metadata Harvesting)
<?xml version="1.0" encoding="utf-8" ?>
<OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/
http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd">
<request verb="ListIdentifiers"
metadataPrefix="oai_dc">http://localhost</request>
<ListIdentifiers>
<header>
<identifier>oai:gdlhub.indonesiaDLN.org:agriknow-2002-607</identifier>
<datestamp>2002-07-02</datestamp>
</header>
Service Provider
Data Provider
<header>
(digilib.itb.ac.id)
(gdlhub.indonesiadln.org)
<identifier>oai:gdlhub.indonesiaDLN.org:agriknow-2002-603</identifier>
<datestamp>2002-03-10</datestamp>
http://gdlhub.indonesiadln.org/OAI/response.php?
</header>
verb=ListIdentifiers&metadataPrefix=oai_dc
. . .
<header>
<identifier>oai:gdlhub.indonesiaDLN.org:agriknow-1998-521</identifier>
<datestamp>1998-01-12</datestamp>
</header>
</ListIdentifiers>
</OAI-PMH>
IndonesiaDLN
km
rg
<?xml version="1.0" encoding="utf-8" ?>
<OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/
http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd">
OAI-PMH
<request verb="GetRecord" identifier="oai:gdlhub.indonesiaDLN.org:agriknow-2001-36"
metadataPrefix="oai_dc">http://localhost</request>
<GetRecord>
<record>
<header>
<identifier>oai:gdlhub.indonesiaDLN.org:agriknow-2001-36</identifier>
<datestamp>2001-05-28</datestamp>
</header>
<metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:xsi="http://www.w3.org/2001/XMLSchemainstance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/
http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Outcome Reports of National Networked Digital Library,
Indonesia</dc:title>
(Protocol For Metadata Harvesting)
<dc:creator>Ismail Fahmi</dc:creator>
<dc:description>A project undertaken by Computer Network Research Group (CNRG) and
Knowledge Management Research Group (KMRG) ITB, funded by IDRC Canada under its Pan Asia
Networking R&D Grants Program. Web site: http://idln.itb.ac.id. <p><p><p>Abstract<p>This
...
Data Provider
Data Provider
forward the revolution in electronic scholarly publishing so that it may lead to
universities playing a more active and cost-effective role in the production,
organization, preservation and dissemination of knowledge</dc:description>
<dc:publisher>JBPEISMAIL</dc:publisher>
<dc:date>2001-05-28</dc:date>
<dc:type>res</dc:type>
<dc:identifier>jbpeismail-gdl-res-2001-ismail-1-idrc</dc:identifier>
<dc:language>English</dc:language>
<dc:rights>Copyright © 2000 by Ismail Fahmi.Verbatim copying and distribution of this
entire article is permitted by author in any medium, provided this notice is
preserved.</dc:rights>
</oai_dc:dc>
</metadata>
</record>
</GetRecord>
</OAI-PMH>
http://digilib.itb.ac.id/OAI/response.php?
verb=GetRecord&identifier=oai:gdlhub.indonesiaDLN.org:agrikno
w-2001-36&metadataPrefix=oai_dc
IndonesiaDLN
km
rg
Extending OAI Protocol for
IndonesiaDLN
IndonesiaDLN
km
rg
IndonesiaDLN : The Network
o ACPTUNSYIAH
o SUPTIAIN
o SBPTIAIN
o SAPTUNSRAT
o RIPTIAIN
o KBPTUNTAN
o JKIKCMC
o JKLPNDPDII
o KSPTIAIN
o JKPKBPPK
o SSPTIAIN
o SGPTUNHALU
o JKPKELNUSA
o LAPTIAIN
o JKPKFORLINK
o YOPTIAIN
o SNPTIAIN
o JKPKKIH
o JKPKLEMHANNAS
o JKPNPNRI
o JKPTBINUS
o JKPTBIS
o JBKMRGGREY o JBPTITBAR
o JKPTIAIN
o JIPTIAIN
o JBPTITBBI
o JKPTPERBANAS o JBPKBATAN
o JBPTITBPSUD o JIPTSTIKOMSBY
o JBPKINSTY
o JKPTYARSI
o JIPTUBAYA
o JBPTITBTI
o JBPKPBA
o JKUNINDFS
o JIPTUMM
o JBPKPERSIS o JBPTSTIEKES
o JKUNUAJ
o JBPKSALMAN o JBPTUNPADLP
o JIIJKLIB
o JBPTUPBJJUTB
o JBPTIAIN
o JIIYPIA
o JBPTUPI
o IJPTUNCEN
IndonesiaDLN
km
rg
IndonesiaDLN Stats
Partners
• Registered :
• Active
:
86
35
Connection
• Dedicated :
• Temporary :
43
43
IndonesiaDLN
km
rg
OAI Extension
Background Idea :
• Many resources providers are connected to
internet temporarily
i.e. dial-up connection, even provider
behind proxy
• The barriers of Indonesia network
connection:
i.e. Internet limited bandwith capacity
IndonesiaDLN
km
rg
OAI Extension
Solution :
• Data Uploading functionality to resolve
the problems, so provider which has
problems can put data into Data Provider
• And then the Data Provider may be
harvested by others through OAI-PMH
IndonesiaDLN
km
rg
OAI Extension
IndonesiaDLN
km
rg
Interoperability through
Metadata Harvesting and Posting
Mechanism
Harvesting Mechanism
OAI v2.0 protocol metadata harvesting
Posting Mechanism
– non-dedicated server/temporary server,
both data and metadata involved.
– has been implemented in the GDL
software environment.
IndonesiaDLN
km
rg
OAI Extension:
Framework Definitions and Concepts
• Protocol Metadata Posting (PMP)
involves two participants;
– Data Provider
• Retrieves uploaded metadata to build its own
value-added services
– Service Provider.
• Puts metadata into data provider
IndonesiaDLN
km
rg
Protocol Requests and
Responses:
There are 6 verbs to use as request:
Connect this request contains keyword argument
containing PUBLISHER_ID, its serial number and
epoch_time.
Disconnect finishes connection to the current hub data
provider
PutRecord puts a record into the repository
PutListRecords put the list of record needs to be
uploaded, by comparing them so the newer metadata
only will be uploaded.
PutFileFragment put the fragments of file to the server.
MergeFileFragments merge the uploaded fragments of
file in the server.
IndonesiaDLN
km
rg
Protocol Metadata Posting
pertama kali service provider melakukan Connect to Hub Data
Provider.
data provider memberikan sebuah id ke service provider jika
authentication sukses. Remainder verb selanjutnya bisa
digunakan, tetapi jika authenetication failed, remainder verb
gak bisa digunakan.
service provider kemudian mem-put single metadata dgn
PutRecord atau bbrp metadata dgn PutListRecords. Intinya
request ini akan mendapatkan response berupa status
penguploadan yang success or not.
service provider juga bisa mem-put file dgn PutFileFragment
dimana potong2an file diupload ke Data Provider. Kemudian di
merged dgn request MergeFileFragment
IndonesiaDLN
km
rg
OAI-PMH
(Protocol For Metadata Posting)
<?xml version="1.0" encoding="UTF-8"?>
<PMP xmlns="http://www.indonesiadln.org/OAI/1.0/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.indonesiadln.org/OAI/1.0
/http://www.indonesiadln.org/OAI/1.0/OAI-PMP.xsd">
<responseDate>2002-02-08T08:55:46Z</responseDate>
<request verb="Connect"
providerId="JBPTITBPP"
providerSerialNumber="28H3oIZETdASw"
epochTime="1031366328">http://agri/OAI-PMPscript.php</request>
Data Provider
<Connect>
Hub/Central Data Provider
(dial-up client)
<sessionId>ca612fe33acc768d4aa2f5940238c8ae</sessi
(gdlhub.indonesiadln.org)
onid>
http://gdlhub.indonesiadln.org/OAI/OAI-PMP-script.php?
</Connect>
</OAI-PMP>
verb=Connect&providerId=JBPTITBPP&providerSerialNumber=28
H3oIZETdASw&epochTime=1031366328
IndonesiaDLN
km
rg
Conclusion
• IndonesiaDLN has a standard
interoperability framework that based
on both Metadata Harvesting and
Metadata Uploading
• Bla..bla
IndonesiaDLN
km
rg
Final Remarks
• Diharapkan model protokol ini bisa
diimplementasikan di negara2 dunia ke
3 seperti Indonesia.
IndonesiaDLN
km
rg
Credits
Ismail Fahmi ([email protected])
Ismail Khalil Ibrahim ([email protected])
Donny Fauzan ([email protected])
Rurie Muharto ([email protected])
IndonesiaDLN
km
rg
Thank You
IndonesiaDLN