Short DOI presentation

Download Report

Transcript Short DOI presentation

Digital Object Identifier
Norman Paskin, International DOI Foundation
doi>
What is DOI?
doi>
Digital
Object
Identifier
• A unique identifier for "a piece of content“
on digital networks
• Digital object interoperability
Analogy: the physical bar code
doi>
• A unique identifier for "a piece of content" in the
physical world
• single, common tool: UPC/EAN Bar Code
• many uses : once assigned, usable by anyone in chain
• wide community support made it work
• self-sustaining cost recovery model etc.
• standard – helps to integrate systems efficiently
The EAN/UPC (bar) code:
more than just a quick way to get through the checkout line
Manufacturer
Distributor
Receiving
Just-in-Time
Ordering
Store
Unique product identifier
also used across supply
chain for:
- Billing/Payments
- Sales Tracking
Inventory
computer
- Financial Reporting
- # goods shipped
- sales by store or region
- etc.
Cash
register
Financial
systems
Head Office
Don't we have identifiers already?
doi>
• SKUs etc are not "actionable" consistently
– (They can be included in a DOI)
• URL is a Location
– DOI is a Name (URI, URN); things move, may be
at multiple places
• Interoperability is KEY
– Horizontal: across media
– Vertical: across supply chain intermediaries
• Need to do more than just "locate"
– (Web) services; rights management
• Need to be independent of platform
– Web, mobile, broadcast, other networks
What is the DOI?
doi>
doi>
"The DOI is the UPC (Bar Code) for objects of
intellectual property on the Internet.”
• 1. Uniquely identifies content - therefore
enables computers to execute transactions of
all kinds
• 2. Provides a stable, persistent link to the
content itself or to services
• 3. Can be used to automate services
How?
doi>
doi>
• Show DOI as combination of components
– use existing standards
• Show examples of services (applications) built
on DOI
– Examples here web –based but DOI applies to
all platforms
Numbering
Policies
doi>
Description
Action
DOI syntax
can include any
existing identifier,
formal or informal,
of any entity
•
•
•
•
•
•
•
•
An identifier “container” e.g.
10.1234/5678
10.1234/0-7645-4889-1
10.1234/978-0-7645-4889-1
10.1234/ISBN 0764548891
10.1234/Norman_presentation
10.1234/2002-10-08-Editeur
etc
• Metadata
• Kernel metadata
– Normalised using indecs DD
– ISO identifiers adopting same
approach?
• Application profile metadata
– Mapped using indecsDD
<indecs> framework:
• Any existing schema (ONIX,
SCORM, MARC, GRID…)
• Standard way of accessing
– “Hooks” to Open URL, UDDI, etc
DOI can describe
• XML expressions
any form of
intellectual property,
at any level of
granularity
• Resolve from DOI to:
–
–
–
–
–
–
Location (URL) - persistence
Multiple locations
Metadata
Services
Nested DOIs (related objects etc)
Extensible: new types
• Standard internet protocol
• Scalable, proven
• In use elsewhere
Handle resolution
allows a DOI to link to
any & multiple pieces
of current data
DOI policies
allow any
business model
for practical
implementations
• Common rules of the road (IDF)
– Governance and agreed scope, policy, rules
• Cost-recovery (self-sustaining)
• Registration agencies (cf ISBN)
• Each can develop own sector rules, business model,
services, fees, metadata etc
–
–
–
–
DOI at cost
DOI free
DOI with other services
etc
DOI syntax
DOI policies
can include any
existing identifier,
formal or informal,
of any entity
allow any
business model
for practical
implementations
extensible
<indecs> framework:
DOI can describe
any form of
intellectual property,
at any level of
granularity
Handle resolution
allows a DOI to link to
any & multiple pieces
of current data
Persistent identifier
doi>
doi>
• Resolution provides persistence
• Easily seen in web applications - DOI never
changes, but URL does:
Handle resolution
allows a DOI to link to
any & multiple pieces
of current data
doi>
doi>
URL
URL
Content
URL
URL
URL
URL
URL
URL
URL
URL
URL
URL
URL URL
Printed identifiers,
bookmarks, etc
doi>
doi>
URL
URL
URL
URL
URL
404
File not found
URL
URL
URL
URL
URL
Content
URL
URL
URL URL
"Linkrot": recent estimates 16% in 6 months
doi>
doi>
URL
DOI
URL
DOI
Assigner
DOI
URL
URL
DOI
DOI
directory
DOI
URL
URL
DOI
URL
DOI
URL
DOI
Content
DOI
URL
URL
DOI
Content
DOI
URL
URL
DOI
URLDOI
DOI URL
doi>
doi>
DOI
DOI
Assigner
Internet
DOI
DOI
DOI
DOI
DOI
directory
directory
DOI
DOI
directory
directory
DOI
DOI
DOI
DOI
directory
DOI
DOI
Content
DOI
DOI
DOI
DOI
doi>
doi>
DOI
DOI
Assigner
DOI
Response Page
DOI
DOI
DOI
directory
DOI
DOI
DOI
DOI
DOI
•purchase content
•view free excerpt
•get related items
•get add’l metadata
•request permissions
Content
DOI
DOI
More than just "locate"
DOI
DOI
DOI
doi>
doi>
DOI
DOI
Assigner
DOI
Response Page
DOI
DOI
DOI
directory
DOI
DOI
DOI
DOI
•purchase content
•view free excerpt
•get related items
•get add’l metadata
•request permissions
DOI
DOI
DOI
DOI
DOI
DOI
Bookstore
Metadata efficiency
•
•
•
•
•
•
<indecs> framework:
in DOI can describe
any form of
intellectual property,
at any level of
granularity
doi>
doi>
Text objects (ONIX)
Art objects (CIDOC)
Learning objects (SCORM)
Audio objects (GRID)
Video objects (SMPTE)
etc
Metadata efficiency
•
•
•
•
•
•
<indecs> framework:
DOI can describe
any form of
intellectual property,
at any level of
granularity
doi>
doi>
Text objects (ONIX)
Art objects (CIDOC)
Learning objects (SCORM)
Audio objects (GRID)
Video objects (SMPTE) etc
Common single mapping
Adding value: services
doi>
doi>
• Acrobat plug-in as focus example here (web based)
• Four example demonstrations shown here:
– Version (provide a dynamic update version of the pdf in hand)
– Multiple resolution (retrieve multiple data: a URL and some
metadata in this case)
– CrossRef (retrieve a standard set of metadata and use it in an
application, a citation builder)
– Rights (very simple e-commerce interface as an illustration)
Adobe plug-in concept: what
doi>
doi>
Buttons "pop up" dynamically
as services become available
Tool Bar
PDF
Forward Linking Service
AN Other Service
DOI is not visible - within pdf package
(like File/Properties in Word, etc)
doi:10.123/456
Plug-In
Acrobat Reader
[ cache ]
PDF document viewed
through Acrobat reader
Demo 1 – Version
Tool Bar
Demo 1 – Version
Handle Record
Tool Bar
DOI
TYPE
cnri.test.jsn/pdf
url
last_modified
DATA
http://host-4-211/book-newversion.pdf
http://host-4-211/book-newversion.pdf
2002-06-13T14:06:03-03:00
2002-06-13T14:06:03-03:00
Handle System
Internet
Demo 1 – Version
Tool Bar
Demo 1 – Version
Click below to see animated demo
Demo 2 – MultiRes
Demo 2 – MultiRes
doi>
Related links
Demo 2 – MultiRes
Demo 2 – MultiRes
Demo 2 – MultiRes
Click below to see animated demo.
Demo 3 – CrossRef
Tool Bar
Demo 3 – CrossRef
Tool Bar
Demo 3 – CrossRef
Tool Bar
Demo 3 – CrossRef
Tool Bar
Demo 3 – CrossRef
Click below to see animated demo.
Demo 4 – Rights
Tool Bar
Demo 4 – Rights
Rights button!
Tool Bar
XMP
Demo 4 – Rights
Tool Bar
Demo 4 – Rights
Tool Bar
Demo 4 – Rights
Click below to see animated demo
What we have done
doi>
doi>
Put the DOI data in functional units in the DOI record; and the
knowledge of what to do with them in the client
– Demonstrated with an end-user client (Acrobat) but equally
applicable to middleware
– No constraints on adding additional functional units to a given
DOI
– A common approach – could use same Handle record to manage
pdf, html, mobile, etc., hence efficient in deploying content
across platforms
How can we use it?
doi>
doi>
• Assign DOIs to any content; or use
assigned DOIs
• Potentially: identification of other entities
as well as content - e.g. "parties"
(such as Companies: "Interparty" project and
related "indecs" work; use existing numbers?)
• As a key for an e-commerce mechanism
e.g. copy protection, tamper-proofing, signatures,
micropayments: NOT a part of DOI, but enabled
by it (added value services, tools)
• How you use it is defined by you - it’s a tool
Who is using it now?
doi>
doi>
• Several hundred organisations
• Several million DOIs
• Example: CrossRef (www.crossref.org)
• The top 160 publishers of technical articles.
•
around 3 million DOIs per year since Jan 2001
• Use DOI to maintain links between them
("citations"); allows each to use their own #
system
• Local copies; versions; links to supplementary
material
Registration Agencies (Oct 2002)
•
•
•
•
•
•
doi>
doi>
CrossRef
Learning Objects Inc (USA, DoD)
Enpia (Korea: government endorsed)
CDI (Consultancy, E Books, CORBIS, etc)
CAL (Australian RRO)
TSO (The Stationery Office)
Others being discussed e.g.
• Multilingual European DOI Registration
Agency
• Content ID Forum (Japan)
• Music Industry project
DOI Registration Agencies 2000-2002
Learning Objects
Network
doi>
doi>
The Stationery
Office
For-profit Start-up Distance
Education Software Firm
Vermont, 2001
For-profit Existing
Publisher spun out of
HMSO
London, 2002
CrossRef
Non-Profit Start-up Consortium
of Existing Large Publishers
Boston, 2000
Content
Directions
For-profit Start-up
DOI Consulting Firm
New York, 2001
Enpia Systems
For-profit Existing Software
Firm
Korea, 2001
Default
RA
IDF/CNRI
Reston, 1997
Multilingual European DOI
Registration Agency
(MEDRA)
Non-profit Consortium of Existing
European Government Publisher
Organizations
Not Yet Appointed
Copyright Agency,
Ltd.
For-profit Existing Rights
Clearinghouse
Australia, 2002
Who is behind it?
doi>
doi>
• International DOI Foundation (IDF)
• Open member organisation, launched 1998
• Members; publishing, technology,
intermediaries
• Modelled on W3C, and on the Bar code
development
• www.doi.org
More information?
•
•
•
•
www.doi.org
DOI-News brief monthly mailing list
DOI Reports to members (extensive)
DOI Handbook; FAQ
Main source of information
doi>
doi>
Digital Object Identifier
Norman Paskin, International DOI Foundation
doi>
Appendix
•
•
•
•
•
Supplementary material
DOI Application profiles concept
Supporting IDF: benefits
IDF development path
DOI and internet standards
DOI Application Profiles h app. profile
doi>
Each Profile can be thought of as built from the kernel +
extensions:
DOI AP
Compulsory kernel for any DOI
metadata for application
Metadata elements h app. profile
DOI Term
ONIX
doi>
AP10
Application may be defined in terms of
another scheme, e.g. ONIX
doi>
DOI Term
ONIX
AP10
=
Application defined in terms of
another scheme, e.g. ONIX
doi>
AP10
Must have mapping for each element e.g.
ONIX “Page” = iid 734 (DOI Term Set)
doi>
DOI Term Set
ONIX
AP10
doi>
DOI users can see metadata as all defined in DOI terms:
AP10
doi>
The advantage is in additional schemes/mappings:
AP27
AP10
doi>
AP27
AP10
Benefits of supporting DOIs
• Persistent identification
doi>
doi>
–
–
–
–
Not just a location
Permanent, trackable, name
Stays the same if ownership, location, control changes
No need to update customers if location changes
–
–
–
–
–
Standard e.g. ISBN, ISSN, ISMN, SICI, ISRC
Non-standard / public e.g. PII
Private e.g. workflow, internal production
Assigned by the publisher
or on his behalf
• Can incorporate existing identifiers
• Can interoperate metadata standards
– Application profiles, kernel metadata, indecsDD
Using DOIs
doi>
doi>
• Automated link from DOI to any (and multiple)
points
– Controlled by the assigner
– e.g. Multiple locations; purchase options; additional info;
access control can be made available and controlled
globally by the publisher. Can be invoked globally by an
intermediary, etc.
• Build your own custom features: entirely extensible
architecture
• Generic applicability; any form of intellectual property,
any granularity (text, music, audio..)
– Simple standard metadata associated with each DOI to
ensure interoperability
• Conforms to, and works with, existing standards
Business benefits
doi>
doi>
• Promotes ready use of material in a legal, controllable,
manner
• Proven, implemented, real system in use now
– e.g. CrossRef: 160+ publishers, around 3 million DOIs per year
since Jan 2001, around 2 million resolutions per month, supports
existing businesses
• Demonstrated unique additional features
– multiple resolution; DOI-APs
– use of these limited only by your imagination
• Low risk
–
–
–
–
not a proprietary system; available at low cost
controlled by neutral, not-for-profit Foundation with single aim.
built on open standards.
comprehensive effort reduces risk of "dead-end": Asia as well
as EU, US; multimedia e.g. text, music, software
Leverage other activities
IDF participates in other efforts
• W3C, IETF DRM activities
• PRISM, ONIX, indecs2…..
• ISO TC46, ISO MPEG
• NISO, WIPO, etc
• Music industry: GRID, CR Forum
• Content ID Forum (Japan)
• Indecs
• TV Anytime
etc
No one company can participate in all these
doi>
doi>
Why is support needed?
doi>
doi>
• if this is desirable, it must be paid for
• membership supports development until operating
federation takes over
• community invests now to get benefit for all
• coordinated work to provide efficient operation
• ensure consistent deployment and avoid
fragmentation
• prevent conflicts and promote efficiency
• outreach to other efforts
Benefits of supporting IDF
doi>
doi>
• Ensure the DOI is widely implemented
– Existing applications need underpinning of consistent
rules, infrastructure, and wide uptake
• Ensure Content community sets standards
– Technology standards are not enough (Napster)
– No other existing forum is doing this: W3C, OEBF, MPEG21
etc. all looking at parts
• DOI results from extensive work by AAP, IPA,
STM (1997+) - a consistent development path
• IDF has strong position, and support.
– Content and technology communities are represented
• Promote collaboration
– interoperate with others; reduce costs, prevent mistakes
– provide a common platform but retain ability to build addedvalue services.
Benefits of supporting IDF
doi>
doi>
• Cost effective way of gaining access to expertise
– Cost is equivalent to 2-3 man days per month of one
consultant (even at highest membership level)
– Detailed Monthly briefings on other activities (WIPO, W3c,
IETF, MPEG, ISO, OEBF, SIIA, etc), and more expertise
available on request
• Preferential access to business opportunities:
– IDF makes connections between members and potential
applications: explore at low risk possible business opportunities
– Early access to results of prototypes, plans
• Share cost of development of prototypes
– Costs can be shared by participants
• Influence the course of the IDF
– participate in working groups, annual meeting, prototypes, board
Registration Agencies
doi>
doi>
• An additional business opportunity for some members
• Build on the features and acceptance of the system
– build on existing services or offer new services
– management of content, management of metadata, etc.
• RAs may build as little or as much as they wish on this
– simple assignment, through to a wide range of services
• RAs determine their own fate:
– IDF provides federal structure for infrastructure, predictable
costs and governance model
– open market structure for applications
• Business opportunity is a shared risk:
– DOI service supported by multiple RAs and multiple applications
– Shared costs of the infrastructure
– common infrastructure encourages common added-value tools
DOI: development path
doi>
(persistent identifier)
Multiple resolution
W3C, WIPO,
NISO, ISO,
UDDI etc.
Initial
implementation
Full
implementation
Activity
tracking
Single redirection
Metadata
A continuing development activity
DOI: components
• A number (or “name”)
– assign a number to something
– (compare: telephone number)
DOI: components
• A number (or “name”)
– assign a number to something
– (compare: telephone number)
• A description
– what the number is assigned to
– (compare: directory entry)
DOI: components
• A number (or “name”)
– assign a number to something
– (compare: telephone number)
• A description
– what the number is assigned to
– (compare: directory entry)
• An action
– make the number do something
– (compare: the telephone system)
DOI: components
• A number (or “name”)
– assign a number to something
– (compare: telephone number)
• A description
– what the number is assigned to
– (compare: directory entry)
• An action
– make the number do something
– (compare: the telephone system)
• Policies
– how to get a phone number; billing
(compare: social structures)
Our aim: Building infrastructure
doi>
doi>
“Imagine a country where nobody can identify who owns what,
addresses cannot easily be verified, people cannot be made to pay
their debts, resources cannot conveniently be turned into money,
ownership cannot be divided into shares, descriptions of assets are
not standardized and cannot easily be compared, and the rules that
govern property vary from neighbourhood to neighbourhood or even
street to street. You have just put yourself into the life of a
developing country or former communist nation”
“The Mystery of Capital: Why Capitalism
Succeeds in the West and Fails Everywhere
Else” by Hernando de Soto (2000)
Our aim: Building infrastructure
doi>
doi>
“One of the most important things a formal property system does is
transform assets from a less accessible condition to a more accessible
condition, so that they can do additional work. Unlike physical assets,
representations are easily combined, divided, mobilized, and used to
stimulate business deals. By uncoupling the economic features of an
asset from their rigid, physical state, a representation makes the
asset "fungible" - able to be fashioned to suit practically any
transaction.”
“The Mystery of Capital: Why Capitalism
Succeeds in the West and Fails Everywhere
Else” by Hernando de Soto (2000)
DOI: provide the tools
for representations of
intellectual property
Internet standards: DOI, URN and URL
doi>
doi>
• Distinguish two issues:
1. The technical specification of “what is” a
URN and a URI
2. What this means for practical
implementation
1. Internet specs
doi>
doi>
• See DOI handbook chapter 4
– 4.9 DOI as a URI
– 4.10 DOI as a URN
– equally true of all HDLs – DOIs are HDLs
• Aim: DOIs are persistent across time and unique
across network space
• DOIs are URIs (formally draft specification)
• DOIs are URNs (in effect)
• URN and URI proponents disagree
– “the URN wars”
1. Internet specs
doi>
doi>
http:// www.w3.org/addressing
(But largely from IETF, W3C did not see need for URN)
URI
URN
URL
ftp:
gopher:
http:
urn:
Resolution (N2L)
DOI as URI
doi>
doi>
• IETF formal spec “URI scheme for Digital Object
identifier”
– Paskin, Norman; Neylon, Eamonn; Hammond, Tony; Sun, Sam;
Uniform Resource Identifier (URI) scheme for Digital Object Identifiers
(DOIs); http://www.ietf.org/internet-drafts/draft-paskin-doi-uri-00.txt
(February 2002)An abstract specification (uri:doi:)
– Would be doi: (like tel:)
[uri: is not part of the uri spec, unlike urn:]
• May be a pure name or de-referenced by any service
– The namespace provides its own mechanism
(“Bootstrapping”)
• RFC 2396: UTF-8 encoding allows non-Roman
characters
• On its own, it’s just a specification!
• Requires code distribution for any implementation
DOI as URN
• URN is less clear:
doi>
doi>
– Higher level situation muddy
– Set of IETF drafts that define URN
– Set of registered namespaces (e.g. isbn)
• DOI could be but isn’t- no advantage
• Unlike URI, provides a specific DNS-based middle layer (RDS) to
find the appropriate resolution service
• Scalability and security questioned; and:
• Little or no resolution implementation
– Resolution proposed is one specific way:
– NAPTR(Name Authority Pointer) turns urn:hdl:10.1000/1
into http://hdl.handle.net/10.1000
– Recently DDDS(Dynamic Delegation Discovery System):
variant of NAPTR
DOI as URN
doi>
doi>
• urn:isbn:123456789 can be defined ; but what
does it do over and above isbn:123456789?
– neither have a readily available, well known, global, resolution
• What if NAPTR were widely deployed? (5 years on)?
• Some advantage: could redirect from one URL
proxy to another
– urn:doi to http://dx.doi.org/ redirect to http://dx2.doi.org
• But this is a “regular expression”: not software
• And still worries about DNS issues
– “Gratuitous use of DNS”
– DNS name servers are widely distributed – inertia
– No security of resolution
1. Internet specs
doi>
doi>
• Persistence across time and network space
desirable
• Do not want to bet on the URN logic of putting a
resolution system in front of resolution systems
– Especially the one proposed
• But
– DOIs ARE URIs (formally)
– DOIs ARE URNs (in effect)
• But: this is not the most important issue!
2. Practical implementation
doi>
doi>
• Irrespective of all this URI/URN specification,
DOIs are still needed, still useful, still valid
• A DOI is more than HDL
– Adds Policy, business rules, business model
– Adds Metadata specifications (cf ISBN, EAN, Visa)
• e.g. Mappings:
–
–
–
–
–
–
Ensures semantic integrity
A technical exercise:
A term is assigned a unique value in the iDD
Given a genealogy and ContextDescription
Other information added
A mapped term becomes part of the dictionary
• Hence will become more useful as it grows
– Consensual between the two things being mapped
– Painstaking, but once-only
– Specialist services requiring intellectual input
2. Practical implementation
doi>
doi>
• On this topic, see
• DOI Handbook Ch. 3.6: Social infrastructure
• DOI Handbook Ch. 6: on The Handle System and
using HDL without DOIs
• DOI Handbook Ch. 13: on RAs and using DOIs
without RAs