Metadata for product information in the book trade

Download Report

Transcript Metadata for product information in the book trade

XML and e-commerce in
book and serials publishing
Francis J Cave
Francis Cave Digital Publishing
8th November 2000
SGML UK Meeting:
XML in Publishing
1
Projects / initiatives using XML
• Digital Object Identifier (DOI) metadata
– kernel metadata
– genre metadata
• Digital rights management
– XrML (ContentGuard)
– ODRL ()
– <indecs>
• “Book trade” product information
– EPICS
– ONIX
• XML/EDI initiatives
8th November 2000
SGML UK Meeting:
XML in Publishing
2
Digital Object Identifier
• International DOI Foundation
• An identifier for intellectual property entities
– persistent
• the identifier remains the same, regardless of ownership
or location of the entity it identifies
– actionable
• the identifier is part of a system that includes technology
(the Handles system) for resolving DOIs to specific
locations
– interoperable
• the identifier uses existing identification schemes, such as
ISBN, ISSN, SICI and PII
8th November 2000
SGML UK Meeting:
XML in Publishing
3
Digital Object Identifier
• DOI kernel metadata
– needed to provide a minimal description of the identified
entity:
•
•
•
•
•
•
•
•
DOI
genre
unique identifier
title
entity type
origination
primary agent
agent role
8th November 2000
e.g. journal article, e-book, photo
e.g. ISBN, ISMN
e.g. work, physical manifestation
e.g. original, excerpt
agent identifier
e.g. publisher, distributor
SGML UK Meeting:
XML in Publishing
4
Digital Object Identifier
• DOI genre metadata
– extension of the kernel metadata to meet the
description requirements of a specific genre
– focus on rights metadata
– must follow the <indecs> model
• So what’s this all got to do with XML...?
– registration of DOIs will in future involve mandatory
delivery of DOI metadata in XML
– XML format is likely to be harmonised with ONIX
• Finding out more...
– http://www.doi.org/
8th November 2000
SGML UK Meeting:
XML in Publishing
5
Digital Rights Management
• XrML – eXtensible Rights Markup Language
– developed at Xerox PARC
– XML-based language for specifying rights and issuing
permissions associated with works
– llicensed by ContentGuard, a Xerox spin-off supported
by Microsoft, Adobe and others
– http://www.xrml.org/
• ODRL – Open Digital Rights Language
– developed by OASIS
• editor Renato Iannella, IPR Systems
– standardised rights language with XML binding
– http://odrl.net/
8th November 2000
SGML UK Meeting:
XML in Publishing
6
Book trade metadata
• EDItEUR Product Information Communication
Standard (EPICS)
• ONIX International
8th November 2000
SGML UK Meeting:
XML in Publishing
7
EPICS
• Abstract data dictionary
– Defines semantics of product information for the book
trade (books, serials, looseleafs, electronic...)
– Defines an abstract data model for product information
in terms of data elements, composites and value
domains
– Does not define a concrete syntax
– Based upon the principles defined in the <indecs>
metadata schema
8th November 2000
SGML UK Meeting:
XML in Publishing
8
EPICS
• Pilot XML expression
– DTD development based upon RDF
– Strong EDI “feel” in choice of syntax
– March 2000: exploring harmonisation with MUZE
MerchEnt
– Development suspended in March as priority switched
to development of ONIX International
• Current status
– Maintained as the abstract foundation for ONIX
– A basis (primary source) for other “book” and allied
trade metadata standards
8th November 2000
SGML UK Meeting:
XML in Publishing
9
ONIX International
• True XML application
– XML DTD now ... XML Schema later
– ONIX messages must be valid XML documents
• One DTD, two implementation levels
– Level 1 – meets original US ONIX requirements
• minimal use of composites
• short “numeric” element type names
• aimed at low-tech SMEs
– Level 2 – implements most of the EPICS semantics
• wider use of composites (but fewer than in EPICS)
• choice of short and verbose element type names
• aimed at larger publishers
8th November 2000
SGML UK Meeting:
XML in Publishing
10
ONIX International
• ONIX descriptive metadata element groups:
– Product numbers
– Product form
– Series details
– Set details
– Title
– Authorship (contributors)
– Conference
– Edition
– Language
– Pagination and other content
– Subject
8th November 2000
– Audience
– Publisher
– Publishing dates
– Territorial rights
– Dimensions
– Descriptions and other text
– Links to image/audio/video
– Prizes
– Replaced by / alternative format
– Supplier, availability and prices
– Sales promotion information
SGML UK Meeting:
XML in Publishing
11
ONIX International
• Metadata group example: “authorship”
– organised into metadata about each individual contributor
(author, editor, translator,...)
– individual contributor information comprises some combination
of any of the following data elements:
•
•
•
•
•
•
•
•
•
•
contributor sequence number (display ordering)
contributor role
person name
person name, inverted
structured person name (up to six elements)
professional position
affiliation
corporate contributor name
biographical note
contributor description (other than biographical information)
8th November 2000
SGML UK Meeting:
XML in Publishing
12
ONIX International
• XML-coded example of a contributor composite,
using short element type names (Level 1):
<contributor>
<b035>A01</b035>
<b037>Schur, Norman W</b037>
<b044>A Harvard graduate in Latin and Italian
literature, Norman Schur attended the University of
Rome and the Sorbonne before returning to the United
States to study law at Harvard and Columbia Law
Schools. Now retired from legal practice, Mr Schur is
a fluent speaker and writer of both British and
American English </b044>
</contributor>
8th November 2000
SGML UK Meeting:
XML in Publishing
13
ONIX International
• XML-coded example of a contributor composite,
using verbose element type names:
<Contributor>
<ContributorRole>A01</ContributorRole>
<PersonNameInverted>Schur, Norman W
</PersonNameInverted>
<BiographicalNote>A Harvard graduate in Latin and
Italian literature, Norman Schur attended the
University of Rome and the Sorbonne before returning
to the United States to study law at Harvard and
Columbia Law Schools. Now retired from legal
practice, Mr Schur is a fluent speaker and writer of
both British and American English</BiographicalNote>
</Contributor>
8th November 2000
SGML UK Meeting:
XML in Publishing
14
ONIX International
• Current status
–
–
–
–
Current release (1.1) was issued in July 2000
Minor corrections in August 2000
Next release (1.2) expected end mid-November 2000
Schema under development
• to enable verification of element content
– Extensions in progress for:
• e-books (for release 1.3 in December 2000)
• video (for release in 2001)
• rights metadata
8th November 2000
SGML UK Meeting:
XML in Publishing
15
XML/EDI initiatives
• Book Industry Communication (BIC) / EDItEUR
are evaluating XML as an alternative to EDI for
trade with libraries
– draft DTD for library book order messages
• Direction likely to be influenced by the Global
Commerce Initiative (GCI) and ebXML
• Finding out more...
– EDItEUR
– GCI
– ebXML
8th November 2000
http://www.editeur.org/
http://globalcommerceinitiative.org/
http://www.ebxml.org/
SGML UK Meeting:
XML in Publishing
16