Developing a Metadata Exchange Format for Mathematical

Download Report

Transcript Developing a Metadata Exchange Format for Mathematical

Developing a Metadata Exchange
Format for Mathematical Literature
David Ruddy
Project Euclid
Cornell University Library
DML 2010 Paris
7 July 2010
History
• Part of the early DML/WDML discussions
• Initial version of MLAP (qualified Dublin Core),
2004-2005
• Effort on a simple DC profile in 2005-2006
– Thierry Bouche, Thomas Fischer, Claude Goutorbe,
David Ruddy
• Dublin Core community refines and
documents its concept of an Application
Profile, 2007-2009
Dublin Core Application Profile
• Dublin Core Abstract Model
– Essentially, an RDF model
• All properties, vocabularies, and syntax
encoding schemes identified by URIs
• Global semantic interoperability
• Semantic web, linked data
DCAP Compliance
•
•
•
•
•
Functional requirements
Domain model
Description set profile
Usage guidelines
Syntax guidelines
MLAP Functional Requirements
• Typical functions of bibliographic records: find,
identify, select, obtain
• Multilingual support
• Potential capabilities:
– Linking to name authority records
– Citation analysis
– Embedded OpenURL Context Objects
– Rich subject analysis
MLAP: Out of Scope
• Description of publications not available
online
• Identification and description of distinct FRBR
entities (supporting version control)
• Structured author/contributor descriptions
• Machine-processable descriptions of access
embargo periods
MLAP Domain Model
• Entities of the application profile, and their
relationships
0..1
Publication
0..n
Publication
Container
Creator
Agent
MLAP Description Set Profile
• Defines how metadata records adhere to the
Description Set Model
• DSP uses a DC constraint language
– Statement templates
– Value constraints
• XML expression of the MLAP DSP:
http://projecteuclid.org/documents/
metadata/mlap/mlap_dsp.xml
MLAP Property Namespaces
• DCMI Metadata Terms
• PRISM: Publishing Requirements for Industry
Standard Metadata
• DC Collections Metadata Terms
MLAP Usage Guidelines
• Human-readable presentation of DSP
• Additional content value rules and/or
recommendations
• Examples
• MLAP usage guidelines (HTML):
http://projecteuclid.org/documents/
metadata/mlap/
MLAP Syntax Guidelines
• The Description Set Model is neutral regarding
syntactic encoding of description sets
• DC provides specifications for how description
sets may be serialized in plain text, XML,
RDF/XML, and in XHTML meta tags
• MLAP usage guidelines encode examples in plain
text, with alternate encodings in XML, and
eventually RDF/XML
• Neutral approach allows for multiple ways to
exchange metadata
@prefix dcterms: <http://purl.org/dc/terms/>
DescriptionSet (
Description (
ResourceURI ( <http://example.org/a/resource/uri > )
Statement (
PropertyURI ( dcterms:title )
LiteralValueString ( "<div xmlns="http://www.w3.org/
1998/Math/MathML">On <math alttext="$L$">
<mi>L</mi></math>-functions of twisted
<math alttext="$4$"><mn>4</mn></math>-dimensional
quaternionic Shimura varieties</div>"
Language ( en )
SyntaxEncodingSchemeURI ( <http://www.w3.org/
1999/02/22-rdf-syntax-ns#XMLLiteral> )
)
)
)
)
<?xml version="1.0" encoding="utf-8"?>
<dcds:descriptionSet
xmlns:dcds="http://purl.org/dc/xmlns/2008/09/01/dc-ds-xml/">
<dcds:description dcds:resourceURI="http://example.org/a/resource/uri">
<dcds:statement dcds:propertyURI="http://purl.org/dc/terms/title">
<dcds:literalValueString xml:lang="en"
dcds:sesURI="http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral">
<div xmlns="http://www.w3.org/1998/Math/MathML">
On <math alttext="$L$"><mi>L</mi></math>-functions of
twisted <math alttext="$4$"><mn>4</mn></math>-dimensional
quaternionic Shimura varieties</div>
</dcds:literalValueString>
</dcds:statement>
</dcds:description>
</dcds:descriptionSet>
<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dcterms="http://purl.org/dc/terms/">
<rdf:Description rdf:about="http://example.org/a/resource/uri">
<dcterms:title rdf:parseType="Literal">
<div xmlns="http://www.w3.org/1998/Math/MathML">
On <math alttext="$L$"><mi>L</mi></math>-functions of
twisted <math alttext="$4$"><mn>4</mn></math>-dimensional
quaternionic Shimura varieties</div>
</dcterms:title>
</rdf:Description>
</rdf:RDF>
Minimal Record Requirements
• Four required elements:
<dcterms:title>
<dcterms:issued>
<dcterms:bibliographicCitation>
<prism:url>
Potential for Rich Records
•
•
•
•
Multilingual values for many properties
MathML in titles and abstracts
Complete reference lists
OpenURL Context Objects for described
publication and all referenced resources
Dedicated Identifiers
• For example:
<prism:url>
for the publication’s HTTP URI, instead of
<dcterms:identifier>
• Also:
<prism:issn>
<prism:eIssn>
<prism:isbn>
<prism:doi>
• Likewise, the publicationContainer entity
Unresolved Issues
• Optimized for serial literature
• Contributor property
– Not easy to capture a role attribute
– Potential solutions add complexity
• MSC codes do not have URIs