An abstract model for DCMI metadata descriptions Andy Powell

Download Report

Transcript An abstract model for DCMI metadata descriptions Andy Powell

An abstract model for DCMI
metadata descriptions
DC Usage Board meeting at DC2003, Seattle
September/October 2003
Andy Powell
[email protected]
UKOLN, University of Bath, UK
http://www.ukoln.ac.uk/
UKOLN is supported by:
I am going to…
• assume people have read the current
‘Abstract Model’ working draft
• propose a revised (more generic)
abstract model
• look at some of the issues that have
been raised
• encourage discussion of the revised
model and the issues
• consider what happens next with the
abstract model document
2
DC-2003 - Seattle, Sept/Oct 2003
Major issues
• why develop an abstract model?
• what is ‘qualified DC’? why
limit to DCMI properties?
• what is a ‘record’?
• what is ‘simple DC’? why limit to DCMES
• what is a ‘value’?
• where does DCSV fit in?
• relationship to ‘application profiles’?
• relationship to RDF?
• abstract model and dumb-down?
3
DC-2003 - Seattle, Sept/Oct 2003
Why?
• non-syntax-based view of what constitutes
a DC metadata description
• need to understand what kinds of
descriptions we are trying to encode
• best done without reference to any
particular syntax
• allows us to compare and contrast the
capabilities of different encodings
• syntax X supports feature Y but syntax Z
doesn’t
• supports better mappings between
syntaxes
4
DC-2003 - Seattle, Sept/Oct 2003
What is qualified DC?
• general feeling that limiting abstract
model for ‘qualified DC’ to DCMI
properties is too limiting
frankly my
dear, I
• real world applications typically go
don’t give a
beyond this
DAM
• therefore, need to re-model at more
generic level
• DCMI Abstract Model
5
DC-2003 - Seattle, Sept/Oct 2003
DCMI abstract model
• a description is made up of one or more
properties and their associated values
• each property is an attribute of the
resource being described
therefore… each description is
about one, and only one, resource
(the 1:1 principle)
• properties may be repeated
• a record is a set of descriptions about
one or more related resources
use of the word record may be a problem?
6
DC-2003 - Seattle, Sept/Oct 2003
DCMI abstract model (2)
• each value is a resource
• each value may be denoted by a value
string
• each value string may have an associated
encoding scheme
• each encoding scheme is identified by an
encoding scheme URI
• each value string may have an associated
language (e.g. en-GB)
a value string is a ‘simple’, humanreadable string
7
DC-2003 - Seattle, Sept/Oct 2003
DCMI abstract model (3)
• each value may be identified by a value
URI
• each value may have an associated rich
value (some marked-up text, an image, a
video, some audio, etc. or some
combination thereof)
• each value may have some associated
related metadata
8
related metadata is a description of
a related resource – e.g. metadata
about the person who is the creator
of a document…
DC-2003 - Seattle, Sept/Oct 2003
What is a record?
• a record is a set of descriptions about
one or more related resources, e.g.
• a description of a resource and a
description of its creator
• a description of a resource, a rights
statement about the resource and a
description of the description
• note: a description is about a single
resource and is made up of one or more
properties and their associated values
9
DC-2003 - Seattle, Sept/Oct 2003
What is a value?
10
• a value is the physical or conceptual entity
that is associated with a property when it
is used to describe a resource
• a person (physical)
• an organisation (physical)
• a subject (conceptual)
• a country (physical)
• a type (conceptual)
• etc.
• therefore, in the abstract model,
a value is always a resource
DC-2003 - Seattle, Sept/Oct 2003
A value is always a resource
• in the DCMI abstract model, a value is
always a resource
• the value resource may
• be identified by a value URI
• be denoted by a string value and/or
a rich value
• have some associated related
metadata
• …but the value is always a resource!
• I think this has an impact on the RDF
encodings??
11
DC-2003 - Seattle, Sept/Oct 2003
But some problems…
• some problems with wording of existing
DCMES definitions…
• CCP element values defined to be a
‘…resource…’
• relation, identifier and source defined
to be a ‘…reference to a resource…’
• rights defined to be either a
‘…resource…’ or a ‘link to a service that
provides a resource…’
• problem: too much of the model is
embedded into the definition!
12
DC-2003 - Seattle, Sept/Oct 2003
What is qualified DC?
• a ‘qualified DC record’ is …
• any record that
• conforms to the DCMI abstract
model
• contains a description that uses at
least one DCMI term
however, this means that it is
probably not possible to define a
single XML schema for qualified DC
records – but can provide a template
XML schema
13
DC-2003 - Seattle, Sept/Oct 2003
What is simple DC?
• a ‘simple DC record’ is …
• any record that
• conforms to the DCMI abstract
model
• comprises only a single description
• uses only properties taken from
DCMES
• makes no use of value URIs,
encoding schemes, rich values or
related metadata
14
DC-2003 - Seattle, Sept/Oct 2003
…or to put it differently
• a simple DC record is made up of a single
description
• that description is made up of one or more
properties and their associated values
• each property is an attribute of the
resource being described
• each property must be one of the 15
DCMES elements
• properties may be repeated
• each value is denoted by a value string
• each value string may have an associated
language (e.g. en-GB)
15
DC-2003 - Seattle, Sept/Oct 2003
…or to put it differently
• simple DC is an ‘application profile’ that
only uses terms taken from the DCMES
16
DC-2003 - Seattle, Sept/Oct 2003
simple DC and value URIs
• all values in simple DC are denoted
using only a value string
• the value string can be a URI…
• …but there is nothing to formally
indicate that the value string is a URI
• simple DC software applications may
choose to guess which value strings are
URIs and which aren’t
17
DC-2003 - Seattle, Sept/Oct 2003
Simple DC and audience
• why isn’t dcterms:audience included in
‘simple DC’?
• because single namespace is simpler
than multiple namespaces
• dc:xxx and dcterms:xxx
• because static definition is simpler
than one that grows over time
• audience + … + …
• because, arguably, audience not part of
the ‘core’
• the ‘t-shirt’ problem
18
DC-2003 - Seattle, Sept/Oct 2003
Abstract model and DCSV?
• DCSV provides mechanism for encoding
‘markup’ in value string
• thus DCSV runs slightly counter to the
abstract model
• DCSV better handled as ‘related metadata’
• e.g. Period provides related metadata
about a conceptual ‘period in time’
• impact? XML enc. good – string enc. bad?
• suggest no new proposals based on DSCV
for the time being
19
DC-2003 - Seattle, Sept/Oct 2003
What is a DCAP?
• a Dublin Core Application Profile (as
currently defined) declares the
properties and encoding schemes used
to construct a description as used
within a particular application
• problems…
• DCAPs don’t currently cover the
whole abstract model
• DCAPs define what a description is –
but most ‘applications’ need defining
at the record level
20
DC-2003 - Seattle, Sept/Oct 2003
RDF vs. abstract model
• what is the relationship between RDF
and the abstract model?
• RDF provides richest encoding syntax
currently
• full encoding of all features of the
model
• but expect to see model fully
implemented in XML as well
• (expect HTML syntax to always be a
partial implementation)
21
DC-2003 - Seattle, Sept/Oct 2003
Dumb-down
• intelligent vs. dumb, element vs. value
• element dumb-down (dumb)
• ignore anything that isn’t [DCMES/an element]
• element dumb-down (intelligent)
• resolve sub-properties until you get to
[DCMES/an element]
• value dumb-down (dumb)
• use value URI or value string as value string
• value dumb-down (intelligent)
22
• use knowledge of related metadata, or value
string to create new value string
• resolve sub-classes/broader
terms
DC-2003 - Seattle, Sept/Oct 2003
sub-properties and classes
• RDFS and human-readable declarations
of DCMI terms refer to sub-properties
and sub-classes
• however, these don’t formally appear in
the abstract model (expect as part of
dumb-down)
• where do these fit into the model?
• I think they belong in the
‘grammatical principles’ document
23
DC-2003 - Seattle, Sept/Oct 2003
24
DC-2003 - Seattle, Sept/Oct 2003
Example 1 – dc:creator
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf=http://www….
xmlns:rdfs=http://www.w3.org/…
xmlns:dc=http://purl.org/dc/…
xmlns:my="http://purl.org…">
<rdf:Description>
<dc:creator>
<rdf:Description>
<rdf:value>
Andy Powell
</rdf:value>
<my:email>
[email protected]
</my:email>
</rdf:Description>
</dc:creator>
</rdf:Description>
</rdf:RDF>
25
Example RDF description using
dc:creator…
DC-2003 - Seattle, Sept/Oct 2003
Example 1 – dc:creator
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf=http://www….
xmlns:rdfs=http://www.w3.org/…
Andy Powell…
xmlns:dc=http://purl.org/dc/…
xmlns:my="http://purl.org…">
rdfs:label
<rdf:Description>
dc:creator
<dc:creator>
Andy Po…
<rdf:Description>
my:name
<rdf:value>
my:email
Andy Powell
</rdf:value>
a.powell@uko…
my:affiliation
<my:email>
[email protected]
UKOLN, Univ…
</my:email>
</rdf:Description>
</dc:creator>
</rdf:Description>
</rdf:RDF>
26
…and the RDF model it
represents.
DC-2003 - Seattle, Sept/Oct 2003
Example 1 – dc:creator
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf=http://www….
xmlns:rdfs=http://www.w3.org/…
Andy Powell…
xmlns:dc=http://purl.org/dc/…
xmlns:my="http://purl.org…">
rdfs:label
<rdf:Description>
dc:creator
<dc:creator>
Andy Po…
<rdf:Description>
my:name
<rdf:value>
my:email
Andy Powell
</rdf:value>
a.powell@uko…
my:affiliation
<my:email>
[email protected]
UKOLN, Univ…
</my:email>
</rdf:Description>
</dc:creator>
</rdf:Description>
</rdf:RDF>
27
relatedMetadata
But… we don’t want to embed
all this information into every
instance metadata record do
we?
DC-2003 - Seattle, Sept/Oct 2003
Example 1 – dc:creator
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf=http://www….
xmlns:rdfs=http://www.w3.org/…
Andy Powell…
xmlns:dc=http://purl.org/dc/…
xmlns:my="http://purl.org…">
rdfs:label
<rdf:Description>
dc:creator
<dc:creator>
<rdf:Description>
<rdf:value>
Andy Powell
</rdf:value>
</rdf:Description>
</dc:creator>
</rdf:Description>
</rdf:RDF>
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf=http://www….
xmlns:rdfs=http://www.w3.org/…
xmlns:dc=http://purl.org/dc/…
xmlns:my="http://purl.org…">
<rdf:Description>
Andy Po…
<my:name>
my:name
Andy Powell
my:email
</my:name>
<my:email>
a.powell@uko…
my:affiliation
[email protected]
</my:email>
UKOLN, Univ…
</rdf:Description>
</rdf:RDF>
Need to separate part of the
information out and store it in
a single place – in this case in a
directory service…
28
DC-2003 - Seattle, Sept/Oct 2003
Example 1 – dc:creator
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf=http://www….
xmlns:rdfs=http://www.w3.org/…
Andy Powell…
xmlns:dc=http://purl.org/dc/…
xmlns:my="http://purl.org…">
rdfs:label
<rdf:Description>
dc:creator
<dc:creator>
valueURI
<rdf:Description>
<rdf:value>
Andy Powell
</rdf:value>
</rdf:Description>
</dc:creator>
</rdf:Description>
</rdf:RDF>
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf=http://www….
xmlns:rdfs=http://www.w3.org/…
xmlns:dc=http://purl.org/dc/…
xmlns:my="http://purl.org…">
<rdf:Description>
Andy Po…
<my:name>
valueURI
my:name
Andy Powell
my:email
</my:name>
<my:email>
a.powell@uko…
my:affiliation
[email protected]
</my:email>
UKOLN, Univ…
</rdf:Description>
</rdf:RDF>
To do this we need to assign a
URI (the ‘valueURI’) to the
anonymous ‘value’ node…
29
DC-2003 - Seattle, Sept/Oct 2003
Example 1 – dc:creator
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf=http://www….
xmlns:rdfs=http://www.w3.org/…
Andy Powell…
xmlns:dc=http://purl.org/dc/…
xmlns:my="http://purl.org…">
rdfs:label
<rdf:Description>
dc:creator
<dc:creator>
valueURI
<rdf:Description>
<rdf:value>
Andy Powell
</rdf:value>
</rdf:Description>
</dc:creator>
</rdf:Description>
</rdf:RDF>
30
<?xml version="1.0"?>
relatedMetadataURI
<rdf:RDF xmlns:rdf=http://www….
xmlns:rdfs=http://www.w3.org/…
xmlns:dc=http://purl.org/dc/…
xmlns:my="http://purl.org…">
<rdf:Description>
Andy Po…
<my:name>valueURI
my:name
Andy Powell
my:email
</my:name>
<my:email>
a.powell@uko…
my:affiliation
[email protected]
</my:email>
UKOLN, Univ…
</rdf:Description>
</rdf:RDF>
The document containing this
information is itself an RDF
resource (the
‘relatedMetadata’) and has a
URI
DC-2003 - Seattle, Sept/Oct 2003
Example 1 – dc:creator
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf=http://www….
xmlns:rdfs=http://www.w3.org/…
Andy Powell…
xmlns:dc=http://purl.org/dc/…
xmlns:my="http://purl.org…">
rdfs:label
<rdf:Description>
dc:creator
<dc:creator>
valueURI
<rdf:Description>
<rdf:value>
Andy Powell
rdfs:seeAlso
</rdf:value>
<my:email>
[email protected]
</my:email>
</rdf:Description>
</dc:creator>
</rdf:Description>
</rdf:RDF>
31
<?xml version="1.0"?>
relatedMetadataURI
<rdf:RDF xmlns:rdf=http://www….
xmlns:rdfs=http://www.w3.org/…
xmlns:dc=http://purl.org/dc/…
xmlns:my="http://purl.org…">
<rdf:Description>
Andy Po…
<dc:creator>
valueURI
<rdf:Description> my:name
my:email
<rdf:value>
Andy Powell
a.powell@uko…
my:affiliation
</rdf:value>
<my:email>
UKOLN, Univ…
[email protected]
</my:email>
</rdf:Description>
</dc:creator>
Use rdf:seeAlso to form
</rdf:Description>
linkage between description
</rdf:RDF> and relatedMetadata…
DC-2003 - Seattle, Sept/Oct 2003
Example 2 – dc:subject
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf=http://www….
xmlns:rdfs=http://www.w3.org/…
xmlns:dc=http://purl.org/dc/…
xmlns:dcterms="http://purl.org…">
<rdf:Description>
<dc:subject>
<dcterms:MESH>
<rdf:value>
D08.586.682.075.400
</rdf:value>
<rdfs:label>
Formate Dehydrogenase
</rdfs:label>
</dcterms:MESH>
</dc:subject>
</rdf:Description>
</rdf:RDF>
32
Example RDF description using
dc:subject (taken from
Qualified DC in RDF
recommendation…
DC-2003 - Seattle, Sept/Oct 2003
Example 2 – dc:subject
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf=http://www….
xmlns:rdfs=http://www.w3.org/…
Formated…
xmlns:dc=http://purl.org/dc/…
xmlns:dcterms="http://purl.org…">
rdfs:label
<rdf:Description>
dc:subject
<dc:subject>
<dcterms:MESH>
<rdf:value>
rdfs:value
D08.586.682.075.400
</rdf:value>
D08.586…
rdf:type
<rdfs:label>
Formate Dehydrogenase
</rdfs:label>
dcterms:MESH
</dcterms:MESH>
</dc:subject>
</rdf:Description>
</rdf:RDF>
33
…and the RDF model it
represents.
DC-2003 - Seattle, Sept/Oct 2003
Example 2 – dc:subject
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf=http://www….
xmlns:rdfs=http://www.w3.org/…
Formated…
xmlns:dc=http://purl.org/dc/…
xmlns:dcterms="http://purl.org…">
rdfs:label
<rdf:Description>
dc:subject
<dc:subject>
<dcterms:MESH>
<rdf:value>
rdfs:value
D08.586.682.075.400
</rdf:value>
D08.586…
rdf:type
<rdfs:label>
Formate Dehydrogenase
</rdfs:label>
dcterms:MESH
</dcterms:MESH>
</dc:subject>
</rdf:Description>
</rdf:RDF>
34
relatedMetadata
But… we don’t want to embed
all this information into every
instance metadata record do
we?
DC-2003 - Seattle, Sept/Oct 2003
Example 2 – dc:subject
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf=http://www….
xmlns:rdfs=http://www.w3.org/…
Formated…
xmlns:dc=http://purl.org/dc/…
xmlns:dcterms="http://purl.org…">
rdfs:label
<rdf:Description>
dc:subject
<dc:subject>
<dcterms:MESH>
<rdf:value>
D08.586.682.075.400
</rdf:value>
rdf:type
</dcterms:MESH>
</dc:subject>
</rdf:Description>
dcterms:MESH
</rdf:RDF>
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf=http://www….
xmlns:rdfs=http://www.w3.org/…
D08.586…
xmlns:dc=http://purl.org/dc/…
xmlns:dcterms="http://purl.org…">
<dcterms:MESH>
<rdf:value>
D08.586.682.075.400
</rdf:value>
<rdfs:label>
Formated…
Formate Dehydrogenase
</rdfs:label>
</dcterms:MESH>
dcterms:MESH
</rdf:RDF>
Need to separate part of the
information out and store it in
a single place – in this case
with the terminology owner…
35
DC-2003 - Seattle, Sept/Oct 2003
Example 2 – dc:subject
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf=http://www….
xmlns:rdfs=http://www.w3.org/…
Formated…
xmlns:dc=http://purl.org/dc/…
xmlns:dcterms="http://purl.org…">
rdfs:label
<rdf:Description>
dc:subject
<dc:subject>
<dcterms:MESH>
valueURI
<rdf:value>
D08.586.682.075.400
</rdf:value>
rdf:type
</dcterms:MESH>
</dc:subject>
</rdf:Description>
dcterms:MESH
</rdf:RDF>
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf=http://www….
xmlns:rdfs=http://www.w3.org/…
D08.586…
xmlns:dc=http://purl.org/dc/…
xmlns:dcterms="http://purl.org…">
<dcterms:MESH>
<rdf:value>
valueURI
D08.586.682.075.400
</rdf:value>
<rdfs:label>
Formated…
Formate Dehydrogenase
</rdfs:label>
</dcterms:MESH>
dcterms:MESH
</rdf:RDF>
To do this we need to assign a
URI (the ‘valueURI’) to the
anonymous ‘value’ node…
36
DC-2003 - Seattle, Sept/Oct 2003
Example 2 – dc:subject
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf=http://www….
xmlns:rdfs=http://www.w3.org/…
Formated…
xmlns:dc=http://purl.org/dc/…
xmlns:dcterms="http://purl.org…">
rdfs:label
<rdf:Description>
dc:subject
<dc:subject>
<dcterms:MESH>
valueURI
<rdf:value>
D08.586.682.075.400
</rdf:value>
rdf:type
</dcterms:MESH>
</dc:subject>
</rdf:Description>
dcterms:MESH
</rdf:RDF>
37
<?xml version="1.0"?>
relatedMetadataURI
<rdf:RDF
xmlns:rdf=http://www….
xmlns:rdfs=http://www.w3.org/…
D08.586…
xmlns:dc=http://purl.org/dc/…
xmlns:dcterms="http://purl.org…">
<dcterms:MESH>
<rdf:value>
valueURI
D08.586.682.075.400
</rdf:value>
<rdfs:label>
Formated…
Formate Dehydrogenase
</rdfs:label>
</dcterms:MESH>
dcterms:MESH
</rdf:RDF>
The document containing this
information is itself an RDF
resource (the
‘relatedMetadata’) and has a
URI
DC-2003 - Seattle, Sept/Oct 2003
Example 2 – dc:subject
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf=http://www….
xmlns:rdfs=http://www.w3.org/…
Formated…
xmlns:dc=http://purl.org/dc/…
xmlns:dcterms="http://purl.org…">
rdfs:label
<rdf:Description>
dc:subject
<dc:subject>
<dcterms:MESH>
valueURI
<rdf:value>
D08.586.682.075.400
rdfs:seeAlso
</rdf:value>
rdf:type
</dcterms:MESH>
</dc:subject>
</rdf:Description>
dcterms:MESH
</rdf:RDF>
<?xml version="1.0"?>
relatedMetadataURI
<rdf:RDF
xmlns:rdf=http://www….
xmlns:rdfs=http://www.w3.org/…
D08.586…
xmlns:dc=http://purl.org/dc/…
xmlns:dcterms="http://purl.org…">
<dcterms:MESH>
<rdf:value>
valueURI
D08.586.682.075.400
</rdf:value>
<rdfs:label>
Formated…
Formate Dehydrogenase
</rdfs:label>
</dcterms:MESH>
dcterms:MESH
</rdf:RDF>
Use rdf:seeAlso to form
linkage between description
and relatedMetadata…
38
DC-2003 - Seattle, Sept/Oct 2003
Abstract DC model
<?xml version="1.0"?>
<?xml version="1.0"?>
relatedMetadataURI
<rdf:RDF
xmlns:rdf=http://www….
resource
<rdf:RDF
xmlns:rdf=http://www….
xmlns:rdfs=http://www.w3.org/…
relatedMetadata
xmlns:rdfs=http://www.w3.org/…
D08.586…
xmlns:dc=http://purl.org/dc/…
Formated…
xmlns:dc=http://purl.org/dc/…
xmlns:dcterms="http://purl.org…"> xmlns:dcterms="http://purl.org…">
rdfs:label valueString
<dcterms:MESH>
<rdf:Description>
dc:subject
<rdf:value>
(valueStringLang)
<dc:subject>
property
valueURI
D08.586.682.075.400
<dcterms:MESH>
valueURI
</rdf:value>
<rdf:value>
<rdfs:label>
D08.586.682.075.400
valueURIrdfs:seeAlso
Formated…
Formate Dehydrogenase
</rdf:value>
rdf:type
</rdfs:label>
</dcterms:MESH>
</dcterms:MESH>
</dc:subject>
dcterms:MESH
</rdf:RDF>
</rdf:Description>
dcterms:MESH
In terms of abstract DC model
we now have: resource,
</rdf:RDF>
encodingScheme
39
property, valueURI, valueString
(and valueStringLang),
encodingScheme,
relatedMetadata
DC-2003 - Seattle, Sept/Oct 2003