Linked data: Who needs it? - American Library Association

Download Report

Transcript Linked data: Who needs it? - American Library Association

Linked data: The play’s the thing
Ed Jones, National University (San Diego)
ALA Annual Conference (New Orleans)
Waiting for the Semantic Web
VLADIMIR:
Well, shall we move to the Semantic Web?
ESTRAGON: Yes, let’s move to the Semantic Web.
[They don’t move.]
[apologies to Samuel Beckett]
outline
1.
2.
3.
The playground
Playground rules
The players
1.
2.
3.
Playing nice
Playing sort of nice
Playing not so nice
The playground
[Linked open data cloud diagram, by Richard Cyganiak and Anja Jentzsch, http://lod-cloud.net/]
Who we play with
Playground rules
Ranganathan’s first law of linked data:
Data is for use
[or, for the true die-hard, Data are for use]
Corollary 1 to Ranganathan’s first law
 Functional

granularity
BISG DP on ISTC:
What gets an ISTC? Moby-Dick alternatives:
1.
2.
3.
4.
Every version is Moby-Dick (one ISTC)
All versions derive from an Ur-parent (Melville scholar) (one
ISTC for Ur-parent, one ISTC for each derivative) 
Some versions derive from different texts (librarian) (one ISTC for
Ur-parent, one ISTC for each derivative text)
Some versions are augmented by introduction and notes that are
separate works (“an even more pedantic librarian, dancing angels
on the head of a pin”) (one ISTC for …, one ISTC for each
component (introduction, biographical note, etc.)
Corollary 2 to Ranganathan’s first law
 “If
1.
2.
3.
you build it, they will come”
There is (or will be [maybe, hopefully]) a lot of
linkable data out there
Others will want some of our data and make
links
We will want some of theirs and make links
The players



Martin Prince (Playing nice)
Ralph Wiggum (Playing sort of nice)
Nelson Muntz (Not playing nice)
Martin Prince: Playing nice
©2009 Twentieth Century
Fox Film Corporation
Playing nice:
Tim Berners-Lee’s rules for linked data
1.
2.
3.
4.
Use URIs as names for things.
Use HTTP URIs so that people can look up those
names.
When someone looks up a URI, provide useful
information, using the standards (RDF*, SPARQL).
Include links to other URIs so that they can discover
more things.
How we play: Group 3
How we play: Group 3
How we play: Group 2
How we play: Group 2
How we play: RDA element sets and value
vocabularies
How we play: RDA element sets and value
vocabularies
Wie sie spielen: GND
Summary: Martin (Sir Tim’s rules)
1.
There are some nice resource files of FRBR Group 2
and 3 entities available in RDF
2.
RDA-specific vocabularies are making headway
3.
The Germans are eating our lunch
Ralph Wiggum: Playing sort of nice
©2009 Twentieth Century
Fox Film Corporation
Functional granularity
 How
 Too
much granularity in RDA?
little?
 Too much?
 Just right?
Too little granularity

Sometimes it may be more useful to express an attribute
at a more granular level

RDA 7.12 Language of the content (captions)
RDA 7.14 Accessibility content (closed captions)





[041]17 $a en $j de $2 iso639-1
[546]\\ $a Closed captioning in German.
http://RDVocab.info/Elements/languageOfTheContentExpression
http://RDVocab.info/Elements/accessibilityContentExpression
Too little granularity

RDA 2.15.1.4: Record identifiers in accordance with any
prescribed display format; otherwise precede the
identifier with a relevant trade name or agency


020 ISBN (scope: binding / publisher / unit / acidity / ebook format / etc.)
022 ISSN
024 Lots of others (ISMN / EAN / DOI / GTIN-14 / etc.)

http://RDVocab.info/Elements/identifierForTheManifestation

Too little or too much granularity?






RDA 2.8.2.3 Recording Place of Publication
RDA 2.20.7.3 Details Relating to Publication Statements
[260] $a Nizhny Novgorod : $b Izd-vo “Spasibo!”, $c 2008.
[500] $a Published in Nizhny Novgorod, South Carolina.
http://RDVocab.info/Elements/placeOfPublicationManifestation
http://RDVocab.info/Elements/noteOnPublicationStatementManifestation
Too much granularity?

RDA 2.7 Production Statement  260 $a - c






RDA 2.8 Publication Statement  260 $a - c


RDA 2.7.2 Place of Production  260 $a
RDA 2.7.3 Parallel Place of Production  260 $a
RDA 2.7.4 Producer’s Name  260 $b
RDA 2.7.5 Parallel Producer Name  260 $b
RDA 2.7.6 Date of Production  260 $c
Ditto
RDA 2.9 Distribution Statement  260 $a - c

Ditto
ISBD RDF

http://iflastandards.info/ns/isbd/elements/[property]

hasPlaceOfPublicationProductionDistribution
<info:lccn/ca35000361> <isbd:P1016> “Paris”
hasNameOfPublisherProducerDistributor
<info:lccn/ca35000361> <isbd:P1017> “Pagnerre”
hasDateOfPublicationProductionDistribution
<info:lccn/ca35000361> <isbd:P1018> “1862”



hasPublicationProductionDistributionEtcArea
<info:lccn/ca35000361> <isbd:P1162> “Paris : Pagnerre, 1862.”
How do we use the Publication, etc., area?
Sometimes others have more granular
metadata




RDA: Publisher’s name (transcribed string from preferred
source, may be publisher name or publisher imprint)
ONIX: Publisher (controlled name)
ONIX: Imprint (controlled name)
Example: “China's Multilateral Co-operation in Asia and
the Pacific”


Taylor & Francis Group publishes it under its Routledge
imprint
[260] $a Milton Park, Abingdon, Oxon ; $a New York : $b
Routledge, $c c2010.
And even more granular…







<Publisher>
<PublishingRole> <b291> [List 45]
01=Publisher
02=Co-publisher
03=Sponsor
05=Host/distributor of electronic content
06=Published for/on behalf of
07=Published in association with
etc.
<NameCodeType> <b241> [List 44]
<NameCodeTypeName> <b242>
<NameCodeValue> <b243>
<PublisherName> <b081>
</Publisher>
Summary: Ralph

1.
2.
3.
RDA granularity may be like in Goldilocks and the
Three Bears:
Some elements maybe too granular
Some elements maybe not granular enough
Some (probably most) elements just right
Nelson Muntz: Not playing nice
©2009 Twentieth Century
Fox Film Corporation
Legacy format
An analogy

RDA = Acela Express

MARC 21 = Too-close tracks and outdated catenary
support
Legacy data
Legacy data

Links to resources (FRBR Group 1):



Links to agents (FRBR Group 2):


Probably via machine population of subfield $0
Links to subjects (FRBR Group 3):


Work/expression possibly via machine population of subfield
$0
Otherwise manifestation level only, mainly serials/IR only (fields
760-787)
Probably via machine population of subfield $0
Links to other vocabularies:

Nope
Where will Area 4 go?

AACR2 X.4 (MARC 260) Publication, distribution, etc., area

Place of publication, distribution, etc.
Publisher, distributor, etc.
Etc.

Maps to … ???



RDA 2.7 Production Statement




RDA 2.8 Publication Statement


Place of Production
Producer’s Name
Etc.
Ditto
RDA 2.9 Distribution Statement

Ditto
Summary

What to link?


Whatever we find useful (functional granularity)
How does RDA play with the SW?

Potentially well. However…



We have to get on the SW to play there
We need appropriate levels of granularity to accommodate / exploit
existing data
We need to link more outside our little corner of the linked data
cloud
I ♥ ISBD
Новая эстонская новелла : 1990-е годы : Пер. с эстон. / Сост.: Пирет
Вийрес; Послесл.: Каяр Прууль. - Таллинн : Aleksandra, Cop. 1999. - 302,
[1] с. : портр.; 21 см. - (Библиотека журнала Таллинн; 7).
ISBN 9985-827-41-4
Художественная литература -- Эстония -- Эстонская литература -- 2-ая
пол. 20 в. -- Рассказы -- Сборник разных авторов
Хранение: 2P 8/44-2;
Not sure what this means