eXtensible Catalog eXtensible Catalog: Tools for the creation and use of RDA, FRBRized and linked data David Lindahl eXtensible Catalog Organization University of Rochester, River Campus.

Download Report

Transcript eXtensible Catalog eXtensible Catalog: Tools for the creation and use of RDA, FRBRized and linked data David Lindahl eXtensible Catalog Organization University of Rochester, River Campus.

eXtensible Catalog
eXtensible Catalog:
Tools for the creation and use of RDA,
FRBRized and linked data
David Lindahl
eXtensible Catalog Organization
University of Rochester, River Campus Libraries
Rochester, NY
LITA National Forum
September 30, 2011
Funders and Sponsors
Major Funding
• Andrew W. Mellon Foundation
Sponsors
• Consortium of Academic and Research Libraries
in Illinois (CARLI)
• Kyushu University
• University of North Carolina at Charlotte
• University of Rochester
2
User Research
Problem:
• User research is of limited value if a library
doesn’t have control over its discovery
environment
• Our solution:
– Develop our own software (eXtensible Catalog)
– Offer a modular architecture (4 “toolkits”)
– Build in tons of configurability
– Use established standards and protocols
– Give it away (open source)
XC User Research Approach
• What articles, books and other resources had
researchers used most recently?
– How did they know the items existed?
– How did they obtain them?
– How did they use them?
• How do they keep current in their fields?
User Research Findings
• Users want to choose between versions of a
resource, see relationships between resources
– Underlying XC metadata is based on FRBR model:
works, expressions, manifestations, etc.
– Use some RDA data elements in FRBR structure
– Metadata services to aggregate/group FRBR entities
in the User Interface
7
User Research Findings
• Users have preferred material and format
types, depending upon their projects
– Show online materials only
– Exclude microforms
• Users want to know why items appear on a
search result list
– Show keywords in context
8
Acting on User Research Findings
9
XC: “Taking Control” of metadata
More Control
over Metadata
More Options for
Customizing the
User Interface
10
XC Schema
DCMI
• Dublin Core terms (all)
• RDA – subset of elements and
role designators
• XC elements (newly-defined)
– when necessary to contain
MARC vocabularies, linking
fields, etc.
RDA
XC
11
Discovery Interface
Translating User Research Findings into XC Functionality
13
14
15
16
17
18
19
20
21
22
FRBR Structure - Pyramid
Work
Expression
Manifestation
Holdings
Expression
Manifestation
Holdings
Manifestation
Holdings
Holdings
23
FRBR Structure - Hourglass
Work
Expression
Work
Expression
Work
Expression
Manifestation
Holdings
Holdings
Holdings
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
Software Overview
Discovery, Metadata Management, and Connectivity
XC Software
Drupal
MST
OAI
NCIP
Toolkit
Toolkit
Toolkit
Toolkit
Metadata Services
- Cleanup
- Format Convert
ILS Connectivity
Synchronize
data with XC
User Interface
- Search
- Browse
User Interface Features
More Metadata Services
ILS Export Scripts
XSLT Scripts
ILS Connectivity
- Circ. status
- Account info
ILS connectors
Each toolkit is eXtensible with add-on packages
40
XC Software
Drupal
MST
OAI
NCIP
Toolkit
Toolkit
Toolkit
Toolkit
Metadata Services
- Cleanup
- Format Convert
ILS Connectivity
Synchronize
data with XC
User Interface
- Search
- Browse
ILS Connectivity
- Circ. status
- Account info
Voyager
“Driver”
Voyager
“Driver”
Metadata
User Interface
Live Circ. Data
Voyager ILS
41
Drupal Toolkit
Drupal
MST
OAI
NCIP
Toolkit
Toolkit
Toolkit
Toolkit
Metadata Services
- Cleanup
- Format Convert
ILS Connectivity
Synchronize
data with XC
User Interface
- Search
- Browse
ILS Connectivity
- Circ. status
- Account info
42
Drupal Toolkit Features
Drupal
Toolkit
User Interface
- Search
- Browse
• Search/Browse
• Customization and theming
• Platform for applications
– Library website
– Modules add functionality
43
Drupal Toolkit In Use
Cute.Catalog @ Kyushu University
Drupal
Toolkit
User Interface
- Search
- Browse
44
Drupal Toolkit In Use
Cute.Catalog @ Kyushu University
Drupal
Toolkit
User Interface
- Search
- Browse
45
Drupal Toolkit In Use
“Creating Communities” @ Denver Public Library
Drupal
Toolkit
User Interface
- Search
- Browse
46
Drupal Toolkit In Use
“Creating Communities” @ Denver Public Library
Drupal
Toolkit
User Interface
- Search
- Browse
47
Metadata Services Toolkit (MST)
Drupal
MST
OAI
NCIP
Toolkit
Toolkit
Toolkit
Toolkit
Metadata Services
- Cleanup
- Format Convert
ILS Connectivity
Synchronize
data with XC
User Interface
- Search
- Browse
ILS Connectivity
- Circ. status
- Account info
48
MST Features
• Collect metadata from repositories
• Process metadata with services:
MST
Toolkit– Normalize
Metadata Services
– Convert
- Cleanup
- Format Convert
– Merge
– Add identifiers
• Platform for building new services
49
MST In Use
Demonstration Server @ Rochester
MST
Toolkit
Metadata Services
- Cleanup
- Format Convert
50
MST In Use
Demonstration Server @ Rochester
MST
Toolkit
Metadata Services
- Cleanup
- Format Convert
51
MST In Use
Perseus Digital Library @ Tufts University (dev.)
MST
Toolkit
Metadata Services
- Cleanup
- Format Convert
52
MST In Use
Perseus Digital Library @ Tufts University (dev.)
MST
Toolkit
Metadata Services
- Cleanup
- Format Convert
53
MST In Use
Union Catalogue @ Ministerio de Cultura, Madrid, Spain
MST
Toolkit
Metadata Services
- Cleanup
- Format Convert
54
Metadata Services Toolkit
ILS
ILS
IR
Digital
Repository
Discovery
Service
OAI-PMH
XC Metadata Services Toolkit
OAI-PMH
MST decides which services and
in which order to process incoming records
MARC
Normalization
Cleanup
DC
Normalization
DC to XC
Transformation
MARC to XC
Transformation
Format conversion
XC Aggregation
Merge
XC
Authority
Add
Identifiers
Creating XC Schema data from MARC
• Parse MARCXML records into linked
FRBR-based records
• Holdings can be separate or embedded
• Manage uplinks
MARCXML
Bibliographic
XC
Work
Work Expressed
XC
Expression
Expression Manifested
XC
Manifestation
OO4 “Uplink”
MARCXML
Holdings
Manifestation Held
XC Holdings
56
Following one MARC record through XC
1. Convert
2. Normalize
3. Transform
4. Aggregate
match
merge
W
W
?
M
M
W
E
MARC
MARCXML
(dirty)
MARCXML
(clean)
Index
E
?
M
EM
M
XC
5. Index
M
?
M
M
M
Other XC
records
XC
WEM
WEM
Data is ready for search
and faceted browse
Steps:
1. Convert from raw MARC to MARCXML (minor cleanup)
2. Normalize MARCXML (major cleanup)
3. Transform from MARCXML to XC (FRBRize)
4. Aggregate at each FRBR level (match and merge)
5. Index records / create WEMs (one for each unique Manifestation)
57
Metadata Services Toolkit (MST)
Drupal
MST
OAI
NCIP
Toolkit
Toolkit
Toolkit
Toolkit
Metadata Services
- Cleanup
- Format Convert
ILS Connectivity
Synchronize
data with XC
User Interface
- Search
- Browse
ILS Connectivity
- Circ. status
- Account info
58
Connectivity Tools
• OAI Toolkit
– Synchronizes metadata with XC
– Cleans up MARC data
– Uses export scripts
Toolkit
OAI
• NCIP 2 Toolkit
–
–
–
–
ILS Connectivity
Synchronize
data with XC
Looks up circulation status
Places requests (renew, hold)
Retrieves user account information
Enables resource sharing
NCIP
Toolkit
ILS Connectivity
- Circ. status
- Account info
• Evergreen ILS   OCLC Worldcat Navigator
• SirsiDynix Symphony   PALCI’s EZBorrow
– Test bed available now!
59
NCIP 2 Toolkit: Testbed
NCIP
Toolkit
ILS Connectivity
- Circ. status
- Account info
NCIP 2 Toolkit: Testbed
NCIP
Toolkit
ILS Connectivity
- Circ. status
- Account info
RDA and FRBR
Helping libraries make the transition
63
64
U.S. RDA Test Coordinating Committee
Overall Recommendation:
“…the Coordinating Committee recommends
that RDA should be implemented by LC, NAL,
and NLM no sooner than January 2013.”
65
Bottom line…by January 2013…
Libraries will be able to use RDA in MARC
and RDA in a non-MARC environment
at the same time.
XC provides one option for doing this
66
U.S. RDA Test Coordinating Committee
Recommended Tasks and Action Item:
“Solicit demonstrations of prototype input and
discovery systems that use the RDA element
set (including relationships)...”
Timeframe for completion: within 18 months.
67
Breaking down the Recommendation
“Solicit demonstrations of
prototype input and discovery
systems that use the RDA element
set (including relationships)...”
prototype
input
discovery
RDA element set
including relationships
What XC Provides




XC is near production-ready
MARC data (bulk)
XC has a discovery interface
Uses subsets of RDA elements
and roles to date
 Primary relationships between
work, expression and item so far
68
XC: Facilitating the Transition
XC enables risk-free experimentation with RDA
data while the library community develops a
successor to MARC
XC can serve as a “bridge” between using RDA
in MARC-based systems and in emerging
applications
69
Linked Data in XC
Library of Congress statement, May 13, 2011
Transforming our Bibliographic Framework
“Experiment with Semantic Web and linked
data technologies to see what benefits to the
bibliographic framework they offer our
community and how our current models need
to be adjusted to take fuller advantage of these
benefits.”
71
Semantic Web and Linked Data
• The Semantic Web refers to a set of
technologies that allow computers to
understand the meaning of information on the
web
• Linked data is a mechanism for exposing,
sharing and connecting data on the web
72
Semantic Web and Linked Data
• If everything has a unique identifier, then
information from one website can be related
to information from another via a computer
program
• Everything includes people, places, things,
vocabularies, metadata elements, web
documents, …
73
Getting Started
To create Linked Data, we need:
–Software to transform legacy data
–Analysis: mapping of legacy metadata
to Linked Data properties
74
Converting MARC to Linked Data
• What XC software can do:
–
–
–
–
Convert MARC codes to vocabulary values
Remove extraneous data
Normalize inconsistencies
Map most MARC fields/subfields and parse to
appropriate FRBR Group 1 entity records
75
Best Practices for Linked Data
By attempting to follow best practices in XC
for Linked Data, we hope to facilitate
eventual output of XC metadata in RDF.
- Unique identifiers for XC metadata records
- Data elements from registered schemas
- Registered vocabularies
76
RDF Triple
Subject
Predicate
Object
Poets, American
This resource
has subject
URIs for each?
77
RDF Triple – Record identifiers
Subject
Predicate
Object
oai:mst.rochester.edu: MST/
MARCToXCTransformation/
10081
This resource
has subject
Poets, American
78
Identifiers for XC Schema records
<?xml version="1.0" encoding="UTF-8"?>
<xc:frbr xmlns:xc="http://www.extensiblecatalog.info/Elements"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:rdvocab="http://rdvocab.info/Elements" xmlns:dcterms="http://purl.org/dc/terms/"
xmlns:rdarole="http://rdvocab.info/roles">
<xc:entity type="work" id="oai:mst.rochester.edu:MST/MARCToXCTransformation/10081">
<dcterms:subject xsi:type="dcterms:LCC">PS3505.U334</dcterms:subject>
<dcterms:subject xsi:type="dcterms:DDC">811/.52</dcterms:subject>
<dcterms:subject xsi:type="dcterms:DDC">B</dcterms:subject>
<rdarole:author>Sawyer-Lauc<U+0327>anno, Christopher, 1951-</rdarole:author>
<rdvocab:titleOfTheWork>E.E. Cummings :</rdvocab:titleOfTheWork>
<xc:subject xsi:type="dcterms:LCSH">Cummings, E. E. (Edward Estlin), 1894-1962.</xc:subject>
<xc:subject xsi:type="dcterms:LCSH">Poets, American-20th century-Biography.</xc:subject>
</xc:entity>
</xc:frbr>
A persistent, globally unique identifier
for each XC Schema record
79
RDF Triple - Registered Data Elements
Subject
oai:mst.rochester.edu: MST/
MARCToXCTransformation/
10081
This resource
Predicate
Object
http://www.
extensiblecatalog.inf
o/Elements/subject
has subject
Poets, American
80
XC Schema Elements
DCMI
Dublin Core terms (DCMI) - all
RDA – subset of elements and role
designators
XC elements (newly-defined) – when
necessary to enable XC system
functionality
RDA
XC
81
XC Schema “work” record: data elements
<?xml version="1.0" encoding="UTF-8"?>
<xc:frbr xmlns:xc="http://www.extensiblecatalog.info/Elements"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:rdvocab="http://rdvocab.info/Elements" xmlns:dcterms="http://purl.org/dc/terms/"
xmlns:rdarole="http://rdvocab.info/roles">
<xc:entity type="work" id="oai:mst.rochester.edu:MST/MARCToXCTransformation/10081">
<dcterms:subject xsi:type="dcterms:LCC">PS3505.U334</dcterms:subject>
<dcterms:subject xsi:type="dcterms:DDC">811/.52</dcterms:subject>
<dcterms:subject xsi:type="dcterms:DDC">B</dcterms:subject>
<rdarole:author>Sawyer-Lauc<U+0327>anno, Christopher, 1951-</rdarole:author>
<rdvocab:titleOfTheWork>E.E. Cummings :</rdvocab:titleOfTheWork>
<xc:subject xsi:type="dcterms:LCSH">Cummings, E. E. (Edward Estlin), 1894-1962.</xc:subject>
<xc:subject xsi:type="dcterms:LCSH">Poets, American-20th century-Biography.</xc:subject>
</xc:entity>
</xc:frbr>
Data elements from registered
namespaces for DC terms, RDA roles
and vocab, and XC
82
RDF Triple - Registered Vocabularies
Subject
oai:mst.rochester.edu: MST/
MARCToXCTransformation/
10081
This resource
Predicate
http://www.
extensiblecatalog.inf
o/Elements/subject
has subject
Object
http://id.loc.gov/authoritie
s/sh85103735#concept
Poets, American
83
XC Work record with embedded URI
<?xml version="1.0" encoding="UTF-8"?>
for LCSH “Poets, American”
<xc:frbr xmlns:xc="http://www.extensiblecatalog.info/Elements" …
xmlns:subjid=“id.loc.gov/authorities”>
<xc:entity type="work" id="oai:mst.rochester.edu:MST/MARCToXCTransformation/10081">
…
<xc:subject xsi:type="dcterms:LCSH">Poets, American-20th century-Biography.</xc:subject>
<xc:subject xsi:type="dcterms:LCSH” subjid=“sh85103735#concept”>Poets,
American</xc:subject>
<xc:temporal>20th century</xc:temporal>
<xc:type>Biography</xc:type>
84
</xc:entity>
RDF Triple
Subject
oai:mst.rochester.edu: MST/
MARCToXCTransformation/
10081
This resource
Predicate
http://www.
extensiblecatalog.inf
o/Elements/subject
has subject
Object
http://id.loc.gov/authoritie
s/sh85103735#concept
Poets, American
85
XC Software is “Linked Data Ready”
• Converts metadata to FRBR entities with RDA
elements and roles
• Adds identifiers for “things”
• Provides a platform for service development
• Synchronizes with existing tools
– Cataloging staff client
– Institutional repository
86
Download XC software at
eXtensibleCatalog.org