IMLS NLG Collection Registry & Item

Download Report

Transcript IMLS NLG Collection Registry & Item

IMLS NLG Collection Registry &
Item-Level Metadata Repository
at the University of Illinois
Timothy W. Cole ([email protected])
Mathematics Librarian &
Professor of Library Administration
University of Illinois at Urbana-Champaign (USA)
Open Archives Forum Workshop
University of Bath
4 September 2003
http://dli.grainger.uiuc.edu/Publications/TWCole/OAForumWkshpBath/
IMLS NLG Program

Institute of Museum and Library Services (IMLS)




U.S. Federal grant-making agency, est. 1996
Goal to foster leadership, innovation, lifetime learning
$244 million annual budget
IMLS National Leadership Grant Program



Currently about $20 million per year
Library, Museum, & Library-Museum Collaborations
Funds research & demonstration, digitization,
preservation, model programs, new technology
[email protected]
University of Illinois at UC
2
Open Archives Forum Workshop
4 September 2003
IMLS Digital Collections Framework
“IMLS Framework of Guidance for Building Good Digital
Collections” published November 2001
http://www.imls.gov/pubs/forumframework.htm

Product of 8-member IMLS Digital Library Forum, with
participation from National Science Digital Library (NSF)

Differentiates digital collections & digital libraries

Articulates principles & frames discussion of best practices

Links to resources, models, & exemplary projects

Will be sustained by National Information Standards Org.
[email protected]
University of Illinois at UC
3
Open Archives Forum Workshop
4 September 2003
Recommendations from the IMLS Forum
Four General Recommendations to IMLS:
1.
Digital collections built with support of public funds can and should
be held to standards that support interoperability, reusability, and
persistence.
2.
IMLS should maintain its own registry of funded digital collections.
3.
Because so much of the IMLS constituency consists of small and
medium-sized organizations without sophisticated in-house
technical support, the IMLS should also consider projects to
develop infrastructure services that lower barriers to NSDL
contribution by smaller organizations.
4.
IMLS should encourage the integration of an archiving component
into every project plan by requiring a description of how data will
be preserved.
[email protected]
University of Illinois at UC
4
Open Archives Forum Workshop
4 September 2003

Collection description and registry for National Leadership
grant projects with digital content


Item level metadata repository via OAI-PMH



Enhance discoverability; all registry fields searchable
Demonstrate potential of metadata for interoperability
Facilitate reuse of information resources
Research question:
How can resource developers best represent collections
and items to meet the needs of service providers and end
users?
Project Website: http://imlsdcc.grainger.uiuc.edu/
[email protected]
University of Illinois at UC
5
Open Archives Forum Workshop
4 September 2003
Project Scope
95 NLG projects with associated digital collections
51 of these are/were collaborative projects
All together 237 institutions involved


Breakdown of Institutions (237 total)
83
52
17
14
8
8
8
5
5
11
ca
de
m
ic
5
O
th
er
21
Li
br
ar
ie
s
M
H
u
is
se
to
um
ric
al
s
S
oc
ie
P
tie
ub
s
lic
Li
br
ar
S
N
ie
ta
on
s
te
-P
L
ib
ro
ra
f it
rie
O
rg
s
an
iz
B
at
S
ot
c
io
an
ho
ns
ic
ol
al
D
G
is
ar
tri
de
ct
s
ns
/H
er
ba
r ia
A
rc
Li
b
hi
A
r
ve
ca
ar
y
s
de
C
m
o
ns
ic
D
or
ep
t ia
t/
In
st
it u
te
90
80
70
60
50
40
30
20
10
0
A

[email protected]
University of Illinois at UC
6
Open Archives Forum Workshop
4 September 2003
A Diverse Community


Wide variation in technical skills and technology
infrastructure & policy
Mix of library, museum, and archive traditions



Diverse perspectives on IP policy, use and presentation of
metadata and primary resources
Diverse embedded knowledge structures
Wide range of vocabularies and descriptive practices



Metadata created for diverse purposes
Local vocabularies for type, subject, coverage, audience
Wide range of granularity
[email protected]
University of Illinois at UC
7
Open Archives Forum Workshop
4 September 2003
Prior Work – Mellon OAI Grants

July 2001, Andrew W. Mellon Foundation awarded 7
grants for OAI-related research ($1.5 mil. total)
Primary focus: demonstrate utility of OAI metadata
harvesting in context of scholarly inquiry
 Research Library Group (RLG)
University of Michigan
University of Illinois at Urbana-Champaign
Emory University / Southeastern Library Network
Woodrow Wilson International Center
University of Virginia
See: http://www.arl.org/newsltr/217/waters.html

[email protected]
University of Illinois at UC
8
Open Archives Forum Workshop
4 September 2003
University of Illinois Mellon OAI Project


July 2001 – May 2003
Primary Objectives:


Create & demonstrate OAI tools
Build portal to aggregated metadata describing cultural
heritage resources




Initially – For OAI testing & research
Long-term – As a sustained resource
Investigate using EAD metadata in OAI context
Research utility of aggregated metadata
[email protected]
University of Illinois at UC
9
Open Archives Forum Workshop
4 September 2003
University of Illinois Cultural Heritage Portal

Harvests 25 OAI Providers




Academic libraries &
archives
Digital library projects
Historical societies
Aggregates 479,000
metadata items



55% text / sheet music
40% image / multimedia
5% archival / museum
http://oai.grainger.uiuc.edu
[email protected]
University of Illinois at UC
10
Open Archives Forum Workshop
4 September 2003
Current Projects Addressing Similar Issues
NSDL
Digital library of resource collections
and services, organized in support
of science education at all levels.
NOF-Digitize / EnrichUK
Description and aggregation of
digitized collections funded by the
New Opportunities Fund
Minerva Project
Creating an agreed European
common platform,
recommendations and guidelines
about digitization, metadata, longterm accessibility and preservation
[email protected]
University of Illinois at UC
11
Open Archives Forum Workshop
4 September 2003
Technical Challenges

NLG Awardees have diverse technical resources




Limited knowledge of / tools for working with XML
Limited knowledge of community metadata schemas
Limited knowledge of / access to CGI capabilities
Early NLG projects have no resources earmarked for
sharing metadata

Technical implementations not always built with reuse
and interoperability in mind
[email protected]
University of Illinois at UC
12
Open Archives Forum Workshop
4 September 2003
OAI Readiness Among NLG Projects
NLG Projects (95 total) and OAI
11%
14%
OAI data provider
19%
Aware / In Development
No OAI experience
Likely no item level
metadata
56%
[email protected]
University of Illinois at UC
13
Open Archives Forum Workshop
4 September 2003
OAI for Static Repositories

Lower barrier option for exposing relatively static and
small collections of metadata




Designed to scale well to about 5,000 metadata records
Provider serves static XML file (no CGI required)
3rd party gateway generates valid OAI responses
Supports only a subset of OAI options


No sets, deleted records, resumptionTokens
DateStamp granularity limited to YYYY-MM-DD
Preliminary alpha version of OAI-SR guidelines available:
http://www.openarchives.org/OAI/2.0/guidelines-static-repository.htm
[email protected]
University of Illinois at UC
14
Open Archives Forum Workshop
4 September 2003
OAI Static Repository Gateways

SR Gateways support CGI extended path

SR Gateways typically cache static repository XML files




SR Gateway lists all SRs available through gateway in
<friends> element (dynamic discovery of SRs)
SR Gateways assumed to support automated
self-registration of SRs
SRs should make themselves available through a
single SR Gateway
SR Gateway applications available on SourceForge.net 1 2
[email protected]
University of Illinois at UC
16
Open Archives Forum Workshop
4 September 2003
Example of a Static Repository XML File
Identify
OAI_DC Record
MARC21 XML Record
Working with Turnkey Solutions

OAI provider service now built into many popular
digital library applications



Some implementations may be limited




ContentDM, Encompass, DLXS, DSpace, EPrints.org
Facilitates participation in OAI-PMH metadata sharing
Many support oai_dc metadata schema only
May have limited feature set (e.g, no resumptionToken)
Metadata mappings may not be configurable
Community needs to advocate requirements strongly
[email protected]
University of Illinois at UC
19
Open Archives Forum Workshop
4 September 2003
Metadata Issues

Wide range of metadata schemas in use

Variations in Descriptive practices & traditions

Use of Dublin Core fields
Granularity

What is being described


Different approaches to IP rights issues
[email protected]
University of Illinois at UC
20
Open Archives Forum Workshop
4 September 2003
Metadata Schemas Used By NLG Projects

MARC and Dublin Core
most common schemas


Metadata Schemas in Use
Includes qualified DC &
DC with extra fields
Locally
developed
24 projects - multiple
schemas

14 of these using Dublin
Core in combination with
another schema
17
Unknown
12
EAD
10
TEI
10
Dublin Core
35
MARC
35
0
[email protected]
University of Illinois at UC
21
10
20
30
40
Open Archives Forum Workshop
4 September 2003
DC element usage (from Mellon)
Records containing subject & description element



SUBJECT
DESCRIPTION
Digital libraries
(10 total, 122,719 records)
78%
36%
Museums, hist. societies, etc.
(6 total, 255,800 records)
93%
93%
Academic libraries
(7 total, 235,294 records)
15%
13%
Many different controlled and local vocabularies in use
Granularity: a record may describe a collection
of coins — or one coin
[email protected]
University of Illinois at UC
22
Open Archives Forum Workshop
4 September 2003
Describe the digital object?
Excerpt of record describing a cotton coverlet
Description: Digital image of a single-sized cotton coverlet for a bed with
embroidered butterfly design. Handmade by Anna F. Ginsberg Hayutin.
Source: Materials: cotton and embroidery floss. Dimensions: 71 in. x 86 in.
Markings: top right hand corner has 1 1/2 in. x 1/2 in. label cut outs at
upper left and right hand side for head board; fabric is woven in a variation
of a rib weave; color each of yellow and gray; hand-embroidered cotton
butterflies and flowers from two shades of each color of embroidery floss blue, pink, green and purple and single top 20 in. bordered with blue and
black cotton embroidery thread; stitches used for embroidery: running
stitch, chain stitch, French knot and back stitches; selvage edges left
unfinished; lower edges turned under and finished with large gray running
stitches made with embroidery floss.
Format: Epson Expression 836 XL Scanner with Adobe Photoshop version
5.5; 300 dpi; 21-53K bytes. Available via the World Wide Web.
Coverage: —
Date Created: 2001-09-19 09:45:18; Updated: 20011107162451; Created:
2001-04-05; Created: 1912-1920?
Type: Image
[email protected]
University of Illinois at UC
23
Open Archives Forum Workshop
4 September 2003
Or describe the analog object?
Excerpt of record describing Am. woven coverlet
Description: Materials: Textile--Multi, Pigment—Dye; Manufacturing Process:
Weaving--Hand, Spinning, Dyeing, Hand-loomed blue wool and white linen
coverlet, worked in overshot weave in plain geometric variant of a
checkerboard pattern.Coverlet is constructed from finely spun, indigo-dyed
wool and undyed linen, woven with considerable skill. Although the pattern is
simpler, the overall craftsmanship is higher than 1934.01.0094A. - D.
Schrishuhn, 11/19/99 This coverlet is an example of early "overshot" weaving
construction, probably dating to the 1820's and is not attributable to any
particular weaver. -- Georgette Meredith, 10/9/1973
Source: —
Format: 228 x 169 x 1.2 cm (1,629 g)
Coverage: Euro-American; America, North; United States; Indiana? Illinois?
Date: Early 19th c. CE
Type: cultural; physical object; original
[email protected]
University of Illinois at UC
24
Open Archives Forum Workshop
4 September 2003
Various Concerns About IP Rights

Overcoming reluctance to share metadata because
of IP rights issues




Concern that sharing metadata is giving away most
valuable asset
Uncertain whether license limits metadata sharing
Uncertain whether to share metadata describing
licensed information resources
Machine readable IP rights attributes Needed to
facilitate reuse
[email protected]
University of Illinois at UC
25
Open Archives Forum Workshop
4 September 2003
Portal Design Issues

How best to organize aggregated metadata for browse




How best to implement basic & advanced searching




Need scalable ways to build / implement classifications
Need better methods for clustering and grouping
Utilize relationships & ties to collection descriptions
Precise searching hard due to metadata usage variations
Limited normalization possible; more work needed
Robust search & ranking across large aggregations hard
Need more audience-specific designs


Need more dynamic & interactive designs
Need better support of educational & instructional uses
[email protected]
University of Illinois at UC
26
Open Archives Forum Workshop
4 September 2003
Portal Design – Mellon Project Experience

Limited focus group testing



23 student teachers in honors-level C & I class
Assignment to students: Use the site in preparing a
lesson plan for high school social studies class
Process




Introduced site & “aggregated metadata” concept
Focus group interviews conducted
Students’ papers examined
Transaction logs analyzed
[email protected]
University of Illinois at UC
27
Open Archives Forum Workshop
4 September 2003
A Few Observations from Test
1. Users expected all links to point to digital objects



Some records pointed to finding aids
Some records pointed to collection’s web site
Some records pointed to Library books on the shelf
2. Users unable to make use of search results


Simple searches produced 1000s of unranked results
Advanced search (with limits) rarely used
3. Distinction between portal and data providers
unimportant to users
[email protected]
University of Illinois at UC
28
Open Archives Forum Workshop
4 September 2003
Rethinking what “online access” means

To librarian & curator

To student teacher
[email protected]
University of Illinois at UC
29
Open Archives Forum Workshop
4 September 2003
Closing Thought – Considering OAI in Context

Descriptive, item-level metadata alone insufficient


Distinction between collection & item blurs in DLs





Must be used in combination with collection descriptions, user
annotations, machine generated clustering, …
Complex objects – TEI, EAD, METS
Granularity – should museum describe every arrowhead in
end-user search & discovery system
Relationships between items provide context
Need to tie collection registry to item-level repository
OAI-PMH not limited to item-level descriptive metadata
[email protected]
University of Illinois at UC
30
Open Archives Forum Workshop
4 September 2003