Getting Started with CONTENTdm

Download Report

Transcript Getting Started with CONTENTdm

Getting Started with
CONTENTdm
Corey Harper, University of Oregon
Terry Reese, Oregon State University
OLA - April 8, 2005
Road Map













Introduction to CONTENTdm's field properties and search interfaces
Determining important access points
Discussion of Dublin Core mapping
CONTENTdm's controlled vocabulary structure
Setting up controlled vocabularies
Use of home grown vocabularies
Metadata interoperability
Shared standards and best practices
Western States Best Practices
Working with Western Waters
Managing control vocabularies between projects
Demo some of OSU and UO's collections
Q&A
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Introducing CONTENTdm

Search Interfaces
3+-3.8 provides three primary search interfaces

1.
2.
3.
Browse Search
Advance Search
Custom Search
4 display methods





Grid view
Bibiographic view
Thumbnail view
Title view
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Introducing CONTENTdm: Browse Interface
 Browses all items in the collection

Browse ordered alphebetically
No skip characters (the, a, I, an used to determine
order)
 The pages of CONTENTdm Compound objects are
not show in the browse interface.

Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Introducing CONTENTdm: Browse Interface
(Grid view)
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Introducing CONTENTdm: Browse Interface
(Thumbnail view)
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Introducing CONTENTdm: Browse Interface
(Bibliographic view)
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Introducing CONTENTdm: Browse Interface
(Title view)
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Introducing CONTENTdm: Browse Interface
(Custom view)
Hyperlink Example
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Introducing CONTENTdm: Advanced
Interface
 Two types of searches


Searching across all fields
Searching on a particular field
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Introducing CONTENTdm: Advanced
Interface
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Introducing CONTENTdm: Advanced
Interface
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Introducing CONTENTdm: Setting up field
properties
 Field properties set in the administration area
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Introducing CONTENTdm: Setting up field
properties
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Introducing CONTENTdm: Setting up field
properties
Dublin Core Mappings
T
i t l e
S
u
b
j e
c t
D
e
s c r i p
t i o
n
C
r e
a
t o
r
P
u
b
l i s h
e
r
C
o
n
t r i b
u
t o
r s
D
a
t e
T
y p
e
F
o
r m
a
t
I d
e
n
t i f i e
r
S
o
u
r c e
L
a
n
g
u
a
g
e
R
e
l a
t i o
n
C
o
v e
r a
g
e
R
i g
h
t s
A
u
d
i e
n
c e
T
i t l e
- A
l t e
r n
a
t i v e
D
e
s c r i p
t i o
n
- T
a
b
l e
O
f
C
o
n
D
e
s c r i p
t i o
n
- A
b
s t r a
c t
D
a
t e
- C
r e
a
t e
d
D
a
t e
- V
a
l i d
D
a
t e
- A
v a
i l a
b
l e
D
a
t e
- I s s u
e
d
D
a
t e
- M
o
d
i f i e
d
F
o
r m
a
t - E
x t e
n
t
F
o
r m
a
t - M
e
d
i u
m
R
e
l a
t i o
n
- I s
V
e
r s i o
n
O
f
R
e
l a
t i o
n
- H
a
s
V
e
r s i o
n
R
e
l a
t i o
n
- I s
R
e
p
l a
c e
d
B
y
R
e
l a
t i o
n
- R
e
p
l a
c e
s
R
e
l a
t i o
n
- I s
R
e
q
u
i r e
d
B
y
R
e
l a
t i o
n
- R
e
q
u
i r e
s
R
e
l a
t i o
n
- I s
P
a
r t
O
f
R
e
l a
t i o
n
- H
a
s
P
a
r t
R
e
l a
t i o
n
- I s
R
e
f e
r e
n
c e
d
B
R
e
l a
t i o
n
- R
e
f e
r e
n
c e
s
R
e
l a
t i o
n
- I s
F
o
r m
a
t
O
f
R
e
l a
t i o
n
- H
a
s
F
o
r m
a
t
O
f
R
e
l a
t i o
n
- C
o
n
f o
r m
s
T
o
C
o
v e
r a
g
e
- S
p
a
t i a
l
C
o
v e
r a
g
e
- T
e
m
p
o
r a
l
A
u
d
i e
n
c e
- M
e
d
i a
t o
r
N
o
n
e
t e
n
t s
y
Data Types:
Text
Date (format: dd/mm/yyyy)
Full Text Search (OCR’d data)
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Introducing CONTENTdm: Starting a project
 What do we scan?



The first and most important part of the
collection building process.
Every institution has great stuff to digitize but….
Digital collections need to be treated like
traditional materials, i.e., are vetted or proposed
by an organization’s subject specialist.
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Introducing CONTENTdm: Starting a project
 Working with stakeholders:

Working with your subject specialists can help
you to identify:
1.
2.
3.
Organizational stakeholders (departments, groups,
etc.)
Outside stakeholders (both public and private)
Field specific thesauri and classification terms that
may be able to be incorporated into the project.
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Introducing CONTENTdm: Starting a project

Determining access points:





What metadata will be present?
How will it be entered?
What best practices or standards will be used in
generating the metadata?
What metadata will your stakeholders expect to be
present?
What metadata elements will be searchable? Which
will use controlled vocabulary? How will your
administrative metadata be stored?
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Introducing CONTENTdm: Starting a project
 Access points into the collection:


Once the collection has been built – how will it
be accessed? Search types?
Example:


http://digitalcollections.library.oregonstate.edu/archi
ves/
http://digitalcollections.library.oregonstate.edu/dna/
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Mapping to Dublin Core
 15 Dublin Core Elements
 16th element – Audience
 26 Qualified DC Elements (from dcterms
namespace)
 None – Special value – field is not mapped
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Dublin Core Mapping
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
OAI-PMH
 Open Archives Initiative Protocol for Metadata
Harvesting
 I’ll be talking about this some, as will Terry
 OAI-PMH is a protocol, layered over HTTP
 Response format encoded as XML
 Defines format for requests and responses used for
harvesting metadata from collections
 Used to build federated search interfaces
 http://www.openarchives.org/
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Some notes on mapping
 Effect on searching – across collections,
search based on DC Mapping
 Effect on OAI output



15 elements and nothing more.
Qualified elements “dumb down” to Simple DC
“none” and “Audience” aren’t included in OAI
results
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
More on Mapping
 Collaborative projects should determine what
information to map to Dublin Core fields
 Western States documentation provides
guidance on mapping
 Effect on search results at centralized search
interfaces, e.g. Western Waters
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Effects of Bad Mapping
 Poor mapping decisions can cause problems




Cluttered results in cross collection searches
Cluttered results in federated searching
Inconsistency in where pieces of information are
found
Important information not harvested
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Decisions are rarely final
 CONTENTdm is extremely flexible


Can easily change mappings, indexing, display,
vocabularies
Can add and delete fields at any time
 This can be both good and bad
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Editing Field Properties
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Editing Field Properties
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Editing Field Properties
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Controlled Vocabularies in CONTENTdm
 Supports SEE FROM type cross-references
 No support for SEE ALSO or hierarchical
(BT, NT, RT)
 One term per line in a text file
 Cross-references: x-ref USE heading
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Vocabularies on the Server
 Stored as text files in vocab folder:

[nickname].txt; e.g. subjec.txt
 Additional vocabulary files stored in the
text_search folder:



voc.[nickname] – Vocabulary terms used in instance data
use.[nickname] – X-refs to terms used in instance data
vocuse.[nickname] – Both terms and X-refs
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Vocabulary Index Generator
 Terry’s PHP Script to create hyperlinked lists
of vocabulary terms:

http://oregonstate.edu/~reeset/contentdm/downloads.html
 Excellent for “Browse by Subject” pages
 Configurable to include x-refs
 Other useful tools available at this URL
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Available Vocabularies
 Software comes with TGM-I
 LCSH and MeSH available from User Support
Center (requires login)
 MeSH – 29,000 entries – headings only
 LCSH – 156,411 entries – 64,959 headings &
91,452 x-refs.

X-Refs include 400, 410, 411 and 450 references from
authority records coded as 150 with 008 bit 09 “a”
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Vocabulary Administration
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Vocabulary Administration
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Vocabulary Verification
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Other Vocabularies
 Adding terms to existing vocabularies

Be careful: update when new versions available
 Creating CONTENTdm formated versions of other
vocabularies:



DC Type
MIME Type
GSFAD
 Creating home-grown vocabularies from scratch
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Home Grown Vocabularies
 May be useful to combine vocabularies or
create new ones.

Example – Combining terms from GSFAD &
from TGM-II (GMGPC)
 Controlling a list of Collection Names,
Collection Identifiers, Source Conditions,
etc.
 Generating Vocabs from a fields contents
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Temporary Vocabularies
 Instance data for fields using vocabularies

Term A; Term B; Term C
 Use same syntax for non-controlled fields that
contain multiple entries

repeatable fields in the DC sense
 Create vocabs from field contents for normalization
and quality control

Browse the defined vocabulary to locate problems
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
CONTENTdm and metadata
interoperability
 Issues to consider:

Interoperability between metadata formats


Dublin Core => MARC, etc.
Interacting with federated searching


Interacting with federated search tools
Understanding how your metadata could be
harvested via OAI.
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
CONTENTdm and metadata
interoperability
 Building federated collections:

By considering metadata interoperability you
can build federated tools based on OAI:


http://fluffy.library.oregonstate.edu/contentdm/searc
h/index.php
Build tools that integrate with other federated
search tools:

http://fluffy.library.oregonstate.edu/a9/search.php
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Shared standards and best practices
 Metadata interoperability and shared
standards go hand in hand.
 Shared standards are essentially a “trust”
contract between groups of users that their
metadata will conform to a specific set of
rules.

Examples: MARC & AACR2
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Shared standards: western waters best
practices
 Shared digital library standards:

Western states Dublin Core best practices

http://www.cdpheritage.org/resource/metadata/wsdc
mbp/
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Working with Western Waters



Western Waters Digital Library – Federated
CONTENTdm catalog of 12 academic libraries.
Projects contributed to the WWDL require
metadata that meets both local and consortial
standards.
As with any consortial arrangement –
compromise is sometimes going to be necessary.
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Working with Western Waters
 More stuff
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Managing controlled vocabularies between
projects
 CONTENTdm has no built-in facility to
share controlled vocabularies between
projects.
 Two methods:

1) use Unix diff function to locate differences
between in use vocab. between projects and
then manually adding missing elements or
correcting conficts between projects.
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Managing controlled vocabularies between
projects
2. Build your own management:
1.
Example:
http://fluffy.library.oregonstate.edu/contentdm/
builder.html
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Demo Collections
 Content
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Q/A
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Contact Us
 Terry Reese - [email protected]
 Corey Harper – [email protected]
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005