Getting Started with CONTENTdm
Download
Report
Transcript Getting Started with CONTENTdm
Getting Started with
CONTENTdm
Corey Harper, University of Oregon
Terry Reese, Oregon State University
OLA - April 8, 2005
Road Map
Introduction to CONTENTdm's field properties and search interfaces
Determining important access points
Discussion of Dublin Core mapping
CONTENTdm's controlled vocabulary structure
Setting up controlled vocabularies
Use of home grown vocabularies
Metadata interoperability
Shared standards and best practices
Western States Best Practices
Working with Western Waters
Managing control vocabularies between projects
Demo some of OSU and UO's collections
Q&A
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Introducing CONTENTdm
Search Interfaces
3+-3.8 provides three primary search interfaces
1.
2.
3.
Browse Search
Advance Search
Custom Search
4 display methods
Grid view
Bibiographic view
Thumbnail view
Title view
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Introducing CONTENTdm: Browse Interface
Browses all items in the collection
Browse ordered alphebetically
No skip characters (the, a, I, an used to determine
order)
The pages of CONTENTdm Compound objects are
not show in the browse interface.
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Introducing CONTENTdm: Browse Interface
(Grid view)
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Introducing CONTENTdm: Browse Interface
(Thumbnail view)
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Introducing CONTENTdm: Browse Interface
(Bibliographic view)
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Introducing CONTENTdm: Browse Interface
(Title view)
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Introducing CONTENTdm: Browse Interface
(Custom view)
Hyperlink Example
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Introducing CONTENTdm: Advanced
Interface
Two types of searches
Searching across all fields
Searching on a particular field
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Introducing CONTENTdm: Advanced
Interface
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Introducing CONTENTdm: Advanced
Interface
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Introducing CONTENTdm: Setting up field
properties
Field properties set in the administration area
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Introducing CONTENTdm: Setting up field
properties
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Introducing CONTENTdm: Setting up field
properties
Dublin Core Mappings
T
i t l e
S
u
b
j e
c t
D
e
s c r i p
t i o
n
C
r e
a
t o
r
P
u
b
l i s h
e
r
C
o
n
t r i b
u
t o
r s
D
a
t e
T
y p
e
F
o
r m
a
t
I d
e
n
t i f i e
r
S
o
u
r c e
L
a
n
g
u
a
g
e
R
e
l a
t i o
n
C
o
v e
r a
g
e
R
i g
h
t s
A
u
d
i e
n
c e
T
i t l e
- A
l t e
r n
a
t i v e
D
e
s c r i p
t i o
n
- T
a
b
l e
O
f
C
o
n
D
e
s c r i p
t i o
n
- A
b
s t r a
c t
D
a
t e
- C
r e
a
t e
d
D
a
t e
- V
a
l i d
D
a
t e
- A
v a
i l a
b
l e
D
a
t e
- I s s u
e
d
D
a
t e
- M
o
d
i f i e
d
F
o
r m
a
t - E
x t e
n
t
F
o
r m
a
t - M
e
d
i u
m
R
e
l a
t i o
n
- I s
V
e
r s i o
n
O
f
R
e
l a
t i o
n
- H
a
s
V
e
r s i o
n
R
e
l a
t i o
n
- I s
R
e
p
l a
c e
d
B
y
R
e
l a
t i o
n
- R
e
p
l a
c e
s
R
e
l a
t i o
n
- I s
R
e
q
u
i r e
d
B
y
R
e
l a
t i o
n
- R
e
q
u
i r e
s
R
e
l a
t i o
n
- I s
P
a
r t
O
f
R
e
l a
t i o
n
- H
a
s
P
a
r t
R
e
l a
t i o
n
- I s
R
e
f e
r e
n
c e
d
B
R
e
l a
t i o
n
- R
e
f e
r e
n
c e
s
R
e
l a
t i o
n
- I s
F
o
r m
a
t
O
f
R
e
l a
t i o
n
- H
a
s
F
o
r m
a
t
O
f
R
e
l a
t i o
n
- C
o
n
f o
r m
s
T
o
C
o
v e
r a
g
e
- S
p
a
t i a
l
C
o
v e
r a
g
e
- T
e
m
p
o
r a
l
A
u
d
i e
n
c e
- M
e
d
i a
t o
r
N
o
n
e
t e
n
t s
y
Data Types:
Text
Date (format: dd/mm/yyyy)
Full Text Search (OCR’d data)
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Introducing CONTENTdm: Starting a project
What do we scan?
The first and most important part of the
collection building process.
Every institution has great stuff to digitize but….
Digital collections need to be treated like
traditional materials, i.e., are vetted or proposed
by an organization’s subject specialist.
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Introducing CONTENTdm: Starting a project
Working with stakeholders:
Working with your subject specialists can help
you to identify:
1.
2.
3.
Organizational stakeholders (departments, groups,
etc.)
Outside stakeholders (both public and private)
Field specific thesauri and classification terms that
may be able to be incorporated into the project.
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Introducing CONTENTdm: Starting a project
Determining access points:
What metadata will be present?
How will it be entered?
What best practices or standards will be used in
generating the metadata?
What metadata will your stakeholders expect to be
present?
What metadata elements will be searchable? Which
will use controlled vocabulary? How will your
administrative metadata be stored?
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Introducing CONTENTdm: Starting a project
Access points into the collection:
Once the collection has been built – how will it
be accessed? Search types?
Example:
http://digitalcollections.library.oregonstate.edu/archi
ves/
http://digitalcollections.library.oregonstate.edu/dna/
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Mapping to Dublin Core
15 Dublin Core Elements
16th element – Audience
26 Qualified DC Elements (from dcterms
namespace)
None – Special value – field is not mapped
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Dublin Core Mapping
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
OAI-PMH
Open Archives Initiative Protocol for Metadata
Harvesting
I’ll be talking about this some, as will Terry
OAI-PMH is a protocol, layered over HTTP
Response format encoded as XML
Defines format for requests and responses used for
harvesting metadata from collections
Used to build federated search interfaces
http://www.openarchives.org/
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Some notes on mapping
Effect on searching – across collections,
search based on DC Mapping
Effect on OAI output
15 elements and nothing more.
Qualified elements “dumb down” to Simple DC
“none” and “Audience” aren’t included in OAI
results
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
More on Mapping
Collaborative projects should determine what
information to map to Dublin Core fields
Western States documentation provides
guidance on mapping
Effect on search results at centralized search
interfaces, e.g. Western Waters
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Effects of Bad Mapping
Poor mapping decisions can cause problems
Cluttered results in cross collection searches
Cluttered results in federated searching
Inconsistency in where pieces of information are
found
Important information not harvested
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Decisions are rarely final
CONTENTdm is extremely flexible
Can easily change mappings, indexing, display,
vocabularies
Can add and delete fields at any time
This can be both good and bad
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Editing Field Properties
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Editing Field Properties
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Editing Field Properties
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Controlled Vocabularies in CONTENTdm
Supports SEE FROM type cross-references
No support for SEE ALSO or hierarchical
(BT, NT, RT)
One term per line in a text file
Cross-references: x-ref USE heading
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Vocabularies on the Server
Stored as text files in vocab folder:
[nickname].txt; e.g. subjec.txt
Additional vocabulary files stored in the
text_search folder:
voc.[nickname] – Vocabulary terms used in instance data
use.[nickname] – X-refs to terms used in instance data
vocuse.[nickname] – Both terms and X-refs
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Vocabulary Index Generator
Terry’s PHP Script to create hyperlinked lists
of vocabulary terms:
http://oregonstate.edu/~reeset/contentdm/downloads.html
Excellent for “Browse by Subject” pages
Configurable to include x-refs
Other useful tools available at this URL
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Available Vocabularies
Software comes with TGM-I
LCSH and MeSH available from User Support
Center (requires login)
MeSH – 29,000 entries – headings only
LCSH – 156,411 entries – 64,959 headings &
91,452 x-refs.
X-Refs include 400, 410, 411 and 450 references from
authority records coded as 150 with 008 bit 09 “a”
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Vocabulary Administration
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Vocabulary Administration
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Vocabulary Verification
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Other Vocabularies
Adding terms to existing vocabularies
Be careful: update when new versions available
Creating CONTENTdm formated versions of other
vocabularies:
DC Type
MIME Type
GSFAD
Creating home-grown vocabularies from scratch
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Home Grown Vocabularies
May be useful to combine vocabularies or
create new ones.
Example – Combining terms from GSFAD &
from TGM-II (GMGPC)
Controlling a list of Collection Names,
Collection Identifiers, Source Conditions,
etc.
Generating Vocabs from a fields contents
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Temporary Vocabularies
Instance data for fields using vocabularies
Term A; Term B; Term C
Use same syntax for non-controlled fields that
contain multiple entries
repeatable fields in the DC sense
Create vocabs from field contents for normalization
and quality control
Browse the defined vocabulary to locate problems
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
CONTENTdm and metadata
interoperability
Issues to consider:
Interoperability between metadata formats
Dublin Core => MARC, etc.
Interacting with federated searching
Interacting with federated search tools
Understanding how your metadata could be
harvested via OAI.
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
CONTENTdm and metadata
interoperability
Building federated collections:
By considering metadata interoperability you
can build federated tools based on OAI:
http://fluffy.library.oregonstate.edu/contentdm/searc
h/index.php
Build tools that integrate with other federated
search tools:
http://fluffy.library.oregonstate.edu/a9/search.php
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Shared standards and best practices
Metadata interoperability and shared
standards go hand in hand.
Shared standards are essentially a “trust”
contract between groups of users that their
metadata will conform to a specific set of
rules.
Examples: MARC & AACR2
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Shared standards: western waters best
practices
Shared digital library standards:
Western states Dublin Core best practices
http://www.cdpheritage.org/resource/metadata/wsdc
mbp/
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Working with Western Waters
Western Waters Digital Library – Federated
CONTENTdm catalog of 12 academic libraries.
Projects contributed to the WWDL require
metadata that meets both local and consortial
standards.
As with any consortial arrangement –
compromise is sometimes going to be necessary.
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Working with Western Waters
More stuff
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Managing controlled vocabularies between
projects
CONTENTdm has no built-in facility to
share controlled vocabularies between
projects.
Two methods:
1) use Unix diff function to locate differences
between in use vocab. between projects and
then manually adding missing elements or
correcting conficts between projects.
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Managing controlled vocabularies between
projects
2. Build your own management:
1.
Example:
http://fluffy.library.oregonstate.edu/contentdm/
builder.html
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Demo Collections
Content
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Q/A
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005
Contact Us
Terry Reese - [email protected]
Corey Harper – [email protected]
Terry Reese - Corey Harper
Oregon Library Association - April 8, 2005