Image Databases in Practice

Download Report

Transcript Image Databases in Practice

Metadata for Asset
Management
Peter B. Hirtle
Co-Director
Cornell Institute for Digital Collections
1
Problem: Imaging projects
produce many digital files
2
Problem redux…
How to you locate, manage, and
display scanned images?
4
One possible answer:


Put identifying information into the file
header
Problems with this approach



Hard to search and retrieve
May change over time
May not be able to migrate data
5
Second approach
Use an image management system to
manage images:
A software application (often a database)
used for organizing, managing, and
providing access to digital media
6
Image management system

Provides tools for searching
(Descriptive metadata)

Provides public and internal links to the
images
(Structural metadata)

Provides the control elements needed
for short and long-term access
(administrative metadata)
7
Metadata for image management

No single accepted standards for each type of
metadata

Descriptive metadata


Structural metadata


MARC, DC, MOA2, EAD, VRA, Open Archives Initiative
LC RFP’s, MOA2, DOIs
Administrative metadata

DIG 35, NISO draft standard, MOA2, in process
preservation standards such as CEDARS
8
Key concept: metadata is
seldom fixed
You will be massaging the metadata
throughout the life of the project



To conform to emerging standards
To adjust to new technical environments
To add functionality
Once you start a digital project, you are
committed to it for life
9
So where do you get an image
management solution?


No single off the shelf solution
Solutions vary according to:
 complexity
 performance
 cost
10
What is the “ideal solution”…?

Dependent upon your needs:
 size
of database
 expected demand for images
 volatility of the data
 available technical resources
11
Other elements to consider....




Access to a controlled thesaurus
Flexibility in database design
The expected life-span of the data
If permanent, the potential for
migration
 Adherence
to database standards
 Adherence to data content standards
12
Three classes of solutions

Generic database applications




Desktop
Client/server
Specialized image management
programs
SGML-based solutions
13
Generic database applications

Most common desktop programs


MS Access, Filemaker Pro
Client/server applications

Oracle, Informix (including Illustra), 4th
Dimension, object-oriented applications
14
Demo Here
15
Advantages to desktop
programs




Low initial cost for desktop programs
Desktop programs are relatively easy to
program and use
Simple data import and export
Growing 3rd-party market of add-ons
(especially web tools)
16
Disadvantages

Desktop solutions limited in size
(< 10,000?)



Few standardized data structures
Web interfaces require customization
High costs of programming
 explicit
with large applications
 hidden but real with desktop
17
Specialized image
management programs

“Desktop” examples:

Canto’s Cumulus
http://www.canto-software.com/
 ImageAXS
http://www.dascorp.com

Portfolio (formerly Fetch)
http://www.extensis.com/products/Portfolio/

Content (shown here)
18
Advantages





Pre-defined data structure
Built-in links to images
Some are cross-platform
Some have built-in links to the web
Overall, less programming expertise
required
19
Disadvantages




Fixed data structure
Proprietary database structures
Limited customization possible
Web access is primarily via scripts
20
Larger client/server image
management programs
Library software
 Museum-oriented programs
 Document management programs
 Digital library solutions
 Other programs for newspaper photos,
stock photos, multimedia asset
management, etc.

21
Library systems

Image-enabled library catalogs include






VTLS
CARL
OCLC Sitesearch
Endeavor’s Voyager and ENCOMPASS
RLG has a system in development
All library systems will head in this
direction
22
Advantages


Ready links between catalog and digital
images
Built on common data structures



MARC or Dublin Core
Increased likelihood they will exploit
library-specific metadata
Greater possibility for shared resources
23
Disadvantages




Poor integration between images and
text
No common repository standard
No shared standard for utilizing
metadata
Administrative hurdles

Do digital imaging and Library Systems talk
to each other?
24
SGML and XML-based systems



A new approach: using metadata encoded
with SGML or XML
Based on document type definitions (DTD)
Examples:



Photographs using EAD: California Heritage
project
Text using Ebind (electronic binding DTD)
Agora’s complete management system
25
Why consider SGML?


Based on an international standard
DTD’s may themselves become
standard



Example: MOA2
May be more appropriate for textoriented description
Links to other SGML or XML-encoded
resources are possible
26
Disadvantages to SGML





Little native client support for SGML
SGML engines may not be as powerful as
relational databases
XML databases are just being developed
Native SGML software tends to be expensive
Often it is easier to store data in a database,
and write it out with SGML XML tags for
exchange or export
27
Summary




No single imagebase package is likely to meet
all your needs
Plan on continuously modifying databases,
interfaces, and metadata
Monitor closely the work developing image
database standards in the area of greatest
interest to you
Avoid if possible the hidden costs of internal
development
28