Transcript Document

Models for Digital Libraries
CSC 9010 - week 2
The 5S model is the work of Edward A. Fox and his students
at Virginia Tech. These slides rely heavily on that work.
Week 2 goals
• Discuss reading of “As we may think”
• Review of points in the discussion of
What is a Digital Library?
• Introduce a formal model of digital
libraries
• Use the 5 S model to direct thinking
about the design of a DL
• Briefly introduce systems to be used
Our systems
• 7 linux machines, remotely accessible
• Bare machines with just basic system
• We will install apache and the rest of
the web infrastructure, as well as the DL
software.
• Detailed instructions will be available
next week.
memex
• Vannevar Bush’s vision
– How far have we come?
– What did you notice about this article -- style or content or
background or anything else.
– Did the article suggest anything you would not want to see
happen?
Image source:
kelty.rice.edu/375/images/memex/camera.jpg
http://www.knowledgesearch.org/presentations/etcon/images/memex.gif
MyLifeBits
• Gordon Bell and Microsoft
•
http://www.guardian.co.uk/science/story/0,3605,1674359,00.html
“Gordon Bell doesn't need to remember, but has no chance of
forgetting. At the age of 71, he is recording as much of his life as
modern technology will allow, storing it all on a vast database: a
digital facsimile of a life lived.
If he goes for a walk, a miniature camera that dangles from his
neck snaps pictures every minute or so, immediately committing
the scene to a memory built not of neurons but ones and
noughts. If he wanders into a cafe, sensors note the change in
light, the shift of temperature and squirrel the information away.
Conversations are recorded and steps logged thanks to a GPS
receiver carried with him.”
Related work
• Walden’s Path
– http://www.csdl.tamu.edu/walden/
– System used by itself or as a service within a
digital library
– Allows a user to make a path through a set of
related resources and save the path for reuse at a
later time.
• Used to allow a teacher to “blaze a trail” through a
collection of materials to help students find their way from
a starting point to a goal.
• Also for recording personal trips through a collection of
material to be revisited.
Moving Forward
• Last week
– Looked at what a library is
• Now
– How do we translate that to a digital entity?
• Information resources, including digital
libraries, are very complex systems.
– A formal model helps to capture the essence of
the system and give special attention to specific
areas
– The model also allows developers of digital
libraries to have a check list of areas to consider
and develop well.
The 5S model
• Streams
– The flow of information in various formats
• Structures
– Organizational aspects of the DL
• Spaces
– Views of components; real or abstract images
• Scenarios
– Services and behaviors
• Societies
– Communities and relationships among them
5S summary
Model
Primitives
Formalisms
Objectives
Stream
Text; video, audio,
software program
Sequences, types
Describes properties of the DL
content, encoding and textual
material or particular forms of
multimedia data.
Structure
Collection, catalog;
hypertext; document;
metadata;
organizational tools
Graphs; nodes; links;
labels; hierarchies
Specifies organizational aspects
of the DL content
Space
User Interface;
index; retrieval
model
Sets; operations; vector Defines logical and
space; measure space; presentational views of several
probability space
DL components
Scenarios
Service, event;
condition; action
Sequence diagrams;
collaboration diagrams
Details the behavior of DL
services
Societies
Community;
managers; actors;
classes;
relationships;
attributes; operators
Object-oriented
modeling constructs;
design patterns
Defines managers responsible
for running DL services; actors
that use those services, and
relationships among them
Source: http://www.dlib.vt.edu/projects/5S-Model/
Etana - A DL for archeology
An example application of 5S Etana: A DL for an archeological site
Scenario
model
Society model
Archaeologist
General public
Services
Value added
Service Manager
Domain specific
Space model
Geographic space
Structure
model
Region
Stream
model
User interface
Text
*Partition
Video
Information Satisfaction
Metric space
Metadata
*Site
Repository building
*Sub-partition
Audio
Taxonomies
Spatial
Temporal
Artifact-specific
*Locus
Drawing
*Container
Photo
*Artifact
3D
Source: E. A. Fox http://feathers.dlib.vt.edu/
Case study: Subjects of
interest for creating a DL:
•
•
•
•
__6__ History of ___
• _9__ Graduate
_14__ Personal Photos
programs - computing
__6__Computing Topic • _6__ Wildlife
__9__ Poems, essays • _6__ Pets
or other literature
• _2__ Gardening
• __9__ Cars, trucks
• _3__ Course syllabi
• _18__ Movies, TV
• _5__ Other: ___
programs, or other
media
2 most popular - personal photos
and movies, etc. Use these as a
working example.
Applying the model, informally
Personal Photos; Movie, TV, media
• Stream - what types of data? Gif, jpg, avi?
• Structure - How are the elements organized? Is
there a hierarchy? Are there multiple structures?
• Spaces - How will we index the items? How will
we divide them into related groups
• Scenarios - what services will we provide? What
information do we need to provide those
services?
• Societies - who is the library intended to serve?
Remember to include agents and other
processes as well as users.
In your group, choose one or the other. Start with stream, scenarios,
societies.
More formally: Definitions
• Definition: A stream is a sequence
whose co-domain is a non empty set.
• Definition: A structure is a tuple (G, L, F)
where G = (V,E) is a directed graph with
vertex set V and edge set E, L is a set
of label values, and F is a labeling
function.
Definitions, cont’d
• Definition: A space is a measurable space,
measure space, probability space, vector
space, topological space, or metric space
– A vector space is a representation for the set of
elements in a collection. The vector representing
each element is a set of characteristics held by
that element and both connecting that element to
others that are similar and distinguishing it from
those that are different.
– We will do an exercise to illustrate
Definitions - 3
• Definition: A scenario is a sequence of related
transition events (e1, e2, …, en) on state set S
such that ek = (sk, sk+1,) for 1 <= k <= n.
– More easily visualized, a scenario is a path in a
directed graph, G = (S, ∑e), where vertices
correspond to states in the state set S and
directed edges are equivalent to events in a set of
events, ∑e, and correspond to transitions between
states.
– Scenarios must be implemented to make a
working system.
Definitions - 4
• Definition: A society is a tuple (C,R) where
– C = (c1, c2, …, cn) is a set of conceptual
communities, each community referring to a set of
individuals of the same class or type (e.g. actors,
activities, components, hardware, software, data);
– R = (r1, r2, …, rm) is a set of relationships, each
relationship being a tuple rj = (ej, ij) where ej is a
Cartesian product ck1 x ck2 x … x cknj. 1<= k1 < k2 < …
< knj<= n, which specifies the communities involved in the
relationship and ij is an activity.
Summary - Week 2
• Continued to explore what a digital
library really is
• Introduced some formal concepts for
modeling a DL
• Briefly discussed the installation and
operation of our own DLs.