JCDL 2009 - Doctoral Consortium PowerPoint presentation (be careful, there are mouse click and timed animations)

Download Report

Transcript JCDL 2009 - Doctoral Consortium PowerPoint presentation (be careful, there are mouse click and timed animations)

Unsupervised Creation of Small
World Networks for the
Preservation of Digital Objects
Charles L. Cartledge
Michael L. Nelson
Old Dominion University
Department of Computer Science
Norfolk, Virginia
Order of Presentation
•
•
•
•
•
•
Technology enablers
Constraints
Simple rules for Complex Behavior
Simulation approach
Simulation results
Future work
SP145
JCDL Short Paper Presentation
2
Motivation
1907
2007
2107
Time
SP145
JCDL Short Paper Presentation
3
Technology Enablers
Cost data: http://www.archivebuilders.com/whitepapers/22011p.pdf
SP145
JCDL Short Paper Presentation
4
Constraints
“ … Tomorrow
we could see the National
75
12 –
80
Library of Medicine abolished by 101
yrs
Congress, Elsevier dismantled by a
corporate raider, the Royal Society
Those
declared bankrupt,
or the University of
that
5–
die,
Michigan Pressdodestroyed
by a meteor. All60
so
yrs
in
are highly unlikely,
but over a long period
avg.
23
of time unlikely
yrs. events will happen. …”
Expectancy data: http://www.cdc.gov/nchs/data/nvsr/nvsr57/nvsr57_14.pdf
http://www.lbl.gov/Science-Articles/Archive/ssc-and-future.html
http://www.dod.mil/brac/
http://www.hq.nasa.gov/office/pao/97budget/zbr.txt
(emphasis CLC)
Patricia W. and J Douglas Perry Library, Old Dominion University
W.Picture:
Y. Arms,
“Preservation of Scientific Serials: Three Current Examples,”
http://www2.westminster-mo.edu/wc_users/homepages/staff/brownr/ClosedCollegeIndex.htm
SP145
Journal of Electronic Publishing, Dec., 1999
JCDL Short Paper Presentation
5
Reynolds’s Rules for Flocking
• Collision Avoidance:
avoid collisions with
nearby flock mates
• Velocity Matching:
attempt to match
velocity with nearby
flock mates
• Flock Centering:
attempt to stay close to
nearby flock mates
My interpretation
• Namespace collision
avoidance
• Following others to
available storage
locations
• Deleting copies of
one’s self to provide
room for late arrivers
Images and rules:
http://www.red3d.com/cwr/boids/
SP145
JCDLDoctoral
Short Paper
Consortium
Presentation
6
Types of Graphs
Regular
Small World
Random
Path length
Long
Shorter
Short
Clustering
coefficient
High
Still high
Low
(Each graph has 20 vertices and 40 edges.)
SP145
JCDL Short Paper Presentation
7
Desirable Graph Properties
SP145
JCDL Short Paper Presentation
8
Unsupervised Small World Graph Creation
•
•
gamma = 0.0
alpha = 0.99
• 0.2 <= beta <=0.66
• gamma < 0.6
•
•
gamma = 0.7
alpha = 0.99
CC is shown as dark lines
L is shown as light lines
SP145
JCDL Short Paper Presentation
9
Phases/Activities
Creation
(Human or archivist
activities)
Wandering
(Autonomous activities)
Connecting
(Autonomous activities)
Flocking
(Autonomous activities)
SP145
JCDL Short Paper Presentation
10
Creation
Any
DO
SP145
JCDL Short Paper Presentation
11
Wandering
A
Who
are youto:
Connected
connected
<Nil> to?
SP145
Who are
Connected to:
you
A
connected
to?
Who are
Connected
you to:
B, C
connected
to?
D
B
Who
are youto:
Connected
connected
A to?
JCDL Short Paper Presentation
C
12
Connecting
B
Connection
Possible
established
connection
A
D
SP145
C
JCDL Short Paper Presentation
13
Flocking
A
B
C’
D’
A’
A’’
A’’
D
D’’
A’
C
C’’
SP145
JCDL Short Paper Presentation
14
Typical Simulation Parameters
•
•
•
•
•
•
•
•
alpha = 0.5
beta = 0.6
gamma = 0.1
Number of DOs =
1000
Number of hosts =
1000
Min number desired
replicas = 3
Max number desired
replicas = 10
Max number of
replicas per host = 20
SP145
JCDL Short Paper Presentation
15
Simulation Results and Analysis
SP145
JCDL Short Paper Presentation
16
Future work
• Test the autonomous graphs for resilience to
error and attack
• Test what happens when a graph becomes
disconnected
• Test what happens when a disconnected
graph becomes re-connected
SP145
JCDL Short Paper Presentation
17
Conclusions
• We have shown that Digital Objects can
autonomously create small world graphs
based on locally gleaned data
• These graphs can be used for long term
preservation
• We intend to study these graphs focusing on
their tolerance to isolated and widespread
failures
SP145
JCDL Short Paper Presentation
18
And that concludes my
presentation.
SP145
JCDL Short Paper Presentation
19
Backup Information
• Equations for Average Path Length and
Clustering Coefficients
SP145
JCDL Short Paper Presentation
20