JCDL 2009 - Doctoral Consortium PowerPoint presentation (be careful, there are mouse click and timed animations)
Download ReportTranscript JCDL 2009 - Doctoral Consortium PowerPoint presentation (be careful, there are mouse click and timed animations)
Unsupervised Creation of Small World Networks for the Preservation of Digital Objects Charles L. Cartledge Michael L. Nelson Old Dominion University Department of Computer Science Norfolk, Virginia Order of Presentation • • • • • • Technology enablers Constraints Simple rules for Complex Behavior Simulation approach Simulation results Future work SP145 JCDL Short Paper Presentation 2 Motivation 1907 2007 2107 Time SP145 JCDL Short Paper Presentation 3 Technology Enablers Cost data: http://www.archivebuilders.com/whitepapers/22011p.pdf SP145 JCDL Short Paper Presentation 4 Constraints “ … Tomorrow we could see the National 75 12 – 80 Library of Medicine abolished by 101 yrs Congress, Elsevier dismantled by a corporate raider, the Royal Society Those declared bankrupt, or the University of that 5– die, Michigan Pressdodestroyed by a meteor. All60 so yrs in are highly unlikely, but over a long period avg. 23 of time unlikely yrs. events will happen. …” Expectancy data: http://www.cdc.gov/nchs/data/nvsr/nvsr57/nvsr57_14.pdf http://www.lbl.gov/Science-Articles/Archive/ssc-and-future.html http://www.dod.mil/brac/ http://www.hq.nasa.gov/office/pao/97budget/zbr.txt (emphasis CLC) Patricia W. and J Douglas Perry Library, Old Dominion University W.Picture: Y. Arms, “Preservation of Scientific Serials: Three Current Examples,” http://www2.westminster-mo.edu/wc_users/homepages/staff/brownr/ClosedCollegeIndex.htm SP145 Journal of Electronic Publishing, Dec., 1999 JCDL Short Paper Presentation 5 Reynolds’s Rules for Flocking • Collision Avoidance: avoid collisions with nearby flock mates • Velocity Matching: attempt to match velocity with nearby flock mates • Flock Centering: attempt to stay close to nearby flock mates My interpretation • Namespace collision avoidance • Following others to available storage locations • Deleting copies of one’s self to provide room for late arrivers Images and rules: http://www.red3d.com/cwr/boids/ SP145 JCDLDoctoral Short Paper Consortium Presentation 6 Types of Graphs Regular Small World Random Path length Long Shorter Short Clustering coefficient High Still high Low (Each graph has 20 vertices and 40 edges.) SP145 JCDL Short Paper Presentation 7 Desirable Graph Properties SP145 JCDL Short Paper Presentation 8 Unsupervised Small World Graph Creation • • gamma = 0.0 alpha = 0.99 • 0.2 <= beta <=0.66 • gamma < 0.6 • • gamma = 0.7 alpha = 0.99 CC is shown as dark lines L is shown as light lines SP145 JCDL Short Paper Presentation 9 Phases/Activities Creation (Human or archivist activities) Wandering (Autonomous activities) Connecting (Autonomous activities) Flocking (Autonomous activities) SP145 JCDL Short Paper Presentation 10 Creation Any DO SP145 JCDL Short Paper Presentation 11 Wandering A Who are youto: Connected connected <Nil> to? SP145 Who are Connected to: you A connected to? Who are Connected you to: B, C connected to? D B Who are youto: Connected connected A to? JCDL Short Paper Presentation C 12 Connecting B Connection Possible established connection A D SP145 C JCDL Short Paper Presentation 13 Flocking A B C’ D’ A’ A’’ A’’ D D’’ A’ C C’’ SP145 JCDL Short Paper Presentation 14 Typical Simulation Parameters • • • • • • • • alpha = 0.5 beta = 0.6 gamma = 0.1 Number of DOs = 1000 Number of hosts = 1000 Min number desired replicas = 3 Max number desired replicas = 10 Max number of replicas per host = 20 SP145 JCDL Short Paper Presentation 15 Simulation Results and Analysis SP145 JCDL Short Paper Presentation 16 Future work • Test the autonomous graphs for resilience to error and attack • Test what happens when a graph becomes disconnected • Test what happens when a disconnected graph becomes re-connected SP145 JCDL Short Paper Presentation 17 Conclusions • We have shown that Digital Objects can autonomously create small world graphs based on locally gleaned data • These graphs can be used for long term preservation • We intend to study these graphs focusing on their tolerance to isolated and widespread failures SP145 JCDL Short Paper Presentation 18 And that concludes my presentation. SP145 JCDL Short Paper Presentation 19 Backup Information • Equations for Average Path Length and Clustering Coefficients SP145 JCDL Short Paper Presentation 20