High contrast colours will help audiences to read text

Download Report

Transcript High contrast colours will help audiences to read text

Working collaboratively in digital preservation

Welcome!

We’re just waiting for everyone to join and then we’ll get started...

Working collaboratively in digital preservation

Paul Wheatley SPRUCE Project Manager University of Leeds Twitter: @prwheatley http://openplanetsfoundation.org/blogs/paul

House keeping....

• Welcome!

• http://bit.ly/OPFcolab • Please mute your microphone • Please open your chat box • If you can’t hear me please let me know!

Overview • Community fail: what has gone wrong in the digital preservation community?

• The benefits of collaboration: how working together is good for us all, but most importantly you • Get involved: a sample of the best collaborative initiatives and how you can be part of them • The end result: what we can achieve when we collaborate

Community fail: what has gone wrong in the digital preservation community?

Community fail Preservation costing example Tools and user needs example Finding tools example

Community fail •

Insufficient communication, awareness, sharing and user driven development

Duplication / Reinvention / Insufficient reuse of existing tools/approaches

• Impact: –Real challenges on the ground fail to be solved –Tools not fit for purpose –Best practice? Virtually none existent –Practitioners struggle

Example: Digital preservation costing initiatives                   LIFE 1, 2 and 3. Projects to explore digital preservation costing, and develop costing models.

Cost Model for Digital Preservation (CMDP): Project at the Royal Danish Library and the Danish National Archives to develop a new cost model. Currently covers Planning, Migrations and Ingest Keeping Research Data Safe 1 and 2 (KRDS):Cost model and benefits analysis for preserving research data Presto Prime cost model for digital storage Cost Estimation Toolkit (CET): Data centre costing model and toolkit, from NASA Goddard Cost Model for Small Scale Automated Digital Preservation Archives (Strodl and Rauber) APARSEN Project activity focused on digital preservation costing EPRSC and JISC study on Cost analysis of cloud computing for research Cost forecasting model for new digitization projects (Excel and web tool under development) (Karim Boughida, Martha Whittaker, Linda Colet, Dan Chudnov) DP4lib business and cost model for a digital preservation service DANS Costs of Digital Archiving Volume 2 Project, focusing on preservation and dissemination of research datasets Blue Ribbon Task Force on Sustainable Digital Preservation and Access Economic Sustainability Reference Model ENSURE Project - Enabling kNowledge Sustainability Usability and Recovery for Economic value Cost Model for Electronic Health Records (Bote, Fernandez-Feijoo, and Ruizb) 4C. EU funded project on costing. Due to commence in 2012. Led by JISC http://wiki.opf-labs.org/display/CDP/Home An extended blog-rant on why this typifies a big #fail for our community

It gets worse...

• • •

Screening the Future

: Managing the cost of archiving, master class, 22nd May, California

Nordbib

: The Costs and Benefits of Keeping Knowledge, 11th June, Copenhagen

JCDL2012

: Models for Digital Cost Analysis, 14th June, Washington DC

Blue Screen of Death Image courtesy of Bill Lefurgy, from the Atlas of Digital Damages

The tools don’t work....

“10 Years on we are still pretty much talking about the same things...

...Tools like DROID and PRONOM etc. didn’t work properly then, and they still don’t work properly now."

Steve Knight, New Zealand National Library, iPRES2012 ( blogged by Inge Angevarre: How are we doing as a community?

)

Mismatch of solutions to user needs • Tools and services that solve digital preservation problems we don’t have • Tools and services that are difficult to use, difficult to integrate with a users setup/workflow/technology • Focus on preservation planning, migration, emulation, file format obsolescence...

• In the first instance practitioners need better characterisation: –Appraisal / assessment –Risk identification –Quality assurance

Put the users in the driving seat • Give the users and practitioners more of a voice • Capture and articulate the challenges more effectively • Share, discuss and refine our requirements

Example: Finding digital preservation tools • Where do you look when you need a preservation tool to solve a particular problem?

Too many lists, not enough collaboration • One tool registry • Utilised (and supported) by the big organisations • Anyone can edit it • Links to the code, links to user experiences

The benefits of collaboration: how working together is good for us all, but most importantly you

Community Individual Note of caution

Benefits of collaboration to the community • Understanding the problem –Capture the challenges –Share requirements –Focused solutions to the problems we have • Understanding approaches to solving the problem –What works well –What doesn’t work so well –Best practice guidance • Developing a solution –Pool development resource –Solve a shared problem, you have a group of users –Bigger impact, better solutions

Benefits of collaboration (individually) • Learning opportunity • New ways of working • More efficient • Raises your profile • Reassuring and supporting • Fun

Note of caution: the flip side • Overheads, eg. comms and coordination • Removal of all redundancy is probably a bad thing in digital preservation!

• Dependence on others is a risk • Choose your partnerships carefully

Get involved: a sample of the best collaborative initiatives and how you can be part of them

Atlas of Digital Damages OPF Format Corpus Stack Exchange Mashups

The SPRUCE Mashup

Identify and Solve concrete preservation problems

• 3 day workshops for ~30 people • Practitioners bring along digital collections • We identify preservation challenges • Pair up practitioners with technical experts • Apply existing open source tools to solve the problems • In doing so, we exchange knowledge about digital preservation • Begin to develop a supportive community

Glasgow Mashup April 2012

Mashups: some observations • Almost every Mashup solution from 5 different events utilised existing (none DP) tools –The challenge is finding the right tool to apply –The group’s collective knowledge is very useful here • Most valuable aspect is the conversation and knowledge sharing • Practitioners and developers *can* understand each other!

• Agile development offers a number of advantages

The end result: what we can achieve when we collaborate

Jpylyzer Datasets, Issues and Solutions File Format November Golden rules

Jpylyzer example • New characterisation + validation tool for JPEG2000 –Produced by SCAPE and OPF –Development and operation driven by use cases • Eg, JP2 used at scale in mass digitisation efforts, truncation is common potential problem, yet existing tools didn’t check (quite complex) end of file conditions • Flawed creation tools omit critical metadata in created files • Validation use case: check JP2 files from 3 rd suppliers against a profile party digitisation –A number of organisations had the same needs –Shared example files enabled testing of solution –Focused tool: easy to use, easy to embed –Now part of Goobi!

– http://openplanetsfoundation.org/software/jpylyzer

Golden rules for collaboration in digital preservation • Share, share and share again –Open licensing FTW –Think about the best location • Utilise existing infrastructure –Flickr, Github, Stack Exchange, etc...

• Before you start a new initiative/development/whatever...

–Check there isn’t something you can build on • A document dies when its published...

–A wiki has the chance to live on if its of interest to a community • If you’re not on twitter, you’re missing out

In summary

Collaboration is:

• good for you and for those you collaborate with • quick and easy to get involved in • actually quite good fun!

So please get involved:

• http://bit.ly/spruce-collaborate

Thanks for listening! Any questions?

http://bit.ly/spruce-collaborate Paul Wheatley SPRUCE Project Manager University of Leeds Twitter: @prwheatley Email: [email protected]

http://openplanetsfoundation.org/blogs/paul Cartoon images courtesy of digitalbevaring.dk