This work is licensed under a Creative Commons Licence Attribution-ShareAlike 2.0 An Introduction to the UK Digital Curation Centre Dr Liz Lyon, DCC Associate Director.
Download ReportTranscript This work is licensed under a Creative Commons Licence Attribution-ShareAlike 2.0 An Introduction to the UK Digital Curation Centre Dr Liz Lyon, DCC Associate Director.
This work is licensed under a Creative Commons Licence Attribution-ShareAlike 2.0 An Introduction to the UK Digital Curation Centre Dr Liz Lyon, DCC Associate Director Outreach Director, UKOLN, University of Bath, UK CURL/SCONUL Workshop December 2005 Funded by: Digital | Curation | Centre Overview • About the Digital Curation Centre – Organisation and structure • What is digital curation? – e-Research cycle • DCC activities – – – – Development activity Research agenda Advisory services Outreach programme 2 Digital | Curation | Centre UK Digital Curation Centre • Development activities • Research agenda • Delivering services • Outreach Programme • 3 http://www.dcc.ac.uk/ Digital | Curation | Centre DCC people (some of them…) • Management & Co-ordination – Director Chris Rusbridge (University of Edinburgh) • Community Support & Outreach – Led by Dr Liz Lyon (UKOLN, University of Bath) • Service Definition & Delivery – Led by Professor Seamus Ross (HATII, University of Glasgow) • Development – Led by Dr David Giaretta (Astronomical Software & Services, CCLRC) • Research – Led by Professor Peter Buneman (University of Edinburgh) 4 Digital | Curation | Centre What is digital curation? For later use? Static Data preservation In use now (and the future)? Dynamic Data curation “maintaining and adding value to a trusted body of digital information for current and future use” 5 Digital | Curation | Centre (Very simple) e-Research Cycle and Data Curation (New) knowledge extraction: data mining, modelling, analysis, synthesis Data processing Formulate hypothesis / ideas, test, experiment, observe: data creation, collection & capture Data processing Data processing Adding value: Data linking, annotation, visualisation, simulation Data processing 6 e-Infrastructure Open access Collaboration Data management storage & validation: description, deposit, self-archiving, preservation, certification Data processing Scholarly communications: data disclosure, publication, citation, discovery, re-use This work is licensed under a Creative Commons Licence Attribution-ShareAlike 2.0 Digital | Curation | Centre (Very simple) e-Research Cycle and Data Curation (New) knowledge extraction: data mining, modelling, analysis, synthesis Data processing Formulate hypothesis / ideas, test, experiment, observe: data creation, collection & capture Data processing Data processing Adding value: Data linking, annotation, visualisation, simulation Data processing 7 e-Infrastructure Open access Collaboration Data management storage & validation: description, deposit, self-archiving, preservation, certification Data processing Scholarly communications: data disclosure, publication, citation, discovery, re-use Digital | Curation | Centre 8 Digital | Curation | Centre Engineering Product Information 9 EPSRC Grand Challenge Project, Prof Chris McMahon, University of Bath Digital | Curation | Centre – Access Grid – Collaborative telematic art – Modify spaces for performers – Interplay: Hallucinations 10 Digital | Curation | Centre Data capture & integration into research workflows • R4L Repository for the Laboratory Project (JISCfunded) automated data capture from instrumentation, deposit of results (chemistry) • SMART TEA electronic Laboratory notebook + annotations 11 Digital | Curation | Centre (Very simple) e-Research Cycle and Data Curation (New) knowledge extraction: data mining, modelling, analysis, synthesis Data processing Formulate hypothesis / ideas, test, experiment, observe: data creation, collection & capture Data processing Data processing Adding value: Data linking, annotation, visualisation, simulation Data processing 12 e-Infrastructure Open access Collaboration Data management storage & validation: description, deposit, self-archiving, preservation, certification Data processing Scholarly communications: data disclosure, publication, citation, discovery, re-use Digital | Curation | Centre Presentation services: subject, media-specific, data, commercial portals Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media Resource discovery, linking, embedding Data analysis, transformation, mining, modelling Searching , harvesting, embedding Aggregator services: national, commercial Resource discovery, linking, embedding Learning object creation, re-use Harvesting metadata Research & e-Science workflows Deposit / selfarchiving Learning & Teaching workflows Repositories : institutional, e-prints, subject, data, learning objects Validation Publication Resource discovery, linking, embedding The scholarly knowledge cycle. 13 Liz Lyon, Ariadne, July 2003. © Liz Lyon (UKOLN, University of Bath), 2005 This work is licensed under a Creative Commons Licence Attribution-ShareAlike 2.0 Deposit / selfarchiving Peer-reviewed publications: journals, conference proceedings Institutional presentation services: portals, Learning Management Systems, u/g, p/g courses, modules Validation Quality assurance bodies Digital | Curation | Centre Disciplinary data-centres 14 Digital | Curation | Centre eBank UK Project http://www.ukoln.ac.uk/projects/ebank-uk/ • Two key themes: – Open access to datasets – Linking research data to publications and to learning • UKOLN, University of Southampton, University of Manchester • e-Science application ‘Combechem’ : Grid-enabled combinatorial chemistry + National Crystallography Service • Resource Discovery Network / PSIgate physical sciences portal 15 Digital | Curation | Centre A data repository entry 16 Digital | Curation | Centre Access to the underlying data: complex objects 17 Digital | Curation | Centre ecrystals.chem.soton.ac.uk Data descriptions • Validation, publication & discovery of data models & schema • Managing complex objects • Metadata packaging standards – METS – MPEG 21 DIDL • Semantic descriptions – Formal controlled vocabularies – High-level and domain ontologies – Inter-disciplinary discovery • Informal approaches Web 2.0 “folksonomies” 18 Digital | Curation | Centre Trusted digital repositories • Audit Checklist for Certification • Draft Report published August 2005 • Research Libraries Group RLG-NARA Taskforce • Defined criteria under 4 categories – – – – Organisation Functions, processes & procedures Designated community & usability Technologies & technical infrastructure 19 Digital | Curation | Centre OAIS Reference Model 20 Digital | Curation | Centre DCC: Development • “DCC Approach to Digital Curation” based on the Reference Model for an Open Archival Information System (OAIS); ISO standard, 14721: – Monitoring international standards – Development of a Representation Information (RI) registry/repository (DCC-RR) – Recommendations for tools and methods for generating Representation Information – Creating test-beds for digital curation tools Development info – see 21 http://dev.dcc.ac.uk for details of Wiki and email list Digital | Curation | Centre open to all (Very simple) e-Research Cycle and Data Curation (New) knowledge extraction: data mining, modelling, analysis, synthesis Data processing Formulate hypothesis / ideas, test, experiment, observe: data creation, collection & capture Data processing Data processing Adding value: Data linking, annotation, visualisation, simulation Data processing 22 e-Infrastructure Open access Collaboration Data management storage & validation: description, deposit, self-archiving, preservation, certification Data processing Scholarly communications: data disclosure, publication, citation, discovery, re-use Digital | Curation | Centre Persistent identifiers for data citation • Identify use cases: depositor, author, service provider, reader, publisher, ? • Schemes: DOI, Handle, ARK, PURL • Global identification: express as http URIs • Added value services: CrossRef, resolution service, integration (Globus), look-up service • Domain identifiers: e.g. International Chemical Identifier (INChI) codes • Google molecules using InChIs demo: Peter Murray-Rust, University of Cambridge • DCC Workshop June 2005 Glasgow 23 Digital | Curation | Centre One approach to data citation using DOIs • Publication & citation of scientific primary data project National Library for Science & Technology (TIB), University of Hanover, Germany STD-DOI Project http://www.std-doi.de • DOI registry for datasets • Data publication agents: World Data Center Climate, GeoForschungsZentrum Potsdam • Data requirements: quality control, long-term curation, use DOI resolver • Exemplar data citation: 24 – Kamm, H; Machon, L; Donner, S (2004): Gas chromatography (KTB Field Lab), GFZ Potsdam. doi:10.1594/GFZ/ICDP/KTB/ktbgeoch-gaschr-p Digital | Curation | Centre (Very simple) e-Research Cycle and Data Curation (New) knowledge extraction: data mining, modelling, analysis, synthesis Data processing Formulate hypothesis / ideas, test, experiment, observe: data creation, collection & capture Data processing Data processing Adding value: Data linking, annotation, visualisation, simulation Data processing 25 e-Infrastructure Open access Collaboration Data management storage & validation: description, deposit, self-archiving, preservation, certification Data processing Scholarly communications: data disclosure, publication, citation, discovery, re-use Digital | Curation | Centre Adding value: eBank linking data to publications 26 Digital | Curation | Centre Linking research to learning - embedding eBank aggregator service in a science portal for student learners 27 Digital | Curation | Centre Adding value through annotation DCC Research at the University of Edinburgh • Scientific databases: Annotation scoping report • AstroDAS: distributed annotation servers in astronomy • New annotation model + prototype: top-ranked demonstration at recent DB conference 28 Digital | Curation | Centre DCC Research agenda • • • • Publishing & integrating scientific databases ‘Archiving’ past states of volatile databases Database provenance and annotation Organisational dynamics of trusted repositories • Automating metadata extraction • Cost-benefit analysis of data curation • Rights and responsibilities 29 – “Public domain, public interest, public funding” paper Waelde & McGinley Digital | Curation | Centre (Very simple) e-Research Cycle and Data Curation (New) knowledge extraction: data mining, modelling, analysis, synthesis Data processing Formulate hypothesis / ideas, test, experiment, observe: data creation, collection & capture Data processing Data processing Adding value: Data linking, annotation, visualisation, simulation Data processing 30 e-Infrastructure Open access Collaboration Data management storage & validation: description, deposit, self-archiving, preservation, certification Data processing Scholarly communications: data disclosure, publication, citation, discovery, re-use Digital | Curation | Centre Facilitate “post-processing” and knowledge extraction Enable the acquisition of newly-derived information and knowledge • Run complex algorithms over primary datasets • Mining (data, text, structures) • Modelling (economic, climate, mathematical, biological) 31 • Analysis (statistical, lexical, pattern matching, gene) Digital | Curation | Centre 32 Digital | Curation | Centre DCC Case Study published: Wide Field Astronomy Unit 33 Digital | Curation | Centre Supporting the community • DCC Outreach & Services: 34 – [email protected] (legal - technical guidance) – Curation Manual 45 chapters planned, Briefing Papers – Workshops: Future-proofing Institutional Web sites, Jan 19-20, London – Information Days: regional – 1st International DCC Conference, Bath Sept 2005 – PV2005 November, Edinburgh – 2nd International Conference November 2006 Glasgow tbc Digital | Curation | Centre • www.ijdc.net • Peer-review Editorial Board • Peter Buneman Editor (research) • Production editor Richard Waller • Papers for submission are very welcome! 35 • 1st issue soon…. Digital | Curation | Centre Associates Network Goals Develop understanding, share best practice, advance research, promote recognition, develop consensus Membership International groups, national bodies, industry partners, funders, research groups, HEIs, FEIs, individuals…… Benefits Early access to R&D outputs, advisory services, training, input to definition and design, community participation 36 Discussion Forum www.dcc.ac.uk Please join us! Digital | Curation | Centre Developing skills & collaboration • • • • NSF Report : “Data scientist” Develop hybrid skills Embed in u/g, p/g curriculum Facilitate community collaboration: – Researchers – Data centres – Libraries & archives • New roles??? • Achieve cultural change 37 Digital | Curation | Centre Thank you. Questions? [email protected] Join the DCC Associates Network at www.dcc.ac.uk Digital | Curation | Centre