eBank UK : linking research data, learning and scholarly communications. Dr Liz Lyon, UKOLN, University of Bath Dr Simon Coles, School of Chemistry,
Download ReportTranscript eBank UK : linking research data, learning and scholarly communications. Dr Liz Lyon, UKOLN, University of Bath Dr Simon Coles, School of Chemistry,
eBank UK : linking research data, learning and scholarly communications. Dr Liz Lyon, UKOLN, University of Bath Dr Simon Coles, School of Chemistry, University of Southampton JISC Joint Programmes Meeting 2005 1 The wider context Why create the e-Framework? The JISC strategic context Sarah Porter, 2005 JISC Joint Programmes Meeting 2005 3 JISC-funded content providers institutional content providers external content providers authentication/authorisation (Athens) service registries metadata schema registries brokers aggregators catalogues indexes identifier services institutional profiling services OpenURL media-specific institutional link servers portals portals subject portals learning management systems terminology services shared infrastructure end-user desktop/browser © Andy Powell (UKOLN, University of Bath), 2005 This work is licensed under a Creative Commons License Attribution-ShareAlike 2.0 JISC Information Environment architecture Presentation services: subject, media-specific, data, commercial portals Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media Resource discovery, linking, embedding Data analysis, transformation, mining, modelling Searching , harvesting, embedding Aggregator services: national, commercial Resource discovery, linking, embedding Learning object creation, re-use Harvesting metadata Learning & Teaching workflows Research & e-Science workflows Repositories : institutional, e-prints, subject, data, learning objects Deposit / selfarchiving Validation Publication Resource discovery, linking, embedding The scholarly knowledge cycle. Liz Lyon, Ariadne, July 2003. © Liz Lyon (UKOLN, University of Bath), 2005 This work is licensed under a Creative Commons License Attribution-ShareAlike 2.0 Deposit / selfarchiving Institutional presentation services: portals, Learning Management Systems, u/g, p/g courses, modules Peer-reviewed publications: journals, conference proceedings JISC Joint Programmes Meeting 2005 Validation Quality assurance bodies 5 eScience - the data deluge Data Overload! EPSRC National Crystallography Service How do we disseminate? JISC Joint Programmes Meeting 2005 6 JISC Joint Programmes Meeting 2005 7 Presentation services: subject, media-specific, data, commercial portals Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media Resource discovery, linking, embedding Data analysis, transformation, mining, modelling Searching , harvesting, embedding Aggregator services: eBank UK Resource discovery, linking, embedding Learning object creation, re-use Harvesting metadata Research & e-Science workflows Deposit / selfarchiving Learning & Teaching workflows Repositories : institutional, e-prints, subject, data, learning objects Validation Publication Deposit / selfarchiving Institutional presentation services: portals, Learning Management Systems, u/g, p/g courses, modules Resource discovery, linking, embedding Peer-reviewed publications: journals, conference proceedings JISC Joint Programmes Meeting 2005 Validation Quality assurance bodies 8 The eBank UK Project eBank UK: background • JISC-funded September 2003, Phase 2 February 2005 • UKOLN at the University of Bath (lead), University of Southampton, University of Manchester • Exemplar: e-Science testbed ‘Combechem’ – – – – Grid-enabled combinatorial chemistry Crystallography, laser and surface chemistry examples Development of an e-Lab using pervasive computing technology National Crystallography Service • Resource Discovery Network / PSIgate physical sciences portal • http://www.ukoln.ac.uk/projects/ebank-uk/ JISC Joint Programmes Meeting 2005 10 The project team • • • • • • • • UKOLN Michael Day Monica Duke Rachel Heery Traugott Koch Liz Lyon + Andy Powell • • • • • • • Southampton Les Carr Simon Coles Jeremy Frey Chris Gutteridge Mike Hursthouse Andrew Milstead • Manchester • John Blunden-Ellis JISC Joint Programmes Meeting 2005 11 Create Data Flow in eBank UK HTML Deposition Interface Submit Store/link Institutional repository Index and Search Harvest (XML) eBank aggregator Present HTML Present OAI-PMH Deposit Service Provider interfaces e.g. Subject Portal Local archive search interface JISC Joint Programmes Meeting 2005 Data files Metadata 12 Dataset Searching, linking and embedding eBank data model Dataset Dataset dcterms:references Harvesting OAI-PMH oai_dc Crystal structure (data holding) Linking ebank_dc record (XML) dc:identifier dc:type=“CrystalStructure” and/or “Collection” Institutional repository Crystal structure report (HTML) Searching, linking and embedding Harvesting OAI-PMH PSIgate portal ebank_dc eBank UK aggregator service dcterms:isReferencedBy Eprint “jump-off” page (HTML) Eprint manifestation (e.g. PDF) Deposit ePrint UK aggregator service dc:identifier Linking Model input Andy Powell, UKOLN. Harvesting OAI-PMH oai_dc Eprint oai_dc record (XML) dc:type=“Eprint” and/or ”Text” JISC Joint Programmes Meeting 2005 Subject service Searching, linking and embedding 13 CombeChem: An EPSRC pilot project Simulation Video Diffractometer Properties Analysis Structures Database Properties e-Lab X-Ray e-Lab Grid Middleware JISC Joint Programmes Meeting 2005 14 Crystallography data: The publication problem 2,000,000 Cl Cl N Cl O O Cl + N O O Cl Cl O Cl Cl Cl O O + N O Cl O Cl Cl N Cl N O N 25,000,000 300,000 JISC Joint Programmes Meeting 2005 15 Crystallography workflow RAW DATA DERIVED DATA RESULTS DATA • Initialisation: mount new sample set up data collection • Collection: collect data • Processing: process and correct images • Solution: solve structures • Refinement: refine structure • CIF: produce CIF (Crystallographic Information File) • Validation: chemical & crystallographic checks • Report: generate Crystal Structure Report JISC Joint Programmes Meeting 2005 16 A data repository entry JISC Joint Programmes Meeting 2005 17 Access to the underlying data ecrystals.chem.soton.ac.uk JISC Joint Programmes Meeting 2005 18 Harvesting: OAIster JISC Joint Programmes Meeting 2005 19 Aggregating: search & discover JISC Joint Programmes Meeting 2005 20 Linking data to publications JISC Joint Programmes Meeting 2005 21 eBank embedded in a science portal JISC Joint Programmes Meeting 2005 22 Current Developments: Deposition and validation tools Validation File format manipulation JISC Joint Programmes Meeting 2005 23 Current Developments: Integration into crystallographic publishing practices Publishers seal of approval JISC Joint Programmes Meeting 2005 24 Current Developments: Ontologies for aggregating, linking & discovery • Transform the ‘list’ into an ‘ontology’ • Embed ontology into the deposition process • Publish keywords in OAI • Aggregators use keywords for linking with the broader literature • Researchers use keyword ontology in search and discovery services JISC Joint Programmes Meeting 2005 25 eBank : linking to learning • Embedding in e-Learning processes • Evaluating the pedagogical benefits – MChem course – Chemical informatics course JISC Joint Programmes Meeting 2005 26 Issues and challenges 1. Issues: research data as content • Sharing it! • Data diversity – – – – Homo- or heterogeneous Raw and derived / processed Sensitivity Fast or slow growth in volume • Repository evolution: – Likelihood to scale up (from bytes to petabytes) – Quality assurance (from the start) – Community-based standards development (“folksonomies”) – Build robust services JISC Joint Programmes Meeting 2005 28 2. Issues: generic data models, metadata schema & terminology • Validation against other schema – CCLRC Scientific Data Model Vs 2 • Complex digital objects and packaging options – METS – MPEG 21 DIDL • Terminologies – Domain: crystallography – Inter-disciplinary e.g. biomaterials – Metadata enhancement: subject keyword additions to datasets based on knowledge of keywords in related publications – Meaningful resource discovery? JISC Joint Programmes Meeting 2005 29 3. Issues: linking and identifiers • • • • Links to individual datasets within an experiment Links to all datasets associated with an experiment or a data collection Links to derived eprints and published literature Context sensitive linking: find me – – – – • Datasets by this author / creator Datasets related to this subject Learning objects by this author / creator Learning objects related to this subject Identifiers and persistence – “generic” – domain: International Chemical Identifier (InChI code) • • Resource discovery : Google Scholar? Provenance: authenticity, authority, integrity? JISC Joint Programmes Meeting 2005 30 4. Issues: embedding and workflow • Into the crystallographic publishing community International Union of Crystallography • Into the chemistry research workflow – SMART TEA Digital Lab Book e-synthesis Lab – Other analytical techniques and instrumentation – RAE procedures? • Into the curriculum and e-Learning workflows – MChem course – Undergraduate Chemical Informatics courses JISC Joint Programmes Meeting 2005 31 Next in Phase 2……. • Full embedding into the crystallographic research and publishing communities • Chemistry workflow embedding – R4L Repository for the Laboratory – Related sub-domains of chemistry SPECTRa • e-Learning embedding and pedagogic evaluation – Assess role in u/g chemical informatics courses – Introducing school children to e-research • Enabling interdisciplinary research – Physical, mathematical, earth, environmental and engineering sciences JISC Joint Programmes Meeting 2005 32 Thank you. Questions?…..