Middleware Selection GridPP13 – Durham – July’05 Robin Middleton Introduction • LCG Baseline Service Group Report – http://lcg.web.cern.ch/LCG/peb/bs/BSReport-v1.0.pdf • Services: – Storage Element; Basic File Transfer;
Download ReportTranscript Middleware Selection GridPP13 – Durham – July’05 Robin Middleton Introduction • LCG Baseline Service Group Report – http://lcg.web.cern.ch/LCG/peb/bs/BSReport-v1.0.pdf • Services: – Storage Element; Basic File Transfer;
Middleware Selection GridPP13 – Durham – July’05 Robin Middleton Introduction • LCG Baseline Service Group Report – http://lcg.web.cern.ch/LCG/peb/bs/BSReport-v1.0.pdf • Services: – Storage Element; Basic File Transfer; Reliable File Transfer; Catalogue Services; Data Management Tools; Compute Element; Workload Management; VO Agents; VO Membership Services; Database Services; Posix-like I/O; Application software Installation Tools; Job Monitoring; Reliable Messaging; Information system • Experiment priorities for Services • A quick look at OMII • Conclusions 06 November 2015 GridPP Middleware Roadmap Storage Element • Characteristics: – – – – – Mass storage, disk pool, disk cache front-end gridFTP service transfers in/out local POSIX-like I/O Authentication/authorisation & audit/accounting SRM (version x.y) Interface • fine detail required evolves through service challenges • Implementations: – CASTOR (SRM v1.1); dCache (SRM v1.1++); LCG-DPM (SRM v1.1 & v2.1); DRM (SRM v1.1) • Goal: functionality in place by end 2005 06 November 2015 GridPP Middleware Roadmap Basic File Transfer • Characteristics: – Critical – Absolutely reliable • fault tolerant against machine failures • load-balancing at sites – Transfer bandwidths at all sites need careful planning • Implementations: – gridFTP - from GT2; eventually migrate to GT4 version when proven – srmCopy - layered on top of gridFTP and preferred interface 06 November 2015 GridPP Middleware Roadmap Reliable File Transfer • Characteristics: – – – – Layered on Basic Data Transfer Request scheduling (relative VO priorities) 3rd party transfers Interaction with grid catalogues • Implementations: – gLite FTS (used to define base interfaces & functionality for any other future implementations) – Globus RFT (currently does not layer on srmCopy, though transfer restarts are possible) – also FTD (AliEn), Don Quixote (Atlas), PhedEx (CMS) & LHCb system 06 November 2015 GridPP Middleware Roadmap Catalogue Services • Characteristics: – – – – – – – – – Vary significantly between experiments dependent information (bookkeeping, metadata, etc.) Implements “collections” – datasets, fileblocks, … More than data files – e.g. log files, … Mappings : LFN, GUID, SURL, … Hierarchical namespace Access control Bulk operations Interfaces to POOL, WMS, Posix-like I/O Centralised catalogue • Implementations: – – – LCG File Catalogue - fulfils requirements FireMan (gLite) – fulfils requirements Globus RLS (does not implement all required interfaces) • Experiments plans: – – – AliEn FC (Alice, LHCb ?) Atlas based on LFC, Fireman & Globus RLS (in US) CMS possibly LFC, Fireman, other ? 06 November 2015 GridPP Middleware Roadmap Data Management Tools • Implementations: – LCG-2 provides complete set of tools for replica management, catalogue interaction and manipulation – gLite has similar toolset – POOL provides back-end catalogue manipulation • Future: – wish to see convergence on single toolset, but exact composition depends catalogue choices ! 06 November 2015 GridPP Middleware Roadmap Compute Element • Characteristics: – – – – – Job submission to local batch Provide resource availability information Availability of accounting information Job status queries Authentication, authorisation based on VOMS • Implementations: – – – – – Globus gatekeeper (LCG-2 & OSG); ARC (NorduGrid) Info : standardised on GLUE schema Accounting : standardised on GGF accounting schema Job status : R-GMA (LCG-2.4) new CE (gLite – based on Condor-C) will be evaluated to replace Globus GRAM-based CEs 06 November 2015 GridPP Middleware Roadmap Workload Management • Characteristics: – Express resource requirements – Service matches against resource availability & submits to best match – Interfaces to Data Location Interface, Storage Index • Implementations: – LCG-2 Resource Broker – Condor-G – gLite RB (push & pull) • Future: – expect WM systems to evolve and mature 06 November 2015 GridPP Middleware Roadmap VO Agents • Characteristics – perform activities for an experiment • job submission / monitoring • file transfer scheduling • database update scheduling • Implementations: – currently jobs running in batch – not ideal – generic solution needed 06 November 2015 GridPP Middleware Roadmap VO Membership Service • Characteristics: – register users – generate extended proxy certificates – handle authorisation for use of resources • Implementations: – all providers participating in LCG have/will have a VOMS service – mechanisms for mapping users/groups and to provide access control vary depending on local policies & requirements 06 November 2015 GridPP Middleware Roadmap Database Services • Characteristics: – Provide back-ends for file catalogues, metadata, conditions, etc. – Write access limited to experiment software managers – Reliable, scalable services based on reliable hardware – Required at Tier-0, Tier-1 and some Tier-2 – Some replication to remote sites of centralised databases • Implementations: – Oracle – MySQL (at some smaller sites) 06 November 2015 GridPP Middleware Roadmap Posix-like I/O • Characteristics: – Support intermediate libraries : POOL, ROOT, … – Support direct from application code – Communicate with Grid File Catalogues (allows LFN / GUID access) • Implementations: – LCG GFAL (Grid File Access Library) – gLiteIO (access control via Fireman catalogue) – xrootd • Comment: – remote (from other sites) I/O not expected, files should be replicated locally 06 November 2015 GridPP Middleware Roadmap Application Software Installation Tools • Characteristics: – – – – VO specific installation of software Validation of installed software Write access limited to experiment software managers Publish installed software in information system • Implementations: – toolset in LCG-2 – experiment specific solutions 06 November 2015 GridPP Middleware Roadmap Job Monitoring • Characteristics: – monitor & trace submitted grid jobs – Instrumentation of job wrapper scripts – VO-level monitoring of resource usage of the VO • Implementations: – partial solution in LCG-2 workload management system – LCG-2 publish Resource Broker info of every job in RGMA (CPU time, wall clock time, memory usage, etc.) – ARC Middleware (NorduGrid) 06 November 2015 GridPP Middleware Roadmap Reliable Messaging • Characteristics: – messaging between applications, services & users – reliable, asynchronous • Implementations: – some experiment specific solutions – a common trustworthy service would be of value; making use of existing open source or public domain tools 06 November 2015 GridPP Middleware Roadmap Information System • Characteristics: – (not seen as a baseline service. but still crucial) – Info published through application interactions with other services – Schema must be adequate to describe services and their parameters • Implementations: – GLUE schema exists and proposed as standard for LCG (update expected in Q4 2005); common between EGEE, OSG & NorduGrid 06 November 2015 GridPP Middleware Roadmap Baseline Services Report - Experiment Priorities Service Alice Atlas CMS LHCb Storage Element A A A A Basic transfer tools A A A A Reliable file transfer service A A A/B A Catalogue services B B B B Catalogue & Data management tools C C C C Compute element A A A A B/C A A C VO agents A A A A VO Membership service A A A A Database Services A A A A Posix-like I/O C C C C Application software installation C C C C Job monitoring tools C C C C Reliable messaging service C C C C Information system A A A A Workload management 06 November 2015 GridPP Middleware Roadmap A: High priority, mandatory B: Standard solutions required but expts could select different implementations C: Desirable common solution, but not essential OMII • Evaluated within EGEE - reported at Athens meeting – based on web service standards; fully decentralised – might fulfil LCG requirements (superficially at least) in • • • • user account management job submission file transfer X.509 based security infrastructure – Users need accounts at all resource sites; sites queried for resources for each job; manual site selection; applications pre-installed at all participating sites – No services for data management (other than file transfer) – No support for catalogues, databases, mass storage, VO management • A major undertaking to use in meeting LHC requirements and would need significant additions of other tools… not clear what advantages would result. Possible use to leverage local resources (e.g. shared with non-HEP), but don’t under estimate work in forging interoperability and ongoing maintenance. 06 November 2015 GridPP Middleware Roadmap Conclusion • LCG brings together components from a number of sources and packages them in a common framework • No single project (EDG, gLite, VDT, OMII, …) provides all the answers • Baseline services report is an end-user view – Are there holes in the analysis ? – What about ? • Security infrastructure • Information system infrastructure • Grid monitoring & Operations toolsets • LCG/EGEE releases are best placed to meet LHC needs in the UK 06 November 2015 GridPP Middleware Roadmap