Middleware Selection GridPP13 – Durham – July’05 Robin Middleton Introduction • LCG Baseline Service Group Report – http://lcg.web.cern.ch/LCG/peb/bs/BSReport-v1.0.pdf • Services: – Storage Element; Basic File Transfer;

Download Report

Transcript Middleware Selection GridPP13 – Durham – July’05 Robin Middleton Introduction • LCG Baseline Service Group Report – http://lcg.web.cern.ch/LCG/peb/bs/BSReport-v1.0.pdf • Services: – Storage Element; Basic File Transfer;

Middleware Selection
GridPP13 – Durham – July’05
Robin Middleton
Introduction
• LCG Baseline Service Group Report
– http://lcg.web.cern.ch/LCG/peb/bs/BSReport-v1.0.pdf
• Services:
– Storage Element; Basic File Transfer; Reliable File Transfer;
Catalogue Services; Data Management Tools; Compute Element;
Workload Management; VO Agents; VO Membership Services;
Database Services; Posix-like I/O; Application software Installation
Tools; Job Monitoring; Reliable Messaging; Information system
• Experiment priorities for Services
• A quick look at OMII
• Conclusions
06 November 2015
GridPP Middleware Roadmap
Storage Element
• Characteristics:
–
–
–
–
–
Mass storage, disk pool, disk cache front-end
gridFTP service transfers in/out
local POSIX-like I/O
Authentication/authorisation & audit/accounting
SRM (version x.y) Interface
• fine detail required evolves through service challenges
• Implementations:
– CASTOR (SRM v1.1); dCache (SRM v1.1++); LCG-DPM
(SRM v1.1 & v2.1); DRM (SRM v1.1)
• Goal: functionality in place by end 2005
06 November 2015
GridPP Middleware Roadmap
Basic File Transfer
• Characteristics:
– Critical
– Absolutely reliable
• fault tolerant against machine failures
• load-balancing at sites
– Transfer bandwidths at all sites need careful planning
• Implementations:
– gridFTP - from GT2; eventually migrate to GT4 version
when proven
– srmCopy - layered on top of gridFTP and preferred
interface
06 November 2015
GridPP Middleware Roadmap
Reliable File Transfer
• Characteristics:
–
–
–
–
Layered on Basic Data Transfer
Request scheduling (relative VO priorities)
3rd party transfers
Interaction with grid catalogues
• Implementations:
– gLite FTS (used to define base interfaces & functionality for any
other future implementations)
– Globus RFT (currently does not layer on srmCopy, though transfer
restarts are possible)
– also FTD (AliEn), Don Quixote (Atlas), PhedEx (CMS) & LHCb
system
06 November 2015
GridPP Middleware Roadmap
Catalogue Services
• Characteristics:
–
–
–
–
–
–
–
–
–
Vary significantly between experiments dependent information (bookkeeping,
metadata, etc.)
Implements “collections” – datasets, fileblocks, …
More than data files – e.g. log files, …
Mappings : LFN, GUID, SURL, …
Hierarchical namespace
Access control
Bulk operations
Interfaces to POOL, WMS, Posix-like I/O
Centralised catalogue
• Implementations:
–
–
–
LCG File Catalogue - fulfils requirements
FireMan (gLite) – fulfils requirements
Globus RLS (does not implement all required interfaces)
• Experiments plans:
–
–
–
AliEn FC (Alice, LHCb ?)
Atlas based on LFC, Fireman & Globus RLS (in US)
CMS possibly LFC, Fireman, other ?
06 November 2015
GridPP Middleware Roadmap
Data Management Tools
• Implementations:
– LCG-2 provides complete set of tools for replica
management, catalogue interaction and manipulation
– gLite has similar toolset
– POOL provides back-end catalogue manipulation
• Future:
– wish to see convergence on single toolset, but exact
composition depends catalogue choices !
06 November 2015
GridPP Middleware Roadmap
Compute Element
• Characteristics:
–
–
–
–
–
Job submission to local batch
Provide resource availability information
Availability of accounting information
Job status queries
Authentication, authorisation based on VOMS
• Implementations:
–
–
–
–
–
Globus gatekeeper (LCG-2 & OSG); ARC (NorduGrid)
Info : standardised on GLUE schema
Accounting : standardised on GGF accounting schema
Job status : R-GMA (LCG-2.4)
new CE (gLite – based on Condor-C) will be evaluated to replace
Globus GRAM-based CEs
06 November 2015
GridPP Middleware Roadmap
Workload Management
• Characteristics:
– Express resource requirements
– Service matches against resource availability & submits
to best match
– Interfaces to Data Location Interface, Storage Index
• Implementations:
– LCG-2 Resource Broker
– Condor-G
– gLite RB (push & pull)
• Future:
– expect WM systems to evolve and mature
06 November 2015
GridPP Middleware Roadmap
VO Agents
• Characteristics
– perform activities for an experiment
• job submission / monitoring
• file transfer scheduling
• database update scheduling
• Implementations:
– currently jobs running in batch – not ideal
– generic solution needed
06 November 2015
GridPP Middleware Roadmap
VO Membership Service
• Characteristics:
– register users
– generate extended proxy certificates
– handle authorisation for use of resources
• Implementations:
– all providers participating in LCG have/will have a
VOMS service
– mechanisms for mapping users/groups and to provide
access control vary depending on local policies &
requirements
06 November 2015
GridPP Middleware Roadmap
Database Services
• Characteristics:
– Provide back-ends for file catalogues, metadata,
conditions, etc.
– Write access limited to experiment software managers
– Reliable, scalable services based on reliable hardware
– Required at Tier-0, Tier-1 and some Tier-2
– Some replication to remote sites of centralised
databases
• Implementations:
– Oracle
– MySQL (at some smaller sites)
06 November 2015
GridPP Middleware Roadmap
Posix-like I/O
• Characteristics:
– Support intermediate libraries : POOL, ROOT, …
– Support direct from application code
– Communicate with Grid File Catalogues (allows LFN / GUID
access)
• Implementations:
– LCG GFAL (Grid File Access Library)
– gLiteIO (access control via Fireman catalogue)
– xrootd
• Comment:
– remote (from other sites) I/O not expected, files should be
replicated locally
06 November 2015
GridPP Middleware Roadmap
Application Software
Installation Tools
• Characteristics:
–
–
–
–
VO specific installation of software
Validation of installed software
Write access limited to experiment software managers
Publish installed software in information system
• Implementations:
– toolset in LCG-2
– experiment specific solutions
06 November 2015
GridPP Middleware Roadmap
Job Monitoring
• Characteristics:
– monitor & trace submitted grid jobs
– Instrumentation of job wrapper scripts
– VO-level monitoring of resource usage of the VO
• Implementations:
– partial solution in LCG-2 workload management system
– LCG-2 publish Resource Broker info of every job in RGMA (CPU time, wall clock time, memory usage, etc.)
– ARC Middleware (NorduGrid)
06 November 2015
GridPP Middleware Roadmap
Reliable Messaging
• Characteristics:
– messaging between applications, services & users
– reliable, asynchronous
• Implementations:
– some experiment specific solutions – a common
trustworthy service would be of value; making use of
existing open source or public domain tools
06 November 2015
GridPP Middleware Roadmap
Information System
• Characteristics:
– (not seen as a baseline service. but still crucial)
– Info published through application interactions with
other services
– Schema must be adequate to describe services and
their parameters
• Implementations:
– GLUE schema exists and proposed as standard for LCG
(update expected in Q4 2005); common between EGEE,
OSG & NorduGrid
06 November 2015
GridPP Middleware Roadmap
Baseline Services Report
- Experiment Priorities Service
Alice
Atlas
CMS
LHCb
Storage Element
A
A
A
A
Basic transfer tools
A
A
A
A
Reliable file transfer service
A
A
A/B
A
Catalogue services
B
B
B
B
Catalogue & Data management tools
C
C
C
C
Compute element
A
A
A
A
B/C
A
A
C
VO agents
A
A
A
A
VO Membership service
A
A
A
A
Database Services
A
A
A
A
Posix-like I/O
C
C
C
C
Application software installation
C
C
C
C
Job monitoring tools
C
C
C
C
Reliable messaging service
C
C
C
C
Information system
A
A
A
A
Workload management
06 November 2015
GridPP Middleware Roadmap
A: High priority,
mandatory
B: Standard
solutions required
but expts could
select different
implementations
C: Desirable
common solution,
but not essential
OMII
• Evaluated within EGEE - reported at Athens meeting
– based on web service standards; fully decentralised
– might fulfil LCG requirements (superficially at least) in
•
•
•
•
user account management
job submission
file transfer
X.509 based security infrastructure
– Users need accounts at all resource sites; sites queried for resources for each
job; manual site selection; applications pre-installed at all participating sites
– No services for data management (other than file transfer)
– No support for catalogues, databases, mass storage, VO management
• A major undertaking to use in meeting LHC requirements and would
need significant additions of other tools… not clear what advantages
would result. Possible use to leverage local resources (e.g. shared with
non-HEP), but don’t under estimate work in forging interoperability and
ongoing maintenance.
06 November 2015
GridPP Middleware Roadmap
Conclusion
• LCG brings together components from a number of
sources and packages them in a common framework
• No single project (EDG, gLite, VDT, OMII, …) provides all
the answers
• Baseline services report is an end-user view
– Are there holes in the analysis ?
– What about ?
• Security infrastructure
• Information system infrastructure
• Grid monitoring & Operations toolsets
• LCG/EGEE releases are best placed to meet LHC needs in
the UK
06 November 2015
GridPP Middleware Roadmap