LCG LHC Computing Grid Project Status of High Level Planning LHCC 5 July 2002 Les Robertson, Project Leader CERN, IT Division [email protected] last update: 07/11/2015 04:07 les robertson.

Download Report

Transcript LCG LHC Computing Grid Project Status of High Level Planning LHCC 5 July 2002 Les Robertson, Project Leader CERN, IT Division [email protected] last update: 07/11/2015 04:07 les robertson.

LCG
LHC Computing Grid Project
Status of High Level Planning
LHCC 5 July 2002
Les Robertson, Project Leader
CERN, IT Division
[email protected]
last update: 07/11/2015 04:07
les robertson - cern-it 1
LCG
Fundamental Goal of the LCG
To help the experiments’ computing projects
get the best, most reliable and accurate
physics results from the data coming from
the detectors
Phase 1 – 2002-05
prepare and deploy the environment for LHC
computing
Phase 2 – 2006-08
acquire, build and operate the LHC
computing service
last update 07/11/2015 04:07
les robertson - cern-it-2
LCG
Phase 1 - High-level Goals
To prepare and deploy the environment for LHC computing

development/support for applications – libraries, tools,

develop/acquire the software for managing a
distributed computing system on the scale required for
LHC – the local computing fabric, integration of the fabrics into a global grid
put in place a pilot service –

frameworks, data management (inc. persistency), …….. common components
 “proof of concept” – the technology and the distributed analysis environment
 platform for learning how to manage and use the system
 provide a solid service for physics and computing data challenges


last update 07/11/2015 04:07
produce a TDR describing the distributed LHC
computing system for the first years of LHC running
maintain opportunities for re-use of developments
outside the LHC programme
les robertson - cern-it-3
Background & Environment
LCG
o
o
the detailed requirements of the applications component
of the project is being defined with experiments – this
work started early this year, but it will be another 18
months before the full scope is fully defined
basic requirements for the computing facility from the
report of the LHC Computing Review - February 2001
o evolving due to review of trigger rates, event sizes, experience
with program prototypes, ---- and will continue to change as
experience is gained with applications and the analysis model is
developed
o
technology is in continuous evolution –
o driven by market forces (processors, storage, networking, ..)
o and by government-funded research (grid middleware)
we have to follow these developments - remain flexible,
open to change
last update 07/11/2015 04:07
les robertson - cern-it-4
Background & Environment (ii)
LCG
o
Regional Computing Centres –
o impressive experience of providing distributed services to
LHC experiments – must now learn how to collaborate much
more closely to provide the integrated service promised by
the Grid vision
o established user communities – wider than LHC – many
external constraints
o
project funding is from many sources, each with its
own constraints
The project is getting under way in an environment
where –
o there is already a great deal of activity
o requirements are changing as understanding and
experience develop
o some fundamental parts of the environment are evolving
more or less independently of the project and LHC
last update 07/11/2015 04:07
les robertson - cern-it-5
Funding Sources
LCG



Regional centres – providing resources for LHC experiments
 in many cases facility shared between experiments (LHC and
•The
project
hasother
differing
degrees of management
non-LHC) and
maybe
with
sciences
control and influence
Grid projects – suppliers and maintainers of middleware
• Some of the funding has been provided because
CERN personnel
materials
HEPand
& LHC
are seen -asincluding
computingspecial
ground-breakers
contributions for
from
Gridmember
technologyand
-- observer states

Experiment resources –

Industrial contributions
-- so we must deliver for LHC and show the
 people participating
in common
relevance
for otherapplications
sciences developments,
data challenges, ..
-- also must
be sensitive
to potential
 computing resources
provided
through
Regionalopportunities
Centres
for non-HEP funding of Phase 2
last update 07/11/2015 04:07
les robertson - cern-it-7
The LHC Computing Grid Project
Organisation
LCG
Common Computing RRB
LHCC
Reports
(funding agencies)
Reviews
Resources
Project Overview Board
Project
Execution
Board
last update 07/11/2015 04:07
Requirements,
Monitoring
Software and
Computing
Committee
(SC2)
les robertson - cern-it-8
LCG




SC2 & PEB Roles
SC2 includes the four experiments, Tier 1 Regional Centres
SC2 identifies common solutions and sets requirements for the
project
 may use an RTAG – Requirements and Technical Assessment
Group
 limited scope, two-month lifetime with intermediate report
 one member per experiment + experts
PEB manages the implementation
 organising projects, work packages
 coordinating between the Regional Centres
 collaborating with Grid projects
 organising grid services
SC2 approves the work plan, monitors progress
last update 07/11/2015 04:07
les robertson - cern-it-9
LCG



last update 07/11/2015 04:07
SC2 Monitors Progress of the Project
Receives regular status reports from the PEB
Written status report every 6 months
 milestones, performance, resources
 estimates time and cost to complete
Organises a peer-review
 about once a year
 presentations by the different components of the
project
 review of documents
 review of planning data
les robertson - cern-it-10
LCG
RTAG status
 in application software area
data persistency
software support process
mathematical libraries
detector geometry description
Monte Carlo generators
applications architectural blueprint
 in fabric area
mass storage requirements
 in Grid technology and deployment area
Grid technology use cases
Regional Centre categorisation









finished – 05apr02
finished – 06may02
finished – 02may02
started
starting
started
finished – 03may02
finished – 07jun02
finished – 07jun02
Current status of RTAGs (and available reports) on www.cern.ch/lcg/sc2
last update 07/11/2015 04:07
les robertson - cern-it-11
LCG
Project Execution Organisation
Four areas – each with area project manager
last update 07/11/2015 04:07

Applications

Grid Technology

Fabrics

Grid deployment
les robertson - cern-it-12
Applications Area
LCG
Area manager – Torre Wenaus




Open weekly applications area meeting
Software Architects Committee
 process for taking LCG-wide software decisions
Importance of RTAGs to define scope
Common projects


everything that is not an experiment-specific component is a
potential candidate for a common project
important changes are under way
new persistency strategy
evolution from Geant 3 towards Geant 4 and Fluka
good time to define common solutions, but there will be
inevitable delays in agreeing requirements, organising common
resources
long term advantages in use of resources, support, maintenance




last update 07/11/2015 04:07
les robertson - cern-it-13
LCG

Applications Area
Key work packages
 Object persistency system
 agreement on hybrid solution (root, Relational
Database Management System)
 Software process
 Common frameworks for simulation and analysis
 Proposal on event generation RTAG
 Architectural blueprint RTAG started
– opening the way to RTAGs/work on analysis
components?

last update 07/11/2015 04:07
Grid middleware requirements defined
les robertson - cern-it-14
Candidate RTAGs
LCG
(from launch workshop)
Simulation tools
Event processing framework
Detector description & model
Distributed analysis interfaces
Conditions database
Distributed production systems
Data dictionary
Small scale persistency
Interactive frameworks
Software testing
Statistical analysis
Software distribution
Detector & event visualization
OO language usage
Physics packages
LCG benchmarking suite
Framework services
Online notebooks
C++ class libraries
last update 07/11/2015 04:07
Completing the RTAGs - setting the requirements –
will take about 2 years
les robertson - cern-it-15
LCG
What is a Grid?
last update: 07/11/2015 04:07
les robertson - cern-it 16
The MONARC Multi-Tier Model
(1999)
LCG
Tier 0 - recording,
reconstruction
CERN
IN2P3
FNAL
RAL
Uni n
Lab a
Tier2
Uni b
Department

Lab c


Desktop
MONARC report: http://home.cern.ch/~barone/monarc/RCArchitecture.html
last update 07/11/2015 04:07
les robertson - cern-it-17
[email protected]
Tier 1 –
full service
LCG
a Grid
virtual LHC
Computing Centre
Grid TheBuilding
Collaborating
Computer
Centres
Alice VO
CMS VO
last update 07/11/2015 04:07
les robertson - cern-it-19
LCG
Virtual Computing Centre
The user --sees the image of a single cluster
does not need to know - where the data is
- where the processing capacity is
- how things are interconnected
- the details of the different hardware
and is not concerned by the local policies of the
equipment owners and managers
last update 07/11/2015 04:07
les robertson - cern-it-20
LCG
Grid Technology Area
Area Manager – Fabrizio Gagliardi




Ensures that the appropriate middleware is available
Dependency on deliverables supplied and maintained by
the “Grid projects”
 Many R&D projects in Europe and US with strong HEP
participation/leadership
 Immature technology – evolving, parallel developments –
 conflict between new functionality and stability
 scope for divergence, especially trans-Atlantic
 It is proving hard to get the first “production” grids going - from
demonstration to service
 Can these projects provide long-term support and maintenance?
HICB (High Energy & Nuclear Physics Intergrid Collaboration Board)
 GLUE - recommendations for compatible US-European middleware
LCG will have to make hard decisions on middleware towards
the end of this year
last update 07/11/2015 04:07
les robertson - cern-it-21
Fabric Area
LCG
Area Manager – Bernd Panzer

Tier 1,2 centre collaboration
 develop/share experience on installing and operating a Grid
 exchange information on planning and experience of large fabric
management
 look for areas for collaboration and cooperation


Grid-Fabric integration middleware
Technology assessment
 likely evolution, cost estimates

CERN Tier 0+1 centre
 Automated systems management package
 Evolution & operation of CERN prototype –
integrating the base LHC computing services in the LCG grid
last update 07/11/2015 04:07
les robertson - cern-it-22
LCG
Grid Deployment Area
 Grid Deployment Area manager – not yet appointed
 Job is to set up and operate a Global Grid Service
 stable, reliable, manageable Grid for – Data Challenges and
regular production work
 integrating computing fabrics at Regional Centres
 learn how to provide support, maintenance, operation
 Grid Deployment Board – Mirco Mazzucato
 Regional Centre senior management
 Grid deployment standards and policies
 authentication, authorisation, formal agreements,
computing rules, sharing, reporting, accounting, ..
 first meeting in September
last update 07/11/2015 04:07
les robertson - cern-it-23
LCG
Grid Deployment Teams – the plan
suppliers’ integration
teams provide
tested releases
common
applications s/w
Trillium - US grid
middleware
certification, build
& distribution
LCG infrastructure
coordination
& operation
grid operation
fabric operation
regional centre A
last update 07/11/2015 04:07
fabric operation
regional centre B
DataGrid
middleware
user support
call centre
…
les robertson - cern-it-24
fabric operation
regional centre X
fabric operation
regional centre Y
LCG




Status of Planning
Launch workshop in March 2002 – established broad priorities
Establishing the high-level goals, deliverables and milestones
Beginning to build the PBS and WBS – as the staff builds up and
the detailed requirements and possibilities emerge
Detailed planning will take some time - ~end of 2002, beginning
2003 - many things are not yet clear
 Applications requirements – need further work by SC2 (RTAGs)
 Grid Technology – negotiation of deliverables from Grid projects
 Grid Deployment – agreements with Regional Centres (GDB)

This is computing – success requires flexibility – getting the right
balance between
 reliable, tested, solid technology
 exploiting leading edge developments that give major benefits
 early recognition of de facto standards
last update 07/11/2015 04:07
les robertson - cern-it-25
LCG
Proposed High Level Milestones
last update: 07/11/2015 04:07
les robertson - cern-it 26
Tactics
LCG

First data is in 2007 – LCG should focus on long-term goals –
 the difficult problems of distributed data analysis: unpredictable (chaotic)
usage patterns; masses of data; batch and interactive
 reliable, stable, dependable services

LCG must leverage current solutions, set realistic targets
 short term (this year):
 use current (classic) solutions for physics data challenges (event
productions)
 consolidate (stabilise, maintain) middleware – and see it used for physics
 learn what a “production grid” really means by working with the Grid R&D
projects
 get the new data persistency prototype going
 medium term (next year):
 make a first release of the persistency system
 Set up a reliable global grid service with limited but well understood
functionality
 not too many nodes, but in three continents
 Stabilise it
 Grow it to include all active Tier2 centres, with support for some Tier 3
centres
last update 07/11/2015 04:07
les robertson - cern-it-27
LCG
Proposed Level 1 Milestones
M1.1 - June 03
First Global Grid Service (LCG-1) available
-- this milestone and M1.3 defined in detail by end
2002
M1.2 - June 03
M1.3a - November 03
M1.3b - November 03
M1.4 - May 04
Hybrid Event Store (Persistency Framework)
available for general users
LCG-1 reliability and performance targets
achieved
Distributed batch production using grid
services
Distributed end-user interactive analysis
-- detailed definition of this milestone by November
03
M1.5 - December 04
“50% prototype” (LCG-3) available
-- detailed definition of this milestone by June 04
M1.6 - March 05
Full Persistency Framework
M1.7 - June 05
LHC Global Grid TDR
last update 07/11/2015 04:07
les robertson - cern-it-28
Proposed Level 1 Milestones
LCG
Hybrid Event Store available for general users
applications
Distributed production using grid services
Distributed end-user interactive analysis
Full Persistency Framework
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4
2002
2003
2004
grid
2005
LHC Global Grid TDR
“50% prototype” (LCG-3) available
LCG-1 reliability and performance targets
First Global Grid Service (LCG-1) available
last update 07/11/2015 04:07
les robertson - cern-it-29
Major Risks
LCG

Complexity of the project – Regional Centres, Grid projects,
experiments, funding sources and funding motivation

Grid technology
 immaturity
 number of development projects
 US-Europe compatibility

Phase 1 funding at CERN
 about 60% of materials funding not yet identified
includes the investments to prepare the CERN Computer
Centre for the giant computing fabrics needed in Phase 2
but the personnel requirements are largely fulfilled by
special contributions


last update 07/11/2015 04:07
les robertson - cern-it-30
LCG




LCG and the LHCC
LCG Phase 1 was approved by Council
 deliverables are – common applications tools and components
– TDR for Phase 2 computing facility
We do not have an LHCC-approved proposal as a starting point
LHCC Referees have been appointed
During the rest of this year, while the detailed planning is being
done, we need some discussion with the referees to  Ensure that the LHCC has the background and planning
information it needs
 Agree on the Level 1 milestones to be tracked by the LHCC
 Agree on reporting style and frequency
last update 07/11/2015 04:07
les robertson - cern-it-32