LCG LHC Computing Grid Project Status of High Level Planning LHCC 5 July 2002 Les Robertson, Project Leader CERN, IT Division [email protected] last update: 07/11/2015 04:07 les robertson.
Download
Report
Transcript LCG LHC Computing Grid Project Status of High Level Planning LHCC 5 July 2002 Les Robertson, Project Leader CERN, IT Division [email protected] last update: 07/11/2015 04:07 les robertson.
LCG
LHC Computing Grid Project
Status of High Level Planning
LHCC 5 July 2002
Les Robertson, Project Leader
CERN, IT Division
[email protected]
last update: 07/11/2015 04:07
les robertson - cern-it 1
LCG
Fundamental Goal of the LCG
To help the experiments’ computing projects
get the best, most reliable and accurate
physics results from the data coming from
the detectors
Phase 1 – 2002-05
prepare and deploy the environment for LHC
computing
Phase 2 – 2006-08
acquire, build and operate the LHC
computing service
last update 07/11/2015 04:07
les robertson - cern-it-2
LCG
Phase 1 - High-level Goals
To prepare and deploy the environment for LHC computing
development/support for applications – libraries, tools,
develop/acquire the software for managing a
distributed computing system on the scale required for
LHC – the local computing fabric, integration of the fabrics into a global grid
put in place a pilot service –
frameworks, data management (inc. persistency), …….. common components
“proof of concept” – the technology and the distributed analysis environment
platform for learning how to manage and use the system
provide a solid service for physics and computing data challenges
last update 07/11/2015 04:07
produce a TDR describing the distributed LHC
computing system for the first years of LHC running
maintain opportunities for re-use of developments
outside the LHC programme
les robertson - cern-it-3
Background & Environment
LCG
o
o
the detailed requirements of the applications component
of the project is being defined with experiments – this
work started early this year, but it will be another 18
months before the full scope is fully defined
basic requirements for the computing facility from the
report of the LHC Computing Review - February 2001
o evolving due to review of trigger rates, event sizes, experience
with program prototypes, ---- and will continue to change as
experience is gained with applications and the analysis model is
developed
o
technology is in continuous evolution –
o driven by market forces (processors, storage, networking, ..)
o and by government-funded research (grid middleware)
we have to follow these developments - remain flexible,
open to change
last update 07/11/2015 04:07
les robertson - cern-it-4
Background & Environment (ii)
LCG
o
Regional Computing Centres –
o impressive experience of providing distributed services to
LHC experiments – must now learn how to collaborate much
more closely to provide the integrated service promised by
the Grid vision
o established user communities – wider than LHC – many
external constraints
o
project funding is from many sources, each with its
own constraints
The project is getting under way in an environment
where –
o there is already a great deal of activity
o requirements are changing as understanding and
experience develop
o some fundamental parts of the environment are evolving
more or less independently of the project and LHC
last update 07/11/2015 04:07
les robertson - cern-it-5
Funding Sources
LCG
Regional centres – providing resources for LHC experiments
in many cases facility shared between experiments (LHC and
•The
project
hasother
differing
degrees of management
non-LHC) and
maybe
with
sciences
control and influence
Grid projects – suppliers and maintainers of middleware
• Some of the funding has been provided because
CERN personnel
materials
HEPand
& LHC
are seen -asincluding
computingspecial
ground-breakers
contributions for
from
Gridmember
technologyand
-- observer states
Experiment resources –
Industrial contributions
-- so we must deliver for LHC and show the
people participating
in common
relevance
for otherapplications
sciences developments,
data challenges, ..
-- also must
be sensitive
to potential
computing resources
provided
through
Regionalopportunities
Centres
for non-HEP funding of Phase 2
last update 07/11/2015 04:07
les robertson - cern-it-7
The LHC Computing Grid Project
Organisation
LCG
Common Computing RRB
LHCC
Reports
(funding agencies)
Reviews
Resources
Project Overview Board
Project
Execution
Board
last update 07/11/2015 04:07
Requirements,
Monitoring
Software and
Computing
Committee
(SC2)
les robertson - cern-it-8
LCG
SC2 & PEB Roles
SC2 includes the four experiments, Tier 1 Regional Centres
SC2 identifies common solutions and sets requirements for the
project
may use an RTAG – Requirements and Technical Assessment
Group
limited scope, two-month lifetime with intermediate report
one member per experiment + experts
PEB manages the implementation
organising projects, work packages
coordinating between the Regional Centres
collaborating with Grid projects
organising grid services
SC2 approves the work plan, monitors progress
last update 07/11/2015 04:07
les robertson - cern-it-9
LCG
last update 07/11/2015 04:07
SC2 Monitors Progress of the Project
Receives regular status reports from the PEB
Written status report every 6 months
milestones, performance, resources
estimates time and cost to complete
Organises a peer-review
about once a year
presentations by the different components of the
project
review of documents
review of planning data
les robertson - cern-it-10
LCG
RTAG status
in application software area
data persistency
software support process
mathematical libraries
detector geometry description
Monte Carlo generators
applications architectural blueprint
in fabric area
mass storage requirements
in Grid technology and deployment area
Grid technology use cases
Regional Centre categorisation
finished – 05apr02
finished – 06may02
finished – 02may02
started
starting
started
finished – 03may02
finished – 07jun02
finished – 07jun02
Current status of RTAGs (and available reports) on www.cern.ch/lcg/sc2
last update 07/11/2015 04:07
les robertson - cern-it-11
LCG
Project Execution Organisation
Four areas – each with area project manager
last update 07/11/2015 04:07
Applications
Grid Technology
Fabrics
Grid deployment
les robertson - cern-it-12
Applications Area
LCG
Area manager – Torre Wenaus
Open weekly applications area meeting
Software Architects Committee
process for taking LCG-wide software decisions
Importance of RTAGs to define scope
Common projects
everything that is not an experiment-specific component is a
potential candidate for a common project
important changes are under way
new persistency strategy
evolution from Geant 3 towards Geant 4 and Fluka
good time to define common solutions, but there will be
inevitable delays in agreeing requirements, organising common
resources
long term advantages in use of resources, support, maintenance
last update 07/11/2015 04:07
les robertson - cern-it-13
LCG
Applications Area
Key work packages
Object persistency system
agreement on hybrid solution (root, Relational
Database Management System)
Software process
Common frameworks for simulation and analysis
Proposal on event generation RTAG
Architectural blueprint RTAG started
– opening the way to RTAGs/work on analysis
components?
last update 07/11/2015 04:07
Grid middleware requirements defined
les robertson - cern-it-14
Candidate RTAGs
LCG
(from launch workshop)
Simulation tools
Event processing framework
Detector description & model
Distributed analysis interfaces
Conditions database
Distributed production systems
Data dictionary
Small scale persistency
Interactive frameworks
Software testing
Statistical analysis
Software distribution
Detector & event visualization
OO language usage
Physics packages
LCG benchmarking suite
Framework services
Online notebooks
C++ class libraries
last update 07/11/2015 04:07
Completing the RTAGs - setting the requirements –
will take about 2 years
les robertson - cern-it-15
LCG
What is a Grid?
last update: 07/11/2015 04:07
les robertson - cern-it 16
The MONARC Multi-Tier Model
(1999)
LCG
Tier 0 - recording,
reconstruction
CERN
IN2P3
FNAL
RAL
Uni n
Lab a
Tier2
Uni b
Department
Lab c
Desktop
MONARC report: http://home.cern.ch/~barone/monarc/RCArchitecture.html
last update 07/11/2015 04:07
les robertson - cern-it-17
[email protected]
Tier 1 –
full service
LCG
a Grid
virtual LHC
Computing Centre
Grid TheBuilding
Collaborating
Computer
Centres
Alice VO
CMS VO
last update 07/11/2015 04:07
les robertson - cern-it-19
LCG
Virtual Computing Centre
The user --sees the image of a single cluster
does not need to know - where the data is
- where the processing capacity is
- how things are interconnected
- the details of the different hardware
and is not concerned by the local policies of the
equipment owners and managers
last update 07/11/2015 04:07
les robertson - cern-it-20
LCG
Grid Technology Area
Area Manager – Fabrizio Gagliardi
Ensures that the appropriate middleware is available
Dependency on deliverables supplied and maintained by
the “Grid projects”
Many R&D projects in Europe and US with strong HEP
participation/leadership
Immature technology – evolving, parallel developments –
conflict between new functionality and stability
scope for divergence, especially trans-Atlantic
It is proving hard to get the first “production” grids going - from
demonstration to service
Can these projects provide long-term support and maintenance?
HICB (High Energy & Nuclear Physics Intergrid Collaboration Board)
GLUE - recommendations for compatible US-European middleware
LCG will have to make hard decisions on middleware towards
the end of this year
last update 07/11/2015 04:07
les robertson - cern-it-21
Fabric Area
LCG
Area Manager – Bernd Panzer
Tier 1,2 centre collaboration
develop/share experience on installing and operating a Grid
exchange information on planning and experience of large fabric
management
look for areas for collaboration and cooperation
Grid-Fabric integration middleware
Technology assessment
likely evolution, cost estimates
CERN Tier 0+1 centre
Automated systems management package
Evolution & operation of CERN prototype –
integrating the base LHC computing services in the LCG grid
last update 07/11/2015 04:07
les robertson - cern-it-22
LCG
Grid Deployment Area
Grid Deployment Area manager – not yet appointed
Job is to set up and operate a Global Grid Service
stable, reliable, manageable Grid for – Data Challenges and
regular production work
integrating computing fabrics at Regional Centres
learn how to provide support, maintenance, operation
Grid Deployment Board – Mirco Mazzucato
Regional Centre senior management
Grid deployment standards and policies
authentication, authorisation, formal agreements,
computing rules, sharing, reporting, accounting, ..
first meeting in September
last update 07/11/2015 04:07
les robertson - cern-it-23
LCG
Grid Deployment Teams – the plan
suppliers’ integration
teams provide
tested releases
common
applications s/w
Trillium - US grid
middleware
certification, build
& distribution
LCG infrastructure
coordination
& operation
grid operation
fabric operation
regional centre A
last update 07/11/2015 04:07
fabric operation
regional centre B
DataGrid
middleware
user support
call centre
…
les robertson - cern-it-24
fabric operation
regional centre X
fabric operation
regional centre Y
LCG
Status of Planning
Launch workshop in March 2002 – established broad priorities
Establishing the high-level goals, deliverables and milestones
Beginning to build the PBS and WBS – as the staff builds up and
the detailed requirements and possibilities emerge
Detailed planning will take some time - ~end of 2002, beginning
2003 - many things are not yet clear
Applications requirements – need further work by SC2 (RTAGs)
Grid Technology – negotiation of deliverables from Grid projects
Grid Deployment – agreements with Regional Centres (GDB)
This is computing – success requires flexibility – getting the right
balance between
reliable, tested, solid technology
exploiting leading edge developments that give major benefits
early recognition of de facto standards
last update 07/11/2015 04:07
les robertson - cern-it-25
LCG
Proposed High Level Milestones
last update: 07/11/2015 04:07
les robertson - cern-it 26
Tactics
LCG
First data is in 2007 – LCG should focus on long-term goals –
the difficult problems of distributed data analysis: unpredictable (chaotic)
usage patterns; masses of data; batch and interactive
reliable, stable, dependable services
LCG must leverage current solutions, set realistic targets
short term (this year):
use current (classic) solutions for physics data challenges (event
productions)
consolidate (stabilise, maintain) middleware – and see it used for physics
learn what a “production grid” really means by working with the Grid R&D
projects
get the new data persistency prototype going
medium term (next year):
make a first release of the persistency system
Set up a reliable global grid service with limited but well understood
functionality
not too many nodes, but in three continents
Stabilise it
Grow it to include all active Tier2 centres, with support for some Tier 3
centres
last update 07/11/2015 04:07
les robertson - cern-it-27
LCG
Proposed Level 1 Milestones
M1.1 - June 03
First Global Grid Service (LCG-1) available
-- this milestone and M1.3 defined in detail by end
2002
M1.2 - June 03
M1.3a - November 03
M1.3b - November 03
M1.4 - May 04
Hybrid Event Store (Persistency Framework)
available for general users
LCG-1 reliability and performance targets
achieved
Distributed batch production using grid
services
Distributed end-user interactive analysis
-- detailed definition of this milestone by November
03
M1.5 - December 04
“50% prototype” (LCG-3) available
-- detailed definition of this milestone by June 04
M1.6 - March 05
Full Persistency Framework
M1.7 - June 05
LHC Global Grid TDR
last update 07/11/2015 04:07
les robertson - cern-it-28
Proposed Level 1 Milestones
LCG
Hybrid Event Store available for general users
applications
Distributed production using grid services
Distributed end-user interactive analysis
Full Persistency Framework
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4
2002
2003
2004
grid
2005
LHC Global Grid TDR
“50% prototype” (LCG-3) available
LCG-1 reliability and performance targets
First Global Grid Service (LCG-1) available
last update 07/11/2015 04:07
les robertson - cern-it-29
Major Risks
LCG
Complexity of the project – Regional Centres, Grid projects,
experiments, funding sources and funding motivation
Grid technology
immaturity
number of development projects
US-Europe compatibility
Phase 1 funding at CERN
about 60% of materials funding not yet identified
includes the investments to prepare the CERN Computer
Centre for the giant computing fabrics needed in Phase 2
but the personnel requirements are largely fulfilled by
special contributions
last update 07/11/2015 04:07
les robertson - cern-it-30
LCG
LCG and the LHCC
LCG Phase 1 was approved by Council
deliverables are – common applications tools and components
– TDR for Phase 2 computing facility
We do not have an LHCC-approved proposal as a starting point
LHCC Referees have been appointed
During the rest of this year, while the detailed planning is being
done, we need some discussion with the referees to Ensure that the LHCC has the background and planning
information it needs
Agree on the Level 1 milestones to be tracked by the LHCC
Agree on reporting style and frequency
last update 07/11/2015 04:07
les robertson - cern-it-32