Status of the LCG Project
Download
Report
Transcript Status of the LCG Project
LCG
LHC Computing Grid Project
Creating a Global Virtual Computing
Centre for Particle Physics
ACAT’2002
27 June 2002
Les Robertson
IT Division, CERN
[email protected]
last update: 20/07/2015 18:34
les robertson - cern-it 1
Summary
LCG
last update 20/07/2015 18:34
LCG – The LHC Computing Grid Project
requirements, funding, creating a Grid
areas of work
grid technology
computing fabrics
deployment
operating a grid
Plan for the LCG Global Grid Service
A few remarks
les robertson - cern-it-2
LCG
Summary
of Computing Capacity Required for all LHC
Experiments in 2007
source: CERN/LHCC/2001-004 - Report of the LHC Computing Review - 20 February 2001
(ATLAS with 270Hz trigger)
---------- CERN ---------Tier 0
Tier 1
Processing (K SI95)
Disk (PB)
Magnetic tape (PB)
1,727
1.2
16.3
832
1.2
1.2
Total
Regional
Centres
Grand
Total
2,559
2.4
17.6
4,974
8.7
20.3
7,533
11.1
37.9
Funding dictates –
Worldwide distributed computing system
Small fraction of the analysis at CERN
Batch analysis – using 12-20 large regional centres
how to use the resources efficiently
establishing and maintaining a uniform physics environment
last update 20/07/2015 18:34
Data exchange and interactive analysis involving tens of
smaller regional centres, universities, labs
les robertson - cern-it-3
Summary - Project Goals
LCG
Goal –
Prepare and deploy the LHC computing environment
applications - tools, frameworks, environment, persistency
computing system global grid service
cluster automated fabric
collaborating computer centres grid
CERN-centric analysis global analysis environment
This is not another grid technology project –
it is a grid deployment project
last update 20/07/2015 18:34
les robertson - cern-it-4
LCG
Two Phases
The first phase of the project – 2002-2005
preparing the prototype computing environment, including
support for applications – libraries, tools, frameworks,
common developments, …..
global grid computing service
funded by Regional Centres, CERN, special contributions to
CERN by member and observer states, middleware
developments by national and regional Grid projects
manpower OK
hardware at CERN - ~40% funded
Phase 2 – construction and operation of the initial LHC
Computing Service – 2005-2007
at CERN – missing funding of ~80M CHF
last update 20/07/2015 18:34
les robertson - cern-it-5
Funding
LCG
Funding agencies have little enthusiasm for investing more in
particle physics
HEP seen as a ground-breaker in computing
initiator of the Web
track record of exploiting leading edge computing
effective global collaborations
real need – for data as well as computation
one of the few application areas with real cross-border data needs
LHC in sync with
-- emergence of Grid technology
-- explosion of network bandwidth
We must deliver on Phase 1 for LHC and show the relevance for other sciences
last update 20/07/2015 18:34
les robertson - cern-it-6
Building a Grid
LCG
Computing Centre Cluster
WAN
application
servers
mass
storage
data cache
last update 20/07/2015 18:34
les robertson - cern-it-7
LCG
Cluster Fabric
autonomic computing
automated management
installation, configuration,
maintenance, monitoring,
error recovery, …
-reliability
-cost containment
last update 20/07/2015 18:34
les robertson - cern-it-8
The MONARC Multi-Tier Model
(1999)
LCG
Tier 0 - recording,
reconstruction
CERN
IN2P3
FNAL
RAL
Uni n
Lab a
Tier2
Uni b
Department
Lab c
Desktop
MONARC report: http://home.cern.ch/~barone/monarc/RCArchitecture.html
last update 20/07/2015 18:34
les robertson - cern-it-9
[email protected]
Tier 1 –
full service
LCG
Building a Grid
Collaborating
Computer
Centres
last update 20/07/2015 18:34
les robertson - cern-it-10
LCG
a Grid
virtual LHC
Computing Centre
Grid TheBuilding
Collaborating
Computer
Centres
Alice VO
CMS VO
last update 20/07/2015 18:34
les robertson - cern-it-11
LCG
Virtual Computing Centre
The user --sees the image of a single cluster
does not need to know - where the data is
- where the processing capacity is
- how things are interconnected
- the details of the different hardware
and is not concerned by the conflicting policies of the
equipment owners and managers
last update 20/07/2015 18:34
les robertson - cern-it-12
Project Implementation
Organisation
LCG
Four areas
last update 20/07/2015 18:34
Applications (see Matthias Kasemann’s
Grid Technology
Fabrics
Grid deployment
presentation)
les robertson - cern-it-13
Grid Technology Area
Leveraging Grid R&D Projects
LCG
•
significant R&D funding for Grid middleware
•
risk of divergence
and is that good or bad?
•
global grids need standards
•
useful grids need stability
•
•
Many national,
regional Grid projects -hard to do this in the current stateGridPP(UK),
of maturityINFN-grid(I),
NorduGrid, Dutch Grid, …
will we recognise and be willing
to migrate to the winning solutions?
European projects
US projects
last update 20/07/2015 18:34
les robertson - cern-it-14
LCG
Grid Technology Area
Ensuring that the appropriate middleware is
available
Supplied and maintained by the “Grid projects”
It is proving hard to get the first “production” data
intensive grids going as user services
Can the grid projects provide long-term support and
maintenance?
Trade-off between new functionality and stability
last update 20/07/2015 18:34
les robertson - cern-it-15
LCG
last update 20/07/2015 18:34
The Trans-Atlantic Issue
Bridging the ATLANTIC is essential for the project
HICB – High Energy and Nuclear Physics Intergid
Collaboration Board
GLUE – Grid Laboratory Universal Environment
compatible middleware and infrastructure
Funded by DataTAG and iVDGL
Certificates - OK
Schemas – under way, working with the wider Globus
world, getting complicated – probably OK
Middleware components – not yet clear – but close
collaboration on
File replication
Job scheduling
les robertson - cern-it-16
LCG
Collaboration with Grid Projects
LCG must deploy a GLOBAL GRID
essential to have compatible middleware &
grid infrastructure
better – have identical middleware
We are banking on GLUE
But we have to make some choices towards the end of the
year
Services are about stability, support, maintenance
Can the R&D grid projects take commitments for long term
maintenance of their middleware?
last update 20/07/2015 18:34
les robertson - cern-it-17
Scope of Fabric Area
LCG
Tier 1,2 centre collaboration
Grid-Fabric integration middleware
(DataGrid WP4)
Automated systems management package
Technology assessment (PASTA III) started
CERN Tier 0+1 centre
last update 20/07/2015 18:34
les robertson - cern-it-18
LCG
Grid Deployment Area
The aim is to build
a general computing service
for a very large user population
of independently-minded scientists
using a large number of independently managed sites
This is NOT a collection of sites providing pre-defined services
it is the user’s job that defines the service
it is current research interests that define the workload
it is the workload that defines the data distribution
DEMAND - Unpredictable & Chaotic
But the SERVICE had better be
last update 20/07/2015 18:34
les robertson - cern-it-19
Available & Reliable
LCG
Grid Deployment – current status
Experiments can do (and are doing) their event production using
distributed resources with a variety of solutions
classic distributed production
– send jobs to specific sites, simple bookkeeping
some use of Globus, and some of the HEP Grid tools
other integrated solutions (ALIEN)
The hard problem for distributed computing
is data analysis – ESD and AOD
chaotic workload
unpredictable data access patterns
this is where new Grid technology is needed
resource broker, replica management, ..
this is the problem that the LCG has to solve
last update 20/07/2015 18:34
les robertson - cern-it-20
Grid Operation
Local site
LCG
queries
monitoring & alarms
corrective actions
User
Local user support
Local operation
Grid operations
Call Centre
Grid Operations Centre
Grid
information
service
last update 20/07/2015 18:34
Grid
logging &
bookkeeping
les robertson - cern-it-21
Virtual
Organisation
Network
Operations
Centre
LCG
Grid Operation
We do not know how to do this
Probably nobody knows –
looks like network operation, but there are many
more variables to be watched and adjusted;
looks like multi-national commercial systems, but
we have no central ownership, control
A 24 hour service is needed
– round the clock and round the world
last update 20/07/2015 18:34
les robertson - cern-it-22
Setting up the
LHC Global Grid Service
LCG
First data is in 2007
LCG must learn from current solutions, leverage the tools coming
from the grid projects, show that grids are useful
but set realistic targets
short term (this year):
use current solutions for physics data challenges (event
productions)
consolidate (stabilise, maintain) middleware
learn what a “production grid” really means by working with
DataGrid and VDT
medium term (next year):
Set up a reliable global grid service – initially only a few larger
centres, but on three continents
Stabilise it
Several times the capacity of the CERN facility
and as easy to use
last update 20/07/2015 18:34
les robertson - cern-it-23
LCG
Having stabilised this base service –
showing that we can run a solid service for the
experiments
then – progressive evolution –
integrate all of the Regional Centre resources
provided for LHC
improve quality, reliability, predictability
integrate new middleware functionality – possibly once
per year
migrate to de facto standards as soon as they emerge
last update 20/07/2015 18:34
les robertson - cern-it-24
LCG
last update 20/07/2015 18:34
Final comments
It is not just about distributing computation,
it is also about managing distributed data (lots of it!)
and maintaining a single view of the environment
All these parallel developments, rapidly changing technology ..
may be good in the long term, but we must deploy
a global grid service next year
A dependable, reliable 24 X 7 service is essential
and not so easy to do with all these sites and all that data
Service Quality is the Key to Acceptance of Grids
Reliable OPERATION will be the factor that limits the size of
practical Grids
We are getting funding because of the relevance for other
sciences, engineering, business -keeping things general, main-line must remain a high priority
les robertson - cern-it-25