Transcript Document

The EGEE project: building a grid
infrastructure for Europe
Bob Jones
EGEE Technical Director
4th Annual Workshop on Linux Clusters
For Super Computing
24 October 2003
EGEE is proposed as a project funded by the
European Union under contract IST-2003-508833
LCSC2003
24 October 2003 - 1
EGEE: Goals
• Create a wide European Grid production quality infrastructure
on top of present and future EU RN infrastructure
• Provide distributed European research communities with
“round-the-clock” access to major computing resources,
independent of geographic location
• Change of emphasis from grid development to grid deployment
• Support many application domains with one large-scale
infrastructure that will attract new resources over time
• Provide training and support for end-users
LCSC2003
24 October 2003 - 2
EGEE: Strategy
• Leverage current and planned national and regional Grid
programmes, building on
•
•
the results of existing projects such as DataGrid and others
the EU Research Network Geant and work closely with relevant
industrial Grid developers and NRENs
• Support Grid computing needs common to the different
communities
•
integrate the computing infrastructures and agree on common
access policies
• Exploit International connections (US and AP)
• Provide interoperability with other major Grid initiatives such as the
US NSF Cyberinfrastructure, establishing a worldwide Grid
infrastructure
LCSC2003
24 October 2003 - 3
EGEE: Partners
• Leverage national resources in a more effective way for
broader European benefit
• 70 leading institutions in 27 countries organised into
regional federations
LCSC2003
24 October 2003 - 4
EGEE Activities
24% Joint Research
JRA1: Middleware Engineering and
Integration
JRA2: Quality Assurance
JRA3: Security
JRA4: Network Services
Development
48% Services
SA1: Grid Operations, Support and Management
SA2: Network Resource Provision
28% Networking
NA1: Management
NA2: Dissemination and Outreach
NA3: User Training and Education
NA4: Application Identification and
Support
NA5: Policy and International
Cooperation
Emphasis in EGEE is on
operating a production
grid and supporting the endusers
Starts 1st April 2004 for 2 years (1st phase) with EU funding of ~32M€
LCSC2003
24 October 2003 - 5
EGEE Service Activity (I)
Create, operate, support and manage a production quality infrastructure
1 Operations Management Centre – OMC
•
•
•
Coordinator for CICs and for ROCs
Team to oversee operations – problems
resolved, performance targets, etc.
Operations Advisory Group to advise on policy
issues, etc.
5 Core Infrastructure Centres – CIC
•
•
•
Day-to-day operation management– implement
operational policies defined by OMC
Monitor state, initiate corrective actions,
eventual 24x7 operation of grid infrastructure
Provide resource and usage accounting,
security incident response coordination, ensure
recovery procedures
~11 Regional Operations Centres – ROC
•
•
Provide front-line support to users and resource
centres
Support new resource centres joining EGEE in
the regions
LCSC2003
24 October 2003 - 6
EGEE Service Activity (II)
Resource Centers
Region
Month 1: 10 RCs
Month 15: 20 RCs
CPU nodes
Disk (TB)
CPU Nodes
Month 15
Disk (TB)
Month 15
CERN
900
140
1800
310
UK + Ireland
100
25
2200
300
France
400
15
895
50
Italy
553
60.6
679
67.2
North
200
20
2000
50
South West
250
10
250
10
Germany + Switzerland
100
2
400
67
South East
146
7
322
14
Central Europe
385
15
730
32
Russia
50
7
152
36
Totals
3084
302
8768
936
RCs are not funded via the project
Expect to attract many more RCs
LCSC2003
24 October 2003 - 7
The Northern Region ROC
• Joint operation between SARA (Netherlands) and Swedish
Infrastructure for Computing (SNIC)
•
Collaboration body formed in North European Grid Cluster (NEG)
• The SNIC part lead by KTH PDC (Parallelldatorcentrum)
• Operation:
•
•
•
•
Negotiate service level agreements (SLA) with committed resource
centres (RC) in the Nordic countries and Estonia
Deploy and support egee grid middleware - includes documentation
and training
Monitor and support 24/7 operation of Grid resources
Ensure collaboration with other Grid initiatives in the region - Nordic
Data Grid, Nordugrid, Swegrid, NOTUR, CSC,…
Per Öster <[email protected]
LCSC2003
24 October 2003 - 8
The Northern Region ROC
• Organisation and tasks (in short)
•
ROC manager
• Coordinate and lead activity
• Coordinate activity with the other 8 ROCs
• Interact with the CIC
•
Deployment team
• Validate software releases
• Deploy software and aid the RCs in resolving implementation issues
• Aid support team in middleware support issues
•
Support team
• Monitor Grid resources
• Support RCs in operation of Grid middleware
• Operate call centre for user and RC support
Per Öster <[email protected]
LCSC2003
24 October 2003 - 9
EGEE Service Activity (III)
• Network Provision
•
•
Ensures EGEE access to network services provided by GEANT and
the NRENs to link users, resources and operational management
Tasks
o
o
o
o
Definition of requirements
Specification of services
Definition of network access policies
Monitoring of service level provision
GEANT is the High-speed pan-European backbone
linking National Research and Educational Networks
(NRENs)
LCSC2003
24 October 2003 - 10
EGEE Middleware Activity (I)
• Hardening and re-engineering of existing
middleware functionality, leveraging the
experience of partners
• Activity concentrated in few major centers
• Key services: Resource Access
•
•
Data Management (CERN)
Information Collection and Accounting (UK)
Resource Brokering (Italy)
Quality Assurance (France)
includes
Grid Security (Northern Europe)
Nordugrid
partners
Middleware Integration (CERN)
•
Middleware Testing (CERN)
•
•
•
•
LCSC2003
24 October 2003 - 11
EGEE Middleware Activity (II)
• Provide robust, supportable middleware components
•
Select, re-engineer, integrate identified Grid Services, evolve towards
Services Oriented Architecture and multiple platforms
• All software available to other projects via Open Source licence
• Selection of Middleware based on requirements of
•
•
the applications (Bio & HEP) and the Operations
Support and evolve of the middleware components
• Evolve towards OGSI, define a re-engineer process, address multiplatform,
multiple implementations and interoperability issues
• Define defect handling processes and responsibilities
LCSC2003
24 October 2003 - 12
Quality Assurance
LCSC2003
24 October 2003 - 13
EGEE and LCG (I)
• Strong links already established between EDG and LCG and this
approach will continue in the scope of EGEE
• The core infrastructure of the LCG and EGEE grids will be operated as
a single service, and will grow out of LCG service
•
LCG includes US and Asia
• EGEE includes other sciences
• Substantial part of infrastructure common to both
• The ROCs provide local support for Resource Centres and users
•
Similar to LCG primary sites
• Some ROCs and LCG primary sites will be merged
LCSC2003
24 October 2003 - 14
EGEE and LCG (II)
LCG Deployment Manager will be the EGEE Operations Manager
Production Middleware deployment in EGEE
LCG-1
LCG-2
Globus 2 based
EGEE-1
EGEE-2
OGSA based
EDG
VDT
LCG
...
...
EGEE
LCSC2003
24 October 2003 - 15
EGEE Implementation Plans
• Initial service will be based on the LCG infrastructure
(production service where most resources are allocated)
• WIll need a certification test-bed system
•
For debugging and problem resolving of the production system
• Must deploy a development service
•
•
Runs the candidate next software release for production
Treated as an reliable facility (but with less support than the
production service)
LCSC2003
24 October 2003 - 16
EGEE Networking Activity (I)
• Dissemination and outreach
•
Lead by TERENA
• User training and induction
•
Lead by Unv Edin. (NeSC)
• Application identification and support
•
Two pilot application centers (for high
energy physics and biomedical grids)
• One more generic component dealing
with longer term recruitment and support
of other communities
• Policy and International cooperation
•
•
•
Establish Grid policy forum
Guide work with international standards
bodies (GGF etc.)
Coordinate relations with other projects
(EU and beyond)
map points indicate federations
and are not geographically precise
LCSC2003
24 October 2003 - 17
EGEE Networking Activity (II)
• User training and Induction
•
Over 25 courses for more than 1000 people in the first 2 years
• Induction & advanced user courses
• Application developer training and middleware retreat
LCSC2003
24 October 2003 - 18
EGEE Networking Activity (III)
• EGEE Scope : ALL-Inclusive for academic applications
• Open to industrial and socio-economic world as well
• The major success criterion of EGEE: how many satisfied users from
how many different domains ?
• 5000 users (3000 after year 2) from at least 5 disciplines
• 2 Pilot Application Domains: Physics & Bioinformatics
Application domains and timelines are for illustration only
LCSC2003
24 October 2003 - 19
EGEE and Industry
• Industrial participation encouraged both as potential end-users and IT
technology and service suppliers
• Normally through national and regional Grid EGEE federations
• EGEE will organise an Industry Forum to keep selected Industrial and
Commercial interested parties in close contact
• Services developed in first phase of 2 years (2004-5) may be tendered
to Industry in second phase (2006-7)
LCSC2003
24 October 2003 - 20
Summary
• EGEE represents the change from grid development to large-scale
deployment build on the results of existing projects such as DataGrid,
NorduGrid, LCG and others
• The goal is to support many application domains with one large-scale
infrastructure that will attract new resources over time
• Training and support for end-users is an important activity of the project
• A path for providing a continuously available grid service is established
(EDG, LCG, EGEE)
• Grid middleware will be re-engineered to produce a OGSI based
implementation addressing the needs of the applications
• The project will start in April 2004 - first phase will last 2 years.
Negotiations have been successfully completed with the European
Commission and planning for the transition to EGEE is underway
LCSC2003
24 October 2003 - 21