Enabling Grids for E-sciencE EGEE - a worldwide Grid infrastructure opportunities for the biomedical community Bob Jones Technical Director EGEE CERN, Switzerland MIE 2005 Healthgrid workshop, Geneva, 29
Download ReportTranscript Enabling Grids for E-sciencE EGEE - a worldwide Grid infrastructure opportunities for the biomedical community Bob Jones Technical Director EGEE CERN, Switzerland MIE 2005 Healthgrid workshop, Geneva, 29
www.eu-egee.org
INFSO-RI-508833 Enabling Grids for E-sciencE
EGEE - a worldwide Grid infrastructure
opportunities for the biomedical community Bob Jones
Technical Director EGEE CERN, Switzerland
MIE 2005 Healthgrid workshop, Geneva, 29 August 2005
The largest e-Infrastructure: EGEE
Enabling Grids for E-sciencE
•
Objectives
– consistent, robust and secure service grid infrastructure – improving and maintaining the middleware – attracting new resources and users from industry as well as science •
Structure
– 71 leading institutions in 27 countries, federated in regional Grids – leveraging national and regional grid activities worldwide – funded by the EU with ~32 M Euros for first 2 years starting 1st April 2004 INFSO-RI-508833
MIE 2005 Healthgrid workshop, Geneva, August 2005 2
EGEE Activities
Enabling Grids for E-sciencE
•
48 % service activities (Grid Operations, Support and Management, Network Resource Provision)
•
24 % middleware re-engineering (Quality Assurance, Security, Network Services Development)
•
28 % networking (Management, Dissemination and Outreach, User Training and Education, Application Identification and Support, Policy and International Cooperation) Emphasis in EGEE is on operating a production grid and supporting the end-users
INFSO-RI-508833
MIE 2005 Healthgrid workshop, Geneva, August 2005 3
INFSO-RI-508833
Enabling Grids for E-sciencE
Site Map
EGEE Infrastructure
In collaboration with LCG
NorduGrid Grid3/OSG Status 25 July 2005
MIE 2005 Healthgrid workshop, Geneva, August 2005 4
Grid monitoring
Enabling Grids for E-sciencE
• •
Operation of Production Service: real-time display of grid operations Accounting Information
– – – – GIIS Monitor + Monitor Graphs Sites Functional Tests GOC Data Base Scheduled Downtimes – – – Live Job Monitor GridIce – VO + Fabric View Certificate Lifetime Monitor
Such tools help the operations staff to ensure the sites work continuously
INFSO-RI-508833
MIE 2005 Healthgrid workshop, Geneva, August 2005 5
•
EGEE infrastructure usage
Enabling Grids for E-sciencE Average job duration January 2005 – June 2005 for the main VOs Infrastructure is continuously used by many groups
INFSO-RI-508833
MIE 2005 Healthgrid workshop, Geneva, August 2005 6
Enabling Grids for E-sciencE
EGEE pilot applications (I)
•
High-Energy Physics (HEP)
– Provides computing infrastructure (LCG) for experiments at CERN in Geneva – Challenging: thousands of processors world-wide generating petabytes of data ‘chaotic’ use of grid with individual user analysis (thousands of users interactively operating within experiment VOs)
Mont Blanc (4810 m)
Downtown Geneva •
Biomedical Applications
Similar computing and data storage requirements Major additional challenge: security & access to data in many formats INFSO-RI-508833
MIE 2005 Healthgrid workshop, Geneva, August 2005 7
BioMed Overview
Enabling Grids for E-sciencE
•
Infrastructure
– ~2000 CPUs – – ~21 TB of disk in 12 countries
PADOVA BARI
• • • •
>50 users in 7 countries working with 12 applications 18 research labs ~80.000 jobs launched since 04/2004 ~10 CPU years
INFSO-RI-508833
15 resource centres
17 CEs
16 SEs Number of jobs BIOMED
25,000 20,000 15,000 10,000 5,000 0 2004-09 2004-10 2004-11 2004-12
Month
Month 2005-01 2005-02 2005-03
MIE 2005 Healthgrid workshop, Geneva, August 2005 8
•
Bioinformatics
Enabling Grids for E-sciencE GPS@: Grid Protein Sequence Analysis
–
Gridified version of NPSA web portal
Offering proteins databases and sequence analysis algorithms to the bioinformaticians (3000 hits per day) Need for large databases and big number of short jobs – – –
Objective
: increased computing power
Status
: 9 bioinformatic softwares gridified
Grid added value
: open to a wider community with larger bioinformatic computations •
xmipp_MLrefine
–
3D structure analysis of macromolecules
From (very noisy) electron microscopy images Maximum likelihood approach to find the optimal model – – –
Objective
: study molecule interaction and chem. properties
Status
: algorithm being optimised and ported to 3D
Grid added value
: parallel computation on different resources of independent jobs INFSO-RI-508833
MIE 2005 Healthgrid workshop, Geneva, August 2005 9
Enabling Grids for E-sciencE
Medical imaging
•
GATE
–
Radiotherapy planning
Improvement of precision by Monte Carlo simulation Processing of DICOM medical images –
Objective
practice : very short computation time compatible with clinical – –
Status
: development and performance testing
Grid Added Value
: parallelisation reduces computing time •
CDSS
–
Clinical Decision Support System
Assembling knowledge databases Using image classification engines – –
Objective: Status
access to knowledge databases from hospitals : from development to deployment, some medical end users –
Grid Added Value
: ubiquitous, managed access to distributed databases and engines INFSO-RI-508833
MIE 2005 Healthgrid workshop, Geneva, August 2005 10
• •
Enabling Grids for E-sciencE
Medical imaging
SiMRI3D
–
3D Magnetic Resonance Image Simulator
MRI physics simulation, parallel implementation Very compute intensive –
Objective
community : offering an image simulator service to the research – –
Status
: parallelised and now running on EGEE resources
Grid Added Value
: enables simulation of high-res images
gPTM3D
–
Interactive tool to segment and analyse medical images
A non gridified version is distributed in several hospitals Need for very fast scheduling of interactive tasks –
Objectives
: shorten computation time using the grid Interactive reconstruction time: < 2min and scalable – –
Status
: development of the gridified version being finalized
Grid Added Value
: permanent availability of resources INFSO-RI-508833
MIE 2005 Healthgrid workshop, Geneva, August 2005 11
•
Grid middleware
Enabling Grids for E-sciencE The Grid relies on advanced software, called middleware , which interfaces between resources and the applications
– – – – – – Finds convenient places for the application to be run Optimises use of resources and access to data Deals with authentication to the different sites that are used Runs the job & monitors progress Recovers from problems Transfers the result back to the scientist INFSO-RI-508833
LCG-1 LCG-2 gLite-1 gLite-2 Globus 2 based Web services based MIE 2005 Healthgrid workshop, Geneva, August 2005 12
•
Drug Discovery
Enabling Grids for E-sciencE Grid-enabled drug discovery process for neglected diseases
–
In silico
docking: compute probability that potential drugs will dock with a target protein – To speed up and reduce cost required to develop new drugs •
WISDOM (Wide In Silico Docking On Malaria)
– Drug Discovery Data Challenge – – 11 July – 19 August 46 million docked ligands produced (typical for computer clusters: 100 000 ligands) – Equivalent to 80 CPU years – – 1000 computers in 15 countries used simultaneously Millions of files (adding up to a few TB of data) Never done on a large scale production infrastructure Never done for a neglected disease •
Next steps
– Sort through data to identify potential drugs – Develop the next steps of the process (molecular dynamics) INFSO-RI-508833
MIE 2005 Healthgrid workshop, Geneva, August 2005 13
User information & support
Enabling Grids for E-sciencE
•
More than 140 training
– >2000 people trained
events across many countries
– induction; application developer; advanced; retreats Material archive online with >200 presentations •
Public and technical websites constantly evolving to expand information available and keep it up to date
• •
3 conferences organized
~ 300 @ Cork ~ 400 @ Den Haag ~ 450 @ Athens
Pisa:
–
4th project conference 24 28 October ’05
Registration open: http://public.eu-egee.org/conferences/4th/ INFSO-RI-508833
MIE 2005 Healthgrid workshop, Geneva, August 2005 14
•
From Phase I to II
Enabling Grids for E-sciencE From 1st EGEE EU Review in February 2005:
– “The reviewers found the overall performance of the project very good.” – “… remarkable achievement to set up this consortium, to realize appropriate structures to provide the necessary leadership, and to cope with changing requirements.” •
EGEE I
– Large scale deployment of EGEE infrastructure to deliver production level Grid services with selected number of applications •
EGEE II
– – Natural continuation of the project’s first phase Emphasis on providing an infrastructure for e-Science increased support for applications increased multidisciplinary Grid infrastructure more involvement from Industry –
Extending the Grid infrastructure world-wide
increased international collaboration
(Asia-Pacific is already a partner!) INFSO-RI-508833
MIE 2005 Healthgrid workshop, Geneva, August 2005 15
Summary
•
Enabling Grids for E-sciencE Grids are a powerful new tool for science
•
Several applications are already benefiting from Grid technologies (biomedical is a good example)
•
Europe is strong in the development of Grids also thanks to the success of EGEE and related projects
•
EGEE offers:
– A mechanism for linking together the people, resources and data of your scientific community – – Continuous monitoring of the status of your Virtual Organisation A set of middleware for gridfying applications with documentation, training and support – Regular forums for linking with grid experts, other communities and industry •
EGEE-II will further extend support for user communities and applications
INFSO-RI-508833
MIE 2005 Healthgrid workshop, Geneva, August 2005 16
Contacts
Enabling Grids for E-sciencE
•
EGEE Website
http://www.eu-egee.org
•
How to join
http://public.eu-egee.org/join/ •
EGEE Project Office
INFSO-RI-508833
MIE 2005 Healthgrid workshop, Geneva, August 2005 17