e-Science Technology/Middleware (Grid, Cyberinfrastructure) Gap Analysis and OMII SEAG Meeting DTI June 20 2003 Geoffrey Fox, Indiana University David Walker, Cardiff University Note for this presentation the terms.

Download Report

Transcript e-Science Technology/Middleware (Grid, Cyberinfrastructure) Gap Analysis and OMII SEAG Meeting DTI June 20 2003 Geoffrey Fox, Indiana University David Walker, Cardiff University Note for this presentation the terms.

e-Science Technology/Middleware
(Grid, Cyberinfrastructure)
Gap Analysis
and OMII
SEAG Meeting
DTI June 20 2003
Geoffrey Fox, Indiana University
David Walker, Cardiff University
Note for this presentation the terms e-Science
Technology/Middleware, Grid, and
Cyberinfrastructure are NOT distinguished
Features of Study
• Draft report distributed to TAG April 28; revised June12 2003
–
–
–
–
–
A: Summary
B: Technology/Project/Worldwide Service Context
C: Gaps by Category
D: Proposed Action Plan for OMII
E: Appendix of UK activities of relevance
Finished apart
from final review
and addition of
references
• Interviewed 80 people -- reasonably complete within the UK
• Extracted and categorized over 120 comments (gaps)
organized into about 35 technology areas with 5 Grid styles of
operation (as in P2P) and 6 functionalities (as in Information
Grid)
• Developed an action plan that is being used to guide Core eScience effort (UK OMII Open Middleware Infrastructure
Initiative) to produce robust useable e-Science (Grid)
infrastructure by 2006
• Interview part of project ran from mid February to early April –
with worldwide review based on literature survey in April/May
Features of Gap Analysis
• These are identified UK gaps in worldwide context;
prejudice that gaps are worldwide but not mandate
of study
• Examined requirements and services already
understood/developed for e-Science (reasonably
broad coverage) and e-Business, e-Government
and e-Services (inevitably rather spotty coverage)
• Gaps divided into four broad areas
–
–
–
–
Near-term Technical
Education and Support
Research (not well separated from Near-term Technical)
Perception and Organization
• Appendix listed over 60 significant UK services
(perhaps clustered together) and tools – in the
context of a total of some 150 world wide Grid
services
Categorization of Technical Gaps and Grid Services
1: Architecture
and Style
2: Basic Technology
Runtime and
Hosting Environment
10: Portals
PSE’s
Area of Grid Services
Grid Services: Application Specific
Resource Specific
Generic
Compute
Resources
11: Network
Information
7: Information
8: Compute/File
3: Security
4: Workflow
5: Notification
6: Meta-data
9: Other
Taxonomy of Grid Functionalities
Name of Grid
Description of Grid Functionality
Type
Compute/File Grid Run multiple jobs with distributed compute and data
resources (Global “UNIX Shell”)
Desktop Grid
“Internet Computing” and “Cycle Scavenging” with
secure sandbox on large numbers of untrusted
computers
Information Grid
Grid service access to distributed information, data and
knowledge repositories
Complexity or
Hybrid Grid
Hybrid combination of Information and Compute/File
Grid emphasizing integration of experimental data,
filters and simulations
Campus Grid
Grid supporting University community computing
Enterprise Grid
Grid supporting a company’s enterprise infrastructure
Note: Term Data Grid not used consistently in community so avoided
OGSA-DAI
Grid Services
Grid
Analysis
Control
Visualize
HPC
Simulation
This Type of Grid
integrates with
Parallel computing
e.g. HPC(x)
Distributed Filters
massage data
For simulation
Hybrid Grid Computing Model
Taxonomy of Grid Operational Style
Name of Grid
Style
Semantic Grid
Peer-to-peer Grid
Lightweight Grid
Description of Grid Operational or
Architectural Style
Integration of Grid and Semantic Web metadata and ontology technologies
Grid built with peer-to-peer mechanisms
Grid designed for rapid deployment and
minimum life-cycle support costs
Collaboration Grid Grid supporting collaborative tools like the
Access Grid, whiteboard and shared
applications.
R3 or Autonomic
Fault tolerant and self-healing Grid
Grid
Robust Reliable Resilient R3
“Central” Architecture/Functionality/Style Gaps
• Substantial comments on “hosting environments”
OGSI and “permeating principles”
– Agreement on Web service model
6: Domain-Specific (Application) Grid Services
5: OGSA-compliant System Grid Services
“Modular” Services
natural for
distributed teams
Specific Gaps
4: Key OGSA Services
3: Permeating Principles and Policies
WS
WS
WS
1: Hosting Environment
2: OGSI Web service Enhancements
WS
“Central Services
And Architecture”
Central Gaps
•
•
•
•
•
•
•
•
•
•
Permeating Principles and Policies
Meta-data rich Message-linked Web Services as the permeating paradigm
“User” Component Model such as “Enterprise JavaBean (EJB)” or .NET.
Service Management framework including a possible Factory mechanism
High level Invocation Framework describing how you interact with system
components.
– This could for example be used to allow the system to built from either
W3C or GGF style (OGSI) Web Services and to protect the user from
changes in their specifications.
Security is a service but the need for fine grain selective authorization
encourages
Policy context that sets the rules for each particular Grid.
– Currently OGSA supports policies for routing, security and resource use.
The Grid Fabric or set of resources needs mechanisms to manage them.
This includes automatic recording of meta-data and configuration of
software.
Quality of service (QoS) for the Network and this implies performance
monitoring and bandwidth reservation services.
– Challenging as end-to-end and not just backbone QoS is needed.
Messaging systems like MQSeries from IBM provide robustness from
asynchronous delivery and can abstract destination and allow customization
of content such as converting between different interface specifications.
Messaging is built on transport mechanisms which can be used to support
mechanisms to implement QoS and to virtualize ports
World Wide Grid Service Activities I
• This was implicit in original report for TAG and now is being made
explicit based on interviews plus survey of major worldwide
activities
• Commercial activities especially those of IBM, Avaki, Platform,
Sun, Entropia and United Devices
• The GT2 and GT3 Globus Toolkits. Here we effectively covering
not just the Globus team but the major projects such the NASA
Information Power Grid that have blazed the trail of “productizing”
Grids.
– Note that we can “already” see GT3 (Grid Service) like functionality from
GT2 wrapped with the various (Java, Perl, Python, CORBA) CoG kits. So
GT2 capabilities can be classified as Services
• Trillium (GriPhyn, iVDGL and PPDG) and NeesGrid; the major
NSF (DoE for PPDG) projects in the USA.
– Condor from the University of Wisconsin which is being integrated into Grid
services through the Trillium and NMI activities.
• The NSF Middleware Initiative (NMI) packaging a suite of Globus,
Condor and Internet2 software.
– This has overlaps with the VDT (Virtual Data Toolkit from GriPhyn)
World Wide Grid Service Activities II
• Unicore (GRIP), GridLab, the European Data Grid (EDG) and
LCG (LHC Computing Grid)
– Many other (20) EU Projects but these have most of technology
development
• Storage Resource Broker SRB-MCAT from SDSC
• The DoE Science Grid and related activities such as the Common
Component Architecture (CCA) project
• Examination of services from a collection of portal projects in the
US from Argonne, Indiana, Michigan, NCSA and Texas.
– This includes best practice discussion from Global Grid Forum
in portals.
• Review of contributions to the recent book Grid Computing:
Making the Global Infrastructure a Reality edited by Fran Berman,
Geoffrey Fox and Tony Hey, John Wiley & Sons, Chichester,
England, ISBN 0-470-85319-0, March 2003
– This includes other major projects like Cactus, NetSolve, Ninf
• Some 6 Core and other application specific UK e-Science Projects
•
Categories of Worldwide Grid Services
–
–
–
–
•
–
–
–
–
•
–
–
–
–
•
–
–
–
–
•
•
–
–
–
–
Types of Grid
R3
Lightweight
P2P
Federation and Interoperability
Core Infrastructure and Hosting Environment
Service Management
Component Model
Service wrapper/Invocation
Messaging
Security Services
Certificate Authority
Authentication
Authorization
Policy
Workflow Services and Programming Model
Composition/Development
Languages and Programming
Compiler
Enactment Engines (Runtime)
Notification Services
Metadata and Information Services
Basic including Registry
Semantically rich Services and meta-data
Information Aggregation (events)
Provenance
•
•
•
•
•
Information Grid Services
– OGSA-DAI/DAIT
– Integration with compute resources
– P2P and database models
Compute/File Grid Services
– Job Submission
– Job Planning Scheduling Management
– Access to Remote Files, Storage and
Computers
– Replica (cache) Management
– Virtual Data
– Parallel Computing
Other services including
– Grid Shell
– Accounting
– Fabric Management
– Visualization Data-mining and
Computational Steering
– Collaboration
Portals and Problem Solving Environments
Network Services
– Performance
– Reservation
– Operations
Features of Worldwide Grid Services
• UK activities have a strong web service and Information Grid
emphasis
– Important compute/file activities as well (White Rose,
RealityGrid, UK part of EDG etc.)
• Non UK activities are dominantly focused on compute/file Grids
– Submit jobs in distributed UNIX shell (Gridshell) fashion
– Gather data from instruments (accelerator, satellite, medical
device); process in batch mode mapping between filesets
• Little emphasis on lightweight or R3 Grids but NSF in USA and
EDG have aimed at better support and software quality
– EDG has useful “tension” between technology and application
focus working groups
– NMI and even GT3 have changed packaging and added
service view – have not changed “underlying” architecture for
robustness
• Coordinated set of Portal activities in USA
• Little work on integrating parallel computing and Grid although
TeraGrid in USA could change this
• Gaps are omissions/deficiencies in UK or worldwide Grid
services of importance to UK e-Science
Central Gaps:
Gaps in Grid Styles and Execution Environment
• Need for both robust (fault tolerant) and lightweight
(suitable for small groups) Grid styles identified
– Peer-to-peer style supports smaller decentralized virtual
organizations
• Noted opportunities for modern middleware ideas to
be used – lightweight, message-based
• Noted that Enterprise JavaBeans not optimized for
Science which has high volume dataflow
• Federated Grid Architecture natural for integration of
heterogeneous functionality, style and security
• Bioinformatics and other fields require integration of
Information and Compute/File Grids
Dynamic light-weight Peer-to-peer
Collaboration Training Grid
Enterprise Grid
Students
Information Grid
R1
R2
Compute Grid
Teacher
Campus Grid
Overlapping Heterogeneous
Dynamic Grid Islands
(a) Layered OGSA Grid
Application
Service
Core
Service
Application
Service
Core
Service
Application
Service
Core
Service
Core
Service
OGSA Interface
(b) Federated OGSA Grid
Appl.
Service
Core
Service
Appl.
Service
Core
Service
Core
Service
Appl.
Service
Core
Service
Appl.
Service
Core
Service
Core
Service
Grid-1
Grid-2
OGSA Mediation
OGSA or non OGSA Interface-1
OGSA or non OGSA Interface-2
Many Gaps in Generic Services
• Some gaps like Workflow and Notification are to make
production versions of current projects
– Appendix shows workflow from DAME, DiscoveryNet, EDG,
Geodise, ICENI, myGrid, Unicore plus Cardiff, NEReSC ….
• RGMA and Semantic Grid offer improved meta-data
and Information services compared to UDDI and MDS
(Globus)
– Need comprehensive federated Information service
• Security requires architecture supporting dynamic finegrain authorization
• UK e-Science has pioneered Information Grids but
gap is continuation of OGSA-DAI, integration with
other services and P2P decentralized models
• Functionality of Compute/File Grids quite advanced
but services probably not robust enough for LCG or
Campus Grids
Gaps in Other Grid services
• Portals and User Interfaces – Noted gap that not
using Grid Computing Environment “best practice”
with component based user-interfaces matching
component-based middleware
• Programming Models (using workflow runtime)
• Fabric Management (should be integrated with
central service management and Information
system), Computational Steering, Visualization,
Datamining, Accounting, Gridmake, Debugging,
Semantic Grid tools (consistent with Information
system), Collaboration, provenance
• Application-specific services
• Note new production central Infrastructure can
support both research and production services of
this type
Some Non-Technical Gaps (Sections 9 and 11)
• Some confusion as to “future” of Grid
software and how projects should evolve
to match evolution of Globus, OGSA etc.
• Correspondingly need special attention to
education (training) in rapidly changing
technologies
• Need dedicated testbeds and repositories
• Current e-Science projects are typically
aimed at “demonstrator” and not broadly
deployable “production” software
– Correct initial strategy and supports new focus
for next phase of core e-Science
ACTION
PLAN
Technology
Repository
and Testbed
Team
Architecture
and Project
Coordination
Distributed
Sub-project
Teams
Action Plan (UK-OMII) Structure
• Technology Repository and Testbed Team (MTT)
– Compliance testing
– Track, training coordination with pro-active alerting technology
status/directions
– Approximately 6 people
• Architecture and Project Coordination (SECT)
– Agile Software Engineering and Project Management
– Central technology architecture and development
– Work with Advisory board (eSSEAB ) meeting about once per month
initially
– 6-12 “professional” people in 1-2 physical sites (single leadership)
– Clear relationship to application requirements
– Some debate as to “where architecture is” (eSSEAB or SECT)
• Distributed Sub-project Teams
– “Independent” activities as now but aiming at deployable production
software with software engineering and deployment done through
SECT/MTT
• Set of focused workshops to refine key services and architecture
– e.g. service management, messaging, workflow, integration of OGSADAI with Compute/File Grids (just a representative set)
Central UK-OMII Projects
• Develop Grid infrastructure supporting
– Robust Reliable Resilient (R3)
Essential
– Lightweight and
Desirable
– Peer-to-peer styles
Desirable
• Could involve asynchronous messaging, federated security
(fine-grain authorization), “e-ScienceBean”, invocation
frameworks “virtualizing” service component structure and
allowing Grids to have either OGSI-compliant or traditional
web services
• Integrate network monitoring/ reservation/ management
including end-to-end network operations
• Support critical policies like security, provenance
• Powerful Service management (Research needed here but
appears to us that clear how to do much better than current
systems)
• Need to either federate and/or interoperate a world of “Grid
Islands”
Positioning of UK OMII
• Support W3C/OASIS standards and if possible avoid forcing
users to build services that need more than standard Web
service tools
• OGSI supported as an option with invocation framework
(hosting environment) supplying this and other additional
functionality as needed
• Work with OGSA to produce interface standards that are
outside W3C/OASIS today (e.g. job submit)
– Try to move standards to W3C
– Is it correct or useful to suggest OGSA look at federation rather than/as
well as interoperability?
– Likely to need OGSA-OMII “standards” where GGF has yet to decide
• Decide on honest and defensible positioning wrt existing
projects such as IBM, Avaki, Globus
• All Software registered by MTT/SECT should use OGSA/OMII
service model
Essential Services in Action Plan
(layer 4)
• Application level Notification as opposed to lowlevel notification needed in service management
• Workflow runtime supporting transactions and high
volume dataflow
– Different e-Science programming models/languages can
use same runtime and be developed independently
• Federated Distributed Information System
– From low level service registration through high-level
semantic metadata (separated or integrated)
– Support of service semantics most quoted “gap”
(Semantic Grid leadership important)
– Support P2P, Central (MDS style) and service-based
(SDE) metadata
– Here as elsewhere can collaborate with GT3, EDG …
Specific Grid Services (layers 5, 6)
• Core Domain Grid Services cover the critical Services
for major Grid functionalities
– Information Grid: OGSA-DAIT
– Compute/File Grid: work with LCG, EDG (follow on),
Trillium(USA) on robust infrastructure
• New central (R3) architecture affects strategy
• Include Campus Grid support
– Hybrid Grids (Complexity Grids) integrating computing
(filters, transformations) possibly on major parallel
computing facilities and data repository access for
Bioinformatics, Environmental (Earth) Science, Virtual
Observatories ……
• Other Services as identified in Gap Analysis with
distributed teams working on different services in
concert with central team for software engineering and
OGSA interfaces as appropriate