NASA’s IPG - West University of Timișoara

download report

Transcript NASA’s IPG - West University of Timișoara

Grid Computing
The 21st Century Paradigm for ServiceOriented Utility Computing
What is Grid?
Grid Projects & Applications
Grid Technologies
What is Grid?
A type of parallel and distributed system that enables the sharing,
selection, & aggregation of geographically distributed resources:
– Computers – PCs, workstations, clusters, supercomputers,
laptops, notebooks, mobile devices, PDA, etc;
– Software – e.g., ASPs renting expensive special purpose
applications on demand;
– Catalogued data and databases – e.g. transparent access to
human genome database;
– Special devices/instruments – e.g., radio telescope –
[email protected] searching for life in galaxy.
– People/collaborators.
depending on their availability, capability, cost, and user QoS
for solving large-scale problems/applications.
thus enabling the creation of “virtual organization” (VOs)
(1) Distributed
(2) High-throughput
(3) On-demand
(4) Data-intensive
(5) Collaborative
(6) Multimedia
Resources = assets,
capabilities, and knowledge
• Capabilities (e.g. application codes, analysis
• Compute Grids (PC cycles, commodity
clusters, HPC)
• Data Grids
• Experimental Instruments
• Knowledge Services
• Virtual Organisations
• Utility Services
Grid‘s main idea
• To treat CPU cycles and software like commodities.
• Enable the coordinated use of geographically
distributed resources – in the absence of central
control and existing trust relationships.
• Computing power is produced much like utilities such
as power and water are produced for consumers.
• Users will have access to “power” on demand
• “When the Network is as fast as the computer’s
internal links, the machine disintegrates across the
Net into a set of special purpose appliances” – Gilder
Technology Report June 2000
Computational Grids and
Electric Power Grids
Power Grid analogy
– Power producers: machines, software, networks, storage systems
– Power consumers: user applications
Applications draw power from the Grid the way appliances draw
electricity from the power utility.
– Seamless, High-performance, Ubiquitous, Dependable
Why the Computational Grid is like the Electric Power Grid
– Electric power is ubiquitous
– Don’t need to know the source of the power (transformer, generator) or
the power company that serves it
Why the Computational Grid is different from the Electric Power Grid
– Wider spectrum of performance
– Wider spectrum of services
– Access governed by more complicated issues: Security, Performance
Distributed Computing
• Concept has been around for two decades
• Basic idea: run scheduler across systems
to runs processes on least- used systems first
– Maximize utilization
– Minimize turnaround time
• Have to load executables and input files to
selected resource
– Shared file system
– File transfers upon resource selection
Examples of Distributed
• Workstation farms etc
– Generally share file system
• [email protected] project, Entropia, etc.
– Only one source code; copies correct
binary code and input data to each system
• Napster, Gnutella: file/data sharing
• NetSolve
– Runs numerical kernel on any of multiple
independent systems, much like a Grid
Internet Computing Projects
Area of interest
[email protected], eOn,, Distributed
Particle Accelerator Design, Analytical Spectroscopy
Research, Evolutionary Research
Life Science
[email protected], [email protected], [email protected],
Folderol, Distributed Folding, Find-a-Drug, Drig Design
Optimization Lab, Community TSC
Cryptography, ECCp-109
[email protected], ZetaGrid, ECMNET, GRISK,
Proth/Wilson/Weiferich/Mersenne Prime Search,
Factoriztions of Cyclotomic Numbers
Why cluster and grid computing
• Clusters and grids increasingly interesting
more workstations
higher performance per workstation
faster interconnecting networks
price/performance competitive with MPP
enormous unused capacity
cyclic availability
Differences parallel/clusters/grids
• Clusters are inherently inhomogeneous
– intrinsic differences in performance, memory,
– dynamically changing “background load”
– ownership of nodes
• Grids add
– differences in administration
– disjoint file systems
– security etc.
P2P, cluster, Internet computing
vs. grid computing
• Peer-to-peer networks (eg Kazaa) fall within the
definition of grid computing (the resource shared
is the storage capacity of each node)
P2P Working Group part of Global Grid Forum
• A cluster is a resource that can be shared- a grid is
a cluster of clusters
• Internet computing: a VO is assembled for a
particular project and disbanded once the project
is complete -the shared resource is the Internet
connected desktop
Why go Grid?
Hot subject
Try it, experience it to learn the potential
Will enable true ubiquitous computing in future
Today, proven in some areas: intraGrids
But still long way to World Wide Grid
State of art techniques, tools are difficult
Short term goals? Use another technology
Does your system have Grid characteristics?
– Distributed users, large scale and heterogeneous
resources, across domains
• Grids enable much more than apps running on multiple
– virtual operating system: provides global workspace/address space
via a single login
– automatically manages files, data, accounts, and security issues
– connects other resources (archival data facilities, instruments,
devices) and people (collaborative environments)
• Inevitable (at least in HPC):
– leverages computational power of all available systems
– manages resources as a single system--easier for users
– provides most flexible resource selection and management, load
– researchers’ desire to solve bigger problems will always outpace
performance increases of single systems; just as multiple processors
are needed, ‘multiple multiprocessors’ will be deemed so
• Resources have different functions, but
multiple classes resources are necessary for
most interesting problems.
• Power of any single resource is small
compared to aggregations of resources
• Network connectivity is increasing rapidly in
bandwidth and availability
• Large problems require teamwork and
What do users want ?
• Grid Consumers
– Execute jobs for solving varying problem size and
– Benefit by selecting and aggregating resources wisely
– Tradeoff timeframe and cost
• Grid Providers
– Contribute (“idle”) resource for executing consumer
– Benefit by maximizing resource utilisation
– Tradeoff local requirements & market opportunity
Grid projects & applications
Proiecte vechi
1996: Ian Foster, Steven Tuecke si Carl Kesselman de la ANL–SUA, infrastructurii de
interconectare a celor mai importante centre de calcul de inalta performanta, proiectul I-WAY.
1998: comunitatii de utilizatori internationali si de standarde in grid, Global Grid Forum.
RO: 2002 si 2003 cateva proiecte in domeniul grid prin programul InfoSoc
EU-FP5: 2000/2001: EuroGrid, DataGrid si Damien (infrastructura,middleware: Geant; Unicore).
EU-FP5: 2001/2002: middleware, aplicatii; GridLab – platforma de testare, CrossGrid –
interoperabilitate griduri, simulari, EGSO – astro-fizica, GRIA – industrial, DataTag – platforma
transatlantica, GRIP - interoperabilitate.
EU-FP5: 2002/2003: aplicatii; Avo – astro-fizica, FlowGrid - simulare, OpenMolGrid – molecular,
GRACE – cautare, COG – ontologii, MOSES – web semantic, BioGrid – bilogic, GEMSS –
medical, SeLeNe – e-learning, MamoGrid – medical, EGEE – securitate, NOMAD – descoperire de
servicii (aprox 20 proiecte)
EU-FP6: 2003/2004, ‘Grid for complex problem solving’
proiecte nationale: UK - e-Science (80 proiecte) incluzand GridPP, Comb-e-Grid, AstroGrid,
MyGrid, GEODISC, DAME, DiscoveryNet, RealityGrid, OGSA-DA; Franta, la INRIA, ruleaza o
serie de proiecte: Algorille – management de resurse pe grid, Apache – planificare multicriteriala,
Grand – desktop grid, Oasis – grid de increder, Paris – simulari numerice, Remap – servere in retea,
Sardies – monitorizare, MPICH-V – MPI pentru grid. In alte tari: Japonia – Grid Data Farm, ITBL,
Olanda – VLAM, DutchGrid, Italia – INFN Grid, Irelanda – EireGrid, Polonia – PIONIERGrid,
Ungaria – DemoGrid, JiniGrid, Australian –SimGrid, Economy Grid., WWG
USA: NASA Information Power Grid, DOE Science Grid, NSF National Virtual Observatory, NSF
GriPhyN, DOE Particle Physics Data Grid, NSF DTF TeraGrid, DOE ASCI DISCOM Grid, DOE
Earth Systems Grid, DOE FusionGrid, NEESGrid, NIH BIRN, NSF iVDGL.
IBM a realizat un Grid Toolkit bazat pe Globus, iar Sun, un One-Grid-Engine.
Maturation of Grid Computing
• Research focus moving from building of basic infrastructure and
application demonstrations to
– Middleware
– Usable production environments
– Application performance
– Scalability -> Globalization
• Development, research, and integration happening outside of
the original infrastructure groups
• Grids becoming a first-class tool for scientific communities
– GriPhyN (Physics), BIRN (Neuroscience), NVO (Astronomy),
• Widespread interest from government in developing
computational Grid platforms; in US
NSF’s Cyberinfrastructure
NASA’s Information Power Grid
DOE’s Science Grid
Grid Applications
• Distributed HPC (Supercomputing):
– Computational science.
• High-Capacity/Throughput Computing:
– Large scale simulation/chip design & parameter studies.
• Content Sharing (free or paid)
– Sharing digital contents among peers (e.g., Napster)
• Remote software access/renting services:
– Application service provides (ASPs) & Web services.
• Data-intensive computing:
– Drug Design, Particle Physics, Stock Prediction...
• On-demand, real-time computing:
– Medical instrumentation & Mission Critical.
• Collaborative Computing:
– Collaborative design, Data exploration, education.
• Service Oriented Computing (SOC):
– Towards economic-based Utility Computing: New paradigm, new
applications, new industries, and new business.
Grid Projects
– Nimrod-G
– Gridbus
– GridSim
– Virtual Lab
– DISCWorld
– GrangeNet
– coming up
– Cactus
– UK eScience
– EU Data Grid
– EuroGrid
– MetaMPI
– XtremeWeb
– and many more.
– I-Grid
– Ninf
– DataFarm
– Globus
– Legion
– Sun Grid Engine
– AppLeS
– Condor-G
– Jxta
– NetSolve
– AccessGrid
– and many more...
Cycle Stealing & .com Initiatives
– [email protected], ….
– Entropia, UD, Parabon,….
Public Forums
– Global Grid Forum
– Australian Grid Forum
– CCGrid conference
– P2P conference
Vision for the Information Power Grid is to promote a
revolution in how NASA addresses large-scale science and
engineering problems by providing persistent
infrastructure for
– “highly capable” computing and data management services
that, on-demand, will locate and coschedule the multiCenter resources needed to address large-scale and/or
widely distributed problems
– the ancillary services that are needed to support the
workflow management frameworks that coordinate the
processes of distributed science and engineering problems
US Grid Projects
NASA Information Power Grid
DOE Science Grid
NSF National Virtual Observatory
DOE Particle Physics Data Grid
NSF DTF TeraGrid
DOE Earth Systems Grid
DOE FusionGrid
EU GridProjects
DataGrid (CERN, ..)
EuroGrid (Unicore)
DataTag (TTT…)
Astrophysical Virtual Observatory
GRIP (Globus/Unicore)
GRIA (Industrial applications)
GridLab (Cactus Toolkit)
CrossGrid (Infrastructure Components)
EGSO (Solar Physics)
National Grid Projects
UK e-Science Grid
Japan – Grid Data Farm, ITBL
Netherlands – VLAM, DutchGrid
Germany – UNICORE, Grid proposal
France – Grid funding approved
Italy – INFN Grid
Eire – Grid-Ireland
Poland – PIONIER Grid
Switzerland - Grid proposal
Hungary – DemoGrid, Grid proposal
ApGrid – AsiaPacific Grid proposal
UK e-Science Initiative
£75M is for Grid Applications in all areas of science and engineering
£10M for Supercomputer upgrade
£35M ‘Core Program’ to encourage development of generic ‘industrial strength’
Grid middleware
‘Grid Starter Kit’ Version 1.0 available for distribution from July 2001
Particle Physics and Astronomy (PPARC)
– links to EU DataGrid, CERN LHC Computing Project, US GriPhyN and PPDataGrid
Projects, and iVDGL Global Grid Project
– links to EU AVO and US NVO projects
Engineering and Physical Sciences (EPSRC)
Comb-e-Chem:Structure-Property Mapping
DAME: Distributed Aircraft Maintenance Environment
Reality Grid: A Tool for InvestigatingCondensed Matter and Materials
My Grid: Personalised Extensible Environments for Data Intensive in silicoExperiments in Biology
GEODISE: Grid Enabled Optimisation and Design Search for Engineering
Discovery Net: High Throughput Sensing Applications
Biology, Medical and Environmental Science: Dynamic Brain Atlas, Biodiversity,
Chemical Structures, Mouse Genes, Robotic Astronomy. Collaborative Visualisation,, Medical Imaging/VR
Globus Press Release
12th November 2001
• 12 Companies adopt Globus Toolkit as
Standard Grid Technology Platform
• 5 new US companies
- Compaq, Cray, SGI, Sun, Veridian
• 3 new Japanese vendors
- Fujitsu, Hitachi, NEC
• 3 US companies increasing commitment
- IBM, Microsoft, Entropia
• Platform will provide commercial version
Grid Technologies
Grid Requirements
Identity & authentication
Authorization & policy
Resource discovery
Resource characterization
Resource allocation
(Co-)reservation, workflow
Distributed algorithms
Remote data access
High-speed data transfer
Performance guarantees
Monitoring Adaptation
Intrusion detection
Resource management
Accounting & payment
Fault management
System evolution
Some Grid Requirements –
User Perspective
• Single allocation: if any at all
• Single sign-on: authentication to any Grid
resources authenticates for all others
• Single compute space: one scheduler for all Grid
• Single data space: can address files and data from
any Grid resources
• Single development environment: Grid tools and
libraries that work on all grid resources
The Security Problem
• Resources being used may be extremely valuable &
the problems being solved extremely sensitive
• Resources are often located in distinct administrative
– Each resource may have own policies & procedures
• The set of resources used by a single computation
may be large, dynamic, and/or unpredictable
– Not just client/server
• It must be broadly available & applicable
– Standard, well-tested, well-understood protocols
– Integration with wide variety of tools
The Resource Management
Enabling secure, controlled remote
access to computational resources and
management of remote computation
– Authentication and authorization
– Resource discovery & characterization
– Reservation and allocation
– Computation monitoring and control
Some Grid Usage Models
• Distributed computing: job scheduling on Grid resources
with secure, automated data transfer
• Workflow: synchronized scheduling and automated data
transfer from one system to next in pipeline (e.g. computeviz, storage)
• Coupled codes, with pieces running on different systems
• Meta- applications: parallel apps spanning multiple systems
• Some models are similar to models already being used, but
are much simpler due to:
– single sign-on
– automatic process scheduling
– automated data transfers
• But Grids can encompass new resources likes sensors and
instruments, so new usage models will arise
Grid-based Computation:
Locate “suitable” computers
Authenticate with appropriate sites
Allocate resources on those computers
Initiate computation on those computers
Configure those computations
Select “appropriate” communication methods
Compute with “suitable” algorithms
Access data files, return output
Respond “appropriately” to resource changes
Leading Grid Middleware
Globus Toolkit (mainly developed at ANL and USC)
Service-oriented toolkit from the Globus project,to
be used in Grid applications, not targeted at enduser
Services for resource selection and allocation,
authentication, file system access and file transfer,
Largest user-base in projects worldwide
Open-source software, commercial support by IBM
and Platform Computing
Leading Grid Middleware
UNICORE (European development)
Originally developed as a standard gateway for job
submission to supercomputers at HPC centers
Comfortable GUI for job definition and monitoring
(abstracted from system peculiarities)
Extended to a Grid job environment, including
similar services as provided by the Globus
Hierarchical job structure, dependencies between
tasks at different sites, automatic file transfer
GRIP project: Integration of Globus into UNICORE,
(task submission into Globus Grid)
Open-source software, commercial support by
Leading Grid Middleware
LEGION (Univ. of Virginia)
Combines distributed resources into a
single virtual computer, „OS for a
distributed machine“
Legion shell providing services such as
naming, file system, security, process
generation, inter-process comm., I/O,
resource management
Open-source software, commercial support
by Avaki
Examples of Grid
Programming Technologies
• MPICH-G2: Grid-enabled message passing
• CoG Kits, GridPort: Portal construction
• GDMP, Data Grid Tools, SRB: replica management,
collection management
• Condor-G: simple workflow management
• Legion: object models for Grid computing
• NetSolve: Network enabled solver
• Cactus: Grid-aware numerical solver framework,
application focus
Standardization Activities
4Web services (under development at W3C), in particular:
hSimple Object Access Protocol (SOAP)
hWeb Service Description Language (WSDL)
hWeb Service Inspection Language (WSIL)
hWeb Service Flow Language (WSFL, for workflows)
4Open Grid Service Architecture (OGSA, on-going at GGF)
hMerger of toolkit model (Globus) with service-oriented
approach of Web services
hGlobus 3.0: partial implementation of OGSA
4Semantic Web (W3C)
hResource Description Framework (RDF) for metadata
interoperability, based on XML
• OGSI = Open Grid Service Infrastructure
Specs from GGF OGSI working group
Defines what makes a Grid service
Based on Web service
Naming, life cycle, state, notification
portTypes definitions, WSDL 1.2 draft
• OGSA = Open Grid Service Architecture
– Specs from GGF OGSA working group
– Defines a list of fundamental Grid services, and how
they cooperate
– Work in progress
OGSA services
• Open Grid Service Architecture, being defined by GGF
OGSA working group
• In ubiquitous Grid platform, there is common need for
some essential set of interfaces, behaviors, resource
models, and bindings
• OGSA defines the core set of services essential for grid,
their functionality and interrelationships
• Work in progress. Last draft Oct 3, 2003
• Core services: service interaction, management,
communication, security
• Non-core: data, program execution, resource management
What is a Grid service
• Defined by OGSI (GGF working group)
• Is a Web service with extensions, which are:
– Name (handle GSH, reference GSR)
– Lifetime management (factories, persistent and
transient services)
– State (Service Data)
– Notification as well as querying
• WSDL 1.2 draft (gwsdl: namespace)
• Definitions of portTypes
Globus Toolkit
Globus Grid Services
The Globus toolkit provides a range of basic Grid services
- Security, information, fault detection, communication,
resource management, ...
These services are simple and orthogonal
- Can be used independently, mix and match
Programming model independent
- For each there are well-defined APIs
- Standards are used extensively
E.g., LDAP, GSS-API, X.509, ...
You don‘t program in Globus, it‘s a set of tools like Unix
The Globus Alliance
• Globus Project ™, since 1996
– Ian Foster (Argonne National Lab),
– Carl Kesselman (University of Southern California’s
Information Science Institute)
• Develop protocols, middleware and tools for Grid
• Globus Alliance, since Sept 2003
• International scope
– University of Edinburgh’s EPCC
– Swedish Center for Parallel Computers (PDC)
– Advisory council of Academic Affiliates from AsiaPacific, Europe, US
Globus Toolkit
• GT2 (2.4 released in 2002): reference
implementation of Grid fabric protocols
GRAM for job submissions
MDS for resource discovery
GridFTP for data transfer
GSI security
• GT3 (3.0 released July 2003): redesign
– OGSI based
– Grid services, built on SOAP and XML
• GT3.2 released March 31, 2004
Globus Toolkit Services
• Job submission and management (GRAM)
– Uniform Job Submission
• Security (GSI)
– PKI-based Security (Authentication) Service
• Information services (MDS)
– LDAP-based Information Service
• Remote file management (GASS) and transfer
– Remote Storage Access Service
• Remote Data Catalogue and Management Tools
– Support by Globus 2.0 released in 2002
– Resource selection and allocation (GIIS, GRIS)
Resource Specification Language
• Common notation for exchange of information
between components
– Syntax similar to MDS/LDAP filters
• RSL provides two types of information:
– Resource requirements: Machine type, number of
nodes, memory, etc.
– Job configuration: Directory, executable, args,
• API provided for manipulating RSL
Job Submission Interfaces
• Globus Toolkit includes several command line
programs for job submission
– globus-job-run: Interactive jobs
– globus-job-submit: Batch/offline jobs
– globusrun: Flexible scripting infrastructure
• Advanced Grid Job Management Systems
– General purpose
• Nimrod-G, Condor-G, etc
– Application specific
• Active Sheet Cactus, Web portals
Meaning of GT3 in the community
• Most commonly referred project
• Traditionally “de facto standard”
• Acknowledged leadership by academia and
industry (IBM,…)
• BSD style license allows for commercial usage
• However, it is only a reference implementation.
Now standards = GGF
• GT undergoes constant changes
• With business entering grid, commercial
implementations may soon catch up
Alternatives to GT3
• Protocol level interoperability:
– “Grid compliant” >= “implements OGSI”
• Other OGSI implementations
OGSI.NET (U.Virginia)
pyGlobus (LBNL)
.NET (U.Edinburgh)
PERL (U. Manchester)
UNICORE (Fujitsu)
• Commercial OGSI compliant products by:
– Avaki, Platform, Data Synapse, …
• Web service alternative: Grid App Framework
Useful References
• Global Grid Forum: working meeting
• HPDC: major academic conference
• Other meetings include IPDPS, CCGrid,
EuroGlobus, Globus Retreats
• Book (Morgan Kaufman):
• Perspective on Grids: “The Anatomy of the Grid:
Enabling Scalable Virtual Organizations”, IJSA, 2001,
• URLs, especially:,,,,