An Overview of Computational Grid Technologies Marlon Pierce Community Grids Laboratory Indiana University [email protected] Grids in I533 Context Workflow, Information, Sharing, Ontology Services Gaussian, Logical File PubChem, etc Data Mining Systems General Data General.

Download Report

Transcript An Overview of Computational Grid Technologies Marlon Pierce Community Grids Laboratory Indiana University [email protected] Grids in I533 Context Workflow, Information, Sharing, Ontology Services Gaussian, Logical File PubChem, etc Data Mining Systems General Data General.

An Overview of
Computational Grid
Technologies
Marlon Pierce
Community Grids Laboratory
Indiana University
[email protected]
Grids in I533 Context
Workflow, Information, Sharing, Ontology
Services
Gaussian,
Logical File
PubChem, etc
Data Mining
Systems
General Data General Exec General File
Services
Services
Services
Web Service Core Specifications
(Verbal description on next slide)
Security, Reliability, etc
Client Environments: Portals, Taverna, etc
Grids in I533 Context



I533 covers a diverse set of topics.
(Web) Services are the core abstraction
 Execution Services: computational chemistry, data mining, text
processing
 Data Services: PubChem, OGSA-DAI
 Information and metadata services: Ontologies, information
discovery and sharing.
 Orchestration services (workflow): Taverna, BPEL, etc.
Grids are collections of services with some glue
 Decentralized security, information system agreements (from
monitoring to metadata), abstract execution protocols, etc.
 Service Oriented Architecture
Brief History of Grids


The term “Grid Computing” was coined by Dr. Larry Smarr, then
director of NCSA, back in 1992.
The original concept: computing power should be available on
demand, for a fee.



Just like the electrical power grid.
Today, Grids are thought of as federations of services that span
organizations.
Grids are usually driven by science applications.

Most core funding from the DOE, NSF, UK e-Science, and other scientific
agencies in the EU, Japan, China, Korea, etc.



These agencies all cooperate to some degree.
DOD has its own version of things, the Global Information Grid, that is
currently unrelated.
IBM, MS, Oracle, Sun, etc have varying degrees of interest.
...
Grid Computing Research

Historically, grid computing has been targeted at simplifying
access to high performance computing and giant scientific data
sets.
 Example: NSF TeraGrid includes both hardware and software
along with a common administration infrastructure.



www.tergrid.org
IU is one of the partners.
There are many overviews of Grid computing.




See for example Globus World presentations from 2004, 2005
Show lots of “gee whiz” pictures of big science problems using
the Grid.
Usually mention seti@home, and more recently, Google and
Bittorent.
These annoy me.

Seti@home has nothing to do with Grid computing.
Grid Computing Research

Grid computing is large scale
distributed computing
research.




“Middleware”
It’s not the pervasive
computing power Grid
originally envisioned.
As long as its research, we get
to keep working on it.
I’ll examine some key
technologies for building a Grid
installation, but not “the” Grid.
There is no
Grid!
Dr. Dave Semeraro has his doubts.
Some Desirable Grid Characteristics


Grids are collections of services.
 Accessing computational facilities to run codes.
 Accessing remote databases, data warehouses and file systems.
 Transferring large data sets.
 Accessing remote instruments and sensors.
Collections are created from multiple partners: Virtual Organizations
 Must support decentralized management.
 Common security abstraction layer



Common information infrastructure


Monitoring hardware and networks: required and solved
Finding resources (i.e. “Semantic Grid”) Research 4Ever!
Ex: TeraGrid combines NCSA, SDSC, IU, TACC, ORNL, Purdue, ...
Generations
 Generation 1: UNIX daemons, command-line clients, protocolbased.
 Generation 2: Based on Web Service standards


Authentication: required and solved.
Authorization: Research 4Ever!
Physical
Organisation
Virtual Organisation
Physical
Organisation
Virtual Organization
View of Deployment
Virtual Organisation
Virtual
Organisation
I. Foster, www.usipv6.com/ppt/fosteripv6andGridJune2003.ppt
Physical
Organisation
Physical
Organisation
Grid Computing Software Examples
Globus Toolkit
(ANL, ISI)
Job managers for science applications, Grid security
frameworks, file management tools, etc.
Condor
(UW)
A job scheduler and cycle scavenger optimally
running applications on available resources. “High
throughput computing”
Storage Resource Middleware that provides a uniform interface for
Broker
connecting to heterogeneous data resources over a
network and accessing replicated data sets.
(SDSC)
OMII
UK e-Science program’s software arm.
OGSA-DAI
(U. Edinburgh)
From UK e-Science program. Wraps XML and
relational databases as Grid services and provides a
workflow client library for query processing.
Making Interoperable Tools


There are a large number of Grid-related research
projects and tools.
They need some common protocols


Two most important




Not just wire protocols but also security procedure
protocols.
GSI: A global security system
GRAM: a global method for executing remote operations.
Grid standards and would-be standards are defined
through the Global Grid Forum.
We will concentrate on the Globus Toolkit in these
lectures, but GSI and GRAM are important to
several other projects.

Condor, SRB, Sun Grid Engine, etc.
Globus Services Landscape
We’ll
start
here.
www.griphyn.org/documents/document_server/uploaded_documents/doc--1515--GT4_GriPhyN.ppt
Grid Security Infrastructure
An overview
Grid Security Infrastructure Keywords

Public Key Infrastructure (PKI)
 Most Grid use asymmetric encryption keys
 Based on OpenSSL but with GSSAPI extensions
 Users have a public key and a private key.




I encrypt with your public key and sign with my private key.



Public keys can decrypt messages encrypted by private keys and
vice versa.
Public key: encrypts a message
Private key: signs a message. Only you have the private key, so
only you can generate that specific signature.
Only you can unencrypt, and you know it came from me.
PKI tools are part of Java’s SDK, so try them out.
Certificate Authorities: establishing trust.
 Can you trust a public key?
 Yes, if you trust the signer.
 Large Grids have CAs.
 You can run your own with SimpleCA.
 CAs can be hierarchical.
More Keywords: GSS API




Generic Security Service API (GSSAPI)
 PKI is slow and symmetric keys are much faster.
 GSSAPI establishes a “context” between two communicators by
sharing a secret symmetric session key.
 Very similar protocol to WS-SecureConversation
Java implementation part of standard SDK release.
 Try it out, but it requires Kerberos
GSI uses the GSSAPI to establish security contexts.
We will see how to program clients in the next lecture.
Single Sign On and Delegation

Single Sign On

A “Grid” implies that you can access lots of machines, but
not necessarily anonymously.




Charged for usage: supercomputer centers issue allocations.
SSO is the ability to login once, get a ticket, and access
many machines without constantly providing username and
password.
GSI is very similar to a somewhat older system called
Kerberos, which you can still get.
Delegation is the security concept that supports this.


In practice, GSI handles delegation by resigning
credentials.
Take advantage of hierarchical CA organization for trust.
Credential Delegation in GSI
Butler et al, http://www.globus.org/alliance/publications/papers/butler.pdf
A Public Key
rainier.extreme.indiana.edu% more usercert.pem
Bag Attributes
localKeyID: 01 00 00 00
subject=/DC=org/DC=doegrids/OU=People/CN=Marlon
Pierce 64229
issuer= /DC=org/DC=DOEGrids/OU=Certificate
Authorities/CN=DOEGrids CA 1
-----BEGIN CERTIFICATE----MIIDJjCCAg6gAwIBAgICFMYwDQYJKoZIhvcNAQEFBQAwa
TETMBEGCgmSJomT8ixk
----------------------[Stuff deleted]--------------------------------rlCbtrvQjT79qYIutfFSxwre52OV7p7f/3Uufj0wO4f4hq5Jt05uof
QU
-----END CERTIFICATE-----
A Private Key
rainier.extreme.indiana.edu% more userkey.pem
Bag Attributes
localKeyID: 01 00 00 00
1.3.6.1.4.1.311.17.1: Microsoft Enhanced Cryptographic Provider v1.0
friendlyName: 6f50c542f27d23ca349e371673b2ff8d_2586cc29-aa584f69-b023-bbcac12e129e
Key Attributes
X509v3 Key Usage: 10
-----BEGIN RSA PRIVATE KEY----Proc-Type: 4,ENCRYPTED
DEK-Info: DES-EDE3-CBC,42533BEF0D5016EB
xxQ8IF5UL1rFeWm4hbZBNYNB5TpHl8FqeRPOJk03fltcHyETdndP4GJqLNx
HMcxk
fy9As9v49HDSpHde/3jMu9L9q8LXSkG6WmFZgI35nsqjCTcstMdNnZ2P+jxp
9sk7
-----------------------[Stuff Deleted]----------------------------------------------------------1rts6i6ZDYFzsCpnu+rOsa0kolp+r0zRI0uiiIbOxU9jOtVTiHPsUg==
-----END RSA PRIVATE KEY-----
MyProxy Credential Repository

Private keys are troublesome
and dangerous.




You need to put one on every
machine that you may use for
initial login.
This increases chance it will
get stolen.
Can be placed on expensive
smart cards.
Solution: MyProxy Server



On-line credential repository.
Issues short-term keys to any
client that knows the
username and password.
Very convenient for Web
portal applications.
J. Basney, http://grid.ncsa.uiuc.edu/myproxy/talks.html
Grid as a Virtual Organization


Now that we have an SSO, we can set this up
across many different partner sites.
Use one super-CA or at least mutually trust our
partner CAs.




This is the beginnings of a “Virtual Organization”.
Real organizations contribute resources to the VO.
VOs can be long-lived.


That is, my org will trust messages signed by your CA.
TeraGrid, Open Sciences Grid
Ad-hoc Grids are more of a research issue.
GSI in Action: GridFTP



GSI is not a service itself.
You use it to build secure services.
These services inherit several capabilities


They can authenticate to each other.
Messages are secure



You can delegate two remote services to take an action on your
behalf.
GridFTP is an example of a GSI enabled service.




Encrypted, non-repudiated, tamper-proof, replay-proof, etc.
File operations and transfers, based on standard IETF FTP protocol.
Supports parallel TCP
Supports striping: several GridFTP servers can act as a logical
GridFTP server, each working on a different data subset.
A nice summary:
www.nesc.ac.uk/talks/563/Day2_1020_GridFTP.ppt
GridFTP Third Party Transfer Cartoon
GridFTP
Client
Credential
“Move File X
to Host B.”
Delegated
Credential
Host A
GridFTP
Source
Server
Host B
GridFTP
Destination
Server
GridFTP Clients

Command line clients



globus-url-copy
uberftp
Programming interfaces: build your own
client.


Java and Python CoG Kits
Java CoG reviewed next lecture.
Grid Resource Allocation
Management (GRAM)
What Is GRAM?




GRAM is a protocol for mapping generic user requests to specific
actions.
Heritage: must execute jobs on supercomputers.
 Interactive: use Unix fork.
 Queue Systems: PBS, LSF, Condor, Sun Grid Engine, etc.
This must take place as the user.
 Allocation accounting, logging, general peace of mind at stodgy
HPC centers.
Note this is very different from e-Business.
 You don’t need a database account to buy something from
Amazon.
Pre-Web Service GRAM Components
MDS client API calls
to locate resources
Client
MDS: Grid Index Info Server
Site boundary
MDS client API calls
to get resource info
GRAM client API calls to
MDS:
request resource allocation
and process creation.
GRAM client API state
change callbacks
Globus Security
Grid Resource Info Server
Query current status
of resource
Local Resource Manager
Infrastructure
Request
Create
Gatekeeper
Job Manager
Parse
RSL Library
Yikes...
Monitor &
control
Allocate &
create processes
Process
Process
Process
GRAM Job Specifications

The major purpose of GRAM is to execute one or
more remote commands on the user’s behalf.




Abstract UNIX shell, PBS, Condor, etc.
So how do you specify the command?
Pre-Web Service Grids (i.e. based on Globus 2)
uses the Resource Specification Language (RSL).
Web Service Grids (i. e. based on Globus 4) use the
XML Job Description Language.
GRAM Client Tools




You can execute remote commands using clients tools
We will develop Java clients next time.
GT 2 command line examples (with RSL)
 globusrun: all purpose client
 globus-job-run: interactive jobs
 globus-job-submit: batch jobs
 globus-job-cancel: stop batch jobs
GT 4 command line examples (with JDL)
 globusrun-ws: all purpose client
 globus-job-run-ws: interactive job submission
 globus-job-submit-ws: batch job submission
 globus-job-clean-ws: stop batch jobs.
Sample RSL String



The following runs the UNIX echo and the
This is an argument to globusrun.
Use this to execute “echo” and “mpi-hello”.
(* Multijob Request *)
+(&(executable = /bin/echo)
(arguments = Hello, Grid From Subjob 1)
(resource_manager_name = resource-manager-1.globus.org)
(count = 1)
)
( &(executable = mpi-hello)
(arguments = Hello, Grid From Subjob 2)
(resource_manager_name = resource-manager-2.globus.org)
(count = 2)
(jobtype = mpi)
)
A Very Simple Job Description
<job>
<executable>/bin/echo</executable>
<directory>/tmp</directory>
<argument>12</argument>
<argument>abc</argument>
<argument>this is an example string </argument>
<environment>
<name>PI</name> <value>3.141</value>
</environment>
<stdin>/dev/null</stdin>
<stdout>stdout</stdout>
<stderr>stderr</stderr>
</job>
http://www.globus.org/toolkit/docs/4.0/execution/wsgram/user-index.html#s-wsgram-user-commandline
More Details on Job Submission



The full Job Description Schema is here:
 http://www.globus.org/toolkit/docs/4.0/execution/wsgram/schema
s/gram_job_description.html#SchemaProperties
You can do much more complicated things.
 Run sequences of jobs.
 Stage files with GridFTP.
 Delegate jobs to other GRAMs.
But this is controversial.
 Lots of people have worked on job management workflow
systems.
 Several based on Apache Ant, for example.
 BPEL is the Web Service standard.
Grids and Web Services
Globus Services Landscape
Now
we
are up
here.
www.griphyn.org/documents/document_server/uploaded_documents/doc--1515--GT4_GriPhyN.ppt
Grids and Web Services




The requirements of Grids are very similar to those of
Service Oriented Architecture-based systems.
Grid and Web Service integration began in 2002.
 Open Grid Services Architecture: “Physiology of the Grid”
paper for Foster et al.
 Aborted start in Globus Toolkit 3, OGSI
 Current Globus Toolkit 4 much more successful.
OGSA-DAI, Condor, and SRB all have Web Service
interfaces.
Many UK e-Science projects also follow a similar approach.
 Sometimes referred to as the “WS-I+” approach to distinguish
it from the Globus/IBM approach.
 See
http://grids.ucs.indiana.edu/ptliupages/publications/WebServi
ceGrids.pdf
 See OMII releases
GT4 GRAM Structure: WSRF/WSN
Poster Child
Service host(s) and compute element(s)
Client
Delegate
Delegation
Transfer
request
RFT File
Transfer
Compute element
Local job
control
sudo
GT4 Java Container
GRAM
GRAM
services
services
GRAM
adapter
GridFTP
FTP
control
Local
scheduler
User
job
FTP data
GridFTP
Remote
storage
element(s)
www.griphyn.org/documents/document_server/uploaded_documents/doc--150VDS_1.4_Plans.2005.0429.ppt
Reliable File Transfer: Third Party Transfer
www.griphyn.org/documents/document_server/uploaded_documents/doc--150VDS_1.4_Plans.2005.0429.ppt




Fire-and-forget transfer
Web services interface
Many files & directories
Integrated failure recovery
RFT Client
SOAP
Messages
RFT Service
GridFTP Server
Master
DSI
Protocol
Interpreter
GridFTP Server
Data
Channel
Data
Channel
IPC Link
IPC
Receiver
Notifications
(Optional)
Protocol
Interpreter
Master
DSI
IPC Link
Slave
DSI
Data
Channel
Data
Channel
Slave
DSI
IPC
Receiver
Grid Web Service Extensions



WSDL and SOAP form the core of Grid
services.
WS-Addressing and WS-Security family are
important.
Globus and friends are working to extend
core Web Service standards through OASIS.


WS-Resource Framework (WSRF): modeling
stateful resources.
WS-Notification: Web Service version of one-tomany messaging.
Stateful Resources and Grids


Web Service Architectures and thus Grids are really message
oriented, not RPC based.
 All state should be in the SOAP message.
 This allows messages to go through many SOAP intermediaries.
Request/response does not really map to Grid requirements.
 Services may take hours or days to complete, so need callbacks.


Services may need to push information to listeners.



Ex: computational chemistry codes on TeraGrid, RFT for many TB of
data.
“Big file 1 is done, now move big file 2”
Grid resources may also come and go.
 Instruments typically generate data at scheduled times.
 Down for maintenance, upgrades, reconfiguration, etc.
WSRF and WS-Notification attempt to solve these Grid
requirements.
Web Service Resource Framework

WSRF is a collection of WSDL specifications
and associated messages.






WS-Resource
WS-ResourceProperties
WS-ResourceLifetime
WS-ServiceGroup
WS-BaseFault
See http://www.oasisopen.org/committees/tc_home.php?wg_abbr
ev=wsrf
WS-Resource



The WS-Resource decouples a (stateful) resource
from the Web Service that accesses it.
For example, a database is a resource that may be
accessed through a Web Service.
The resource may be defined by metadata.



Our database needs to provide clues to the type of data it
contains.
Need this for discovery.
This metadata is contained in WS-ResourceProperties
Goals of WS-ResourceProperties


Provide a metadata
property framework for
describing resources.
Provide a Web Service
interface for performing
operations on these
properties.



Query and retrieve
properties.
Update values on a
resource (controversial).
Subscribe to property
changes.



Use XML Schemas to hold
WSDL message definitions
that define the resource
properties.
Associate these messages
with WSDL portTypes.
The actual values of the
Schema are in an XML
document.

Store it in memory, put it in
a database, derive it at
query time, ...
This requires some understanding of WSDL and SOAP.
Upcoming lecture will cover this.
Goals of WS-ResourceLifetime

Resources may have lifetimes.



For example, your quantum chemistry calculation
may take a few hours.
This may be associated with a WS-Resource.
WS-ResourceLifetime defines methods for



Destroying a resource at some future time (and
t=0 allowed).
Learning the lifetime of a resource.
Extending the lifetime of a resource.
WS-Notification Core Specs

WS-BaseNotification



WS-Topics



Specs for controlling publications and subscriptions of
events (i.e. resource property changes.)
Subscribers subscribe directly to publishers.
Topics are used to organize messages.
You may publish or subscribe to a topic rather than a
specific resource endpoint.
WS-BrokeredNotification

Brokers decouple publishers from subscribers.
WS-Notification


Stateful resources will need to notify one or
more listeners when their state changes.
For example, a Web lecture has many
events.




Beginning and end of the lecture.
Changes in slides.
To my knowledge, no one has tried this.
Real examples based on WS-GRAM, RFT.
A Skeptical View of WSRF



WSRF has several independent implementations.
 WSRF.NET (UV), Python (LBL), Perl (UK), C/C++ (ANL) ,...
But is this critical mass?
 What about MS, Oracle, and other big Web Service players.
OASIS specification approval is glacial.
 Many specs, even if approved, have died on the vine for lack of
backing.
 Many more are a mess because of complicated dependencies.





WS-Addressing has released many versions, screwing up many
dependent specs.
Competing specs exists.
 MS’s WS-Eventing, for example.
“Semantic Grid” using an entirely different approach for
metadata.
 RDF, OWL provide more natural modeling of metadata than treebased XML Schemas.
Ignores UDDI as an information system.
I ran out of room.
Future Challenges




Real time interaction
Joy of use
Intuitive user interface
Global scalability



1000s of simultaneous users
Addictive
(Observation courtesy Prof. Fran Berman)