GGF14 Workshop: GT4 Status and Experiences

Download Report

Transcript GGF14 Workshop: GT4 Status and Experiences

Globus Toolkit® 4
Ian Foster
Argonne National Laboratory
University of Chicago
Univa Corporation
2
Credits


Globus Toolkit v4 is the work of many
talented Globus Alliance members, at

Argonne Natl. Lab & U.Chicago

USC Information Sciences Corporation

National Center for Supercomputing Applns

U. Edinburgh

Swedish PDC

Univa Corporation

Other contributors at other institutions
Supported by DOE, NSF, UK EPSRC, and
other sources
3
On April 29, 2005 the
Globus Alliance released
the finest version of the
Globus Toolkit to date!
Don’t take our word for it!
Read the UK eScience Evaluation of GT4
www.nesc.ac.uk/technical_papers/UKeS-2005-03.pdf
(Reachable from www.globus.org, under “News”)
4
Overview

Background and Globus approach

Globus Toolkit: current capabilities

Future directions

Related tools
5
“… A new age has dawned in scientific
and engineering research, pushed by
continuing progress in computing,
information, and communication
technology, and pulled by the expanding
complexity, scope and scale of today’s
challenges. The capacity of this
technology has crossed thresholds that
now make possible a comprehensive
cyberinfrastructure on which to build
new types of scientific and engineering
knowledge environments and
organizations, and to pursue research in
new ways and with increased efficacy…”
National Science Foundation Blue Ribbon Advisory Panel, 2003
8
History
In the early 90s, I (Foster) and others (e.g.,
Carl Kesselman, USC-ISI) enjoyed helping
scientists apply distributed computing


Opportunities seemed ripe for the picking
Application of technology always uncovers
new and interesting requirements

Science is cool

Big/innovative science is even cooler
9
History (continued)
While helping to build/integrate a diverse
range of applications, the same problems
kept showing up over and over again




Too many different security systems
Too many different scheduling/execution
mechanisms
Too many different storage systems
Too many different monitoring/status/event
systems
10
What Kinds of Applications?

Computation intensive





Interactive simulation (climate modeling)
Large-scale simulation and analysis (galaxy
formation, gravity waves, event simulation)
Engineering (parameter studies, linked models)
Data intensive

Experimental data analysis (e.g., physics)

Image & sensor analysis (astronomy, climate)
Distributed collaboration


Online instrumentation (microscopes, x-ray)
Remote visualization (climate studies, biology)
Engineering (large-scale structural testing)
11
Key Common Feature
The size and/or complexity of the problem
requires that people in several
organizations collaborate and share
computing resources, data, instruments
12
An Example Problem





The Large Hadron
Collider (LHC)
Largest machine
ever built by humans!
Located at CERN,
Geneva Switzerland
Particle accelerator and
collider with a
circumference of 16.8
miles
Scheduled to go into
production in 2007
13
An Example Problem (continued)



Will generate
10 Petabytes
(107 Gigabytes)
of information
per year
This information
must be processed
and stored somewhere
It is beyond the scope
of a single institution to
manage this problem
14
Virtual Organizations
•
•
•
•
Distributed resources and people
Linked by networks, crossing admin domains
Sharing resources, common goals
Dynamic
R
R
R
R
R
R
R
R
R
R
R
R
VO-A
R
VO-B
15
Virtual Organizations
•
•
•
•
•
Distributed resources and people
Linked by networks, crossing admin domains
Sharing resources, common goals
Dynamic
Fault tolerant
R
R
R
R
R
R
R
R
R
R
R
VO-A
R
VO-B
16
The Globus Approach
17
The Role of the Globus Toolkit


A collection of solutions to problems that
come up frequently when building
collaborative distributed applications
Heterogeneity


A focus, in particular, on overcoming
heterogeneity for application developers
Standards
We capitalize on and encourage use of
existing standards (IETF, W3C, OASIS, GGF)
 GT also includes reference implementations
of new/proposed standards in these
organizations

18
Layers in the Grid
19
A Typical eScience Use of Globus:
Network for Earthquake Eng. Simulation
Links instruments, data,
computers, people
20
Without the Globus Toolkit
Web
Browser
Application
Developer
10
Off the
Shelf
12
Globus
Toolkit
0
Grid
Community
0
Simulation
Tool
Web
Portal
Compute
Server
B
Compute
Server
Registration
Service
Data
Viewer
Tool
Chat
Tool
Credential
Repository
Application services
organize VOs & enable
access to other services
Camera
Telepresence
Monitor
Data
Catalog
Certificate
authority
Users work
with client
applications
A
Collective services
aggregate &/or
virtualize resources
Camera
C
Database
service
D
Database
service
E
Database
service
Resources implement
standard access &
management interfaces
21
With the Globus Toolkit
Globus
Web
Browser
GRAM
Simulation
Tool
Globus
GRAM
Globus Index
Service
CHEF
Compute
Server
Compute
Server
Camera
Application
Developer
2
Off the
Shelf
9
Globus
Toolkit
Grid
Community
4
4
Data
Viewer
Tool
CHEF Chat
Teamlet
MyProxy
Telepresence
Monitor
Globus
DAI
Globus
MCS/RLS
Application services
organize VOs & enable
access to other services
Globus
DAI
Globus
Certificate
Authority
Users work
with client
applications
Camera
DAI
Collective services
aggregate &/or
virtualize resources
Database
service
Database
service
Database
service
Resources implement
standard access &
management interfaces
22
The Globus Toolkit:
“Standard Plumbing” for the Grid

Not turnkey solutions, but building blocks &
tools for application developers & system
integrators


Easier to reuse than to reinvent


Some components (e.g., file transfer) go farther than
others (e.g., remote job submission) toward enduser relevance
Compatibility with other Grid systems comes for free
Today the majority of the GT public interfaces
are usable by application developers and
system integrators


Relatively few end-user interfaces
In general, not intended for direct use by end users
(scientists, engineers, marketing specialists)
23
The Application-Infrastructure Gap
Dynamic
and/or
Distributed
Applications
Shared Distributed Infrastructure
B
A
1
1
9
9
24
Bridging the Gap:
Grid Infrastructure
Users

Service-oriented applications



Wrap applications as
services
Compose applications
into workflows
Service-oriented Grid
infrastructure

Provision physical
resources to support
application workloads
Composition
Workflows
Invocation
Appln
Service
Appln
Service
Provisioning
25
Grid Infrastructure


Distributed management

Of physical resources

Of software services

Of communities and their policies
Unified treatment



Build on Web services framework
Use WS-RF, WS-Notification (or
WS-Transfer/Man) to
represent/access state
Common management
abstractions & interfaces
Globus is Open Source
Grid Infrastructure

Implement key Web services standards


Software for Grid infrastructure




Service-enable new & existing resources
E.g., GRAM on computer, GridFTP on
storage system, custom application services
Uniform abstractions & mechanisms
Tools to build applications that exploit Grid
infrastructure


State, notification, security, …
Registries, security, data management, …
Enabler of a rich tool & service ecosystem
26
27
An eBusiness Use of Globus:
SAP Demonstration @ GlobusWorld

3 Globus-enabled applns:




CRM: Internet Pricing Configurator (IPC)
CRM: Workforce
Management (WFM)
Web Browsers / Batch Processes
SCM: Advanced Planner
& Optimizer (APO)
Applications modified to:


Adjust to varying
demand & resources
Use Globus to discover
& provision resources
(typically several thousand requests)
Request:
Price Query
1
IPC
Server
2
IPC
Delegation of
Dispatcher
Request
2
IPC
Response: PricelistServer
Depending on:
- Time
- Discount
- Number of Items
-… 3
SAP AG R/3 Internet Pricing
& Configurator (IPC)
28
Overview

Background and Globus approach

Globus Toolkit: current capabilities

Future directions

Related tools
29
The Globus Toolkit is
a Collection of Components


A set of loosely-coupled components, with:

Services and clients

Libraries

Development tools
GT components are used to build Gridbased applications and services


GT can be viewed as a Grid SDK
GT components can be categorized across
two different dimensions

By broad domain area

By protocol support
30
GT Domain Areas

Core runtime


Security


Provision, deploy, & manage services
Data management


Apply uniform policy across distinct systems
Execution management


Infrastructure for building new services
Discover, transfer, & access large data
Monitoring

Discover & monitor dynamic services
31
GT Protocols


Web service protocols
 WSDL, SOAP
 WS Addressing, WSRF, WSN
 WS Security, SAML, XACML
 WS-Interoperability profile
Non Web service protocols
 Standards-based, such as GridFTP
 Custom
32
“Stateless” vs. “Stateful” Services
FileTransfer
Service
move

Client
Without state, how does client:





move (A to B)
Determine what happened (success/failure)?
Find out how many files completed?
Receive updates when interesting events arise?
Terminate a request?
Few useful services are truly “stateless”, but
WS interfaces alone do not provide built-in
support for state
33
FileTransferService
(without WSRF)
FileTransfer
Service
move
move (A to B) : transferID
Client
whatHappen
state tellMeWhen
cancel

Developer reinvents wheel for each new service



Custom management and identification of state:
transferID
Custom operations to inspect state synchronously
(whatHappen) and asynchronously (tellMeWhen)
Custom lifetime operation (cancel)
34
WSRF in a Nutshell


Service
EPR
EPR
EPR

GetRP
GetMultRPs
Resource
SetRP







SetTerminationTime
ImmediateDestruction
Notification Interfaces



GetRP, QueryRPs,
GetMultipleRPs, SetRP
Lifetime Interfaces

Destroy
Endpoint Reference
State Interfaces
Subscribe
SetTermTime
Resource
Resource Property
State identification

QueryRPs
RPs
Service
State representation
Subscribe
Notify
ServiceGroups
35
FileTransferService (w/ WSRF)
FileTransferService
createResource
Transfer
getRP
RPs
queryRPs
createResource (A to B) : EPR
Client
destroy

Developer specifies custom method to createResource
and leaves the rest to WSRF standards:



State exposed as Resource + Resource Properties and
identified by Endpoint Reference (EPR)
State inspected by standard interfaces (GetRP, QueryRPs)
Lifetime management by standard interfaces (Destroy)
Globus Toolkit version 2 (GT2)
Web
Services
Components
Pre-WS
Authentication
Authorization
GridFTP
Security
Data Mgmt
Grid Resource Monitoring
Alloc. Mgmt & Discovery
(GRAM)
(MDS)
Execution
Mgmt
Info
Services
C Common
Libraries
Common
Runtime
Non-WS
Components
Globus Toolkit version 3 (GT3)
Community Data Access
Authorization & Integration
WS
Authentication
Authorization
Pre-WS
Authentication
Authorization
Reliable
File
Transfer
Grid Resource
Alloc. Mgmt
(WS GRAM)
GridFTP
Grid Resource Monitoring
Alloc. Mgmt & Discovery
(GRAM)
MDS3
(MDS)
Replica
Location
Security
Data Mgmt
Java
WS Core
C Common
Libraries
eXtensible
IO (XIO)
Execution
Mgmt
Info
Services
Common
Runtime
Web
Services
Components
Non-WS
Components
Globus Toolkit version 4 (GT4)
Grid
Telecontrol
Protocol
Community
Scheduling
Framework
Python
WS Core
Community Data Access Workspace
Authorization & Integration Management
Trigger
C
WS Core
Reliable
File
Transfer
Grid Resource
Allocation &
Management
Index
Java
WS Core
Pre-WS
Authentication
Authorization
GridFTP
Pre-WS
Pre-WS
Grid Resource Monitoring
Alloc. & Mgmt & Discovery
C Common
Libraries
Credential
Mgmt
Replica
Location
www.globus.org
eXtensible
IO (XIO)
Security
Data Mgmt
Authentication
Authorization
Execution
Mgmt
Contrib/
Preview
Deprecated
WebMDS
Delegation
Data
Replication
Core
Info
Services
Common
Runtime
Web
Services
Components
Non-WS
Components
39
Globus Toolkit:
Open Source Grid Infrastructure
Globus Toolkit v4
www.globus.org
Data
Replication
Credential
Mgmt
Replica
Location
Grid
Telecontrol
Protocol
Delegation
Data Access
& Integration
Community
Scheduling
Framework
WebMDS
Python
Runtime
Community
Authorization
Reliable
File
Transfer
Workspace
Management
Trigger
C
Runtime
Authentication
Authorization
GridFTP
Grid Resource
Allocation &
Management
Index
Java
Runtime
Security
Data
Mgmt
Execution
Mgmt
Info
Services
Common
Runtime
40
4.0 is not a typical “.0” release,
but the culmination of months of testing
3.0.2
3.0.1
3.0.0
3.2.1
3.2.0
3.9.0
3.3.0
4.0.1
3.9.2
3.9.1 3.9.3
CVS trunk
Stable release branch
Development release
Stable release
3.9.4
4.0.0
3.9.5
41
GT4 Components
Your
Your
CC
Client
Client
SERVER
Your
Your
Python
Python
Client
Client
Java Services in Apache Axis Python hosting,
Plus GT Libraries and Handlers
GT Libraries
Pre-WS MDS
C WS
Core
Pre-WS GRAM
pyGlobus
WS Core
RLS
Your
C
Service
MyProxy
Your
Python
Service
SimpleCA
X.509 credentials =
common authentication
CAS
OGSA-DAI
GTCP
Delegation
Index
Trigger
Archiver
Your
Your
Java
Java
Service
Service
GRAM
RFT
Interoperable
WS-I-compliant
SOAP messaging
Your
Your
CC
Client
Client
Your
Your
Java
Java
Client
Client
Your
Your
Python
Python
Client
Client
GridFTP
Your
Your
Java
Java
Client
Client
CLIENT
C Services using GT
Libraries and Handlers
42
Our Goals for GT4

Usability, reliability, scalability, …




Web service components have quality equal
or superior to pre-WS components
Documentation at acceptable quality level
Consistency with latest standards (WS-*,
WSRF, WS-N, etc.) and Apache platform

WS-I Basic Profile compliant

WS-I Basic Security Profile compliant
New components, platforms, languages

And links to larger Globus ecosystem
43
Globus Toolkit:
Open Source Grid Infrastructure
Globus Toolkit v4
www.globus.org
Data
Replication
Credential
Mgmt
Replica
Location
Grid
Telecontrol
Protocol
Delegation
Data Access
& Integration
Community
Scheduling
Framework
WebMDS
Python
Runtime
Community
Authorization
Reliable
File
Transfer
Workspace
Management
Trigger
C
Runtime
Authentication
Authorization
GridFTP
Grid Resource
Allocation &
Management
Index
Java
Runtime
Security
Data
Mgmt
Execution
Mgmt
Info
Services
Common
Runtime
44
GT4 Web Services Runtime




Supports both GT (GRAM, RFT, Delegation,
etc.) & user-developed services
Redesign to enhance scalability, modularity,
performance, usability
Leverages existing WS standards

WS-I Basic Profile: WSDL, SOAP, etc.

WS-Security, WS-Addressing
Adds support for emerging WS standards


WS-Resource Framework, WS-Notification
Java, Python, & C hosting environments

Java is standard Apache
45
GT4 WS Core in a Nutshell
Service
EPR
EPR
EPR
GetRP
GetMultRPs
Resource
SetRP
QueryRPs
RPs
Implementation of WSRF:
Resources,
EndpointReferences,
ResourceProperties
Operation Providers: pre-build
implementations of WSRF
operations
Subscribe
SetTermTime
Destroy
Notification implementation:
Topics, TopicSet, Embedded
Notification Consumer service
Implementations of Resources
(ReflectionResource,
PersistentReflectionResource)
and ResourceProperties
(SimpleResourceProperty,
ReflectionResourceProperty)
47
GT4 WS Core in a Nutshell
Service Container
Service
Service
Service
GetRP
GetRP
GetRP
GetMultRPs
EPR
GetMultRPs
EPR
GetMultRPs
EPR
EPR
SetRP
EPR
EPRResource
SetRP
EPRResource
SetRP
Resource QueryRPs
QueryRPs
RPs
QueryRPs
Subscribe
RPs
Subscribe
RPs
Subscribe
SetTermTime
SetTermTime
ResourceHome
SetTermTime
Destroy
ResourceHome
Destroy
ResourceHome
Destroy
Service Container: host
multiple services in
container; one JVM
process
…more details: based
on AXIS service
container, processes
SOAP messages,
ResourceContext
extension.
48
GT4 WS Core in a Nutshell
Service Container
Service
Service
Service
GetRP
GetRP
GetRP
GetMultRPs
EPR
GetMultRPs
EPR
GetMultRPs
EPR
EPR
SetRP
EPR
EPRResource
SetRP
EPRResource
SetRP
Resource QueryRPs
QueryRPs
RPs
QueryRPs
Subscribe
RPs
Subscribe
RPs
Subscribe
SetTermTime
SetTermTime
ResourceHome
SetTermTime
Destroy
ResourceHome
Destroy
ResourceHome
Destroy
PIP
PDP
Secure Communication:
Transport, Message,
Conversation (Transport
demonstrates best
performance)
Configurable Security Policies:
Policy Information Points
(PIPs), Policy Decision Points
(PDP) -- chained
Example authorization
PDPs: GridMap, SAML
implementations,
XACML policies
49
GT4 WS Core in a Nutshell
Service Container
Service
Service
Service
PIP
GetRP
GetRP
GetRP
GetMultRPs
EPR
GetMultRPs
EPR
GetMultRPs
EPR
EPR
SetRP
EPR
EPRResource
SetRP
EPRResource
SetRP
Resource QueryRPs
QueryRPs
RPs
QueryRPs
Subscribe
RPs
Subscribe
RPs
Subscribe
SetTermTime
SetTermTime
ResourceHome
SetTermTime
Destroy
ResourceHome
Destroy
ResourceHome
Destroy
WorkManager
DB Conn Pool
PDP
JNDI Directory
WorkManager: “thread
pool”, site independent
“work” manager
Apache Database
Connection Pool library
(JDBC “DataSource”
implementation)
JNDI Directory: manages
internal, shared objects
(ResourceHomes,
WorkManager,
Configuration objects,…)
50
GT4 WS Core in a Nutshell
Apache Tomcat
Service Container
Service
Service
Service
PIP
GetRP
GetRP
GetRP
GetMultRPs
EPR
GetMultRPs
EPR
GetMultRPs
EPR
EPR
SetRP
EPR
EPRResource
SetRP
EPRResource
SetRP
Resource QueryRPs
QueryRPs
RPs
QueryRPs
Subscribe
RPs
Subscribe
RPs
Subscribe
SetTermTime
SetTermTime
ResourceHome
SetTermTime
Destroy
ResourceHome
Destroy
ResourceHome
Destroy
WorkManager
DB Conn Pool
PDP
JNDI Directory
Deploy Service
Container “standalone”
or within Apache
Tomcat
51
GT4 Web Services Runtime
Custom
Web
Services
Custom
GT4
WSRF Web WSRF Web
Services
Services
WS-Addressing, WSRF,
WS-Notification
WSDL, SOAP, WS-Security
Registry
Administration
GT4 Container
User Applications
52
Modeling State in Web Services
Resource
allocation
Authentication
& Authorization
are applied to
all requests
Factory
service
State inspection
Lifetime mgmt
Notifications
Service
requestor
(e.g., user
application)
Discovery
Stateful
Entities
Register
Stateful
Entity
Interactions standardized using WSDL and SOAP
Registry
53
WSRF & WS-Notification

Naming and bindings (basis for virtualization)




Lifecycle (basis for fault resilient state mgmt)

Resources created by services following factory pattern

Resources destroyed immediately or scheduled
Information model (basis for monitoring, discovery)

Resource properties associated with resources

Operations for querying and setting this info

Asynchronous notification of changes to properties
Service groups (basis for registries, collective svcs)


Every resource can be uniquely referenced, and has one or
more associated services for interacting with it
Group membership rules & membership management
Base Fault type
WSRF/WSNs Compared
(HPDC 2005)
54
GT4-Java
GT4-C
pyGridWare
WSRF::Lite
WSRF.NET
Languages supported
Java
C
Python
Perl
C#/C++/VBasic, etc.
WS-Security password profile
Yes
No
In progress
In progress
Yes
WS-Security X.509 profile
Yes
In progress
Yes
In progress
Yes
WS-SecureConversation
Yes
No
Yes
No
Yes
TLS/SSL
Yes
Yes
Yes
Yes
Yes
Multiple
Multiple
Callout
None
Yes
Not default
Yes
Yes
Yes
Memory Footprint
JVM + 10M
22 KB
12 MB
12 MB
Depends
Memory size per WS-Resource
Depends on
resource state
70B
Depends on
resource state
0 (file/DB) or
10B (process)
Depends on resource
state
Unmodified hosting environment
Yes
No
Yes
Yes (Apache)
Yes
Compliance with WS-I Basic
Profile
Yes
Yes
Yes
In progress
Yes
Compliance with
WS-I Basic Security Profile
Yes
Yes
Yes
No
Yes
Log4J
Yes
Yes
Yes
WSE diagnostics
WS-ResourceLifetime
Yes
Yes
Yes
Yes
Yes
WS-ResourceProperties
Yes
Yes
Yes
Yes
Yes
WS-ServiceGroup
Yes
Yes
Yes
Yes
Yes
WS-BaseFaults
Yes
Yes
Yes
Yes
Yes
WS-BaseNotification
Yes
Consumer
Yes
No
Yes
WS-BrokeredNotification
Partial
No
No
No
Yes
WS-Topics
Partial
Partial
Partial
No
Partial
Authorization
Persistence of WS-Resources
Logging
55
GetRP Test
Distributed client and service on same LAN
(times in milliseconds)
149.67
No Security
25.57
X509 Signing
HTTPS
181.96
17.1
140.5
55.6
81.39
10.05
8.23
2.34
N/A
14.8
11.46
2.85
12.91
GT4 WS Core Performance
56
(1) Message-level security (times in milliseconds)
GT4 Java
GT4 C
GT4 Python
WSRF.NET
GetRP
181.96
14.77
140.50
81.39
SetRP
182.04
14.99
142.21
82.48
CreateR
188.46
14.98
132.26
96.22
DestroyR
182.03
15.76
136.12
86.89
Notify
219.51
N/A
244.93
101.57
(2) Transport-level security (times in milliseconds)
GT4 Java
GT4 C
GT4 Python
WSRF.NET
getRP
11.46
2.85
149.67
12.91
setRP
11.47
2.86
150.79
12.3
createR
18.00
2.82
132.60
20.84
destroyR
14.92
2.71
149.21
16.05
Notify
29.26
9.67
169.07
45.0
“WSRF/WSNs Compared,” HPDC 2005.
57
Globus Toolkit:
Open Source Grid Infrastructure
Globus Toolkit v4
www.globus.org
Data
Replication
Credential
Mgmt
Replica
Location
Grid
Telecontrol
Protocol
Delegation
Data Access
& Integration
Community
Scheduling
Framework
WebMDS
Python
Runtime
Community
Authorization
Reliable
File
Transfer
Workspace
Management
Trigger
C
Runtime
Authentication
Authorization
GridFTP
Grid Resource
Allocation &
Management
Index
Java
Runtime
Security
Data
Mgmt
Execution
Mgmt
Info
Services
Common
Runtime
58
Globus Security

Control access to shared services



Address autonomous management, e.g.,
different policy in different work-groups
Support multi-user collaborations

Federate through mutually trusted services

Local policy authorities rule
Allow users and application communities to
set up dynamic trust domains

Personal/VO collection of resources working
together based on trust of user/VO
59
Virtual Organization (VO) Concept
Virtual Community C
Person B
(Administrator)
Compute Server C1'
Person A
(Principal Investigator)


Person E
(Researcher)
Person D
(Researcher)
Person B
(Staff)
Compute Server C2
File server F1
(disk A)
Compute Server C1
Person A
(Faculty)
Person C
(Student)
Organization A
Person D File server F1
(Staff) (disks A and B)
Compute Server C3
Person E
(Faculty)
Person F
(Faculty)
Organization B
VO for each application or workload
Carve out and configure resources for a
particular use and set of users
60
GT4 Security
Authz Callout:
SAML, XACML
SSL/WS-Security
with Proxy
Services (running
Certificates
on user’s behalf)
Access
Compute
Center
Rights
CAS or VOMS
issuing SAML
or X.509 ACs
Users
Rights
Local policy
on VO identity
or attribute
authority
MyProxy
VO
Rights’
KCA
61
GT4 Security


Public-key-based authentication
Extensible authorization framework based
on Web services standards

SAML-based authorization callout


Integrated policy decision engine


As specified in GGF OGSA-Authz WG
XACML policy language, per-operation policies,
pluggable
Credential management service

MyProxy (One time password support)

Community Authorization Service

Standalone delegation service
62
GT4’s Use of Security Standards
Supported,
but slow
Supported,
but insecure
Fastest,
so default
63
GT-XACML Integration

eXtensible Access Control Markup Language

OASIS standard, open source implementations

XACML: sophisticated policy language

Globus Toolkit ships with XACML runtime



Included in every client and server built on GT

Turned-on through configuration
… that can be called transparently from
runtime and/or explicitly from application …
… and we use the XACML-”model” for
our Authz Processing Framework
64
GT Authorization Framework
65
Other Security Services Include …


MyProxy

Simplified credential management

Web portal integration

Single-sign-on support
KCA & kx.509


SimpleCA


Bridging into/out-of Kerberos domains
Online credential generation
PERMIS

Authorization service callout
66
Globus Toolkit:
Open Source Grid Infrastructure
Globus Toolkit v4
www.globus.org
Data
Replication
Credential
Mgmt
Replica
Location
Grid
Telecontrol
Protocol
Delegation
Data Access
& Integration
Community
Scheduling
Framework
WebMDS
Python
Runtime
Community
Authorization
Reliable
File
Transfer
Workspace
Management
Trigger
C
Runtime
Authentication
Authorization
GridFTP
Grid Resource
Allocation &
Management
Index
Java
Runtime
Security
Data
Mgmt
Execution
Mgmt
Info
Services
Common
Runtime
67
GT4 Data Management


Stage/move large data to/from nodes

GridFTP, Reliable File Transfer (RFT)

Alone, and integrated with GRAM
Locate data of interest


Replicate data for performance/reliability


Replica Location Service (RLS)
Distributed Replication Service (DRS)
Provide access to diverse data sources


File systems, parallel file systems,
hierarchical storage: GridFTP
Databases: OGSA DAI
Bandwidth Vs Striping
18000

100% Globus code



No licensing issues
Stable, extensible
Bandwidth (Mbps)
16000
GridFTP in GT4
14000
12000
10000
8000
6000
4000
2000
0
0
IPv6 Support
10
20
30
40
50
70
# Stream = 1
# Stream = 2
# Stream = 4
# Stream = 8
# Stream = 16
# Stream = 32
XIO for different transports

Striping  multi-Gb/sec wide area transport

60
Degree of Striping


68
Disk-to-disk on
TeraGrid
20000
27 Gbit/s on 30 Gbit/s link
Pluggable

Front-end: e.g., future WS control channel

Back-end: e.g., HPSS, cluster file systems

Transfer: e.g., UDP, NetBLT transport
69
Reliable File Transfer:
Third Party Transfer

Fire-and-forget transfer

Web services interface

Many files & directories

Integrated failure recovery

Has transferred 900K files
RFT Client
SOAP
Messages
RFT Service
GridFTP Server
Master
DSI
Protocol
Interpreter
GridFTP Server
Data
Channel
Data
Channel
IPC Link
IPC
Receiver
Notifications
(Optional)
Protocol
Interpreter
Master
DSI
IPC Link
Slave
DSI
Data
Channel
Data
Channel
Slave
DSI
IPC
Receiver
70
Replica Location Service




Identify location of files
via logical to physical
name map
Distributed indexing of
names, fault tolerant
update protocols
GT4 version scalable &
stable
Managing ~40 million
files across ~10 sites
Index
Index
Local Update Bloom Bloom
DB
send
filter
filter
(secs) (secs) (bits)
10K
<1
2
1M
1M
2
24
10 M
5M
7
175
50 M
Reliable Wide Area Data
Replication
71
LIGO Gravitational Wave Observatory
Birmingham•
Cardiff
AEI/Golm
Replicating >1 Terabyte/day to 8 sites
>30 million replicas so far
MTBF = 1 month www.globus.org/solutions
72
OGSA-DAI


Provide service-based access to structured
data resources as part of Globus
Specify a selection of interfaces tailored to
various styles of data access—starting with
relational and XML
73
The OGSA-DAI Framework
Application
Client Toolkit
OGSA-DAI service
Engine
SQLQuery
readFile
XPath
XSLT
GZip
GridFTP
Activities
JDBC
XMLDB
File
Data
Resources
SQL
MySQL
DB2
Server
XIndice
SWISS
PROT
Databases
74
Extensibility Example
OGSA-DAI service
Engine
SQLQuery
Multiple
JDBC
SQL GDS
SQL
SQL
JDBC
JDBC
MySQL
SQL
SQL
JDBC
JDBC
OGSA-DAI: A Framework
for Building Applications

Supports data access, insert and update




Supports data delivery





SOAP over HTTP
FTP; GridFTP
E-mail
Inter-service
Supports data transformation



Relational: MySQL, Oracle, DB2, SQL Server,
Postgres
XML: Xindice, eXist
Files – CSV, BinX, EMBL, OMIM, SWISSPROT,…
XSLT
ZIP; GZIP
Supports security

X.509 certificate based security
75
76
OGSA-DAI: Other Features

A framework for building data clients


A framework for developing functionality



Client toolkit library for application
developers
Extend existing activities, or implement
your own
Mix and match activities to provide
functionality you need
Highly extensible


Customise our out-of-the-box product
Provide your own services, client-side
support, and data-related functionality
77
Globus Toolkit:
Open Source Grid Infrastructure
Globus Toolkit v4
www.globus.org
Data
Replication
Credential
Mgmt
Replica
Location
Grid
Telecontrol
Protocol
Delegation
Data Access
& Integration
Community
Scheduling
Framework
WebMDS
Python
Runtime
Community
Authorization
Reliable
File
Transfer
Workspace
Management
Trigger
C
Runtime
Authentication
Authorization
GridFTP
Grid Resource
Allocation &
Management
Index
Java
Runtime
Security
Data
Mgmt
Execution
Mgmt
Info
Services
Common
Runtime
78
Execution Management (GRAM)

Common WS interface to schedulers



Unix, Condor, LSF, PBS, SGE, …
More generally: interface for process
execution management

Lay down execution environment

Stage data

Monitor & manage lifecycle

Kill it, clean up
A basis for application-driven provisioning
79
GT4 WS GRAM


2nd-generation WS implementation
optimized for performance, flexibility,
stability, scalability
Streamlined critical path


Flexible credential management


Use only what you need
Credential cache & delegation service
GridFTP & RFT used for data operations

Data staging & streaming output

Eliminates redundant GASS code
80
GT4 WS GRAM Architecture
Service host(s) and compute element(s)
Job events
Client
Delegate
Delegation
Transfer
request
RFT File
Transfer
SEG
Compute element
Local job control
sudo
GT4 Java Container
GRAM
GRAM
services
services
GRAM
adapter
GridFTP
FTP
control
Local
scheduler
User
job
FTP data
GridFTP
Remote
storage
element(s)
81
GT4 WS GRAM Architecture
Service host(s) and compute element(s)
Job events
Client
Delegate
Delegation
Transfer
request
RFT File
Transfer
SEG
Compute element
Local job control
sudo
GT4 Java Container
GRAM
GRAM
services
services
GRAM
adapter
GridFTP
FTP
control
Local
scheduler
User
job
FTP data
Delegated credential can be:
Made available to the application
GridFTP
Remote
storage
element(s)
82
GT4 WS GRAM Architecture
Service host(s) and compute element(s)
Job events
Client
Delegate
Delegation
Transfer
request
RFT File
Transfer
SEG
Compute element
Local job control
sudo
GT4 Java Container
GRAM
GRAM
services
services
GRAM
adapter
GridFTP
FTP
control
Local
scheduler
User
job
FTP data
Delegated credential can be:
Used to authenticate with RFT
GridFTP
Remote
storage
element(s)
83
GT4 WS GRAM Architecture
Service host(s) and compute element(s)
Job events
Client
Delegate
Delegation
Transfer
request
RFT File
Transfer
SEG
Compute element
Local job control
sudo
GT4 Java Container
GRAM
GRAM
services
services
GRAM
adapter
GridFTP
FTP
control
Local
scheduler
User
job
FTP data
Delegated credential can be:
Used to authenticate with GridFTP
GridFTP
Remote
storage
element(s)
84
WS GRAM Performance

Time to submit a basic GRAM job
Pre-WS GRAM: < 1 second
 WS GRAM: 2 seconds


Concurrent jobs
Pre-WS GRAM: 300 jobs
 WS GRAM: 32,000 jobs


Various studies are underway to test latest
software
85
GT4 WS GRAM Performance
Sustained Job Load
Per Client Thread (N)
Number of Client Threads (M)
1
2
4
8
16
32
64
128
1
7
15
29
57
80
69
69
70
2
15
29
58
79
74
70
70
64
4
29
58
78
77
68
69
52
69
8
59
77
77
72
65
27
16
77
77
75
64
27
32
76
75
68
64
67
64
75
73
70
66
65
128
80
72
64
63
71
69
50
All numbers are simple jobs/minute, no delegation or staging
86
Workspace Service:
The Hosted Activity
Policy
Client
Negotiate access
Initiate activity
Monitor activity
Control activity
Activity
Environment
Interface
Resource provider
87
Activities Can Be Nested
Client
Policy
Client
Client
Environment
Interface
Resource provider
88
For Example …
Deploy service
Deploy container
Deploy virtual machine
Deploy hypervisor/OS
Procure hardware
JVM
JVM
VM
VM
Hypervisor/OS
Physical machine
Provisioning, management, and monitoring at all levels
89
Dynamic Service Deployment
Community
A
• Community
scheduling logic
• Data distribution
• Community
management
• Science services
• ...
…
Community
Z
Requirements:
• Community
control
• Persistence
• Resource
guarantees
• Noninterference
90
Virtual Machine Costs
job startup scenario
c)
0.7
b)
0.7
a)
1.7
0.8
GRAM job in
paused VM
8
0.8
0
Job in
booted VM
8
GRAM job
8
2
4
6
time (in seconds)
8
10
12
VM setup
VM boot
job setup
GRAM job
91
Virtual OSG Clusters
OSG
OSG cluster
Xen hypervisors
TeraGrid cluster
92
Globus Toolkit:
Open Source Grid Infrastructure
Globus Toolkit v4
www.globus.org
Data
Replication
Credential
Mgmt
Replica
Location
Grid
Telecontrol
Protocol
Delegation
Data Access
& Integration
Community
Scheduling
Framework
WebMDS
Python
Runtime
Community
Authorization
Reliable
File
Transfer
Workspace
Management
Trigger
C
Runtime
Authentication
Authorization
GridFTP
Grid Resource
Allocation &
Management
Index
Java
Runtime
Security
Data
Mgmt
Execution
Mgmt
Info
Services
Common
Runtime
93
Monitoring and Discovery

“Every service should be monitorable and
discoverable using common mechanisms”



WSRF/WSN provides those mechanisms
A common aggregator framework for
collecting information from services, thus:

MDS-Index: Xpath queries, with caching

MDS-Trigger: perform action on condition

(MDS-Archiver: Xpath on historical data)
Deep integration with Globus containers &
services: every GT4 service is discoverable

GRAM, RFT, GridFTP, CAS, …
GT4
Monitoring & Discovery
WS-ServiceGroup
Clients
(e.g., WebMDS)
GT4 Container
Registration &
WSRF/WSN Access
GT4 Container
MDSIndex
Automated
registration
in container
GRAM
94
MDSIndex
adapter
GT4 Cont.
Custom protocols
for non-WSRF entities
MDSIndex
GridFTP
User
RFT
95
Index Server Performance



As the MDS4 Index grows, query rate and
response time both slow, although
sublinearly
Response time slows due to increasing
data transfer size

Full Index is being returned

Response is re-built for every query
Real question – how much over simple WSN performance?
96
Information Providers



GT4 information providers collect
information from some system and make it
accessible as WSRF resource properties
Growing number of information providers

Ganglia, CluMon, Nagios

SGE, LSF, OpenPBS, PBSPro, Torque
Many opportunities to build additional ones

E.g., network monitoring, storage systems,
various sensors
97
GT4 Summary
Your
Your
CC
Client
Client
SERVER
Your
Your
Python
Python
Client
Client
Java Services in Apache Axis Python hosting,
Plus GT Libraries and Handlers
GT Libraries
Pre-WS MDS
C WS
Core
Pre-WS GRAM
pyGlobus
WS Core
RLS
Your
C
Service
MyProxy
Your
Python
Service
SimpleCA
X.509 credentials =
common authentication
CAS
OGSA-DAI
GTCP
Delegation
Index
Trigger
Archiver
Your
Your
Java
Java
Service
Service
GRAM
RFT
Interoperable
WS-I-compliant
SOAP messaging
Your
Your
CC
Client
Client
Your
Your
Java
Java
Client
Client
Your
Your
Python
Python
Client
Client
GridFTP
Your
Your
Java
Java
Client
Client
CLIENT
C Services using GT
Libraries and Handlers
GT4
Documentation
is
Much Improved!
99
Overview

Background and Globus approach

Globus Toolkit: current capabilities

Future directions

Related tools
100
The Globus Commitment
to Open Source


Globus was first established as an open
source project in 1996
The Globus Toolkit is open source to:

allow for inspection


encourage adoption


in pursuit of ubiquity and interoperability
encourage contributions


for consideration in standardization processes
harness the expertise of the community
The Globus Toolkit is distributed under the
(BSD-style) Apache License version 2
101
The Future:
Structure


NSF Community Driven Improvement of
Globus Software (CDIGS) project

5 years of funding for GT enhancement

Regular Globus roadmaps outlining plans
GlobDev
http://dev.globus.org

Apache-like community development site

Community governance of components

“Globus Toolkit” & other related software

Open for business early 2006

“Globus Alliance” = “GlobDev committers”
102
GlobDev

The current set of Globus components will
be organized into several “Globus Projects”


Each project will have its own group of
“Committers”


Projects release products
committers are responsible for governance
on matters relating to their products
The “Globus Management Committee” will


provide overall guidance and conflict
resolution
approve the creation of new Globus Projects
103
The Future:
Content


We now have a solid and extremely
powerful Web services base
Next, we will build an expanded open
source Grid infrastructure



Virtualization
New services for provisioning, data
management, security, VO management

End-user tools for application development

Etc., etc.
And of course responding to user requests
for other short-term needs
104
The Future


We now have a solid and extremely
powerful Web services base
Next, we will build an expanded open
source Grid infrastructure



Virtualization
New services for provisioning, data
management, security, VO management

End-user tools for application development

Etc., etc.
And of course responding to user requests
for other short-term needs
105
Short-Term Priorities:
Security






Improve GSI error reporting & diagnostics
Secure password, one-time password,
Kerberos support for initial log on
Trust roots, use of GridLogon
Identity/attribute assertions in GT auth.
callouts (e.g., Shib, PERMIS, VOMS, SAML)
Extend CAS admin & policy support
Security logging with management control
for audit purposes
106
Short-Term Priorities:
Data Management

Space & bandwidth management in
GridFTP

Concurrency in globus-url-copy

Priorities in RFT

Data replication service

Enhance policy support in data services

Physical file name creation service

Scalable & distributed metadata manager
107
Short-Term Priorities:
Execution Management

Implement GGF JSDL once finalized

Advance reservation support

Policy-driven restart of “persistent” jobs

Improved information collection for jobs

Improved management of job collections

Credential refresh

Development of workspace service


Integration of virtual machines (Xen,
VMware) and associated services
Windows port of WS GRAM
108
Short-Term Priorities:
Information Services

Many more information sources, including
gateways to other systems

Automated configuration of monitoring

Specialized monitoring displays

Performance optimization of registry

Archiver service

Helper tools to streamline integration of
new information sources
109
Short-Term Priorities:
WS Core

Streamlined container configuration

Remote management interface

Dynamic service deployment

Service isolation: multiple service
instances

WS-Notification, subscription performance

Full functionality in C WS Core

Optimized WS-ServiceGroup support

WS-SecureConversation support
110
What to Expect from the
Globus Alliance in the Coming Months

Support for users of GT4




Working to make sure the toolkit meets
user needs

Answering questions on the mailing lists

Further improving documentation
Normal evolution of performance,
scalability and feature enhancements
Further development of tools and services
in support of VOs
Expanding contributions to Globus
111
Overview

Background and Globus approach

Globus Toolkit: current capabilities

Future directions

Related tools
112
The Globus Ecosystem

Globus components address core issues
relating to resource access, monitoring,
discovery, security, data movement, etc.


A larger Globus ecosystem of open source
and proprietary components provide
complementary components


GT4 being the latest version
A growing list of components
These components can be combined to
produce solutions to Grid problems

We’re building a list of such solutions
113
Many Tools Build on, or Can
Contribute to, GT4-Based Grids










Condor-G, DAGman
MPICH-G2
GRMS
Nimrod-G
Ninf-G
Open Grid Computing Env.
Commodity Grid Toolkit
GriPhyN Virtual Data
System
Virtual Data Toolkit
GridXpert Synergy












Platform Globus Toolkit
VOMS
PERMIS
GT4IDE
Sun Grid Engine
PBS scheduler
LSF scheduler
GridBus
TeraGrid CTSS
NEES
IBM Grid Toolbox
…
114
Documenting
The Grid
Ecosystem
The Grid Ecosystem: Software Components for Grid Systems
And Applications
www.grids-center.org
115
Example Solutions

Portal-based User Reg. System (PURSE)

VO Management Registration Service

Service Monitoring Service

TeraGrid TGCP Tool

Lightweight Data Replicator

GriPhyN Virtual Data System
116
Condor-G


The Condor Project @ U Wisconsin Madison
develops software for high-throughput
computing on collections of distributed
compute resources
Condor-G is an interface to GRAM created
by the Condor team that allows users to
submit jobs to GRAM servers
117
GridShib

Allows the use of Shibboleth-transported
attributes for authorization in GT4
deployments

And, more generally, SAML support

2 year project started December 1, 2004

Participants


Von Welch, UIUC/NCSA (PI)

Kate Keahey, UChicago/Argonne (PI)

Frank Siebenlist, Argonne

Tom Barton, UChicago
Beta software released September 16, 2005
118
Handle System


The Handle System from CNRI
(http://www.handle.net) is a generalpurpose global name service enabling
secure name resolution over the internet
The Handle System-GT Integration Project
leverages the Handle System for identifier
and resolution services through tight
integration with GT4’s Web services
protocols
119
MPICH-G2


MPICH-G2, developed at Northern Illinois
University and Argonne National Lab, is a
grid-enabled implementation of the MPI
v1.1 standard
MPICH-G2 is implemented using the
pre-WS GRAM component in GT4;
integration with GT4 WS GRAM is expected
in the near future
120
Nimrod/G


Nimrod is a specialized parametric
modeling system from Monash University
Nimrod/G uses a simple declarative
parametric modeling language to express
parameter sweep experiments. Based on
GT4 WS services, Nimrod/G enables the
formulation, execution and monitoring of
multiple individual parametric experiments
121
Ninf-G4


Ninf-G4, from AIST, is a reference
implementation of the GGF standard
GridRPC API
Ninf-G4 is provides higher-level
programming APIs for the development
and execution of parallel applications on
the Grid
122
PERMIS


PERMIS is an EU-funded Privilege
Management service that implements RoleBased Access Control
Thanks to the work of the UK Grid
Engineering Task Force, services running in
a Java WS Core container can use PERMIS
via GT4’s SAML authorization callouts
123
SRB


SRB is a package from SDSC providing a
uniform interface for connecting to
network-based heterogeneous data
resources
GT4’s GridFTP includes an interface to SRB
data sources, and vice versa
124
Sun Grid Engine


Sun Grid Engine is an open source
distributed resource management system
from Sun Microsystems
In a collaboration between the London eScience Centre, Gridwise and MCNC, the
Sun Grid Engine has been integrated with
GT4
125
Tells Us About Your
Grid Tools & Solutions



We list links to related projects on the
“Related Software” of the Globus Toolkit
web www.globus.org/toolkit/tools/
“Solutions” are documented on the Globus
web www.globus.org/solutions/
If we’ve got details wrong or you have a
GT4-related tool to list on our website,
please send mail to [email protected]
126
Questions?