Globus Virtual Workspaces

Download Report

Transcript Globus Virtual Workspaces

Globus Virtual Workspaces
An Update
SC 2007, Reno, NV
Kate Keahey
Argonne National Laboratory
University of Chicago
[email protected]
Motivation and Background
Why Virtual Workspaces?

Quality of Service

We get: batch-style provisioning



We need: advance reservations, urgent computing,
periodic, best-effort, and others



One size fits all
Side-effect of job scheduling
Separation of job scheduling and resource management
E.g. workflow-based apps and batch apps have different
needs
Quality of Life


We have: “a 100 nodes we cannot use”
Complex applications



Hard to install
Require validation
Separation of environment preparation and resources
leasing
SC07, Reno, NV
Virtual Workspaces: http://workspace.globus.org
What are Virtual
Workspaces?

A dynamically provisioned environment



Environment definition: we get exactly the (software)
environment we need on demand.
Resource allocation: Provision the resources the workspace
needs (CPUs, memory, disk, bandwidth, availability),
allowing for dynamic renegotiation to reflect changing
requirements and conditions.
Implementation


Traditional means: publishing, automated
configuration, coarse-grained enforcement
Virtual Machines: encapsulated configuration and
fine-grained enforcement
Paper: “Virtual Workspaces: Achieving Quality of Service and Quality of
Life in the Grid”
SC07, Reno, NV
Virtual Workspaces: http://workspace.globus.org
Virtual Machines
App
App
App
App
App
Guest OS
(Linux)
Guest OS
(NetBSD)
Guest OS
(Windows)
VM
VM
VM
Parallels
Xen
VMWare
Virtual Machine Monitor (VMM) / Hypervisor
UML
Hardware
KVM
etc.




Bring your environment with you
Fast to deploy, enables short-term leasing
Excellent enforcement, performance isolation
Very good isolation
SC07, Reno, NV
Virtual Workspaces: http://workspace.globus.org
Globus Virtual Workspaces:
How Do They Work?
Virtual Workspaces:
Vital Stats

The GT4 Virtual Workspace Service (VWS) allows
an authorized client to deploy and manage
workspaces on-demand.

GT4 WSRF front-end (one per site)



Follows WS-Agreement provisioning model
Currently implements workspaces as Xen VMs




Other implementations could also be used
Implements multiple deployment modes


Leverages GT core and services, notifications, security, etc.
Best-effort, leasing, etc.
Current release 1.3 (November ‘07)
Globus incubator project
More information at: http://workspace.globus.org
SC07, Reno, NV
Virtual Workspaces: http://workspace.globus.org
Deploying Workspaces
Remotely
VWS
Service
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Workspace
-Workspace metadata
-Pointer to the image
-Logistics information
-Deployment request
-CPU, memory, node count, etc.
SC07, Reno, NV
Virtual Workspaces: http://workspace.globus.org
Interacting with
Workspaces
The workspace service publishes
information on each workspace
as standard WSRF Resource
Properties.
VWS
Service
Users can query those
properties to find out
information about their
workspace (e.g. what IP
the workspace was
bound to)
Users can interact
directly with their
workspaces the same
way the would with a
physical machine.
SC07, Reno, NV
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Trusted Computing Base (TCB)
Virtual Workspaces: http://workspace.globus.org
Workspace Service
Components
Workspace WSRF front-end
that allows clients
to deploy and manage
virtual workspaces
VWS
Service
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Workspace back-end:
Resource manager for
a pool of physical nodes
Deploys and manages
Workspaces on the nodes
Each node must have a
VMM (Xen) installed, as
well as the workspace control
program that manages
individual nodes
Contextualization
creates a common context
for a virtual cluster
SC07, Reno, NV
Trusted Computing Base (TCB)
Virtual Workspaces: http://workspace.globus.org
Workspace Service
Components

GT4 WSRF front-end


Leverages GT core and services, notifications,
security, etc.
Follows the OGF WS-Agreement provisioning model



Provides lease descriptions
Workspace Service back-end




Publishes available lease terms
Currently focused on Xen
Works with multiple Resource Managers
Workspace Control
Contextualization

Put the virtual appliance in its deployment context
SC07, Reno, NV
Virtual Workspaces: http://workspace.globus.org
Managing Resources with Virtual
Workspaces
Workspace Back-Ends

Default resource manager (basic slot fitting)


Commercial datacenter technology would also fit
Challenge: finding Xen-enabled resources



Amazon Elastic Compute Cloud (EC2)
Selling cycles as Xen VMs
Software similar to Workspace Service



SC07, Reno, NV
No virtual clusters, contextualization, fine-grain allocations, etc.
Solution: develop a back-end to EC2
Grid credential admission -> EC2 charging model
Virtual Workspaces: http://workspace.globus.org
Virtual Workspaces for STAR

STAR image configuration


A virtual cluster composed of an OSG headnode and STAR
worker nodes
Using the workspace service over EC2 to provision
resources


Allocations of up to 100 nodes
Dynamically contextualized for out-of-the-box cluster
SC07, Reno, NV
Virtual Workspaces: http://workspace.globus.org
with thanks to Jerome Lauret and Doug Olson of the STAR project
Running
Running
Runningjobs
jobs
jobs::::109
150
142
124
94
73
42
0
Running
jobs
230
VWS/EC2
BNL
Running
Running
Runningjobs
jobs
jobs:::140
300
282
243
221
:195
76
0
Running
jobs
300
WSU
Running
Running
Runningjobs
jobs
jobs::::152
200
195
183
136
96
54
37
0
Running
jobs
150
Fermi
PDSF
Running
Runningjobs
jobs:::15
50
42
39
34
21
27
9
0
Running
jobs
50
Job Completion :
SC07, Reno, NV
File Recovery :
Virtual Workspaces: http://workspace.globus.org
withwith
thanks
to Jerome
Lauret
and and
Doug
Olson
of the
project
thanks
to Jerome
Lauret
Doug
Olson
of STAR
the STAR
project
Nersc
PDSF
EC2
(via Workspace
Service)
WSU
SC07, Reno, NV
Accelerated display of a workflow job state
Y = Workspaces:
job number,http://workspace.globus.org
X = job state
Virtual
Workspace Back-Ends

Default resource manager (basic slot fitting)


Commercial datacenter technology would also fit
Challenge: finding Xen-enabled resources



Amazon Elastic Compute Cloud (EC2)
Selling cycles as Xen VMs
Software similar to Workspace Service




No virtual clusters, contextualization, fine-grain allocations, etc.
Grid credential admission -> EC2 charging model
Solution: develop a back-end to EC2
Challenge: integrating VMs into current provisioning
models

SC07, Reno, NV
Solution: gliding in VMs with the Workspace Pilot
Virtual Workspaces: http://workspace.globus.org
Providing Resources:
The Workspace Pilot


Challenge: find the simplest way to integrate VMs
into current provisioning models
Glide-ins (Condor): poor man’s resource leasing

Best-effort semantics: submit a job “pilot” that claims
resources but does not run a job

The Workspace Pilot

Resources booted to dom0

Pilot adjusts memory

VWS leases “slots” to VMs

SC07, Reno, NV
Functional closure: kill-all
facility, etc.
Virtual Workspaces: http://workspace.globus.org
Workspace Control

VM control



Integrating into the network





Assigning MAC addresses and IP addresses
DHCP Delivery tool
Building up a trusted networking layer
VM image propagation
Image management and reconstruction


Starting, stopping etc.
To be replaced by Xen API
creating blank partitions
Talks to the workspace service via ssh

SC07, Reno, NV
To be replaced
Virtual Workspaces: http://workspace.globus.org
Workspace Back-Ends

Default resource manager (basic slot fitting)


Commercial datacenter technology would also fit
Challenge: finding Xen-enabled resources



Amazon Elastic Compute Cloud (EC2)
Selling cycles as Xen VMs
Software similar to Workspace Service




Grid credential admission -> EC2 charging model
Solution: develop a back-end to EC2
Challenge: integrating VMs into current provisioning
models


No virtual clusters, contextualization, fine-grain allocations, etc.
Solution: gliding in VMs with the Workspace Pilot
Long-term solutions



SC07, Reno, NV
Interleaving soft and hard leases
Providing better articulated leasing models
Developed in the context of existing schedulers
Virtual Workspaces: http://workspace.globus.org
So -- you’ve deployed* some
VMs… Now What?
*Do they have public IP addresses? Do they actually represent
something useful?(BTW, I need an OSG cluster) Can the VMs find out
about each other? Can they share storage? How do they integrate
into the site storage/account system? Do they have host certificates?
And gridmapfile? And all the other things that will integrate them
into my VO?
Virtual Clusters

Challenge: what is a virtual cluster?

A more complex virtual machine



Available at the same time and sharing a common context
Example:


Networking, shared storage, etc. that will be portable across sites and
implementations
A set of worker nodes with some edge services in front and NFS-based
shared storage
Solution: management of ensembles and sharing

Configurable cluster deployment



Exporting and sharing a common context


A set of worker nodes
A few Edge Services enabling access to those nodes
Configuring and joining context
Networking


Edge Services have public IPs
Worker nodes are on a private network shared with the Edge Services
Paper: “Virtual Clusters for Grid Communities”, CCGrid 2006
SC07, Reno, NV
Virtual Workspaces: http://workspace.globus.org
Contextualization

Challenge: Putting a VM in the deployment context of the
Grid, site, and other VMs


Assigning and sharing IP addresses, name resolution, applicationlevel configuration, etc.
Solution: Management of Common Context

contextualization agent
Common
Context
IP
hostname
pk
Configuration-dependent



provides&requires
Common understanding
between the image “vendor”
and deployer
Mechanisms for securely
delivering the required
information to images across
different implementations
Paper: “A Scalable Approach To Deploying And Managing Appliances”,
TeraGrid conference 2007
SC07, Reno, NV
Virtual Workspaces: http://workspace.globus.org
Where do VM images come from?
Appliance Management

Short term solution: Marketplaces




The Workspace Marketplace
http://workspace.globus.org/vm/marketplace.html
Providing described images for scientific community
Appliance providers and marketplaces
Long-term solution: Appliance Providers

Automated image production, attestation and signing


Automated management
Collaboration with configuration management
communities and projects



SC07, Reno, NV
rPath company: the rBuilder project (DOE SBiR)
Bcfg2, adopted on many ANL resources
Osfarm @ CERN OpenLab, serving the scientific community
Appliance providers
Virtual Workspaces: http://workspace.globus.org
Workspace Ecosystem
Appliance Providers:
OSFarm, rPath, CohesiveFT, bcfg2, etc.
marketplaces of all kinds
Virtual Organizations: configuration, attestation, maintenance
Resource Providers:
Local clusters,
Grid resource providers (TeraGrid, OSG)
Commercial providers: EC2, Sun, slicehost,
Provisioning a resource, not a platform
Middleware:
appliances --> resources
manage appliance deployment
Combining networks and storage
VWS
SC07, Reno, NV
EC2
In-Vigo
Virtual Workspaces: http://workspace.globus.org
Parting Thoughts

VMs are the raw materials from which a working
system can be built




Division of labor




But we still have to build it!
Technical challenges: taking one step at a time
Social/procedural challenges
Resource providers
Appliance providers
Can we build trust between these two groups?
If you think we can help you out, give us a call:

http://workspace.globus.org
SC07, Reno, NV
Virtual Workspaces: http://workspace.globus.org
Acknowledgements

Workspace team:




Funding




Kate Keahey
Tim Freeman
Borja Sotomayor
NSF SDCI “Missing Links”
NSF CSR “Virtual Playgrounds”
DOE CEDPS Project
With thanks to many collaborators:

Jerome Lauret (STAR, BNL), Doug Olson (STAR, LBNL), Marty Wesley
(rPath), Stu Gott (rPath), Ken Van Dine (rPath), Predrag Buncic (Alice,
CERN), Haavard Bjerke (CERN), Rick Bradshaw (Bcfg2, ANL), Narayan
Desai (Bcfg2, ANL), Duncan Penfold-Brown (Atlas,uvic), Ian Gable (Atlas,
uvic), David Grundy (Atlas, uvic), Ti Leggit (University of Chicago), Greg
Cross (University of Chicago), Mike Papka (University of Chicago/ANL)
SC07, Reno, NV
Virtual Workspaces: http://workspace.globus.org