PlanetLab: a worldwide testbed for distributed computing David Culler UC Berkeley •with Larry Peterson, Tom Anderson, Mic Bowman, Timothy Roscoe, Brent Chun, •Frans Kaashoek, Mike.

Download Report

Transcript PlanetLab: a worldwide testbed for distributed computing David Culler UC Berkeley •with Larry Peterson, Tom Anderson, Mic Bowman, Timothy Roscoe, Brent Chun, •Frans Kaashoek, Mike.

PlanetLab: a worldwide testbed for
distributed computing
David Culler
UC Berkeley
•with Larry Peterson, Tom Anderson, Mic Bowman, Timothy Roscoe, Brent Chun,
•Frans Kaashoek, Mike Wawrzoniak, ....
www.planet-lab.org
PlanetLab is …
http://www.planet-lab.org
• A novel world-wide testbed
• 205 machines at 85 sites in 19 countries
– Towards thousands
– Universities, Internet 2, co-location centers
– 230 research projects
10/1/2003
PlanetLab - DISC
... is many, many vantage points
on the internet
• The Internet in the middle
• Close to you wherever you are
• A truly global network perspective
• A place to bring dist. Algorithms to reality
10/1/2003
PlanetLab - DISC
Where did it come from?
• Sense of wonder
– what would be the next important thing to do in extreme
networked systems post cluster, post yahoo, post inktomi, post
akamai, post gnutella, post bubble?
• Sense of angst
– NRC: “looking over the fence at networks”
» ossified internet (intellectually, infrastructure, system)
» next internet likely to emerge as overlay on current one (again)
» it will be defined by its services, not its transport
• Sense of excitement
10/1/2003
PlanetLab - DISC
A new look at internet services
10/1/2003
PlanetLab - DISC
Planetary-Scale Services
• Services and applications spread over the web
– Proximity => low latency, high bandwidth, predictable, reliable
– Perspective => adapt to load, delays, failures, $ on a global scale
• Content-distribution Networks and Peer-to-Peer
sharing just the tip of the iceberg.
• Academic Community developing the architectural
building blocks to enable many kinds of distributed
services
–
–
–
–
–
scalable translation,
dist. storage,
dist. events,
instrumentation,
management
10/1/2003
PlanetLab - DISC
Key Concept: Overlay networks
10/1/2003
PlanetLab - DISC
10/1/2003
PlanetLab - DISC
Overlay network routing
10/1/2003
PlanetLab - DISC
key missing element – hands on
experience
• Researchers had no vehicle to try out
their next n great ideas in this space
• Lots of simulations
• Lots of emulation on large clusters
– emulab, millennium, modelnet
• Lots of folks calling their 17 friends before the
next deadline
– RON testbed
• but not the surprises and frustrations of
experience at scale to drive innovation
10/1/2003
PlanetLab - DISC
March 02 “Underground Meeting”
Washington
Tom Anderson
Steven Gribble
David Wetherall
MIT
Intel Research
David Culler
Timothy Roscoe
Sylvia Ratnasamy
Gaetano Borriello
Satya (CMU Srini)
Milan Milenkovic
Frans Kaashoek
Hari Balakrishnan
Duke
Robert Morris
Amin Vadat
David Anderson
Jeff Chase
Berkeley
Ion Stoica
Joe Helerstein
Eric Brewer
Kubi
10/1/2003
Princeton
Larry Peterson
Randy Wang
Vivek Pai
PlanetLab - DISC
see http://www.cs.berkeley.edu/~culler/planetlab
Rice
Peter Druschel
Utah
Jay Lepreau
CMU
Srini Seshan
Hui Zhang
UCSD
Stefan Savage
Columbia
Andrew
Campbell
ICIR
Scott Shenker
Eddie Kohler
Guidelines (1)
• Thousand viewpoints on “the cloud” is what matters
– not the thousand servers
– not the routers, per se
– not the pipes
10/1/2003
PlanetLab - DISC
Guidelines (2)
• and you miust have the vantage points of the crossroads
– primarily co-location centers
10/1/2003
PlanetLab - DISC
Guidelines (3)
• Each service needs an overlay covering many
points
– logically isolated
• Many concurrent services and applications
– must be able to slice nodes => VM per service
– service has a slice across large subset
• Must be able to run each service / app over long
period to build meaningful workload
– traffic capture/generator must be part of facility
• Consensus on “a node” more important than
“which node”
10/1/2003
PlanetLab - DISC
Guidelines (4)
Management, Management, Management
• Test-lab as a whole must be up a lot
– global remote administration and management
» mission control
– redundancy within
• Each service will require its own remote management
capability
• Testlab nodes cannot “bring down” their site
– generally not on main forwarding path
– proxy path
– must be able to extend overlay out to user nodes?
• Relationship to firewalls and proxies is key
10/1/2003
PlanetLab - DISC
Guidelines (5)
• Storage has to be a part of it
– edge nodes have significant capacity
• Needs a basic well-managed capability
– but growing to the seti@home model should be considered at
some stage
– may be essential for some services
10/1/2003
PlanetLab - DISC
Confluence of Technologies
• Cluster-based scalable distribution, remote execution, management,
monitoring tools
– UCB Millennium, OSCAR, ..., Utah Emulab, ...
• CDNS and P2Ps
– Gnutella, Kazaa, ...
• Proxies routine
• Virtual machines & Sandboxing
– VMWare, Janos, Denali,...
web-host slices (EnSim)
• Overlay networks becoming ubiquitous
– xBone, RON, Detour...
Akamai, Digital Island, ....
• Service Composition Frameworks
– yahoo, ninja, .net, websphere, Eliza
•
•
•
•
•
Established internet ‘crossroads’ – colos
Web Services / Utility Computing
Authentication infrastructure (grid)
Packet processing (layer 7 switches, NATs, firewalls)
Internet instrumentation
10/1/2003
PlanetLab - DISC
Outcome
• “Mirror of Dreams” project
• K.I.S.S.
– Building Blocks, not solutions
– no big standards, OGSA-like, meta-hyper-supercomputer
• Compromise
– A basic working testbed in the hand is much better than
“exactly my way” in the bush
• “just give me a bunch of (virtual) machines
spread around the planet,.. I’ll take it from there”
• small distr. arch team, builders,
10/1/2003
PlanetLab - DISC
users
a novel system architecture
• Distributed means of acquiring a slice of virtual
machines spanning much of the planet
10/1/2003
PlanetLab - DISC
Tension of Dual Roles
• Research testbed
– run fixed-scope experiments
– large set of geographically distributed machines
– diverse & realistic network conditions
• Deployment platform for novel services
– run continuously
– develop a user community that provides realistic workload
design
deploy
measure
10/1/2003
PlanetLab - DISC
Growing up quick
• “Underground” meeting March 2002
• Intel seeds effort
– First 100 nodes
– Operational support
• First node up July 2002
• By SOSP (deadline March 2003) 25% of accepted
papers refer to PlanetLab
• Each following conference
has seen dramatic load
– OSDI
– NDSI
10/1/2003
PlanetLab - DISC
A Rich research agenda
• Content Dist. Networks
• Global System Architecture
– Slices, management, distribution,
• Network measurement
– Scriptroute, PlanetProbe, I3, etc.
– ESM, Scribe, TACT, etc.
• Overlay Networks
• Distributed Hash Tables
– Chord, Tapestry, Pastry, Bamboo, etc.
• Wide-area distributed storage
– Oceanstore, SFS, CFS, Palimpsest, IBP
• Resource allocation
– Sharp, Slices, XenoCorp, Automated
contracts
– PIER, IrisLog, Sophia, etc.
• Management and Monitoring
– Ganglia, InfoSpect, Scout Monitor,
BGP Sensors, etc.
• Application-level multicast
• Distributed query processing
– CoDeeN, ESM, UltraPeer emulation,
Gnutella mapping
– RON, ROM++, ESM, XBone, ABone,
etc.
• Virtualization and Isolation
– Xen, Denali, VServers, SILK, Mgmt
VMs, etc.
• Router Design implications
– NetBind, Scout, NewArch, Icarus, etc.
• Testbed Federation
– NetBed, RON, XenoServers
• Etc., etc., etc.
10/1/2003
PlanetLab - DISC
Architecture principles
• “Slices” as fundamental resource unit
– distributed set of (virtual machine) resources
– a service runs in a slice
– resources allocated / limited per-slice (proc, bw, namespace)
• Distributed Resource Control
– host controls node, service producer, service consumers
• Unbundled Management
– provided by basic services (in slices)
– instrumentation and monitoring a fundamental service
• Application-Centric Interfaces
– evolve from what people actually use
• Self-obsolescence
– everything we build should eventually be replaced by the community
– initial centralized services only bootstrap distributed ones
10/1/2003
PlanetLab - DISC
Slice-ability
• Each service runs in a slice of PlanetLab
– distributed set of resources (network of virtual machines)
– allows services to run continuously
• VM monitor on each node enforces slices
– limits fraction of node resources consumed
– limits portion of name spaces consumed
• Challenges
–
–
–
–
10/1/2003
global resource discovery
allocation and management
enforcing virtualization
security
PlanetLab - DISC
Unbundled Management
• Partition management into orthogonal services
–
–
–
–
–
resource discovery
monitoring system health
topology management
manage user accounts and credentials
software distribution and updates
• Approach
– management services run in their own slice
– allow competing alternatives
– engineer for innovation (define minimal interfaces)
10/1/2003
PlanetLab - DISC
Distributed Resource Control
• At least two interested parties
– service producers (researchers)
» decide how their services are deployed over available nodes
– service consumers (users)
» decide what services run on their nodes
• At least two contributing factors
– fair slice allocation policy
» both local and global components (see above)
– knowledge about node state
» freshest at the node itself
10/1/2003
PlanetLab - DISC
Application-Centric Interfaces
• Inherent problems
– stable platform versus research into platforms
– writing applications for temporary testbeds
– integrating testbeds with desktop machines
• Approach
– adopt popular API (Linux) and evolve implementation
– eventually separate isolation and application interfaces
– provide generic “shim” library for desktops
10/1/2003
PlanetLab - DISC
Research Thrusts
10/1/2003
PlanetLab - DISC
Open Content Distribution Networks
Codeen – Vivek Pai @ Princeton
10/1/2003
PlanetLab - DISC
Content Distribution Networks
• CoDeeN (Princeton)
– Largest open proxy
– First real service
•
•
•
•
•
•
The Dark Side of the Web: An Open Proxy's View,
Infranet/IRIS (MIT)
ESM (CMU)
UltraPeer emulation (UCB)
Gnutella mapping (UWash)
Gnutella measurement (UChicago)
Delegate
– push out planetlab distribution via proxies
10/1/2003
PlanetLab - DISC
Application-Level Multicast
10/1/2003
PlanetLab - DISC
Druschel, Rice
Srini, CMU
Application-level multicast
• End-System Multicast (CMU)
– Broadcast sigcomm over planetlab
• Scribe (Rice)
• TACT (Duke)
• Internet Backplane (UTK)
10/1/2003
PlanetLab - DISC
Distributed Hash Tables
•
•
•
•
•
Chord (MIT,UCB) / Koorde (MIT)
Tapestry / Bamboo (UCB)
Pastry (Rice)
Kademlia (NYU)
+ cast of thousands using them
10/1/2003
PlanetLab - DISC
Global Objects
name
address
lookup
• Global objects drawn from a large namespace
– 128 – 256 bits
– Table would have more rows than atoms in universe
• Objects could be anywhere on the planet
10/1/2003
PlanetLab - DISC
Distributed Hash Tables (DHT)
• Combine lookup and routing by doing a series of
– small lookup => next hop
– route
name
address
10/1/2003
PlanetLab - DISC
DHT Design
• Locate object from large namespace anywhere
within small factor (<2) of knowing its address
• Dozens of competing alternative based on
different mathematical structures
– CAN, Chord, Pastry, Tapestry, Plaxton, Viceroy, Kademlia,
Skipnet, Symphony, Koorde, Apocrypha, Land, ORDI …
110
100
000
111
111
001
101
010
011
h=2
110
h=1
000
001
000
001
010
011
100
101
110
010
111
101
011
100
10/1/2003
PlanetLab - DISC
Empirical Comparison (Rhea, Roscoe,Kubi)
• 79 PlanetLab nodes, 400 ids per node
• Performed by Tapesty side
10/1/2003
PlanetLab - DISC
Redefined set of critical issues
• Not dilation once converged, but behavior under churn
• Convergence of approaches
• Convergence of interfaces
10/1/2003
PlanetLab - DISC
Dimensions
• Topology
– Maintain robust set of options at each step
• Routing: iterative vs recursive
• Link monitoring
• Recovery
– Proactive vs periodic or damped
• What’s located
• API
10/1/2003
PlanetLab - DISC
Ossified or fragile?
• One group forgot to turn off an experiment
– after 2 weeks of router being pinged every 2 seconds, ISP contacted
ISI and threatened to shut them down.
• One group failed to initialize destination address
and ports (and had many virtual nodes on each
of many physical nodes)
–
–
–
–
–
10/1/2003
worked OK when tested on a LAN
trashed flow-caches in routers
probably generated a lot of unreachable destination traffic
triggered port-scan alarms at ISPs (port 0)
n^2 probe packets trigger other alarms
PlanetLab - DISC
Distributed Storage
• Phase 0 provides basic copy scripts
– community calls for global nfs / afs !!!
• Internet Backplane Protocol (Tenn)
– basic transport and storage of variable sized blocks (in depots)
– intermittently available, untrusted, bounded duration
– do E2E redundancy, encryption, permanence
• Cooperative File System (MIT, UCB)
– FS over DHASH (replicated blocks) over Chord
» PAST distributes whole files over Pastry
– distributed read-only file storage
• Palimpsest (IRB/Cambridge)
– Unmanaged, permanence through replication
• Ocean store (UCB)
– versioned updates of private, durable storage over untrusted servers
10/1/2003
PlanetLab - DISC
OceanStore (Kubiatowicz)
RAID distributed over the whole Internet
10/1/2003
PlanetLab - DISC
Dipping in to OceanStore Prototype
• Routine studies on thousands virtual nodes
across a hundred planetlab sites
• Efficiency of dissemination tree
– more replicas allows more of the bytes to move across fast
links
10/1/2003
PlanetLab - DISC
Network measurement
•
•
•
•
•
ScriptRoute (UWashington)
PlanetProbe (Cambridge)
I3 (UCBerkeley)
Network Weather Service (UTK)
BGP multiviews (Princeton)
Ping (everyone)
10/1/2003
PlanetLab - DISC
Watching the internet in the middle
scp 4 MB to MIT, Rice, CIT
confirm Padhye SIGCOMM98
83 machines, 11/1/02 Sean Rhea
basis for DHT comparison
143 RON+PlanetLab
Synthetic Coodinate
c/o Frans Kaashoek
110 10/1/2003
machine, c/o Ion Stoica
i3 weather service
PlanetLab - DISC
Towards an instrumentation service
• Critical underlying issue
– All the design techniques are evaluated relative to the raw internet
– Sophisticated services observe and adapt to the internet
• every overlay, DHT, and multicast is measuring the internet
in the middle
• they do it in different ways
• they do different things with the data
• Can this be abstracted into a customizable instrumentation
service?
– Share common underlying measurements
– Reduce ping, scp load
– Grow down into the infrastructure
10/1/2003
PlanetLab - DISC
Internet Measurement
10/1/2003
PlanetLab - DISC
Representative Sample of the Internet?
10/1/2003
PlanetLab - DISC
Distributed Query Processing at internet
scale
• PIER (UCB, ICIR, IRB)
– Dataflow adaptive query processing over DHT
• Sophia (Princeton)
– Distributed prolog
• IrisNet/IrisLog (Intel Pittsburgh/CMU)
– Xpath (xml query) over DNS
• Distributed Search (MIT)
• TAG (Intel Berkeley)
10/1/2003
PlanetLab - DISC
Declarative
Queries
Query Plan
Overlay Network
Physical Network
Network
Monitoring
Other User
Apps
Applications
Query
Optimizer
Catalog
Manager
Core
Relational
Execution
Engine
PIER
DHT
Wrapper
Storage
Manager
IP
Network
Overlay
Routing
DHT
Network
Does This Work for Real?
Scale-up Performance (1MB source data/node)
Real Network
Simulation
10/1/2003
PlanetLab - DISC
the Gaetano advice
• for this to be successful, it will need the support
of network and system administrators at all the
sites...
• it would be good to start by building tools that
made their job easier
10/1/2003
PlanetLab - DISC
ScriptRoute (Spring, Wetherall, Anderson)
• Traceroute provides a way to measure from you
out
• 100s of traceroute servers have appeared to help
debug connectivity problems
– very limited functionality
• => provide simple, instrumentation sandbox at
many sites in the internet
– TTL, MTU, BW, congestion, reordering
– safe interpreter + network guardian to limit impact
» individual and aggregate limits
10/1/2003
PlanetLab - DISC
Example: reverse trace
UW
Google
• underlying debate: open, unauthenticated,
community measurement infrastructure vs
closed, engineered service
• see also Princeton BGP multilateration
10/1/2003
PlanetLab - DISC
Ossified or brittle?
• Scriptroute set of several alarms
• Low bandwidth traffic to lots of ip addresses
brought routers to a crawl
• Lots of small TTLs but not exactly Traceroute
packets...
• isp installed filter blocking subnet at Harvard and
sent notice to network administrator without
human intervention
– Is innovation still allowed?
10/1/2003
PlanetLab - DISC
NetBait Serendipity
• Brent Chun built a simple http server on port 80
to explain what planetlab was about and to direct
inquiries to planet-lab.org
• It also logged requests
• Sitting just outside the firewall of ~40
universities...
• the worlds largest honey pot
• the number of worm probes from compromized
machines was shocking
• imagine the the epidemiology
• see netbait.planet-lab.org
10/1/2003
PlanetLab - DISC
1/
5
1/ / 20
10 03
1/ /20
15 03
1/ /20
20 03
1/ /20
25 03
1/ /20
30 03
/2
2/ 00
4/ 3
2
2/ 00
9 3
2/ / 20
14 03
2/ /20
19 03
2/ /20
24 03
/2
3/ 00
1/ 3
2
3/ 00
6 3
3/ / 20
11 03
3/ /20
16 03
/2
00
3
Probes per day
One example
250
10/1/2003
Code Red
Nimda
200
150
100
50
0
• The monthly code-red cycle in the large?
• What happened in March?
PlanetLab - DISC
3/
1/
2
3/ 003
2/
2
3/ 003
3/
2
3/ 003
4/
2
3/ 003
5/
2
3/ 003
6/
2
3/ 003
7/
2
3/ 003
8/
2
3/ 003
9/
3/ 2 00
10 3
/
3/ 200
11 3
/
3/ 200
12 3
/
3/ 200
13 3
/
3/ 200
14 3
/
3/ 200
15 3
/
3/ 200
16 3
/
3/ 200
17 3
/
3/ 200
18 3
/
3/ 200
19 3
/
3/ 200
20 3
/2
00
3
Probes per day
No, not Iraq
1400
Code Red
Nimda
1200
Code Red II.F
1000
10/1/2003
800
600
400
200
0
• A new voracious worm appeared and displaced
the older Code Red
PlanetLab - DISC
Netbait view of March
10/1/2003
PlanetLab - DISC
But where is the real action
• Management, Management, Management
• Truly distributed resource allocation and
management
– Perhaps the first truly meaningful computational economy
10/1/2003
PlanetLab - DISC
8. Management and Monitoring
•
•
•
•
•
•
Ganglia (UCB, Intel Berkeley)
InfoSpect (Intel Berkeley)
Scout Monitor (Intel SSL, Princeton)
BGP Sensors (Princeton)
Service Utilities (Duke)
PlanetLab NMS (IRB/UWash)
10/1/2003
PlanetLab - DISC
...is an incubator for the next
generation of the internet
Underlay:
the new thin waste?
routing, topology services
sink down into the internet
Internet
“the next internet will be created as an overlay on the current one”
10/1/2003
PlanetLab - DISC
What Planet-Lab is about?
• Create the open infrastructure for invention of the next
generation of wide-area (“planetary scale”) services
– post-cluster, post-yahoo, post-CDN, post-P2P, ...
• Potentially, the foundation on which the next Internet
can emerge
– think beyond TCP/UDP/IP + DNS + BGP + OSPF... as to what the net
provides
– building-blocks upon which services and applications will be based
– “the next internet will be created as an overlay in the current one” (NRC)
• A different kind of network testbed
–
–
–
–
not a collection of pipes and giga-pops
not a distributed supercomputer
geographically distributed network services
alternative network architectures and protocols
• Focus and Mobilize the Network / Systems Research
Community to define the emerging internet
10/1/2003
www.planet-lab.org
PlanetLab - DISC
Current Institutions (partial)
Academia Sinica, Taiwan
Boston University
Caltech
Carnegie Mellon University
Chinese Univ of Hong Kong
Columbia University
Cornell University
Datalogisk Institut Copenhagen
Duke University
Georgia Tech
Harvard University
HP Labs
Intel Research
Johns Hopkins
Lancaster University
Lawrence Berkeley Laboratory
MIT
Michigan State University
National Tsing Hua Univ.
New York University
Northwestern University
10/1/2003
Princeton University
Purdue University
Rensselaer Polytechnic Inst.
Rice University
Rutgers University
Stanford University
Technische Universitat Berlin
The Hebrew Univ of Jerusalem
University College London
University of Arizona
University of Basel
University of Bologna
University of British Columbia
UC Berkeley
UCLA
UC San Diego
UC Santa Barbara
University of Cambridge
University of Canterbury
University of Chicago
University of Illinois
PlanetLab - DISC
University of Kansas
University of Kentucky
University of Maryland
University of Massachusetts
University of Michigan
University of North Carolina
University of Pennsylvania
University of Rochester
USC / ISI
University of Technology Sydney
University of Tennessee
University of Texas
University of Toronto
University of Utah
University of Virginia
University of Washington
University of Wisconsin
Uppsala University, Sweden
Washington University in St Louis
Wayne State University
Join the fun ... www.planet-lab.org
• It is just beginning
– towards a representative sample of the internet
• Working Groups
–
–
–
–
–
–
Virtualization
Common API for DHTs
Dynamic Slice Creation
System Monitoring
Applications
Software Distribution Tools
• Building the consortium
• Hands-on experience with wide-area services at scale is
mothering tremendous innovation
– nothing “just works” in the wide-area at scale
• Rich set of research challenges ahead
– reach for applications (legal please)
• see Pick up the bootCD, ....Throw in your nodes
10/1/2003
PlanetLab - DISC
Thanks
10/1/2003
PlanetLab - DISC
… is a novel academic / industry
collaboration
•
•
•
•
•
Inspired by University research
Seeded by Intel
Architected and led by academic community
Developed and maintained by combined effort
Growing an industrial consortium
– Hosted at Princeton with UCB and UWash
10/1/2003
PlanetLab - DISC
Where did it come from?
10/1/2003
PlanetLab - DISC
5. Resource Allocation
•
•
•
•
•
SHARP (Duke, Intel Berkeley)
Slices (Intel Berkeley, Princeton)
Automated Contracts (UCB)
DSlice (Intel Berkeley, Princeton)
XenoCorp (Cambridge)
10/1/2003
PlanetLab - DISC
10. Virtualization and Isolation
•
•
•
•
•
•
Xen (Cambridge)
Denali (UWash)
Vservers (Intel Berkeley)
Mgmt VMs (Intel SSL)
SILK/Scout (Princeton)
DSlice (Intel Berkeley)
10/1/2003
PlanetLab - DISC
11. Router Design
•
•
•
•
•
NetBind (Columbia)
Scout/SILK (Princeton)
NewArch (MIT)
Icarus (UWash)
CapabilityIP (UWash/Intel Berkeley)
10/1/2003
PlanetLab - DISC
12. Testbed Federation
•
•
•
•
•
PlanetLab (IRB, Princeton, UW)
Emulab/Netbed (Utah)
RON (MIT)
Grid (you know)
Xenoservers (Cambridge)
10/1/2003
PlanetLab - DISC