PlanetLab: a worldwide testbed for distributed computing David Culler UC Berkeley •with Larry Peterson, Tom Anderson, Mic Bowman, Timothy Roscoe, Brent Chun, •Frans Kaashoek, Mike.
Download ReportTranscript PlanetLab: a worldwide testbed for distributed computing David Culler UC Berkeley •with Larry Peterson, Tom Anderson, Mic Bowman, Timothy Roscoe, Brent Chun, •Frans Kaashoek, Mike.
PlanetLab: a worldwide testbed for distributed computing David Culler UC Berkeley •with Larry Peterson, Tom Anderson, Mic Bowman, Timothy Roscoe, Brent Chun, •Frans Kaashoek, Mike Wawrzoniak, .... www.planet-lab.org PlanetLab is … http://www.planet-lab.org • A novel world-wide testbed • 205 machines at 85 sites in 19 countries – Towards thousands – Universities, Internet 2, co-location centers – 230 research projects 10/1/2003 PlanetLab - DISC ... is many, many vantage points on the internet • The Internet in the middle • Close to you wherever you are • A truly global network perspective • A place to bring dist. Algorithms to reality 10/1/2003 PlanetLab - DISC Where did it come from? • Sense of wonder – what would be the next important thing to do in extreme networked systems post cluster, post yahoo, post inktomi, post akamai, post gnutella, post bubble? • Sense of angst – NRC: “looking over the fence at networks” » ossified internet (intellectually, infrastructure, system) » next internet likely to emerge as overlay on current one (again) » it will be defined by its services, not its transport • Sense of excitement 10/1/2003 PlanetLab - DISC A new look at internet services 10/1/2003 PlanetLab - DISC Planetary-Scale Services • Services and applications spread over the web – Proximity => low latency, high bandwidth, predictable, reliable – Perspective => adapt to load, delays, failures, $ on a global scale • Content-distribution Networks and Peer-to-Peer sharing just the tip of the iceberg. • Academic Community developing the architectural building blocks to enable many kinds of distributed services – – – – – scalable translation, dist. storage, dist. events, instrumentation, management 10/1/2003 PlanetLab - DISC Key Concept: Overlay networks 10/1/2003 PlanetLab - DISC 10/1/2003 PlanetLab - DISC Overlay network routing 10/1/2003 PlanetLab - DISC key missing element – hands on experience • Researchers had no vehicle to try out their next n great ideas in this space • Lots of simulations • Lots of emulation on large clusters – emulab, millennium, modelnet • Lots of folks calling their 17 friends before the next deadline – RON testbed • but not the surprises and frustrations of experience at scale to drive innovation 10/1/2003 PlanetLab - DISC March 02 “Underground Meeting” Washington Tom Anderson Steven Gribble David Wetherall MIT Intel Research David Culler Timothy Roscoe Sylvia Ratnasamy Gaetano Borriello Satya (CMU Srini) Milan Milenkovic Frans Kaashoek Hari Balakrishnan Duke Robert Morris Amin Vadat David Anderson Jeff Chase Berkeley Ion Stoica Joe Helerstein Eric Brewer Kubi 10/1/2003 Princeton Larry Peterson Randy Wang Vivek Pai PlanetLab - DISC see http://www.cs.berkeley.edu/~culler/planetlab Rice Peter Druschel Utah Jay Lepreau CMU Srini Seshan Hui Zhang UCSD Stefan Savage Columbia Andrew Campbell ICIR Scott Shenker Eddie Kohler Guidelines (1) • Thousand viewpoints on “the cloud” is what matters – not the thousand servers – not the routers, per se – not the pipes 10/1/2003 PlanetLab - DISC Guidelines (2) • and you miust have the vantage points of the crossroads – primarily co-location centers 10/1/2003 PlanetLab - DISC Guidelines (3) • Each service needs an overlay covering many points – logically isolated • Many concurrent services and applications – must be able to slice nodes => VM per service – service has a slice across large subset • Must be able to run each service / app over long period to build meaningful workload – traffic capture/generator must be part of facility • Consensus on “a node” more important than “which node” 10/1/2003 PlanetLab - DISC Guidelines (4) Management, Management, Management • Test-lab as a whole must be up a lot – global remote administration and management » mission control – redundancy within • Each service will require its own remote management capability • Testlab nodes cannot “bring down” their site – generally not on main forwarding path – proxy path – must be able to extend overlay out to user nodes? • Relationship to firewalls and proxies is key 10/1/2003 PlanetLab - DISC Guidelines (5) • Storage has to be a part of it – edge nodes have significant capacity • Needs a basic well-managed capability – but growing to the seti@home model should be considered at some stage – may be essential for some services 10/1/2003 PlanetLab - DISC Confluence of Technologies • Cluster-based scalable distribution, remote execution, management, monitoring tools – UCB Millennium, OSCAR, ..., Utah Emulab, ... • CDNS and P2Ps – Gnutella, Kazaa, ... • Proxies routine • Virtual machines & Sandboxing – VMWare, Janos, Denali,... web-host slices (EnSim) • Overlay networks becoming ubiquitous – xBone, RON, Detour... Akamai, Digital Island, .... • Service Composition Frameworks – yahoo, ninja, .net, websphere, Eliza • • • • • Established internet ‘crossroads’ – colos Web Services / Utility Computing Authentication infrastructure (grid) Packet processing (layer 7 switches, NATs, firewalls) Internet instrumentation 10/1/2003 PlanetLab - DISC Outcome • “Mirror of Dreams” project • K.I.S.S. – Building Blocks, not solutions – no big standards, OGSA-like, meta-hyper-supercomputer • Compromise – A basic working testbed in the hand is much better than “exactly my way” in the bush • “just give me a bunch of (virtual) machines spread around the planet,.. I’ll take it from there” • small distr. arch team, builders, 10/1/2003 PlanetLab - DISC users a novel system architecture • Distributed means of acquiring a slice of virtual machines spanning much of the planet 10/1/2003 PlanetLab - DISC Tension of Dual Roles • Research testbed – run fixed-scope experiments – large set of geographically distributed machines – diverse & realistic network conditions • Deployment platform for novel services – run continuously – develop a user community that provides realistic workload design deploy measure 10/1/2003 PlanetLab - DISC Growing up quick • “Underground” meeting March 2002 • Intel seeds effort – First 100 nodes – Operational support • First node up July 2002 • By SOSP (deadline March 2003) 25% of accepted papers refer to PlanetLab • Each following conference has seen dramatic load – OSDI – NDSI 10/1/2003 PlanetLab - DISC A Rich research agenda • Content Dist. Networks • Global System Architecture – Slices, management, distribution, • Network measurement – Scriptroute, PlanetProbe, I3, etc. – ESM, Scribe, TACT, etc. • Overlay Networks • Distributed Hash Tables – Chord, Tapestry, Pastry, Bamboo, etc. • Wide-area distributed storage – Oceanstore, SFS, CFS, Palimpsest, IBP • Resource allocation – Sharp, Slices, XenoCorp, Automated contracts – PIER, IrisLog, Sophia, etc. • Management and Monitoring – Ganglia, InfoSpect, Scout Monitor, BGP Sensors, etc. • Application-level multicast • Distributed query processing – CoDeeN, ESM, UltraPeer emulation, Gnutella mapping – RON, ROM++, ESM, XBone, ABone, etc. • Virtualization and Isolation – Xen, Denali, VServers, SILK, Mgmt VMs, etc. • Router Design implications – NetBind, Scout, NewArch, Icarus, etc. • Testbed Federation – NetBed, RON, XenoServers • Etc., etc., etc. 10/1/2003 PlanetLab - DISC Architecture principles • “Slices” as fundamental resource unit – distributed set of (virtual machine) resources – a service runs in a slice – resources allocated / limited per-slice (proc, bw, namespace) • Distributed Resource Control – host controls node, service producer, service consumers • Unbundled Management – provided by basic services (in slices) – instrumentation and monitoring a fundamental service • Application-Centric Interfaces – evolve from what people actually use • Self-obsolescence – everything we build should eventually be replaced by the community – initial centralized services only bootstrap distributed ones 10/1/2003 PlanetLab - DISC Slice-ability • Each service runs in a slice of PlanetLab – distributed set of resources (network of virtual machines) – allows services to run continuously • VM monitor on each node enforces slices – limits fraction of node resources consumed – limits portion of name spaces consumed • Challenges – – – – 10/1/2003 global resource discovery allocation and management enforcing virtualization security PlanetLab - DISC Unbundled Management • Partition management into orthogonal services – – – – – resource discovery monitoring system health topology management manage user accounts and credentials software distribution and updates • Approach – management services run in their own slice – allow competing alternatives – engineer for innovation (define minimal interfaces) 10/1/2003 PlanetLab - DISC Distributed Resource Control • At least two interested parties – service producers (researchers) » decide how their services are deployed over available nodes – service consumers (users) » decide what services run on their nodes • At least two contributing factors – fair slice allocation policy » both local and global components (see above) – knowledge about node state » freshest at the node itself 10/1/2003 PlanetLab - DISC Application-Centric Interfaces • Inherent problems – stable platform versus research into platforms – writing applications for temporary testbeds – integrating testbeds with desktop machines • Approach – adopt popular API (Linux) and evolve implementation – eventually separate isolation and application interfaces – provide generic “shim” library for desktops 10/1/2003 PlanetLab - DISC Research Thrusts 10/1/2003 PlanetLab - DISC Open Content Distribution Networks Codeen – Vivek Pai @ Princeton 10/1/2003 PlanetLab - DISC Content Distribution Networks • CoDeeN (Princeton) – Largest open proxy – First real service • • • • • • The Dark Side of the Web: An Open Proxy's View, Infranet/IRIS (MIT) ESM (CMU) UltraPeer emulation (UCB) Gnutella mapping (UWash) Gnutella measurement (UChicago) Delegate – push out planetlab distribution via proxies 10/1/2003 PlanetLab - DISC Application-Level Multicast 10/1/2003 PlanetLab - DISC Druschel, Rice Srini, CMU Application-level multicast • End-System Multicast (CMU) – Broadcast sigcomm over planetlab • Scribe (Rice) • TACT (Duke) • Internet Backplane (UTK) 10/1/2003 PlanetLab - DISC Distributed Hash Tables • • • • • Chord (MIT,UCB) / Koorde (MIT) Tapestry / Bamboo (UCB) Pastry (Rice) Kademlia (NYU) + cast of thousands using them 10/1/2003 PlanetLab - DISC Global Objects name address lookup • Global objects drawn from a large namespace – 128 – 256 bits – Table would have more rows than atoms in universe • Objects could be anywhere on the planet 10/1/2003 PlanetLab - DISC Distributed Hash Tables (DHT) • Combine lookup and routing by doing a series of – small lookup => next hop – route name address 10/1/2003 PlanetLab - DISC DHT Design • Locate object from large namespace anywhere within small factor (<2) of knowing its address • Dozens of competing alternative based on different mathematical structures – CAN, Chord, Pastry, Tapestry, Plaxton, Viceroy, Kademlia, Skipnet, Symphony, Koorde, Apocrypha, Land, ORDI … 110 100 000 111 111 001 101 010 011 h=2 110 h=1 000 001 000 001 010 011 100 101 110 010 111 101 011 100 10/1/2003 PlanetLab - DISC Empirical Comparison (Rhea, Roscoe,Kubi) • 79 PlanetLab nodes, 400 ids per node • Performed by Tapesty side 10/1/2003 PlanetLab - DISC Redefined set of critical issues • Not dilation once converged, but behavior under churn • Convergence of approaches • Convergence of interfaces 10/1/2003 PlanetLab - DISC Dimensions • Topology – Maintain robust set of options at each step • Routing: iterative vs recursive • Link monitoring • Recovery – Proactive vs periodic or damped • What’s located • API 10/1/2003 PlanetLab - DISC Ossified or fragile? • One group forgot to turn off an experiment – after 2 weeks of router being pinged every 2 seconds, ISP contacted ISI and threatened to shut them down. • One group failed to initialize destination address and ports (and had many virtual nodes on each of many physical nodes) – – – – – 10/1/2003 worked OK when tested on a LAN trashed flow-caches in routers probably generated a lot of unreachable destination traffic triggered port-scan alarms at ISPs (port 0) n^2 probe packets trigger other alarms PlanetLab - DISC Distributed Storage • Phase 0 provides basic copy scripts – community calls for global nfs / afs !!! • Internet Backplane Protocol (Tenn) – basic transport and storage of variable sized blocks (in depots) – intermittently available, untrusted, bounded duration – do E2E redundancy, encryption, permanence • Cooperative File System (MIT, UCB) – FS over DHASH (replicated blocks) over Chord » PAST distributes whole files over Pastry – distributed read-only file storage • Palimpsest (IRB/Cambridge) – Unmanaged, permanence through replication • Ocean store (UCB) – versioned updates of private, durable storage over untrusted servers 10/1/2003 PlanetLab - DISC OceanStore (Kubiatowicz) RAID distributed over the whole Internet 10/1/2003 PlanetLab - DISC Dipping in to OceanStore Prototype • Routine studies on thousands virtual nodes across a hundred planetlab sites • Efficiency of dissemination tree – more replicas allows more of the bytes to move across fast links 10/1/2003 PlanetLab - DISC Network measurement • • • • • ScriptRoute (UWashington) PlanetProbe (Cambridge) I3 (UCBerkeley) Network Weather Service (UTK) BGP multiviews (Princeton) Ping (everyone) 10/1/2003 PlanetLab - DISC Watching the internet in the middle scp 4 MB to MIT, Rice, CIT confirm Padhye SIGCOMM98 83 machines, 11/1/02 Sean Rhea basis for DHT comparison 143 RON+PlanetLab Synthetic Coodinate c/o Frans Kaashoek 110 10/1/2003 machine, c/o Ion Stoica i3 weather service PlanetLab - DISC Towards an instrumentation service • Critical underlying issue – All the design techniques are evaluated relative to the raw internet – Sophisticated services observe and adapt to the internet • every overlay, DHT, and multicast is measuring the internet in the middle • they do it in different ways • they do different things with the data • Can this be abstracted into a customizable instrumentation service? – Share common underlying measurements – Reduce ping, scp load – Grow down into the infrastructure 10/1/2003 PlanetLab - DISC Internet Measurement 10/1/2003 PlanetLab - DISC Representative Sample of the Internet? 10/1/2003 PlanetLab - DISC Distributed Query Processing at internet scale • PIER (UCB, ICIR, IRB) – Dataflow adaptive query processing over DHT • Sophia (Princeton) – Distributed prolog • IrisNet/IrisLog (Intel Pittsburgh/CMU) – Xpath (xml query) over DNS • Distributed Search (MIT) • TAG (Intel Berkeley) 10/1/2003 PlanetLab - DISC Declarative Queries Query Plan Overlay Network Physical Network Network Monitoring Other User Apps Applications Query Optimizer Catalog Manager Core Relational Execution Engine PIER DHT Wrapper Storage Manager IP Network Overlay Routing DHT Network Does This Work for Real? Scale-up Performance (1MB source data/node) Real Network Simulation 10/1/2003 PlanetLab - DISC the Gaetano advice • for this to be successful, it will need the support of network and system administrators at all the sites... • it would be good to start by building tools that made their job easier 10/1/2003 PlanetLab - DISC ScriptRoute (Spring, Wetherall, Anderson) • Traceroute provides a way to measure from you out • 100s of traceroute servers have appeared to help debug connectivity problems – very limited functionality • => provide simple, instrumentation sandbox at many sites in the internet – TTL, MTU, BW, congestion, reordering – safe interpreter + network guardian to limit impact » individual and aggregate limits 10/1/2003 PlanetLab - DISC Example: reverse trace UW Google • underlying debate: open, unauthenticated, community measurement infrastructure vs closed, engineered service • see also Princeton BGP multilateration 10/1/2003 PlanetLab - DISC Ossified or brittle? • Scriptroute set of several alarms • Low bandwidth traffic to lots of ip addresses brought routers to a crawl • Lots of small TTLs but not exactly Traceroute packets... • isp installed filter blocking subnet at Harvard and sent notice to network administrator without human intervention – Is innovation still allowed? 10/1/2003 PlanetLab - DISC NetBait Serendipity • Brent Chun built a simple http server on port 80 to explain what planetlab was about and to direct inquiries to planet-lab.org • It also logged requests • Sitting just outside the firewall of ~40 universities... • the worlds largest honey pot • the number of worm probes from compromized machines was shocking • imagine the the epidemiology • see netbait.planet-lab.org 10/1/2003 PlanetLab - DISC 1/ 5 1/ / 20 10 03 1/ /20 15 03 1/ /20 20 03 1/ /20 25 03 1/ /20 30 03 /2 2/ 00 4/ 3 2 2/ 00 9 3 2/ / 20 14 03 2/ /20 19 03 2/ /20 24 03 /2 3/ 00 1/ 3 2 3/ 00 6 3 3/ / 20 11 03 3/ /20 16 03 /2 00 3 Probes per day One example 250 10/1/2003 Code Red Nimda 200 150 100 50 0 • The monthly code-red cycle in the large? • What happened in March? PlanetLab - DISC 3/ 1/ 2 3/ 003 2/ 2 3/ 003 3/ 2 3/ 003 4/ 2 3/ 003 5/ 2 3/ 003 6/ 2 3/ 003 7/ 2 3/ 003 8/ 2 3/ 003 9/ 3/ 2 00 10 3 / 3/ 200 11 3 / 3/ 200 12 3 / 3/ 200 13 3 / 3/ 200 14 3 / 3/ 200 15 3 / 3/ 200 16 3 / 3/ 200 17 3 / 3/ 200 18 3 / 3/ 200 19 3 / 3/ 200 20 3 /2 00 3 Probes per day No, not Iraq 1400 Code Red Nimda 1200 Code Red II.F 1000 10/1/2003 800 600 400 200 0 • A new voracious worm appeared and displaced the older Code Red PlanetLab - DISC Netbait view of March 10/1/2003 PlanetLab - DISC But where is the real action • Management, Management, Management • Truly distributed resource allocation and management – Perhaps the first truly meaningful computational economy 10/1/2003 PlanetLab - DISC 8. Management and Monitoring • • • • • • Ganglia (UCB, Intel Berkeley) InfoSpect (Intel Berkeley) Scout Monitor (Intel SSL, Princeton) BGP Sensors (Princeton) Service Utilities (Duke) PlanetLab NMS (IRB/UWash) 10/1/2003 PlanetLab - DISC ...is an incubator for the next generation of the internet Underlay: the new thin waste? routing, topology services sink down into the internet Internet “the next internet will be created as an overlay on the current one” 10/1/2003 PlanetLab - DISC What Planet-Lab is about? • Create the open infrastructure for invention of the next generation of wide-area (“planetary scale”) services – post-cluster, post-yahoo, post-CDN, post-P2P, ... • Potentially, the foundation on which the next Internet can emerge – think beyond TCP/UDP/IP + DNS + BGP + OSPF... as to what the net provides – building-blocks upon which services and applications will be based – “the next internet will be created as an overlay in the current one” (NRC) • A different kind of network testbed – – – – not a collection of pipes and giga-pops not a distributed supercomputer geographically distributed network services alternative network architectures and protocols • Focus and Mobilize the Network / Systems Research Community to define the emerging internet 10/1/2003 www.planet-lab.org PlanetLab - DISC Current Institutions (partial) Academia Sinica, Taiwan Boston University Caltech Carnegie Mellon University Chinese Univ of Hong Kong Columbia University Cornell University Datalogisk Institut Copenhagen Duke University Georgia Tech Harvard University HP Labs Intel Research Johns Hopkins Lancaster University Lawrence Berkeley Laboratory MIT Michigan State University National Tsing Hua Univ. New York University Northwestern University 10/1/2003 Princeton University Purdue University Rensselaer Polytechnic Inst. Rice University Rutgers University Stanford University Technische Universitat Berlin The Hebrew Univ of Jerusalem University College London University of Arizona University of Basel University of Bologna University of British Columbia UC Berkeley UCLA UC San Diego UC Santa Barbara University of Cambridge University of Canterbury University of Chicago University of Illinois PlanetLab - DISC University of Kansas University of Kentucky University of Maryland University of Massachusetts University of Michigan University of North Carolina University of Pennsylvania University of Rochester USC / ISI University of Technology Sydney University of Tennessee University of Texas University of Toronto University of Utah University of Virginia University of Washington University of Wisconsin Uppsala University, Sweden Washington University in St Louis Wayne State University Join the fun ... www.planet-lab.org • It is just beginning – towards a representative sample of the internet • Working Groups – – – – – – Virtualization Common API for DHTs Dynamic Slice Creation System Monitoring Applications Software Distribution Tools • Building the consortium • Hands-on experience with wide-area services at scale is mothering tremendous innovation – nothing “just works” in the wide-area at scale • Rich set of research challenges ahead – reach for applications (legal please) • see Pick up the bootCD, ....Throw in your nodes 10/1/2003 PlanetLab - DISC Thanks 10/1/2003 PlanetLab - DISC … is a novel academic / industry collaboration • • • • • Inspired by University research Seeded by Intel Architected and led by academic community Developed and maintained by combined effort Growing an industrial consortium – Hosted at Princeton with UCB and UWash 10/1/2003 PlanetLab - DISC Where did it come from? 10/1/2003 PlanetLab - DISC 5. Resource Allocation • • • • • SHARP (Duke, Intel Berkeley) Slices (Intel Berkeley, Princeton) Automated Contracts (UCB) DSlice (Intel Berkeley, Princeton) XenoCorp (Cambridge) 10/1/2003 PlanetLab - DISC 10. Virtualization and Isolation • • • • • • Xen (Cambridge) Denali (UWash) Vservers (Intel Berkeley) Mgmt VMs (Intel SSL) SILK/Scout (Princeton) DSlice (Intel Berkeley) 10/1/2003 PlanetLab - DISC 11. Router Design • • • • • NetBind (Columbia) Scout/SILK (Princeton) NewArch (MIT) Icarus (UWash) CapabilityIP (UWash/Intel Berkeley) 10/1/2003 PlanetLab - DISC 12. Testbed Federation • • • • • PlanetLab (IRB, Princeton, UW) Emulab/Netbed (Utah) RON (MIT) Grid (you know) Xenoservers (Cambridge) 10/1/2003 PlanetLab - DISC