Transcript Slide 1
Coflow A Networking Abstraction For Cluster Applications Mosharaf Chowdhury Ion Stoica UC Berkeley Cluster Applications Multi-Stage Data Flows » Computation interleaved with communication Computation Driver » Distributed » Runs on many machines Communication » Structured » Between machine groups 2 Communication Abstraction A Flow » Sequence of packets » Independent » Often the unit for network scheduling, traffic engineering, load balancing etc. Multiple Parallel Flows » Independent » Yet, semantically bound » Shared objective Driver Minimize Completio n Time 3 Coflow A collection of flows between two groups of machines that are bound together by application-specific semantics Captures 1. Structure 2. Shared Objective 3. Semantics 4 We Want To… Better schedule the network » Intra-coflow » Inter-coflow Write the communication layer of a new application » Without reinventing the wheel Add unsupported coflows to an application, or Replace an existing coflow implementation » Independent of applications 5 Cluster Applications Coflow AP I The Network (Physically or Logically Centralized Controller) 6 Goals Coflow 1. Separate intent from mechanisms AP I 2. Convey application-specific semantics to the network 7 Coflow AP I terminate(handle) Job finishes get(handle, id) content Shuffl e finishe s Driver put(handle, id, content) create(SHUFFLE) handle MapReduc e 8 Flexibilit y reducer s shuffl e Coflow Choice of algorithms mapper s 1. Orchestra, SIGCOMM’2011 » Default » WSS1 Choice of mechanism » App vs. Network layer » Pull vs. Push 9 @driver b create(BCAST) … Coflow reducer s broadcas t shuffl e Flexibilit y mappe rs driver (JobTrack er) put(b, id, content) … terminate(b) @mapper get(b, id) … 1 0 @driver b create(BCAST) s create(SHUFFLE, ord=[b ~> s]) Coflow reducer s broadca st shuffl e Flexibilit y mapper s driver (JobTrack er) put(b, id, content) … terminate(b) terminate(s) @mapper get(b, id) put(s, ids1) … 11 Throughput-Sensitive Applications After 2 seconds Minimize Completion Time 12 Throughput-Sensitive Applications After 4 seconds After 7 seconds After 2 seconds Minimize Completion Time 13 Throughput-Sensitive Applications Free up resources without hurting applicationperceived communication time After 7 seconds After 2 seconds Minimize Completion Time 14 Latency-Sensitive Applications HotNets 2012 Top-level Aggregato r Mid-level Aggregato rs Workers 15 Latency-Sensitive Applications HotNets 2012 HotNets-XI: Home Page conferences.sigcomm.org/hotnets/2012/ The Eleventh ACM Workshop on Hot Topics in Networks (HotNets-XI) will bring together people with interest in computer networks to engage in a lively debate ... Top-level Aggregato r Workshop | acm sigcomm HotNets www.sigcomm.org/events/hotnets-workshop The Workshop on Hot Topics in Networks (HotNets) was created in 2002 to discuss early-stage, creative ... HotNets-XI, Seattle, WA area, October 29-30, 2012. Meet Deadline 1,2 Mid-level Aggregato HotNets-XI: Call for Papers conferences.sigcomm.org/hotnets/2012/cfp.shtml rs The Eleventh ACM Workshop on Hot Topics in Networks (HotNets-XI) will bring together researchers in computer networks and systems to engage in a lively ... Meet Deadline 1,2 Coflow accepted at HotNets'2012 www.mosharaf.com/blog/2012/09/.../coflow-accepted-at-hotnets201... Sep 13, 2012 – Update: Coflow camera-ready is available online! Tell us what you think! Our position paper to address the lack of a networking abstraction for ... Workers 1. D3, SIGCOMM’2011 2. PDQ, SIGCOMM’2012 Limit impact to as few requests as 16 One More Thing… 1. Critical Path Scheduling 2. OpenTCP 3. Structured Streams 4. … 17 Coflow A semantically-bound collection of flows Conveys application intent to the network » Allows better management of network resources » Provides greater flexibility in designing applications Mosharaf Chowdhury http://www.mosharaf.com/ UC Berkeley Critical Path Scheduling Communication of a cluster application is represented by a partially-ordered set of coflows S A A S S B S S Network allocation takes place among these partially-ordered sets of coflows 19 Coflow Operation Caller create(PATTERN, [opt]) handle Driver AP I put(handle, id, content, [opt]) result Sender get(handle, id, [opt]) content Receiver terminate(handle, [opt]) result Driver 2 0 Throughput-Sensitive Applications Job finishes Shuffle finishes Local shuffle finishes Local shuffle finishes Reduce Stage Minimize Completion Time1 Map Stage MapReduc e Framewor 1. Orchestra, SIGCOMM’2011 k Data Flow 21 reducers shuffle1 shuffle2 reducers Coflow Resourc e Allocation 1. Weights [Across Apps] mappe rs mappe rs Job 1 Job 2 Weighted sharing between coflows @driver shuffle1 create(SHUFFLE, weight=1) shuffle2 create(SHUFFLE, weight=2) … 2 2 reducers shuffle1 shuffle2 reducers Coflow Resourc e Allocation 2. Priorities [Across Apps] mappe rs mappe rs Job 1 Job 2 Strict priorities @driver shuffle1 create(SHUFFLE, pri=3) shuffle2 create(SHUFFLE, pri=5) … 2 3 reducers shuffle2 mappe rs broadcast (b) mappe rs Coflow Resourc e aggregation(a gg) shuffle1 reducers driver Job 1 Job 2 Allocation finishes_before (~>) 3. Dependencies @driver b create(BCAST) shuffle2]) create(SHUFFLE, ord=[b ~> agg create(AGGR, ord=[shuffle2 ~> agg]) [Within Apps] 2 4 Communication of a cluster application is represented by a partially-ordered set of coflows Coflow Resourc e Allocatio n S A A S S B S S Network allocation takes place among these partially-ordered sets of 2 coflows 5