Lecture #1 - Wayne State University

Download Report

Transcript Lecture #1 - Wayne State University

Peer-to-Peer Systems

ECE7650 P2P Service 1-1

Outline

   What and Why P2P?

Examples of P2P Applications  File Sharing   Voice over IP Steaming Overview of P2P Architecture P2P Service 1-2

What is P2P?

 “P2P is a class of applications that take advantage of resources content, human presence – available at the edges of the Internet. Because accessing these decentralized operating in an environment of unstable and unpredictable IP addresses operate outside the DNS system significant, or total autonomy from central servers ” – storage, cycles, resources means P2P nodes must and have

Clay Shirky

( www.shirky.com

)

Defining Characteristics

1.

2.

3.

Significant autonomy from central servers Exploits resources at the edges of the Internet    storage and content CPU cycles human presence Resources at edge have intermittent connectivity, being added & removed P2P Service 1-4

Examples of P2P Usages

      File Sharing and Content Delivery  BitTorrent, eDonkey, Gnutella, etc P2P Communication and appl-level multicast   Voice over IP and Instant Messaging: Skype Video Streaming: PPLive Distributed Processing  SETI@Home, PlanetLab, etc Distributed databases Collaboration and Distributed games Ad hoc networks P2P Service 1-5

Current Status (as of 10/2010)

 Cisco Virtual Networking Index revealed that   P2P traffic, mainly file sharing, to be doubled by 2014 , growing to 7+ petabytes per month Share drops dwon from 75% of all bw on the Internet 5 years ago down to 39% at the end of 2009  Roughly 46% of all traffic in 2014 will be attributed Internet Video P2P Service 1-6

Client/Server Architecture

  Well known, server is a data source Clients request data from server  Very successful model  WWW (HTTP), FTP, Web services, etc.

Client Client Server Internet Client Client * Figure from http://project-iris.net/talks/dht-toronto-03.ppt

Client/Server Limitations

    Scalability is hard to achieve Presents a single point of failure Requires administration Unused resources at the network edge  P2P systems try to address these limitations

P2P Overlay Network

     Provide and consume data Any node can initiate a connection No centralized data source Overlay Graph Virtual edge  TCP connection  or simply a pointer to an IP address Overlay maintenance  Periodically ping to make sure neighbor is still alive  Or verify liveness while messaging  If neighbor goes down, may want to establish new edge  New node needs to bootstrap

Overlay networks

Overlay IP

IPv6 – Caen, 11 juin 2003 10

Overlay networks

Overlay IP

IPv6 – Caen, 11 juin 2003 11

Overlays: all in the application layer

  Tremendous design flexibility  Topology, maintenance    Message types Protocol Messaging over TCP or UDP Underlying physical net is transparent to developer  But some overlays exploit proximity P2P Service 1-12

  

Main Components of a P2P app

A web portal of the application, aka login server  Check out the availability of services as well as peers Directory server , furnishing peers availability info, aka tracker Peers, which join and leave at will, obey behavior rules and implement business logics of the apps  Discovery : find out if a particular service is available, data (meta-data about the service), and peers holding the actual data   Location : acquire location info about tracker of a service, and info about peers having the data; report its own location and data already possessed Data transfer : push vs pull approaches for data exchange; structured vs unstructured interconnect network of peers P2P Service 1-13

   

P2P Goals and Benefits

Service availability: efficient use of resources  Unused bw, storage, proc power at the “edge of the network” Performance and Scalability   No central information, comm and computation bottleneck a certain rate of population growth that can be supported while maintaining a stable level of perf Reliability   Replicas, Geographic distribution No single point of failure Robustness to change (or dynamism)    Ease of administration    Nodes self-organize Built-in fault tolerance, replication, and load balancing Increased autonomy Trust: Security, Privacy (Anonymity), Incentives

Outline

   What and Why P2P?

Examples of P2P Applications  File Sharing   Voice over IP Steaming Overview of P2P Architecture P2P Service 1-15

P2P file sharing

Example  Alice runs P2P client application on her notebook computer    Intermittently connects to Internet; gets new IP address for each connection Asks for “Hey Jude” Application displays other peers that have copy of Hey Jude.

   Alice chooses one of the peers, Bob.

File is copied from Bob’s PC to Alice’s notebook: HTTP While Alice downloads, other users uploading from Alice.

 Alice’s peer is both a Web client and a transient Web server.

All peers are servers = highly scalable!

P2P Service 16

Key Issues in P2P File Sharing

    Search : the file-sharing system has to support a convenient and accurate file search user interface.

Peer Selection : The file-sharing system has to support an efficient peer selection mechanism so as to minimize the download time.

Connection . Peers should be able to set up more or less stable data transfer connections so that file data packets can be exchanged efficiently.

Performance . The key performance metrics are download time and availability.

P2P Service 1-17

How Did it Start?

 A killer application: Napster  Free music over the Internet  Key idea: share the storage and bandwidth of individual (home) users Internet P2P Service 18

Main Challenges

 Find where a particular file is stored  Note: problem similar to finding a particular page in web caching  Nodes join and leave dynamically E F D E?

C A B P2P Service 19

P2P file sharing Architectures

   Centralized Directory :   Central Directory keeps track of peer IPs and their shared content Example: Napster and Instant Messaging Distributed Query Flooding :   Peers keep their own shared directory and content is located in nearby peers.

Example: Gnutella protocol Distributed Heterogeneous Peers   proprietary protocol, group leaders with high bandwidth act as central directories searched by connected peers Example: KaZaA P2P Service 20

P2P: centralized directory

Original “Napster” design 1) when peer connects, it informs central server:   IP address content 2) Alice queries for “Hey Jude” 3) Alice requests file from Bob centralized directory server 2 1 1 1 1 Alice 3 Bob peers P2P Service 21

Napster: Example

m5 m6 E F E?

E E?

m5 m1 A m2 B m3 C m4 D m5 E m6 F m1 A m2 B m4 D m3 C P2P Service 22

Napster: History

 history:    5/99: Shawn Fanning (freshman, Northeasten U.) founds Napster Online music service 12/99: 3/00: first lawsuit 25% UWisc traffic Napster     2000: est. 60M users 2/01 : US Circuit Court of Appeals: Napster knew users violating copyright laws 7/01: # simultaneous online users: Napster 160K, Gnutella: 40K, Now: try to come back

http://www.napster.com

23

P2P: problems with centralized directory

   Single point of failure: if the central directory crashes the whole application goes down.

Performance bottleneck: the central server maintains a large database Copyright infringement file transfer is decentralized, but locating content is highly centralized P2P Service 24

Query flooding: Gnutella

    fully distributed  no central server public domain protocol many Gnutella clients implementing protocol Peers discover other peers through Gnutella hosts that maintain and cache list of available peers. Discovery is not part of the Gnutella protocol.

overlay network: graph  edge between peer X and Y if there’s a TCP connection    all active peers and edges is overlay network Edge is not a physical link but logical link Given peer will typically be connected with < 10 overlay neighbors P2P Service 25

Gnutella: Example

 Assume: m1 ’ s neighbors are m2 and m3; m3 ’ s neighbors are m4 and m5; … m5 E m6 F E D E?

m4 E?

E?

E?

C A B m3 m1 m2 P2P Service 26

1.

2.

3.

4.

5.

6.

Gnutella: Peer joining or leaving

Joining peer X must find some other peer in Gnutella network: use list of candidate peers X sequentially attempts to make TCP with peers on list until connection setup with Y X sends Ping message to Y; Y forwards Ping message. The frequency of Ping messages are not part of the protocol but they should be minimized.

All peers receiving Ping message respond with Pong message containing the number of files shared and their size in kbytes.

X receives many Pong messages. It can then setup additional TCP connections When a peer leaves the network, other peers try to connect sequentially to others P2P Service 27

Gnutella Scoped Flooding

Searching by flooding :  If you don’t have the file you want, query 7 of your neighbors.

    If they don’t have it, they contact 7 of their neighbors, for a maximum hop count of 10.

Requests are flooded, but there is no tree structure.

No looping but packets may be received twice.

Reverse path forwarding * Figure from http://computer.howstuffworks.com/file-sharing.htm

Gnutella protocol Query

 A Query message (each with a MessageID) is sent over existing TCP connections.

 peers forward Query message and keep track of the last socket source of the message with the message ID and decrement the peer-count field.

 QueryHit message sent over reverse path using the message ID so that peers can remove the QueryHit messages from the network.

QueryHit Query File transfer: HTTP Query QueryHit limited scope query flooding has been implemented where a peer-count field of the query is decremented when it reaches a peer and returned to sender when it reaches 0 The number of edges of a Gnutella overlay network with N nodes = N(N-1)/2 P2P Service 29

Gnutella vs Napster

     Distribute file location and decentralize lookup.

Idea: multicast the request Hot to find a file:    Send request to all neighbors Neighbors recursively multicast the request Eventually a machine that has the file receives the request, and it sends back the answer Advantages:  Totally decentralized, highly robust Disadvantages:  Not scalable; the entire network can be swamped with request (to alleviate this problem, each request has a TTL) P2P Service 30

Recap: P2P file sharing Arch

   Centralized Directory:   Central Directory keeps track of peer IPs and their shared content Example: Napster and Instant Messaging Distributed Query Flooding:   Peers keep their own shared directory and content is located in nearby peers.

Example: Gnutella protocol Distributed Heterogeneous Peers   proprietary protocol, group leaders with high bandwidth act as central directories searched by connected peers Example: KaZaA P2P Service 31

Exploiting heterogeneity: KaZaA

   Proprietary protocol, encrypts the control traffic but not the data files Each peer is either a group leader or assigned to a group leader.

 TCP connection between peer and its group leader.

 TCP connections between some pairs of group leaders.

Group leader tracks the content in all its children.

ordinary peer group-leader peer neighoring relationships in overlay network P2P Service 32

KaZaA: Querying

     Each file has a hash and a descriptor Client sends keyword query to its group leader Group leader responds with matches:  For each match: metadata, hash, IP address If group leader forwards query to other group leaders, they respond with matches. limited scope query flooding is also implemented by KaZaA.

Client then selects files for downloading  HTTP requests using hash as identifier sent to peers holding desired file P2P Service 33

KaZaA tricks to improve performance

   Request queuing: each peer can limit the #simultaneous uploads (~3-7) to avoid long delays Incentive priorities: the more a peer uploads the higher his priority to download Parallel downloading of a file across peers: peer can download different portions of the same file from different peers using the byte-range header of http.

P2P Service 34

BitTorrent

  P2P file sharing communication protocol 130 millions installations, as of 2008.Q1

tracker:

each torrent has an infrstrctr node, which keeps record of peers participating in the torrent

torrent:

group of peers exchanging chunks of a file obtain list of peers trading chunks peer P2P Service 35

BitTorrent (1)

       file divided into 256KB

chunks

.

peer joining torrent:   has no chunks, but will accumulate them over time registers with tracker to get list of peers, connects to subset of peers (“neighbors”) concurrently in TCP Alice’s neighboring peers may fluctuate over time Alice periodically ask each of her neighbor for the list of chunks they have (pull chunk) while downloading, peer uploads chunks to other peers. peers may come and go once peer has entire file, it may (selfishly) leave or (altruistically) remain P2P Service 36

BitTorrent (2)

Which chunk to pull first?

  at any given time, diff peers have different subsets of file chunks periodically, a peer (Alice) asks each neighbor for list of chunks that they have.

 Alice issues requests for her missing chunks  rarest first Which request to be responded     first: tit-for-tat trading Alice sends chunks to four neighbors who are currently sending her chunks

at the highest rate

 re-evaluate top 4 every 10 secs every 30 secs: randomly select another peer, starts sending chunks  the new peer may join top 4 Random selection allows new peers to get chunks, so they can start to trade Trading algorithm helps eliminate free-riding problem.

P2P Service 37

File Distribution: Server-Client vs P2P

Question

: How much time to distribute file from one server to N peers?

File, size

F

Server u s

u 1 d 1 u 2 d 2 u s :

server upload bandwidth

u i :

peer i upload bandwidth

d i :

peer i download bandwidth

d N

Network (with abundant bandwidth)

u N

File distribution time: server-client

  server sequentially sends N copies: 

NF/u s

time client i takes F/d i time to download

F

Server

d N u N

u s

u 1 d 1 u 2 d 2

Network (with abundant bandwidth) Time to distribute

F

to

N

clients using client/server approach = d cs = max {

NF/u s , F/min(d i ) i

} increases linearly in N (for large N)

File distribution time: P2P

   Server server must send one copy: F/u

s

time client i takes F/d i time to download NF bits must be downloaded (aggregate)  fastest possible upload rate: u s

F d N u N

+ S u i u s

u 1 d 1 u 2

Network (with

d 2

abundant bandwidth) d P2P = max {

F/u s , F/min(d i )

, NF/(u s

i

+ S u i ) }

Server-client vs. P2P: example

Client upload rate = u, F/u = 1 hour, u s = 10u, d min ≥ u s 3.5

3 2.5

2 1.5

1 0.5

0 0 P2P Client-Server 5 10 15 N 20 25 30 35

Issues with P2P

  Free Riding (Free Loading)   Two types of free riding • Downloading but not sharing any data • Not sharing any interesting data On Gnutella • 15% of users contribute 94% of content • 63% of users never responded to a query – Didn’t have “interesting” data No ranking: what is a trusted source?

 “spoofing”

P2P Case study: Skype

   P2P (pc-to-pc, pc-to phone, phone-to-pc) Voice-Over-IP (VoIP) application  also IM proprietary application layer protocol (inferred Skype login server via reverse engineering) hierarchical overlay    Founded by the same people of Kazaa Acquired by eBay in 2005 for $2.6B

300 millions accounts by 2008.Q1

Skype clients (SC) Supernode (SN) P2P Service 43

Skype: making a call

User starts Skype  SC registers with Super Node    a cached SN list, or list of bootstrap SNs that are hardwired Any node could be SN Skype login server  SC logs in (authenticate)  Call: SC contacts SN with callee ID  SN contacts other SNs (unknown protocol, maybe flooding) to find addr of callee ; returns addr to SC  SC directly contacts callee, overTCP  Up/down link bw of 5 kbytes/sec  Over 60% behind NAT. How to call them?

P2P Service 44

NAT: Network Address Translation

2: NAT router changes datagram source addr from 10.0.0.1, 3345 to 138.76.29.7, 5001, updates table NAT translation table WAN side addr LAN side addr 138.76.29.7, 5001 10.0.0.1, 3345 …… …… 2 S: 138.76.29.7, 5001 D: 128.119.40.186, 80 138.76.29.7

S: 128.119.40.186, 80 D: 138.76.29.7, 5001 3: Reply arrives dest. address: 138.76.29.7, 5001 3 1: host 10.0.0.1 sends datagram to 128.119.40.186, 80 S: 10.0.0.1, 3345 D: 128.119.40.186, 80 10.0.0.1

1 10.0.0.4

10.0.0.2

S: 128.119.40.186, 80 D: 10.0.0.1, 3345 4 10.0.0.3

4: NAT router changes datagram dest addr from 138.76.29.7, 5001 to 10.0.0.1, 3345

NAT traversal problem in Skype

 Relay  NATed client establishes connection to relay   External client connects to relay relay bridges packets between to connections Client 2.

connection to relay initiated by client 3.

relaying established 1.

connection to relay initiated by NATted host 138.76.29.7

NAT router 10.0.0.1

Teleconference in Skype

   Involve multiple parties in a VoIP session Mixing operation is required: merging several streams of voice packets for delivery to the receivers Possible mixing approaches:     Each user has its own multicast tree One designate handles mixing and subsequent operation Single multicast tree Decoupled: one for mixing and the other for distribution P2P Service 1-47

Skype Outage in Dec 2010

  On Dec 22, 2010, the P2P skype network became unstable and suffered a critical failure It lasted about 24 hours. Causes:  a cluster of support servers responsible for offline instant messaging became overloaded.

   As a result, some Skype clients received delayed responses from the overloaded servers. In a version of the Skype for Windows client (version 5.0.0152), the delayed responses from the overloaded servers were not properly processed, causing Windows clients running the affected version to crash.

around 50% of all Skype users globally were running this Skype version, and the crashes caused approximately 40% of those clients to fail. These clients included 25– 30% of the publicly available supernodes, also failed as a result of this problem.

P2P Service 1-48

Skype Outage in Dec 2010

  Causes (cont’)   The failure of 25–30% of supernodes in the P2P network resulted in an increased load on the remaining supernodes.

Supernodes have a built in mechanism to protect themselves and to avoid adverse impact on the systems hosting them when operational parameters do not fall into expected ranges.

 increased load in supernode traffic led to some of these parameters exceeding normal limits, and as a result, more supernodes started to shut down. This further increased the load on remaining supernodes and caused a positive feedback loop, which led to the near complete failures that occurred a few hours after the triggering event.

How to recover from this outage?

P2P Service 1-49

Skype Reading List

    Baset, S. A. and Schulzrinne, H. G. An Analysis of the Skype Peer-to-Peer Internet Telephony Protocol. In Proceedings of

INFOCOM 2006.

Caizzone, G. Et al. Analysis of the Scalability of the Overlay Skype System. In Proceedings of ICC 2008. Kho, W., Baset, S. A., and Schulzrinne, H. G. (2008). Skype Relay alls: Measurements and Experiments. In Proceedings

of INFOCOM 2008.

Gu, X., et al peerTalk: A Peer-to-Peer Multiparty Voice over-IP System. IEEE TPDS, 19(4):515–528, 2008.

 Lars Rabbe, CIO update: Post-mortem on the Skype outage, http://blogs.skype.com/en/2010/12/cio_update.html

P2P Service 1-50

P2P Video Streaming

  A strong case for P2P   Youtube costs $1B+ for network BW > 1Mpbs last mile download bw, and over several hundreds Kbps upload bw are widely available Examples of the systems  Tree-based push approach structure. : peers are organized into a tree structure for data delivery, with each packet being disseminated using the same overlay  Pull-based data driven approach : nodes maintain a set of partners, and periodically exchange data availability info with the partners • SplitStream, CoolStream, PPStream, PPLive P2P Service 1-51

Video Streaming

 A generic arch for retrieving, storing, and playing back video packets    Original video packet stream is broken down into chunks, each with a unique chunk ID Buffer map : indicates the presence/absence of video chunks in the node A peer requests for missing packets from its connected peers.

P2P Service 1-52

Key Issues

    Chunk Size : determines the size of buffer map, which needs to be exchanged frequently Replication strategies : how video/chunk should be cached, dependent on video coding   Selection and replacement of videos Prefetching Chunk selection : which missing chunks should be downloaded first: sequential, rarest first, anchor first  Anchor first: get all the chunks located at predefined anchor points Transmission strategies : to maximize download rate and to minimize overheads    Parallel download from multiple peers, but may causing duplication Selective download of different chunks from multiple peers Sequential download chunks from peer to peer P2P Service 1-53

Scalable Coding

   Scalable Video Coding (SVC): improving the image quality split images into different hierarchical layers, each successive layer  Extension of H.264/MPEG-4 video compression standard Multiple Description Coding (MDC) contents at the viewer’s machine. : a video stream is encoded into multiple substreams (aka descriptions), with different importance as to restoring the original  E.g. In MPEG-2, I-frames be encoded as the first layer, the first P-frames as the second layer, the second P-frames as the third layer, the B-frames as even higher layers MDC creates independent descriptors (bitstreams), while SVC creates hierarchical dependent layers P2P Service 1-54

P2P Streaming Example: PPTV

      PPLive: free P2P-based IPTV As of January 2006, the PPLive network provided 200+ channels with 400,000 daily users on average. The bit rates of video programs mainly range from 250 Kbps to 400 Kbps with a few channels as high as 800 Kbps. The video content is mostly feeds from TV channels in Mandarin. The channels are encoded in two video formats: Window Media Video (WMV) or Real Video The encoded video content is divided into chunks network.

        channel.

This peer may also upload cached video chunks to multiple peers. Received video chunks are reassembled in order and buffered in queue of PPLive TV engine, forming local streaming file in memory.

When the streaming file length crosses a predefined threshold, the PPLive TV engine launches media player, which downloads video content from local HTTP streaming server.

After the buffer of the media player fills up to required level, the actual video playback starts.

When PPLive starts, the PPLive TV engine downloads media content from peers aggressively to minimize playback start-up delay. When the media player receives enough content and starts to play the media, streaming process gradually stabilizes. The PPLive TV engine streams data to the media player at media playback rate.

Measurement setup

     “Insights into PPLive: A Measurement Study of a LargeScale P2P IPTV System” by X. Hei et al. One residential and one campus PC “watched” channel CCTV3 The other residential and campus PC “watched” channel CCTV10 Each of these four traces lasted about 2 hours. From the PPLive web site, CCTV3 is a popular channel with a 5-star popularity grade and CCTV10 is less popular with a 3-star popularity grade.

Session durations

   Signaling versus video sessions All sessions are TCP based The median video session is about 20 seconds and about 10%

Video traffic breakdown among sessions

Start-up delays

    Two types of start-up delay:   the delay from when one channel is selected until the streaming player pops up; the delay from when the player pops up until the playback actually starts. The player pop-updelay is in general 10-15 seconds and the player buffering delay is around 10-15 seconds. Therefore, the total start-up delay is around 20 30 seconds. Nevertheless, some less popular channels have a total start-up delays of up to 2 minutes.

Upload-download rates

Estimating the redundancy ratio

      It is possible to download same video blocks more than once Excluding TCP/IP headers, determine total streaming payload for the downloaded traffic. Utilizing video traffic filtering heuristic rule (packet size > 1200B) extract video traffic. Given playback interval and the media playback speed, obtain a rough estimate of the media segment size. Compute the redundant traffic by the difference between the total received video traffic and the estimated media segment size. estimated media segment size.

Dynamics of video participants

Peer arrivals & departures

Geographic distribution of peers

Video Streaming Reading

  et al. IEEE Proceedings, 2007.

Study of a LargeScale P2P IPTV System”

Unstructured vs Structured P2P

  The systems we described do not offer any guarantees about their performance (or even correctness) Structured P2P   Scalable guarantees on numbers of hops to answer a query Maintain all other P2P properties (load balance, self-organization, dynamic nature)  Approach: Distributed Hash Tables (DHT)

Distributed Hash Tables (DHT)

A hash table is a data structure that uses a hash func to map id values (keys) to their associated values      Stores ( key , value ) pairs   The key is like a filename The value can be file contents, or pointer to location Goal : Efficiently insert/lookup/delete ( Each peer stores a subset of ( key , value key , value ) pairs in Core operation: Find node responsible for a key )   Map key to node Efficiently route insert/lookup/delete request to this node Allow for frequent node arrivals/departures

DHT Design Goals

  An “overlay” network with:        Flexible mapping of keys to physical nodes Small network diameter Small degree (fanout) Local routing decisions Robustness to churn Routing flexibility Decent locality (low “stretch”) A “storage” or “memory” mechanism with   No guarantees on persistence Maintenance via soft state

Basic Ideas

    keys IDs  are associated with globally unique integers of size m (for large m) key ID space (search space) is uniformly populated - mapping of keys to IDs using (consistent) hashing a node is responsible for indexing all the keys in a certain subspace (zone) of the ID space nodes have only partial knowledge of other node’s responsibilities

DHT ID Assignment

  Assign (key, value) to each peer in range [0,2 n -1].

 Each identifier can be represented by n bits.

 Rule : assign key to the peer that has the closest ID.

Require each key to be an integer in same range .

dataID nodeID Any metric space will do

Circular DHT (1)

1 3 15 4 12 5 10 8   Each peer only aware of immediate successor and predecessor.

“Overlay network” P2P Service 1-72

Circle DHT (2)

O(N) messages on avg to resolve query, when there are N peers 1111 I am 0001 0011 Who’s resp for key 1110 ?

1110 1110 Define closest as closest successor 1100 1110 1010 1110 1110 1000 1110 0100 0101 P2P Service 1-73

Circular DHT with Shortcuts

1 Who’s resp for key 1110? 3 15 4 12 5   10 8 Each peer keeps track of IP addresses of predecessor, successor, short cuts.

Reduced from 6 to 2 messages.

P2P Service 1-74

Design Space

   How many shortcut neighbors should each peer have, and which peers should be these shortcut neighbors Possible to design shortcuts so O(log N) neighbors, O(log N) messages in query  E.g. Chord [Stoica 2001]  High overhead for routing table maintenance in a large scale network compromising lookup efficiency?  constant-degree DHT with degree=O(1), O(log N) query efficiency  E.g. Cycloid [Shen’06] P2P Service 1-75

Peer Churn

1 15 3 4 • To handle peer churn, require each peer to know the IP address • of its two successors. Each peer periodically pings its two successors to see if they are still alive . 12 5    10 8 Peer 5 abruptly leaves Peer 4 detects; makes 8 its immediate successor; asks 8 who its immediate successor is; makes 8’s immediate successor its second successor.

What if peer 13 wants to join?

P2P Service 1-76

Content Addressable Network (CAN) : 2D Space

    Space divided between nodes All nodes cover the entire space Each node covers either a square or a rectangular area of ratios 1:2 or 2:1 Example:  Node n1:(1, 2) first node that joins  cover the entire space

7 6 5 4 3 2 1 0 n1 0 1 2 3 4 5 6 7

CAN Example: 2D Space

 Node n2:(4, 2) joins n1 and n2  space is divided between

7 6 5 4 3 2 1 0 n1 n2

0 1 2 3 4 5 6 7

CAN Example: 2D Space

 Node n2:(4, 2) joins n1 and n2  space is divided between

7 6 5 4 3 2 1 0 n1 n3 n2

0 1 2 3 4 5 6 7

CAN Example:2D Space

 Nodes n4:(5, 5) and n5:(6,6) join

1 0 3 2 7 6 5 4 n1 n3 n2 n4 n5 0 1 2 3 4 5 6 7

CAN Example: 2D Space

 Nodes: n1:(1, 2); n2:(4,2); n3:(3, 5); n4:(5,5);n5:(6,6)

7 6 5

 Items: f1:(2,3); f2:(5,1); f3:(2,1); f4:(7,5);

4 3 2

 Each item is stored by the node who owns its mapping in the space

1 0 0 n1 f1 f3 1 2 n3 3 n2 4 n4 n5 f4 f2 5 6 7

CAN: Query Example

    Each node knows its neighbors in the d-space Forward query to the neighbor that is closest to the query id Example: assume n1 queries f4 Can route around some failures  some failures require local flooding

7 6 5 4 3 2 1 0 n1 f1 f3 0 1 2 n3 3 n2 4 n4 n5 f4 f2 5 6 7

      

DHT Reading List

Stoica, et al., Chord: a scalable peer-to-peer lookup service for Internet applications, Proc. 2001 ACM SIGCOMM Ratnasmay, et al., A scalable content-addressable network, Proc. 2001 ACM SIGCOMM Shen, et al, Cycloid: A constant-degree lookup efficient P2P network, Proc. of IPDPS’04. (Performance Evaluation, 2006) H. Shen and C. Xu, “Locality-aware and churn-resilient load balancing algorithms in structured peer-to-peer networks,” IEEE TPDS, vol.18(6):849-862, June 2007. H. Shen and C. Xu, “Hash-based proximity clustering for efficient load balancing in heterogeneous DHT networks,” JPDC, vol.68(5):686-702, May 2008.

H. Shen and C. Xu, “Elastic routing table with provable performance for congestion control in DHT networks,” IEEETPDS , vol.21(2):242—256, February 2010.

H. Shen and C. Xu, “Leveraging a compound graph based DHT for multi-attribute range queries with performance analysis,” IEEE

Transactions on Computers, 2011 (accepted)

P2P Service 1-83

Examples of Network Services

       E-mail Web and DNS Instant messaging Remote login P2P file sharing Multi-user network games Streaming stored video clips    Internet telephone Real-time video conference Cloud computing and Storage P2P Services 84