Peer-to-Peer Systems: Practice Theory & Ashwin R. Bharambe

Download Report

Transcript Peer-to-Peer Systems: Practice Theory & Ashwin R. Bharambe

Peer-to-Peer Systems: Theory & Practice

Ashwin R. Bharambe 15744 Lecture

Overview Internet Indirection Infrastructure (i3) Freenet BitTorrent

Content distribution

Effect of P2P networks on the Internet

How does the new traffic matrix look like?

4/30/2020 Ashwin R. Bharambe 2

i3: Motivation

Today’s Internet based on point-to-point abstraction Applications need more: Multicast Mobility Anycast Existing solutions: Change IP layer Overlays

So, what’s the problem?

A different solution for each service

4/30/2020 Ashwin R. Bharambe 3

The i3 solution Indirection

Every problem in CS …

Only primitive needed

Solution: Add an indirection layer on top of IP Implement using overlay networks Solution Components: Naming using “identifiers” Subscriptions using “triggers” DHT as the gluing substrate

4/30/2020 Ashwin R. Bharambe 4

i3: Rendezvous Communication

Packets addressed to identifiers (“names”) Trigger=(Identifier, IP address): inserted by receiver Sender

send(ID, data) send( R , data)

trigger Receiver (R)

ID R 4/30/2020

Senders

decoupled

from receivers

Ashwin R. Bharambe 5

i3: Service Model

API

sendPacket(id, p); insertTrigger(id, addr); removeTrigger(id, addr); // optional

Best-effort service model (like IP) Triggers periodically refreshed by end-hosts Reliability, congestion control, and flow control implemented at end-hosts

4/30/2020 Ashwin R. Bharambe 6

i3: Implementation

Use a Distributed Hash Table Scalable, self-organizing, robust Suitable as a substrate for the Internet

IP.route(R) send( R , data) send(ID, data)

Sender

trigger

Receiver (R)

ID R DHT.put(id) DHT.put(id) 4/30/2020 Ashwin R. Bharambe 7

Mobility and Multicast

send to

many

Mobility supported naturally End-host inserts trigger with new IP address, and everything transparent to sender Robust, and supports location privacy Multicast All receivers insert triggers under same ID Sender uses that ID for sending Can optimize tree construction to balance load

4/30/2020 Ashwin R. Bharambe 8

Anycast

send to

any one

Generalized matching First k-bits have to match, longest prefix match among rest Sender

a b

Triggers

a b 1 a b 2 a b 3

(R1) (R2) (R3) Related triggers must be on same server Server selection (randomize last bits)

4/30/2020 Ashwin R. Bharambe 9

Generalization: Identifier Stack

Stack of identifiers i3 routes packet through these identifiers Receivers trigger maps id to Sender can also specify id-stack in packet Mechanism: first id used to match trigger rest added to the RHS of trigger recursively continued

4/30/2020 Ashwin R. Bharambe 10

Service Composition

Receiver mediated: R sets up chain and passes id_gif/jpg to sender: sender oblivious Sender-mediated: S can include (id_gif/jpg, ID) in his packet: receiver oblivious Sender (GIF) send((ID_ GIF/JPG ,ID), data) S_ GIF/JPG send(ID, data) send(R, data)

ID R

Receiver R (JPG)

ID_ GIF/JPG S_ GIF/JPG 4/30/2020 Ashwin R. Bharambe 11

Public, Private Triggers

Servers publish their public ids: e.g., via DNS Clients contact server using public ids, and negotiate private ids used thereafter Useful: Efficiency -- private ids chosen on “close-by” i3 servers Security -- private ids are shared-secrets

4/30/2020 Ashwin R. Bharambe 12

Scalable Multicast

Replication possible at any i3-server in the infrastructure. Tree construction can be done internally R 3 g x (g, data) g R 1 g R 2 x R 3 x R 4 R 4 R 2 R 1

4/30/2020 Ashwin R. Bharambe 13

Evaluation

Efficiency Metric: Latency stretch Sender  i3 takes many hops Sender  i3  Receiver triangle routing    Decoupling of senders and receivers One framework for various new abstractions  Scalable, incrementally deployable Performance Overheads What speeds can this support?

4/30/2020 Ashwin R. Bharambe 14

4/30/2020

Switch tracks…

I don’t understand any DHT stuff; it’s all unreal All I understand is… FILE SHARING

Ashwin R. Bharambe 15

P2P Applications

Centralized model e.g., Napster global index held by central authority direct contact between requestors and providers

Index server

4/30/2020 NAPSTER Ashwin R. Bharambe 16

P2P Applications

Decentralized model e.g., Freenet, Gnutella no global index – local knowledge only (approximate answers) contact mediated by chain of intermediaries

Index servers

4/30/2020 KAZAA Ashwin R. Bharambe FREENET or GNUTELLA 17

4/30/2020

What is Freenet and Why?

Distributed, Peer to Peer, file sharing system Completely anonymous, for producers or consumers of information Resistance to attempts by third parties to deny access to information

Ashwin R. Bharambe 18

4/30/2020

Freenet: How it works

Data structure Key Management Problems How can one node know about others How can it get data from remote nodes How to add new nodes to Freenet How does Freenet manage its data

Ashwin R. Bharambe 19

Data structure

Each document is associated with a “ key ” Routing Table pairs Data Structure should be able to:  rapidly find the document given a certain key  rapidly find the closest key to a given key  keep track the popularity of documents and know which document to delete when under pressure

4/30/2020 Ashwin R. Bharambe 20

4/30/2020

Key Management(1)

A way to locate a document anywhere Keys are used to form a URI Keyword-signed Key(KSK) Based on a short descriptive string, usually a set of keywords that can describe the document

Example: University/cmu/cs/ashu

Uniquely identify a document Potential problem – global namespace

Ashwin R. Bharambe 21

4/30/2020

Key Management (2)

Signed-subspace Key (SSK)  Add sender information to avoid namespace conflict

Private key to sign / public key to verify

Content-hash Key(CHK) Hash of the document

Ashwin R. Bharambe 22

 Sorry, No B Forward to nearest “untried” key  Perform a depth-first search A A, Help me!

 On success, return data to upstream requestor D  Cache the data source I C

Routing algorithm characteristics

Key clustering Data partitioning Nodes know about keys “similar” to theirs Store clusters of files with same keys Popular data gets cached more Seamless replication to avoid hot-spots As time progresses, connectivity increases

4/30/2020 Ashwin R. Bharambe 24

File insertion

Query the file key A response  key collision Re-send with a different key On success, nodes cache the file with a pointer to the data source

4/30/2020 Ashwin R. Bharambe 25

Node join

Need to assign a “key” to the node Two options: Existing node chooses the key Joining node chooses its key What’s the problem?

Uses a bit commitment protocol hash(a)   hash(b ^ hash(a)) hash(c ^ hash(b ^ hash(a)))

4/30/2020 Ashwin R. Bharambe 26

Anonymity

Sender remains anonymous Data sources are randomly modified as packet traverses Use “pre-routing” with “mix-nets” to enhance Receiver (or key) anonymity “mix-nets”

4/30/2020 Ashwin R. Bharambe 27

Scalability

X-axis: # of nodes Y-axis: # of pathlength The relation between network size and average pathlenth.

Initially, 20 nodes. Add nodes regularly.

4/30/2020 Ashwin R. Bharambe 29

4/30/2020

Small world Model

X-axis: # of links Y-axis: fraction of nodes (log-scale) Most of nodes have only few connections while a small number of news have large set of connections.

WHY?

Power law

Ashwin R. Bharambe 30

What’s good?

Distributed storage and retrieval Anonymity Adaptive replication based on usage patterns

Anything else?

4/30/2020 Ashwin R. Bharambe 31

Is it perfect?

Query path-length Not bounded Difficult to know the cause of search failures Document did not exist?

Could not find it?

Anything else?

4/30/2020 Ashwin R. Bharambe 32

4/30/2020

Switch tracks…

How does file sharing change the Internet?

Ashwin R. Bharambe 33

Users are patient

4/30/2020 Ashwin R. Bharambe

batch mode delivery!

35

Audio-Video

Small objects

audio Large objects

video 4/30/2020 Ashwin R. Bharambe 37

Object Dynamics Fetch-at-most-once

Short-lived popularity Recently born objects most popular Most requests are for “old” objects

4/30/2020 Ashwin R. Bharambe 38

File sharing not Zipf!

4/30/2020 Ashwin R. Bharambe 39

Conclusions

Many other interesting aspects Some obvious, some not Contribution Fetch-at-most-once significant locality substantial opportunity for caching

4/30/2020 Ashwin R. Bharambe 40