CMPT 880: P2P Systems

Transcript CMPT 880: P2P Systems

School of Computing Science
Simon Fraser University
CMPT 765/408: P2P Systems
Instructor: Dr. Mohamed Hefeeda
1
P2P Computing: Definitions

Peers cooperate to achieve desired functions
- Peers:
• End-systems (typically, user machines)
• Interconnected through an overlay network
• Peer ≡ Like the others (similar or behave in similar manner)
- Cooperate:
• Share resources, e.g., data, CPU cycles, storage, bandwidth
• Participate in protocols, e.g., routing, replication, …
- Functions:
• File-sharing, distributed computing, communications,
content distribution, …

Note: the P2P concept is much wider than file
sharing
2
When Did P2P Start?

Napster (Late 1990’s)
- Court shut Napster down in 2001

Gnutella (2000)

Then the killer FastTrack (Kazaa, ...)

BitTorrent, and many others

Accompanied by significant research interest

Claim
- P2P is much older than Napster!

Proof
- The original Internet!
- Remember UUCP (unix-to-unix copy)?
3
What IS and IS NOT New in P2P?

What is not new
- Concepts!

What is new
- The term P2P (may be!)
- New characteristics of
• Nodes which constitute the
• System that we build
4
What IS NOT New in P2P?

Distributed architectures

Distributed resource sharing

Node management (join/leave/fail)

Group communications

Distributed state management

….
5
What IS New in P2P?

Nodes (Peers)
- Quite heterogeneous
• Several order of magnitudes difference in resources
• Compare the bandwidth of a dial-up peer versus a
high-speed LAN peer
- Unreliable
• Failure is the norm!
- Offer limited capacity
• Load sharing and balancing are critical
- Autonomous
• Rational, i.e., maximize their own benefits!
• Motivations should be provided to peers to cooperate
in a way that optimizes the system performance
6
What IS New in P2P? (cont’d)

System
- Scale
• Numerous number of peers (millions)
- Structure and topology
• Ad-hoc: No control over peer joining/leaving
• Highly dynamic
- Membership/participation
• Typically open 
- More security concerns
• Trust, privacy, data integrity, …
- Cost of building and running
• Small fraction of same-scale centralized systems
• How much would it cost to build/run a super computer
with processing power of that 3 Million SETI@Home PCs?
7
What IS New in P2P? (cont’d)

So what?

We need to design new lighter-weight
algorithms and protocols to scale to
millions (or billions!) of nodes given the
new characteristics

Question: why now, not two decades ago?
- We did not have such abundant (and
underutilized) computing resources back then!
- And, network connectivity was very limited
8
Why is it Important to Study P2P?

P2P traffic is a major portion of Internet
traffic (50+%), current killer app

P2P traffic has exceeded web traffic
(former killer app)!

Direct implications on the design,
administration, and use of computer
networks and network resources
- Think of ISP designers or campus network
administrators

Many potential distributed applications
9
Sample P2P Applications

File sharing
- Gnutella, Kazaa, BitTorrent, …

Distributed cycle sharing
- SETI@home, Gnome@home, …

File and storage systems
- OceanStore, CFS, Freenet, Farsite, …

Media streaming and content distribution
- PROMISE
- SplitStream, CoopNet, PeerCast, Bullet,
Zigzag, NICE, …
10
P2P vs. its Cousin (Grid Computing)

Common Goal:
- Aggregate resources (e.g., storage, CPU
cycles, and data) into a common pool and
provide efficient access to them

Differences along five axes [Foster & Imanitchi 03]
- Target communities and applications
- Type of shared resources
- Scalability of the system
- Services provided
- Software required
11
P2P vs Grid Computing (cont’d)
Issue
Grid
P2P

Established
Communities communities, e.g.,
scientific institutions
and

Applications

Computationallyintensive problems

Powerful and Reliable
machines, clusters


Resources
Shared
High-speed
connectivity

Specialized
instruments

Grass-root
communities
(anonymous)
Mostly, fileswapping
PCs with limited
capacity and
connectivity

Unreliable

Very diverse
12
P2P vs Grid Computing (cont’d)
Issue
System
Scalability
Grid
Hundreds to thousands
of nodes

Sophisticated services:
authentication, resource
discovery, scheduling,
access control, and
membership control
P2P
Hundreds of
thousands to
Millions of nodes


Services
Provided
Members usually trust
others

Software
required
Sophisticated suit: e.g.,
Globus, Condor

Limited services:
resource discovery

limited trust
among peers

Simple: (screen
saver), e.g., Kazza,
SETI@Home
13
P2P vs Grid Computing: Discussion

The differences mentioned are based on the
traditional view of each paradigm
- It is conceived that both paradigms will converge and
will complement each other [e.g., Butt et al. 03]

Target communities and applications
- Grid: is going open

Type of shared resources
- P2P: is to include various and more powerful resources

Scalability of the system
- Grid: is to increase number of nodes

Services provided
- P2P: is to provide authentication, data integrity, trust
management, …
14
P2P Systems: Simple Model
P2P Application
Middleware
P2P Substrate
Operating System
Hardware
System architecture: Peers form an
overlay according to the P2P
Substrate
Software architecture
model on a peer
15
Overlay Network

An abstract layer built on top of physical network

Neighbors in overlay can be several hops away in
physical network

Why do we need overlays?
- Flexibility in
• Choosing neighbors
• Forming and customizing topology to fit application’s
needs (e.g., short delay, reliability, high BW, …)
• Designing communication protocols among nodes
- Get around limitations in legacy networks
- Enable new (and old!) network services
16
Overlay Network (cont’d)
17
Overlay Network (cont’d)

Some applications that use overlays
- Application level multicast, e.g., ESM, Zigzag, NICE, …
- Reliable inter-domain routing, e.g., RON
- Content Distribution Networks (CDN)
- Peer-to-peer file sharing

Overlay design issues
- Select neighbors
- Handle node arrivals, departures
- Detect and handle failures (nodes, links)
- Monitor and adapt to network dynamics
- Match with the underlying physical network
18
Overlay Network (cont’d)
Recall: IP Multicast
source
19
Overlay Network (cont’d)
Application Level Multicast (ALM)
source
20
Peer Software Model


A software client
installed on each peer
P2P Application
Three components:
- P2P Substrate
- Middleware
- P2P Application
Middleware
P2P Substrate
Operating System
Hardware
Software model on peer
21
Peer Software Model (cont’d)

P2P Substrate (key component)
- Overlay management
• Construction
• Maintenance (peer join/leave/fail and network
dynamics)
- Resource management
• Allocation (storage)
• Discovery (routing and lookup)

Ex: Pastry, CAN, Chord, …

More on this later
22
Peer Software Model (cont’d)

Middleware
- Provides auxiliary services to P2P applications:
• Peer selection
• Trust management
• Data integrity validation
• Authentication and authorization
• Membership management
• Accounting (Economics and rationality)
• …
- Ex: CollectCast, EigenTrust, Micro payment
23
Peer Software Model (cont’d)

P2P Application
- Potentially, there could be multiple applications
running on top of a single P2P substrate
- Applications include
•
•
•
•
File sharing
File and storage systems
Distributed cycle sharing
Content distribution
- This layer provides some functions and
bookkeeping relevant to target application
• File assembly (file sharing)
• Buffering and rate smoothing (streaming)

Ex: Promise, Bullet, CFS
24
P2P Substrate

Key component, which
- Manages the Overlay
- Allocates and discovers objects

P2P Substrates can be
- Structured
- Unstructured
- Based on the flexibility of placing objects at
peers
25
P2P Substrates: Classification

Structured (or tightly controlled, DHT)
− Objects are rigidly assigned to specific peers
− Looks like as a Distributed Hash Table (DHT)
− Efficient search & guarantee of finding
− Lack of partial name and keyword queries
− Maintenance overhead
− Ex: Chord, CAN, Pastry, Tapestry, Kademila (Overnet)

Unstructured (or loosely controlled)
− Objects can be anywhere
− Support partial name and keyword queries
− Inefficient search & no guarantee of finding
− Some heuristics exist to enhance performance
− Ex: Gnutella, Kazaa (super node), GIA [Chawathe et al. 03] 26
Structured P2P Substrates

Objects are rigidly assigned to peers
− Objects and peers have IDs (usually by
hashing some attributes)
− Objects are assigned to peers based on IDs

Peers in overlay form specific geometrical
shape, e.g.,
- tree, ring, hypercube, butterfly network

Shape (to some extent) determines
− How neighbors are chosen, and
− How messages are routed
27
Structured P2P Substrates (cont’d)

Substrate provides a Distributed Hash
Table (DHT)-like interface
− InsertObject (key, value), findObject (key), …
− In the literature, many authors refer to
structured P2P substrates as DHTs

It also provides peer management (join,
leave, fail) operations

Most of these operations are done in O(log
n) steps, n is number of peers
28
Structured P2P Substrates (cont’d)

DHTs: Efficient search & guarantee of
finding

However,
− Lack of partial name and keyword queries
− Maintenance overhead, even O(log n) may be too
much in very dynamic environments

Ex: Chord, CAN, Pastry, Tapestry, Kademila
(Overnet)
29
Example: Content Addressable Network (CAN)
[Ratnasamy 01]
−
Nodes form an overlay in d-dimensional space
− Node IDs are chosen randomly from the d-space
− Object IDs (keys) are chosen from the same d-space
−
Space is dynamically partitioned into zones
−
Each node owns a zone
−
Zones are split and merged as nodes join and
leave
−
Each node stores
− The portion of the hash table that belongs to its zone
− Information about its immediate neighbors in the dspace
30
2-d CAN: Dynamic Space Division
7
n2
n1
n4
n3
n5
0
0
7
31
2-d CAN: Key Assignment
7
K1
K2
n2
n1
K4
n4
n3
K3
n5
0
0
7
32
2-d CAN: Routing (Lookup)
7
K1
K2
n2
n1
K4?
K4?
K4
n4
n3
K3
n5
0
0
7
33
CAN: Routing
−
Nodes keep 2d = O(d) state information
(neighbor coordinates, IPs)
− Constant, does not depend on number of
nodes n
−
Greedy routing
- Route to the node that is closest to the
destination
- On average, is done in O(n1/d) = O(log n) when
d = log n /2
34
CAN: Node Join
−
New node finds a node already in the CAN
− (bootstrap: one (or a few) dedicated nodes outside the
CAN maintain a partial list of active nodes)
−
It finds a node whose zone will be split
− Choose a random point P (will be its ID)
− Forward a JOIN request to P through the existing node
−
The node that owns P splits its zone and sends
half of its routing table to the new node
−
Neighbors of the split zone are notified
35
CAN: Node Leave, Fail
−
Graceful departure
− The leaving node hands over its zone to one of its
neighbors
−
Failure
− Detected by the absence of heart beat messages sent
periodically in regular operation
− Neighbors initiate takeover timers, proportional to the
volume of their zones
− Neighbor with smallest timer takes over zone of dead
node
− notifies other neighbors so they cancel their timers (some
negotiation between neighbors may occur)
− Note: the (key, value) entries stored at the failed node
are lost
− Nodes that insert (key, value) pairs periodically refresh (or
re-insert) them
36
CAN: Discussion
−
Scalable
− O(log n) steps for operations
− State information is O(d) at each node
−
Locality
− Nodes are neighbors in the overlay, not in the physical
network
− Suggestion (for better routing)
− Each node measure RTT between itself and its neighbors
− Forward the request to the neighbor with maximum ratio
of progress to RTT
−
Maintenance cost
− Logarithmic
− But, may still be too much for very dynamic P2P
systems
37
Unstructured P2P Substrates
−
−
Objects can be anywhere  Loosely-controlled
overlays
The loose control
− Makes overlay tolerate transient behavior of nodes
− For example, when a peer leaves, nothing needs to be done
because there is no structure to restore
− Enables system to support flexible search queries
− Queries are sent in plain text and every node runs a minidatabase engine
−
But, we loose on searching
− Usually using flooding, inefficient
− Some heuristics exist to enhance performance
− No guarantee on locating a requested object (e.g., rarely
requested objects)
−
Ex: Gnutella, Kazaa (super node), GIA [Chawathe et al.
03]
38
Example: Gnutella
−
−
−
Peers are called servents
All peers form an unstructured overlay
Peer join
− Find an active peer already in Gnutella (e.g., contact
known Gnutella hosts)
− Send a Ping message through the active peer
− Peers willing to accept new neighbors reply with Pong
−
Peer leave, fail
− Just drop out of the network!
−
To search for a file
− Send a Query message to all neighbors with a TTL (=7)
− Upon receiving a Query message
− Check local database and reply with a QueryHit to
requester
− Decrement TTL and forward to all neighbors if nonzero
39
Flooding in Gnutella
Scalability Problem
40
Heuristics for Searching [Yang and Garcia-Molina 02]
−
Iterative deepening
− Multiple BFS with increasing TTLs
− Reduce traffic but increase response time
−
Directed BFS
− Send to “good” neighbors (subset of your neighbors
that returned many results in the past)  need to keep
history
−
Local Indices
− Keep a small index over files stored on neighbors (within
number of hops)
− May answer queries on behalf of them
− Save cost of sending queries over the network
− Index currency?
41
Heuristics for Searching: Super Node
−
Used in Kazaa (signaling protocols are
encrypted)
−
Studied in [Chawathe 03]
−
Relatively powerful nodes play special role
− maintain indexes over other peers
42
Unstructured Substrates with Super Nodes
Super Node (SN)
Ordinary Node (ON)
43
Example: FastTrack Networks (Kazaa)
−
Most of info/plots in following slides are from
Understanding Kazaa by Liang et al.
−
The most popular (~ 3 million active users in a
typical day) sharing 5,000 Terabytes
−
Kazaa traffic exceeds Web traffic
−
Two-tier architecture (with Super Nodes and
Ordinary Nodes)
−
SN maintains index on files stored at ONs
attached to it
− ON reports to SN the following metadata on each file:
− File name, file size, ContentHash, file descriptors (artist
name, album name, …)
44
FastTrack Networks (cont’d)
−
Mainly two types of traffic
− Signaling
− Handshaking, connection establishment, uploading
metadata, …
− Encrypted! (some reverse engineering efforts)
− Over TCP connections between SN—SN and SN—ON
− Analyzed in [Liang et al. 04]
− Content traffic
− Files exchanged, not encrypted
− All through HTTP between ON—ON
− Detailed Analysis in [Gummadi et al. 03]
45
Kazaa (cont’d)
−
File search
− ON sends a query to its SN
− SN replies with a list of IPs of ONs that have
the file
− SN may forward the query to other SNs
−
Parallel downloads take place between
supplying ONs and receiving ON
46
FastTrack Networks (cont’d)
−
Measurement study of Liang et al.
− Hook three machines to Kazaa and wait till one
of them is promoted to be SN
− Connect the other two (ONs) to that SN
− Study several properties
− Topology structure and dynamics
− Neighbor selection
− Super node lifetime
− ….
47
Kazaa: Topology Structure [Liang et al. 04]
ON to SN: 100 - 160 connections  Since there are ~3M
nodes, we have ~30,000 SNs
SN to SN: 30 – 50 connections  Each SN connects to
~0.1 % of total number of SNs
48
Kazaa: Topology Dynamics [Liang et al. 04]
−
Average ON – SN connection duration
− Is ~ 1 hour, after removing very short-lived
connections (30 sec) used for shopping for
SNs
−
Average SN – SN connection duration
− 23 min, which is short because of
− Connection shuffling between SNs to allow ONs to
reach a larger set of objects
− SNs search for other SNs with smaller loads
− SNs connect to each other from time to time to
exchange SN lists (each SN stores 200 other SNs in
its cache)
49
Kazaa: Neighbor Selection [Liang et al. 04]
−
When ON first joins, it gets a list of 200 SNs
− ON considers locality and SN workload in selecting its future SN
−
Locality
− 40% of ON-SN connections have RTT < 5 msec
− 60% of ON-SN connections have RTT < 50 msec
− RTT: E. US  Europe ~100 msec
50
Kazaa: Lifetime and Signaling
Overhead [Liang et al. 04]
−
−
Super node average lifetime is ~2.5 hours
Overhead:
− 161 Kb/s upstream
− 191 Kb/s downstream
−  Most of SNs are high-speed (campus network, or cable)
51
Kazaa vs. Firewalls, NAT [Liang et al. 04]
−
Default port WAS 1214
− Easy for firewalls to filter out Kazaa traffic
−
Now, Kazaa uses dynamic ports
− Each peer chooses its random port
− ON reports its port to its SN
− Ports of SNs are part of the SN refresh list exchanged among
peers
− Too bad for firewalls!
−
Network Address Translator (NAT)
− A requesting peer can not establish a direct connection with a
serving peer behind NAT
− Solution: connection reversal
− Send to SN of NATed peer, which already has a connection with it
− SN tells NATed peer to establish a connection with requesting
peer!
− Transfer occurs happily through the NAT
− Both peers behind NATs?
52
Kazaa: Lessons [Liang et al. 04]
−
−
−
−
−
Distributed design
Exploit heterogeneity
Load balancing
Locality in neighbor selection
Connection Shuffling
− If a peer searches for a file and does not find it, it may try
later and gets it!
−
Efficient gossiping algorithms
− To learn about other SNs and perform shuffling
− Kazaa uses a “freshness” field in SN refresh list  a
peer ignores stale data
−
Consider peers behind NATs and Firewalls
− They are everywhere!
53
Summary

P2P is an active research area with many
potential applications in industry and academia

In P2P computing paradigm:
- Peers cooperate to achieve desired functions

New characteristics
- heterogeneity, unreliability, rationality, scale, ad hoc
-  new and lighter-weight algorithms are needed

Simple model for P2P systems:
- Peers form an abstract layer called overlay
- A peer software client may have three components
• P2P substrate, middleware, and P2P application
• Borders between components may be blurred
54
Summary (cont’d)

P2P substrate: A key component, which
- Manages the Overlay
- Allocates and discovers objects

P2P Substrates can be
- Structured (DHT)
• Example: CAN
- Unstructured
• Example 1: Gnutella,
• Example 2: Kazza
55

CMPT 880: P2P Systems

Transcript CMPT 880: P2P Systems

Directory