Peer-to-Peer Networking

Download Report

Transcript Peer-to-Peer Networking

Peer-to-Peer Networking Systems:
Napster,
Gnutella, Freenet
et. al
By Trystan Upstill
Napster: How does it work?
• Centralized repository
– Stores details of online users and songs
– Server broker provides song / user
information to Napster clients
– Napster clients use peer-to-peer file
transfer
21 July 2015
Peer-to-Peer: Systems
2
Napster: Initiating a connection
• Client connects to
Napster with login
and password
Napster main server
21 July 2015
+ t
e lis
m g
na on
er s
us red
a
sh
– Transmits current
listing of shared
files
– Napster registers
Username, maps
username to IP
address and
records song list
(+statistics)
Peer-to-Peer: Systems
3
Napster: Requesting a song
• Client sends
song request to
Napster server
Napster main server
g
in
ch
at and /
f m s me s
t o ng a se
lis so ern es
us ddr
a
21 July 2015
ng
so e
d am
re n
si t +
de rtis
a
– Napster checks
song database
– Returns
matched songs
with usernames
and IP
addresses (plus
extra stats)
Peer-to-Peer: Systems
4
Napster: Downloading a song
• User selects a
song in their
client, download
request sent
straight to user.
– Machine
contacted if
available
– More complex if
destination
behind firewall
21 July 2015
Napster main server
request for song
song
Peer-to-Peer: Systems
5
Napster: Why does it work?
• Anti-censorship
– Ignores copyright…at least it tries to
– These copyright laws prevent servers from
serving copyrighted MP3s. (RIAA gets mad)
• Popular songs propagate
– Downloads are shared by default
– Not everyone will share, but not everyone will be
a piker
• Not hard to search for songs
– … generally:
• Try searching on Napster for solo appearances by a
violinist, or music featuring xylophones
21 July 2015
Peer-to-Peer: Systems
6
Napster: Clever Design
• Centralized user and song database
– Quick searching
• Try Gnutella
– Users come and go
• User/Search database continually updated
– But…
• This centralization is ultimately its downfall.
• Automatic file sharing
– Easy to use file server
– Uses the “commons”.
21 July 2015
Peer-to-Peer: Systems
7
Napster: Future
• Napster now has to restrict access to
copyrighted MP3s
– Will Napster get over its current legal problems?
• Will users still be able to use Napster to
download copyrighted MP3s?
– People have started to bypass Napster naming to
allow copyright files to be downloaded
• Napster is threatening to sue the companies creating
these applications
21 July 2015
Peer-to-Peer: Systems
8
Gnutella
• Completely Decentralized
• Originally Written by the Winamp Guys
– In 14 hours
– Intended to be release as a GPL product at v1.0
• The first version released was 0.56
• This was the last version of Gnutella
– AOL squashed it several hours after release
– Designed to “share recipes”
• No control on content
• So: Gnutella was re-engineered
– By the open source community
– Now: Public open protocol
21 July 2015
• Anything that can produce/interpret protocol is
Gnutella-compatible.
Peer-to-Peer: Systems
9
Why is Gnutella Interesting?
• Small, simple and robust
– Written quickly – only meant for hundreds or
thousands of users (not tens of thousands)
• No real way to halt or monitor Gnutella
– Avoids the Napster problem
• The network is hidden from view
– Type in a keyword and get files
• Uses Virtual infrastructure based on a
physical infrastructure
– Software Internet built on the Hardware Internet
– Expands as user connects
21 July 2015
• Ceases to exist if no users exist
Peer-to-Peer: Systems
10
Gnutella: Communication
• Message broadcast
– Message-based routing and Broadcasting
• Unicast TCP broadcast
– Contact as many hosts as possible to get the best replies
• No persistent connections
– Move away from IP addresses to Message UUIDs
– To avoid repetition
• Messages are assigned UUIDs
– To not swamp the network
• Set a horizon with message decay with TTL of 7
21 July 2015
Peer-to-Peer: Systems
11
Gnutella: Communication (cont)
• Dynamic routing
– Uses message UUIDs to route rather than IP
address
– Routed using UUID history
• Lossy transmission over reliable TCP
– If a node gets overloaded it just drops packets
• Serving up the files
– Clients generally have mini HTTP servers in them
that serve the files, Gnutella does not define a
file sharing agent.
• File discovery NOT transfer
21 July 2015
Peer-to-Peer: Systems
12
Gnutella: Communication Types
Gnutella Routed Replys
Gnutella Broadcast
21 July 2015
Peer-to-Peer: Systems
13
Gnutella: Initiating a connection
• Plug-in to a host
and send a
broadcast ping
– Can be any host,
hosts transmitted
through word-ofmouth or hostcaches
– Ping then
broadcast
through network
with TTL of 7
21 July 2015
Peer-to-Peer: Systems
14
Gnutella: Initiating a connection 2
• Hosts that are not
overwhelmed respond
with a routed pong
po
ng
ng
po
– Your Gnutella caches
these IP addresses
or replying nodes
– In the example the
grey nodes do not
respond within a
certain amount of
time.
po
ng
• They are
overloaded
21 July 2015
ng
po
po
ng
Peer-to-Peer: Systems
pong
po
ng
ng
po
15
Gnutella: Searching / Querying
qu
ery
qu
er
y
qu
er
y
que
ry
quer
y
• More with
Reflector
21 July 2015
qu
er
y
ery
qu
– Up to 7 layers
deep (TTL7)
– Estimated at
around 10000
nodes
query
(at TTL2)
• You broadcast
query to all the
nodes you know,
they broadcast to
all the nodes
they know etc.
query
example network subsection
Peer-to-Peer: Systems
16
Gnutella: Search/Query Responses
21 July 2015
Peer-to-Peer: Systems
que
resp ry
onse
ery e
qu ons
p
res
– The user receives all a
list containing all the
files that matched their
query and a
corresponding IP
address.
re que
sp ry
on
se
• Querying node is sent
routed responses from
Gnutella clients
containing interesting
files.
query
nse
respo
example network subsection
17
• When a user selects a file
from the retrieved search
results the Gnutella node
establishes a direct
connection to the node
with the file
– Peers run some kind of file
exchange client
downloa
d
re q u e s t
requested
file
Gnutella: File Transfer
• Not Gnutella Controlled
– Lack of anonymity at this
stage
example network subsection
21 July 2015
Peer-to-Peer: Systems
18
Gnutella: Interesting Aspects
• Assumption: enough nodes exist so that
slow one node can drop replies
– Sub-assumption: can drop some correct answers
as they are probably replicated
– This builds a natural network topology
• No standard searching protocol
– Every node can interpret requests differently
• Like the real world?
• No guarantee that you will be able to reach
every other node on the network
– Horizon typically around 10000 nodes
– Not always the same 10000 nodes
21 July 2015
Peer-to-Peer: Systems
19
Gnutella: InfraSearch
• Web based Gnutella search engine
– Setup a Gnutella network of a few nodes
• Ex:
– Online photo labs image database
– A calculator
– A yahoo proxy
– Search request sent out
– Responses collated and converted to HTML
• Demonstrated access of differing types of
data across heterogeneous systems
• Recently acquired by Sun for JXTA project
21 July 2015
Peer-to-Peer: Systems
20
Gnutella: Network Organization
• Dynamic backbone creation through
nodes with high bandwidth
– They don’t tend to drop many packets
– Therefore: They stay connected to many
nodes
– Newer Gnutella clients drop connections
with slow nodes
• Otherwise they act as black holes
21 July 2015
Peer-to-Peer: Systems
21
Gnutella: Performance
• Node optimization
– Every time you optimize traffic you increase node
bandwidth
– Example: Push Request, Ping
– Introduce new Gnutella nodes to improve
bandwidth
• Clip2’s Reflector
• The great Gnutella crush
– Result of Napster injunction
– Host caches created a problem
• This problem was hard to fix due to variety of clients
21 July 2015
Peer-to-Peer: Systems
22
Gnutella: Problems
• Free-riding issues
– 24 hour survey showed:
• 70% of people shared no files
– 50% of search responses from top 1% of hosts
– Reverting to client/server
• Suddenly not so hard to shut down!
– People argue that this is not the case
• Cornucopia of the commons
• Non-standard implementation
– People implement their own Gnutella clients
• Some are dodgier than others
• Potential way to manipulate system operation
• “Good” clients now monitor traffic
21 July 2015
Peer-to-Peer: Systems
23
Freenet: What is it?
• Cooperative file distribution, Goals:
– Improve document distribution efficiency
• Bandwidth and disk space sharing
– Remove document censorship
• Provide plausible deniability for node operators
– Provide anonymity
– Remove single points of failure
• For free speech
• Large geographically distributed HDD with
anonymous access.
21 July 2015
Peer-to-Peer: Systems
24
Freenet: How does it work?
• Users designate size of HDD used in
distribution of Freenet files.
• All users share bandwidth to improve file
transport.
– Network optimized for computerized access to
files
– Each file has a unique id, obscured in its
interpretation
– Lots of optimizations throughout the network
• Replies are cached along the way (when
transferred to improve next access)
21 July 2015
Peer-to-Peer: Systems
25
Freenet: How it works
• Much like Gnutella
– Chain rather than cluster
• Requests routed using last-known location or
“best guess”
– Using file keys, each file has a unique key and
these keys have localities
– Files not directly transferred between
peers, rather requests and file fragments
sent through all intermediate nodes
• Improves anonymity
• Propagates files
21 July 2015
Peer-to-Peer: Systems
26
Freenet: File Caching
• Files are mapped in a “stack”
• The stack contains the filename, size,
and the last seen node containing that
file
– Last seen node could be itself
– Stack has most recently used files at the
top of the stack, files are removed off the
bottom of the stack as required (this is
how files leave Freenet).
• Files are “weighed down” by their size.
21 July 2015
Peer-to-Peer: Systems
27
Freenet: File Identification
• Files are identified by keys
– Keys prevent bogus files “infecting” the
network like cancer.
– Each link in the chain checks files to see
whether their key is correct
– Freenet:keytype@data
• Example key types: content, keyword, signed
21 July 2015
Peer-to-Peer: Systems
28
Freenet: Example Search
R2
R3
R1
FOUND3
R4
R5
R8
21 July 2015
FOUND2
R6
R7
FOU
ND
1
R9
Peer-to-Peer: Systems
29
Freenet: Interesting Aspects
• Prevention of specialized nodes
– Popular files propagate across all nodes in
network
• Cannot censor one central location
• More popular information becomes easier to
access
– Opposite of the Web
• Unpopular vs. Unwanted
– In Freenet it is important to make the distinction
• Popular files are never removed
• Files that are never downloaded are removed
21 July 2015
Peer-to-Peer: Systems
30
Instant Messaging: ICQ and AIM
• Centralized administered repository of
current users and their IP addresses
– Peer-to-peer information exchange
– Buffer messages if users offline
– Handle tricky packet routing through firewalls
• Similar to Napster
– Users logon, IP address is mapped to username
– Removes machine dependent messaging
• “[email protected]” becomes “Trystan”
21 July 2015
Peer-to-Peer: Systems
31
Beyond First Generation P2P systems
• Having to build a new infrastructure on
top of the Internet
– Sun working on JXTA
• Open source P2P infrastructure
– Intel want to define standards
• Arguments that it is too early for standards
• The next generation might just use
Virtual Internet API
21 July 2015
Peer-to-Peer: Systems
32