Lecture 7 CS 728 - University of Cincinnati

Download Report

Transcript Lecture 7 CS 728 - University of Cincinnati

Lecture 7
CS 728
Searchable Networks
Errata: Differences between Copying and
Preferential Attachment
• In generative model: let pk be fraction of nodes
with (in)degree k
• Consider the degree distribution of attaching
new node to target of randomly chosen edge.
– Answer is not pk but proportional to kpk why??
• But in copying model we take target from a
random edge from a random vertex!
– In this case probability of connecting to a node is 1/n
sum (1/outdegrees) of k parents
– So preferential attachment to nodes of high indegree
whose parents have low outdegree
Searchable Networks
• Questions:
• Social: How does a person in a small world find
their soul mate?
• Comp Sci: How does the notion of long and
short edges in a “random” network impact ability
to find key nodes?
• Just because a short path exists, doesn’t mean
you can easily find it (using only local info).
• You don’t know all of the people whom your
friends know.
• Under what conditions is a network searchable?
Searchable Networks
Kleinberg (2000)
Variation of Watts’s b model
and Waxman’s model:
–
–
–
–
Lattice is d-dimensional
(d=2).
One random link per node.
Parameter r controls
probability of random link –
greater for closer nodes.
node u is connected to node
v with probability
proportional to d(u,v)^-r
• Lower bound
Fundamental consequences of model
• When longrange contacts are formed
independently of the geometry of the grid, short
chains will exist but the nodes, operating at a
local level, will not be able to find them.
• When longrange contacts are formed by a
process that is related to the geometry of the
grid in a specific way, however, then short chains
will still form and nodes operating with local
knowledge will be able to construct them.
• Theorem 1: Effective routing is impossible in uniformly
random graphs.
When r = 0, the expected delivery time of any
decentralized algorithm is at least O(n^2/3), and hence
exponential in the expected minimum path length.
• Theorem 2: Greedy routing is effective in certain
random graphs.
When r = 2, there is a decentralized (greedy) algorithm,
so that the expected delivery time is at most O( logn^2),
hence quadratic in expected path length.
Proof Sketch for Lower Bound
The impossibility result is based on the fact that the uniform
distribution prevents a decentralized algorithm from using
any “clues'' provided by the geometry of the grid.
Consider the set U of all nodes within lattice distance n^2/3 of
destination t.
With high probability, the source s will lie outside of U, and if
the message is never passed from a node to a long-range
contact in U , the number of steps needed to reach t will be
at least proportional to n^2/3 .
But the probability that any message holder has a long-range
contact in U is roughly n^(4/3)/n^2 = n^-2/3 , so the
expected number of steps before a long-range contact in U
is found is at least proportional to n^2/3 as well.
Proof Sketch for Upper Bound Th. 2
• Greedy algorithm always moves us closer.
Consider phases that move the message
half the distance to destination.
(Recall Zeno’s paradox).
• Probability of connecting to a node at
distance d is ~ 1/(d^2 lgn) and there are
~ d^2 nodes at distance d from destination.
Thus ~lg n steps will end the phase.
• So with lg n phases we are done lg^2 n time
Searchable Networks
Kleinberg (2000)
Watts, Dodds, Newman (2002)
show that for d = 2 or 3, real
networks are quite searchable.
Killworth and Bernard (1978) found
that people tended to search
their networks by d = 2:
geography and profession.
The Watts-Dodds-Newman model
closely fitting a real-world experiment