Transcript Title
Small World Social Networks
With slides from Jon Kleinberg,
David Liben-Nowell, and Daniel Bilar
Social Network Analysis
[Social network analysis] is grounded in the observation that
social actors are interdependent
and that the links among them
have important consequences for every individual
[and for all of the individuals together]. ...
[Relationships] provide individuals with opportunities and,
at the same time, potential constraints on their behavior. ...
Social network analysis involves theorizing, model building
and empirical research focused on uncovering the
patterning of links among actors. It is concerned also with
uncovering the antecedents and consequences of recurrent
patterns. [Linton C. Freeman, UC-Irvine]
Representing a Social Network
a set V of n nodes or vertices,
usually denoted {v1, …, vn}
node v2
1
8
2
7
3
6
4
5
a set E of m edges between nodes,
usually denoted {ei,j}
8
3
edge
e8,3
Examples of Social Networks
Nodes are high-school
students.
Boys are red,
Girls are blue…
What is the meaning of a
bidirectional edge?
Paths
5
7
8
6
3
2
4
Path (v1,v2,v8,v3,v7)
1
Definition:
A path is a sequence of nodes (v1, …, vk) such that
for any adjacent pair vi and vi+1,
there’s a directed edge ei,i+1 between them.
Path length
5
7
8
6
3
2
4
Path (v1,v2,v8,v3,v7)
has length 4.
1
Definition: The length of a path is
the number of edges it contains.
Distance
5
7
8
6
3
2
4
1
The distance between
v1 and v7 is 3.
Definition:
The distance between nodes vi and vj is the
length of the shortest path connecting them.
Famous distances
Kevin Bacon number
nodes = {actors}
edges = if two actors star in same film
Kevin Bacon number =
distance between actor and Bacon
The Kevin Bacon Game
Invented by Albright College
students in 1994: Craig Fass,
Brian Turtle, Mike Ginelly
Goal: Connect any actor to
Kevin Bacon, by linking actors
who have acted in the same
movie.
Oracle of Bacon website uses
Internet Movie Database
(IMDB.com) to find shortest
link between any two actors:
http://oracleofbacon.org/
Famous distances
Math PhD
genealogies
Famous distances
Paul Erdős number
nodes = {mathematicians}
edges = if 2 mathematicians co-author a
paper
Erdős number = distance between
mathematican and Erdos
Erdős Numbers
Erdős wrote 1500+ papers
with 507 co-authors.
Number of links required to
connect scholars to Erdős, via coauthorship of papers
What type of graph do you
expect?
Jerry Grossman (Oakland Univ.)
website allows mathematicians to
compute their Erdős numbers:
http://www.oakland.edu/enp/
Connecting path lengths,
among mathematicians only:
avg = 4.65
max = 13
Diameter
5
7
8
6
3
2
4
The diameter is 3.
1
Definition: The diameter of a graph is the
maximum shortest-path distance between any two
nodes.
Six degrees of separation
The diameter of a social network
is typically small.
Milgram: Six Degrees of Separation
296 People in Omaha, NE, were
given a letter,
asked to try to reach a
stockbroker in Sharon, MA,
via personal acquaintances
20% reached target
average number of “hops” in the
completed chains = 6.5
Why are chains so short?
“Random Graphs have small
diameter”
Do they?
Why are Chains so Short?
Maybe exponential growth of
acquaintances…
@d=1: Most people know at least
100 others
@d=2: Through their friends: 10K
@d=3: Through their friends’
friends: 1M
@d=4: Through their friends’
friends’ friends: 100M
@d=k: 10^k
Not so fast…
Your friends mostly know each other…
In high school self-reported friendships,
clusters based on race (left-right)
and age (top-bottom)
Homophily: Your friends are similar to you!
Pr [two of your friends are friends] is high
Social networks have vertices with
high clustering coefficient
(how much its neighborhood
resembles a clique)
So, exponential growth does not explain it
We want a model with
small diameter and
large clustering coefficient
Watts/Strogatz: Rewire Ring Lattice
Proposed a model (ring lattice)
with small diameter and large clustering coefficient:
Put people on circle
connect each to x closest neighbors;
with prob. p, rewire each connection randomly
Result: Yes, short chains exist for p>0.1!
p =0.0
p =0.1
p =1.0
Ok, short chains exist, but…
Will people be able to find the short chains?
Milgram showed that people were able to find them.
Kleinberg [2000]:
No search strategy in a Watts/Strogatz network,
based only on local information,
can find short chains…
Kleinberg’s Rewire Grid
Now you can find short paths!
The effect of distance
Searching with local
information gets more
efficient as increases up
to 2, then gets worse
again!
In fact, it finds short paths
in logarithmic time!
Theory and practice agree!
Translated into English?
“Distance scales”
Count friends
within log distances:
1 - 10
10 - 100
100 - 1000
1000 - 10000 …
When = 2,
nodes have the same
volume of links
to each distance scale
The Power of Long Distance Relations
Probability of friendship is falling off
like the square of the distance!
Geographic location is
a primary reason for selecting next person in chain
We have eventually understood Milgram’s experiment
But does this explains what happens on the internet?
(It depends on how
you define distance:
see Liben-Nowell’s paper)