Is there a Science of Networks?

Download Report

Transcript Is there a Science of Networks?

Is there a Science of Networks?

What kinds of networks are there?

From Bacon numbers to random graphs to Internet
 From FOAF to Selfish Routing: apparent similarities between many
human and technological systems & organization
 Modeling, simulation, and hypotheses
 Compelling concepts
• Metaphor of viral spread
• Properties of connectivity has qualitative and quantitative effects


Computer Science?
From the facebook to tomogravity
 How do we model networks, measure them, and reason about them?
 What mathematics is necessary?
 Will the real-world intrude?
CompSci 100e
10.1
From subsets to graphs with bits

We’ll consider SequenceSync APT
 What is a “vertex” in the graph? Where are arcs?
0
1
0
0
5
1
2
4
3
2


For state-0, we have {1,5,4,2} for transitions
We’ll consider a graph in which vertices are sets of states
 Start with every possible state in our initial vertex
CompSci 100e
10.2
Review: Vocabulary

Graphs are collections of vertices
and edges (vertex also called
node)
 Edge connects two vertices
• Direction can be important,
directed edge, directed graph
• Edge may have associated
weight/cost


A vertex sequence v0, v1, …, vn-1 is
a path where vk and vk+1 are
connected by an edge.
 If some vertex is repeated, the
path is a cycle
 A graph is connected if there is
a path between any pair of
vertices
What vertices are reachable from a
given vertex?
 Traverse the graph…
CompSci 100e
78
NYC
Phil
268
204
190
Wash DC
LGA
$412
Boston
394
$441
$186
LAX
$1701
DCA
$186
ORD
10.3
Word ladder
CompSci 100e
10.4
Vocabulary/Traversals

Connected?
 Connected components?
• Weakly connected (directionless)

Degree: # edges incident a vertex
• indegree (enter), outdegree (exit)

Starting at 7 where can we get?
 Depth-first search, envision each vertex as a room, with doors
leading out
• Go into a room, mark the room, choose an unused door, exit
– Don’t go into a room you’ve already been in (see mark)
• Backtrack if all doors used (to room with unused door)


Rooms are stacked up, backtracking is really recursion
One alternative uses a queue: breadth-first search
CompSci 100e
10.5
Six Degrees of Bacon

Background
 Stanley Milgram’s Six Degrees of Separation?
 Craig Fass, Mike Ginelli, and Brian Turtle invented it
as a drinking game at Albright College
 Brett Tjaden, Glenn Wasson, Patrick Reynolds have run t
online website from UVa and beyond
 Instance of Small-World phenomenon

http://oracleofbacon.org handles 2 kinds of requests
1. Find the links from Actor A to Actor B.
2. How good a center is a given actor?
 How does it answer these requests?
CompSci 100e
10.6
How does the Oracle work?


Not using Oracle™
Queries require traversal of the graph
BN = 1
Sean Penn
BN = 0
Kevin Bacon
Mystic River
Tim Robbins
Tom Hanks
Apollo 13
Footloose
Bill Paxton
Sarah Jessica Parker
John Lithgow
CompSci 100e
10.7
How does the Oracle Work?


BN = Bacon Number
Queries require traversal of the graph
BN = 2
Woody Allen
BN = 1
Sean Penn
Sweet and Lowdown
Judge Reinhold
Fast Times at Ridgemont High
Miranda Otto
War of the Worlds
Mystic River
BN = 0
Tim Robbins
The Shawshank Redemption
Morgan Freeman
Cast Away
Helen Hunt
Tom Hanks
Kevin Bacon
Apollo 13
Bill Paxton
Footloose
Forrest Gump
Sarah Jessica Parker
Sally Field
Tombstone
John Lithgow
A Simple Plan
Val Kilmer
Billy Bob Thornton
CompSci 100e
10.8
How does the Oracle work?


How do we choose which movie or actor to explore next?
Queries require traversal of the graph
BN = 2
Woody Allen
BN = 1
Sean Penn
Sweet and Lowdown
Judge Reinhold
Fast Times at Ridgemont High
Miranda Otto
War of the Worlds
Mystic River
BN = 0
Tim Robbins
The Shawshank Redemption
Morgan Freeman
Cast Away
Helen Hunt
Tom Hanks
Kevin Bacon
Apollo 13
Bill Paxton
Footloose
Forrest Gump
Sarah Jessica Parker
Sally Field
Tombstone
John Lithgow
A Simple Plan
Val Kilmer
Billy Bob Thornton
CompSci 100e
10.9
Actor-Actor Graph
CompSci 100e
10.10
Movie-Movie Graph
CompSci 100e
10.11
Actor-Movie Graph
CompSci 100e
10.12
Traversals


Connected?
 Connected components?
 Degree: # edges incident a vertex
Starting at Bacon where can we get?
 Random search, choose a neighboring vertex at random
• Can we move in circles?

Depth-first search, envision each vertex as a room, with doors
leading out
• Go into a room, mark the room, choose an unused door, exit
– Don’t go into a room you’ve already been in (see mark)
• Backtrack if all doors used (to room with unused door)


Rooms are stacked up, backtracking is really recursion
One alternative uses a queue: breadth-first search
CompSci 100e
10.13
Breadth first search

In an unweighted graph this finds the shortest path between a
start vertex and every vertex
 Visit every node one away from start
 Visit every node two away from start
• This is every node one away from a node one away


Visit every node three away from start, …
Put vertex on queue to start (initially just one)
 Repeat: take vertex off queue, put all adjacent vertices on
 Don’t put a vertex on that’s already been visited (why?)
 When are 1-away vertices enqueued? 2-away? 3-away?
 How many vertices on queue?
CompSci 100e
10.14
General graph traversal
COLLECTION_OF_VERTICES fringe;
fringe = INITIAL_COLLECTION;
while (!fringe.isEmpty()) {
Vertex v = fringe.removeItem(QUEUE_FN);
if (! MARKED(v)) {
MARK(v);
VISIT(v);
for each edge (v,w) {
if (NEEDS_PROCESSING(w))
Add w to fringe according to QUEUE_FN;
}
}
}
CompSci 100e
10.15
Breadth-first search
Visit each vertex reachable from some source in breadth-first order

Like level-order traversal
Queue fringe;
fringe = {v};
while (!fringe.isEmpty()) {
Vertex v = fringe.dequeue();
if (! getMark(v)) {
setMark(v);
VISIT(v);
for each edge (v,w) {
if (MARKED(w))
fringe.enqueue(w);
}
}
}

How do we change to make depth-first search?
 How does the order visited change?

CompSci 100e
10.16
Center of the Hollywood Universe?



1,018,678 people can be connected to Bacon
Is he the center of the Hollywood Universe?
 Who is?
 Who are other good centers?
 What makes them good centers?
Centrality
 Closeness: the inverse average distance of a node to all
other nodes
• Geodesic: shortest path between two vertices
• Closeness centrality: number of other vertices divided by the
sum of all distances between the vertex and all others.


Degree: the degree of a node
Betweenness: a measure of how much a vertex is between
other nodes
CompSci 100e
10.17
Closeness Centrality
An actor is important if she is relatively close to all other actors
1


Cc (v i )   d(v i ,v j )


 j


where d(x,y) is the length of shortest
path between x and y
Jordan Centrality uses the inverse of the length of the
longest path to any other vertex
CompSci 100e
10.18
Degree Centrality
Actor with the most ties is the most important
CD  d(v i )  eij
where eij is 1 if an vi and vj are
adjacent, 0 otherwise
j

A purely local measure
CompSci 100e
10.19
Betweenness Centrality
An actor who lies on communication paths can control
communication flow, and is thus important
CB (v i )   g jk (v i ) /g jk
jk

where gjk = the number of geodesics
connecting jk, and gjk(vi) = the number
that actor i is on.
Information Centrality uses all paths in the network, and
weights them based on their length.
CompSci 100e
10.20