Transcript networks

Self-Organization in Networks
Ron Eglash, RPI
Network theory origins in graph theory
Network theory was first introduce as “graph theory” by Euler. Rather than “links” and
“nodes” he called them “vertices” and “edges.”
The city of Konigsberg (Russia) has seven bridges. A popular pasttime in the city was
attempting to find a walk such that you cross each bridge once and only once, without
re-tracing steps. Euler proved that is impossible. He was a wet blanket at bridge parties.
First, he shows that you can treat the landmass as vertices (nodes) and the
bridges as edges (links). Thus its the origin of topology as well.
Network theory origins in graph theory
a. The start and end are “terminal nodes” – all others are “non-terminal nodes”
b. Since you never re-trace your steps, you have to enter each non-terminal
node by one bridge, and then leave by different bridge.
c. The non-terminal nodes must therefore have an even number of bridges
d. Terminal nodes can either be both odd or none odd
both odd works
none odd works
Network theory origins in graph theory
The number of edges (links) is the “degree” of a node.
Euler showed that a necessary condition for the walk is that the graph have exactly
zero or two nodes of odd degree. Since the graph of Königsberg has four nodes of
odd degree, it cannot be an Eulerian path.
Network theory origins in Lusona?
Recall that lusona, the African design practice of drawing graphs in the sand,
had exactly the same constraints: a connected path that never re-traces.
Could we also credit the Tchokwe with the origins of network theory? What
could you ask a Lusona expert to help make that determination?
And if we do it for Lusona, what about Celtic knots and Kolam in India?
Sequence of Lusona, “trees of the ancestors”
Milgram’s small world experiment
a. Travers and Milgram’ s work on the small world is responsible for the folklore
that “everyone is connected by a chain of about six steps”
b. Their experiment “Send a packet from sets of randomly selected people to a
stock broker in Boston”
c. Experimental setup: Arbitrarily select people from 3 pools:
d. People in Boston
e. Random In Nebraska
f. Stockholders in Nebraska
g. Most letters never actually made it, but the ones that did accomplished the
task in surprisingly few steps, on average only 5.
Network vocabulary
•
•
•
•
•
Translating from Euler: Graph = network, vertices = nodes, edges = links
A network is composed of nodes and links between them.
A “connected graph” means that no nodes are unreachable by another node.
The “degree” of a node is the number of links connecting that node.
The “path length” is the number of links from a given node to another given
node.
• An “undirected” network ignores the difference between incoming and
outgoing. This may be an important difference (eg website typically hope to
gain large numbers of inbound links)
Network degree distribution
In Melanie Mitchel’s example, her node has the highest degree (10). In the degree
distribution, she is at the upper extreme.
Network degree distribution
A node with much higher degree than most is a “hub.” Various measures of the
“centrality” of a node exist.
“Degree centrality”: simply the node with the most links. Google’s website is currently the
highest degree hub online.
Clustering of nodes
A “clustering” is a group of nodes that are more likely to link to each other than to
other nodes in the graph they are connected to. Clusters can have a fractal
topology, with clusters of clusters and so on down.
Clustering coefficient
The clustering coefficient of a selected node is defined as the probability that two
randomly selected neighbors are connected to each other.
For example, how likely is it that two of your friends are also friends with each other?
cc1(v) = number of pairs of neighbors actually connected by edges
number of pairs of neighbors
Vertex v has 5 neighbors, which means 10 possible pairs. 3 are actually connected to
each other. CC = 5/10 = 0.5
Small World Network: original hypothesis
Regular network of 60 nodes:
average path length is 15
Randomly add links to only 5% of nodes:
average path length drops to 9
A network has the “small world” property if it has a small average path-length relative to
the total number of nodes.
Specifically typical path length L between two randomly chosen nodes grows as the
logarithm of the number of nodes N in the network
Better model than adding random links:
scale-free version
Regular network has high average path length (“number of hops”), high clustering.
Small world has low average path length, high clustering
Random network has low average path length, low clustering
Scale-free networks similar to fractals
Recall from complexity theory: the most complex is between ordered and random
Small World Social Networks
Actor network: example of a small world
Small World Social Networks
Other social network examples:
Any two documents on the Web are around 19 clicks away (directed) on average
Mathematicians and papers (Erdös number)
Musicians and rock groups
Baseball players and teams
Friends
Memes?
Small World Biological Networks
C. Elegans (nematode) neurons
Small World Biological Networks
Disease transmission (virus links organisms)
Metabolite processing networks (catalyst chains)
DNA transcription (nodes are genes)
Small World Technological Networks
Airline hubs
Small World Technological Networks
Power grid
Small World Technological Networks
Small World Network
Empirical test: measure CC of small world network in real world
Measure CC for same nodes after reconnecting them randomly
Examples:
• Kevin Bacon Graph (KBG)
• Power Grid (Western US)
• C. elegans Worm
Google page rank as measure
Directed graph, so you only measure “in-links”: the “importance” of your webpage is who
links to you, not who you link to. Measure the in-degree of each node.
Google page rank as measure
Number of Web pages with indegree k is proportional to 1/K2
No matter how far we “zoom” out,
you see the same relation!
Typical of scale-free networks
Scale-free networks are power laws
Number of Web pages with
in-degree k is proportional
to 1/K2
i.e. written as Number = K−2
Power law networks are:
• Resilient to random
deletion
• Vulnerable to hub
deletion
We need some skepticism about the ubiquity
of power laws
Cosma Shalizi: “Our tendency to hallucinate power laws is a
disgrace.”
Nonetheless, scale-free networks are at least
common--why?
Neural connectivity: resilience, local-global
compromise (too bulky to connect all)
Genetic connectivity: resistance to viruses
Scientific papers: preferential attachment
Why are scale-free networks common?
Because self-organization is common!
Toss a handful of particles in
the air: “self-organized” but
without order. Trival case
Sand waves from wind
action: a quasi-ordered
emergent pattern.
Significant case.
Salt crystal forms from
evaporating water.
Completely ordered. Trivial
case.
Self-organization tends to produce scale-free structures in
networks too!
Branching structures are both geometrically and
topologically scale-free
Lungs
fern
algae
Tree variation as variation in (geometric) fractal
dimension
Df = 1.05
Sparse conversation
Df = 1.50
Conversation at the
"edge of chaos"
Df = 1.70
Dense conversation
Fractal structures are common not just in nature but in
artifical and social systems as well
Tree representation of threaded
online conversation
Network Theory example:
Measuring the fractal dimension of
conversation trees
• Visually intuitive: Sparse trees = low
fractal dimension, lush trees = high fractal
dimension.
• The dimension number can take over
when our intuition fails.
• There is a powerful connection to
complexity theory…
How to counter the reductive tendancy in network modeling?
Reductive example:
• Linked: The New Science of Networks by
Albert-László Barabási.
• The Internet’s “highly connected hubs”
(due to fractal structure) greatly increase
vulnerability to planned attack.
• Used data showing that networks of
human sexual contact have a fractal
structure.
• Concluded that HIV infection rates could
be greatly reduced by targeting the same
“highly connected hubs” – sexually
promiscuous individuals.
Targeting “highly connected hubs” in
sexual networks can increase HIV rates
• In Africa, for example, people connected
with AIDS risk have been subject to
harassment, violence, and even murder.
• As a result, communication about HIV is
very poor
• Lack of communication greatly increases
transmission rates.
• Fixations on sexual promiscuity in Africa
have been closely linked to right-wing
religious opposition to condom use.
Conclusion
• Thus the reflexive or recursive engagements of
STS are not merely negative barriers to truth
claims. They can be a positive force in generating
representations that are better accounts of the
worlds we inhabit.
• Complexity theory can open up representations to
the “trielectic” between natural, social, and
technological worlds