Fat Tail Distributions and Efficiency of Flow Processing on Complex Networks

Download Report

Transcript Fat Tail Distributions and Efficiency of Flow Processing on Complex Networks

Fat Tail Distributions and Efficiency of
Flow Processing on Complex Networks
Zoltán Toroczkai
Center for Nonlinear Studies, and Complex Systems Group,
Theoretical Division, Los Alamos National Laboratory
LA-UR-03-5542
LANL LDRD-DR S.P.I.N. Project, 2003-06
What are Networks?
Interacting many “particle” systems where the interactions are
propagated through a discrete structure, a graph (not a continuum).
Node (the “particle”)
Link (edge)
The links [edges] represent interactions or associations between the nodes.
Graph:
-- undirected
-- directed
Where are Networks?
• Infrastructures: transportation nw-s (airports, highways, roads,
rail, water) energy transport nw-s (electric power, petroleum,
natural gas)
• Communications: telephone, microwave backbone, internet,
email, www, etc.
• Biology: protein-gene interactions, protein-protein interactions,
metabolic nw-s, cell-signaling nw-s, the food web, etc.
• Social Systems: acquaintance (friendship) nw-s, terrorist nw-s,
collaboration networks, epidemic networks, the sex-web
• Geology: river networks
Communication Networks
Skitter data depicting a macroscopic snapshot of Internet connectivity, with selected backbone ISPs (Internet Service
Provider) colored separately by K. C. Claffy email: [email protected] http://www.caida.org/Papers/Nae/
Networks in Biology
The metabolic pathway
Chemicals
Bio-Chemical reactions
Networks in Biology
The metabolic pathway
Chemicals
Bio-Chemical reactions
Biochemical Pathways - Metabolic Pathways, Source: ExPASy
The protein network
proteins
H. Jeong, S.P. Mason, A.-L. Barabasi, Z.N. Oltvai, Nature
411, 41-42 (2001)
Binding
P. Uetz, et al. Nature 403, 623-7 (2000).
Structural properties: degree distributions and the scale-free
character
Node degree: number of neighbors
i
ki=5
Degree distribution, P(k): fraction of nodes whose degree is k (a histogram over
the ki –s.)
Observation: networks found in Nature and human made, are in many cases
“scale-free” (power-law) networks:
P( k )  k 
For the sake of definitions:
The Erdős-Rényi Random Graph (also called the binomial random
graph)
GN , p (V , E )
• Consider N nodes (dots).
• Take every pair (i,j) of nodes and connect them with an edge with
probability p.
What is scale-free?
Poisson distribution
Power-law distribution
=<k>
Erdős-Rényi Graph
Non-Scale-free Network
Capacity achieving degree distribution of Tornado
code. The decay exponent -2.02.
M. Luby, M. Mitzenmacher, M.A. Shokrollahi,
D. Spielman and V. Stemann, in Proc. 29th
ACM Symp. Theor. Comp. pg. 150 (1997).
Scale-free Network
Science citations
www, out- and in- link distributions
Archaea
Bacteria
Sex-web
Internet, router level
Bacteria
Eukaryotes
Metabolic network
Eukaryotes
Scale-free Networks: Coincidence or Universality?
• No obvious universal mechanism identified
•As a matter of fact we claim that there is none (universal that is).
• Instead, our statement is that at least for a large class of networks (to be specified)
network structural evolution is governed by a selection principle which is closely tied to
the global efficiency of transport and flow processing by these structures, and
• Whatever the specific mechanism, it is such as to obey this selection principle.
Need to define first a flow process on these networks.
Z. Toroczkai and K.E. Bassler, “Jamming is Limited in Scale-free Networks”,
Nature, 428, 716 (2004)
Z. Toroczkai, B. Kozma, K.E. Bassler, N.W. Hengartner and G. Korniss
“Gradient Networks”, http://www.arxiv.org/cond-mat/0408262
Gradient Flow Networks
Gradients of a scalar (temperature, concentration, potential, etc.) induce flows (heat,
particles, currents, etc.).
Naturally, gradients will induce flows on networks as well.
Ex.:
Load balancing in parallel computation and packet routing on the internet
Y. Rabani, A. Sinclair and R. Wanka, Proc. 39th Symp. On Foundations of Computer
Science (FOCS), 1998: “Local Divergence of Markov Chains and the Analysis of
Iterative Load-balancing Schemes”
Setup:
Let G=G(V,E) be an undirected graph, which we call the substrate network.
The vertex set:
V  {x0 , x1 ,..., xN 1}  {0,1,2,..., N  1}
The edge set:
E  V  V , e  E , e  xi x j  (i, j ), xx  E (no self - loops)
A simple representation of E is via the Nx N adjacency (or incidence) matrix A
1 if (i, j )  E
A( xi , x j )  aij  
0 if (i, j )  E
Let us consider a scalar field
(1)
{h} : V  
Set of nearest neighbor nodes on G of i :
Si(1)
Definition 1
The gradient h(i) of the field {h} in node i is a directed edge:
h(i )  (i,  (i ))
(2)
(1)
Which points from i to that nearest neighbor   Si  {i} for G for which the increase in the
scalar is the largest, i.e.,:
 (i)  arg max (h j )
(3)
jS i(1) {i}
The weight associated with edge (i,) is given by:
h(i)  h  hi
If  (i )  i then h(i )  (i, i )  0(i ) .
The self-loop
0(i )
is a loop through i
with zero weight.
Definition 2
The set F of directed gradient edges on G together with the vertex set V forms
the gradient network:
G  G (V , F )
If (3) admits more than one solution, than the gradient in i is degenerate.
In the following we will only consider scalar fields with non-degenerate gradients. This means:
Prob.{hi  h j if (i, j )  E}  0
Theorem 1
Proof:
Non-degenerate gradient networks form forests.
Theorem 2
The number of trees in this forest = number of local maxima of {h} on G.
In-degree distribution of the Gradient Network when G=GN,p . A
combinatorial derivation
Version: Balazs Kozma (RPI)
Assume that the scalar values at the nodes are i.i.d,
according to some distribution (h).
First, distribute the scalars on the node set V, then
find those link configurations which contribute to
R(l) when building the GN,p graph.
Without restricting the generality, calculate R(l) for
node 0.
Consider the set of nodes with the property
h j  h0
Let the number of elements in this set be n, and the set be
denoted by [n].
The complementary set of [n] in V\{0} is :
C [n ]
In order to have exactly l nodes pointing their gradient edges into 0:
• they have to be connected to node 0 on the substrate
• they must NOT be connected to the set [n]
For l nodes:
p(1  p) 
n l
Also need to require that no other nodes will be pointing their gradient directions into node 0 :
(Obviously none of the [n] will.)
1  p(1  p) 
n N 1l n
So, for a fixed h0 and a specific set [n] :
 N 1  n 
n l

 p(1  p) 1  p(1  p)n
l





N 1l n
The probability Qn for such an event for a given n while letting h-s vary according to their
distribution:
For one node to have its scalar larger than h0:
For exactly n nodes:
Thus:
 (h0 )   dh  (h)
 (h0 ) n 1   (h0 ) N 1n
h0
 N  1
1
  dh0  (h0 )  (h0 ) n 1   (h0 ) N 1n 
Qn  
N
 n 
Combining:
 N 1  n 
l
 p(1  p)n 1  p(1  p)n
RN (l )   Qn 
l
n 0


N 1



N 1l n
Finally:
1
RN (l ) 
N
 N 1  n 
n


1

p
(
1

p
)



l
n 0 

N 1


N 1n l
p(1  p) 
n l
1
RN (l ) 
N
 N 1  n 

 1  p(1  p) n

l
n 0 

N 1


N 1n l
p(1  p) 
In the limit p  0, N  , z  Np  const. , z  1 :
1
RN (l )  , 1  l  z  Np ,
zl
n l
What happens when the substrate is a scale-free
network?
Gradient Networks and Transport Efficiency
- every node has exactly one out-link (one gradient direction) but it can have more
than one in-link (the followers)
- the gradient network has N-nodes and N out-links. So the number of “out-streams”
is Nsend = N
- the number of RECEIVERS is
J  1
N receive
N send
 1
h
G
N receive   Nl(in)
l 1
( in)
N
l1 l
N

h
N 0(in)
N
 RN (0)
h
G
G
- J is a congestion (pressure) characteristic.
- 0  J  1. J=0: minimum congestion, J=1: maximum congestion

N 1
1
GN , p
n
J ( N , p)   1  p(1  p)
N n1

N 1n
p  const. ,
In the scaling limit
J
GN , p
( N , p)  1 
N  ,

ln N
 1 
1

O
   1

 1 
 N 

N ln 
1 p 
- for large networks we get maximal congestion!
In the scaling limit
p  0, N  , pN  z,
1
J
GN , p
( N , p)   dx e
0
J
GN , p
 ze zx

1
 Ei( z )  Ei( ze  z )
z
ln z  C
1
( N , p)  1 
 ... z

1
z
- becomes congested for large average degree.

- For scale-free structures, the congestion factor becomes independent on the
system (network) size!!
For LARGE and growing networks, where the conductance of edges is the same, and
the flow is generated by gradients, scale-free networks are more likely to be
selected during network evolution than scaled structures.
The Configuration model
A. Clauset, C. Moore, Z.T., E. Lopez, to be published.
Generating functions:
g ( z )   ki z k
i

xg ( x) 

R( z )   dx g 1  (1  z )
g (1) 

0
1
K-th Power of a Ring
Degree distribution of the gradient network for the K-th power of a ring
RN (l )   l ,k ( in )
0
h
bij   ij  aij
If i, j V , and i  S 0(1) , then let :
H i ( j )  1  bij  bij (h0  h j )
So:
N 1
N 1
k0(in)   a0i  H i ( j ) 
i 1
j 1
  1  b
N 1
i 0
N 1
ij
j 1
 bij (h0  h j )

 N  1
1

R(l )   
N n0  n 
N 1
 n   (1),..., (n)
where
Pn ( N  1)
1

[ ]n Pn ( N 1)
l,
i 0  j11bi ( j ) 
is an n-subset of the set {1,2,…,N-1}.
denotes the set of all possible n-subsets of {1,2…,N-1}.
 1  b  
n
 N  1

Pn ( N  1)  
n


n
i 0 j 1
is always zero, if there is a node from the n-subset connected to i,
or i belongs to the n-subset.
i ( j)
n
Let
Tn   n   S(1()j )
j 1
which is the union of the disks of all nodes from the n-subset.
Thus, one needs to find the number of coverings of the ring with n disks, each of radius K,
that misses exactly l nearest neighbors of the origin.



4 3  9 K  4 K 2  2 Kl
, 1  l  K 1

 (2 K  l )( 2 K  l  1)( 2 K  l  2)( 2 K  l  3)


6 2  7K  7K 2

, lK
3K (3K  1)(3K  2)(3K  3)

R ( 2 K ) (l )  

42 K  1

, K  1  l  2K  1
 (2 K  l  1)( 2 K  l  2)( 2 K  l  3)


1

,
l  2K

4 K  1


Power law with exponent =- 3
2K+l
Competition Games on Networks
Collaboration with:
• Marian Anghel (CCS-3)
• Kevin E. Bassler (U. Houston)
• György Korniss (Rensselaer)
References:
M. Anghel, Z. Toroczkai, K.E. Bassler and G. Korniss, Competition-driven Network
Dynamics: Emergence of a Scale-free Leadership Structure and Collective
Efficiency, Phys.Rev.Lett. 92, 058701 (2004)
Z. Toroczkai, M. Anghel, G. Korniss and K.W. Bassler, Effects of Inter-agent
Communications on the Collective, in Collectives and the Design of Complex
Systems, eds. K. Tumer and D.H. Wolpert, Springer, 2004.
Resource limitations lead in human, and most biological populations to
competitive dynamics.
The more severe the limitations, the more fierce the competition.
Amid competitive conditions certain agents may have better venues or
strategies to reach the resources, which puts them into a distinguished
class of the “few”, or elites.
Elites form a minority group.
In spite of the minority character, the elites can considerably shape the
structure of the whole society:
since they are the most successful (in the given situation), the rest of the
agents will tend to follow (imitate, interact with) the elites creating a
social structure of leadership in the agent society.
Definition: a leader is an agent that has at least one follower at that moment.
The influence of a leader is measured by the number of followers it has.
Leaders can be following other leaders or themselves.
The non-leaders are coined “followers”.
The El Farol bar problem
[W. B Arthur(1994)]
A
B
…
A binary (computer friendly) version of the El Farol bar problem:
The Minority Game (MG)
[Challet and Zhang (1997)]
A = “0” (bar ok, go to the bar)
B = “1” (bar crowded, stay home)
latest bit
 l  {0,1,..,2m-1}
(011..101)
World utility(history):
m bits
(Strategies)(i) =
S(i)1(l)
S(i)2(l)

(Scores)(i) = C (i)(k), k = 1,2,..,S.
S(i)S(l)
k *  max{C (i ) (k )}
k
(Prediction)
(i)
=
P(i)  Sk(i*) (l ) {0,1}
3-bit history
000
001
010
011
100
101
110
111
associated integ.
0
1
2
3
4
5
6
7
Strategy # 1
0
0
0
1
1
0
0
1
Strategy #2
1
1
0
0
1
0
0
0
Strategy #3
1
1
1
0
0
0
1
0
A(t)
t
Attendance time-series for the MG:
World Utility Function:
   ( A  N / 2) 2 
Agents cooperate if they manage
to produce fluctuations below
(N1/2)/2 (RCG).
Scaling variable:
P 2m
 
N N

The El Farol bar game on a social network
A
B
…
The Minority Game on Networks (MGoN)
Agents communicate among themselves.
Social network:
2 components:
1) Aquintance (substrate) network: G (non-directed, less dynamic)
2) Action network: A (directed and dynamic)
G
AG
A
Communication types (more bounded rationality):
Majority rule
(not rational)
Minority rule
(not rational)
Critic’s rule: an agent listens to the OPINION/PREDICTION of
all neighboring agents on G, scores them (self included) based on
their past predictions, and ACTS on the best score.
(more rational, uses reinforcement learning)
L(1i )
(Links)(i) =
L(2i )

(Scores)(i) = F (i)(j), j= 1,2,..,K.i
L(iK)i
j*  max {F ( i ) ( j )}
j
(Prediction)
(i)
= P (i )
S
( j )
k*
(l )  {0,1}
Emergence of scale-free
leadership structure:
 Robust leadership hierarchy
0  kiout  ki
N k ( N , m; p)  k   N1 ( N , m; p)
N1 ( N , m; p)  a( p)
N k ( N , m; p)  a( p)k   f k ( N , m; p)
f k ( N , m; p)  1, for m  1
 RCG on the ER network
produces the scale-free
backbone of the leadership
structure
 The influence is evenly distributed
among all levels of the leadership
hierarchy.
m=6
 The followers (“sheep”) make up
most of the population (over 90%)
and their number scales linearly with
the total number of agents.
 Structural un-evenness appears in
the leadership structure for low trait
diversity.
Network Effects: Improved Market Efficiency
 A networked, low trait diversity system
is more effective as a collective
than a sophisticated group!
 Can we find/evolve networks/strategies
that achieve almost perfect volatility
given a group and their strategies
(or the social network on the group)?
In the limit p  0, N  , z  Np  const. , z  1 :
1
RN (l ) 
, 1  l  z  Np ,
zl
Conclusions :
• We defined Gradient Networks as directed sub-graphs formed by local gradients of a
scalar distributed on a substrate graph G.
• When the gradient direction is unique these Gradient Networks form forests.
• Gradient Networks typically arise when there is a local extremizing dynamics at the
node level (Agent-based Systems such as markets, routers, parallel computers, etc..)).
• Gradient Networks can be scale-free graphs even on substrate networks that are NOT
scale-free networks (such as E-R graphs)!!
• Gradient Networks can be highly dynamic, their evolution driven by the dynamics of
the scalar field on G and they are not solely defined through the topological properties of
G!! (such as in the case of preferential attachment).
• G. N.-s give a natural explanation for why scale-free large networks might emerge if
the edges have the same conductance and the flows are generated by gradients.