Gradient Networks (a random tutorial) With: M. Anghel (LANL), K.E. Bassler (Houston), G.

Download Report

Transcript Gradient Networks (a random tutorial) With: M. Anghel (LANL), K.E. Bassler (Houston), G.

Gradient Networks
(a random tutorial)
With:
M. Anghel (LANL), K.E. Bassler (Houston), G. Korniss (RPI), B.
Kozma (Paris-Sud), E. Ravasz-Reagan (Harvard), A. Clauset (SFI),
E. Lopez (LANL), C. Moore (UNM/SFI).
Physics Department, University of Notre Dame
What are Agent-based Systems?
We are rather familiar with:
Classical physical, chemical, and certain biological systems:
• Elementary particles, nuclei, atoms, molecules, proteins, polymers, fluids, solids, etc.
• They are single- or many-particle systems with well defined physical interactions.
• Their properties and behavior are well described by the known laws of physics and
chemistry.
• These properties (including the statistical ones) are reproducible.
There are, however, other types of ubiquitous systems surrounding
us: Agent-based Systems.
Social
Insects
Collective behavior from simple individuals.
High level of organization forming “social
structures (hierarchies).
The individual usually cannot exist/survive on its own.
For efficient foraging, memory of
locations is needed.
Memory is introduced via pheromone
trails.
This is a “collective memory” !
Humans
As a collective they too, can
form low-entropy formations:
Or, high entropy formations, or crowds:
while having fun …
…or just plain panicked
… markets…
in New York
… and economies:
… or middle-east
How do we even begin to think
about such systems?
Let us attempt a unifying representation:
ABS-s are systems of interacting entities called agents / players / individuals.
An agent is an entity with the following set of qualities:
•There is a set of variables x describing the state of the agent. (position, speed, health
state, etc.). The corresponding state space is X.
•There is a set of variables z, describing the perceived state of the environment, Z. The
environment includes other agents if there are any.
•There is a set of allowable actions (output space), A. (swerve, brake, accelerate, etc.)
•There is a set of strategies, which are functions s: (ZX)t  A, that summon an action to a
given external perception, state of the agent and history up to time t. These are “ways of
thinking” for the agent. Behavioral input space.
•There is a set of utility variables, uU. (time to destination, profits, risk)
•There is a multivariate objective function: F:URm, which might include constraints
(“rules”). The physics version is called action.
•There is a drive to optimize the objective function.
The topology of the interactions is usually a dynamical graph, or network.
Agent-based systems are really nothing more than a set of coupled optimizers.
Problem Classes
•The “Forward” or Analysis problem: mapping out collective behavior from the study
of interactions on the individual level (from micro to macro approach).
•The “Backward” or Design problem: there is an additional set of global variables that
form the utility space of the designer. Define individual traits and response functions such
that a global optimal performance is induced.
Would Statistical Physics like methods work?
Deductive Game Theory
(von Neumann and Morgenstern )
- rational behavior
- algorithmic choice tree
evaluation
Classical Statistical Mechanics
- single response function
(Hamiltonian)
- non-adaptive
- large particle limit N ~ 1023
- agent-planning
Agent-based Systems
- multiple response fcts.
explosion of state space
- adaptive
- individual goal-driven (coupled set of optimizers)
- mesoscopic size N ~ 108
- bounded rationality behavior (“good news”)
(Brian W. Arthur, 1994)
- broad distribution of interaction scales
Approaches of study:
Stylized (theoretical): build models from ingredients that
qualitatively match observations. After running the model see if the
output qualitatively matches the corresponding observations of the
real system. Gives a general understanding only, no quantitative
predictive capability.
Bottom-up (simulation and data heavy): insert as much
quantitative detail as possible along with real-world data. Run the
model over and over with different data. Perform statistics and
compare results with statistics measured on the real system.
Some predictive capability.
Industry, government.
Icosystems, Eric Bonabeau
The following slides represent example of a stylized model of a market. This is an agent-based
system where we study the qualitative behavior of a collective of interacting agents under
certain conditions, in particular that of limited resources. It lead us to the introduction of the
notion of gradient networks.
Competition Games on Networks
Collaboration with:
• Marian Anghel (LANL)
• Kevin E. Bassler (U. Houston)
• György Korniss (Rensselaer)
References:
M. Anghel, Z. Toroczkai, K.E. Bassler and G. Korniss, Competition-driven Network
Dynamics: Emergence of a Scale-free Leadership Structure and Collective
Efficiency, Phys.Rev.Lett. 92, 058701 (2004)
Z. Toroczkai, M. Anghel, G. Korniss and K.W. Bassler, Effects of Inter-agent
Communications on the Collective, in Collectives and the Design of Complex
Systems, eds. K. Tumer and D.H. Wolpert, Springer, 2004.
Resource limitations lead in human, and most biological populations to
competitive dynamics.
The more severe the limitations, the more fierce the competition.
Amid competitive conditions certain agents may have better venues or
strategies to reach the resources, which puts them into a distinguished
class of the “few”, the gurus (elites).
They form a minority group.
In spite of the minority character, they can considerably shape the
structure of the whole society:
since they are the most successful (in the given situation), the rest of the
agents will tend to follow (imitate, interact with) the gurus creating a
social structure of leadership in the agent society.
Definition: a leader is an agent that has at least one follower at that moment.
The influence of a leader is measured by the number of followers it has.
Leaders can be following other leaders or themselves.
The non-leaders are coined “followers”.
The El Farol bar problem
[W. B Arthur(1994)]
A
B
…
A binary (computer friendly) version of the El Farol bar problem:
The Minority Game (MG)
[Challet and Zhang (1997)]
A = “0” (bar ok, go to the bar)
B = “1” (bar crowded, stay home)
latest bit
 l  {0,1,..,2m-1}
(011..101)
World utility(history):
m bits
(Strategies)(i) =
S(i)1(l)
S(i)2(l)

(Scores)(i) = C (i)(k), k = 1,2,..,S.
S(i)S(l)
k *  max{C (i ) (k )}
k
(Prediction)
(i)
=
P(i)  Sk(i*) (l ) {0,1}
3-bit history
000
001
010
011
100
101
110
111
associated integ.
0
1
2
3
4
5
6
7
Strategy # 1
0
0
0
1
1
0
0
1
Strategy #2
1
1
0
0
1
0
0
0
Strategy #3
1
1
1
0
0
0
1
0
A(t)
t
Attendance time-series for the MG:
World Utility Function:
   ( A  N / 2) 2 
Agents cooperate if they manage
to produce fluctuations below
(N1/2)/2 (RCG).

The El Farol bar game on a social network
A
B
…
The Minority Game on Networks (MGoN)
Agents communicate among themselves.
Social network:
2 components:
1) Aquintance (substrate) network: G (non-directed, less dynamic)
2) Action network: A (directed and dynamic)
G
AG
A
Emergence of scale-free
leadership structure:
 Robust leadership hierarchy
0  kiout  ki
N k ( N , m; p)  k   N1 ( N , m; p)
N1 ( N , m; p)  a( p)
N k ( N , m; p)  a( p)k   f k ( N , m; p)
f k ( N , m; p)  1, for m  1
 RCG on the ER network
produces the scale-free
backbone of the leadership
structure
 The influence is evenly distributed
among all levels of the leadership
hierarchy.
m=6
 The followers make up most of the
population (over 90%) and their
number scales linearly with the total
number of agents.
 Structural un-evenness appears in
the leadership structure for low trait
diversity.
Network Effects: Improved Market Efficiency
 A networked, low trait diversity system
is more effective as a collective
than a sophisticated group!
 Can we find/evolve networks/strategies
that achieve almost perfect volatility
given a group and their strategies
(or the social network on the group)?
In the limit p  0, N  , z  Np  const. , z  1 :
1
RN (l ) 
, 1  l  z  Np ,
zl
What are networks ?
Collection of discrete entities [nodes], which might be connected via links [edges]
representing interactions or associations between the connected elements.
Mathematical term for these objects: Graph
Typical notation: G(V, E), where V={1,2,…,N} is the set of nodes (vertices, sites)
and E is the set of edges.
An edge typically connects a pair of vertices x and y, however it can also connect
more than two vertices, called hyperedges and this case the resulting graph is called
a Hypergraph. For now we exclusively deal with simple graphs, where E  V V .
Typical notations for an edge :
e  {x, y}  (x, y)  xy
If there are several edges between two nodes, the graph is called a multigraph.
If the interaction or association is unidirectional, then this fact is resolved by
making xy  yx

e  xy
Such an edge
is called a directed edge and the corresponding graph
a directed graph, or digraph for short.
Note:
xy  E  yx  E
Both nodes and edges can have associated a number of properties, parameters,
called weights.
Graphs and weights can be time dependent.
Typical real-world graphs are the result of complex processes with stochastic
components
makes sense to talk about Graph Ensembles and probabilistic
descriptions.
Representations:
Visual, geometric:
Abstract:
- e.g. with the adjacency matrix:
A  {aij }NN where
1 if ij  E
aij  
0 if ij  E
-“expensive” representation,
requires O(N2) resources
- it is hard to simply recover
patterns/clusters from.
- sometimes advantageous for
analytical calculations
Finding clusters in networks: “community” detection.
More economical representations: adjacency lists.
- standard representation used in
algorithmic computations.
Reading:
List Heads
Neighbors
1)
R. Sedgewick, “Algorithms in (C++), Part 5,
Graph Algorithms”, Addison-Wesley, (2002).
2)
Cormen et.al., “Introduction to Algorithms”,
The MIT Press, (2001)
Where are Networks?
• Infrastructures: transportation nw-s (airports, highways, roads,
rail, water) energy transport nw-s (electric power, petroleum,
natural gas)
• Communications: telephone, microwave backbone, internet,
email, www, etc.
• Biology: protein-gene interactions, protein-protein interactions,
metabolic nw-s, cell-signaling nw-s, the food web, etc.
• Social Systems: acquaintance (friendship) nw-s, terrorist nw-s,
collaboration networks, epidemic networks, the sex-web
• Geology: river networks
Skitter data depicting a macroscopic snapshot of Internet connectivity, with selected backbone ISPs (Internet Service
Provider) colored separately by K. C. Claffy email: [email protected] http://www.caida.org/Papers/Nae/
Biological Networks
R.J. Williams, N.D. Martinez Nature (2000)
Food Webs
trophic species
trophic interactions
Cellular Networks: The Bio-Map
Source: Barabasi et.al.
GENOME
Protein-gene
interactions
PROTEOME
Protein-Protein
interactions
METABOLISM
Bio-chemical
reactions
Citrate Cycle
Metabolic Networks
Chemicals
Bio-Chemical reactions
Biochemical Pathways - Metabolic Pathways, Source: ExPASy
The protein network
proteins
H. Jeong, S.P. Mason, A.-L. Barabasi, Z.N. Oltvai, Nature
411, 41 (2001)
Binding
P. Uetz, et al. Nature 403, 623-7 (2000).
Social Networks
person
Social interaction, relation (friendship, etc.)
Acquaintance networks
The sex-web
Actor Networks
Collaboration Networks
person
common paper
More on social networks later…
(Newman, 2000, H. Jeong et al 2001)
How do we describe and study networks?
The party problem
What is the minimum nr. of people R, one should invite to a party that would
surely have k people who all know each other, or k who do not know each
other (at all)?
For k=3,
R(k) =6
Know each other
Do not now each other
For k=4:
R(k) =18 (hard proof)
For k=5:
R(k)=…
NOT KNOWN!
Only the bounds are known:
43  R(5)  49 .
Come on, use a computer!
We are looking for complete graphs with n nodes that have a monochromatic
complete subgraph of k nodes (k-clique). (Here k=5.)
n( n  1)
edges in a complete graph. There are
2
graphs whose edges are either blue or red.
There are
2n( n1) / 2
such
Since for k=3, R(3)=6, an n=6 node complete graph would have a monochromatic
triangle.
n=6:
n=18:
2n( n1) / 2  215  32,768
2n( n1) / 2  2153  1.461046
43  n  49:
2903  21176
graphs.
Operating at the physical limits of computation (as
determined by the Planck constant, the speed of
light and the gravitational constant) the 1kg laptop
of Set Lloyd performs
f  5.42181050 operationsper second
To check all graphs for monochromatic complete
subgraphs takes at least
2n( n1) / 2 / f
S. Lloyd, “Ultimate Physical Limits to
Computation”, Nature, 406, 1047
(2000).
seconds 2n( n1) / 2193.44 years
Or, for k=5 it would take at least 2.69310213 years!
The age of the universe is estimated to be:
1.1-2  1010 yrs!
Probabilistic ensemble approach.
Structural properties: degree distributions and the
scale-free character
Node degree: number of neighbors
i
ki=5
Degree distribution, P(k): fraction of nodes whose degree is k (a histogram over
the ki –s.)
Observation: networks found in Nature and human made, are in many cases
“scale-free” (power-law) networks:
P(k )  k 
For the sake of definitions:
The Erdős-Rényi Random Graph (also called the binomial random
graph)
GN , p (V , E)
• Consider N nodes (dots).
• Take every pair (i,j) of nodes and connect them with an edge with
probability p.
The Erdős-Rényi random graph (continued)
GN,p is a graph with N vertices and link-probability p (the probability that two arbitrarily
chosen vertices are connected by an edge).
Average nr. of links incident on a node:   p(N 1) . Clustering coefficient C  p 

N
.
N 1 k
N1k
P(k)  
p (1 p)
 k 


If Xk denotes the number of nodes in an instance of GN,p with degree k, its distribution is not
given exactly by P(k)! -- correlations induced by the fact that and edge is shared by two nodes.

It is however asymptotically correct (Bollobás).
The probability of a node having exactly k incident edges is:
In the limit of N and p0 such that
=pN=const. :
P(k)  e
Since

k
k
k!
k  kP(k)  
(Poisson)
and
the width is:   
the Binomial Random Graph has a

characteristic scale given by 

Can graphs
with the same P(k) be very different?
k 
Other graph measures: Clustering or transitivity
A
B
Very likely!
C
ni
Ci 
[ki (ki  1) / 2]
Clustering distribution:
1 N
C(k) 
Ci ki ,k

N(k) i1
Average clustering coefficient:
C  Ci 
ki=5
ni=3
i
Ci=0.3
Random Geometric Graphs
0<R1
 
d (r1 , r2 )  R
Continuum percolation
Average degree: 
 (d )  Rd  d / 2 N (1  d / 2)
c (2)  4.52  0.01
c (d )  c ()  Ad 
c ()  1,
  1.74(2),
A  11.78(5)
Degree distribution is Poisson
Clustering coefficient
 1  H d (1), even d
Cd  
3 / 2  H d (1 / 2), odd d
(i)  3 
H d ( x) 
 


(
i

1
/
2
)
 ix
 4
1
C2  1 
3
4
d /2
i 1/ 2
3  0.5865 ...
J.Dall, M. Christensen, PRE 66, 016121 (2002)
What is scale-free?
Poisson distribution
Power-law distribution
=<k>
Erdős-Rényi Graph
Non-Scale-free Network
Capacity achieving degree distribution of Tornado
code. The decay exponent -2.02.
M. Luby, M. Mitzenmacher, M.A. Shokrollahi,
D. Spielman and V. Stemann, in Proc. 29th
ACM Symp. Theor. Comp. pg. 150 (1997).
Scale-free Network
Science citations
www, out- and in- link distributions
Archaea
Bacteria
Sex-web
Internet, router level
Bacteria
Eukaryotes
Metabolic network
Eukaryotes
Scale-free Networks: Coincidence or Universality?
• No obvious universal mechanism identified
•As a matter of fact we claim that there is none (universal that is).
• Instead, our statement is that at least for a large class of networks (to be specified)
network structural evolution is governed by a selection principle which is closely tied to
the global efficiency of transport and flow processing by these structures, and
• Whatever the specific mechanism, it is such as to obey this selection principle.
Need to define first a flow process on these networks.
Z. Toroczkai and K.E. Bassler, “Jamming is Limited in Scale-free Networks”,
Nature, 428, 716 (2004)
Z. Toroczkai, B. Kozma, K.E. Bassler, N.W. Hengartner and G. Korniss
“Gradient Networks”, http://www.arxiv.org/cond-mat/0408262
Gradient Networks
Gradients of a scalar (temperature, concentration, potential, etc.) induce flows (heat,
particles, currents, etc.).
Naturally, gradients will induce flows on networks as well.
Ex.:
Load balancing in parallel computation and packet routing on the internet
Y. Rabani, A. Sinclair and R. Wanka, Proc. 39th Symp. On Foundations of Computer
Science (FOCS), 1998: “Local Divergence of Markov Chains and the Analysis of
Iterative Load-balancing Schemes”
Setup:
Let G=G(V,E) be an undirected graph, which we call the substrate network.
The vertex set:
V  {x0 , x1,...,xN 1}  {0,1,2,...,N 1}
The edge set:
E  V V , e  E, e  xi x j  (i, j), xx  E (noself - loops)
A simple representation of E is via the Nx N adjacency (or incidence) matrix A
1 if (i, j )  E
A( xi , x j )  aij  
0 if (i, j )  E
Let us consider a scalar field
(1)
{h} : V  
Set of nearest neighbor nodes on G of i :
Si(1)
Definition 1
The gradient h(i) of the field {h} in node i is a directed edge:
h(i)  (i,  (i))
(2)
(1)
Which points from i to that nearest neighbor   Si {i} for G for which the increase in the
scalar is the largest, i.e.,:
 (i)  arg max(h j )
(3)
jSi(1) {i}
The weight associated with edge (i,) is given by:
h(i)  h  hi
If  (i)  i then h(i)  (i, i)  0(i) .
The self-loop
0(i)
is a loop through i
with zero weight.
Definition 2
The set F of directed gradient edges on G together with the vertex set V forms
the gradient network:
G  G(V , F )
If (3) admits more than one solution, than the gradient in i is degenerate.
In the following we will only consider scalar fields with non-degenerate gradients. This means:
Prob.{hi  h j if (i, j)  E}  0
Theorem 1
Proof:
Non-degenerate gradient networks form forests.
Theorem 2
The number of trees in this forest = number of local maxima of {h} on G.
0.48
0.82
0.67
0.65
0.46
0.6
0.53
0.44
0.5
0.22
0.2
0.65
0.1
0.19
0.16
0.87
0.15
0.32
0.14
0.2
0.2
0.18
0.44
0.43
0.67
0.7
0.05
0.15
0.24
0.16
0.65
0.13
0.55
0.05
0.65
0.8
In-degree distribution of the Gradient Network when G=GN,p . A
combinatorial derivation
Version: Balazs Kozma (RPI)
Assume that the scalar values at the nodes are i.i.d,
according to some distribution (h).
First, distribute the scalars on the node set V, then
find those link configurations which contribute to
R(l) when building the GN,p graph.
Without restricting the generality, calculate R(l) for
node 0.
Consider the set of nodes with the property
hj  h0
Let the number of elements in this set be n, and the set be
denoted by [n].
The complementary set of [n] in
V\{0} is :
C [n]
In order to have exactly l nodes pointing their gradient edges into 0:
• they have to be connected to node 0 on the substrate AND
• they must NOT be connected to the set [n]
For l nodes:
p(1 p) 
n
l
Also need to require that no other nodes will be pointing their gradient directions into node 0 :
 none of the [n] will.)
(Obviously
1 p(1 p)n 
N 1ln
So, for a fixed h0 and a specific set [n] :

N 1 n
n l
n N1ln

p(1 p)  1 p(1 p) 
 l

Denote by Qn the probability for such an event for a given n while letting h-s vary according
to their distribution.
For one node to have its scalar larger than h0:
 (h0 )   dh  (h)
h0
For exactly n nodes:
 (h0 ) 1  (h0 )
n
Thus:
Combining:
N1n
N 1
1
n
N1n
Qn  
dh

(h
)

(h
)
1

(h
)





 0
0
0
0
n 
N


N 1 n
n l
n N1ln
RN (l)  Qn 
p(1
p)
1
p(1
p)
 


l


n 0
N1

Finally:
N 1
1
RN(l)  
N n 0
Independent of 
N 1 n 
n N 1nl
n l
p(1 p) 

1 p(1 p) 

l


1 N 1
RN (l)  
N n 0
N 1 n 
n N 1nl
n l
p(1 p) 

1 p(1 p) 

l



In the limit p  0, N  , z  Np  const. , z  1 :
1
RN (l )  , 1  l  z  Np ,
zl
What happens when the substrate is a scale-free
network?
Gradient Networks and Transport Efficiency
- every node has exactly one out-link (one gradient direction) but it can have more
than one in-link (the followers)
- the gradient network has N-nodes and N out-links. So the number of “out-streams”
is Nsend = N
N receive   N l(in)
- the number of RECEIVERS is
l1
J  1
N receive
N send
h
G
 1


l1
N l(in)

N
h
G
N 0(in)
N
 RN (0)
h
G
- J is a congestion (pressure) characteristic.
- 0  J  1. J=0: minimum congestion, J=1: maximum congestion
N 1
J
GN, p
1
n N 1n
(N, p)  1 p(1 p) 
N n1
p  const. ,
In the scaling limit
J
GN, p
N  ,

1 
ln N
(N, p)  1
1 O 1

 1 
N 
N ln

1 p 
- for large networks we get maximal congestion!

In the scaling limit
J
GN, p
p  0, N  , pN  z,
(N, p) 
1
 dx e
0
J

GN , p
ze  zx

1
Ei(z)  Ei(zez )

z
ln z  C
(N, p)  1
 ... z1
1
z
- becomes congested for large average degree.
- For scale-free structures, the congestion factor becomes independent on the
system (network) size!!
For LARGE and growing networks, where the conductance of edges is the same, and
the flow is generated by gradients, scale-free networks are more likely to be
selected during network evolution than scaled structures.
Gradient Networks Tend to be Power-Law
The Configuration model
A. Clauset, C. Moore, E. Lopez, E. Ravasz, Z.T., to be published.
Generating functions:
g ( z )   ki z k
i

xg ( x) 

R( z )   dx g 1  (1  z )
g (1) 

0
1
K-th Power of a Ring

4 3  9K  4K 2  2Kl

, 1  l  K 1
(2K  l)(2K  l  1)(2K  l  2)(2K  l  3)


2
6
2

7K

7K

 , lK


3K(3K  1)(3K  2)(3K  3)

R(2K ) (l)  

42K  1

,
K  1  l  2K 1
(2K  l  1)(2K  l  2)(2K  l  3)


1

,
l  2K

4K  1

Power law with exponent =- 3
2K+l
So far we have looked at uncorrelated scalar fields.
What happens if the numbers (scalars) sitting at the nodes are correlated,
and in particular if they are correlated to the local network neighborhood
properties of the node?
Typically still scale-free behavior (large system limit) but with a different
exponent.
Coming up as an example for correlated gradient networks:
Protein Folding Pathways , see Erzsebet Ravasz’s talk !
Take home message:
The scale-free character observed so widely in diverse systems might be
due to a global tendency of distributed systems to improve their
performance.