Diapositiva 1

Download Report

Transcript Diapositiva 1

Biological networks and statistical physics
Diego Garlaschelli
Dipartimento di Fisica, Università di Siena, ITALY
Said Business School, University of Oxford, UK
BioPhys09, Arcidosso, ITALY
Biological networks:
from cells to ecosystems
Metabolic networks
Vertices = cellular substrates (products or educts)
Links = biochemical reactions (enzyme-mediated)
complex
educt
product
enzyme
(part of E. coli’s metabolic network )
educt
Protein-protein interaction networks
Vertices = proteins
Links = interactions within the cell
Neural networks
Vertices = neurons
Links = synapses
← single neuron
↑
web of synaptic connections
Vascular networks
Vertices = tissues
Links = blood vessels
6
5
7
3
4
2
1
8
Ecological networks (food webs)
Vertices = coexisting species
Links = predator-prey interactions
Real networks versus regular graphs
Protein-protein interaction network
(Saccharomyces cerevisiae)
Regular graphs
Two problems:
1) characterization of network structure (and complexity)
2) network modelling
Graph Theory
Undirected Graph
Directed Graph
“Graph”≡ G(V,E)
V: N vertices
E: L links
Adjacency Matrix:
ai j
i
1
ai j  a j i  
0
ki 
N
a
j1
ij
i
j
i
j
j
corresponds to
i
j
1
0
ai j  
i
j
i
j
Degree (number of links) of vertex i
2L
out
k i  k in
k 
i  ki
N
in
out
statistical distributions : Pk , P k , P k
   
Average vertex-vertex distance:
k
in
i
j
N
 ai j
j1
N
k out
  a ji
i
j1
Clustering coefficient:
C  ci
D  di j
d i j  minimum distance between i and j
i
ci 
i and to each other
pairs connected to i
pairs connected to
Small-world character of (most) real networks:
Short mean distance D:
“it’s a small world, after all!”
Efficient information transport
(and fast disease spreading too!)
Large clustering coefficient C:
“my friends are friends of each other”
High robustness
under vertex removal
Degree distribution in (most) real networks:
Power-law distribution
P(k)  k -
2< <3
No characteristic scale (scale-free)!
Many poorly connected vertices
Few highly connected vertices
(a) Archaeoglobus fulgidus (archea);
(b) E. coli (bacterium);
(c) Caenorhabditis elegans (eukaryote);
(d) 43 different organisms together.
Finite-scale versus scale-free networks
Finite-scale networks:
P(k) decays exponentially
Scale-free networks:
P(k) decays as a power law
No vertex has a degree much
larger than the average value
Few vertices have a degree much
larger that the average value
Finite-scale versus scale-free networks
Finite-scale networks:
P(k) decays exponentially
Scale-free networks:
P(k) decays as a power law
(in both cases N=130 and L=215: same average degree)
5 vertices with largest degree
vertices connected to the red ones (random 27%, scale-free 60%)
other vertices
RANDOM GRAPH model (Erdös, Renyi 1959)
● Start with a set of N isolated vertices;
● For each pair of vertices draw a link
with uniform probability p.
Lp
NN  1
2
k 
p=0
p=0.1
p=0.5
p=1
2L
 pN  1  pN
N
Degree distribution P(k):
Average vertex-vertex distance:
Pk   e
 pN
pNk
k!
(Poisson)
k
D
N

D
log N
log k
Clustering coefficient
Cp
k
N
Connected components in random graphs
The interesting feature of the random graph model is the presence of a
critical probability pc marking the appearance of a giant cluster:
Percolation threshold
pc  1/N
When p<pc the network is made of many small clusters
and P(s) decays exponentially;
when p>pc there are few very small clusters and one giant one;
at p=pc the cluster size distribution has a power-law form: P(s)  s -
SMALL-WORLD model (Watts, Strogatz
Nature 1998)
● Start with a regular
d-dimensional
lattice,
connected up to q nearest
neighbours;
● With probability p, an
end of each link is rewired
to a new randomly chosen
vertex.
p =0
0<p<1
p=1
Regular
Small-world
Random
Average distance and clustering coefficient
Degree distribution
P(k)
C(p)/C(0)
10 -1
small-world
regime
10 -2
D(p)/D(0)
10 -3
10 -4
0
4
8
12
16
SCALE-FREE model (Barabási, Albert
Science 1999)
● Start with m0 vertices and no link;
● at each timestep add a a new vertex with m
links, connected to preexisting vertices chosen
randomly with probability proportional to their
degree k (preferential attachment).
After a certain number of iterations, the
degree distribution approaches a power-law
distribution:
P(k) k -
 =3
P(k) k -
 =3
Growth and preferential attachment are both
necessary!
FITNESS model (Caldarelli et al.
Phys. Rev. Lett. 2002)
● Each vertex i is assigned a fitness value xi
drawn from a given distribution r(x) ;
● A link is drawn between each pair of
vertices i and j with probability
f(xi,xj)
depending on xi and xj .
Power-law degree
distributions are obtained by
chosing
r(x)  xα
f(xi,xj)  xi xj
or
r(x)= ex
f(xi,xj)  (xi +xj –z)
Exponential random graphs
Reciprocity of directed networks
Link reciprocity: the problem
Do reciprocated links (pairs of mutual links between two vertices) occur more or
less often than expected by chance in a directed network?
Adjacency matrix (NxN):
2
5
1
4
3
6
Important aspect of many networks:
Mutuality of relationships (friendship, acquaintance, etc.) in social networks
Reversibility of biochemical reactions in cellular networks
Symbiosis in food webs
Synonymy in word association networks
Economic/financial interdependence in trade/shareholding networks
…
Standard definition of reciprocity
Reciprocity = fraction of reciprocated links in the network
Total number of directed links:
reciprocity
Number of reciprocated links:
(Email and WWW)
(WTW)
A new definition of reciprocity
Conceptual problems with the standard definition:
-
is not an absolute quantity, to be compared to
- as a consequence, networks with different density cannot be compared
- self-loops should be excluded when computing
and
New definition of reciprocity:
correlation coefficient between reciprocal links
reciprocal
areciprocal
antireciprocal
avoiding the aforementioned problems.
D. Garlaschelli, M.I. Loffredo Phys. Rev. Lett.93,268701(2004)
Results:
reciprocity
classifies
real networks
WTW
WWW
Neural
Email
Words
Metabolic
Financial
Food Webs
D. Garlaschelli, M.I. Loffredo
Phys. Rev. Lett.93,268701(2004)
Size dependence of the reciprocity
Metabolic networks
Food Webs
World Trade Web
A general model of reciprocity
We introduce a multi-species formalism where reciprocated
and
non-reciprocated
links are regarded as two different ‘chemical species’,
each governed by the corresponding chemical potential (
and
)
‘particles’ of type
distributed among
‘states’
‘particles’ of type
distributed among
‘states’
Decomposition of the adjacency matrix:
where
Graph Hamiltonian:
• Garlaschelli and Loffredo, PHYSICAL REVIEW E 73, 015101(R)
2006
A general model of reciprocity
Grand Partition Function:
Grand Potential:
Occupation probabilities:
Conditional connection probability:
Models of weighted networks
Structural correlations in complex networks
In order to detect patterns in networks,
one needs (one or more) null model(s) as a reference.
A null model is obtained by fixing some topological constraint(s),
and generating a maximally random network consistent with them.
Examples of null models for unweighted networks:
-the random graph (Erdos-Renyi) model (number of links fixed),
-the configuration model (degree sequence fixed),
-etc.
Problem of structural correlations:
When a low-level constraint is fixed,
patterns may be generated at a higher level,
even if they do not signal ‘true’ high-level correlations.
The (solved) problem for unweighted networks
Problem: specifying the degree sequence alone
generates anticorrelations between knni and ki (disassortativity)
and between ci and ki (hierarchy).
Maslov et al.
Solution: in unweighted networks, structural correlations can be fully
characterized analytically in terms of exponential random graphs:
Correct prediction:
Park & Newman
Park & Newman
Some null models for weighted networks
Model 1: Global weight reshuffling (fixed topology)
Model 2: Global weight & tie reshuffling (fixed degrees)
Model 3: Local weighted rewiring (fixed strengths)
Model 4: Local weighted rewiring (fixed strengths and degrees)
Is it possible to characterize these models analytically?
Exponential formulation of the four null models
Model 1: Global weight reshuffling (fixed topology)
Model 2: Global weight & tie reshuffling (fixed degrees)
Model 3: Local weighted rewiring (fixed strengths)
Model 4: Local weighted rewiring (fixed strengths and degrees)
Note: H1, H2, H3 and H4 are particular cases of:
Analytic solution of the general null model:
Solution: the probability of a link of weight w between i and j is
Models 1 and 2 (global weight reshuffling):
Fermionic correlations
The expectations
are confirmed, however
implies
This means that weighted measures (except the disparity)
display a satisfactory behaviour under these null models
(but they inherit purely topological correlations!)
Model 3 (fixed strength): Bosonic correlations
Now all weighted measures are uninformative!
Model 4 (fixed strength+degree):
mixed Bose-Fermi statistics
We still have
as in model 3:
All weighted measures are uninformative in this case too!
Particular case:
the Weighted Random Graph (WRG) model
See a Mathematica demonstration of the model (by T. Squartini) at:
http://demonstrations.wolfram.com/WeightedRandomGraph/
The Weighted Random Graph (WRG) model
The Weighted Random Graph (WRG) model
Largest connected component in the WRG
after weak (+) and strong (-) edge removal
Clustering coefficient in the WRG
after weak (+) and strong (-) edge removal
Food webs
Food webs
Networks of predation relationships among N biological species
i
j
i is eaten by j
Peculiar (problematic?) aspects of food webs
P>(k’)
C/Crandom
C/CrandomN
C/Crandom=1
Not small-world! N
Not scale-free!
k’=k/<k>
The connectance c=L/N2 varies across different webs
(fraction of directed links out of the total possible ones)
Only property similar to other networks: small distance D
Dunne, Williams, Martinez Proc. Natl. Acad. Sci. USA 2002
A modest proposal: food webs as transportation networks
Resource transfer along each food chain:
Flux of matter and energy form prey to predators,
in more and more complex forms: directionality
Species ultimately feed on the abiotic resources
(light, water, chemicals): connectedness
Almost 10% of the resources are transferred
from the prey to the predator: energy dispersion
Minimum-energy subgraphs: minimum spanning trees
Minimum spanning trees can be obtained as zero-temperature ensembles
where li is the trophic level (shortest distance to abiotic resources) of species i
Spanning trees and allometric scaling
Structure minimizing each species’ distance from the “environment vertex”
20
ℓ=
19
C(A)
Ai
Ci
16
12
ℓ=
8
6
4
3
ℓ=
A
1
0
0
2
4
6
8
Allometric relations:
ℓ=
Ci (Ai)
Trophic level ℓ of a
species i:
minimum distance
from the
environment to i.
→
C (A)
Spanning tree:
all links from a
species at level ℓ to
species at levels
ℓ’≤ℓ are removed.
Power-law scaling:
C(A) Aη
10
Allometric scaling in river networks
C(A) Aη
η = 3/2
Ai = drainage area of site i
Ci = water in the basin of i
Banavar, Maritan, Rinaldo Nature 1999
Allometric scaling in vascular systems
C(A) Aη
Kleiber’s law of metabolism:
η = 4/3
B(M) M 3/4
A0= metabolic rate (B)
C0= nutrient volume (M)
General case (dimension d):
η = (d+1)/d
maximum efficiency
West, Brown, Enquist Science 1999; Banavar, Maritan, Rinaldo Nature 1999
Allometric scaling in food webs
The resource transfer is universal and efficient (common organising principle?)
C(A) Aη
η = 1.16-1.13
Garlaschelli, Caldarelli, Pietronero Nature 423, 165-168 (2003)
Transport efficiency in food webs
The constraint limiting the efficiency is not the geometry, but the competition!
C(A) A2
C(A) Aη
C(A) A
chain
1<η<2
star
inefficient
competition
efficient
Summary: food web structure decomposition
Spanning trees and loops: complementary properties and roles
Tree-forming links:
1) Determine the degree of
transportation EFFICIENCY
2) Measured by the allometric
exponent η
3) η is universal! (Common
evolutionary principle?)
Loop-forming links:
1) Determine the STABILITY
under species removal
2) Measured by the directed
connectance c
Source
Species
3) c varies! (Web-specific
organization?)
Out-of-equilibrium statistical
mechanics of networks
Restoring the feedback
We focus on the case when topology and dynamics
evolve over comparable timescales:
Dynamical process
Topological evolution
As a result, the process is self-organized
and a non-equilibrium stationary state is reached,
independently of (otherwise arbitrary) initial conditions.
We choose the simplest possible dynamical rule: Bak-Sneppen model
and the simplest possible network formation mechanism: Fitness model
Coupling the Bak-Sneppen and the fitness model
Bak-Sneppen model on fixed graphs
(Bak, Sneppen PRL 1993 – Flyvbjerg, Sneppen, Bak PRL 1993 –
Kulkarni, Almaas, Stroud cond-mat/9905066 – Moreno, Vazquez EPL 2002 Lee, Kim PRE 2005 - Masuda, Goh, Kahng PRE 2005)
1) Specify graph, and keep it fixed;
2) assign each vertex i a fitness xi drawn uniformly in (0,1);
3) draw anew fitnesses of least fit vertex and its neighbours;
4) evolve fitnesses iterating 3).
Fitness network model with quenched fitnesses
(Caldarelli et al. PRL 2002 – Boguna, Pastor-Satorras PRE 2003)
1) Specify fitness distribution r(x);
2) assign each vertex i a fitness xi drawn from r(x), and keep it fixed;
3) draw network by joining i and j with probability f(xi, xj);
4) repeat realizations and perform ensemble average.
Coupled (Self-organized) model:
1) Assign each vertex i a fitness xi drawn from what you like;
2) draw network by joining i and j with probability f(xi, xj);
3) draw anew fitnesses of least fit vertex and its neighbours, uniformly in (0,1);
4) draw anew links of least fit vertex and its neighbours with probability f(xi, xj);
5) repeat from 3).
Typical iteration of the model:
Analytical solution for arbitrary f(x,y)
Stationary fitness distribution:
uniform, as in
standard BS
novel result:
depends on x
(not uniform)
Distribution of minimum fitness:
uniform
Critical threshold  obtained from normalization condition:
D. Garlaschelli, A. Capocci, G. Caldarelli, Nature Physics 3, 813-817 (2007)
Analytical solution for arbitrary f(x,y)
Degree versus fitness:
Stationary degree distribution:
Similarly, all other topological properties are derived
as in the static fitness model
Particular choices of f(x,y)
Null case: random graph
(“grandcanonically” equivalent to random-neighbor BS model)
Stationary fitness distribution:
Critical threshold:
Step-like, as in
random-neighbor
BS model
(if sparse)
subcritical
sparse
dense
dynamical regimes rooted in an underlying
percolation transition, located at
Particular choices of f(x,y)
Simplest nontrivial (and unbiased) case: configuration model
see Garlaschelli and Loffredo, Phys. Rev. E 78, 015101(R) (2008).
Stationary fitness distribution:
Zipf
(but normalizable!)
Critical threshold:
subcritical
sparse
dense
conjecture (verified later): underlying
percolation transition, located at
Stationary fitness distribution
In the self-organized model, it is no longer step-like
(as in the BS model on fitness-independent networks) but power-law:
Theoretical results against simulations
Power-law fitness distribution (above ):
Check the percolation transition conjecture
Power-law
cluster size
distribution
at the
transition
Check the percolation transition conjecture
Degree versus fitness
The “saturation”
reflects repulsion
between large
degrees: implies
disassortativity
and hierarchy
(not shown)
Cumulative degree distribution
Scale-free
degree
distribution
(above )
Average fitness versus threshold
References
Reciprocity
D. Garlaschelli, M. I. Loffredo, Phys. Rev. Lett. 93, 268701 (2004)
D. Garlaschelli, M. I. Loffredo, Phys. Rev. E 73, 015101(R) (2006)
Weighted networks
D. Garlaschelli, M.I. Loffredo, Phys. Rev. Lett. 102, 038701 (2009)
D. Garlaschelli, New Journal of Physics 11, 073005 (2009)
Food web scaling
D. Garlaschelli, G. Caldarelli, L. Pietronero, Nature 423, 165-168 (2003)
D. Garlaschelli, Eur. Phys. J. B 38(2), 277 (2004)
Out-of-equilibrium model
D. Garlaschelli, A. Capocci, G. Caldarelli, Nature Physics 3, 813 - 817 (2007)
G. Caldarelli, A. Capocci, D. Garlaschelli, Eur. Phys. J. B 64, 585-591 (2008)