Nessun titolo diapositiva

Download Report

Transcript Nessun titolo diapositiva

Scale Free Networks in
Biology and Social Systems
Guido Caldarelli, Andrea Capocci, Cecile Caretta, Fabrizio Coccetti,
Francesca Colaiori, Ramon Ferrer i Cancho, Diego Garlaschelli,
Luciano Pietronero, Vito Servedio, Federico Squartini
University of Rome“La Sapienza”
Coevolution and Self-Organization
in Dynamical Networks
•Introduction
There is a growing interest in science in the analysis of
scale-free networks
These ubiquitous structure are characterized by two typical
aspects
Small World effect
That is small diameter
Scale-free structure
There is not a typical number of links
Many sites have few
Few have many
•Contents

Network Topological properties (degree distribution etc)
1) Give new description of phenomena allowing
 to detect new universal behaviour.
 to validate models
2) Can sometime help in explaining the evolution of the system

As example of this use of graph I will present
1) Food Webs
2) Linnean Trees
3) Protein Interaction Networks
4) Financial Systems

Scale-Free Network arise naturally in RANDOM environments
I will present our interpretation of this fact
•The Milgram Experiment (1967)
Is it possible to deliver a message to a Stock dealer in Chicago
starting from unrelated people in Nebraska?
•The Small World Effect
On average less than
6 passages!!
SIX DEGREES OF SEPARATION
•Basic Graph Theory
Is it possible to travel from one part of
the city of Königsberg to any other
PASSING ALL THE BRIDGES ON THE
PREGEL ONLY ONCE ?
NO!
Euler (1736) pointed out that to be a “passage”
point a vertex must have an even number of links.
Only starting and ending points can have an odd
number of links.
THIS IS NOT THE CASE FOR KÖNIGSBERG
•Is the problem time dependent?
1736
All vertices have odd
degree! → No way
2004
Only B and C have odd
degree! → we can do it!
•
Graph Topology
A Graph G(v,e) is an object
composed by v vertices and e edges
· Degree k (In-degree kin and out-degree kout ) = number of edges (oriented) per vertex
· Distance d = minimum number of edges amongst two vertices ( in the connected region !)
· Diameter D = Maximum of the distances ( in the connected region !)
· Clustering = cliques distribution, or clustering coefficient
Usually many quantities are needed
In order to “classify” a network
•
Graph Topology (2)
·1 Degree frequency density P(k) = how many times you find a vertex whose degree is k
P(k )  e
P(k)
 pN
( pN ) k
k!
P(k )  k 
k
·2 Degree Correlation Knn (k) = average degree of a neighbour of a vertex with degree k
·3 Clustering Coefficient (k) = the average value of c for a vertex whose degree is k
Assortative networks
Disassortative networks
Social networks
Techological,
Biological networks
• Real networks always display one of these two tendencies,
• “similar” networks display “similar” behaviours.
Assortativity coefficient
r  ij  i j
>0
:
Assortative
=0
:
Non assortative
<0
:
Disassortative
Consequences of assortativity: - Resistence to attacks
- Percolation
- Epidemic spreading
 M.E.J. Newman, Physical Review E, 67 026126 , (2003).
•
Graph Topology (3)
·4 Centrality betweenness b(k) = The probability that a vertex whose degree is k
has betweenness b
betweenness of I is the number of distances
between any pair of vertices passing
through I
·5 TREES ONLY!!! P(A) = Probability Density for subbranches of size A
11
1 1
Size distribution:
Allometric relations:
35
33
1 1
35
1 1 0,6 0,5 P(A
C(A
0,5
11 5
2
3
30
)
0,4
)
25
22
20
0,3
15
1 1
22 8
0,2
0,1
0,1
0,1
0,1
0,1
A
0
10
33
11
10
0,1
1
2
3
4
5
6
7
8
9
10
5
5
A
3
1
0
0
2
4
6
8
10
12
•“Food Web” (ecological network):
Set of interconnected food chains resulting in a much more complex topology:
•Degree Distribution P(k) in real Food Webs
Unaggregated versions of real webs:
irregular
or scalefree?
P(k) k-
R.V. Solé, J.M. Montoya Proc. Royal Society Series B 268 2039 (2001)
J.M. Montoya, R.V. Solé, Journal of Theor. Biology 214 405 (2002)
•Spanning Trees of a Directed Graph
A spanning tree of a connected directed graph is any of its connected directed subtrees
with the same number of vertices.
In general, the same graph can have more spanning trees with different
topologies.
Since the peculiarity of the system (FOOD WEBS),some are more sensible
than the others.
•How to characterize a tree?
1
1
1
1
5
Out-component size:
w
AX 
XY
AY  1
Out-component size
distribution P(A) :
0,5
3
1
1
5
11
8
Ynn X
0,6
1
2
22
10
1
3
1
Sum of the sizes:
CX 
1
Y
Y
X
Allometric relations:
33
C X  C X A X  
35
P(A)
A

 
C  C A 
C(A)
30
33
0,5
25
0,4
22
20
0,3
15
0,2
11
10
0,1
0,1
0,1
0,1
0,1
0,1
5
A
0
1
2
3
4
5
6
7
8
9
10
5
A
3
1
0
0
2
4
6
8
10
12
•Area Distribution in Real Food Webs
•Allometric Relations in Real Food Webs
(D.Garlaschelli, G. Caldarelli, L. Pietronero Nature 423 165 (2003))
  1
0 1
C( A)  A
efficient
P(A)   A1
stable
C(A)  A 1    2
P(A)  A 0    
 0
C(A)  A2
inefficient
P(A)  cost
unstable
•Ecosystems around the world
Lazio
Utah
Amazonia
Peruvian
and Atacama
Desert
Ecosystem =
Iran
Argentina
Set of all living organisms and environmental properties of
a restricted geographic area
we focus our attention on plants
in order to obtain a good universality of the results we have
chosen a great variety of climatic environments
•From Linnean trees to graph theory
Linnean Tree = hierarchical structure organized on different
levels, called taxonomic levels, representing:
• classification and identification of different plants
• history of the evolution of different species
A Linnean tree already has
the topological structure of a tree graph
phylum
subphylum
class
subclass
order
family
genus
species
• each node in the graph represents a different taxa
(specie, genus, family, and so on). All nodes are
organized on levels representing the taxonomic one
• all link are up-down directed and each one
represents the belonging of a taxon to the relative
upper level taxon
Connected graph without loops or
double-linked nodes
•Scale-free properties
P(k)
Degree distribution:
P(k )  k

k
 ~ 2.5  0.2
The best results for the exponent value are given by ecosystems with
greater number of species. For smaller networks its value can increase
reaching  = 2.8 - 2.9.
•Geographical flora subsets
Tiber
Mte Testaccio
Aniene
Lazio
City of Rome
Colli Prenestini
k
k
 =2.52  0.08
 =2.58  0.08
k
2.6 ≤  ≤ 2.8
•What about random subsets?
In spite of some slight difference in the exponent value, a subset which represents on its own
a geographical unit of living organisms still show a power-law in the connectivity distribution.
P(k)
P(k)
P(k)
random extraction of 100, 200 and 400 species between those belonging
to the big ecosystems and reconstruction of the phylogenetic tree
LAZIO
k
k
P(k)
P(k)
k
• Simulation:
ROME
P(k)=k -2.6
k
k
A comparison
Correlated:
P(k)
Not Correlated:
k
k
•Protein Interaction Network of Yeast
(Saccaromyces Cerevisiae)
•Protein Interaction Network of Fruitfly
(Drosophila Melanogaster)
•Economics and Finance
Probably the most complex system is
human behaviour!
Even by considering only the trading
between individuals, situation seem to
be incredibly complicated.
Econophysics tries to understand the
basic “active ingredients” at the basis of
some peculiar behaviours.
For example price statistical properties
can be described through a simple
model of agents trading the same stock.
“A Prototype Model of Stock Exchange”
Europhysics Letters, 40 479 (1997), G. C., M. Marsili, Y.-C. Zhang.
•
Why networks ?
Some of the phenomena in finance can be described by means of graphs
• Stock price correlations
•J.-P. Onnela, A. Chackraborti, K. Kaski, J. Kertész, A. Kanto
http://xxx.lanl.gov/abs/cond-mat/0303579 and http://xxx.lanl.gov/abs/cond-mat/0302546
•G. Bonanno, G. Caldarelli, F. Lillo and R. N. Mantegna http://xxx.lanl.gov/abs/cond-mat/0211546
•Portfolio composition
•Next slides…
•Board of Directors
•M. E. J. Newman, S. H. Strogatz and D. J. Watts, Phys. Rev. E 64, 026118 (2001).
.
•S. Battiston, E. Bonabeau and G. Weisbuch http://xxx.lanl.gov/abs/cond-mat/0209590 (2002)
Through this new description we can
•Discover new features
•Validate Models
•
Stock Price Correlations
ri ( )  ln Pi ( )  ln Pi ( 1)
i , j 
r
Logarithmic return of stock i
ri rj  ri rj
2
j
 rj
d i , j  2(1   i , j )
2
 r
i
2
 ri
2

Correlation between returns
(averaged on trading days)
Distance between stocks i, j
A tree (a graph with no cycle) can be constructed by imposing that the
sum of the (N-1) distances is the minimum one.
Real Data from NYSE
Correlation based minimal spanning trees of real data from daily stock returns of 1071 stocks for the 12-year period
1987-1998 (3030 trading days). The node colour is based on Standard Industrial Classification system.
The correspondence is:
red for mining
green for transportation, communications,
electric,gas and sanitary services
black for retail trade
cyan for construction
light blue for public
administration
purple for finance and insurance
yellow for manufacturing
magenta for wholesale trade
orange for service industries
“Topology of correlation based..” http://xxx.lanl.gov/abs/cond-mat/0211546
G. Bonanno, G. C. , F. Lillo, R. Mantegna.
Data from Capital Asset Pricing Model
In the model it is supposed that returns follow
ri (t )  i  i rM (t )   i (t )
ri(t) = return of stock i
rM(t) = return of market (Standard & Poor’s)
i,i = real parameters
i, = noise term with 0 mean
Correlation based minimal spanning trees of of an artificial market composed by of 1071 stocks according to
the one factor model.
The node colour is based on Standard Industrial Classification system. The correspondence is:
red for mining
green for transportation, communications,
electric,gas and sanitary services
black for retail trade
cyan for construction
light blue for public
administration
purple for finance and insurance
yellow for manufacturing
magenta for wholesale trade
orange for service industries
Without going in much detail about degree distribution or clustering of the two graphs
We can conclude that:
the topology of MST for the real and an artificial market are greatly different.
Real market properties are not reproduced by simple random models
• Portfolio Composition
Investors or Companies not traded at Borsa di Milano (Italy)
Companies traded at Borsa di Milano (Italy)
• Portfolio Composition
• Portfolio Composition
• Portfolio Composition
• Portfolio Composition
•Models (1)
Standard Theory of Random Graph
(Erdös and Rényi 1960)
P(k)
k
Random Graphs are composed by starting with n vertices.
With probability p two vertices are connected by an edge
Degrees are Poisson distributed
k
 pN ( pN )
P(k )  e
k!
Small World
(D.J. Watts and S.H. Strogatz 1998)
Small World Graph are composed by adding
shortcuts to regular lattices
Degrees are peaked around mean value
• Models (2)
Model of Growing Networks
(A.-L. Barabási 1999)
1) Growth
Every time step new nodes enter the system
2) Preferential Attachment
The probability to be connected depends on the
degree P(k)  k
Degrees are Power law distributed
P(k )  k 
Intrinsic Fitness Model
(G.Caldarelli A. Capocci, P.De Los Rios, M.A. Munoz 2002)
1) Growth or not
Nodes can be fixed at the beginning or be added
2) Attachment is related to intrinsic properties
The probability to be connected depends on the
sites
Degrees are Power law distributed
P(k )  k 
•Intrinsic Fitness Model
Without introducing growth or preferential attachment we can have power-laws
We consider “disorder” in the Random Graph model
(i.e. vertices differ one from the other).
This mechanism is responsible of self-similarity in Laplacian Fractals
•Dielectric Breakdown
•In a perfect dielectric
•In reality
•Intrinsic Fitness Model
Different realizations of the model
a) b) c) have (x) power law with exponent 2.5 ,3 ,4 respectively.
d) has (x)=exp(-x) and a threshold rule.
•Intrinsic Fitness Model
Degree distribution for cases
a) b) c) with (x) power law with
exponent 2.5 ,3 ,4 respectively.
Degree distribution for the case
d) with (x)=exp(-x) and a threshold rule.
•Conclusions
Results:
 networks (SCALE-FREE OR NOT) allow to detect universality
(same statistical properties) for FOOD WEBS TAXONOMY and
SOCIAL SYSTEMS.
Regardless the different number of species, environments, markets
 STATIC AND DYNAMICAL NETWORK PROPERTIES other than
the degree distribution allow to validate models.
NEITHER RANDOM GRAPH NOR BARABASI-ALBERT WORK
IT IS POSSIBLE THAT PROPERTIES OBSERVED ARE REALLY
RANDOM BECAUSE RANDOM GRAPH CAN GIVE POWER LAWS!
Future:
 new data
 suitable models taking into account also
environment and natural selection
AT LEAST FOR FOOD WEBS, TAXONOMY
AND FINANCIAL MARKETS
COSIN
COevolution and Self-organisation In
dynamical Networks
RTD Shared Cost Contract IST-2001-33555
http://www.cosin.org
•
•
•
•
•
Nodes
Period of Activity:
Budget:
Persons financed:
Human resources:
EU countries
Non EU countries
EU COSIN participant
Non EU COSIN participant
6 in 5 countries
April 2002-April 2005
1.256 M€
8-10 researchers
371.5 Persons/months