No Slide Title

Download Report

Transcript No Slide Title

The architecture of complexity
From the diameter of the www
to the structure of the cell
Albert László Barabási
(Univ. of Notre Dame)
Zoltán Néda, Hawoong Jeong, Réka Albert,
Ginestra Bianconi, Soonhyung Yook,
Erzsébet Ravasz, Zoltán Dezsö
www.nd.edu/~networks
Austin Powers:
The spy who
shagged me
Let’s make
it legal
Robert Wagner
Wild Things
What Price Glory
Barry Norton
A Few
Good Man
Monsieur
Verdoux
What is Complexity?
A popular paradigm: Simple systems display complex behavior
 non-linear systems
 chaos
 fractals
3 Body Problem
Earth( ) Jupiter ( ) Sun (
)
Main Entry: 1com·plex
Function: noun
Etymology: Late Latin complexus totality, from Latin,
embrace, from complecti
Date: 1643
1 : a whole made up of complicated or interrelated parts
Society
Nodes: individuals
Links: social relationship
(family/work/friendship/etc.)
S. Milgram (1967)
Six Degrees of Separation
John Guare
Social networks: Many individuals with diverse
social interactions between them.
Communication networks
The Earth is developing an electronic nervous system,
a network with diverse nodes and links are
-computers
-phone lines
-routers
-TV cables
-satellites
-EM waves
Communication
networks: Many
non-identical
components
with diverse
connections
between them.
Complex systems
Made of
many non-identical elements
connected by diverse interactions.
NETWORK
Erdös-Rényi model
(1960)
Connect with
probability p
p=1/6
N=10
k ~ 1.5
- Democratic
- Random
Pál Erdös
(1913-1996)
Poisson distribution
ARE COMPLEX NETWORKS
REALLY RANDOM?
Cluster Coefficient
Clustering: My friends will likely know each other!
Probability to be connected C
»p
# of links between 1,2,…n neighbors
C=
n(n-1)/2
Networks are clustered
[large C(p)]
but have a small
characteristic path length
[small L(p)].
Network
C
Crand
L
N
WWW
0.1078
0.00023
3.1
153127
Internet
0.18-0.3
0.001
3.7-3.76
30156209
Actor
0.79
0.00027
3.65
225226
Coauthorship
0.43
0.00018
5.9
52909
Metabolic
0.32
0.026
2.9
282
Foodweb
0.22
0.06
2.43
134
C. elegance
0.28
0.05
2.65
282
Watts-Strogatz Model
C(p) : clustering coeff.
L(p) : average path length
(Watts and Strogatz, Nature 393, 440 (1998))
World Wide Web
Nodes: WWW documents
Links: URL links
800 million documents
(S. Lawrence, 1999)
ROBOT:
collects all
URL’s found in a
document and follows
them recursively
R. Albert, H. Jeong, A-L Barabasi, Nature, 401 130 (1999)
What did we expect?
k ~ 6
P(k=500) ~ 10-99
NWWW ~ 109
 N(k=500)~10-90
We find:
out= 2.45
 in = 2.1
P(k=500) ~ 10-6
NWWW ~ 109
 N(k=500) ~ 103
Pout(k) ~ k-out
Pin(k) ~ k- in
19 degrees of separation
3
l15=2 [125]
6
1
l17=4 [1346  7]
4
5
2
7
… < l > = ??
 Finite size scaling: create a network with N nodes with Pin(k) and Pout(k)
< l > = 0.35 + 2.06 log(N)
19 degrees of separation
R. Albert et al Nature (99)
nd.edu
<l>
based on 800 million webpages
[S. Lawrence et al Nature (99)]
IBM
A. Broder et al WWW9 (00)
What does it mean?
Poisson distribution
Exponential Network
Power-law distribution
Scale-free Network
INTERNET BACKBONE
Nodes: computers, routers
Links: physical lines
(Faloutsos, Faloutsos and Faloutsos, 1999)
ACTOR CONNECTIVITIES
Nodes: actors
Links: cast jointly
Days of Thunder (1990)
Far and Away
(1992)
Eyes Wide Shut (1999)
N = 212,250 actors
k = 28.78
P(k) ~k-
=2.3
SCIENCE CITATION INDEX
Nodes: papers
Links: citations
25
Witten-Sander
PRL 1981
1736 PRL papers (1988)
2212
P(k) ~k-
( = 3)
(S. Redner, 1998)
SCIENCE COAUTHORSHIP
Nodes: scientist (authors)
Links: write paper together
(Newman, 2000, H. Jeong et al 2001)
Food Web
Nodes: trophic species
Links: trophic interactions
R.J. Williams, N.D. Martinez Nature (2000)
R. Sole (cond-mat/0011195)
Sex-web
Nodes: people (Females; Males)
Links: sexual relationships
4781 Swedes; 18-74;
59% response rate.
Liljeros et al. Nature 2001
Most real world networks have
the same internal structure:
Scale-free networks
Why?
What does it mean?
SCALE-FREE NETWORKS
(1) The number of nodes (N) is NOT fixed.
Networks continuously expand by
the addition of new nodes
Examples:
WWW : addition of new documents
Citation : publication of new papers
(2) The attachment is NOT uniform.
A node is linked with higher probability to a node
that already has a large number of links.
Examples :
WWW : new documents link to well known sites
(CNN, YAHOO, NewYork Times, etc)
Citation : well cited papers are more likely to be cited again
(1) GROWTH :
Scale-free model
At every timestep we add a new node with m edges
(connected to the nodes already present in the system).
(2) PREFERENTIAL ATTACHMENT :
The probability Π that a new node will be connected to
node i depends on the connectivity ki of that node
ki
 ( ki ) 
 jk j
P(k) ~k-3
A.-L.Barabási, R. Albert, Science 286, 509 (1999)
Mean Field Theory
ki
ki
ki
  ( ki )  A

, with initial condition ki (ti )  m
t
k
2
t
j j
t
ki (t )  m
ti
m 2t
m 2t
m 2t
P(ki (t )  k )  Pt (ti  2 )  1  Pt (ti  2 )  1  2
k
k
k (m0  t )
P(ki (t )  k ) 2m2t 1
3
 P( k ) 

~
k
k
mo  t k 3
γ=3
A.-L.Barabási, R. Albert and H. Jeong, Physica A 272, 173 (1999)
Model A
growth
preferential attachment
Π(ki) : uniform
ki
m
 A (ki ) 
t
m0  t  1
 m0  t  1

ki (t )  m ln(
)  1
 m  ti  1

e
k
k
P(k )  exp( ) ~ e
m
m
Model B
growth
preferential attachment
ki
1
N ki 1
 A(ki )  

t
N N  1 2t N
2( N  1)
ki (t ) 
t  Ct
N ( N  2)
N
2 ( N 1)
2
~ t
N
P(k) : power law (initially)
 Gaussian
Preferential Attachment
ki
ki
  ( ki ) ~
t
t
For given t,k  (k)
k vs. k : increase in the No. of links in a unit time
Citation
network
Internet
(cond-mat/0104131)
Universality?
WWW
(in)
 = 2.1
Internet
Actor
Citation
index
 = 2. 5
 = 2.3
=3
Sex
Web
Cellular Phone call
linguistics
network network
 = 3.5  = 2.1
 = 2.1
 = 2.8
Extended Model
• prob. p : internal links
• prob. q : link deletion
• prob. 1-p-q : add node
P(k) ~ (k+(p,q,m))-(p,q,m)
  [1,)
p=0.937
m=1
 = 31.68
 = 3.07
Actor
network
• Predict the network topology
from microscopic processes
with parameters (p,q,m)
• Scaling but no universality
Other Models
• Non-linear preferential attachment :
(k) ~ k  P(k) ~ no scaling for 1
 <1 : stretch-exponential
 >1 : no-scaling (>2 : “gelation”)
(Krapivsky et al (2000).)
• Initial attractiveness : (k) ~ A+k
 P(k) ~ k- where =2 + A/m
(Dorogovtsev et al (2000).)
• Aging : each node has a lifetime
 node cannot get links after retirement. (actor)
 P(k) : power-law with exponential cutoff
(Amaral et al (2000).)
Other Models (continued)
• Saturation : each node has maximum link number.
 node cannot get links after finite # of links
P(k) : power-law with exponential cutoff
(Amaral et al (2000).)
Can Latecomers Make It? Fitness Model
SF model:
k(t)~t ½
(first mover advantage)
Real systems:
nodes compete for links -- fitness
Fitness Model:
fitness (h )
 ( ki ) 
hi k i
 jh j k j
k(h,t)~tb(h)
where
b(h) =h/C  dh  (h )
1
1
C /h 1
Bose-Einstein Condensation in Evolving Networks
Network
hi ki
i 
 jh j k j
h
kin (h )
 (h )
Fit-gets-rich
Bose gas
e b
n( ) 
g ( )
1
e  b  1
Bose-Einstein condensation
G. Bianconi and A.-L. Barabási, Physical Review Letters 2001; cond-mat/0011029
Next Lecture:
(9 am tomorrow morning)
-The Web of Life: Networks in biological systems
(metabolic and protein interaction networks)
-Modeling the Internet
-Achilles’ Heel: robustness, error and attack tolerance
-Why Bacon?
http://www.nd.edu/~networks
Note:
There are two postdoctoral positions open in my research group.
For more details see www.nd.edu/~networks
The architecture of complexity
From the diameter of the www
to the structure of the cell
Albert László Barabási
(Univ. of Notre Dame)
Zoltán Néda, Hawoong Jeong, Réka Albert,
Ginestra Bianconi, Soonhyung Yook,
Erzsébet Ravasz, Zoltán Dezsö
www.nd.edu/~networks
Next Lecture:
(9 am tomorrow morning)
-The Web of Life: Networks in biological systems
(metabolic and protein interaction networks)
-Modeling the Internet
-Achilles’ Heel: robustness, error and attack tolerance
-Why Bacon?
INTERNET BACKBONE
Nodes: computers, routers
Links: physical lines
(Faloutsos, Faloutsos and Faloutsos, 1999)
(1) GROWTH :
Scale-free model
At every timestep we add a new node with m edges
(connected to the nodes already present in the system).
(2) PREFERENTIAL ATTACHMENT :
The probability Π that a new node will be connected to
node i depends on the connectivity ki of that node
ki
 ( ki ) 
 jk j
P(k) ~k-3
A.-L.Barabási, R. Albert, Science 286, 509 (1999)
Spatial Distributions
Router
density
Population
density
Spatial Distribution of Routers
Fractal set
Box counting: N(l)  No. of boxes
of size l that contain routers
N(l) ~ l -Df
Df=1.5
Preferential Attachment
• Compare maps taken at different times (t = 6 months)
• Measure k(k), increase in No. of links for a node
with k links
Preferential Attachment:
k(k) ~ k
INTERNET
N(l) ~ l-Df
Df=1.5
k(k) ~ k
=1
P(d) ~ d-s
s=1
Parameter (s,,Df) dependence of P(k) and P(d)
What is the topology of cellular networks?
Argument 1:
Cellular networks are
scale-free!
Argument 2:
Cellular networks are
exponential!
Reason:
They formed one node
at a time…
Reason:
They have been streamlined
by evolution...
GENOME
protein-gene
interactions
PROTEOME
protein-protein
interactions
METABOLISM
Bio-chemical
reactions
Citrate Cycle
METABOLISM
Bio-chemical
reactions
Citrate Cycle
Metabolic Network
Nodes: chemicals (substrates)
Links: bio-chemical reactions
Metabolic network
Archaea
Bacteria
Eukaryotes
Organisms from all three domains of life are
scale-free networks!
H. Jeong, B. Tombor, R. Albert, Z.N. Oltvai, and A.L. Barabasi, Nature, 407 651 (2000)
Node-node distance in metabolic networks
3
D15=2 [125]
6
1
D17=4 [134  67]
4
2
5
7
… D = ??
Scale-free networks:
D~log(N)
Larger organisms are expected
to have a larger diameter!
GENOME
protein-gene
interactions
PROTEOME
protein-protein
interactions
METABOLISM
Bio-chemical
reactions
Citrate Cycle
PROTEOME
protein-protein
interactions
Yeast protein network
Nodes: proteins
Links: physical interactions (binding)
P. Uetz, et al. Nature 403, 623-7 (2000).
Topology of the protein network
P(k ) ~ (k  k0 )  exp(
k  k0
)
k
H. Jeong, S.P. Mason, A.-L. Barabasi, Z.N. Oltvai, Nature 411, 41-42 (2001)
Nature 408 307 (2000)
…
“One way to understand the p53 network
is to compare it to the Internet.
The cell, like the Internet, appears to
be a ‘scale-free network’.”
p53 network (mammals)
Robustness
Complex systems maintain their basic functions
even under errors and failures
(cell  mutations; Internet  router breakdowns)
1
S
fc
0
1
Fraction of removed nodes, f
node failure
Robustness of scale-free networks
Failures
Topological
error tolerance
1
  3 : fc=1
S
0
Attacks
(R. Cohen et al PRL, 2000)
fc
f
1
Achilles’ Heel of complex networks
failure
attack
Internet
R. Albert, H. Jeong, A.L. Barabasi, Nature 406 378 (2000)
Yeast protein network
- lethality and topological position -
Highly connected proteins are more essential (lethal)...
H. Jeong, S.P. Mason, A.-L. Barabasi, Z.N. Oltvai, Nature 411, 41-42 (2001)
Complexity
Network
Science collaboration
WWW
Food Web
Scale-free network
Citation pattern
Internet
Cell
UNCOVERING ORDER HIDDEN WITHIN COMPLEX SYSTEMS
Traditional modeling:
Network as a static graph
Given a network with N nodes and L links

Create a graph with statistically identical topology
RESULT: model the static network topology
PROBLEM: Real networks are dynamical systems!
Evolving networks
OBJECTIVE: capture the network dynamics
METHOD :
• identify the processes that contribute to the network topology
•develop dynamical models that capture these processes

BONUS: get the topology correctly.
Bonus: Why Kevin Bacon?
Measure the average distance between Kevin Bacon and all other actors.
Kevin Bacon
Is Kevin Bacon
the most
connected actor?
NO!
No. of movies : 46
No. of actors : 1811
Average separation: 2.79
Rod Steiger
Donald Pleasence
Martin Sheen
Christopher Lee
Robert Mitchum
Charlton Heston
Eddie Albert
Robert Vaughn
Donald Sutherland
John Gielgud
Anthony Quinn
James Earl Jones
Average
distance
2.537527
2.542376
2.551210
2.552497
2.557181
2.566284
2.567036
2.570193
2.577880
2.578980
2.579750
2.584440
# of
movies
112
180
136
201
136
104
112
126
107
122
146
112
# of
links
2562
2874
3501
2993
2905
2552
3333
2761
2865
2942
2978
3787
Kevin Bacon
Kevin
Bacon
2.786981
2.786981
46
46
1811
1811
Rank
Name
1
2
3
4
5
6
7
8
9
10
11
12
…
876
876
…
#1 Rod Steiger
#876
Kevin Bacon
Donald
#2
Pleasence
#3 Martin Sheen
http://www.nd.edu/~networks
Note:
There are two postdoctoral positions open in my research group.
For more details see www.nd.edu/~networks
References
• R. Albert, H. Jeong, A.L. Barabasi, Nature 401 130 (1999).
• R. Albert, A.L. Barabasi, Science 286 509 (1999).
• A.L. Barabási, R. Albert and H. Jeong, Physica A 272, 173 (1999)
• R. Albert, H. Jeong, A.L. Barabasi, Nature 406 378 (2000).
• H.Jeong, B.Tombor, R.Albert, Z.N.Oltvai, A.L.Barabasi, Nature 407 651 (2000).
• H. Jeong, S.P. Mason, A.L. Barabasi, Z.N. Oltvai, Nature (in press).
URL: http://www.nd.edu/~networks
Whole cellular network
Properties of the protein network
P(k ) ~ (k  k0 )  exp(
k  k0
)
k
Highly connected proteins are
more essential (lethal) than
less connected proteins.
Nodes: chemicals
Metabolic Network
(substrates)
Links: chem. reaction
Metabolic network
Archaea
Bacteria
Eukaryotes
Organisms
from all three
domains of life
are scale-free
networks!
H. Jeong, B. Tombor, R. Albert, Z.N. Oltvai, and A.L. Barabasi, Nature, 407 651 (2000)
Whole cellular network
Properties of metabolic networks
Average distances are independent of organisms!
 by making more links between nodes.
 based on “design principles” of the cell through evolution.
cf. Other scale-free network: D~log(N)
Taxonomy using networks
A: Archaea
B: Bacteria
E: Eukaryotes
Achilles’ Heel of complex network
failure
attack
Internet
Protein network
R. Albert, H. Jeong, A.L. Barabasi, Nature 406 378 (2000)