An Introduction of Complex networks - ZHAO JING

Download Report

Transcript An Introduction of Complex networks - ZHAO JING

An Introduction of Complex
networks
Zhao Jing
2006.11.22
[email protected]
Outline
I. Network metrics and topological features
II. Modularity and network decomposition
III. Topological diversity of networks with a given
degree sequence
I. Network metrics and topological
features
Zhao J, Yu H, Luo J, Cao Z, Li Y: Complex networks theory for analyzing metabolic
networks. Chinese Science Bulletin 2006, 51(13):1529-1537.
1.1 Degree distribution vs. scale-free networks
Degree distribution p(k) :
the occurrence frequency of nodes with degree k,
(k=1,2,…).
Barabasi, A.L., Albert, R., Emergence of scaling in random networks, Science, 1999,
286:509-512
BA model for network evolution:
(1) Growth: the continuous addition of new nodes.
(2) Preferential attachment: “the rich get richer” principle.
 The high-degree nodes should appear in the earlier
stage of network formation.
Thirteen hub metabolites in E.coli metabolic network
Wagner, A., Fell, D.A., The small world inside large metabolic networks, Proc R Soc Lond B,
2001, 268:1803-1810.
Performance of scale-free networks:
error tolerance: high resistance to random perturbations
attack vulnerability : the removal of a few hub nodes will
destroy the whole network.
=>The most highly connected proteins in the cell are the most
important for its survival.
Albert, R., Jeong, H., Barabasi, A.-L., Error and attack tolerance of complex networks, Nature,
2000, 406:378-382.
Jeong, H., Mason, S.P., Barabasi, A.L., Oltvai, Z.N., Lethality and centrality in protein networks,
Nature, 2001, 411:41-42.
Notice: Computation of the exponent
cumulative distribution : P( x  k )   p(i)

i k
p ( k ) ~ k 

P( x  k ) ~ k ( 1)
Log-log plot of the degree distribution (A) and cumulative degree distribution (B)
for a network of 20000 nodes constructed by Barabasi-Albert preferential
attachment model.
A.-.Barabási
R. Albert
Norte Dame Univ.
Barabási is the 2006 recipient of the John von Neumann Medal.
The award has been presented since 1976 to a maximum of three
individuals who have gained distinction in the dissemination of
computer culture. Previous recipients of the award include
Microsoft founder Bill Gates, former IBM chairman Louis Gerstner
and Intel Corporation board chair Andrew Grove
1.2 Clustering coefficient vs. Hierarchical modular
networks
CC (v) 
2 N (v )
d (v)(d (v)  1)
C (k ) ~ k 1
Ravasz E, Somera A L, Mongru D A, Oltvai Z N, Barabasi A L, Hierarchical organization of modularity
in metabolic networks, Science,2002,297: 1551-1556
Complex systems usually have a
hierarchical structure, the entities of
one level being compounded into
new entities at the next higher lever,
as cells into tissues, tissues into
organs, and organs into functional
systems.
The whole is greater than the sum of
its parts!
Life’s complex Pyramid: from the particular to
the universal
At each new level of complexity in
biology new and unexpected
qualities appear, qualities which
apparently cannot be reduced to the
properties of the component parts.
Oltvai, Z.N., Barabási, A.-L., Life’s Complexity Pyramid, SCIENCE, 2002, 298:763764.
Mayr E., “How biology differs from the physical sciences”, Evolution as a crossroad: the new
biology and the new philosophy of science, MIT press, Cambridge,1985.
Davis Paul, The cosmic blueprint, Simon and Schuster,1988.
1.3 Mean path length vs. small-world networks
Small-world network: small mean path length; high clustering
coefficient
Small-world cell networks=>the cell may react quickly to changes of the
surroundings
Watts, D.J., Strogatz, S.H., Collective dynamics of `small-world' networks, Nature, 1998,
393:440-442.
1.4 Assortativity coefficient vs. degree-degree
correlation
Newman , M.E.J., Assortative mixing in networks, Phys Rev Lett, 2002, 89:208701.
The average connectivity <knn> of the nearest neighbors of a node depending on its connectivity
k for the 1998 snapshot of the Internet, the generalized BA model and the fitness model.
Romualdo Pastor-Satorras, Alexei Vázquez, and Alessandro Vespignani, Dynamical and Correlation
Properties of the Internet, PHYSI CAL REV IEW LETTERS, VOLUME 87, NUMBER 25(2002)
Correlation profiles of protein interaction network in yeast. Z-scores for connectivity
correlations :
Z(K0,K1) = (P(K0,K1) − Pr(K0,K1))/r(K0,K1)
where r(K0,K1) is the standard deviation of Pr(K0,K1) in 1000 realizations of a
randomized network.
Maslov, S., Sneppen, K., Specificity and Stability in Topology of Protein Networks, Science,
2002, 296:910-913.
1.5 Rich-club coefficient and rich-club phenomenon
Notice: Rich-club

Assortative mixing
Colizza V, Flammini A, Serrano MA, Vespignani A: Detecting rich-club ordering in
complex networks. Nat Phys 2006, 2(2):110-115.
1.6 k-core
1, 2 and 3-core. Two basic properties of cores: first, cores may be disconnected
subgraphs; second, cores are nested: for i>j, an i-core is a subgraph of a j-core of
the same graph.
=> The probability of nodes both being essential and evolutionary
conserved successively increases toward the innermost cores.
Wuchty, S., Almaas, E., Peeling the yeast protein network, Proteomics, 2005, 5:444-449.
3-core of E.coli metabolic network
Zhao J, Tao L, Yu H, Luo J-H, Cao ZW, Li Y: Bow-tie topological features of metabolic
networks and the functional significance. eprint q-bioMN/0611013 2006.
1.7 Betweenness centrality
Betweenness centrality is based on the assumption
that information is transmitted along shortest paths.
Node betweenness : the number of shortest paths between
pairs of nodes that run along this node.
Edge betweenness: the number of shortest paths between
pairs of nodes that run along this edge.
=> Nodes and edges of high betweenness centrality could
be bottlenecks of the network, thus could be important
enzymes or metabolites.
Rahman, S.A., Schomburg, D., Observing local and global properties of metabolic pathways: 'load
points' and 'choke points' in the metabolic networks, Bioinformatics, 2006, 22:1767-1774.
1.8 Null Model and Z-score
Z 
P  Pr
Pr
Maslov, S., Sneppen, K., Specificity and Stability in Topology of Protein Networks, Science,
2002, 296:910-913.
Maslov S, Sneppen K, Zaliznyak A: Detection of topological patterns in complex networks: correlation
profile of the internet. Physica A: Statistical and Theoretical Physics 2004, 333:529-540.
II. Modularity and network decomposition
Zhao J, Yu H, Luo J, Cao Z, Li Y: Complex networks theory for analyzing metabolic
networks. Chinese Science Bulletin 2006, 51(13):1529-1537.
2.1 Modularity:
From functional view:
Modularity: the system can be decomposed in parts (modules), such that the
function of each part is more complex than a basic combination of the input
to a new output.
From topological view:
Assumption:
A densely connected subnetwork  "part with complex function."
Modularity: network could be divided into groups of vertices that have a
high density of edges within them, with a lower density of edges between
groups.
Hartwell LH, Hopfield JJ, Leibler S, Murray AW: From molecular to modular cell biology. Nature
1999, 402:C47-C52.
Papin JA, Reed JL, Palsson BO: Hierarchical thinking in network biology: the unbiased
modularization of biochemical networks ,Trends in Biochemical Sciences 2004, 29:641-647.
For a given decomposition of a network, the modularity metric is defined as:
r
M   [eii  ( eij ) 2 ]
i 1
j
The modularity metric of a network is defined as the largest modularity
metric of all possible partitions of the network.
The modularity of networks must always be compared to the null case
of a random graph.
Newman M: Detecting community structure in networks EurPhysJB 2004, 38:321-330.
Guimera R, Sales-Pardo M, Amaral LAN: Modularity from fluctuations in random graphs and
complex networks. Physical Review E 2004, 70:025101.
2.2 Simulated annealing method:
r
max M  max [eii  ( eij ) 2 ]
i 1
j
Guimera R, Nunes Amaral LA: Functional cartography of complex metabolic networks. Nature 2005,
433(7028):895-900.
2.3 Hierarchical clustering
method:
Similarity index(or dissimilarity index):
to signify the extent to which two nodes
would like in the same cluster.
Agglomerative method:
to start off with each node being its own
cluster. At each step, it combines the two
most similar clusters to form a new larger
cluster until all nodes have been
combined into one cluster.
Divisive method:
to begin with one cluster including all
the nodes, and attempts to find the
splitting point at which two clusters are
as dissimilar as possible.
Topological overlap algorithm: Substrate graph
OT (i, j ) 
J n (i, j )
min(k i , k j )
Jn(i,j) denotes the number of nodes to which both i and j are linked
( plus 1 if there is a direct link between i and j ); ki, kj is the degree of i
and j, respectively.
Agglomerative method.
Ravasz E, Somera AL, Mongru DA, Oltvai ZN, Barabasi AL: Hierarchical Organization of
Modularity in Metabolic Networks. Science 2002, 297(5586):1551-1555
Shortest path algorithm: enzyme graph
dissimilarity(i, j )  min(d (i, j ), d ( j, i))
d(i, j) is the number of arcs in the shortest directed path from i to j .
Agglomerative method.
Ma H-W, Zhao X-M, Yuan Y-J, Zeng A-P: Decomposition of metabolic network into functional
modules based on the global connectivity structure of reaction graph. Bioinformatics 2004,
20(12):1870-1876.
Betweenness method: substrate-enzyme bipartite graph
C B (r ) 
 r ( s, t )
1

k in (r ) s t  ( s, t )
 r (s, t ) is the number of shortest paths between s and t that passes
through r,  ( s, t ) is the total number of shortest paths between s and
t, kin (r) is the in-degree of node r.
Divisive method.
Holme P, Huss M, Jeong H: Subnetwork hierarchies of biochemical pathways. Bioinformatics
2003, 19(4):532-538.
Corrected Euclidean-like dissimilarity algorithm: substrate graph
D (i, j )  (d ij  d ji ) 
2
N
2
2
[(
d

d
)

(
d

d
)
]
 ki kj
ik
jk
k 1
k i , j
d(i, j) is the number of arcs in the shortest directed path from i to j .
Agglomerative method.
Zhao J, Yu H, Luo J, Cao Z, Li Y: Hierarchical modularity of nested bow-ties in metabolic
networks. BMC Bioinformatics 2006:7:386.
2.4 Relationship between topological modules and
functional modules
Case 1: some modules are dominated by one major category of metabolisms
Zhao J, Yu H, Luo J, Cao Z, Li Y: Hierarchical modularity of nested bow-ties in metabolic
networks. BMC Bioinformatics 2006:7:386.
Case 2 : A standard textbook pathway can break into several modules.
Case 3 : Some
modules are mixtures
of pieces of several
conventional
biochemical pathways.
III. Topological diversity of networks
with a given degree sequence
--Degree sequence tells us few things
Graphs with the same degree sequence have significantly topological diversity.
Zhao J, Tao L, Yu H, Luo J-H, Cao Z-W, Li Y-X: The spectrum of degree correlations: topological
diversity of networks with a given degree sequence. e-print physics/0611078 2006.
Holme P, Zhao J: Exploring the assortativity-clustering space of a network's degree sequence.
eprint q-bioOT/0611020 2006.
Thanks!