Network theory III
Download
Report
Transcript Network theory III
Network theory III
David Lusseau
BIOL4062/5062
[email protected]
Outline
16 March: community structure
Suggested readings:
Newman M.E.J. 2003. The structure and function of complex
networks. SIAM Review 45,167-256
What is a community?
A cluster of individuals that are more linked to
one another than to others
Traditional techniques
Cluster analysis (hierarchical)
Multi-Dimensional Scaling
Principal Coordinate Analysis
Traditional techniques
How representative is the result?
Loss of information measure: Stress in MDS
What is the best division?
Cluster analysis
Peripheral individuals are lumped together
Girvan-Newman algorithm
Divisive clustering algorithm
Find the boundaries of communities
Divide a population of n vertices in 1 to n communities
Weakest link between communities: edge betweenness
Standardise betweenness at each step
Re-calculate edge betweenness at each step
Zachary karate club
Girvan & Newman 2002 PNAS
Finding the best division
For each step calculate a modularity coefficient
Best division will have the most edges within
communities and the least between
Take community size into consideration
Q
2
eii a i
Q=0.42
i
a i e ij
j
1
2
3
1
30
2
5
2
2
10
2
3
5
2
50
37 2
10
14 2
50
57 2
30
Q (
(
) )(
(
) )(
(
) )
108 108
108 108
108 108
Zachary karate club
Newman & Girvan 2003 Physics Review E
Modularity coefficient
The principle of modularity coefficient
optimisation can be apply to any community
structure algorithm
Extension to weighted matrices
Edge betweenness
Transform similarity matrix into dissimilarity matrix
Calculate geodesic path using Djikstra’ algorithm
Problem: more likely to remove edges
between strongly connected pairs
Alternative: Modularity optimisation
Forget edge betweenness
Optimise for high Q!
Computer intensive
Prone to false minima
Difficult to find out
Iterate the optimisation to detect
Not always successful
Modularity- Greedy algorithm
Start with n communities (agglomerative
clustering method)
At each step link the communities that provides
the greatest increase (or the smallest decrease
in Q)
Modularity- Greedy algorithm
Q optimisation
Girvan-Newman
Overlapping communities
Recognise that some individuals sit on the fence
Do not force them in one community or the
other but identify them as overlapping
Palla et al. 2005 Nature
Palla algorithm
Based on the k-clique principle: a community is
composed of a number of k-cliques
k-cliques: fully connected subgraphs of k vertices
Adjacent k-cliques share k-1 vertices
Community: series of adjacent cliques
Palla et al. 2005 Nature
Palla algorithm
Find all k-cliques
Calculate the clique-clique overlap matrix
Define adjacent cliques
Issues (and advantages):
k is user-defined, find ‘best’ k by trial and error
Works only on binary networks
(weighted network transformation)
Palla et al. 2005 Nature
Simply the best method
Modularity matrix
A matrix? Let’s eigenanalyse!
Let’s rewrite the modularity coefficient:
ki k j
1
Q
( Aij
)si s j
4m ij
2m
Community
identification
Links distributed
at random
Newman 2006 PNAS
Modularity matrix
Bij ( Aij
2m
)
Sum rows and sum of columns = 0
ki k j
One eigenvector (1,1,1….) with eigenvalue 0
Graph Laplacian
Eigenvector of the dominant eigenvalue gives
the best community division into 2 communities
(negative and positive elements)
Magnitude of eigenvector elements
Tells us how well a vertex is classified (whether
it belongs to the core or the periphery of the
community)
Zachary karate club
Finding the best division
Repeat the process on each subgraph
Recalculate the modularity coefficient for the whole
graph
If new division makes 0 or <0 contribution to
modularity then do not do it
Else continue
Power of modularity matrix method
Different types of null models can be tested
As long as we have
One eigenvector (1,1,1….) with eigenvalue 0
1
Q
( Aij Pij )si s j
2m ij
To do so, substract sum of rows from diagonal
Uncertainty
Bootstrapped algorithm
m results from community algorithm
Matrix: likelihood that 2 individuals belong to the
same community
Coarse-grain community identity
Provides uncertainty overlap
Girvan-Newman in Netdraw
Modularity matrix in Socprog