Transcript Slide 1

Graph spectral analysis/Graph
spectral clustering and its
application to metabolic networks
Graph spectral analysis/
Graph spectral clustering
PROTEIN STRUCTURE: INSIGHTS FROM
GRAPH THEORY
SARASWATHI VISHVESHWARA, K. V. BRINDA and N. KANNANy
Molecular Biophysics Unit, Indian Institute of Science
Bangalore 560012, India
Adjacency Matrix
Laplacian matrix L=D-A
Degree Matrix
Eigenvalues and eigenvectors
Eigenvalues of a matrix A are the roots of the following equation
|A-λI|=0, where I is an identity matrix
Let λ is an eigenvalue of A and x is a vector such that
-----(1)
N×N N×1
N×1
then x is an eigenvector of A corresponding to λ .
Node 1 has 3 edges, nodes 2, 3 and 4 have 2 edges each and node 5
has only one edge. The magnitude of the vector components of the
largest eigenvalue of the Adjacency matrix reflects this observation.
Node 1 has 3 edges, nodes 2, 3 and 4 have 2 edges each and node 5
has only one edge. The magnitude of the vector components of the
largest eigenvalue of the Laplacian matrix reflects this observation.
The largest eigenvalue (lev) depends upon the highest degree in
the graph.
For any k regular graph G (a graph with k degree on all the
vertices), the eigenvalue with the largest absolute value is k.
A corollary to this theorem is that the lev of a clique of n vertices
is n − 1.
In a general connected graph, the lev is always ≤ to the largest
degree in the graph.
In a graph with n vertices, the absolute value of lev decreases
as the degree of vertices decreases.
 The lev of a clique with 11 vertices is 10 and that of a linear
chain with 11 vertices is 1.932
a linear chain with 11 vertices
In graphs 5(a)-5(e), the highest degree is 6. In graphs 5(f)-5(i), the highest
degree is 5, 4, 3 and 2 respectively.
It can be noticed that the lev is generally higher if the graph contains vertices of
high degree. The lev decreases gradually from the graph with highest degree 6
to the one with highest degree 2. In case of graphs 5(a){5(e), where there is one
common vertex with degree 6 (highest degree) and the degrees on the other
vertices are different (less than 6 in all cases), the lev also depends on the
degree of the vertices adjoining the highest degree vertex.
We combine graph 4(a) and graph 4(b) and construct a Laplacian matrix
with edge weights (1/dij ), where dij is the distance between vertices i and
j. The distances between the vertices of graph 4(a) and graph 4(b) are
considered to be very large (say 100) and thus the matrix elements
corresponding to a vertex from graph 4(a) and the other from graph 4(b)
is considered to have a very small value of 0.01. The Laplacian matrix of 8
vertices thus considered is diagonalized and their eigenvalues and
corresponding vector components are given in Table 3.
The vector components corresponding to
the second smallest eigenvalue contains
the desired information about clustering,
where the cluster forming residues have
identical values. In Fig. 4, nodes 1-5 form a
cluster (cluster 1) and 6-8 form another
cluster (cluster 2).
Metabolome Based Reaction Graphs of M. tuberculosis
and M. leprae: A Comparative Network Analysis
Ketki D. Verkhedkar1, Karthik Raman2, Nagasuma R. Chandra2, Saraswathi
Vishveshwara1*
1 Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India, 2
Bioinformatics Centre, Supercomputer Education and Research Centre, Indian
Institute of Science, Bangalore, India
PLoS ONE | www.plosone.org
September 2007 | Issue 9 | e881
Construction of network
R1
R2
R3
R4
Analysis of network parameters
Analyses of sub-clusters in the giant component
To detect sub-clusters of reactions in the giant component graph
spectral analysis was performed.
To obtain the eigenvalue spectra of the graph, the adjacency matrix of
the graph is converted to a Laplacian matrix (L), by the equation:
L=D-A
where D, the degree matrix of the graph, is a diagonal matrix in which
the ith element on the diagonal is equal to the number of connections
that the ith node makes in the graph.
It is observed that reactions belonging to fatty acid biosynthesis and
the FAS-II cycle of the mycolic acid pathway in M. tuberculosis form
distinct, tightly connected sub-clusters.
Identification of hubs in the reaction networks
In biological networks, the hubs are thought to be functionally important and
phylogenetically oldest.
The largest vector component of the highest eigenvalue of the Laplacian
matrix of the graph corresponds to the node with high degree as well as low
eccentricity. Two parameters, degree and eccentricity, are involved in the
identification of graph spectral (GS) hubs.
Alternatively, hubs can be ranked based on their connectivity alone (degree
hubs).
It was observed that the top 50 degree hubs in the reaction networks of the
three organisms comprised reactions involving the metabolite L-glutamate
as well as reactions involving pyruvate. However, the top 50 GS hubs of M.
tuberculosis and M. leprae exclusively comprised reactions involving Lglutamate while the top GS hubs in E. coli only consisted of reactions
involving pyruvate.
The difference in the degree and GS hubs suggests that the most highly
connected reactions are not necessarily the most central reactions in the
metabolome of the organism