Microarray Gene Expression Data Analysis using Boolean Networks Mentor: Dr. Shahadat H. Kowuser Team: Patrick Bailey Simon Bartholomew Loodwing Murillo.

Download Report

Transcript Microarray Gene Expression Data Analysis using Boolean Networks Mentor: Dr. Shahadat H. Kowuser Team: Patrick Bailey Simon Bartholomew Loodwing Murillo.

Microarray Gene Expression
Data Analysis using Boolean
Networks
Mentor: Dr. Shahadat H. Kowuser
Team: Patrick Bailey
Simon Bartholomew
Loodwing Murillo
1
Outline
•DNA Sequencing
•What is a Microarray?
•Microarray Technology
•Boolean Networks
•BN Gene Application
2
DNA Sequencing
DNA sequence {A,T,C,G}
ATCGAATCGA
Protein sequence { except B, J, O, U, X, Z}
KMLSLLMARTYW
Central Dogma of Protein Synthesis
Genome
Transcriptome
Transcription
DNA Sequence
{A,T,C,G}
Proteome
Translation
Microarray
Gene Expression Analysis
Protein synthesis
(KMLSLLMARTYW)
What is a Microarray?
•Device used to analyze gene function/expression patterns.
•Allow these patterns to be studied in parallel
Example:
In each location, a
known probe
(cDNA) is placed
with cDNA from a
certain sample
For example, cDNA
from cancerous and
healthy cells with
different probes
(known strands of
cDNA)
Color indicates the relative abundance of a labeled cDNA
5
Microarray Technology
6
Application of BN
Understanding Experimental data
We are expecting powerful computational
tools to extract functional information from
the Experimental data.
7
What class of models
should be chosen?
The selection should be made in view of
 data requirements
 goals of modeling and analysis.
8
Boolean Algebra
AND
OR
NOR
0=F
1=T
In A
In B
Out
In A
In B
Out
In A
In B
Out
0
0
0
1
0
0
1
0
1
0
1
0
0
0
0
0
1
0
1
0
0
0
1
1
1
0
1
0
1
1
1
1
1
1
1
0
All High=High, else Low
Any High=High, else Low
Any High=Low, else High
Boolean Algebra
XOR
NAND
NOT
In A
In B
Out
In A
In B
Out
In A
Out
0
0
0
0
0
1
1
0
0
1
1
0
1
1
0
1
1
0
1
1
0
1
1
1
0
1
1
0
Diff=High, Same=Low
All High= Low, Else High
(Inversion)
Boolean Networks
• A Boolean network is defined by a set of
nodes, V = {x1, x2, . . . , xn}, and a list of
Boolean functions, F= {f1, f2, . . . , fn}.
x1
x2
Each xk represents the state (expression)
of a gene, gk, where xk = 1 the gene is
expressed or xk = 0, the gene is not
expressed.
y1
x4
x5
y2
y3
y4
11
x3
y5
x6
State Transition Diagram
x1
0
0
0
0
0
0
0
0
…
x2
0
0
0
0
0
0
0
0
…
12
t
x3
0
0
0
0
1
1
1
1
…
x4
0
0
1
1
0
0
1
1
…
x5
0
1
0
1
0
1
0
1
…
x1
0
0
1
1
0
0
1
1
…
x2
0
1
1
0
0
1
1
1
…
t+1
x3
0
1
0
1
1
1
1
1
…
x4
0
0
1
1
0
0
1
1
…
x5
0
0
0
0
0
0
0
0
…
Gene Networks
t1
g1
0
t2
1
t3
2
_
t4
1
g2
1
2
1
0
g3
0
1
1
1.
g4
1
2
1
0
g1
?
+
+
g4
_
+
_
g2
_
_
Gene network
13
g3
Boolean Networks
t+1
14
0
1
2
3
4
1
1
0
1
1
1
0
0
0
0
1
0
1
1
0
x1
x2
x3 x1
x2
x3
x3
or
nor
nand
x1
t
t
x2
Boolean Network
t
0
1
2
3
4
x1
x2
x3
1
1
0
1
1
1
0
0
0
1
1
0
1
1
1
At any given time, combining the gene
states gives a Gene Activity Pattern
(GAP).
15
GAP
Boolean Networks Model
B
A
g1
g5
g2
g3
g4
AND
OR
A/B
g1
g2
g3
g4
g5
lo/lo
OFF
OFF
ON
OFF
ON
lo/hi
OFF
ON
ON
ON
OFF
hi/lo
OFF
ON
ON
ON
OFF
hi/hi
ON
ON
OFF
OFF
ON
OR
OA
promoter
OB
g2
AND
OA
OB
promoter
g1
NAND XOR
EQ
Boolean Networks
B
A
g1
g5
g2
g3
g4
AND
OR
A/B
g1
g2
g3
g4
g5
lo/lo
OFF
OFF
ON
OFF
ON
lo/hi
OFF
ON
ON
ON
OFF
hi/lo
OFF
ON
ON
ON
OFF
hi/hi
ON
ON
OFF
OFF
ON
NAND
promoter OA
g3
OB
OR
OA
promoter
OB
g2
AND
OA
OB
promoter
g1
NAND XOR
EQ
More complex control functions, e.g., XOR ?
A/B
XOR
OR
NAND
lo/lo
OFF
OFF
ON
lo/hi
ON
ON
ON
hi/lo
ON
ON
ON
hi/hi
OFF
ON
OFF
OA OB
promoter
XOR(A,B) = (A OR B) AND NOT(A AND B)
Regulated recruitment
➡ Integrates OR and NAND into
a single regulatory region.
XOR
OA1
OB1
OR
promoter OA2
NAND
OB2
g4
Boolean Network modeling:
Discretization
Expression data is discretized for time series analysis:
time
0
5
10 15 20 25 30 35 40 45 50 55
gene 1
0
0
0
0
0
0
1
1
1
1
1
1
gene 2
0
0
0
0
0
0
0
1
1
0
0
0
gene 3
1
1
1
1
1
1
1
0
0
0
0
0
Gene expression data in bit stream format.
19
Boolean Network modeling:
Discretization
on
on
on
on
gene 1
on
gene 2
on
on
t1
on
on
gene 3
on
on
t2
t3
on
off on
on
off
off
off
off
off
off
off
off
off
off
off
off
off
off
off
off
0
20
10
off
20
off
30
time (min)
40
50
60
Project Contribution
Personal and Collective Contribution:
•Analysis of Boolean Algebra and logical functions.
•Developed Boolean Network Model for Microarray Gene Expression Data.
•Exploration of Microarray Analytic Technology to develop new methods and
concepts of Gene expression Data Analysis.
•Collectively reviewed nearly 100 publications and public resource materials
on the following topics: DNA Composition, Microarrays, Boolean Networks, Clustering
Methods, Genetic Sorting Algorithms
•Develop close interpersonal communication skills through extended cooperation with
coworkers and mentors.
•Develop invaluable public communication and presentation skills and etiquette through
first hand and observation of progressive project presentations.
Project Outcome:
Prepared two papers for publication:
•Microarray Gene Expression Data Analysis using Boolean Networks
•Microarray Gene Expression Data Analysis using Clustering Networks
21
Summary
• cDNA is extracted and fluorescence is applied to each sample.
• Microarrays are used to analyze gene function and expression.
• Modern Technology is used to extract fluorescence intensity data.
• Boolean Algebra’s logical operators and functions are used to
create extensive graphic networks used to analyze data.
• Analyzed data can be further processed and analyzed using
different models and techniques.
22
Acknowledgments
•Dr.Godwin Mbamalu
•Dr. Samir Raychowdhury
•Dr.Samuel Darko
•DOD/NSSA/SURI
23
Reference
 Hidde de Jong, Modeling and simulation of genetic regulatory
systems: a literature review; J Comput Biol. 2002;9(1):67-103.
Review.
 BAYESIAN ROBUSTNESS IN THE CONTROL OF GENE
REGULATORY NETWORKS Ranadip Pal1, Aniruddha Datta2,
Edward R. Dougherty
 Anastassiou, D. (2001). Genomic Signal Processing. IEEE Signal
Processing
 Dougherty, E. R. and A. Datta (2005). "Genomic signal
processing: diagnosis and therapy." Signal Processing
Magazine, IEEE 22(1): 107 - 112.
 Vaidyanathan, P. P. (2004). Genomics and Proteomics: A Signal
Processorapos's Tour. Circuits and Systems Magazine, IEEE. 4:
1-1.
24
Questions?
Thank You