Transcript Slide 1
Current Subject
Viral Identification
Using Microarray
Introduction to Bioinformatics
Dudu Burstein
Current Subject
Short Biology Introduction
Short Biology Introduction
DNA Microarrays
Introduction to Bioinformatics
3 of 25
Short Biology Introduction
Viruses
Introduction to Bioinformatics
4 of 25
The SARS Case
Round 1: Viral Identification
Using DNA Microarrays
Identification using microarray
Previous Identification
Techniques
Similar gene amplification
Antibody recognition (immunoscreening of cDNA
(degenerate PCR)
Libraries)
Drawbacks:
Limited candidates
Biased
Time consuming
Introduction to Bioinformatics
6 of 25
Identification using microarray
The DeRisi Lab Viral Microarray
Approx. 1,000 viruses
Probes 70 nucleotide long
10 most conserved of each virus
Amplification and hybridization
Objective: “create a microarray with the
capability of detecting the widest possible
range of both known and unknown viruses”
Introduction to Bioinformatics
7 of 25
Identification using microarray
The SARS Epidemic
SARS – Severe acute respiratory syndrome
Flu-like symptoms
Nov. 2002: first case in Gunangdong, China
15 Feb. 2003: Spreads to Hong-Kong
21 Feb.: 12 infections that will spread to
Hong Kong, Vietnam Singapore, Ireland,
Germany and Canada
Introduction to Bioinformatics
8 of 25
Identification using microarray
The SARS Epidemic
Cases in:
China, Hong Kong, Canada, Taiwan, Singapore,
Vietnam, USA, Philippines, Germany, Mongloia, Thailand, France,
Malaysia, Sweden, Italy, UK, India, Korea, Indonesia, South Africa,
Kuwait, Ireland, Romania, Russia, Spain, Switzerland.
Total 8,096 known cases
774 deaths
Mortality rate of 9.6%
April 2004 –
last reported case
Introduction to Bioinformatics
9 of 25
Identification using microarray
The SARS Identification
March 15th - WHO generate global alert
March 22th – samples obtained
Amplified and Hybridized with microarray
(1,000 viruses, 10 probes of 70 nucleotides)
Following results in less then 24 hours
Introduction to Bioinformatics
10 of 25
Identification using microarray
SARS Identification
Family
Virus
Corona
IBV
A
A
Corona
IBV
A
A
Corona
Bovine
corona
A
A
Corona
Human 229E
A
A
Astro
Turkey astro
A
A
Astro
Ovine astro
A
A
Astro
Avian
nephritis
A
A
Astro
Human astro
A
A
Introduction to Bioinformatics
11 of 25
Identification using microarray
SARS Identification
Family
Virus
Corona
IBV
A
A
Corona
IBV
A
A
Corona
Bovine
corona
A
A
Corona
Human 229E
A
A
Astro
Turkey astro
A
A
Astro
Ovine astro
A
A
Astro
Avian
nephritis
A
A
Astro
Human astro
A
A
Introduction to Bioinformatics
12 of 25
Identification using microarray
Summary (round 1)
Microarray of conserved sequences from thousands
of viruses
Hybridization enable identification
Rapid procedure
Limited homology suffice
Sequencing based on DNA recovered from
microarray
The SARS proof
Introduction to Bioinformatics
13 of 25
The E-Predict Algorithm
Round 2: The E-Predict
Algorithms
The E-Predict Algorithm
E-Predict Algorithm Challenges
Complex hybridization pattern, still time
consuming
Human interpretation might be biased
Separate closely related species
Unanticipated cross hybridization
Statistical significance
Signal from dozens or hundreds of species when
pure samples impossible to obtain (metagenomics)
Introduction to Bioinformatics
15 of 25
The E-Predict Algorithm
E-Predict Algorithm Outline
Introduction to Bioinformatics
16 of 25
The E-Predict Algorithm
Significance Estimation
Similarity ranking ≠ Probability that best
profile corresponds to virus in sample
1,009 independent diverse microarray data
For every virus, most data – false positive
Used as null (H0) Distribution
Introduction to Bioinformatics
17 of 25
The E-Predict Algorithm
Significance Estimation
Introduction to Bioinformatics
18 of 25
The E-Predict Algorithm
E-Predict Results – HPV18
Introduction to Bioinformatics
19 of 25
The E-Predict Algorithm
E-Predict Results – FluA
Introduction to Bioinformatics
20 of 25
The E-Predict Algorithm
Serotype Discrimination
HRV – species of the Rhinovirus genus, part
of the picornavirus family
HRV can be divided to:
HRV group A
HRV group B
HRV87 (closely related to enteroviruses)
Energy profiles of HRV89 (group A) and
HRV14 (group B)
Introduction to Bioinformatics
21 of 25
The E-Predict Algorithm
Serotype Discrimination
Introduction to Bioinformatics
22 of 25
The E-Predict Algorithm
Summary
Results achieved very rapidly
Minimal human interpretation: no bias
Not sensitive to noise
Handles complex hybridization pattern
Valid Interfamily and intrafamily separation
Serotype separation
Introduction to Bioinformatics
23 of 25
The E-Predict Algorithm
Possible Application
Pathogen detection:
clinical specimens
field isolates
Monitoring food/water contamination
Characterization of microbial communities
from soil/water
Introduction to Bioinformatics
24 of 25
The SARS Case
Thank You