Transcript Document

English okay?
Masters studies offer tracks:
This is part of:
VL Microarray data analyis
Tuesday, 8:30 – 10:00
Ü
Thursday 10:15-11:45 (start: Oct. 23)
Next semester: Praktikum + Seminar
Thereafter possibility for Masters thesis.
Anwesenheitspflicht in VL und Ü (Liste!)
Literature:
See course web page.
1
21. Okt
Microarray-Technologien
Martin Vingron
2
28. Okt
Grundlagen der Datenanalyse
Christine Steinhoff
3
4. Nov
Varianzanalyse I
Christine Steinhoff
4
11. Nov
Varianzanalyse II
Christine Steinhoff
5
18. Nov
LOWESS, Varianzstabilisierung
Anja von Heydebreck
6
25. Nov
Statistisches Testen
Anja von Heydebreck
7
2. Dez
Clusterverfahren
Anja von Heydebreck
8
9. Dez
Klassifikation, Lin. Diskriminanzanalyse
Rainer Spang
9
16. Dez
Anwendungen in der Krebsforschung
Rainer Spang
10
6. Jan
Hauptkomponentenanalyse
Martin Vingron
11
13. Jan
Statistische Lerntheorie
Rainer Spang
12
20. Jan
Sequenzannotation
Rainer Spang
13
27. Jan
Bayessche Netzwerke
Rainer Spang
14
15
3. Feb
10. Feb
Regulation
Martin Vingron
Zusammenfassung, Wiederholung, Ausblick
Genome Sequencing:
Functional Genomics:
Determination of DNA sequence
Derivation of amino acid sequences
Analysis, comparison, classification
Study of gene function
gene expression studies
proteomics
metabolic networks
DNA
gene
transcription
messenger RNA (mRNA)
translation
protein
sequence
structure
A cell and its population of genes:
What is the problem?
Determine the amount of mRNA for each
gene that is present in a cell/tissue.
DNA forms double strands by a process called
hybridization:
Labeling
Hybridization
Expression Arrays
cDNA Arrays
Glas Arrays
Oligonucleotide Arrays
Membrane based
Arrays
Glass Slide Microarrays
… were first produced at Stanford University (Schena et al, 1995).
Whole cDNA:
500-1500 bp
Filter “Macro”arrays
… were first published by Lennon and Lehrach, 1991
7.5x2.5cm
Ca 21 cm
Oligonucleotide Arrays
… were first published by Lockhardt et al, 1996
... TGTGATGGTGGGAATGGGTCAGAAGGACTCCTATGTGGGTGACGAGGCC
TTACCCAGTCTTCCTGAGGATACAC
ca 25bp
TTACCCAGTCTTGCTGAGGATACAC
probe cell
1
PM
MM
2
3
4
...
...
...
17 18 19 20
probe set
probe pair
Probe - Reference
CC
CC
G
A
CC
G
A
G
CC
A
CC
G
A
CC
G
A
G
A
CC
G
A
CC
G
A
There are other technologies, too,
to estimate expression levels:
• EST sequencing – „electronic northern“
• SAGE: tags of mRNAs are concatenated
and sequenced
• Reliability of results depends on depth of
probing (number of ESTs, number of tags)
Why do we want to know?
• „tissue profiling“: which genes are
expressed in a tissue
• Comparing healthy and diseased (e.g.,
tumor) tissue
• Studying dynamic processes: E.g., cell
cycle (time series)
Example:
Renal clear cell carcinoma
Comparison of kidney cancer cells to normal
tissue. Which genes are altered in their
expression?
N98-8880
T98-8880
Molecular Genome
Dr. Judith Boer
Example: Cell cycle time course
G1
S
G2
M
Spellman et al took several samples per time-point and
hybridized the RNA to a glass chips with all yeast genes
Data processing
• Image collection
• Image analysis, intensity determination
• Within slide normalization
Trends in Biotech
Hess et al, 19(11),2001
OUPUT: Scanner + Scanner-Software
Clone
Index
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
Finger #Row Column S#1 Mean S#1 S.Dev S#1 Area S#1 BkMean S#1 BkS.Dev
1
3
5
7
9
11
1
3
5
7
9
11
2
4
6
8
10
12
2
4
6
8
10
12
1
3
5
7
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
1
1
1
1
1
1
2
2
2
2
2
2
3
3
3
3
1964.028
2149.386
906.1724
3588.557
60317.82
54301.75
771.2751
662.4827
1245.646
488.5027
5783.04
1961.644
2838.966
55542.37
3338.375
2955.312
61398.14
58695.57
5746.229
2045.466
1097.905
858.4306
6882.719
2503.947
740.4891
8222.649
2244.806
1655.791
...
682.7736
769.6178
420.9323
1168.349
11562
20957.93
409.6172
309.9964
923.4761
297.9345
1924.275
1296.955
964.7534
20307.24
2077.73
984.0138
11946.8
15767.88
2064.34
682.7502
435.5942
387.9267
3266.915
770.0308
400.7088
3462.559
734.1207
711.0076
113
91
74
89
153
135
73
73
52
31
125
76
82
131
65
117
152
147
83
102
67
101
129
128
71
118
92
81
208.7262
152.5326
206.2414
162.2653
186.1003
148.7088
174.1211
140.9021
195.0722
157.5111
206.8428
178.7954
236.801
173.5277
203.6449
156.0123
190.193
150.2779
196.4706
163.1341
184.9516
152.659
191.4465
159.6247
230.2468
158.729
212.3941
148.894
173.0246
131.4185
196.8208
164.9264
172.9454
151.5878
135.3904
139.0316
222.6505
156.1099
229.2686
168.0402
167.7662
130.3175
200.3975
166.7363
170.8719
152.7547
136.53
139.233
226.4033
163.7236
Trends in Biotech
185.4035
134.3752
19(11),2001
195.8065Hess et al,153.7403
186.3808
150.0175
...
Different technologies
• Support: membrane or glass slide
• Spotted material: PCR product or oligo
(short/long)
• Labeling:
– 1-channel: radioactive, Affy
– Absolute values
– 2-channel: 2 color fluorescent labeling
– Relative values
Quality issues
0.8
1.0
subpopulations:
PCR
43 a73-u02400vene.txt
0.0
0.2
0.4
^
F
0.6
Kidney1
Kidney2
Kidney3
Kidney4
Kidney5
Kidney6
Onco1
Onco2
Onco3
Onco4
Onco5
-0.2
-0.1
0.0
0.1
0.2
0.3
0.4
log(fg.green/fg.red)
Remedies: improve PCR protocols;
model “random effect” through plate-wise calibration
0.5
subpopulations: pin
0.8
1.0
41 (a42-u07639vene.txt) by spotting pin
0.0
0.2
0.4
^
F
0.6
1:1
1:2
1:3
1:4
2:1
2:2
2:3
2:4
3:1
3:2
3:3
3:4
4:1
4:2
4:3
4:4
-0.8
-0.6
-0.4
-0.2
0.0
0.2
log(fg.green/fg.red)
Remedies: handling of pins; pin-wise calibration
Distribution of intensities: log-normal?
intensities
QQPlot
Histogramm
log intensities
Chip design
• Type of chip:
– Global „whole genome“ (yeast, drosophila,
mouse, man)
– Domain specific, e.g. cancer, infection
• Spots:
– PCR products: E.g., 3´ UTR (avoid crosshyb.)
– Oligos: uniqueness, stability
Databases
•
•
•
•
•
Stanford
TIGR
Gene expression atlas
GEO
Arrayexpress
• MIAME standard: Minimum Information About a
Microarray Experiment
Software
•
•
•
•
R + Bioconductor
Jexpress
Genesprings
Rosetta Resolver
Affymetrix technology
• Per gene, spot 20 perfectly matching oligos
and 20 oligos with 1 mismatch
• Intensity: weighted average of pixel
intensities in perfect and mismatch oligos
(More on this next week)