Computational methods for genomics-guided immunotherapy Sahar Al Seesi and Ion Măndoiu Computer Science & Engineering Department University of Connecticut.

Download Report

Transcript Computational methods for genomics-guided immunotherapy Sahar Al Seesi and Ion Măndoiu Computer Science & Engineering Department University of Connecticut.

Computational methods for
genomics-guided immunotherapy
Sahar Al Seesi and Ion Măndoiu
Computer Science & Engineering Department
University of Connecticut
Class I endogenous antigen
presentation
Somatic rearrangement of T-cell
receptor genes
Potential TCR repertoire diversity: 1015
T-cell selection in thymus
Estimated TCR repertoire diversity after selection: ~2x107
T-cell activation and proliferation
T-cell activation and proliferation
T-cell activation and proliferation
The immune system and cancer
Cutting the brakes: PD1 and CTLA-4
blockade
Stepping on the gas: vaccination with
neoepitopes
Combined approach
Ton N. Schumacher, and Robert D. Schreiber Science 2015;348:69-74
Sequencing
QC and
Mapping
Calling
SNVs
Epitope
Prediction
Clonality
Analysis
Vaccine
Design
TCR
Sequencing
Sequencing
QC and
Mapping
Tumor DNA
Calling
SNVs
Epitope
Prediction
Normal DNA
Clonality
Analysis
Vaccine
Design
Tumor RNA
Whole Genome or Nextera Rapid
Library prep
Capture Exome
Whole Transcriptome
Library prep
Illumina HiSeq
Sequencing
TCR
Sequencing
Sequencing
Tumor DNA
QC and
Mapping
Calling
SNVs
Normal DNA
Whole Genome or Exome
Library prep
AmpliSeq
Ion Proton
Sequencing
Epitope
Prediction
Clonality
Analysis
Vaccine
Design
TCR
Sequencing
Tumor RNA
Whole Transcriptome
Library prep
Ion PGM
Sequencing
QC and
Mapping
Sequencing
Calling
SNVs
Epitope
Prediction
Clonality
Analysis
Vaccine
Design
TCR
Sequencing
ION-Torrent Proton Runs: read statistics
# of reads
# of bases
Mean of reads lengths
std of reads lengths
PB
25,031,340
3,272,787,408
130.74
66.88
Melanoma 1T
Patient 2T
19,252,932
2,589,624,915
134.5
68.67
28,400,728
4,147,914,801
146.04
66.91
3T
26,039,006
3,800,446,471
145.95
67.02
Normal
20,726,352
3,353,732,704
161.81
63.35
20,726,352
3,360,840,809
162.15
63.14
20,726,352
3,367,827,877
162.49
62.92
Synthetic
TumorAF10
Tumor
TumorAF20
Sequencing
QC and
Mapping
Calling
SNVs
Epitope
Prediction
Tumor Exome
Reads
Human reference
Normal Exome
Reads
Human reference
Tumor RNA-Seq
Reads
Clonality
Analysis
Vaccine
Design
TCR
Sequencing
Sequencing
QC and
Mapping
Calling
SNVs
Epitope
Prediction
Clonality
Analysis
• fastq QC Tools
Tools to analyze and preprocess fastq files
– FASTX (http://hannonlab.cshl.edu/fastx_toolkit/)
• Charts quality statistics
• Filters sequences based on quality
• Trims sequences based on quality
• Collapses identical sequences into a
single sequence
Vaccine
Design
TCR
Sequencing
Sequencing
QC and
Mapping
Calling
SNVs
Epitope
Prediction
Clonality
Analysis
Vaccine
Design
TCR
Sequencing
• fastq QC Tools
Tools to analyze and preprocess fastq files
– PRINSEQ (http://prinseq.sourceforge.net/)
• Generates read length and quality statistics
• Filters reads based on length, quality, GC
content and other criteria
• Trims reads based on length/position or quality
scores
Sequencing
QC and
Mapping
Calling
SNVs
Epitope
Prediction
Clonality
Analysis
Vaccine
Design
• Mapping decisions
– What is the best mapper for your data?
TCR
Sequencing
Tumor Exome
Reads
Human reference
– End-to-end unspliced alignments vs.
spliced or local alignments
Normal Exome
Reads
Human reference
– Unique vs. non-unique alignments
Tumor RNA-Seq
Reads
Sequencing
QC and
Mapping
Calling
SNVs
Epitope
Prediction
Clonality
Analysis
Vaccine
Design
TCR
Sequencing
ION-Torrent Proton read mapping comparison
Bowtie2
Melanoma
Patient
Synthetic
Tumor
TMAP
Segemehl
% of aligned bases
Aligned reads
length mean
% of aligned bases
Aligned reads
length mean
% of aligned bases Aligned reads
2
length mean2
PB
89%
138.8
100%
130.7
99%
135.4
1T
90%
143.3
100%
134.5
99%
140.4
2T
90%
153.3
100%
146
99%
150.3
3T
91%
153.4
100%
146
99%
150.4
Normal
89%
172.89
100%
161.81
98%
167.66
Tumor_AF10
89%
173.03
100%
162.15
99%
167.87
Tumor_AF20
89%
173.17
100%
162.49
99%
168.08
Sequencing
QC and
Mapping
Calling
SNVs
Epitope
Prediction
Clonality
Analysis
Tumor Exome
Reads
*
Human reference
*
*
*
*
*
*
*
*
*
Normal Exome
Reads
Vaccine
Design
TCR
Sequencing
Sequencing
QC and
Mapping
Calling
SNVs
Epitope
Prediction
Clonality
Analysis
Vaccine
Design
Tumor Exome
Reads
Somatic Variant Callers
•
•
•
•
•
Mutect (Broad Inst.)
VarScan2 (Wash. U.)
SomaticSniper (Wash. U)
Strelka (Illumina)
SNVQ w/ subtraction (UConn)
TCR
Sequencing
*
Human reference
*
*
*
*
*
*
*
*
*
Normal Exome
Reads
QC and
Mapping
Sequencing
Calling
SNVs
Epitope
Prediction
Clonality
Analysis
Vaccine
Design
TCR
Sequencing
Coverage distribution of exome vs. SNV calls
40%
35%
% of variants called
30%
25%
CCDS HCC1954BL
CCDS HCC1954
Mutect
20%
SNVQ
Sniper
15%
Strelka
Varscan2
10%
5%
0%
0
20
40
60
80
100
Read depth
120
140
160
180
200
Sequencing
QC and
Mapping
Calling
SNVs
Epitope
Prediction
Clonality
Analysis
Vaccine
Design
TCR
Sequencing
Comparing Somatic Variant Callers
• Synthetic Tumors
– ION Torrent Proton exome sequencing of two 1K Genomes individual
(mutations known)
– Downloaded from the public Torrent server
– Both exomes were sequenced on the same Proton chip
– Subset of the NA19240 sample was used as the normal sample
– Mixtures of NA19240 and NA12878 samples were used as the tumor
samples
– Reads were mixed in different proportions to simulate allelic fractions,
0.1, 0.2, 0.3, 0.4 and 0.5.
Sequencing
QC and
Mapping
Calling
SNVs
Epitope
Prediction
Clonality
Analysis
Vaccine
Design
TCR
Sequencing
Comparing Somatic Variant Callers
TP
mixture_AF_10 FP
FN
TP
mixture_AF_20 FP
FN
TP
mixture_AF_30 FP
FN
TP
mixture_AF_40 FP
FN
TP
mixture_AF_50 FP
FN
SNVQ
Strelka
Mutect
Bowtie2 TMAP Semegehl Bowtie2 TMAP
Semegehl Bowtie2 TMAP
Semegehl Bowtie2
464
476
470
1,573
1,738
402
440
360
22
1,232
341
295
331
218
85
80
80
3,050
11,823 11,811
11,817 10,714
10,549
11,885 11,847
11,927
12,265
2,506
2,605
2,668
4,220
4,475
1,094
1,183
994
35
1,224
341
286
302
191
55
54
54
3,178
9,781
9,682
9,619
8,067
7,812
11,193 11,104
11,293
12,252
4,608
4,842
4,910
5,776
6,089
1,526
1,637
1,379
61
1,219
364
307
317
199
43
39
46
3,235
7,679
7,445
7,377
6,511
6,198
10,761 10,650
10,908
12,226
6,138
6,404
6,465
6,660
7,005
1,797
1,907
1,614
89
1,259
378
333
362
241
40
40
46
3,338
6,149
5,883
5,822
5,627
5,282
10,490 10,380
10,673
12,198
7,018
7,309
7,359
7,232
7,592
1,937
2,037
1,732
163
1,322
394
357
388
286
41
49
49
3,375
5,269
4,978
4,928
5,055
4,695
10,350 10,250
10,555
12,124
Sniper
Varscan2
TMAP Semegehl Bowtie2 TMAP
Semegehl
10
17
92
91
7
1,469
1,344
1,971
1,056
20
12,277
12,270 12,195 12,196
12,280
15
20
797
836
837
1,548
1,422
1,978
1,015
447
12,272
12,267 11,490 11,451
11,450
31
29
1,922
1,994
1,994
1,618
1,471
1,952
1,013
476
12,256
12,258 10,365 10,293
10,293
48
44
3,198
3,298
3,312
1,661
1,528
2,018
1,071
505
12,239
12,243
9,089
8,989
8,975
104
41
4,284
4,466
4,489
1,763
558
2,162
1,161
590
12,183
12,246
8,003
7,821
7,798
QC and
Mapping
Sequencing
Calling
SNVs
Epitope
Prediction
Clonality
Analysis
Vaccine
Design
TCR
Sequencing
Comparing Somatic Variant Callers
120
100
80
PPV
Strelka
SNVQ
60
Mutect
Sniper
40
Varscan2
20
0
0
10
20
30
40
Sensitivity
50
60
70
Sequencing
QC and
Mapping
Calling
SNVs
Epitope
Prediction
Clonality
Analysis
Vaccine
Design
TCR
Sequencing
The ICGC-TCGA DREAM Somatic Mutation Calling
Challenge
• Initial Goal: Find the Best WGS Analysis Methods
• Challenge 1 Data: 10 Real Tumor/Normal pairs
– 5 from pancreatic tumors and 5 from prostate tumors
– Sequenced to ~50x/30x
• Up to 10K candidates will be validated
• re-sequencing to ~300x coverage using AmpliSeq
primers on an IonTorrent
Sequencing
QC and
Mapping
Calling
SNVs
Epitope
Prediction
Clonality
Analysis
Vaccine
Design
TCR
Sequencing
• Criteria for selecting candidate epitopes
1) Gene harboring the SNV must be expressed (FPKM
estimation)
• IsoEM (Nicolae et. al., Algorithms for Molecular Biology, 2011)
http://dna.engr.uconn.edu/?page_id=105
• RSEM (Li et. al., BMC Bioinformatics, 2011)
http://deweylab.biostat.wisc.edu/rsem/
Not Expressed
X
Sequencing
QC and
Mapping
Calling
SNVs
Epitope
Prediction
Clonality
Analysis
Vaccine
Design
TCR
Sequencing
• Criteria for selecting candidate epitopes
1) Gene harboring the SNV must be expressed
2) Peptide will be generated inside the cell upon protein
being cleaved by the proteasome
3) Peptide will bind to an MHC molecule that will chaperon
it to the cell surface
• NetChop
Predicts cleavage sites of the human proteasome
http://www.cbs.dtu.dk/services/NetChop/
• SYFPEITHI
Predicts MHC I, MHC II binding
http://www.syfpeithi.de/
• NETMHC
Predicts MHC I binding
http://www.cbs.dtu.dk/services/NetMHC/
• NetCTL
Combined cleavage and MHC biding predictions
http://www.cbs.dtu.dk/services/NetCTL/
Sequencing
QC and
Mapping
Calling
SNVs
Epitope
Prediction
Clonality
Analysis
Vaccine
Design
TCR
Sequencing
Candidate neo-epitopes statistics for two mouse cell lines
Duan et. al., JEM 2014
Sequencing
QC and
Mapping
Calling
SNVs
Epitope
Prediction
Clonality
Analysis
Vaccine
Design
Epi-Seq pipeline for neo-epitope prediction on local Galaxy
server
TCR
Sequencing
QC and
Mapping
Sequencing
Calling
SNVs
Epitope
Prediction
Clonality
Analysis
Vaccine
Design
TCR
Sequencing
• Rational vaccine design requires info on the
clonal structure of the tumor
– Not all cells harbor all candidate epitopes
• Approaches to clonality analysis
1) Computational inference from sequencing depth
•
SNV allelic fractions only
2) Targeted amplicon sequencing of selected mutations
at single cell level
•
More noisy data, potentially biased by capture protocols
Sequencing
QC and
Mapping
Calling
SNVs
Epitope
Prediction
Clonality
Analysis
Vaccine
Design
Cell capture & pre-amp
TCR
Sequencing
Sequencing
QC and
Mapping
Calling
SNVs
Epitope
Prediction
Clonality
Analysis
Vaccine
Design
PCR on Access Array
TCR
Sequencing
Sequencing
QC and
Mapping
Calling
SNVs
Epitope
Prediction
Clonality
Analysis
Vaccine
Design
PCR on Access Array
TCR
Sequencing
QC and
Mapping
Sequencing
Calling
SNVs
Epitope
Prediction
Clonality
Analysis
Vaccine
Design
TCR
Sequencing
Captured cells in pilot run
1
2
3
4
5
6
7
8
9
10
11
12
A
1_C03
1_C02
1_C01
1_C49
1_C50
1_C51
1_C06
3_C05
1_C04
1_C52
1_C53
1_C54
B
2_C09
1_C08
1_C07
1_C55
1_C56
1_C57
1_C12
1_C11
1_C10
1_C58
1_C59
1_C60
C
1_C15
2_C14
2_C13
1_C61
1_C62
1_C63
1_C18
2_C17
1_C16
1_C64
1_C65
1_C66
D
1_C21
2_C20
1_C19
1_C67
1_C68
1_C69
2_C24
2_C23
4_C22
1_C70
1_C71
1_C72
E
bulk
2_C26
1_C27
1_C75
1_C74
1_C73
0_C28
0_C29
0_C30
1_C78
1_C77
2_C76
F
1_C31
0_C32
1_C33
1_C81
1_C80
1_C79
0_C34
1_C35
0_C36
1_C84
0_C83
1_C82
G
0_C37
1_C38
0_C39
1_C87
1_C86
1_C85
1_C40
1_C41
1_C42
1_C90
1_C89
1_C88
H
1_C43
1_C44
1_C45
1_C93
1_C92
1_C91
1_C46
1_C47
1_C48
1_C96
1_C95
1_C94
Sequencing
QC and
Mapping
Calling
SNVs
Epitope
Prediction
Clonality
Analysis
Vaccine
Design
TCR
Sequencing
Analysis Pipeline
Barcode list
Fastx Barcode
Splitter
96 fastq
files: one
per well
pooled fastq
file
tmap
mm9 BALBc
genome
Generate
Referece
List of SNV
Locations
96 sam files:
one per well
fasta with +/300 bases
around each
SNV
compute
coverage
96x48 with total
and fwd/rev
variant coverage
for each well/SNV
Sequencing
QC and
Mapping
Calling
SNVs
Epitope
Prediction
Clonality
Analysis
Vaccine
Design
TCR
Sequencing
Target Aligned Reads
Unaligned
Reverse
Forward
A1
A2
A3
A4
A5
A6
A7
A8_3
A9
A10
A11
A12
B1_2
B2
B3
B4
B5
B6
B7
B8
B9
B10
B11
B12
C1
C2_2
C3_2
C4
C5
C6
C7
C8_2
C9
C10
C11
C12
D1
D2_2
D3
D4
D5
D6
D7_2
D8_2
D9_4
D10
D11
D12
180,000
160,000
140,000
120,000
100,000
80,000
60,000
40,000
20,000
-
180,000
160,000
Unaligned
Reverse
Forward
140,000
120,000
100,000
80,000
60,000
40,000
20,000
E1_bulk
E2_2
E3
E4
E5
E6
E7_0
E8_0
E9_0
E10
E11
E12_2
F1
F2_0
F3
F4
F5
F6
F7_0
F8
F9_0
F10
F11_0
F12
G1_0
G2
G3_0
G4
G5
G6
G7
G8
G9
G10
G11
G12
H1
H2
H3
H4
H5
H6
H7
H8
H9
H10
H11
H12
unmatched
-
chr10:20059582
chr10:57242972
chr11:101613424
chr11:35513793
chr11:6296832
chr11:78096592
chr1:191675045
chr12:111888098
chr12:114383174
chr12:17287415
chr12:73884772
chr13:42278980
chr13:55554430
chr15:25925906
chr1:57463849
chr15:81535979
chr15:85222168
chr16:17018488
chr17:27717774
chr17:46814692
chr17:53645675
chr1:78467742
chr19:24175196
chr19:8946517
chr2:104271486
chr2:165875637
chr2:166776842
chr2:91612587
chr3:146082306
chr4:115980807
chr4:62083556
chr5:28098614
chr6:17257915
chr6:29349553
chr6:71927367
chr7:133941041
chr7:30151735
chr7:31097709
chr8:26879828
chr9:110453160
Sequencing
QC and
Mapping
Calling
SNVs
Epitope
Prediction
Clonality
Analysis
Vaccine
Design
100000
50000
0
TCR
Sequencing
Per SNV Coverage
250000
200000
150000
alt_rev
alt_fwd
ref_rev
ref_fwd
QC and
Mapping
Sequencing
Calling
SNVs
Epitope
Prediction
Clonality
Analysis
Vaccine
Design
TCR
Sequencing
SNV Support Matrix
Cells
A1
chr10:20059582
chr10:57242972
chr11:101613424
chr11:35513793
chr11:6296832
chr11:78096592
chr1:191675045
chr12:111888098
chr12:114383174
chr12:17287415
chr12:73884772
chr13:42278980
chr13:55554430
chr15:25925906
chr1:57463849
chr15:81535979
SNVs
chr15:85222168
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
Alt in forward
Alt in reverse
chr16:17018488
chr17:27717774
chr17:46814692
chr17:53645675
chr1:78467742
chr19:24175196
chr19:8946517
chr2:104271486
chr2:165875637
chr2:166776842
chr2:91612587
chr3:146082306
chr4:115980807
chr4:62083556
chr5:28098614
chr6:17257915
chr6:29349553
chr6:71927367
chr7:133941041
chr7:30151735
chr7:31097709
chr8:26879828
chr9:110453160
-
-
-
A2
00020
4
01000
0
2
3
01-
-
-
A3
0
0
016
0
0
0
1001-
-
A4
1239
3189
0
3
3
5
1408
2390
1
1
0
1
1
1
-
1
2
0
1
1
0
0
0
-
1634
2536
0
1
0
0
0
0
00-
1060
2638
435
2760
945
212
7
5
3
3
-
0
0
A5
0
7
1
0
0
5
06462
638
01351
1284
132
5
1852
1974
001361
1440
0
2
0
0
151520 1005 1348
1581
1982
2282
1207
1405
0
0
0
0
12000
0
040
2
0
0
000
0
0
0
0
5
960
4
656
818
0
1
2
0
337 80 0
3
0
0
3
3
A6
0
1
652651
4018
291
471
A7
1892 1312 -
0
0
889
1342
0
2
0
0
708 1035 0
0
1
6
0
0
1757 2340 0
1
0
1
25
50
47
79
1367
1183
1388
1110
2340
1146
332
207
0
0
-
0
3
1
0
0
0
1
0
0
6
0
0
0
1
0
1
815 153 -
8
6
0
0
1175
1591
1
0
174
218
0
0
873
1102
606
1215
0
1
307
1212
1742
1590
0
0
0
1
627
1057
0
0
0
0
A8_3
1
2
103
144
338
545
262
442
338
394
164
539
1
0
170
238
301
394
0
0
906
1085
680
833
521
757
1225
1229
225
238
47
46
0
0
00183
109
1082
890
0
0
001177
3
2140
2188
0
0
1902
1629
9
14
0
0
1646
1748
18
8
1956
1433
0
0
84
124
0
0
00-
A9
610
759
11
12
688
1216
258
382
1
0
173
1097
2
1
1201
1692
4
6
1124
717
142
180
510
742
0
0
22375
228
759
1292
0
0
15
19
708
1387
18
21
332 1227 0
0
10-
24
27
310
407
286
678
375 88 991
241
330
550
237
211
0
1
1534
2090
3
6
85
218
183
273
1
0
0
0
1227
2073
0
0
5
1
2
1
1121
1478
859 1063 0
1
0
4
14
63
0
1
A11
24
21
1
1
986 1245 520
638
719
678
0
0
1115
1377
22-
1119
1135
0
0
182 232 84
56
1342
1709
0
0
1
0
332
733
00682 803 -
0
0
605
732
2
3
0
0
0
0
0
0
0
0
1
0
541 8651
990
A10
10550
596
1
2
266
380
37
83
558
580
325 785 3
2
1
1
0
1
00966
940
0
0
974 1259 -
52
31
27
11
26
688
1376 1303 2
1
0
0
375
354
12
7
1285
1101
1293
502
1078
867
0
0
0
0
1
1
0
0
0
0
000
0
311
375
0
12
00-
108
152
451476
2225
675
854
0
0
342 58 001049
1583
0
0
1
1
A12
0
0
958
759
748
942
0
2
1
0
5
9
002
8
307
220
2
6
0
1
600
440
0
139
0011
13
0016
86
00-
3
2
449
251
0
0
B1_2
0
0
132
117
1345
1964
562
629
0
0
972
657
16
609
850
1050
659
953
136
85
1066
1150
697
532
3
0
342
473
970
458
179
160
0
2
2
5
737
1181
332
660
0
0
788
849
1
3
311
356
1
0
1
1
0
3
798
782
106
103
343
2054
276
1065
1091
754
649
96
215
212
0
0
2
3
B2
509
429
1282
1412
634 1012 678 942 261
358
0011 288 234
313
1011
1471
730 977 745
883
812
939
211002
1214
209 406 636 253 941
1086
931 1255 766 988 35
48
569
530
176
489
20169 243 131 145 107 165 1
2
044 23635 91692
1873
7
8
392
1240
563
791
302
1001
1
0
495
16
38
38
00191
245
B3
501
471
1545
1132
0
1
1
0
2450
2307
1
0
293
227
1429
1152
1708
1620
0
0
1059
1388
0
0
0
2
0
1
3
6
0
1
1517
1285
1072
1421
1511
217
584
97
4
5
4
3
B4
101
132
688
832
351
681
213
333
83 111 0
1
104
439
0
1
888
1430
595
758
10859
931
00139
139
532
886
521
428
1122
1604
853
1317
435
585
634
1264
48
117
216
655
60
77
0
1
801
1074
347
468
102
147
0
133
0
0
83 0518
754
4
6
197
525
845
1182
200
372
563 99 816 192 132
215
16
14
0
4
B5
0
2
0
1
0
2
0
2
0
0
50
2309
0
0
7
12
0
2
0
2
0
2
0
1
0
1
0
4
0
8
0
0
4
4
0
1
0
0
0
1
0
0
0
4
1
2
1
0
0
1
0
0
0
2
0
0
0
1
0
3
0
1
4
4
0
0
0
0
B6
79
98
2
0
3
1
0
2
3
1
773
725
424
1465
1153
1416
6
6
1739
1720
0
0
744
715
0
1
1944
2043
358
522
3
1
26
35
3
5
0
2
3
4
0
0
363
1000
0
0
0
0
2
2
856
903
5
2
0
413
001425
10
1708
2098
4
2
626
1137
1630
1691
477
918
911
159
0
0
1
6
011103
1178
B7
0
1
1345
604
2641
1971
21
16
0
0
1190
482
0
0
638 721 2939
1726
0
0
1166
355
1648
664
134
64
0
4
903
486
0
1
279
147
1903
989
2202
1027
581
512
0
0
351
580
0
2
0
0
0
0
1164
497
798
458
0
89
1628
2
0
0
0
0
2089
1313
0
3
1414
1248
1627
141
0
0
3033
1926
4
2
B8
2
0
529
668
285
519
632
1001
180
288
1
0
227
1142
786
1138
3
4
0
0
233
247
303
422
1508
1979
569
849
0
0
652
1019
0
0
251
344
1
1
178
421
0
1
0
0
281
489
178
282
1
0
3
2
0
148
1
0
0
0
422
536
0
6
201
521
426
547
181
431
1
0
1
0
6
16
220
183
308
369
B9
828
1093
618
826
666
1967
1
0
34
62
0
0
98
672
46
63
834
1245
689
1023
717
438
567
798
489
839
203
315
337
597
210
146
567
1241
0
1
1029
62
1
0
95
391
0
0
5
8
123
185
39
62
23
11
0
234
1
0
788
28
326
444
1
3
553
1818
640
960
158
398
25
28
756
87
11
22
29
25
784
1210
B10
B11
419
0461
0547 548 1472 1879 143 210 39
1
47
0
430
1
336
0
79 194 218 518 42
2
43
4
126
1
166
0
505 336 710 736 574 543 131
3381
135
3807
695 810 260
5
278
0
187
0
189
0
128
0
132
0
238
1680
496
2550
234
0
449
1
22 100 582 703 705 934 463
1
507
0
313
0
323
1
14020 455 515 103
1
0
0
767
0
1002
0
4
0
3
0
45
0
63
0
658
1709
0406
0
597
0
365
1
70
0
533 122 1
3
1
5
88 73 0
1
1
0
B12
1354 1170 20282 327 1094 490 222
0
1
3
1015 547 331
250
10630 370 118 190 001
1
2
1
0002397 429 164 159 000
1
128
165
068 374 401 10260 307 2401123 197 962 104 1262 266 05345 225 1378 1330 -
C1
-
C2_2
-
C3_2
0
1
00-
-
7
2
00-
0
0
-
-
000
2
000
0
-
1
0
0
0
3
2
0
0
3
7
1809 1341 103188 4677 0000140
1
1528
2488
-
0
1
0
1
0
1
1
2
0
0
-
0
0
12
9
10-
-
0
1
1
4
-
0
0
2384
2539
9
6
0
0
C4
913
1083
1050
1262
299
430
432
673
143
182
10
9
14
564
553
769
842
1197
68
108
512
384
326
405
586 673 1785
2215
302
416
361
233
760
1003
440
577
694
967
506
868
511
738
28
87
1
6
385
672
574
742
507
772
306
359
0
154
0
0
1098 10 904
1272
1
2
173
812
465
584
190
750
574 512 585
136
291
363
144
125
220
277
C5
1668
1992
2172 2345 0
1
03011799
1949
051397
1731
4
12
2130 2245 0
1
1996
2160
0007031984 2662 0
10
1819
2188
0
6
01040
3
83 128 1953 2140 0
3
3
1
010
0
0
4
6
18
0
4
0
11
0
5
334 71 27
59
0013-
C6
2
0
0
1
5981
1942
0
1
7
4
2
0
1
0
0
0
0
0
0
0
2
2
1
0
1
1
0
0
0
0
2
0
0
0
1
1
1
0
0
1
C7
1218 1413 456 428 1059 2553 553 777 41
65
969
980
0
0
012
1
982 1142 00200
233
0
0
2710 3487 000
0
139
201
926 1225 65
73
0
0
444 1138 51
163
001263
1717
00887
1110
1
1
0308 0
4
0
0
246 355 46309
864
1205
1661
0
3
-
17
5
920 155 1060
1686
880
714
556 663 -
C8_2
-
415
645
993
980
354 892 -
4
7
302 271 44 56 572 626 1128 1453 968 1045 117 179 415 1187 838 1134 0131001003 10 490 704 00675 1206 279 68 2300-
C9
0000180 159 101588
2433
555 793 -
-
-
-
-
-
-
C10
-
C11
0
0
0
3
-
1
7
0
0
80009
0
10000
0
103
3
-
-
11-
-
0
0
-
-
4
8
0
0
C12
248 224 1099
1436
370 518 56 50 163 870 937 994 20619 652 10831
779
1064 1180 1955 2033 111537
1774
381
379
474 411 511
835
00134 145 00832 850 500180 130 140 883 11874 2176 380051 115 1764 298 177 229 0016-
D1
D2_2
0295-
000
0
-
-
490
4
3044
3930
0
0
1563
2184
0
0
-
1909
3151
0013 10 -
D3
1
1
577 889 -
201
777
710
1004
1909 2092 001051
1323
002
0
0
1
1178
1591
010
0
0
1
0
0
0
0
1
2
001
0
-
1
0
0
0
1
0
0
1
0
0
0
0
0
0
-
1
0
8
9
2
0
10-
-
0
1
0
0
0
0
641
989
6
8
-
-
D4
1112
1285
1184
1499
2209
3522
57
41
28
933
83
133
1005
1321
1977
2623
263
531
2
0
488
677
1041 1344 469
917
0
0
0
1
0
0
0
1
911
1324
26
41
875
855
0
0
0
0
320 60
0
4
9
1
1
858
1218
13
59
1363 1290 0
0
1270
1906
4
4
1192
1863
D5
0
1
0
1
0
5
130202004
6
0
9
22110 021
0
04010
2
0
3
1
8
0202100
0
0
4
0
1
0
0
01013200071303107
9
0
0
01-
D6
00100
0
-
5
3
00-
D7_2
14
19
0
0
4
9
1
10
3
11
6
3
0
0
1
0
0
0
-
0
0
10-
0
0
10
19
712 2
3
0
0
-
0
0
0
0
0
0
0
1
00-
0
0
0
0
1
0
0
0
000
0
0
1
0
0
6077
1465
3
7
-
3
3
0
0
2
1
3
5
0
0
0
0
Low
D8_2
D9_4
D10
0516 0728 807
1762
598 1104
2102
746 157
2
780 362
0
1240 362
0
234 633
0
362 83 133 117 269 1
1
296 0
0
370 120
0
172
808
0
503
0
4
172
0
2
423
1
6
706
4
6
1108
223
1
341 330
0
471 400
0
394 320
0
287 501
1620
452
725
2097
536
0327 0467 718
0
736
1026
1
1005
282 482 311
4
133
184
1
217
439
1
472
879
0
848
0200
0354
552 805 2
1
216
0
2
457
38
680
193 125
1653
410 259 81
895 288
39 242 46 369 1
0
194 1
0
366 537 201 682 241 280
2
293
428
0
382
852
1
201 784
0
294 00122 81 171
228
245
2
702 2
0
556
2
449 104
0
791 0
0
31
0
629
0
497 89
0
752 892
0
505 1188
0
703 145
0
189 319
1
405 376 58 61 616 13 167 650
2
394
1248
3
667
139
0
186 90
0
189 0
5
248 0
0
355 -
D11
D12
0
0
935
1101
1325
2478
0
0
371
499
885
741
33
1227
460
725
86
155
723
994
206
140
68
88
23
28
1
0
0
0
1
0
1
5
0
0
0
0
1
0
1
0
0
0
0
0
0
0
0
0
0
0
5
8
317 189 1324
1575
405 567 511
642
140
339
0
0
0
0
110
141
1
4
0
0
671
879
22
11
0
76
996
1410
29
1
3
1
0
0
34
268
299
472
56
206
174
174
558
141
973
1306
0
0
108
246
E1_bulk E2_2
E3
349 505 362 623 587 1157 318
1486
624
2574
258 452 263 303 25 1021 383 745 664
3
1327
6
368
1
698
0
448
0
486
0
353
0
488
0
299
0
463
0
749
0
1389
1
413 808 320 256 492
421
0
356
786
1
281 443 712
475 471
939 0
204
2
1
549
1
28
230
0
37
421
2
63
98
1
166
322
0
149
137
837
139
247
1151
0
167
0
0
396
0
73
414 72
760 556
179
0
474
354
1
509
290
0
361
418
0
0
04
158 0
169 0
287 882
686 8
16 1301
284
3548
1241
584
4377
3
2
0
2
3
0
2
140 12
867 844
479 745
1060 137
113
1
313
642
0
579
362 333
535 781
519 113
214 934
440
4
805
741
7
311
136 167
145 173
159 186
424 351
266
1213
919
662
589
811
730
279
221
229
134
40
507
43
52
1160
1155
494
486
766
309
563
431
759
514
2068
1741
217
198
E4
000
1
1171
1889
0
0
199
317
466
388
54
159
238
528
1188
1414
701
624
82
55
1134
1206
585
640
1808
2130
262
333
582
586
260
379
452
510
1221
1343
148
277
0
0
251
708
1
2
359
509
408
440
395
408
491
597
0
147
16
25
899 6537
835
2
2
1744
2897
717
889
407
775
930
115
936
232
97
154
60
52
457
481
E5
E6
0
1
396 418 2475 3913 01001654 1726 02121333
1834
606 616 00733 841 994 1166 0
4
1019 1465 841
629
146
187
0100847
1315
1317 1863 577 1426 02111795
2165
134203168
198
2949
3782
6
6
1358
3125
0
0
011010 391 01260
0
52-
High
-
0
1
0
0
4
2
2
0
0
1
0
0
0
0
0
0
002
2
0
0
0
0
E7_0
0
2
0
1
1784
3342
0
0
663
769
286
1145
796
1248
2
3
0
0
011
2
0
2
0
2
000
1
1377
2195
001297
1602
0
3
0
3
0
1
001
3
0
0
000
2
0
1
0
0
0
0
107
178
1405
3486
0
1
523
1232
0
0
449
115
1
2
0
0
3
2
E8_0
00014
7
0032111
1
4
2
01000110005
3
250
1
00000000-
0
1
00000
0
0
0
003
6
1
3
2600304
4
0000-
E9_0
E10
0
0
0
1
0
1
-
0
0
0
1
1
0
1
1
1
0
0
0
2
6
0
0
1
0
0
1
0
1
0
0
-
1
1
0
1
2
0
001
0
0
0
1
0
0
0
-
0
0
100
0
0
0
01-
0
0
0
1
0
1
4
8
0
8
-
E11
E12_2
F1
1017 21231 0483 743 441 912 1734 477 2872 731 682 425 962 755 108 182 059 085 294 50 1248 248 1469 0680 866
3
279
1385
3
467
879 511 939 840 856
0
1620
0
01102 403 1249 501 270 73 407 98 0
0
01
0
0431 217 647 394 1145
1
166 810
0
151 0
0
115 0
0
175 1025 01487 1712 977 344
0
160
550
0
344
443 0829 00
0
116 0
0
533 0192 0342 1
0
138 0
0
226 451 604 0
2
51
0
0
81
23
08
0031 47 454 211 590 267 643 664 18329 1243 453 1878 20821121
1099
129 2557
3269
285 574
1657
92 719
1987
154 542 196 1188 430 0201 052 470 545 115 164 24
3
405
36
3
784
0100530
0
5658
2
1-
F2_0
F3
0
1
0
0
00002
0
206
8
5
0
021
1
0
0
1
0
213
1
2
2
000
0
0
2
1
0
2
2
0
0
010
0
001064-
F4
1332
1776
997
1117
428 711 1292 1417 0
1
1
3
1395
1787
1135
1375
824 1041 227
326
727 833 1606
2370
1595 1994 0
0
17
33
50
186
1091
1854
52 50 -
432
632
0
546
0
0
684 21251
1875
7
4
219
451
501
737
0
0
00641 179 1087
1812
00377 484 -
F5
2389
2727
1
3
1068
1598
0
5
0
1
2
9
0
0
0
0
2470
2775
0
4
0
2
1298
1443
0
0
1
2
0
2
0
0
0
3
0
1
0
1
0
135
920
980
0
2
0
0
2
1
391
720
393
1427
F6
1
4
0
0
1
13
00010
5
0
2
0
9
5
9
1
9
0
3
0
4
056779 1154 0
2
0
3
0
5
0
0
0
4
1
0
2593
1696
0
1
000
1
0
4
0
2
0
1
010
0
106668
1263
12
10
1
17
060
1
2
12
12
6
0
1
0
1
F7_0
10102
0
2
0
102
0
3
3
21001
0
000
0
2
2
002
0
0
0
1771 2801 0
1
001
0
110
1
0
0
0
0
13 62
0
0
0
0
0
7
4
0
0
0
0
F8
4
1
1
0
F9_0
200
0
2
1
0
0
3
0
2
0
4
3
0
0
007
1
200
1
0
0
0
0
0
0
0
0
0
0
5
0
0
0
00-
4
4
0
0
0
0
1
1
4
3
1000002
0
30003
1
30-
4
2
10100020004
0
0000000
0
00101212309
6
003
1
F10
0
1
0
0
01-
0
0
1
0
0
0
3
2
1
0
0
0
0
0
2
1
-
2
0
0
1
F11_0
723 667 1238 1146 1592
2209
571 612 0
0
38 195 525 517 3
1
1
0
1241 761 749
749
1007 1073 277 286 177 190 1083 659 599 667 1091 1123 896
1216
00900 971 385 422 00699
752
01097 001593
9
252
262
431065
2005
682 709 -
-
1
6
0
1
F12
-
0
0
0
0
3
3
1
0
2
0
0
1
1
0
-
0
0
0
0
1
0
1
0
00881 143 355
465
01-
6
8
G1_0
6
0
2
0
2
0
2
0
0
0
0
0
5
0
6
0
7
0
0
0
7
0
1
0
0
0
0
0
0
0
2
0
0
0
1
0
3
1
0
0
3
0
100
0
6
0
1
2
0
0
0
0
3013
0
0
0
3
0
4
0
2
0
2
0
308
5
0
0
2
0
G2
000
1
1
13
100316040
9
2
8
1
7
001121001
0
1
1
0
0
01010
4
41110
1
220
2
0
2
0
0
2
5
0
0
2
9
05023
4
5
11
030
0
G3_0
0
1
0
1
003
4
0110000
1
-
1
1
000000000
0
02001110001
5
01-
G4
0
1
0
0
0
0
5
4
0
0
0
1
2
1
0
0
0
0
0
0
5
5
G5
G6
G7
393
2243
2948 344
2794
2786 843 3
655 1
2
0
1
4
1
0
00
30
92 0
111 1
1547 1
1288 0
339 1
1321 0
86
1
1
113
0
0
2542
2
4
2670
1
1
920
0
1
833
1
1
1204 729 989
1
2
912
0
1
1027
0
0
1059
0
0
1662 01709 41007 01284 0951 2
476 0
1023
0
6
1249
1
0
02
21
041102
22
22 023 000000
00
11191
0
0
1327
1
0
0
0
0
0
0
0
0
0
0
2
0
1
1
02
00000
00
1554 0
40
11
3938
0
18
4004
0
712 83
0
0
179
0
1
164 0
224 1
343 0
538 0
762 90 1828 0
377 0
4
2
3
8
5
13
0010
00
G8
102
2
010
0
4
0
0
0
2
0
6
4
4
0
0
0
1
0
0
0
1
0
7
0
0
0
3
3
0
0
0
0
1
1
0
0
0
0
0
0
2
0
4
0
0
0
2
0
G9
1254 956 -
81
126
0
0
259
155
83 357 1
2
1080
1063
1
0
1
1
0
0
0
0
1
4
752
857
716
298
751
934
0
1
6
7
2
0
174
277
311
742
0011829
914
0
0
0
0
2
0
1
0
568
499
4
4
680
1333
36
36
68
129
1
0
10
5
0
0
1
0
1
0
1
5
1
0
3
3
G10
G11
2
0
0
0
4
0
0
0
001
0
1
0
0
0
0
1
0
1
0
0
3
5
4
2
2001100
1
0
0
0
1
0
0
003
0
1
0
0
2649
0
2664
001001000010002164 2932 001010 1004110101765 515 40770000-
G12
1339
1322
1112
1947
2
0
1
0
1216
842
0
14
332
457
1842
2367
1
2
355
276
0
0
1031
357
851
889
500
918
0
0
1
0
666
827
38
29
0293 001773
23
2029
2394
0
0
0
0
17
69
1
0
18
15
0
0
433
768
H1
472 367 621747
2932
50582
684
0
1
220 3
0
7
3
1244
1092
00210
0
00002
0
1432 1852 0
0
1
0
1439
1847
001
0
3000200
0
0
0
4010
0
00800
1503
1
0
1
1
404
36
007
2
527 359 1686 1602 -
H2
-
2751
3357
1221
1580
2090
2232
2147 2713 1
1
1788
1864
547
569
798
502
0
0
0
1
0
2
0
0
0
0
5
6
1
0
1419
2683
129
209
0
0
1721
411
1
4
H3
H4
0
2
211433
1999
00856
1079
184 138 -
1564
2024
1532
1482
0
0
226
234
0
0
0
3
0
0
1743
790
750
1050
0
2
0
0
0
0
897
1305
0
0
716
845
0
1
000
0
0
2
0
48
0
0
0
0
1796
2147
03987
2161
809
938
286
584
702
111
189
44
779
1067
795 658 14
9
000
2
0
1
425
499
1723
1668
0
1
2
1
1179 1494 0
3
656
963
151
33
0
2
1012
1325
011681
2701
134
196
01020
0
1110
1327
1833
2558
010
0
2363
19
21107
228
1367 1813 1508
2917
1504 293 0
0
2516
4005
39-
H5
0000000192 405 010
2
1543 1859 1266 883 556 626 001
0
544 389 79 101 1
0
1256
1815
674 1151 892 1191 10880 1180 441032 1280 908 31423 3002 971 1845 01221308 1154 -
H6
-
1
0
0
1
-
0
1
1
1
-
-
-
H7
01811
2
000170227
3
1
0
00500010222
0
1
4
1
5
00021
0
322
3
0100000
0
2000121200470000-
H8
0
0
H9
0
4
02-
0
0
0
0
1260
1832
1323
1823
2
0
-
1555
1120
861
1128
0
1
000000001002-
H10
1089
1291
0
0
0
0
1484
2281
0
0
0
0
0
1
0
1
880
1249
0
0
2113
2903
371
833
1287
1807
1
0
575
142
0
4
694
717
-
H11
825
869
253
262
0
2
176
253
304
363
189
160
44
152
30
35
1152
1310
681
687
851
502
482
509
230
219
1034
1020
330
380
1
0
695
828
0
0
529
524
0
0
139
183
0
0
910 886
1084
0
0
511
513
109
143
0
86
0
0
831
3
71
76
58854
1538
395
461
107
195
717
113
0
0
706
959
0
0
280
341
H12
395 412 285
232
2
0
246
293
266
277
427 351 301
1027
312 357 84
96
207
227
372
258
367
344
0
0
94 76 287 374 936 564 138 124 512
615
1009 1115 742
1169
0
1
157
347
450
564
983
1064
77 68 243
309
0
0
01409 3974
1190
338
603
647 749 55
91
736
239
1143 298 104
148
730 601 331
319
1
1
1
0
0
0
0
2
5
252
2
3
3
6
0
1
1175
1780
0
0
0
0
1
2
0
0
313
1117
0
0
385
715
0
0
0
4
0
0
191
302
0
0
187
1211
137
630
791
786
497
785
0
1
Sequencing
QC and
Mapping
Calling
SNVs
Epitope
Prediction
Clonality
Analysis
Vaccine
Design
TCR
Sequencing
• Which Epitopes go into the vaccine?
1) Diversify across HLA alleles
2) Select mutations with balanced allele specific
expression
3) Maximize (expected) clone coverage
4) Include epitopes with high MHC binding affinity
5) Include epitopes with high Differential Agretopic
Index (DAI): difference in MHC affinity between
mutant epitope and its wild type counter part
Sequencing
QC and
Mapping
Calling
SNVs
Epitope
Prediction
Clonality
Analysis
Vaccine
Design
TCR
Sequencing
Duan et. al., JEM 2014
Sequencing
QC and
Mapping
Calling
SNVs
Epitope
Prediction
Clonality
Analysis
Vaccine
Design
TCR
Sequencing
• Expitope (Haase et. al., Bioinformatics 2014)
– Checks for cross-reactivity based on ENCODE RNA-Seq from
normal tissues
– http://webclu.bio.wzw.tum.de/expitope/
• OptiTope (Toussaint and Kohlbacher, Nucleic Acid Research 2009)
– Optimizes allele coverage
– http://etk.informatik.uni-tuebingen.de/optitope
Sequencing
QC and
Mapping
Calling
SNVs
Epitope
Prediction
Clonality
Analysis
Vaccine
Design
TCR
Sequencing
Toussaint and Kohlbacher, Nucleic Acid Research 2009
Sequencing
QC and
Mapping
Calling
SNVs
Epitope
Prediction
Clonality
Analysis
Vaccine
Design
TCR
Sequencing
• T Cell Receptor sequencing
– Compare the TCR repertoire before and after
immunization to determine response against used
epitope(s)
• Primary analysis of TCR sequencing data
– IMSEQ (Kuchenbecker et. al., Bioinformatics 2015)
http://www.imtools.org/
– HTJoinSolver (Russ et. al., BMC Bioinformatics 2015)
https://dcb.cit.nih.gov/HTJoinSolver/
Sequencing
QC and
Mapping
Calling
SNVs
Epitope
Prediction
Clonality
Analysis
Vaccine
Design
TCR
Sequencing
• tcR (Nazarov et. al., BMC Bioinformatics, 2015)
– R package for downstream analysis, including diversity
measures, shared T cell receptor sequences identification
http://imminfo.github.io/tcr/
THANK YOU!
• QC
• FASTX: http://hannonlab.cshl.edu/fastx_toolkit/)
• PRINSEQ: http://prinseq.sourceforge.net/)
• Epitope prediction
•
•
•
•
NetChop: http://www.cbs.dtu.dk/services/NetChop/
SYFPEITHI: http://www.syfpeithi.de/
NETMHC: http://www.cbs.dtu.dk/services/NetMHC/
NetCTL: http://www.cbs.dtu.dk/services/NetCTL/
• Tumor Specific epitope predicton pipeline
• Epi-Seq: http://dna.engr.uconn.edu/?page_id=470
Also available on out galaxy server: http://mhc1.engr.uconn.edu:8080/
• Vaccine design
• Expitope: http://webclu.bio.wzw.tum.de/expitope/
• OptiTope: http://etk.informatik.uni-tuebingen.de/optitope
• TCR sequencing analysis
• Epi-Seq: http://imminfo.github.io/tcr/