Oncotator Transcript selection and its importance on variant annotation. Overview • Background – Transcript selection has a large affect on annotation results. • Oncotator has two selection.
Download ReportTranscript Oncotator Transcript selection and its importance on variant annotation. Overview • Background – Transcript selection has a large affect on annotation results. • Oncotator has two selection.
Oncotator Transcript selection and its importance on variant annotation. Overview • Background – Transcript selection has a large affect on annotation results. • Oncotator has two selection modes: CANONICAL and EFFECT – Existing selection modes often fail to capture expected variant annotations • Crucial for clinical applications! • • Approach – Construct list of transcripts that should be used for annotation – Override selection mode when list is provided on cmd line Results – New list of optimal transcripts for each gene • Composed of transcripts with 100% sequence identity match with UniProt record – Plus some additional tweaks – We now “correctly” annotate.. • …all clinically actionable variants described in MyCancerGenome • … all genes captured by MGH’s SNAPSHOT assay Oncotator Transcript Selection modes • CANONICAL (default) • EFFECT 1. GENCODE level of curation 1. variant classification score 2. variant classification score 2. GENCODE level of curation 3. APPRIS 3. APPRIS 4. transcript sequence length 4. transcript sequence length 5. alphabetical 5. alphabetical Gene Variant Canonical Classification CRLF2 chrX:1314966A>C 5'UTR EGFR chr7:55259515T>G Missense_Mutation Canonical Annotation p.L813R Effect Classification Effect Annotation Missense_Mutation p.F232C Missense_Mutation p.L813R • Manual transcript selection is necessary to get EGFR p.L858R annotation Approach 1. Compile list of well known variants and their expected annotations – mycancergenome.org • 212 variants – MGH SNAPSHOT assay • Additional 30 variants 2. “Reverse oncotate” to get expected genomic variants – e.g. BRAF p.V600E g.chr7:140453136A>T 3. Compare reverse-oncotated variant annotations with expected results from Step 1. – e.g. g.chr7:140453136A>T ??? Results Oncotator Modes Tx Selection Approach Canonical Effect Concordance with Expected annotation 86% (243/283) 91% (257/283) Transcript Lists UniProt Exact 98% (279/283) Clinical 100% (283/283) • Transcript Lists – “UniProt Exact” (tx_exact_uniprot_matches.txt) • 24,000 transcripts with perfect sequence identity with UniProt record sequence – UniProt record == gene’s canonical transcript – “Clinical” • “UniProt Exact” list + 3 additional transcripts Conclusions • We now have a list of preferred transcripts that we recommend using in most settings – “-c” option on command line – 100% concordance with variant annotations in MyCancerGenome