Transcript Document
Nuts and Bolts of Clinical Genomic Sequencing Thomas Stricker MD PhD Vanderbilt University Next Generation Sequencing Illumina Seqeuncing Technology DNA – the genetic code •DNA is a double stranded polymer of 4 bases (A, T, C,G) •The order (sequence) of A,T,C,G is the genetic code •A always pairs with T on the opposite strand, and C always pairs with G •Enzymes called polymerases make copies of DNA by taking a single strand of DNA, and then adding A,T,C,G according to the base-pairing rules Sanger (mod by Lee Hood) •Sequencing by synthesis •Mix many copies of the same DNA molecule, polymerase, ATCGs, and a small amount of flourescently labeled ATCG that are terminated •Terminated bases stop extension •Separate based on size Illumina Seqeuncing Technology What Happened? 1. In vitro amplification, ‘cloning’ 2. Flow cell based sequencing by synthesis 3. A draft of the human genome Illumina Seqeuncing Technology Single platform – 4 mutation types Rare Mutations – Implications for Therapy Broad spectrum of mutations gives physicians some information… 877 Lung Specimens (843 patients) …but without well-annotated sequencing reports, physicians struggle to find best therapy Oncogene Frequency (%) Treatment EGFR 10-35 Gefitinib, erlotinib, afatinib ALK fusion 3-7 Crizotinib MET amp 2-4 Crizotinib DDR2 ~4 Dasatinib HER2 2-4 Afatinib ROS1 fusion 1 Crizotinib BRAF Y472C rare Dasatinib BRAF V600E 1 Vemurafenib, dabrafenib RET fusion 1 Cabozantinib NRAS 1 Trametinib (preclinical) KRAS 15-25 Selumetinib (with chemo) FGFR1/2 amp ~20 AZD4547 7/1/2010-2/28/2013 * MEK1 (8) 0.9% NRAS (5) 0.5% PIK3CA (20) 2.2% PTEN (2) 0.2% AKT1 (2) 0.2% BRAF (20) 2.2% ERBB2 (10) 1.1% KRAS (198) 21.7% EGFR (135) 14.8% No mutation detected (512) 56.1% * Data courtesy of Dr. William Pao and Dr. Mia Levy 300 PIK3CA TP53 CDH1 GATA3 MAP3K1 KMT2C NCOR1 PTEN MAP2K4 RUNX1 ARID1A TBX3 CBFB FOXA1 ERBB2 RB1 CTCF RPGR SF3B1 FBXW7 PIK3R1 WSCD2 MYB HIST1H3B ACTL6B CASP8 CDKN1B TBL1XR1 GPS2 AARS ASB10 FAM86B2 TCP10 ZFP36L1 FAM86B1 ZFP36L2 AQP12A C1QTNF5 HIST1H2BC HLA−DRB5 KRAS EPDR1 FAM20C PTHLH THEM5 The Long Tail of Cancer mutations 45 Recurrently mutated genes in the TCGA breast cancer data set 250 Range from over 30% to 2% of cases 200 150 100 50 0 Human Genome 3 billion base pairs in the human genome Roughly 1% is in coding sequence Target Enrichment = Amplicon-based approach Target Enrichment = Hybrid capture approach Fusion Detection = Hybrid Capture Tumor-Normal Contamination Tumor-Normal Contamination Clinical Utility of NGS Clinical Utility of NGS Clinical Utility of NGS Analysis Schematic SNVs (GATK) Indels (Pindel, SLOPE) Order Entry Annotation (MyCancerGenome, Others) Alignment Translocations (Discordant Pairs) CNVs (VarScan2) Reporting Somatic mutation frequencies observed in exomes from 3,083 tumour–normal pairs . MS Lawrence et al. Nature 000, 1-5 (2013) doi:10.1038/nature12213 Inherited Variants > Somatic Synonymous SNVs Non-Synonymous SNV Inherited Variants > Somatic Somatic Coding mutations – 3 to 300 IGV – Genotyping