Transcript Whole genome sequencing - Center for Biological Sequence Analysis
Course on Introduction to microbial whole genome sequencing and analysis
Mette Voldby Larsen
DTU – Center for Biological Sequence Analysis (CBS)
Henrik Hasman
DTU – National Food Institute
Presentation
• Henrik Hasman • Ph.D. in molecular microbiology (1999) • Has been working at DTU – National Food Institute since 2000 • Main topics are antimicrobial resistance and genetic engineering of
microorganisms and practical applications of NGS in clinical microbiology.
What do we do
• Applied research in evolution and spread of pathogenic bacteria with focus on antimicrobial resistance and bacterial typing.
• • Drug development for control of infections Development of bioinformatic solutions for especially clinical microbiology.
• WHO Collaborating Center and EU Reference Laboratory for antimicrobial resistance (EURL AR).
• Coordinator of COMPARE (Horizon 2020).
Mette Voldby Larsen 2002:
Cand. scient.
in Biology from University of Copenhagen 2007: PhD in Immunological Bioinformatics from Center for Biological Sequence Analysis (CBS), DTU 2007-2012: Assistant professor at CBS, DTU 2012 : Associate professor at CBS, DTU > Primary research fields: Developing methods for whole-genome based prediction of microorganism’s type, phenotype, phylogeny ect. Recently also phages.
> Teaching, study leader for Human Life Science Engineering
More than 150 employees= one of the largest bioinformatics groups within academia in Europe Web-services runs a total of more than 1 million jobs per month.
The flagship is “SignalP”, which predicts protein localization
The course
Learning objectives:
• Understand the most common NGS technologies and terminology.
• Learn how to prepare raw data from the sequencer for further bioinformatic analysis.
• Be able to use tools for
In silico
detection of plasmid, resistance and virulence genes.
• Be able to perform global and local WGS analysis to determine clonal relationship of bacteria (SNP, ND, MLST).
• Cases and discussion of relevant literature.
• Learn about metagenomics in clinical microbiology.
Introduction to NGS
Today
Welcome Introduktion to Next Generation Sequencing Illumina præsentation Intro to sequencing, raw data and assembly
Lunch (Sandwiches)
Journal club Introduction to CGE single isolate, single services Computer work w. single isolates and single services
Coffee
Computer work w. single isolates and single services Wrap-up of computer work
Introduction to NGS
Tomorrow
Welcome back Case - VTEC diagnostics
Coffee
Introduction to the SNP/ND concept Computer work w. VTEC
Lunch (Sandwiches)
Wrap-up of computer work Computer work w. CSIPhylogeny and NDtree
Coffee
Batch upload and the pipeline The map Computer work w. batch upload and the map
Sponsored dinner in Lyngby at 18.30
Introduktion til NGS
Friday
Welcome back Wrap-up of computer work Metagenomics
Coffee
Case - Urine infections CLCBio presentation Computer work w. MGMapper/your own data
Lunch (Sandwiches)
Computer work w. MGMapper/your own data Wrap-up of computer work Implementing NGS in a clinical laboratory Future perspectives and GMI/COMPARE Course evaluation and goodbye
Coffee
And now to you..?
•
Who are YOU?
•
Where do you come from (country/institution)?
•
Your daily work?
•
Experience with NGS/WGS?
•
Your motivation for joining the course?
Introduction to NGS
Next Generation Sequencing
One method to rule them all…
1981 £35000 2006 £2600
Ray Kurzweil
100 200£ + 2-3£ for App…
Workflow today at the clinical laboratory
Family Genus Species (Subspecies) Serovar Phagetype Ribotype Resistograms PFGE type MLVA type MLST type DNA Microarray analysis Full genomic DNA sequence Identification Typing Selecting an appropriate typing method can be depending on initial (less discriminatory) pre typing. And going directly for the most discriminatory method can sometimes be misleading.
19
Typing methods
• •
Phenotypic
– Serotyping (antibodies) – Phage typing (virus susceptibility) – Biotyping (ability to grow in different substrates) – – Antimicrobial resistance Protein profiles
Genotypic
– DNA fingerprint (RAPD, AFLP, ERIC, MLVA) – DNA sequencing (MLST, spa, dru, full genome)
Workflow with WGS at the clinical laboratory
Didelot et al, 2012.
DNA sequencing
21
DNA sequencing Applied Biosystems (ABI) Genetic analyser
“First Generation” Sequencing machine (capillary Sanger sequencing) 22
23
Limitations
•
Limitation
The size of DNA fragments that can be read in this way is about 700 bps...and it takes a long time to rum even a few genes..!
•
Problem
Most genomes are enormous (e.g 10 pair in case of human). So it is impossible to be sequenced directly! This is called 8 base Large Scale Sequencing 24
Solution
•
Solution
Break
the DNA into small fragments randomly
Sequence
the readable fragment directly
Assemble
the fragment together to reconstruct the original DNA
Scaffolder
gaps 25
Solving a one-dimensional jigsaw puzzle with millions of pieces(without the box) !
NGS output
Huge numbers of small fragments (35-500 bp)
Second generation sequencing
Loman et al, 2012
Platforms
Loman et al, 2012
Platforms
Next generation sequencing machines
454 Life Sciences (Roche)
First Next Generation Sequencing machine
Illumina HiSeq/GAII systems
High throughput systems
Ion Torrent PGM system
Low/medium throughput system
Illumina MiSeq system
Medium throughput system
Oxford Nanopore (MinION)
Single-molecule sequencing 30
Raw DNA sequences
Rough assembly and compression Fine assembly Identification Gene finding Comparison
Summary of:
What it is?
Has it been seen before?
How we can fight/treat?
What is new/unusual?
Google maps like view
• Reports Outbreaks
What is already known?
Pathogenicity islands Virulence genes Resistance genes MLST type
What is novel?
Vaccine targets Virulence genes Resistance genes SNPs
Workflow with WGS at the clinical laboratory 4-6 hours Modified from Didelot
et al.
, 2012.
Wet-Lab Workflow
Analysis tools
Library DNA purification DNA barcoding