High-throughput genotyping

Download Report

Transcript High-throughput genotyping

Genotyping & Haplotyping

Monday, 27 April 2020

Finnish Genome Center

‹#›

Genotyping

• Analysis of DNA-sequence variation • Human DNA sequence is 99.9% identical between individuals →3000 000 varying nucleotides • Polymorphism: normal variation between individuals (frequency> 1% of population) • Genetic variation • May cause or predispose to inheritable diseases • Determines e.g. individual drug response • Used as markers to identify disease genes

Finnish Genome Center

Important terms

• Allele • Alternative form of a gene or DNA sequence at a specific chromosomal location (locus) • at each locus an individual possesses two alleles, one inherited from each parent • Genotype • genetic constitution of an individual, combination of alleles • Genetic marker • Polymorphisms that are highly variable between individuals:

Microsatellites

and single nucleotide polymorphisms

(SNPs)

• Marker may be inherited together with the disease predisposing gene because of linkage disequilibrium (LD)

Finnish Genome Center

Linkage disequilibrium, LD

• Alleles are in LD, if they are inherited together more often than could be expected based on allele frequencies • Two loci are inherited together, because recombination during meiosis separates them only seldom

Finnish Genome Center

Microsatellite markers

Di-, tri-, tetranucleotide repeats

GAACGTACT CACACACACACACA TTTGAC TTCGATGATA GATAGATAGATAGATA CGT

• the number of repeats varies (→ 30) • highly polymorphic • distributed evenly throughout the genome • easy to detect by PCR

Finnish Genome Center

SNP markers

Single Nucleotide Polymorphisms (SNPs) GTGGACGTGCTT [G/C] TCGATTTACCTAG

• The most simple and common type of polymorphism • Highly abundant; every 1000 bp along human genome • Most SNPs do not affect on cell function • some SNPs could predispose people to disease or • influence the individual’s response to a drug

Finnish Genome Center

SNP genotyping techniques

• over 100 different approaches • Ideal SNP genotyping platform: • high-throughput capacity • simple assay design • robust • affordable price • automated genotype calling • accurate and reliable results

Finnish Genome Center

...SNP genotyping techniques

• • • PCR

discrimination between alleles:

• allele-specific hybridization • allele-specific primer extension • allele-specific oligonucleotide ligation • allele-specific enzymatic cleavage

detection of the allelic discrimination:

• light emitted by the products • mass • change in the electrical property

Finnish Genome Center

High-throughput genotyping; Finnish Genome Center as an example

• Independent department of

University of Helsinki

since 1998 • National core facility for the genetic research of multifactorial diseases • Provides collaboration and genotyping service to scientist and research groups in Finland, also abroad

Finnish Genome Center

Goals of the Finnish Genome Center

• help designing genetic studies • perform high-throughput genotyping • perform data analysis • training of scientists • adopt and develop new strategies & technologies

Finnish Genome Center

Research strategies

• • •

Genome-wide scan

• ~400 microsatellite markers at 10 cM interval • Family-data

Fine mapping

• Candidate regions identified by a genome scan • Project specific microsatellite or SNP markers

SNP genotyping

• Candidate genes • Fine mapping • Sequenom: MassArray MALDI-TOF

Finnish Genome Center

Setting up PCR-reactions

Finnish Genome Center

Electrophoresis run for microsatellites

2000 1000 0 C04 HDT1.PA3.020902A HDT.111 Q Score : 1.5 Allele 1 : 248.6 ( 19 ) Allele 2 : 250.5 ( 20 ) 240 250 260 G02 HDT1.PA3.020902A OA.20015 Q Score : 3.3 Allele 1 : 98.7 ( 19 ) Allele 2 : 104.7 ( 22 ) 200 100 0 80 90 100 110 3000 2000 1000 0 E08 HDT1.PA3.020902A HDT.402 Q Score : 2.4 Allele 1 : 232.8 ( 15 ) Allele 2 : 254.7 ( 26 ) 230 240 250 260 120

Finnish Genome Center

Microsatellite data

Marker D7S513 D7S517 D7S640 D7S640 D7S669 D8S258 D8S260 D8S264 Well ID H01 C07 B02 G12 E05 B06 C02 H01 SampleID Allele1 OA.11616 26 DYS.5020 26 DYS.3819 26 OA.1528 26 OA.11615 26 DYS.5001 26 DYS.3931 26 OA.11616 26 Allele2 28 26 29 29 29 27 26 26 Size1 190.93

262.19

133.41

133.59

190.37

159.38

215.57

158.86

Size2 195.02

262.19

139.41

139.46

196.61

161.38

215.57

158.86

Finnish Genome Center

SNP genotyping with MassARRAY (MALDI-TOF)

• Primer extension reactions designed to generate different sized products • Analysis by mass spectrometry

C/T G/A dGTP dATP dTTP ddCTP G/A Extendable primer C analyte T analyte GGACCTGGAGCCCCCACC GGACCTGGAGCCCCCACC C GGACCTGGAGCCCCCACC T C Mass in Daltons 5430.5

5703.7

5976,9.9

Finnish Genome Center

Mass spectrometry multiplexing

Finnish Genome Center

SNP data

ASSAY_ID rs10563 rs10563 rs3527 rs6779 rs135627 rs42778 rs755555 rs45167 rs47890 5 6 2 2 3 4 1 1 CHIP_ID WELL_ID SAMPLE_ID GENOTYPE DESCRIPTION 1 A01 A.Conservative

A02 B05 A.Conservative

A.Conservative

A01 B02 C04 D12 B.Moderate

A.Conservative

A.Conservative

A.Conservative

E10 F01 A.Conservative

A.Conservative

Finnish Genome Center

SNP genotyping workflow at FGC

DNA samples PCR Digestion Pooling of PCR products Purification (Sap+Exo I) Primer Extension Sephadex purification Purification (Sap) Primer Extension Cation resin purification Gel Electrophoresis Capillary Electrophoresis MALDI-TOF mass spectrometry LIMS Database Allele calling

Finnish Genome Center

Haplotype

• Multiple loci in the same chromosome that are inherited together • Usually a string of SNPs that are linked locus alleles haplotypes

Finnish Genome Center

Haplotype construction

• No good molecular methods available to identify haplotypes Genotypes → SNP1 AT SNP2 GC Haplotypes, two alternatives A T A T G C C G → Computational methods to create haplotypes from genotype data

Finnish Genome Center

...Haplotype construction

• Family-based haplotype construction • Linkage analysis softwares: Simwalk, Merlin, Genehunter, Allegro...

• Population-based haplotype construction • Not as reliable as family-based • EM-algorithm (expectation maximization algorithm), described in http://www gene.cimr.cam.ac.uk/clayton/software/ • SnpHap • PHASE

Finnish Genome Center

Haplotype blocks

• Low recombination rate in the region • Strong LD • Low haplotype diversity • Small number of SNPs in the block are enough to identify common haplotypes; tag SNPs

Finnish Genome Center

Formation of haplotype blocks

1 1 1 x

chromosomes

2 2 2

meiosis

2 2 1 1 1 2

recombination

Finnish Genome Center

2 2 1

Few generations

2 3 1

Hundreds of generations

Finnish Genome Center

1-150 kb Average block size • African populations: 11 kb • Non-african populations: 22 kb • 60%-80% of the genome is in the blocks of > 10 kb

Finnish Genome Center

Block frequencies

Typically, only 3-5 common haplotypes account for >90% of the observed haplotypes

Finnish Genome Center

Benefits of haplotypes instead of individual SNPs

• Information content is higher • Gene function may depend on more than one SNP • Smaller number of required markers • The amount of wrong positive association is reduced • Replacing of missing genotypes by computational methods • Elimination of genotyping errors • Challenges: • Haplotypes are difficult to define directly in the lab; computational methods • Defining of block boarders is ambiguous; several different algorithms

Finnish Genome Center

The HapMap project

• International collaboration to create a map of human genetic variation • The map is based on common haplotype patterns • Includes information on • SNPs (location, frequency, sequence) • Haplotype block structure • Distribution of haplotypes in different populations

Finnish Genome Center

Finnish Genome Center

Finnish Genome Center

Finnish Genome Center

Finnish Genome Center

Finnish Genome Center

Finnish Genome Center