Whole Genome Association Study: What are the next steps? Sven Bergmann University of Lausanne & Swiss Institute of Bioinformatics http://serverdgm.unil.ch/bergmann TC Nestle 10 Sep.
Download ReportTranscript Whole Genome Association Study: What are the next steps? Sven Bergmann University of Lausanne & Swiss Institute of Bioinformatics http://serverdgm.unil.ch/bergmann TC Nestle 10 Sep.
Whole Genome Association Study: What are the next steps? Sven Bergmann University of Lausanne & Swiss Institute of Bioinformatics http://serverdgm.unil.ch/bergmann TC Nestle 10 Sep. 2009 What is association? SNPs trait variant chromosome Genetic variation yields phenotypic variation 1.2 1 0.8 Population with ‘ ’ allele Population with ‘ ’ allele 0.6 0.4 0.2 0 -6 -4 -2 0 2 Distributions of “trait” 4 6 phenotype Association using regression genotype Coded genotype Regression formalism (monotonic) transformation effect size (regression coefficient) error (residual) phenotype (response variable) of individual i p(β=0) coded genotype (feature) of individual i Goal: Find effect size that explains best all (potentially transformed) phenotypes as a linear function of the genotypes and estimate the probability (p-value) for the data being consistent with the null hypothesis (i.e. no effect) Whole Genome Association Current insights from GWAS: • Well-powered (meta-)studies with (ten-)thousands of samples have identified a few (dozen) candidate loci with highly significant associations • Many of these associations have been replicated in independent studies Current insights from GWAS: • Each locus explains but a tiny (<1%) fraction of the phenotypic variance • All significant loci together explain only a small (<10%) of the variance David Goldstein: “~93,000 SNPs would be required to explain 80% of the population variation in height.” Common Genetic Variation and Human Traits, NEJM 360;17 So what do we miss? 1. Other variants like Copy Number Variations or epigenetics may play an important role 2. Interactions between genetic variants (GxG) or with the environment (GxE) 3. Many causal variants may be rare and/or poorly tagged by the measured SNPs 4. Many causal variants may have very small effect sizes 5. Overestimation of heritabilities from twin-studies? Intensity of Allele A CNVs can be called from SNP probe intensities Intensity of Allele G Copy Number Variations are called with varying uncertainties Well-separated Reasonably separated Badly separated We propose mixture model both for phenotypes and uncertain genotypes Phenotype mixture components genotype mixture components (the mixture MLE/LRT model) Covariates & Interactions • For which parameters do we have to correct the phenotypes? Age, sex, other SNPs … • Interactions: Can we test 106 x 106 interactions and does it make sense? R a bsGs cs 'Gs ' d s,s 'GsGs ' Network Approaches for Integrative Association Analysis Using knowledge on physical gene-interactions or pathways to prioritize the search for functional interactions Can we reduce the complexity of the phenotypic data? Hundreds of samples 1000 2000 Many 3000 measurements 4000 5000 6000 200 400 600 800 1000 New Analysis and Visualization Tools are needed! New Tools: Module Visualization http://serverdgm.unil.ch/bergmann/Fibroblasts/visualiser.html Data Integration: Example NCI60 60 cancer cell lines (9 tissue types) Drug Response Data ~5,000 drugs Gene Expression Data ~23,000 gene probes How to identify Co-modules? Iteratively refine genes, cell-lines and drugs to get co-modules Z Kutalik, J Beckmann & SB, Nature Biotechnology (2008) Modular Approach for Integrative Analysis of Genotypes and Phenotypes Phenotypes Measurements Modular links Individuals SNPs/Haplotypes Genotypes Acknowledgements Jacqui Beckmann People: Zoltán Kutalik Micha Hersch Aitana Morton Diana Marek Barbara Piasecka Bastian Peter Karen Kapur Alain Sewer Toby Johnson Armand Valsessia Gabor Csardi Funding: SNSF, SIB, Cavaglieri, Leenaards, SystemsX.ch, European FP http://serverdgm.unil.ch/bergmann