Introduction to molecular networks Sushmita Roy BMI/CS 576 www.biostat.wisc.edu/bmi576 [email protected] Nov 6th, 2014 Understanding a cell as a system • Measure: identify the parts of a.
Download ReportTranscript Introduction to molecular networks Sushmita Roy BMI/CS 576 www.biostat.wisc.edu/bmi576 [email protected] Nov 6th, 2014 Understanding a cell as a system • Measure: identify the parts of a.
Introduction to molecular networks Sushmita Roy BMI/CS 576 www.biostat.wisc.edu/bmi576 [email protected] Nov 6th, 2014 Understanding a cell as a system • Measure: identify the parts of a system – Parts: different types of bio-molecules • genes, proteins, metabolites – High-throughput assays to measure these molecules • Model: how these parts are put together – Clustering – Network inference and analysis Omics data provide comprehensive description of nearly all components of the cell Joyce & Palsson, Nature Mol cell biol. 2006 Key concepts in networks • What are molecular networks? – Different types of networks – Graph-theoretic representation • Computational problems in network biology – Network structure and parameter learning – Analysis of network properties – Inference on networks: for data integration, interpretation and hypothesis generation • Classes of methods for expression-based network inference • Probabilistic graphical models for networks – Bayesian networks and dependency networks – Evaluation of inferred networks A network • Describes connectivity patterns between the parts of a system – Vertex/Nodes: parts – Edges/Interactions: connections • Edges can have sign and/or weight • Connectivity is represented as a graph – Node and vertex are used interchangeably – Edge and interaction are used interchangeably Vertex/Node A Edge B E D F C Why are networks important? • Genes do not function independently • Identifying these networks is a central challenge in Systems biology • Motivating applications – Protein Networks for cancer prognosis – Networks for interpretation of genetic mutants – Networks for gene prioritization CB26CH 23-Ideker a Protein networks for predicting cancer prognosis ARI 26 June 2010 20:15 b NGFR CAV1 AHR * ESR1 SYT1 RNF111 CDC27 c PTK2B JAK2 DCTN2 DAB2 ERBB2 SMAD3 YWHAZ SMARCA2 CHUK * JUN EGF * SERPINA3 BRCA1 RIT1 USP4 * TP53 *RAD51 CAV1 SMAD2 TP53 PIN1 * CR RPL5 ERBB4 M, S, CR IFRD1 FGF5 VEGF CT, A, S, CR S, CR Expression level CDH5 RPS25 Downregulated Upregulated VEGFB d Diamonds are differentially expressed genes Good Poorpredictive power Circles are not differentially expressed but importantoutcome for outcome Chuang & Ideker, Annual Reviews 2010 BRCA1 BRCA1 Different types of molecular networks • Depends on what – the vertices represent – the edges represent – whether edges directed or undirected • Molecular networks – Vertices are bio-molecules • Genes, proteins, metabolites – Edges represent interaction between molecules Transcriptional regulatory networks Nodes: regulatory protein like a TF, or target gene Edges: TF A regulates C Transcription factors (TF) A B ! " # $%&' () * ' $! +, -, . &!" ##$% !! &' ( ) **+&,,- - - ./ 012 34536*789.51 E. coli: 153 TFs and 1319S. target genes cerevisiae Gene C (a) (b) T F B A C (c) R G Directed, Signed, weighted Vargas and Santillan, 2008 Representation Figure 2 of the S. cerevisiae regulatory network target genes 157transcriptional TFs and 4410 S. cerevisiae: Representation of the S. cerevisiae transcriptional regulatory network. a) Representation of the tr gene regulatory network of S. cerevisiae. Green circles represent transcription factors, brown circles denote Detecting protein-DNA interactions • • ChIP-chip and ChIP-chip binding profiles for transcription factors Determine the (approximate) locations in the genome where a protein binds Peter Park, Nature Reviews Genetics, 2009 Protein-protein interaction networks Vertices: proteins Edges: Protein U physically interacts with protein X Protein complex U X Y Z U X Y Z Undirected Yeast protein interaction network Barabasi et al. 2004 Metabolic networks Vertices: enzymes Edges: Enzyme M and N share a metabolite Proteins (enzymes) M a Enzymes metabolites b N Metabolites c O d M N O Undirected, weighted Figure from KEGG database Signaling networks Vertices: Enzymes and other proteins Edges: Enzyme P modifies protein Q Receptors P Q A TF P Q A Directed Sachs et al., 2005, Science Genetic interaction networks Genetic interaction: If the phenotype of double mutant is significantly different than each mutant along Vertices: Genes Edges: Genetic interaction between query (Q) and gene G Q G Undirected Dixon et al., 2009, Annu. Rev. Genet Yeast genetic interaction network Costanzo et al, 2011, Science Summary of different types of Molecular networks • Physical networks – Transcriptional regulatory networks: interactions between regulatory proteins (transcription factors) and genes – Protein-protein: interactions among proteins – Signaling networks: interactions between protein and small molecules, and among proteins that relay signals from outside the cell to the nucleus • Functional networks – metabolic: describe reactions through which enzymes convert substrates to products – genetic: describe interactions among genes which when genetically perturbed together produce a significant phenotype than individually Computational problems in networks • Network reconstruction – Infer the structure and parameters of networks – We will examine this problem in the context of “expression-based network inference” • Network evaluation – Properties of networks • Network applications – Interpretation of gene sets – Using networks to infer function of a gene Network reconstruction • Given – A set of attributes associated with network nodes – Typically attributes are mRNA levels • Do – Infer what nodes interact with each other • Algorithms for network reconstruction can vary based on their meaning of interaction – Similarity – Mutual information – Predictive ability Computational methods to infer networks • We will focus on transcriptional regulatory networks • These networks are inferred from gene expression data • Many methods to do network inference – We will focus on probabilistic graphical models Modeling a regulatory network Hot1 Sko1 HSP12 Hot1 Sko1 Hot1 regulates HSP12 X2 X1 ψ(X1,X2) …. HSP12 is a target of Hot1 HSP12 Structure Who are the regulators? BOOLEAN LINEAR DIFF. EQNS PROBABILISTIC Y Function How they determine expression levels? Mathematical representations of networks X1 X2 Models differ in the function that maps input system state to output state f X3 Input expression of neighbors Output expression of node Boolean Networks Differential equations Probabilistic graphical models Input Output X1 X2 X3 X1 X2 X3 0 0 0 0 1 1 1 0 1 1 1 1 Rate equations Probability distributions Network evaluation • How accurate is the network? – In silico validation – Agreement with known interactions – Agreement with experiments • Do topological properties of the network represent important biological functions – Degree distributions – Network motifs – Highly connected nodes and relationship to lethality Network applications • • • • Interpretation of gene sets (for example from clustering) Integration of two or more different types of datasets Graph clustering Classification/Inference on graphs Plan for next lectures • Overview of Expression-based network inference – Classes of methods – Strengths and weaknesses of different methods • Representing networks as probabilistic graphical models – Bayesian networks – Module networks – Dependency networks • Evaluation of inferred networks