Discovering Cellular Machinery

Download Report

Transcript Discovering Cellular Machinery

Understanding Science Through
the Lens of Computation
Richard Karp
Visit Day 2008
The Computational Lens
• In many sciences, the natural processes
being studied are computational in nature.
Viewing natural or engineered systems
through the lens of their computational
requirements or capabilities, made rigorous
through the theory of algorithms and
computational complexity, provides
important new insights and ways of
thinking.
Computational Processes in the
Sciences
• Regulation of protein production,
metabolism and embryonic development
• Phase transitions of physical systems
• Mechanisms of learning
• Molecular self-assembly
• Strategic behavior of companies
• Evolution of Web-based social networks
The Computational Lens at
Berkeley
• The Web, the Internet and Computational Game
Theory (Christos Papadimitriou)
• Quantum Computing (Umesh Vazirani)
• Statistical Physics (Alistair Sinclair, Elchanan
Mossel)
• Computational Molecular Biology (Michael
Jordan, Richard Karp, Elchanan Mossel, Christos
Papadimitriou, Satish Rao, Yun Song)
A Computational View of
Quantum Physics
• Quantum physics is the right setting for
studying computation at subatomic levels.
• Theory of Computation (ToC) is
fundamental for understanding the power of
quantum computation.
• Quantum computation and hence ToC will
test of the foundations of quantum physics.
Testing the Foundations
“Quantum computing is as much about
testing quantum physics as it is about
building powerful computers.”
Umesh Vazirani
Highlights
• Construction of a universal quantum Turing
machine (Bernstein, Vazirani)
• Definition of BQP, the class of problems
efficiently solvable on a quantum Turing
machine (Bernstein, Vazirani)
• Quantum Fourier Transform algorithm, a
tool for Shor’s polynomial-time factoring
algorithm(Hales, Hallgren, Vazirani)
Links Between Statistical Physics
and Computer Science
• Both fields study how macroscopic
properties of large systems arise from local
interactions.
– Statistical physics: properties of water and
magnetic materials
– Computer science: global properties of World
Wide Web, structure of complex combinatorial
problems
Similarities of Models and
Methods
• Probabilistic models capture statistical
behavior of large, complex, heterogeneous
and incompletely known systems.
• Phase transitions in statistical physics have
close parallels with sharp thresholds in
computer science.
Areas of Convergence
• Constraint satisfaction problems
• Belief propagation and error-correcting
codes
• Markov Chain Monte Carlo
• Percolation and sensor networks
Highlights
• Randomized polynomial-time algorithm for
computing the permanent of a nonnegative
matrix (Jerrum, Sinclair, Vigoda)
• Survey propagation, the best known method
for solving random satisfiability problems,
combines ideas from statistical physics and
computational learning theory.
Computational Models of the
Web and the Internet
• “For the first time, we had to approach an
artifact with the same puzzlement with
which the pioneers of other sciences had to
approach the universe, the cell, the brain,
the market” Christos Papadimitriou
Computational Models of the
Web
• The Internet and the Web are simultaneously
computational, social and economic. They support
new modes of interaction.
• Novel algorithmic problems: ranking methods of
search engines, reputation systems,
recommendation systems, design of auctions and
other economic mechanisms, optimal placement of
on-line advertisements.
Highlight
• “Computing a Nash Equilibrium is PPAD
Complete” Daskalakis, Goldberg,
Papadimitriou
Social Sciences and the Web
• The Web is a powerful laboratory for
studying social and economic systems as
computational processes.
• Insights from algorithmic game theory are
indispensable for understanding the new
markets and economic mechanisms that the
Internet has spawned.
Computational Processes in
Biology
• Learning in neural networks
• Response of immune system to an invading
microbe
• Specialization of cells during embryonic
development
• Collective behavior of animal communities:
flocking of birds, self-organization of ant colonies
• Design of sensor-actuator control systems for
regulation of biological processes
• Evolution of species
Highlights
• “Optimal Phylogenetic Reconstruction”
(Daskalakis, Mossel, Roch) determines the
minimum length of DNA sequences needed to
reconstruct the evolutionary history of a set of
species.
• “Identification of Protein Complexes Conserved in
Yeast, Worm and Fly” (Karp et al) infers
molecular machines using cross-species analysis
of protein interaction data.
A Challenge for the Future
“We can approach understanding how the whole
genome works by breaking it down into groups of
genes that interact strongly with each other. Once
researchers identify and understand these network
modules, the next step will be to figure out the
interactions within networks of networks, and so
on until we eventually understand how the whole
genome works, many years from now. ”
Gary Odell
And so …
• The algorithmic worldview is changing the
sciences: mathematical, natural, life, social.
• CS is placing itself at the center of scientific
discourse and exchange of ideas.
• And this is only the beginning …
The Power of the Computational
Perspective
• Exposes the computational nature of natural
processes and provides a language for their
description.
• Brings to bear fundamental algorithmic concepts:
adversarial and probabilistic models, asymptotic
analysis, intractability, computational learning
theory, threshold behavior, fault tolerance, …
• Alters the worldviews of many scientific fields.
Algorithmic Challenges in
Computational Molecular
Biology
Revolution in Biology
• Advances in computation and instrumentation
enable a quantitative characterization of biological
systems.
• Opportunity to advance understanding of
molecular processes of life and change the ways
we diagnose and treat disease.
• Multidisciplinary field: involves the biological,
physical, engineering and mathematical sciences.
Biological background
The eukaryotic cell
Goals of Computational
Molecular Biology
• Sequence and compare the genomes of
many organisms.
• Identify the genes and determine the
functions of the proteins they encode.
• Understand how genes, proteins and other
molecules act in concert to control cellular
processes.
Goals of Computational
Molecular Biology
• Trace the evolutionary history and
evolutionary relationships of existing
species.
• Understand the structure, function and
evolutionary history of proteins.
• Identify the associations between genetic
mutations and disease.
Regulation of Gene Expression
• Animals can be viewed as highly complex,
precisely regulated spatial and temporal
arrays of differential gene expression.
• Gene expression is regulated by a complex
network of interactions among proteins,
genomic DNA, RNA and chemicals within
the cell.
Levels of Regulation
• Genome: spells the names of the proteins.
• Transcription of genes to mRNA: regulated by
binding of transcription factors to DNA in control
regions of genes.
• Translation of mRNA into functioning proteins,
regulated by complex networks of protein-protein
and protein-RNA interactions, and by posttranslational modifications of proteins.
Levels of Regulation (Cont.)
• Regulation of metabolic processes: complex
network of chemical reactions catalyzed by
enzymes.
• Global phenotype such as disease: regulated
by interaction of many metabolic processes.
Key Research Areas
• Analysis of protein-DNA interactions: breaking
the cis-regulatory code.
“ Regulatory interactions mandated by circuitry
encoded in the genome determine whether each
gene is expressed in each cell, throughout
developmental space and time, and, if so, at what
amplitude.” Eric Davidson
• Analysis of protein-protein interactions:
identification of molecular machines and signal
transduction cascades.
Tools for Analysis
• Measurement of protein-DNA and protein-protein
interactions, and of mRNA production under
perturbed conditions.
• DNA sequence analysis to identify genes, their
regulatory regions and the transcription factor
binding sites within them.
• Phylogenetic analysis to identify regulatory
structures conserved across species.
• Classification of proteins according to structure
and function.
The Ultimate Goal
• ``Portions of the endo16 cis-regulatory system of
Strongylocentrotus are to date the most
extensively explored of any, with respect to the
functional meaning of each interaction that takes
place within them. What emerges is almost
astounding: a network of logic interactions
programmed into the DNA sequence that amounts
essentially to a hardwired biological
computational device.”