Curing Cancer with Your Cell Phone: Why all Sciences are Becoming Computing Sciences David Evans http://www.cs.virginia.edu/evans Computer Science = Doing Cool Stuff with Computers? College Science Scholars.

Download Report

Transcript Curing Cancer with Your Cell Phone: Why all Sciences are Becoming Computing Sciences David Evans http://www.cs.virginia.edu/evans Computer Science = Doing Cool Stuff with Computers? College Science Scholars.

Curing Cancer with Your Cell Phone: Why all Sciences are Becoming Computing Sciences

David Evans http://www.cs.virginia.edu/evans

Computer Science =

Doing Cool Stuff with Computers?

College Science Scholars 2

Toaster Science =

Doing Cool Stuff with Toasters?

College Science Scholars 3

Computer Science

• Mathematics is about declarative (“what is”) knowledge; Computer Science is about imperative (“how to”) knowledge • The Study of Information Processes – How to describe them Language – How to predict their properties Logic – How to implement them quickly, cheaply, and reliably Engineering

College Science Scholars 4

Most Science is About Information Processes Which came first, the chicken or the egg?

How can a (relatively) simple, single cell turn into a chicken?

College Science Scholars 5

Agenda

• Three Big Ideas: – All Computers are Equally Powerful – Programs are Data, Data are Programs – Many Surprisingly Different Problems are Equally Difficult • One Open Question – Is a machine that can always guess correctly able to solve problems a normal machine can’t?

College Science Scholars 6

“Computers” before WWII

College Science Scholars 7

Mechanical Computing

College Science Scholars 8

Modeling Pencil and Paper

...

# C S S A 7 2 3 ...

How long should the tape be?

“Computing is normally done by writing certain symbols on paper. We may suppose this paper is divided into squares like a child’s arithmetic book.” Alan Turing, On computable numbers, with an application to the Entscheidungsproblem, 1936

College Science Scholars 9

College Science Scholars

Modeling Brains

•Rules for steps •Remember a little “For the present I shall only say that the justification lies in the fact that the human memory is necessarily limited.” Alan Turing

10

Turing’s Model

...

# 1 0 1 1 0 1 1 1 0 1 1 0 1 1 1 # ...

Start Input: # Write: # Move:  Input: 1 Write: 1 Move:  1 Input: 0 Write: 0 Move:  2 Input: 1 Write: 0 Move:  Input: 0 Write: # Move:  3

College Science Scholars 11

Universal Machine

Universal Machine A Universal Turing Machine can simulate any Turing Machine running on any Input!

College Science Scholars 12

Church-Turing Thesis

• All mechanical computers are equally powerful* *Except for practical limits like memory size, time, energy, etc.

• There exists a Turing machine that can simulate any mechanical computer • Any computer that is powerful enough to simulate a Turing machine, can simulate any mechanical computer

College Science Scholars 13

What This Means

• Your cell phone, watch, iPod, etc. has a processor powerful enough to simulate a Turing machine • A Turing machine can simulate the world’s most powerful supercomputer • Thus, your cell phone can simulate the world’s most powerful supercomputer (it’ll just take a lot longer and will run out of memory)

College Science Scholars 14

Recap

• All Computers are Equally Powerful • Programs are Data, Data are Programs • Many Problems are Equally Difficult – But no one knows how difficult!

College Science Scholars 15

A “Hard” Problem?

College Science Scholars 16

Generalized Pegboard Puzzle

• Input: a configuration of cracker barrel style pegboard (of any size)

n

pegs on a • Output: if there is a sequence of jumps that leaves a single peg, output that sequence of jumps. Otherwise, output false.

Is this a “hard” problem?

College Science Scholars 17

Solving Problems

• A solution to a problem instance: given a pegboard configuration, here’s the sequence of jumps • A solution to a problem: a procedure that (1) always finds the correct answer, and (2) always finishes.

College Science Scholars 18

“Brute Force” Solvers

• Enumerate all possible answers – Every possible sequence of jumps • Try them all until you find one that works – Simulate the jumps • This works for almost all problems!

• Problem: how long does it take?

College Science Scholars 19

1200

Problem Solving Time

1000 800 ~ 2

n

600 ~

n

3 400 200 0 1 2 3 4 5 6 7 8 Problem Input Size 9

College Science Scholars

10 ~

n

20

70000

Increasing Problem Size

~ 2

n

60000 50000 40000 30000 20000 10000 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 ~

n

3

College Science Scholars 21

1200000 Tractable and Intractable Problems 1000000 800000 600000 400000 “intractable” “tractable” 200000 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

I do nothing that a man of unlimited funds, superb physical endurance, and maximum scientific knowledge could not do.

– Batman (may be able to solve intractable problems, but computer scientists can only solve tractable ones for large n)

College Science Scholars 22

time since “Big Bang” 2032 today

This makes a huge difference!

~ 2

n

1E+30 1E+28 1E+26 1E+24 1E+22 1E+20 1E+18 1E+16 1E+14 1E+12 1E+10 1E+08 100000 0 10000 100 1 ~

n

3 2 4 8 16 32 64 128 log-log scale

College Science Scholars 23

Back to the Pegboard...

• A brute force solution is easy...but on the pink line • Is there a tractable solution?

1E+30 1E+28 1E+26 1E+24 1E+22 1E+20 1E+18 1E+16 1E+14 1E+12 1E+10 1E+08 100000 0 10000 100 1 2 4 8 16 32 64 128

College Science Scholars 24

Deciding a Problem Is Hard

• “I tried really hard and still couldn’t solve it.” – Maybe the speaker isn’t smart enough – Maybe a few days more effort will find it • “Lots of really smart people tried really hard and no one could solve it.” • “It seems sort of like this other problem that we think is hard...”

College Science Scholars 25

College Science Scholars

Reduction

Input Trans former Output Trans former Pegboard Solver jumps Hard Problem Solver

26

Reading the Genome

College Science Scholars

Whitehead Institute, MIT

27

Gene Reading Machines

• One read: about 700 base pairs • But…don’t know where they are on the chromosome Read 3 TACCCGTGATCCA Read 1 Actual Genome Read 2 TCCAGAATAA ACCAGAATACC ACCAGAATACCCGTGATCCAGAATAA

College Science Scholars 28

Genome Assembly

Read 1 Read 2 Read 3 ACCAGAATACC TCCAGAATAA TACCCGTGATCCA Input: Genome fragments (but without knowing where they are from) Ouput: The full genome

College Science Scholars 29

Genome Assembly

Read 1 Read 2 Read 3 ACCAGAATACC TCCAGAATAA TACCCGTGATCCA Input: Genome fragments (but without knowing where they are from) Ouput: The smallest genome sequence such that all the fragments are substrings.

College Science Scholars 30

Genome Assembly Solver

ACCAGAATACC TCCAGAATAA TACCCGTGATCCA (~30M reads, ~900 bp) Input Trans former

College Science Scholars

Output Trans former Pegboard Solver jumps Genome Assembly Solver

31

What This Means

• We already know the shortest common superstring (genome assembly) problem is “hard” • The pegboard problem must also be hard, since we could use a solver for it to solve the genome assembly problem – Requires: we can build fast transformers that don’t increase the problem size exponentially

College Science Scholars 32

...

Non-Deterministic Machines

# 1 0 1 1 0 1 1 1 0 1 1 0 1 1 1 # ...

Start Input: # Write: # Move:  1 Input: 1 Write: 1 Move:  Input: 0 Write: 0 Move:  Input: Move: # Write: 0  4 2

College Science Scholars

Input: 1 Write: 0 Move:  3

33

Non-Deterministic Machine

• Everytime there is a choice, it can guess the correct choice without looking ahead • If we had such a machine, solving Pegboard (or Genome Assembly, etc.) problem would be easy: – It can guess the solution one step (alignment) at a time

College Science Scholars 34

Big Open Question

Is a non deterministic machine able to solve problems that are intractable on a deterministic machine?

1E+30 1E+28 1E+26 1E+24 1E+22 1E+20 1E+18 1E+16 1E+14 1E+12 1E+10 1E+08 100000 0 10000 100 1 2 4 8 16 32 64 128 Seems obvious that the magic guess correctly ability should be useful...but no one knows for sure!

College Science Scholars 35

Recap

• “P vs NP” problem (one of the millennium prize problems) • Solving the pegboard puzzle is equivalent to solving genome assembly • With a non-deterministic machine, we could solve both • With a mechanical computer, we don’t know if a tractable solution exists (but can’t prove it doesn’t): We don’t know if checking a solution is really easier than finding it

College Science Scholars 36

Summary

• Computer Science is the study of information processes: all about problem solving • Many seemingly paradoxical results: – All Computers are Equally Powerful!

– Many Surprisingly Different Problems are Equivalent!

• And seemingly obvious open problems: – Is checking a solution is really easier than finding it?

College Science Scholars 37

Computer Science at UVa

• New Interdisciplinary Major in Computer Science for A&S students (approved last year) • Take CS150 this Spring – Every scientist needs to understand computing, not just as a tool but as a

way of thinking

• Lots of opportunities to get involved in research groups

College Science Scholars 38

My Research Group

• Computer Security: computing in the presence of adversaries • Recent student projects: – Proof that the Pegboard puzzle is hard (Mike Peck and Chris Frost) – Disk-level virus detection (Adrienne Felt) – Web Application Security (Sam Guarnieri) – N-Variant Systems: run variants of a program simultaneously (Sean Talts)

College Science Scholars 39

Questions

http://www.cs.virginia.edu/evans [email protected]

College Science Scholars 40