Transcript [Slides]

On The Learning Power
of Evolution
Vitaly Feldman
1
Fundamental Question

How can complex and adaptive
mechanisms result from
evolution?



Fundamental principle: random
variation guided by natural selection
[Darwin, Wallace 1859]
There is no quantitative theory
TCS


Established notions of complexity
Computational learning theory
2
Model Outline
Complex behavior: multi-argument function





Function representation
Fitness estimation
Random variation
Natural selection
Success criteria
3
Representation


Domain of conditions X and distribution
D over X
Representation class R of functions over
X


Space of available behaviors
Efficiently evaluatable
4
Fitness


Optimal function f: X ! {-1,1}
Performance: correlation with f relative
to D
Perff(r,D) = ED[f(x)¢r(x)]
5
Random Variation

Mutation algorithm M:



Given r 2 R produces a
random mutation of r
Efficient
NeighM(r) is all possible
outputs of M on r
Hypothesis
Mutation Algorithm
6
Natural Selection
If beneficial mutations are available then output one of
them, otherwise output one of the neutral mutations
Bene(r)={r’ 2 NeighM(r) | Perff(r’,D) > Perff(r,D) + t }
Neut(r)={r’ 2 NeighM(r) | |Perff(r’,D) - Perff(r,D)| · t }
t is the tolerance


*
If Bene(r)  ; a mutation is chosen from
Bene(r) according to PrM
If Bene(r) = ; a mutation is chosen from
Neut(r) according to PrM
NeighM(r) and Perff are estimated via polysize sampling and t is inverse-polynomial
Natural selection
Step(R,M,r)
7
Evolvability

Class of functions C is evolvable over D if exists an
evolutionary algorithm (R,M) and a polynomial g(¢,¢) s.t.
For every f2C, r2R, >0, for a sequence
r0=r,r1,r2,… where ri+1 Ã Step(R,M,ri) w.h.p.
it holds Perff(rg(n,1/),D) ¸ 1-

Evolvable (distribution-independently)


Evolvable for all D by the same R and M
C represents the complexity of structures that can
evolve in a single phase of evolution driven by a
single optimal function from C
8
Evolvability of Conjunctions

ANDs of Boolean variables and their
negations over {-1,1}n


e.g. x3Ƭx5Æx8
Evolutionary algorithm


R is all conjunctions
M adds or removes a variable or its negation
Does not work
Works for monotone conjunctions over the
uniform distribution [L. Valiant 06]
9
What is Evolvable in This Model?




EV µ PAC
EV µ SQ ( PAC [L. Valiant 06]
 Statistical Query learning [Kearns 93]: estimates of
ED[(x,f(x))] for an efficiently evaluatable 
EV µ CSQ [F 08]
Learnability by correlational statistical queries
CSQ: ED[(x)¢f(x)]
CSQ µ EV [F 08]
 Fixed D: CSQ = SQ [Bshouty, F 01]
10
Distribution-independent Evolvability

Algorithms

Singletons [F 09]



Lower bounds [F 08]


C 2 EV => each function in C is expressible as a “low”
weight integer threshold function over a poly-sized basis B
EV ( SQ



R is all conjunctions of a logarithmic number of functions from
a set of pairwise independent functions
M chooses a random such conjunction
Linear threshold functions and decision lists are not evolvable
(even weakly) [GHR 92, Sherstov 07, BVW 07]
Conjunctions?
Low weight integer linear thresholds?
11
Robustness of the Model

How is the set of evolvable function classes
influenced by various aspects of the definition?





Selection rule
Mutation algorithm
Fitness function
…
The model is robust to a variety of
modifications and the power is essentially
determined by the performance function [F 09]
12
Original Selection Rule
If beneficial mutations are available then output one of
them, otherwise output one of the neutral mutations
Bene(r)={r’ 2 NeighM(r) | Perff(r’,D) > Perff(r,D) + t }
Neut(r)={r’ 2 NeighM(r) | |Perff(r’,D) - Perff(r,D)| · t }
t is the tolerance


*
If Bene(r)  ; a mutation is chosen from
Bene(r) according to PrM
If Bene(r) = ; a mutation is chosen from
Neut(r) according to PrM
NeighM(r) and Perff are estimated via polysize sampling and t is inverse-polynomial
Natural selection
Step(R,M,r)
13
Other Selection Rules

Sufficient condition:
8 r1,r2 2
NeighM(r)
if
Perff(r1,D) ¸ Perff(r2,D) + t
then r1 is “observably” favored to r2

Selection rule can be “smooth” and need
not be fixed in time
14
Performance Function

For real-valued representations other
measures of performance can be used


e.g. expected quadratic loss LQ-Perf :
1-ED[(f(x)-r(x))2]/2
Decision lists are evolvable wrt uniform distribution
with LQ-Perf [Michael 07]

The obtained model is equivalent to learning
from the corresponding type of statistical
queries


CSQ if the loss function is linear
SQ otherwise
15
What About New Algorithms?

Conjunctions are evolvable distributionindependently with LQ-Perf [F 09]
Mutation algorithm:
Add/subtract ¢xi and project to X[-1,1]
(for X = {-1,1}n )

16
Further Directions




Limits of distribution-independent
evolvability
“Natural” algorithms for “interesting”
function classes and distributions
Evolvability without performance
decreases
Applications


Direct connections to evolutionary
biology
CS
17
References






L.
L.
V.
V.
Valiant. Evolvability, ECCC 2006; JACM 2009
Michael. Evolvability via the Fourier Transform, 2007
F. Evolvability from Learning Algorithms, STOC 2008
F. and L. Valiant. The Learning Power of Evolution,
COLT 2008(open problems)
V. F. Robustness of Evolvability, COLT 2009 (to appear)
V. F. A complete characterization of SQ learning with
applications to evolvability, 2009 (to appear)
18