Upper Bounds on the Time and Space Complexity of Optimizing Additively Separable Functions Matthew J.

Download Report

Transcript Upper Bounds on the Time and Space Complexity of Optimizing Additively Separable Functions Matthew J.

Upper Bounds on the Time and Space
Complexity of Optimizing Additively
Separable Functions
Matthew J. Streeter
Carnegie Mellon University
Pittsburgh, PA
[email protected]
Outline
• Introduction
• Definitions & notation
• Detecting linkage
• Algorithm, analysis & performance
• Conclusions
Introduction
• An additively separable function f of order k is one
that can be expressed as:
f (s)   fi (s)
i
where each fi depends on at most k characters of s,
and each character contributes to at most one fi

• Studied extensively in EC literature, particularly in
relation to competent GAs.
Introduction
“[W]e would like a procedure that scales
polynomially, as O(jb) with b as small a number as
possible (current estimates of b suggest that
subquadratic—b≤2—solutions are possible).”
David Goldberg,
The Design of Innovation (p. 51-2)
Introduction
• Previous bound: time O(j2); space O(1)
(Munemoto & Goldberg 1999)
• New bound: time O(j*ln(j)); space O(1)
Definitions & notation
• si = ith character of binary string s
• s[ic] = a copy of s with si set to c
• fi(s) = f(s[i (si)]) - f(s)
= effect on fitness of flipping ith bit
(Munemoto & Goldberg 1999)
Definitions & notation
• Linkage: positions i and j are linked, written
there is some string s such that:
fi(s[j0])  fi(s[j1])
(i, j), if
• Grouping: i and j are grouped, written (i, j), if i=j or if
there is some sequence i0, i1, ..., in such that:
i0 = i
in = j
(im, im+1) for 0  m < n
Definitions & notation
• Linkage group: a non-empty set g such that if i  g
then j  g iff. (i, j)
• Linkage group partition: the unique set f = {g1, g2, ...,
gn} of linkage groups of f.
Example
f(s) = s1s2 + s2s3 + s4s5
• Linkage:
(1, 2)
(2, 3)
(4, 5)
(2, 1)
(3, 2)
(5, 4)
• Linkage groups: {1,2,3} and {4,5}, so
f = {{1,2,3}, {4,5}}
• f is an additively separable function of order 3
Relationship to additively separable
functions
• If f = {g1, g2, ..., gn} then f can be written as:
f (s)   fi (s)
i
where each fi depends only on the positions in gi
So f is additively separable of order

k = max1≤i≤n |gi|
• So once we know f, we can find the global optimum
of f in time O(2k*j) by local search
Algorithm overview
• Start with a random string  and the trivial linkage
group partition  = {{1}, {2}, ..., {j}}.
• Repeatedly perform a randomized test to detect pairs
of positions that are linked
• Every time we find a new link (i,j), merge i’s and j’s
subset to form a new subset g’, and use local search
to make  optimal w.r.t. g’
• Once  = f, we will have found a globally optimal
string
Detecting linkage: O(j2) approach
• For fixed i and j, pick a random string s and check if
fi(s[j0])  fi(s[j1]).
• Test requires 4 function evaluations, and is
conclusive with probability at least 2-k.
• Leads to an algorithm that requires O(2k*j2) function
evaluations. (Munemoto & Goldberg 1999)
Detecting linkage: O(j*ln(j)) approach
• For fixed i, generate two random strings s and t that
have the same character at i, and check whether
fi(s)  fi(t).
• Suppose fi(s)  fi(t). Let d be the hamming
distance from s to t.
– If d=1, call the position that differs j and we have (i, j) by
definition
– Otherwise create a string s’ that differs from both s and t in
d/2 positions. We must have either fi(s’)  fi(s) or fi(s’) 
fi(t), so just recurse until we get d=1.
Example
f(s) = s1s2 + s2s3 + s4s5
Iteration 1
i=2
Iteration 2
s
f2(s)
s
f2(s)
s(1)
00000
0
s(2)
00011
0
t(1)
10111
2
t(2)
10111
2
s(1)’
00011
0
s(2)’
00111
1
s
f2(s)
s(2)
00111
1
t(2)
10111
2
Iteration 3
Conclusion:
(1, 2)
How to use this?
• Some links are more “obvious” than others
• Ideally we would like to only discover novel links
(those that let us update )
• We would like test to be conclusive with probability at
least 2-k
Discovering novel links
• Binary search starting at s and t will always return a
position j where s and t disagree (sj  tj)
• So, when looking for a link from i, just make sure s
and t agree on all the positions in whichever subset in
 contains i
Probability that test is conclusive
• Let g be the subset in  that contains i, and let gf be
i’s true linkage group (g  gf)
• Can show that with probability at least 2-|g|, will
choose an s such that for some t, fi(t)  fi(s)
• Because there only |gf| - |g| positions left in t that
affect fi(t), test will be conclusive with probability at
least:

2 2
g
 gf  g
  2 g  2k
f
Finding f
• On each iteration, let i run from 1 to j and perform test
for link out of i
Analysis
• Each conclusive test requires time O(ln(j)) and we do
at most j-1 of them, so total time is O(j*ln(j))
• Time for local search is O(2k*j)
• Only thing left is time for inconclusive tests, each of
which is O(1).
• Need to know the number t of rounds needed to
discover the correct  with probability ≥ p
Calculating number of rounds
• To discover f it is sufficient that each position
participate in 1 conclusive test
• If this hasn’t happened yet, it happens with probability
at least 2-k
• This means we get a lower bound by analyzing the
following algorithm:
for n from 1 to t do:
for i from 1 to j do:
with probability 2-k, mark i
if all positions are marked, return ‘success’
return ‘failure’
Calculating number of rounds
• The probability that this algorithm succeeds is
p = (1-(1-2-k)t)
• To succeed with probability p, we must set
1 

 

ln 1 p 




t
ln 1 2k


• Some calculus shows that this is O(2k*ln(j))

Performance
• Ran on folded trap functions with k=5, 5 ≤ j ≤ 1000
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
• Near-linear scaling as expected
• Have solved up to 100,000 bit problems with this algorithm
Limitations
• Function must be strictly additively separable
• On any real problem, this algorithm will become an
exhaustive search
• Can start to address this using averaging and
thresholding (Munemoto & Goldberg 1999)
• I believe these limitations can be overcome
Conclusions
• New upper bounds on time complexity (O(2k*j*ln(j)))
and space complexity (O(1)) of optimizing additively
separable functions
• Algorithm not practical as-is
• Linkage detection algorithm presented here could be
used to construct more powerful competent GAs