Lecture 6 Some Applications. I: The Traveling Salesman Problem. This is a “typical” NP-Complete problem, with no known (expected?) polynomial-time solution.

Download Report

Transcript Lecture 6 Some Applications. I: The Traveling Salesman Problem. This is a “typical” NP-Complete problem, with no known (expected?) polynomial-time solution.

Lecture 6
Some Applications. I: The Traveling Salesman Problem.
This is a “typical” NP-Complete problem, with no known (expected?)
polynomial-time solution. By 1990, problems in VLSI fabrication
were asking for good solutions in the case of 1.2 million “cities”.
Since the only exact solution known is of the type “generate all
permutations on N elements , compute the cost of the tour
corresponding to the permutation, update the known smallest tour and
repeat”, the computational cost is O(N!). It would appear that, at the
moment, exact solutions are feasible for problems around 100 (or
slightly more) cities; it would also appear that problems involving
10,000+ cities are still on the “far side” for evolutionary based
optimization approaches.
Our discussion is based on Michalewicz and Fogel’s How to Solve it,
Ch. 8.
11/6/2015
1
Lecture 6
Note: one of the reasons for choosing this problem is that it is wellknown and is non-trivial. Much work has been done on it, with many
clever variants: it should be a good problem to stimulate ideas in other
areas.
Note: A survey article on combinatorial optimization approaches to the
the problem is [JohnsonMcGeoch1997]. A follow-up
([Walshaw2001]) provides some recent variants. Essentially, the best
algorithm for obtaining approximate solutions is based on one
invented by Lin and Kernighan in 1973. The original Lin-Kernighan
algorithm (the 1973 paper does not appear to be available on-line) was
the best algorithm until 1989, and has been superseded by some more
efficient variants. No genetic algorithm approaches seem to come
close to it, at this point.
11/6/2015
2
Lecture 6
Test Cases: in trying to determine the efficiency (and accuracy) of a
proposed approximation algorithm, it is necessary to set up
benchmarks. There are two generally accepted ways of setting up test
cases:
1. The cities are distributed at random in the Euclidean plane,
according to some uniform random variable. Euclidean distance is
assumed. One of the reasons for this is that an empirical formula for
the expected length of the minimal tour exists: L* = k•sqrt(N•R),
where N is the number of cities, R is the area of the square box
containing them and k is an empirical formula - its suggested (from
various considerations) value is k = 0.749.
2. Publicly available test cases (TSPLIB), with documented optimal or
best-known solutions.
11/6/2015
3
Lecture 6
Variation Operators. For TSP, the evaluation function is trivial: add
up the lengths of the edges in a given tour. Fitness is thus easy to
compute, and fitness proportionate reproduction is easy to set up.
Several problems arise in the decision of which operators to use to
provide the next generation.
1. Do we use binary or integer representation for the “names” of the
cities? Binary representation requires N•ceil(log2(N)) bits and both
mutation and crossover applied at the bit level are likely to lead to
non-tours or, even worse, to indices that do not correspond to any city.
A reasonable choice is to pick an integer representation. Even
in this case, an array-of-integers representation for the chromosomes is
very likely to introduce non-tours under both mutation and crossover,
requiring either re-tries or “patching up” operators.
11/6/2015
4
Lecture 6
2. Do we attempt to invent variants of whatever operators we want to
introduce so that they repair errors (after all, error repair is a fairly
common event in real DNA), or do we look for different
representations for our chromosomes, with operators that do not
introduce errors?
3. How can we guarantee that our changes in representation and
operators will still provide a search over the whole search space (or
that our repair techniques will not introduce unacceptable biases in our
“search populations”)?
11/6/2015
5
Lecture 6
A reasonable choice (it has been the historical choice) attempts to
introduce appropriate data structures and appropriate operators, trying
to “repair” as little as possible. The operant mantra is:
lots of repair = wrong data type.
We will use the examples in [M&F], with 9 cities, numbered from 1 to
9.
We start with a series of node-based operators; we will look at edgebased ones later.
11/6/2015
6
Lecture 6
Adjacency Representation. Encode a tour as a list of cities. City j is
listed in position i iff the tour leads from city i to city j. The vector
(2 4 8 3 9 7 1 5 6)  1 2 4 3 8 5 9 6 7.
This may be slightly counterintuitive, but a bit of practice will
convince you that it works as an unambiguous way of representing
tours in term of ordered lists or arrays.
It should be clear that not all such vectors can represent tours:
(2 4 8 1 9 3 5 7 6) has the disjoint cycles 1  2  4  1 and 3  8
 7  5  9  6  3.
This representation does not support simple “cut-and-slice” crossover
operators (as well as not supporting point mutations). We can choose
to “repair” or we can choose to modify the operators so they leave us
with legal members of the population (= tours).
11/6/2015
7
Lecture 6
J. Grefenstette published in 1985 (paper apparently not available online) several potential operators, and ran some experiments.
Alternating Edges Crossover. Randomly choose an edge from the
first parent; select an appropriate edge from the second parent; then
select from first parent; etc. If the new edge from a parent introduces
a cycle into the partial tour being constructed, select a random edge
from the remaining ones that does not introduce a cycle.
p1 = (2 3 8 7 9 1 4 5 6)
p2 = (7 5 1 6 9 2 8 4 3)
might lead to the partial offspring:
s1 = (2 5 8 7 9 1 ? ? ?)
since up to this point the picking of alternate edges gives no cycle.
11/6/2015
8
Lecture 6
At this point, the choice of 8 from the second parent would introduce
the edge 7  8, but the edge endpoint 8 has already been picked. The
still missing vertices are 3, 4 and 6. Choosing the edges 7  3 and 7
 6, does not introduce any immediate cycles, while 7  4 introduces
the cycle 7  4  7.
A potential partial offspring is: s1 = (2 5 8 7 9 1 3 ? ?) with the
parents (p1 as potential donor)
p1 = (2 3 8 7 9 1 4 5 6)
p2 = (7 5 1 6 9 2 8 4 3)
5 has already been chosen; 4 and 6 remain. The choice 8  4
introduces the cycle 8  4  7  3  8, so choose 8  6, which
forces 9  4: s1 = (2 5 8 7 9 1 3 6 4). The offspring (2 5 8 7 9 1 6 4
3) is also possible - as well as others.
11/6/2015
9
Lecture 6
Question: what is the worst cost of this crossover scheme? The
expected cost? It is clear that we may have to perform several checks
on long paths before we can determine whether a proposed increment
of a partial tour is still a useful partial tour.
Subtour-chunks Crossover. Choose a random length subtour from
one parent, another random length subtour from the other. Extend the
tour by choosing edges alternating between parents. Pick random noncycle-producing edges if the next choice would introduce a cycle.
This is clearly quite similar to the previous crossover scheme.
11/6/2015
10
Lecture 6
Heuristic Crossover. Choose a random city to start the tour. For each
of the parents, check which edge emanating from that city is shorter.
Take it. That will give you the next city. Repeat with the parents. If
both parents provide edges introducing a cycle, pick a random city
(and edge) not introducing a cycle.
Modification: 1) if the shorter edge from a parent introduces a cycle,
check the longer before “going random”; 2) if you need to “go
random” select the shortest edge from a pool of randomly selected q
(parameter) edges (similar to tournament selection).
To improve local optimization, another operator was introduced (this
has a long history in its own right): randomly select two edges (i, j)
and (k, m) and check if |(i, j)| + |(k, m)| > |(i, m)| + |(k, j)|. If true,
replace (i, j) and (k, m) by (i, m) and (k, j).
11/6/2015
11
Lecture 6
None of the three crossover operators developed in this context has
shown outstanding performance. They all suffer from the creation of
too much disruption: it would seem that the Adjacency List
representation, although good for thinking in terms of schemata that
“fix” good edges and attempt to construct around them, is somehow
not very good for the problem, because it does not really maintain
good schemata.
11/6/2015
12
Lecture 6
Ordinal Representation. Assume the existence of a “reference list”
that will be used as a starting point for all other lists. This list could be
C = (1 2 3 4 5 6 7 8 9), although it does not have to be. A tour such as
1 2 4 3 8 5 9 6 7 can be represented by the list
l = (1 1 2 1 4 1 3 1 1)… How?
The first number on l is 1, so take the first city on C as the first city of
the tour, and remove it from C. The resulting partial tour is 1. the
second number on l is also a 1, so pick the first element on the current
version of C, which is 2. Remove from C; partial tour 1  2. The
next item on l is 2, corresponding to the 4 on C. Remove the 4 from
C. Partial tour 1  2  4. Continue until all elements of l have been
accounted for.
11/6/2015
13
Lecture 6
Advantage: splicing works! Example: the splice point is |.
Parents:
p1 = (1 1 2 1 | 4 1 3 1 1) [1 2 4 3 8 5 9 6 7]
p2 = (5 1 5 5 | 5 3 3 2 1) [5 1 7 8 9 4 6 3 2]
Offspring:
s1 = (1 1 2 1 | 5 3 3 2 1) [1 2 4 3 9 7 8 6 5]
s2 = (5 1 5 5 | 4 1 3 1 1) [5 1 7 8 6 2 9 3 4]
Disadvantage: only the first part of the tour survives, the second
becoming, essentially, random. There is too little inheritance, and the
experimental results bear this out.
11/6/2015
14
Lecture 6
Path Representation. This is what one would expect: the path
5 1 7 8 9 4 6 2 3 is represented as the list
(5 1 7 8 9 4 6 2 3). It should be clear that this particular tour can be
represented via 9 equivalent lists (rotate left or right).
Crossover Operators.
1) Partially Mapped (PMX) crossover: choose a subsequence of a
tour from one parent and preserve the order and position of as many
ciries as possible from the other parent. As an example, consider the
parents with cut-points indicated by |:
p1 = (1 2 3 | 5 4 6 7 | 8 9)
p2 = (4 5 2 | 1 8 7 6 | 9 3)
The offspring would be generated as follows:
11/6/2015
15
Lecture 6
s1 = (x x x | 1 8 7 6 | x x)
s2 = (x x x | 4 5 6 7 | x x)
This swap also defines a mapping: 1  4, 8  5, 7  6, 6  7.
Next, fill additional cities from the parents that don’t lead to conflict.
p1 = (1 2 3 | 5 4 6 7 | 8 9)
p2 = (4 5 2 | 1 8 7 6 | 9 3)
The easy ones:
s1 = (x 2 3 | 1 8 7 6 | x 9)
s2 = (x x 2 | 4 5 6 7 | 9 3)
For the harder ones, use the mapping:
s1 = (4 2 3 | 1 8 7 6 | 5 9)
s2 = (1 8 2 | 4 5 6 7 | 9 3).
11/6/2015
16
Lecture 6
2) Order (OX) crossover: choose a a subsequence of a tour from one
parent and preserve the relative order of the cities from the other.
Example:
p1 = (1 2 3 | 5 4 6 7 | 8 9)
p2 = (4 5 2 | 1 8 7 6 | 9 3)
s1 = (x x x | 5 4 6 7 | x x)
s2 = (x x x | 1 8 7 6 | x x)
The tour in p2, starting from its second cut point, is
9 3 4 5 2 1 8 7 6. Remove the cities already
in s1, obtaining the partial tour 9 3 2 1 8. Insert this partial
tour after the second cut point of s1, obtaining the tour
s1 = (2 1 8 | 5 4 6 7 | 9 3 ). Similarly s2 = (3 4 5 | 1 8 7 6 | 9 2).
11/6/2015
17
Lecture 6
3) Cycle (CX) crossover: each city and its position comes from one of
the parents. Example:
p1 = (1 2 3 4 5 6 7 8 9)
p2 = (4 1 2 8 7 6 9 3 5)
Start by taking the first city from p1:
s1 = (1 x x x x x x x x)
The next city must be from p2, and from the same position. This gives
city 4, which is in position 4 on p1: s1 = (1 x x 4 x x x x x). In p2, in
the same position as 4 in p1, we have city 8: s1 = (1 x x 4 x x x 8 x).
We continue with s1 = (1 x 3 4 x x x 8 x), s1 = (1 2 3 4 x x x 8 x).
Note that the selection of 2 now forces the selection of 1 and we have
a cycle in our scheme. We now use the second parent to fill in: s1 = (1
2 3 4 7 6 9 8 5).
11/6/2015
18
Lecture 6
If we now start form p2,
p1 = (1 2 3 4 5 6 7 8 9)
p2 = (4 1 2 8 7 6 9 3 5)
s1 = (4 1 2 8 x x x 3 x) completes the first “cycle”. Filling in from p1:
s1 = (4 1 2 8 5 6 7 3 9) .
Several other path-based operators have been tried, and one may wish
to look in the literature for more.
11/6/2015
19
Lecture 6
Other Reordering Operators.
1. Inversion. Select two points along the permutation, cut it at these
points and re-insert the reversed string.
(1 2 | 3 4 5 6 | 7 8 9) (1 2 | 6 5 4 3 | 7 8 9).
2. Insertion. Select a city and insert it in a random place.
3. Displacement. Select a subtour and insert it in a random place.
4. Reciprocal Exchange. Swap two cities.
5. Heuristic Crossover with multiple (> 2) parents. Just as
Heuristic Crossover but with as many choices for the next city as there
are parents - pick the shortest edge out…
11/6/2015
20
Lecture 6
Edge-based Operators. Various people tried to introduce operators
that would make better use of edge information. Grefensteette
introduced a class of heuristic operators along the following lines:
1. Randomly select a city to be the current city c of the offspring.
2. Select 4 edges (2 from each parent) incident to c.
3. Define a probability distribution over the selected edges based on
their cost. Edges incident on a previously visited city have probability
0.
4. If at least one edge has positive probability, select probabilistically;
otherwise select at random to reach an unvisited city.
5. City at other end of edge is new c.
6. If tour is complete, stop; if not goto 2.
11/6/2015
21
Lecture 6
A number of experiments indicate that only about 60% of the edges
are transferred from the parents; 40% are random. Too much
randomness to be effective in building more efficient solutions.
Edge Recombination: require that the number of edges inherited
from the parents be as large as possible.
Start from the idea that, in a tour (3 1 2 8 7 4 6 9 5) the edges are (3 1),
(1 2), (2 8), (8 7), (7 4), (4 6), (6 9), (9 5), (5 3). Recall that the edges
are undirected: (5 3) = (3 5); direction is not important. Position of a
city on a tour is not important: tours are circular.
Evaluation Function: minimize the total cost of the edges that
constitute a legal tour.
11/6/2015
22
Lecture 6
Example. Start with two parents:
p1 = (1 2 3 4 5 6 7 8 9),
p2 = (4 1 2 8 7 6 9 3 5).
Using both parents, collect the edges available:
City 1: (1 9), (1 2), (1 4);
City 2: (2 1), (2 3), (2 8);
City 3: (3 2), (3 4), (3 9), (3 5);
City 4: (4 3), (4 5), (4 1);
City 5: (5 4), (5 6), (5 3);
City 6: (6 5), (6 7), (6 9);
City 7: (7 6), (7 8);
City 8: (8 7), (8 9), (8 2);
City 9: (9 8), ( 1), ( 6), (9 3).
11/6/2015
23
Lecture 6
The algorithm. Start with either one of the “start cities” in the edge
lists of the parents or with a city with the smallest number of edges this latter criterion maximizes the probability that you will finish the
tour using the parental set of edges. Once you have decided on the
first city, add an edge to a city with the smallest number of edges.
Continue.
If we start with City 1: we can reach 2, 4, 9. 2 and 4 have 3 edges; 9
has 4. Pick, randomly, between 2 and 4. Say you picked 4. You now
have (1 4 x x x x x x x). 4 has edges to 1, 3, 5. The edge to 1 has
already been used; 5 has fewer edges than 3: (1 4 5 x x x x x x).
Continuing in this fashion, we arrive at the offspring (1 4 5 6 7 8 2 3
9) - without needing to introduce a new edge to complete the tour. It
appears (experimentally) that failure occurs in less than 1.5% of the
cases.
11/6/2015
24
Lecture 6
A variant attempts to maintain subtours common to both parents: note
that, if a city has only two or three edges associated, one (or both) of
the edges must be common to both parents. The algorithm “prefers”
to choose edges common to both, before looking at any others.
This seems to have led to better results.
Finding further operators that improve on the know edge ones is an
open question (or was as of 2000).
11/6/2015
25
Lecture 6
Matrix Representations and Operators. There have been at least 3
attempts.
1. Precedence Matrix. A tour (3 1 2 8 7 4 6 9 5) is represented by the
matrix: the element mij contains a 1
iff the city i occurs before the city j
on the tour. Properties of the matrix:
1. The number of 1s is exactly
n•(n - 1)/2.
2. mii = 0 for all 1 ≤ i ≤ n.
3. If mij = 1 and mjk = 1 then mik = 1.
11/6/2015
26
Lecture 6
Claim: If the number of 1s is less than n•(n - 1)/2, with the other
conditions still satisfied, the cities are partially ordered, which is
another way of saying that the matrix can be completed in at least one
way to obtain a legal tour.
The operators devised in [Fox & McMahon, Genetic Operators for
Sequencing Problems, in Foundations of Genetic Algorithms, G. J. E.
Rawlins, ed., Morgan Kaufman, 1991] were the operators of
intersection and union.
Intersection. The intersection operator is based on the observation that
the (bitwise) intersection of two tour matrices results in a matrix that
satisfies conditions 2 and 3, with a number of 1s no greater than
n•(n - 1)/2, and thus extendible to a tour
11/6/2015
27
Lecture 6
The two parents
p1 = (1 2 3 4 5 6 7 8 9)
p2 = (4 1 2 8 7 6 9 3 5)
correspond to the matrices below:
11/6/2015
28
Lecture 6
The intersection is given by the matrix:
And we have a partial order. For example, city 1 must precede cities
2, 3, 5, 6, 7, 8 and 9; city 6 is only required to precede city 9; etc..
How do we complete the tour?
11/6/2015
29
Lecture 6
Select one of the parents; add some 1s unique to this parent; complete
the matrix into a tour sequence through an analysis of the sums of the
rows and columns. A possible completion is given by the matrix on
the right below, which gives the tour (1 2 4 8 7 6 3 5 9).
11/6/2015
30
Lecrture 6
The second operator is the union operator. It is based on the
observation that subsets from two matrices can be safely combined
provided the two subsets have empty intersection. It thus partitions the
set of cities into two disjoint groups and copies the bits of one matrix
for the first group, and the bits of the other for the second group.
For example, p1 can lead to the set
{1, 2, 3, 4}, p2 to the set {5, 6, 7, 8, 9},
with the “union” matrix at the right.
It needs to be completed, using the
same kind of techniques used for the
completion of the intersection.
11/6/2015
31
Lecture 6
2. Binary Tours. The matrix element mij contains a 1 iff the tour goes
directly from city i to city j. There is only one nonzero entry in each
row and column. Matrix (a) below represents the tour ( 1 2 4 3 8 6 5 7
9) (or any “rotation” left or right). Matrix (b) represents another tour
(or does it?).
11/6/2015
32
Lecture 6
The requirement that each row and column contain a single 1 allows
for non-tours to be represented - matrix (b) on the previous slide
represents the two subtours (1 2 4 5 7) and (3 8 6 9). An example of
subtours that can then be connected to make up a full tour occurs
below. The genetic algorithm will be restricted to subtours of length at
least 3 - fixed into full tours after it terminates.
11/6/2015
33
Lecture 6
Operators. Two were defined. The first took a matrix, randomly
selected several rows and columns, removed the set bits at the
intersections of those rows and columns and replaced them randomly.
Ex.: matrix (a) corresponds to a tour. Assume rows 4, 6, 7, 9 and
columns 1, 3, 5, 8, 9 are selected. The marginal sums are calculated
and stored; the bits at the intersections are removed and replaced
randomly, agreeing with the marginal sums. We have an example
below:
11/6/2015
34
Lecture 6
Replacing the old rows and column fragments with the new ones, we
have a matrix that represents not a single tour but two subtours: (1 2 4
5 7) and (3 8 6 9) - the full matrix (8.6(b)) introduced earlier.
The second operator starts with two parents and a matrix of 0s. It
fills the new matrix with the result of the “intersection” of the parents:
1 bits in both result in a 1 bit in the offspring. After this initial phase,
the operator copies alternately one set bit from each parent until no
bits exist in either parent that can be copied without violating the
matrix restrictions. If, at this point, the matrix still has some rows
without a 1, they receive a 1 at random - again satisfying the matrix
constraints.
11/6/2015
35
Lecture 6
We start with matrix (a) below, representing the subtours (1 5 3 7 8)
and (2 4 9 6). The second parent matrix (b) represents the full tour (1
5 6 2 7 8 3 4 9).
11/6/2015
36
Lecture 6
The first result of the first phase of the operator application appears in
(a) below; the second phase leads to (b). Notice that a number of rows
(columns) in (b) have no 1: it would not be possible to fill from a
parent without violating the conditions.
11/6/2015
37
Lecture 6
Random filling (with the constraint of having just one 1 in each row
and column) leads to the matrix below. It does represent a tour, (1 5 6
2 3 4 9 7 8).
Another solution would lead to the two subtours (1 5 3 4 9) and (2 7 8
6).
11/6/2015
38
Lecture 6
Binary Matrices with Crossover (XO) Operators. We define
crossover operators by first choosing one or more “between column”
positions, swapping the values in the blocks so identified - see the
picture below. The resulting matrices are (generally) illegal, although
they have the correct total number of 1s.
11/6/2015
39
Lecture 6
We see the result in the picture below: some rows (columns) have no
1s, some have two. We need to repair these intermediate matrices. For
example, the 1 in position m1,4 in (a) could be moved to position m1,8.
After this type of “correction” we may end up with the first offspring
providing the legal tour (1 2 8 4 3 6 5 7 9), while the second offspring
provides the two subtours (1 6 5 7 2 8 9) and (3 4).
11/6/2015
40
Lecture 6
The second step of the repair algorithm needs to be applied only to the
second matrix: cut and connect subtours to produce a legal tour. You
can use information about which edges exist in the parents to choose
where to splice the subtours: for example, (2 4) is present in one of the
parents, so we splice (3 4) and (4 3) between 2 and 8.
A heuristic “inversion operator” was also introduced. The claim based on a number of experiments - was that this was reasonable, with
one 318-city experiment resulting in a tour within 0.6% of optimal.
When the splicing of subtours is required, local considerations can be
useful - and some systems incorporate them.
11/6/2015
41
Lecture 6
A modified GA scheme would involve some local optimizations
applied to all elements of the population.
11/6/2015
42
Lecture 6
We will introduce some notions from [Papadimitriou & Steiglitz;
Combinatorial Optimization] and then we’ll continue. They refer to
the TSP.
Definition. Let f and g denote tours. A k-change (also know as k-opt)
neighborhhood of f is defined as
Nk(f) = {g : g is a tour, obtained from f as follows: remove k edges
from f and replace them with k edges}. Example for 2-change:
5
2
1
3
1
4
6
7
11/6/2015
5
2
3
4
6
7
43
Lecture 6
It has been found that 2-change and 3-change (especially 3-change)
lead to very effective heuristics for TSP. 4-change does not seem to be
sufficiently effective to bother: the extra computational cost is not paid
back in extra improvements leading to faster convergence to a better
approximation [results mostly due to Lin].
The local optimization algorithm can be stated as follows:
Let t be a tour.
any s  N k t  with c s  c t  if such s exist s,
k  improvet
   
"no" otherwise.
procedure local search
begin
t  some initial tour;
while k-improve(t) ≠ “no” do

t  k-improve(t);
return t
end
11/6/2015
44
Lecture 6
An algorithm was proposed:
1. Use a 2-opt procedure to replace each tour in the current population
with a locally optimal tour.
2. Allow higher quality solutions to generate more offspring.
3. Use recombination and mutation.
4. Search for the minimum by using local search within each
individual.
5. Repeat steps 2-4 until a termination condition is met.
For reproduction, a variant of the order crossover (OX) was used.
11/6/2015
45
Lecture 6
Two parents are given - with the cut points marked by |:
p1 = (1 2 3 | 4 5 6 7 | 8 9)
p2 = (4 5 2 | 1 8 7 6 | 9 3),
and they produce offspring as follows. First, the segments
between cut points are copied into the offspring:
s1 = (x x x | 4 5 6 7 | x x)
s2 = (x x x | 1 8 7 6 | x x).
Next, instead of starting from the second cut point of one parent - as
was the case for OX - the cities from the other parent are copied in the
same order from the beginning of the string, omitting those symbols
that are already present. This leads to the descendants:
s1 = (2 1 8 | 4 5 6 7 | 9 3)
s2 = (2 3 4 | 1 8 7 6 | 5 9).
11/6/2015
46
Lecture 6
Edge Assembly Crossover (EAX). Assume two parents have been
selected: [Y. Nagata & S. Kobayashi, Edge Assembly Crossover: a
High-Power Genetic Algorithm for the TSP, in Proceedings of the
Seventh International Conference on GAs, Morgan Kaufman, 1997]
11/6/2015
47
Lecture 6
Construct a graph G that contains the edges of both parents (A and B).
Then construct a set of AB-cycles: an AB-cycle is an even length
subcycle of G with edges that come alternately from A and B. An ABcycle can repeat cities but not edges.
Construction: assume we start at city 6 and that the first edge of the
AB-cycle is (6 1). This edge comes from parent A.
11/6/2015
48
Lecture 6
We pick the next edge from parent B, originating from city 1. Select
edge (1 3). You can further select edges (in order) (3 4) from A, (4 5)
from B, (5 6) from A (which gives an odd cycle) and (6 3) from B.
The latter choice gives us an even cycle, after dropping the first two
edges. Remove the edges of the AB-cycle from G, and repeat. This
may well lead to “ineffective” AB-cycles containing just two edges.
11/6/2015
49
Lecture 6
The next step involves selecting a subset of AB-cycles (an E-set) to
extend into a suitable tour. Two selection mechanisms were
introduced: 1) a deterministic heuristic method; 2) a random method
(each AB-cycle has probability 0.5). Once the E-set is constructed, we
construct an intermediate offspring C:
set C  A
for each edge e  E-set do
if e  A then C  C - {e}
if e  B then C  C  {e}
C is a set of disjoint subtours that cover all cities.
We now introduce a greedy procedure that incorporates local search to
obtain a single legal tour.
11/6/2015
50
Lecture 6
The construction:
1) Start with a subtour T with the smallest number of edges.
2) Select two edges, one from T and another from all edges in
remaining subtours. Assume that (vq, vq+1)  T and (v’r, v’r+1)  T,
where the indices are modulo the number of nodes in the
corresponding subtours.
3) Take the two edges out of the subtours, reducing the cost by
cut(q, r) = L (vq, vq+1) + L (v’r, v’r+1),
where L denotes the length of the edge.
4) Define link(q, r) = min{L (vq, v’r) + L (vq+1, v’r+1), L (vq, v’r+1) + L
(vq+1, v’r)}, and replace the original edges, if the new ones provide a
shorter connection - we seek minr,q{link(q, r) - cut(q, r)}.
11/6/2015
51
Lecture 6
5) After making a connection, the number of subtours is decreased by
one and we repeat the process until we have a full tour, which is the
offspring of parents A and B.
Observations and conclusions:
a) EAX produced results that are a small fraction of 1% above
optimum for many test cases from 100 to over 3000 cities.
b) A very high selective pressure is used. An offspring replaces a
parent only if it is better than both parents. If the offspring isn’t better,
the EAX operator is applied for up to N = 100 times in an attempt to
improve it.
c) EAX incorporates an implicit mutation by introducing new edges
into an offspring during the process of connecting subtours.
11/6/2015
52
Lecture 6
The inver-over operator. How about a “pure” evolutionary
algorithm? No local search; local optima avoidance; strong selection
pressure. [Tao & Michalewicz, Evolutionary algorithms for the TSP,
Proceedings of the 5th Parallel Problem Solving from Nature
Conference, Lecture Notes in CS, #1498, Springer-Verlag, 1998].
1) Each individual competes only with its offspring;
2) There is only one (adaptive) variation operator;
3) The number of times the operator is applied to an individual during
a generation is variable.
11/6/2015
53
Lecture 6
The Algorithm: (p is small in (0, 1).
11/6/2015
54
Lecture 6
Example. Assume the current S’ = (2 3 9 4 1 5 8 6 7) and c = 3.
If rand() < p, then select a (random) city c’ from S’, say 8, and invert:
(2 3 9 4 1 5 8 6 7)  (2 3 8 5 1 4 9 6 7).
If not, select another (random) individual from the population, say (1 6
4 3 5 7 9 2 8). Search this individual for the city c’ which is “next” to
c in this new individual: you find city 5. Thus the segment for
inversion in S’ starts at 3 and ends at 5. The new offspring is
S’  (2 3 5 1 4 9 8 6 7).
Note that a substring 3-5 came from the second parent. Also note that
the repeat loop can be repeated many times.
11/6/2015
55
Lecture 6
Some claims: (from Michalewicz & Fogel, or, beating your own drum
- maybe with justification).
1. Probably quickest evolutionary algorithm developed up to 1998.
2. Three parameters: p (the probability of generating random
inversion), population size, and number of iterations in termination
condition.
3. Precision and stability appear (empirically) good for small (≤ 105
cities - almost 100% accuracy for tour), computational time acceptable
(≤ 4s).
4. New operator introduces a mix of inversion (mutation) and
crossover. Better than algorithms that rely on random inversion only.
5. p (= 0.02 for all tests) determines the proportion of blind and guided
inversions.
11/6/2015
56