Sunday 17 November 2013

representations, permutations, visualisations

One of the things I’m interested in is evolutionary algorithms (EAs), and how to make them better.  An EA takes a population of “genomes”, “mutates” (changes a little) and selects (based on “fitness”), and mutates and selects, and ... until a suitably fit answer is found.

A recent advance has been the introduction of “evo-devo” algorithms.  (I’m putting all this biological terminology in scare quotes, because by the time the relevant process has been translated into a computer algorithm, it is so far removed from its biological inspiration as to make a biologist wince, or even exclaim in outrage.)  Evo-devo puts a distance between the genome (the representation that gets mutated) and the “phenotype” (the representation that gets selected based on fitness).  This can help the algorithm’s performance, by allowing simple easily mutatable genomes develop into complex structured phenotypes.

A colleague of mine at York, Julian Miller, is the inventor of such an algorithm, Cartesian Genetic Programming (CGP).  The genome is a string of numbers (numbers are easy to mutate a little bit).  The string is then interpreted as a network phenotype.  The network itself has inputs and outputs, so is a form of program.

Now, let’s consider permutations.  A permutation of the numbers 1 to N is these numbers in some specific order.  So the permutations of 1 to 3 are: (1,2,3), (1,3,2), (2,1,3), (2,3,1), (3,1,2), (3,2,1).  A string of length N has N! (N factorial) permutations.  N! grows very fast; while 5! = 120, 10! = 3,628,800, and 100! > 10157.

Permutations are common in computer science.  One classic use is in the Travelling Salesman Problem: given a bunch of N cities, find the shortest path through all of them.  That is, find the permutation of 1 to N that gives the shortest path.  Given there are N! such permutations, clearly we don’t want to try them all.  Although exact algorithms that are essentially more efficient than trying all possibilities aren’t known (and it is strongly suspected that there aren’t any), there are algorithms that come up with very good approximate (nearly shortest path) answers most of the time.  EAs are one such class of algorithms: breed for fitter (shorter) paths.

Permutations as genomes are a bit tricky, though.  A permutation has structure: it must contain all the numbers from 1 to N, and each only once.  So you can’t mutate a single entry: you have to swap two entries, or do something else that maintains the permutation structure. “Crossover” is even harder: how do you take half of one permutation, half of another, and combine them into a valid permutation?  There are various techniques, but they are not very pretty.

Using an evo-devo approach to generate a permutation seems even harder: how do you ensure that your developed system is a valid permutation?  So, for example, with CGP we can have a list of outputs, but how do we ensure that this list is a valid permutation?  (Having a single output that is already a permutation merely moves the problem back inside the network somewhere.)

We need a further representation and development step that is guaranteed to produce a permutation.  Rather than try to get the network to produce a permutation immediately, let’s break it down into two steps: the network produces a list of numbers, then that list has to go through a further interpretation step to form a permutation.  Julian came up with an idea of how to do this: given a list of (say) real numbers (easy to produce with CGP), just sort them into ascending order.  The correspondingly sorted list of the indexes gives the required permutation.  Voilà!

A string of numbers is interpreted as a network, which outputs a real vector, which when sorted yields a permutation

What is happening here is easy to visualise using a technique called parallel coordinates.  A list of N real numbers can be thought of as a vector in N-D space.  But N-D space is hard to visualise if N > 3.  (I find it pretty hard to visualise even when N = 3.)

It’s hard to visualise a lot of dimensions this way

Parallel coordinates do what it says in the name: instead of drawing the N dimensions orthogonal to each other (rapidly running out of ways to do this in our 3D physical space), draw them parallel to each other.  It’s easy to draw lots of parallel lines.  Now plot the N-D point (x1,x2,...xN) as follows: plot the point x1 on axis 1, the point x2 on axis 2, and so on, then joint these points together with a line.  The line in the parallel coordinate plot represents the point in N-D space.

parallel coordinates view of a single N-D point

We can use these parallel coordinated to visualise how a vector of real numbers can represent a permutation by its components being sorted into ascending order.

(top) a vector of 20 real numbers, a 20-D point, drawn in parallel coordinates; (bottom) the same vector, with the parallel axes ordered so that the components are in increasing order: the sorted axis indexes are the permutation represented by the N-D point. 

The Python/numpy code that generated these plots is:
N = 20
P = range(N) # indexes
V = rand(N)  # random vector

# plot unsorted vector
for dim in range(N):
    ax1.plot([dim, dim], [0, 1], '0.5', linewidth=0.25)
    ax1.text(dim, -0.2, str(P[dim]), ha='center', fontsize=32)
ax1.plot(range(N), V, '.k', markersize=20)
ax1.plot(range(N), V, 'k')

# sort, and plot sorted vector
P = argsort(V) # sort indexes
V = sort(V)    # sort vector (for plotting)
for dim in range(N):
    ax2.plot([dim, dim], [0, 1], '0.5', linewidth=0.25)
    ax2.text(dim, -0.2, str(P[dim]), ha='center', fontsize=32)
ax2.plot(range(N), V, '.k', markersize=20)
ax2.plot(range(N), V, 'k')
Note that the code that generates the permutation is the single line P = argsort(V): the rest is just plotting code.

Here I started from a random vector, rather than the non-random output of some CGP network.  Sorting a random vector is one way to construct a random permutation, but as far as Julian and I can tell from the literature, this CGP use for representing evolved, non-random permutations isn’t standard.  Julian has been using it for several years in his module on evolutionary algorithms, and will be publishing a paper on some results next year.


No comments:

Post a Comment