Computational Evolution

All the intelligent things in the universe that we know of were made by evolution. So can we copy evolution to make programs in AI?

There is an entire field of modelling evolution as an "algorithm".

Natural Evolution

First, let us take a look at natural evolution, with our "computer science" hat on.

Living things contain a diversity of DNA, a diversity of ways of persisting. Some things persist, some things don't.

The Victorians saw it as a struggle of the fit, of the strong, where what would persist would be more intelligent, more admirable in some way.
Now we see it as more complex:
1. You can survive by being sneaky, by being a cheater, by being a parasite, by being an unnoticed harmless freeloader, by being a byproduct of something else.
2. You can survive in symbiosis with other species, other genes. You can survive as part of a massive social grouping where the individual is irrelevant.
3. You can get locked into spirals of sexual mate selection which have little to do with "fitness".
4. You can lose talents as well as gain them. You can find yourself stuck in an evolutionary local maximum, where "You can't get from here to there."
5. You can be wiped out by utterly random, and completely unfair, selections (dinosaurs). You can survive unfairly because your competition was so wiped out (mammals).
6. You can even have mass extinctions that have basically no cause (this is related to chaos theory). Certain systems can have periodically unstable dynamics which cause mass extinction for no reason.
  Self-organised criticality.
  - Introduction to Self-Organized Criticality & Earthquakes
  - Self-organised criticality
The question we always ask is: "If we come back in a million years, will these creatures be long extinct, or will they actually still be here?" e.g. Now on earth. Go away and come back in 5 million years. What will be here? Go away again and come back in 50 million years. Same question. What will persist?

Even bodies cannot be taken as a given. The chicken is an egg's ludicrous, elaborate, and yet successful, way of making another egg. Death cannot be taken as a given. Nor sex. Nor even a population of individuals. Nor even a mechanism of inheritance (DNA). You have to explain why these things evolved. Why they persist.

This view of evolution, exemplified by the writings of Richard Dawkins, is very much how Darwin saw it.
- Richard Dawkins, The Blind Watchmaker, 1986.
- Richard Dawkins, The Selfish Gene, 1976.

Computational evolution

Genetic Algorithms use evolution as a heuristic to find good solutions.
Tends to be a return to the simple Victorian idea of a "fair test".
Want algorithm to solve a problem, and the best at solving it get to survive. e.g. Fitness = Score out of 100 in a task.
Artificial Life more like modern definition of evolution. Sets up dynamic systems (ecologies, economies) and lets them run. See what emerges.
Instead of an explicit Fitness score, and the fittest reproduce, you have rules or "laws of physics" like: "You need energy to reproduce. You get energy from food. You can eat plants, dead animals, or hunt for yourself. You can steal other creatures' food. To reproduce, you must find a mate. You must stay alive until you reproduce. Every action uses up energy. If you use up too much energy you die." Then you run the environment and see what persists. Often you are surprised at the strategies creatures evolve in order to persist.
Q. In fact, the above is a good example. There is a problem with the rules above. A successful creature might simply do nothing. Why?

The Genetic Algorithm (GA)

Evolution as an algorithm:
We have a population of "creatures".
Their "DNA" is the instructions on how to build a creature to solve some problem.
We will be searching the space of possible DNA.

The algorithm will look like:

Start with creatures with random DNA.
Loop forever:
1. Each creature tries to solve some problem.
2. The creature's "fitness" is rated according to this test.
3. The fittest get to reproduce.
4. Reproduction could involve many things:
  - Pass on copy of yourself unchanged into the next generation (cloning).
  - Pass on copy with small random change (mutation).
  - Construct a child as a mix of two (or more) parents. This is called crossover in sexual reproduction.
5. The less fit don't reproduce.
6. The new population is more like the fitter end of the previous population.
7. The old population is deleted.
8. The new population replaces it. And loop.

Hardware evolution

Almost all evolution algorithms are done in software only, but some people are experimenting with hardware evolution:

Cornell researchers build a robot that can reproduce

Some definitions

Genotype: The "DNA". The instructions for how to build a new solution.
Phenotype: The "body". What the genotype constructs. The machine that actually solves the problem.
e.g. The genotype is a list of "weights" (parameters) that define a neural network. The phenotype is a neural network with these weights built-in.
Gene: A subsection of the genotype that can be isolated and identified to have some function.
This is hard in nature, but easy in engineering.
In engineering, we normally set it up so that each segment of the genotype maps to some known parameter in the built machine. Certain combinations of parameters/genes may be better than others. Evolution is a search in this multi-dimensional space.
Allele: The value that a gene has. e.g. The gene for eye colour can take values (blue, brown, green). The term is used in the GA sample code.
Chromosome: Genes are often grouped together on structures called chromosomes, so combinations of them can be easily reshuffled, especially by crossover.

Inheritance

Lamarckian inheritance - The idea that what you learn in life can be passed on through your genes.
Does not happen in nature.
Mendelian inheritance - What you learn can only affect whether you pass on your genes, not what is in them.
This is what actually happens.
But we can experiment with Lamarckism, even if it does not occur in nature.

Adaptive Landscape

Certain gene combinations are fitter than others, and evolution is a search in the multi-dimensional space of gene combinations to find fitter solutions.

This was recognised by evolutionary biologists long before hill-climbing computer programs (or indeed any computer programs).
This is from Sewall Wright in 1932:

You will recognise from hill-climbing/descending that this a form of "blind" search on a landscape where we only get feedback on particular points that we try out. We never get to see a picture of the landscape as above. In some searches (e.g. Neural Networks) we get a measure of the slope but here we do not get the slope. We are only told the fitness-value. We do not know the direction in which we should move the genes to improve matters.

No method that works like this can guarantee to find the global maximum without infinite exploration. The global maximum could be almost completely hidden, with almost no landscape leading towards it (it must have some landscape leading towards it if the landscape is continuous - it can't be a point).

How do we know in advance that the landscape is continuous? If we know nothing about it, it may not be.

A population of hill-climbers

The main feature of evolution by hill-climbing is that we use a population of hill-climbers. We start the experiment by "dropping" our hill-climbers at random points scattered all over the landscape (hopefully giving good coverage). Many may be starting in useless places, but if we know nothing about the landscape what else can we do except drop them randomly? They then begin to move. (Of course, as before, they move by discrete jumps, not continuously.)

For instance, let the genotype encode a real numbered value x in the interval 1-10, and the "fitness" of x is:

f(x) = sin(x) + sin(2x) + sin(5x) + cos(x)

Then all we are doing is maximising this function over this interval. We start with a population of random x. Generation 1 drops them all over the interval:

Now, the fittest get to reproduce. Not the single fittest solution, but a number of the better solutions. This means that instead of following a single path up the landscape, we keep multiple solutions alive. Generation 7:

Why do we let less fit individuals reproduce?

Why is keeping multiple solutions alive a good idea? Because there is no guarantee that the best solutions right now are on the slopes of the global maximum. We give many different areas of the landscape a chance to prove themselves.

Also, even if the less fit individuals are not on the slopes of a better peak:

They are solving the problem in a different way, and (imagine the genotype consists of 100 genes) they probably will have some better genes to contribute to the fittest.