Dr. Mark Humphrys

School of Computing. Dublin City University.

Online coding site: Ancient Brain

coders   JavaScript worlds

Search:

Free AI exercises


Encoding the Genotype

Recall our rough Genetic Algorithm and our definitions. We need to fill in some items.

First, how should the genotype be encoded so we can subject it to mutation and crossover? Binary bitstrings? Program code?

What is the definition of a random genotype? A random program?




Binary Chromosome

In nature, DNA is a "string" written in an alphabet of 4 "characters" A, T, G and C (the 4 nucleobases). Other mechanisms of inheritance may have existed before DNA, and other mechanisms may exist elsewhere in the universe.

Simplest chromosome for machine evolution: String of bits 0 or 1.

  


Decoding the chromosome is one-way

Decoding the chromosome is a one-way process.
We decode a chromosome to a number (or something else).
We never encode a number as a chromosome.



10010011001100100101111000010100100100001010100110101010101010100101001001010011010111001

This genotype can be interpreted as representing an integer in range INT_MIN to INT_MAX.
Or it can be interpreted as representing a real number in range REAL_MIN to REAL_MAX.
Or it can be interpreted as representing text (program code).
etc.




Decoding to minus infinity to plus infinity

Given a real number y from 0 to 1, we can normalise this to any real number x from minus infinity to plus infinity by taking the inverse of a very linear sigmoid function (which encodes plus/minus infinity to 0-1).

For example setting:

  y = 1 / (1 + exp(-x/5))
and solving for x.


Q. An actual straight line is no good for this. Why?




Gray codes

Say we encode numbers as binary:

0 000
1 001
2 010
3 011
4 100
5 101
6 110
7 111
Note that to get from 3 to 4 we need a massive mutation (of all 3 bits). This can lead to a "barrier" that it is difficult for the GA to pass over.

Gray codes are an attempt to encode numbers such that to go to the next number always entails only 1 bit change. This rule works: "Start with all 0's for 0. For each successive number, successively flip the rightmost bit that produces a new string."

0 000
1 001
2 011
3 010
4 110
5 111
6 101
7 100
You could say now, a single bit change will change from 7 to 0. But we had before anyway a single bit change from 4 to 0. And indeed 7 to 3, 8 to 0, etc.

Anyway, a small mutation leading to a large (and probably bad) change is not such a problem as a small (desirable) continuous change needing a large mutation.


Jumping barriers

Evolution has a way of "jumping" barriers anyway. e.g. In original coding, 3 can't get to 4, but it jumps to 7, and then declines to 4.

3 can get to 4 via 1-bit mutations in 3 generations.

Or can it? What if 7 is really unfit? i.e. Fit is 3 and 4 only. Then the 7 mutations die immediately. It could get stuck on local minimum, but often you have to be unlucky not to be able to jump barrier somehow.




Bitstrings v. Regular language implementation

Evolving bitstrings is popular, but not compulsory.

Consider where a system is defined by 100 parameters:

You can have two approaches:


  1. Store in regular language implementations.
    • 20 real numbers. Need to define randomise method so they start in 0 to 1. Mutation means (say) plus/minus a real number between 0 and 0.1. Have to check for out of bounds.
    • 20 real numbers. Need to define randomise method so they start in 0 to 1000. Mutation means (say) plus/minus a real number between 0 and 100. Have to check for out of bounds.
    • 20 integers. Need to define randomise method so they start in 0 to 1000. Mutation means (say) plus/minus an integer between 0 and 100. Have to check for out of bounds.
    • and so on
    Mutation is flexible: Different amounts for different parameters.
    Crossover probably will not generate new parameters: Easiest way to define Crossover is to take some parameters from one parent, rest of parameters from other parent. Probably will not define it to cut in middle of a parameter. (That would take some coding work.) Will probably rely on mutation for new parameters.

  2. Store bitstrings.
    All parameters, say, stored back-to-back on one bitstring. Need method to decode a bitstring to the 100 parameters (not easy). Will involve all the bounds-checking above.
    Once we have defined that method, other methods are easy:
    • Randomise method - bitstring is random 0s and 1s.
    • Mutation method - flip some percentage of the bits.
    • Crossover method - start of one bitstring, end of another. (Maybe multiple cuts.)
    Mutation is not flexible: Same rate for whole genotype. Bit flip might have more impact on one parameter than on another.
    Crossover will often generate new parameters: Crossover might cut in middle of a parameter. Get a quite different child parameter.


Either approach has something to recommend it.
People evolve many different data structures.



ancientbrain.com      w2mind.org      humphrysfamilytree.com

On the Internet since 1987.      New 250 G VPS server.

Note: Links on this site to user-generated content like Wikipedia are highlighted in red as possibly unreliable. My view is that such links are highly useful but flawed.