Genetic Algorithms: 4.1 Quick Overview
Genetic Algorithms: 4.1 Quick Overview
4. Genetic Algorithms
4.1 Quick Overview
In 1960 the first serious investigation into Genetic algorithms (GAs) was undertaken by
John Holland [6]. Genetic algorithms attempt to mimic the properties of natural selection
and natural genetics. They consider a population of potential solutions and then use a
defined survival of the fittest technique to produce a new generation of solutions which are
hopefully better approximations to the ideal solution. This process of breeding fitter
generations comprises the essence of the genetic algorithm approach.
Genetic Algorithms have become popular due to several factors. They have been
successfully applied to self-adaptive control systems and to function optimisation problems.
They are a powerful search technique, yet are computationally simple. The search method
they use is robust since it is not limited like other search methods with regard to
assumptions about the search space. These factors and the relative benefits of GAs will be
investigated in the following sections. The theory and functionality of GAs is explained in
this report while the methods of implementation can be found in the well-outlined Genetic
Algorithm Toolbox [7].
4.1.1 Why
Algorithms?
Genetic
It is helpful to briefly consider the other methods currently known for control optimisation
so as to compare the respective advantages and disadvantages of the GA search method.
Search methods can be defined as being calculus, enumerative or random based.
The calculus-based methods can either be direct or indirect. They are local in scope, that is,
they both seek local optima by solving the derivative of a set of non-linear equations.
Enumerative search methods look at objective function values for every point of the search
space, one at a time. Although this is a simple approach it is highly inefficient for a large
search space. Random search methods tend to explore regions of the search space more
quickly yet tend to perform no more efficiently than enumerative schemes. All of these
search techniques discussed lack a robustness and reliability that GAs can provide. It is
informative to look at the properties of GAs as well as simultaneously comparing how they
differ from traditional search methods [6].
the
search
space
with
likely
Algorithms
in
Populations
GAs operate in cycles called generations, and operate on a population of potential solutions.
A population will have about 30-100 individuals and each individual, or
current
There are a few ways of encoding these chromosomes such as integer, real-valued and ternary
but one of the most popular ways is binary encoding (bit string), because it is a simpler string
to operate on. To create an initial population at random consisting of NIND individuals whose
chromosomes are LIND bits long the Matlab function crtbp.m will be used. The number of
random numbers that are generated is NIND x LIND and each number will be from the binary set
{0,1}.
That is, for binary encoding each chromosome is constructed by stringing binary
representations of vector components end to end (see Figure 4.2 below). The length of each
chromosome depends on the vector dimension and the desired accuracy. [5]
1011010110101110
Figure 4.2 A sample binary encoded chromosome consisting of a string of 16 bits
Merely observing the above encoding is not helpful in gaining information about
the
problem to be solved. It is only when the chromosome values are mapped to the decision
variable domain that helpful information is obtained. The search process operates on the
chromosome encoding and not the decision variables.
F (x) = g( f (x))
To obtain an expression for individual fitness F (xi) the performance of each individual
represented by the objective function f (xi) will be determined in relation to the whole
population NIND:
F (x i ) =
f (x i )
N IND
f (x )
i =1
Fitness values can be normalised, scaled, shared or left unchanged. It is often necessary to
offset the objective function before determining fitness to avoid obtaining negative objective
function values if the solution is to be optimised. The following equation can do this, where
a is a positive scaling value, and b is an offset to make fitness values non-negative.
F (x) = af (x) + b
Further mathematical descriptions of the fitness function are given in Section 4.2.7.
Before reproduction can take place, this process of selection must be used to decide the
number of times a particular individual will be chosen for reproduction.
While
these
selection algorithms are detailed elsewhere ([6], [7]) it will suffice to mention here that the
selection algorithm should possess the following properties:
i. Zero bias (an indication of accuracy) Individuals should be selected entirely on their
position in the population.
ii. Minimum spread (an indication of consistency) Individuals should be selected a
minimum of et (i) times (floor) and maximum of et (i) times (ceiling).
The stochastic universal sampling selection fulfils these conditions of zero bias and
minimum spread. The Matlab stochastic universal sampling function sus.m will be used in
later examples. Once a string has been selected using the criterion of the fitness function an
exact replica is made using the Matlab replication function rep.m. This is repeated for all
the selected strings and then these copies or replicas undergo crossover operations
random. The crossover operator is explained in the next section.
at
Crossover
Crossover, also known as recombination, is the operator that produces new chromosomes.
Crossover produces new individuals that have parts of both parents genetic material, akin to
nature. Different types of crossover are possible such as, multi-point, uniform and shuffle,
but the simplest of all is single-point crossover. An example of single-point crossover can
be seen in Figure 4.3. The cross site can be chosen at random and each pair of strings
undergo crossover as follows:
1
0
1
0
1
0
After crossover
1
1
1
0
1
1
1
0
1
0
1
1
1
1
1
0
1
0
Through this process of reproduction and crossover the GA can exploit the knowledge of the
gene pool by allowing good chromosomes to combine with not so good chromosomes. Note
that the crossover operator is applied with a chosen probability and this means that not all
chromosomes in the population may undergo crossover. Single-point crossover will be
implemented using the Matlab function xovsp.m.
Mutation
It is helpful to consider the phenomenon of mutation in the natural world before seeing how
it is helpful as an operator in GAs.
The difference with GAs is that a mutation only ever causes a single bit to change its state.
Therefore a 1 chosen at random will be changed to a 0 and vice versa. Single bits are
neither added nor left out just switched. An example is given in Figure 4.4.
Mutations are needed in GAs because while reproduction and crossover explore the search
space well, occasionally they lose information that is essential to the solution. It is for this
reason mutation is used to explore the search space as efficiently as possible. Mutation is
applied randomly to an entire population and with low probability (0.001->0.01). It can be
thought of as a secondary background operator. [7]
Original string:
10110101
Before mutation
Mutated string:
10100101
After mutation
Mutation point (in bold)
To keep the size of successive populations the same as the original population in the process
outlined above, the new individuals need to be reinserted into the original population. This
leads to the problem of deciding what individuals should be replaced. In this series of tests
the oldest members will be replaced first. In time therefore, as long as there are a sufficient
number of generational cycles each individual in the population will eventually be replaced.
Reinsertion is performed using the Matlab function reins.m.
In determining how many cycles of the GA should take place, any of the following stopping
criteria can be employed.
A.
B.
C.
acceptable
solutions to the problem at hand, the GA may simply run again, or be continued with
extended design criteria. The GA allows the user to pick from a number of potential
solutions it doesnt just yield one solution. It is good at identifying simultaneously these
other solutions for a problem that by nature has multiple solutions.
In Table 4.1 above a summary of the terms commonly used in the field of genetic is given.
The corresponding terms that are more commonly used by biologists are also included.
Genetic Algorithm
Natural
String
Chromosome
Feature, character,
Gene
or detector
Feature value
Allele
String position
Locus
Structure
Genotype
Parameter set,
Phenotype
Alternative solution,
Decoded structure
Non-linearity
Epistasis
40
population is initialised
Count of
generations is
increased
New
individuals
placed into
population
GENERATIONAL LOOP
Crossover
takes place
Mutation
operator used
Objective
function
evaluates new
generation
The plot that results from the SGA in Appendix 3 is shown below in Figure 4.6.
Figure 4.6
the
42