SoftComputing Unit4 GeneticAlgorithms
SoftComputing Unit4 GeneticAlgorithms
LAMARCKISM
In the olden days people believed that all the organisms on the earth had not undergone any
change. Jean Baptist Lamarck was the first person to propose the theory of evolution. He thought
that at some point of time in the history the size of giraffe was equal to that of deer. Due to
shortage of food material on the ground and to reach the lower branches of trees giraffes started
stretching their necks. Because of continuous stretching of neck, after several generations giraffes
developed long necks. Such characters that are developed during the lifetime of an organism are
called ‘acquired characters’. Lamarck proposed that these acquired characters are passed on to its
offsprings i.e. to next generation and proposed the theory of ‘Inheritance of acquired characters’.
For example elongation of neck and forelimbs in giraffe.
AUGUST WEISMANN
But August Weismann, tested this theory by an experiment on rats. He removed tails of parental
rats. He observed that their offspring’s are normal with tails. He has done it again and again for
twenty two generations but still offsprings are normal with tails. He proved that the bodily changes
are not inherited. So they won’t be passed to it’s offsprings.
Darwinism
Charles Darwin proposed ‘Natural selection’ the famous ‘theory of evolution’. Charles Darwin
(1809-1882) was born in England. He voyaged for five years, just when he was 22 years old. In the
world survey ship HMS Beagle. He visited a number of places including Galapagos Islands. He
keenly observed the flora and fauna of these places. He gathered a lot of information and
evidences. Darwin observed a small group of related birds which are exhibiting diversity in
structure in the Galapagos islands. These birds are Finch birds. Observe the fig-12. How do the
beaks help them. He was influenced by the book ‘Principles of geology’ written by Sir Charles
Lyell. He suggested that geological changes occured in a uniform rate, Darwin did not agree to this
idea. He felt that large changes occured due to accumulation of small changes.
Darwin was also influenced by the famous ‘Malthus theory’. This was written in ‘An essay on the
principles of population’. Malthus observed that population grows in geometrical progression (1, 2,
4, 8, ......) where as food sources increases in arithmetic progression (1, 2, 3, 4, 5, .......). He
suggested that geological changes occured characters are passed on to its offsprings i.e. to next
generation and proposed the theory of ‘Inheritance of acquired characters’. For example elongation
of neck and forelimbs in giraffe. But August Weismann, tested this theory by an experiment on
rats. He removed tails of parental rats. He observed that their offspring’s are normal with tails. He
has done it again and again for twenty two generations but still offsprings are normal with tails. He
proved that the bodily changes are not inherited. So they Charles Darwin proposed ‘Natural
selection’ the famous ‘theory of Charles Darwin (1809-1882) was born in England. He voyaged for
five years, just when he was 22 years old. In the world survey ship HMS Beagle. He visited a
number of places including Galapagos Islands. He keenly observed the flora and fauna of these
places. He gathered a lot of information and evidences. Darwin observed a small group of related
birds which are exhibiting diversity in structure in the Galapagos islands. These birds are Finch
birds. Observe the fig-12. How do the beaks help them. He was influenced by the book ‘Principles
of geology’ written
by Sir Charles Lyell. He suggested that geological changes occured Think and discuss Based on
these ideas, Darwin proposed the theory of “Natural selection”, which means that the nature only
selects or decides which organismshould survive or perish in nature. This is the meaning of
survival of the fittest. The organisms with useful traits will survive. If traits are not usefull to
organisms then they are going to be perished or eliminated from its environment Alfred Russel
Wallace also independently concluded that natural selection contributed to origin of new species.
For example we have seen in the case of red beetles which were seen and eaten by crows. So, the
population of red beetles gradually got eliminated or perished from its environment. But at the
same time the beetles which are green in colour which are present on the green leaves were not
noticed by crows. So the green beetles survived in the environment and their population have
gradually increased. This is nothing but “natural selection” Variations which are useful to an
individual are retained, while those which are not useful are lost. In a population when there is a
struggle for the existence the ‘fittest’ will be survived. Nature favours only useful variations. Each
species tend to produce large number of offsprings. They compete with each other for food, space,
mating and other needs. In this struggle for existence, only the fittest can survive. This is called
‘survival of the fittest’. Over a long period of time this leads to the formation of new species. You
may observe in your surroundings some seedlings and some of the animal kids only survive.
Situation-1:
In this situation a colour variation arises during reproduction. So that there appears one beetle that
is green in colour instead of red
More over this green coloured beetle passes it’s colour to it’s off spring (Progeny). So that all its
progeny are green. Crows cannot see the green coloured beetles on green leaves of the bushes and
therefore crows cannot eat them. But crows can see the red beetles and eat them.
As a result there are more and more green beetles than red ones which decrease in their number.
The variation of colour in beetle ‘green’ gave a survival advantage to
‘green beetles’ than red beetles. In other words it was naturally selected. We can see that the
‘natural selection’ was exerted by the crows. The more crows there are, the more red beetles would
be eaten and the more number of green beetles in the population would be. Thus the natural
selection is directing evolution in the beetle population. It results in adaptation in the beetle
population to fit in their environment better. Let us think of another situation.
Situation-2:
In this situation a colour variation occurs again in its progeny during reproduction, but now it
results in ‘Blue’ colour beetles instead of ‘red’ colour beetle. This blue colour beetle can pass its
colour to its progeny. So that all its progeny are blue Blue and red beetle Crows can see blue
coloured beetles on the green leaves of the bushes and the red ones as well. And therefore crows
can eat both red and blue coloured beetles. In this case there is no survival advantage for blue
coloured beetles as we have seen in case of green coloured beetles. What happens initially in the
population, there are a few blue beetles, but most are red. Imagine at this point an elephant comes
by and stamps on the bushes where the beetles live. This kills most of the beetles. By chance the
few beetles survived are mostly blue. Again the beetle population slowly increases. But in the
beetle population most of them are in blue colour. Thus sometimes accidents may also result in
changes in certain characters of the a population. Characters as we know are governed by genes.
Thus there is change in the frequency of genes in small populations. This is known as “Genetic
drift’, which provides diversity in the population.
In this case beetles population is increasing, but suddenly bushes were affected by a plant disease
in which leaf material were destroyed or in which leaves are affected by this beetles got less food
material. So beetles are poorly nourished. So the weight of beetles decrease but no changes take
place in their genetic material (DNA). After a few years the plant disease are eliminated. Bushes
are healthy with plenty of leaves. What do you think will be condition of the beetles
We inherited our traits from our parents. Let us see how sex is determined in human beings. Each
human cell contains 23 pairs (46) of chromosomes. Out of 23 pairs 22 pairs of chromosomes are
autosomes. Chromosomes whose number and morphology do not differ between males and
females of a species are called autosomes. The remaining pair is called allosomes or sex
chromosomes. These are two types, one is ‘X’ and the other is ‘Y’. These two chromosomes
determine the sex of an individual. Females have two ‘X’ chromosomes in their cells (XX). Males
have one ‘X’ and one ‘Y’ chromosomes in their cells (XY). All the gametes (ova) produced by a
woman have only X chromosomes. The gametes (sperm) produced by a man are of two types one
with X chromosome and other Y chromosome. If the sperm carries Y chromosome and fertilizes
the ovum (X chromosome). Then the baby will have XY condition. So the baby will be a boy
,mother’s sex chromosomes father’s sex chromosomes male child female child Father Mother
Baby girl Baby boy Baby girl Baby boy 22 +X 22 +Y 22+X 22+X (44+XX) ,(44+XY) (44+XX)
(44+XY) Parents Gametes Offsping 44+XY 44+XX Gyno Sperm Andro Sperm Eggs What will
happen if the sperm containing X chromosomes fertilizes the ovum? Who decides the sex of the
baby – mother or father? Is the sex also a character or trait? Does it follow Mendels’ law of
dominance? Were all your traits similar to that of your parents?
Background:
Ant Colony Optimization technique is purely inspired from the foraging behaviour of ant colonies,
first introduced by Marco Dorigo in the 1990s. Ants are eusocial insects that prefer community
survival and sustaining rather than as individual species. They communicate with each other using
sound, touch and pheromone. Pheromones are organic chemical compounds secreted by the ants
that trigger a social response in members of same species. These are chemicals capable of acting
like hormones outside the body of the secreting individual, to impact the behaviour of the receiving
individuals. Since most ants live on the ground, they use the soil surface to leave pheromone trails
that may be followed (smelled) by other ants. Ants live in community nests and the underlying
principle of ACO is to observe the movement of the ants from their nests in order to search for
food in the shortest possible path. Initially, ants start to move randomly in search of food around
their nests. This randomized search opens up multiple routes from the nest to the food source.
Now, based on the quality and quantity of the food, ants carry a portion of the food back with
necessary pheromone concentration on its return path. Depending on these pheromone trials, the
probability of selection of a specific path by the following ants would be a guiding factor to the
food source. Evidently, this probability is based on the concentration as well as the rate of
evaporation of pheromone. It can also be observed that since the evaporation rate of pheromone is
also a deciding factor, the length of each path can easily be accounted for.
In the above figure, for simplicity, only two possible paths have been considered between the food
source and the ant nest. The stages can be analyzed as follows:
1. Stage 1: All ants are in their nest. There is no pheromone content in the environment. (For
algorithmic design, residual pheromone amount can be considered without interfering with the
probability)
2. Stage 2: Ants begin their search with equal (0.5 each) probability along each path. Clearly, the
curved path is the longer and hence the time taken by ants to reach food source is greater than the
other.
3. Stage 3: The ants through the shorter path reaches food source earlier. Now, evidently they face
with a similar selection dilemma, but this time due to pheromone trail along the shorter path
already available, probability of selection is higher.
4. Stage 4: More ants return via the shorter path and subsequently the pheromone concentrations
also increase. Moreover, due to evaporation, the pheromone concentration in the longer path
reduces, decreasing the probability of selection of this path in further stages. Therefore, the whole
colony gradually uses the shorter path in higher probabilities. So, path optimization is attained.
Algorithmic Design:
Pertaining to the above behaviour of the ants, an algorithmic design can now be developed. For
simplicity, a single food source and single ant colony have been considered with just two paths of
possible traversal. The whole scenario can be realized through weighted graphs where the ant
colony and the food source act as vertices (or nodes); the paths serve as the edges and the
pheromone values are the weights associated with the edges. Let the graph be G = (V, E) where V,
E are the edges and the vertices of the graph. The vertices according to our consideration are Vs
(Source vertex – ant colony) and Vd (Destination vertex – Food source), The two edges are E1 and
E2 with lengths L1 and L2 assigned to each. Now, the associated pheromone values (indicative of
their strength) can be assumed to be R1 and R2 for vertices E1 and E2 respectively. Thus for each
ant, the starting probability of selection of path (between E1 and E2) can be expressed as follows:
Evidently, if R1>R2, the probability of choosing E1 is higher and vice-versa. Now, while returning
through this shortest path say Ei, the pheromone value is updated for the corresponding path. The
updation is done based on the length of the paths as well as the evaporation rate of pheromone. So,
the update can be step-wise realized as
follows:
1. In accordance to path length –
In the above updation, i = 1, 2 and ‘K’ serves as a parameter of the model. Moreover, the update is
dependent on the length of the path. Shorter the path, higher the pheromone added.
2. In accordance to evaporation rate of pheromone –
The parameter ‘v’ belongs to interval (0, 1] that regulates the pheromone
evaporation. Further, i = 1, 2. At each iteration, all ants are placed at source vertex Vs (ant colony).
Subsequently, ants move from Vs to Vd (food source) following step 1. Next, all ants conduct their
return trip and reinforce their chosen path based on step 2.
Pseudocode:
Procedure AntColonyOptimization:
Initialize necessary parameters and pheromone trials;
while not termination do:
Generate ant population;
Calculate fitness values associated with each ant;
Find best solution through selection methods;
Update pheromone trial;
end while
end procedure
The pheromone update and the fitness calculations in the above pseudocode can be found through
the step-wise implementations mentioned above. Thus, the introduction of the ACO optimization
technique has been established. The application of the ACO can be extended to various problems
such as the famous TSP (Travelling Salesman Problem).
Introduction to Genetic Algorithm
Debasis Samanta
26.02.2016
Limitations:
Computationally expensive.
For a discontinuous objective function, methods may fail.
Method may not be suitable for parallel computing.
Discrete (integer) variables are difficult to handle.
Methods may not necessarily adaptive.
Biologic behaviors:
Genetics and Evolution –> Genetic Algorithms (GA)
Behavior of ant colony –> Ant Colony Optimization (ACO)
Human nervous system –> Artificial Neural Network (ANN)
Firs time itriduced by Ptrof. John Holland (of Michigan University, USA,
1965).
But, the first article on GA was published in 1975.
Chromosome
Reproduction
+ = Organism’s cell :
Cell division
x y
gamete diploid
Kinetochore
Information from
two different Combined into so that diversity
organism’s body in information is possible
cells
Random crossover points
makes infinite diversities
Mutation:
Genetics
Definition of GA:
Genetic algorithm is a population-based probabilistic search and
optimization techniques, which works based on the mechanisms of
natural genetics and natural evaluation.
Start
Note:
An individual in the
population is
corresponding to a
Initial Population possible solution
No
Converge ? Selection
Yes
Reproduction
Stop
Note:
1 GA is an iterative process.
2 It is a searching technique.
3 Working cycle with / without convergence.
4 Solution is not necessarily guranteed. Usually, terminated with a
local optima.
Start
Define parameters
Parameter representation
Create population
Initialize population
Apply cost
function to each of
the population
No
Converge ? Evaluate the fitness
Selection
Yes
Select Mate
Stop
Crossover
Reproduction
Mutation
Inversion
Objective function(s)
Constraint(s)
Input parameters
Encoding
Decoding
Start
Convergence
Yes Return the individual(s) with
Criteria meet ? best fitness value
No
Select Np individuals
(with repetition) Stop
Perform inversion on
the offspring
SGA Parameters
Convergence threshold δ
Mutation µ
Inversion η
Crossover ρ
Simple GA features:
Computationally expensive.
Crossover
Mutation
Inversion
No
Convergence
meet ?
Stop
SGA Features:
It is applicable when
Limitations in SSGA:
Any Questions??
Debasis Samanta
01.03.2016
1 Encoding
2 Convergence test
3 Mating pool
4 Fitness Evaluation
5 Crossover
6 Mutation
7 Inversion
1 Encoding
2 Convergence test
3 Mating pool
4 Fitness Evaluation
5 Crossover
6 Mutation
7 Inversion
Different GAs
Encoding Schemes
Binary encoding
Real value encoding
Order encoding
Tree encoding
For example:
Encoding Scheme
Tree encoding
1 Individual
2 Population
Genotype
Phenotype
a b c 1 0 1 2 9 6 7 $ α β..................
Chromosome
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 7 / 42
Individual Representation :Phenotype and
Genotype
Note :
A: 0 1 1 0 0 1 0 1 0 1 0 1 0 1 1 1 1 0 Individual 1
B: 0 0 1 0 1 0 1 1 1 0 1 0 1 0 1 0 0 0 Individual 2
There are n items, each item has its own cost (ci ) and weight (wi ).
Maximize :
P
i ci ∗ wi ∗ xi
Subject to
P
xi ∗ wi ≤ W
where xi ∈ [0 · · · 1]
The encoding for the 0-1 Knapsack, problem, in general, for n items
set would look as follows.
Genotype :
1 2 3 4 ..... n-1 n
.....
Phenotype :
0 1 0 1 1 0 1 0 1 0 1 0 1. . . . . .1 0 1
A binary string of n-bits
Genotype :
Phenotype :
01101
A binary string of 5-bits
Genotype :
x y
Phenotype :
01101 11001
Two binary string of 5-bits each
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 15 / 42
Pros and cons of Binary encoding scheme
Limitations:
1 Needs an effort to convert into binary from
2 Accuarcy depends on the binary reprresentation
Advantages:
1 Since operations with binary represntation is faster, it provide a
faster implementations of all GA operators and hence the execution
of GAs.
2 Any optimization problem has it binary-coded GA implementation
Genotype :
x y
Phenotype :
5.28 -475.36
Real-value representation
where XL ≤ x ≤ XU
Equivalently,
ε = XU2−Xn
L
Note:If ε = 0.5, then 4.05 or 4.49 ≡ 4 and 4.50 or 4.99 ≡ 4.5 and so
on.
1 Example 1:
1 ≤ x ≤ 16, n = 6. What is the accuracy?
16−1 15
ε= 26
= 64 = 0.249 ≈ 0.25
2 Example 2:
What is the obtainable accuracy, for the binary representation for a
variable X in the range range 20.1 ≤ X ≤ 45.6 with 8-bits?
3 Example 3:
In the above case, what is the binary representation of X = 34.35?
Example:
TSP
- Visit all the cities
- One city once only
- Starting and ending city is the same
d A B C D E 6
2 4
A 0 2 6 4
B 2 0 7 5
5 D
C 7 0 3 1 B
E
D 6 3 0 3
7
E 4 5 1 0
d= Distance matrix C
Minimizing
Pn−2
cost = i=0 d(ci , ci+1 ) + d(cn−1 , c0 )
Subject to
P = [c0 , c1 , c2 , · · · , cn−1 , c0 ]
where ci ∈ X ;
Here, P is an ordered collection of cities and ci 6= cj such that
∀i, j = 0, 1, · · · , n − 1
Note: P represents a possible tour with the starting cities as c0 .
and
X = x1 , x2 , · · · , xn , set of n number of cities,
d(xi , xj ) is the distance between any two cities xi and xj .
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 25 / 42
Tree encoding
A D A B E G C F (In-order)
(TL R TR)
B C
A B D C E G F (Pre-order)
D E F (R TL TR)
D B G E E C A (Post-order)
G
(TL TR R)
C2
C1
C3
C1
C2
C9 C5
C5
C6
C7 C4
C4 C8
C6 C8
C7
C9
C10 C3
the width wi and height hi (which are constant for rigid blocks and
variable for flexible blocks)
1 hi
ρi , the desirable aspect ratio about where ρi ≤ wi ≤ ρi , where
ρi = 1, if the block bi is rigid.
ai = wi × hi , the area of each block bi .
Wire = f1 (B, N)
4 Desirable floor plan aspect ratio ρ such that 1ρ ≤ W
H
≤ ρ, where H
and W are the height and width of the floor plan, respectively.
Area = f2 (B, N, ρ)
5 Timing information.
Delay = f3 (B, N, ρ)
Objectives :
We are to find a floor plan, which would
1 Minimize floor plan area.
2 Minimize wire length.
3 Minimize circuit delay.
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 30 / 42
Tree encoding for Floor planning problem
3 4 5
1 1
4 5 6 7
2 2
6 7 3
H H
3
1
2 1 H 3
4 5
2 V V
6 7
6 7 4 5
Note 1:
The operators H and V expressed in polish notation carry the following
meanings:
ijH → Block bj is on top of the block bi .
ijV → Block bi is on the left of block bj .
Note 4:
Post order traversal of a binary tree is equivalent to polish notation
+ -
abc÷+ ab+c-
a ÷ + c
b c
a b
Note 5:
There is only one way to performing a post order traversal of a binary
tree.
H H
3
1
2 1 H 3
4 5
2 V V
6 7
6 7 4 5
Polish notation : 2 1 H 6 7 V 4 5 V H 3 H V
H
V2
2
3 1 V
H3 2
V4
3
4 5
2
H1 H 2
1 4
V 3
Floor Plan
4 5
Binary tree
Polish notation : 4 5 V 3 H 2 V 1 H
H H
2 1 V
H
?
6 7
V 3
4 5
2 1 H 6 7 V 4 5 V 3 H H V
H H 3
1
2 1 V
H 4 5
6 7 2
V 3 6 7
4 5
2 1 H 6 7 V 4 5 V 3 H H V
Problem :
Problem :
Debasis Samanta
08.03.2016
1 Encoding
2 Fitness Evaluation and Selection
3 Mating pool
4 Crossover
5 Mutation
6 Inversion
7 Convergence test
1 Encoding
2 Fitness evaluation and Selection
3 Mating pool
4 Crossover
5 Mutation
6 Inversion
7 Convergence test
Selection is the process for creating the population for next generation
from the current generation
To generate new population: Breeding in GA
Select a pair
Reproduce
Selection involves:
Survival of the fittest
Struggle for the existence
. . . etc.
A
3 E
P1: C B A D F E 11
1 6
2 P2: A B D C E F 19
B 2 D 5 P3: A C B F E D 16
5 4 P4: F C D B E A 12
4
P5: C F D A B E 10
C F
2
Tournament selection
Steady-state selection
Boltzman selection
fi
In an iteration, we calculate F̄
for all individuals in the current
population.
Note :
Here, the size of the mating pool is p% × N, for some p.
Convergence rate depends on p.
i
fi > fj
The wheel is rotated for Np times (where Np = p%N, for some p) and
each time, only one area is identified by the pointer to be the winner.
Note :
Here, an individual may be selected more than once.
Convergence rate is fast.
8% 20%
7
8% 8
Individual Fitness value pi 6
1 1.01 0.05
2 2.11 0.09
3 3.11 0.13 5%
4 4.01 0.17 1
5 4.66 0.20 20% 5
6 1.91 0.08
2
7 1.93 0.08
8 4.51 0.20 9%
4 3
17% 13%
Example:
Individual pi Pi r T
80 %
10 %
6% 4%
The observation is that the individual with higher fitness values will
guard the other to be selected for mating. This leads to a lesser
diversity and hence fewer scope toward exploring the alternative
solution and also premature convergence or early convergence with
local optimal solution.
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 20 / 40
Rank-based selection
To overcome the problem with Roulette-Wheel selection, a
rank-based selection scheme has been proposed.
The process of ranking selection consists of two steps.
1 Individuals are arranged in an ascending order of their fitness
values. The individual, which has the lowest value of fitness is
assigned rank 1, and other individuals are ranked accordingly.
2 The proportionate based selection scheme is then followed based
on the assigned rank.
Note:
The % area to be occupied by a particular individual i, is given by
r
PNi × 100
i=1 ri
where ri indicates the rank of i − th individual.
Two or more individuals with the same fitness values should have
the same rank.
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 21 / 40
Rank-based selection: Example
Continuing with the population of 4 individuals with fitness values:
f1 = 0.40, f2 = 0.05, f3 = 0.03 and f4 = 0.02.
Their proportionate area on the wheel are: 80%, 10%, 6% and 4%
Their ranks are shown in the following figure.
80%
30%
40%
20%
10% 10%
6% 4%
4 Stop
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 23 / 40
Comparing Rank-based selection with
Roulette-Wheel selection
1 80 % 0.4 4 40 %
2 10 % 0.05 3 30 %
3 7% 0.03 2 20 %
4 4% 0.02 1 10 %
40 %
80 %
1 1
2 10 % 2
3 30 %
4 4 3
7%
10 % 20 %
3%
Winner
?
?
? ? ? ?
India New Zealand England Sri Lanka S. Africa Australia Pakistan Zimbabwe
N = 8, NU = 2, Np = 8
Input :
Individual 1 2 3 4 5 6 7 8
Fintess 1.0 2.1 3.1 4.0 4.6 1.9 1.8 4.5
Output :
Trial Individuals Selected
1 2, 4 4
2 3, 8 8
3 1, 3 3
4 4, 5 5
5 1, 6 6
6 1, 2 2
7 4, 2 4
8 8, 3 8
If the fitness values of two individuals are same, than there is a tie in
the match!! So, what to do????
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 28 / 40
Tournament selection
Note :
There are different twists can be made into the basic Tournament
selection scheme:
Steps :
This completes the selection procedure for one iteration. Repeat the
iteration until the mating pool of desired size is obtained.
Reference:
D. D. Goldberg and K. Deb,”A comparison of selection schemes in
foundation of GA”, Vol. 1, 1991, Pg. 69-93
Web link : K. Deb Website, IIT Kanpur
Moves to the
Elite 1
mating pool
Elite 2
. . . . . . . . . .
Select then based
on earlier discussed
any scheme
Elite n
Population diversity
Selection pressure
Population diversity
Population diversity
Selection pressure
p
Gp = N
Debasis Samanta
11.03.2016
1 Encoding
2 Fitness Evaluation and Selection
3 Mating pool
4 Crossover
5 Mutation
6 Inversion
7 Convergence test
1 Encoding
2 Fitness evaluation and Selection
3 Mating pool
4 Crossover
5 Mutation
6 Inversion
7 Convergence test
Reproduction:
Crossover
Mutation
Inversion
Real-coded GAs
Tree-coded GAs
Note :
Generally, pc = 1.0, so that almost all the parents can participate in
production.
Before Crossover
Crossover Point - k
Two diploid
Offspring 1: 0 1 1 0 1 1 0 0
for two new
Offspring 2: offspring is
1 0 1 0 0 0 1 0
produced
After Crossver
Before Crossover
Parent 1 : 0 1 1 0 0 0 1 0
Parent 2 : 1 0 1 0 1 1 0 0
Offspring 1: 0 1 1 0 1 0 1 0
Offspring 2: 1 0 1 0 0 1 0 0
After Crossver
k1 k2 k3
Parent 1 Offspring 1
Parent 2 Offspring 2
Swap 1 Swap 2
Before crossover
Parent 1 : 1 1 0 0 0 1 0 1 1 0 0 1
Parent 2 : 0 1 1 0 0 1 1 1 0 1 0 1
Coin tossing: 1 0 0 1 1 1 0 1 1 0 0 1
After crossover
Offspring 1: 1 1 1 0 0 1 1 1 1 1 0 1
Offspring 2:
0 1 0 0 0 1 0 1 0 0 0 1
Where there is a 1 in the mask, the gene is copied from the first
parent
Before Crossover
Parent 1 : 1 1 0 0 0 1 0 1
Parent 2 : 0 1 1 0 0 1 1 1
Mask 1 0 0 1 1 1 0 1
After Crossver
Before crossover
Parent 1 : 1 1 0 0 0 0 1 0
Here, Hamming
distance is 4
Parent 2 : 1 0 0 1 1 0 1 1
Tossing: 1 0 1 1
If toss is 1, then swap the
bits else remain as it is
Offspring 1: 1 0 0 0 1 0 1 1
Offspring 2:
1 1 0 1 0 0 1 0
After crossver
P1 : 1 1 0 0 0 1 1 0
P2 : 1 0 0 1 1 0 1 1
K-point
P1' : 0 0 1 0 1 1 0 1
After shuffing bits
P2' : 0 1 1 1 0 1 0 1
Offspring 1: 0 0 1 0 1 1 0 1
Single point
Offspring 2: crossover
0 1 1 1 0 1 0 1
After crossver
Rows..
P1: P2:
r11 r12 r13 r14 r21 r22 r23 r24 Two
dimensianal
..
..
..
..
..
..
..
..
representation
of the
..
..
..
..
..
..
..
..
chromosomes
..
..
..
..
..
..
..
..
r1n-3 r1n-2 r1n-1 r1n n×4
r2n-3 r2n-2 r2n-1 r2n n×4
Each bit of the first parent is compared with the bit of the second
parent.
If both are the same, the bit is taken for the offspring.
Otherwise, the bit from the third parent is taken for the offspring.
P1: 1 1 0 1 0 0 0 1
P2: 0 1 1 0 1 0 0 1
P3: 0 1 1 0 1 1 0 1
C1: 0 1 1 0 1 0 0 1
Note: Sometime, the third parent can be taken as the crossover mask.
1 Non-uniform variation:
It can not combine all possible schemas (i.e. building blocks)
2 Positional bias:
The schemas that can be created or destroyed by a crossover
depends strongly on the location of the bits in the chromosomes.
3 End-point bias:
It is also observed that single-point crossover treats some loci
preferentially, that is, the segments exchanged between the two
parents always contain the end points of the strings.
4 Hamming cliff problem:
A one-bit change can make a large (or a small) jump.
A multi-bits can make a small (or a large gap).
For example, 1000 =⇒ 0111
(Here, Hamming distance = 4, but distance between phenotype is
1)
Similarly, 0000 =⇒ 1000
(Here, Hamming distance = 1, but distance between phenotype is
8)
Following are the few well known crossover techniques for the
real-coded GAs.
Linear crossover
Blend crossover
Binary simulated crossover
For example
Suppose P1 and P2 are the two parameter’s values in two parents,
then the corresponding offspring values in chromosomes can be
obtained as
Ci = αi P1 + βi P2
where i = 1, 2 · · · n (number of children).
αi and βi are some constants.
Advantages
1 It is simple to calculate and hence faster in computation
2 Can allow to generate a large set of offspring from two parent
values
3 Controls are possible to choose a wide-range of variations
Limitations
1 Needs to be decided the values of αi and βi
2 It is difficult for the inexperienced users to decide the right values
for αi and βi
3 If αi and βi values are not chosen properly, the solution may stuck
into a local optima.
Example :
P1 = 15.65 and P2 = 18.83
α = 0.5 and γ = 0.6
New offspring
C1=16.60 C2=17.88
C1 −C2
α= P1 −P2
Here P1 and P2 are represent the parent points and C1 and C2 are
two children solutions.
Probability Distribution:
offspring
C1=15.52 C2=18.95
parent
Limitations
1 Computationally expensive compared to binary crossover.
2 If proper values of parameters involved in the crossover
techniques are not chosen judiciously, then it may lead to
premature convergence with not necessarily optimum solutions.
A B C D E F G H
Before crossover
H G F E D C B A
K-point
A B C D E C B A
After Single point
binary crossover
H G F E D F G H
Example :
Crossover Point K
P1 : A C D E B F G H J I
P1 : E D C J I H B A F G
C1 : A C D E J I H B F G
C2: E D C J A B F G H I
Example :
K1 K2
P1 : A C D E B F G H J I
P1 : E D C J I H B A F G
C1 : E D C J B F G I H A
C2: A C D E I H B F G J
(b) This vector defines the order in which genes are successfully
drawn from P1 and P2 as follows.
4 If i th value is 1 then
Delete j th gene value from P1 and as well as from P2 and
append it to the offspring (which is initially empty).
5 Else
Delete k th gene value from P2 and as well as from P1 and
append it to the offspring.
6 Repeat Step 2 until both P1 and P2 are empty and the offspring
contains all gene values.
Example :
Random Vector σ 2 1 1 2 1 1 2 2 1 2
P1 : A C D E B F G H J I
P2 : E D C J I H B A F G
C1 : E C D J B F H A I G
C2: ?
Steps :
K1 K2 K3
P1 : A C D E B F G H J I
P1 : E D C J I H B A F G
C1 : E C D J B I A H F G
C2: D E C B I F G A H J
P1 : 1 2 4 6 8 7 5 3
P2 : 4 3 5 7 8 6 2 1
Connectivity graph:
2
6
1
4 8
3
7
5
City Connectivity
1 2 4 3
2 1 4 7 6
2
6
1 3 1 4 5
4 1 2 3 6
4 8
3
5 3 7 8
7
5 6 2 4 8
7 2 5 8
8 5 6 7
1 Start the child tour with the starting city of P1 . Let this city be X .
2 Append city X to C.
3 Delete all occurrences of X from the connectivity list of all cities
(right-hand column).
4 From city X choose the next city say Y , which is in the list of
minimum (or any one, if there is no choice) connectivity links.
5 Make X = Y [ i.e. new city Y becomes city X ].
6 Repeat Steps 2-5 until the tour is complete.
7 End
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 57 / 58
Any questions??
Debasis Samanta
15.03.2016
1 Encoding
2 Fitness Evaluation and Selection
3 Mating pool
4 Reproduction
Crossover
Mutation
Inversion
5 Convergence test
1 Encoding
2 Fitness evaluation and Selection
3 Mating pool
4 Crossover
5 Mutation
6 Inversion
7 Convergence test
8 Fitness scaling
Evolution
Local optima
Evolution
Global optima
Binary Coded GA :
Flipping
Interchanging
Reversing
Real Coded GA :
Random mutation
Polynomial mutation
Order GA :
Tree-encoded GA :
1 0 1 1 0 0 1 0
offspring
1 0 0 0 1 0 0 1
Mutation chromosome
0 0 1 1 1 1 0 0
Mutated offspring
* *
1 0 1 1 0 1 0 0
Child chromosome
1 1 1 1 0 0 0 1
Mutated chromosome
*
0 1 1 0 0 1 0 1
Child chromosome
0 1 1 0 0 1 1 1
Mutated chromosome
Binary Coded GA :
Flipping
Interchanging
Reversing
Real-coded GA :
Random mutation
Polynomial mutation
Order GA :
Tree-encoded GA :
Example :
Poriginal = 15.6
r = 0.7
∆ = 2.5
Then, Pmutated = 15.6 + (0.7 − 0.5) × 2.5 = 16.1
Example :
Poriginal = 15.6, r = 0.7, q = 2, ∆ = 1.2 then Pmutated =?
1
δ = 1 − [2 (1 − r )] q+1 = 0.1565
Pmutated = 15.6 × 0.1565 × 1.2 = 15.7878
Start
No
Fitness evaluation &
? Converge ?
Selection
Selection
Yes
Select Mate
Stop
Crossover
Reproduction
Mutation
Inversion
Fitness value
Best
individuals
Worst individuals
Search space
Fitness value
Search space
It is observed that
If fitness values are too far apart, then it will select several copies
of the good individuals and many other worst individual will not be
selected at all.
This will tend to fill the entire population with very similar
chromosomes and will limit the ability of the GA to explore large
amount of the search space.
If the fitness values are too close to each other, then the GA will
tend to select one copy of each individual, consequently, it will not
be guided by small fitness variations and search scope will be
reduced.
As a way out we can think for crossover or mutation (or both) with
a higher fluctuation in the values of design parameter.
Fitness scaling is used to scale the raw fitness values so that the GA
sees a reasonable amount of difference in the scaled fitness values of
the best versus worst individuals.
Linear scaling
Sigma scaling
Note:
The fitness scaling is useful to avoid premature convergence, and
slow finishing.
Algorithm
Steps :
f̄
a= fmax −f̄
,
f̄ ∗fmin
b= fmin −f̄
5 For each fi ∈ F do
fi0 = a × fi + b
F 0 = F 0 ∪ fi0
Note :
Steps :
1 Calculate the average fitness value
PN
i f
f̄ = i=1
N
2 Determine reference worst-case fitness value fw such that
fw = f̄ + S ∗ σ
Where σ = STD(F ), is the standard deviation of the fitness of
population and
S is a user defined factor called sigma scaling factor (Usually
1 ≤ S ≤ 5)
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 15.03.2016 27 / 30
Sigma scaling
fi0 = fik
Debasis Samanta
18.03.2016
S: Constraints
Subject to
gj (x1 , x2 , · · · , xn ), ROPj Cj , j = 1, 2, · · · , l
V: Design variables
xk ROPk dk , k = 1, 2, · · · , n
Note :
1 For a multi-objective optimization problem (MOOP), m ≥ 2
2 Objective functions can be either minimization, maximization or
both.
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 2 / 53
A formal specification of MOOP
Let us consider, without loss of generality, a multi-objective
optimization problem with n decision variables and m objective
functions
where
x = [x1 , x2 , · · · , xn ] ∈ X
y = [y1 , y2 , · · · , yn ] ∈ Y
Here :
x is called decision vector
y is called an objective vector
X is called a decision space
Y is called an objective space
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 3 / 53
Illustration: Decision space and objective space
3 2
1
1
In other words,
where ∀i ∈ [1, 2, · · · , m]
3 If this is the case, then we say that x¯∗ is a desirable solution.
Cardinality of the optimal set is more than one, that is, there are
m ≥ 2 goals of optimization instead of one
There are m ≥ 2 different search points (possibly in different
decision spaces) corresponding to m objectives
f1 f2 f3 f4
Objectives
Objectives
Search space Search space
5 f2
4
i ze
Maximize f2
3 im
Objectives
x
Ma
2
1
Minim
ize f
1
F2 (minimize)
F2 (minimize)
F1 (minimize) F1 (minimize)
Solution
Found ?
Subject to
constraint S
with design
variables V
Single Optimal GA to solve
Solution the problem
A MOOP problem
Minimize f1 Ideal Multiple Pareto-
Minimize f2 Multiobjective optimal
......... Optimizer solutions
Minimize fm
Subject to
constraint S
with design
Choose one Higher level
variables V
solution Informaiton
Here, effort have been made in finding the set of trade-off solutions by
considering all objective to be important.
Steps
minimize Cost
maximize Comfort
In the next few slides, we shall discuss the above idea of solving
MOOPs more precisely. Before that, let us familiar to few more basic
definitions and terminologies.
1 Concept of domination
2 Properties of dominance relation
3 Pareto-optimization
4 Solutions with multiple-objectives
Z*1
f2
f2
Z*
Z* Z*2
Z*
f1 f1
(B) A good solution vector should
(A) Ideal objective vector be as close to ideal solution vector
f2
Z*
Utopian objective vector
f1
Note :
Like the ideal objective vector, the Utopian objective vector also
represents a non-existent solution.
(f1max,f2max)
*
Z2
2 Znadir
Z* Z1*
Note :
z nadir is the upper bound with respect to Pareto optimal set. Whereas,
a vector of objective W found by using the worst feasible function
values fimax in the entire search space.
Notation
Definition 3 : Domination
A solution xi is said to dominate the other solution xj if both condition I
and II are true.
Condition : I
The solution xi is no worse than xj in all objectives. That is
fk (xi ) 7 fk (xj ) for all k = 1, 2, · · · , M
Condition : II
The solution xi is strictly better than xj in at least one objective. That is
fk̄ (xi ) fk̄ (xj ) for at least one k̄ ∈ {1, 2, · · · , M}
x3
Minimize f1
f2 Minimize f2
x1 ≤ x2
x2
x1
x1 ≤ x3 but x3 ≤ x1
x2 ≤ x3 as well as x3 ≤ x2
f1
x3 Minimize f1
f2 Maximize f2
x2 x1 ≤ x2 or x2 ≤ x1 ?
x1
x1 ≤ x3 or x3 ≤ x1 ?
x2 ≤ x3 or x3 ≤ x2 ?
f1
Note :
xj is dominated by xi
xi is non-dominated by xj
xi is non-inferior to xj
2
5
4
4
f2 minimize
3
1 5
2
3
1
2 6 10 14 18
f1 maximize
Reflexive :
The dominance relation is NOT reflexive.
Any solution x does not dominate itself.
Antisymmetric :
Transitive :
2
5
4
f2 minimize 4
3
1 5
2
3
1
2 6 10 14 18
f1 maximize
Non-dominated front
In other words, we can not say that two solutions 3 and 5 are
better.
2
5
4
4
f2 minimize
3
1 5
2
3
1
2 6 10 14 18
f1 maximize
2
5
4
4
f2 minimize
3
1 5
2
3
1
2 6 10 14 18 22
f1 maximize
2
5
4
4
f2 maximize
3 Here P = {1,2,3,4,5}
1 5
2 Non-dominated set
3 P’ = {3, 5}
1
2 6 10 14 18
f1 maximize
2
5
4
4
f2 miniimize
f2 (minimize)
3
1 5
2
3
1
2 6 10 14 18 22
f1 (minimize)
f1 maximize
2
5
4
4
f2 minimize
3
1 5
2
F2 (minimize)
f2 (maximize)
3
1
2 6 10 14 18
f1 maximize
f1 (maximize) F1 (minimize)
F2 (maximize)
F2 (miniimize)
F1 (miniimize) F1 (miniimize)
F2 (maximize)
F2 (miniimize)
F1 (maximize) F1 (maximize)
f2 (min)
f2 (min)
f2 max)
f2 (min)
F2 (min)
f2 (min)
f2 (min)
f2 (min)
f2 (max)
F2 (max)
F2 max)
Debasis Samanta
22.03.2016
Initialization of Yes
MOOP Selection Convergence Solution
Population
Test
No
Reproduction
MOEA Techniques
Independent sampling
Aggregation Hybrid Selection
(Ordering)
Criterion selection (VEGA)
Lexicographic ordering
Pareto Selection
Aggregation
(Scalarization Ranking
Goal attainment
Game theory
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 4 / 32
Classification of MOEA techniques
Note :
1 A priori approches
Lexicographic ordering
Simple weighted approach (SOEA)
2 A posteriori approaches
Criterion selection (VEGA)
Pareto-based approaches
2 Pareto-based approaches
Reference :
”Compaction of Symbolic Layout using Genetic Algorithms” by M.P
Fourman in Proceedings of 1st International Conference on Genetic
Algorithms, Pages 141-153, 1985.
Minimize
f = [f1 , f2 , · · · , fk ]
Subject to
gj (x) ≤ cj , where j = 1, 2, · · · , n
.................................................................
.................................................................
(c) At the i-th step, we have
Minimize fi (x)
Subject to gj (x) ≤ cj , j = 1, 2, · · · , n
fl (x) = fl∗ , l = 1, 2, · · · , i − 1
The solution obtained at the end is x̄k∗ , that is, fk∗ = fk (x̄k∗ ).
This is taken as the desired solution x̄ ∗ of the given multiobjective
optimization problem
Remarks :
Note :
It produces a single solution rather than a set of Pareto-optimal
solutions.
Pn
fitness = i=1 wi × fi (x)
Pareto -front
Minimize f2
Pare
to -fron
W1 f t2
1 +w
2 f2
Pa
re
Minimize f1
to
-fr
on
t2
Feasible objective
space
Minimize f2
Pareto-optimal front
min f1
SOEA Solution
About VEGA :
1 1 1
Sub Population 1
2 2 2
.......
.......
.......
.......
Shuffle the entire
Create Sub-population
Sub-population j population Reproduction
1 2 ... k
Split it into k-blocks Apply crossover
according to fitness and mutation to
.......
.......
.......
.......
Sub-population k
N N N
I1
Proportional to
I2 Sub population 1
selection w.r to f1
I3
Proportional to Sub population 2
selection w.r to f2
Create
a mating
pool of
size M
Proportional to
IN-1 Sub population k
selection w.r to fK
IN
Sub population of size M
Current population of size N
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 27 / 32
VEGA: Shuffle the sub-populations
2 Shuffle the sub-populations
Using some shuffling operation (e.g. generate two random
numbers i and j between 1 and M both inclusive and then swap Ii
and Ij which are in the i and j sub-populations.
Ii Ij
Shuffle
Ij Ii
I1 I1
I2 I2
Reproduction
Crossover
Mutation
IM IN
New generation of
population size N
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 29 / 32
Comments on VEGA
Advantages:
Disadvantages:
Debasis Samanta
29.03.2016
MOEA Solution
Techniques
Independent sampling
Lexicographic ordering
Aggregate Selection
Game theory approach
Criterion selection (VEGA)
Non-linear fitness evaluation
Pareto selection
SOEA
Min-Max method
Ranking (MOGA)
Demes
Elitist
MOEA Solution
Techniques
Independent sampling
Lexicographic ordering
Aggregate Selection
Game theory approach
Criterion selection (VEGA)
Non-linear fitness evaluation
Pareto selection
SOEA
Min-Max method
Ranking (MOGA)
Demes
Elitist
Converged ?
Min f2
xi
Xi
Min f1
Rank(x1)=1+|xi|
Where |xi| = number of solutions in
the shaded region
Max f2
xi
Max f1
Rank(x1)=1+11=12
Max f2
1
8 1
Max f1
Note :
Steps :
interpolation
f1
Pk fii
Example : Linearization = f̄i = j=1 f¯i
j
where fji denotes the j-th objective function of a solution in the i-th rank
and f¯i denotes the average value of the j-th objectives of all the
j
solutions in the i-th rank.
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 13 / 70
Illustration of MOGA
1 2 3 4 k l
1, 2, n
1 2
1 2 q
Pareto-domination tournament
let N = size of the population, K is the no of objective functions.
Steps :
The basic idea behind sharing is that the more individuals are
located in the neighborhood of a certain individual, the more its
fitness value is degraded.
Procedure do sharing(C1 , C2 )
1 j=1. Let x = C1
2 Compute a normalized (Euclidean distance) measure with the
individual xj in the current population as follows,
s 2
j
Pk fix −fi
dxj = i=1 f U −f L
i i
where fij denotes the i-th objective function of the j-th individual
fiU and fiL denote the upper and lower values of the i-th objective
function.
C1
C2
....
N*
C1
. . . . . . .
C2
1
Comparison_set_index 2
1
2 Candidate 1
Candidate 2
Comparison_individual
Random_pop_index
Population list
This approach does not apply Pareto selection to the entire population,
but only to a segment of it at each num, the technique is very first and
produces good non-dominated num that can be kept for a large
number of generation.
However, besides requiring a sharing factor, this approach also
requires a good choice of the value of tdom to perform well,
complicating its appropriate use in practice.
Notations :
xi
Max f2
Front 1
Front 2
Max f1
Front 4 Front 3
Steps :
1 For each xi ∈ P do
3 While Pk = φ do
Update nj = nj − 1
If nj = 0 then xj in Q Else Q = Q ∪ xj
2
5
4
4
f2 maximize
3
1 5
2
3
1
2 6 10 14 18
f1 maximize
Here, 1 dominates 2
5 dominates 1 etc.
The solutions in the front [3, 5] are the best, follows by solutions
[1, 4].
Start
MOOP
Initial Population
Encoding
Front k = 1
Classification
Classified ?
dominated individual
Yes
Selection for mating Assign dummy fitness
pool value
Reproduction
Sharing
(Crossover, Mutation) Fitness sharing
Evaluate each
front k = k + 1
individual
No
Converged ?
Yes
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 40 / 70
Assign dummy fitness value
Note :
Current front
f1
Note :
Approach
Since the total population size is 2N, not all fonts may be
accommodated in N stats available in P 0 . All fronts which could
not be accommodated are simply rejected. when the last allowed
front is being considered, there may exist more solution in the lase
front than the remaining slots in the new population.
Instead of arbitrarily discarding some members from the last
acceptable front, the solution which will make the diversity of the
selected solutions the highest are chosen.
This is accomplished by calculating crowding distance of solutions
in the last acceptable front.
This way a new generation was obtained and the steps are
repeated until it satisfies a termination condition.
F1 F1
F2 F2
P
Non-dominated Solution based on Fi
sorting Fi
crowding distance
P’
Note :
If they have the same rank but solution xi has better crowding
solution xj .
Note :
The second condition resolve the tie of both solution being on the
same non-dominated front by deciding on their crowding distances
(Note : For NSGA-II, only the second condition is valid as all solution
are belong to one front only.
f3
Xi-1
Xi-1
f2
x2 Xi
f1
Xi+1
Xi+1
f1
f2
(a) Solutions in 2D space (b) Solutions in 3D space
1 2 M
11 21 M1
12 22 M2
1l 2l Ml
f27
f26
f25
2
f24
f23
f22
f21
1
f11 f12 f13 f14 f15 f16 f17
Note :
Remarks :
Algorithm
Selection
Reproduction
Crossover
Mutation
Applications
Strengths of GA
Important Problems of GA
Why GA works?
Conclusion
References
INTRODUCTION
Genetic Algorithm has been developed by John Holland, his colleagues and his students at the
University of Michigan in 1975. The book named “Adaptation in Natural and Artificial Systems” was
the first by them that discusses GA.
Genetic Algorithms are search algorithm based on mechanics of natural selection and natural
genetics. They combine survival of the fittest among string structures with a structured yet
randomized information exchange to form a search algorithm with some of the innovative flair of
human touch. In every new generation a new set of strings is created using bits and pieces of fittest
of the old. While randomized, genetic algorithms are no simple random walk. They efficiently exploit
historical information to speculate on new search points with expected improved performance.
The Central theme of research on genetic algorithms has been robustness the balance between
efficiency and efficacy necessary for survival in many different environments. Genetic Algorithms are
theoretically and empirically proven to provide robust search in complex spaces. These algorithms
are computationally simple and yet powerful in their search for improvement.
1. GAs work with a coding of the parameter set and not the parameter themselves.
2. GAs search from a population of points and not from a single point.
3. GAs use objective function information and not derivatives or other auxiliary knowledge.
Selection
Reproduction
Crossover
Mutation
Stopping Criteria: We need to provide a stopping criteria like the minimum final tolerance in the
function or the total time the GA runs or the total number of generations.
In our main problem statement, we have provided both minimum tolerance(=10^-6) and the
maximum number of generations to be 400.
For the purpose of explaining the different parts of GA we have considered a problem statement and
written code to explain different parts of the GA.
x(1)^2-x(2)^2-5.13
INITIAL POPULATION
Initially many individual solutions are randomly generated to form an initial population. The
population size depends on the nature of the problem, but typically contains several hundreds or
thousands of possible solutions. Traditionally, the population is generated randomly, covering the
entire range of possible solutions (the search space). Occasionally, the solutions may be "seeded" in
areas where optimal solutions are likely to be found.
To generate population of 10 strings and take transpose so that each column represent the string
that we have randomly chosen.
function y=generation()
x1=[];
for i=1:10
x1=[x1;generate()];%8cross10
end
x1=transpose(x1);%10cross8
function y=generate()
for j=1:8
if(rand>0.5)
x1(j)=1;
else
x1(j)=0;
end
end
y=x1;
end
initial_generation =
1 1 0 1 1 0 1 0
0 1 1 1 1 0 1 1
1 1 1 0 1 0 1 0
0 0 0 1 1 0 1 0
0 0 1 1 0 0 0 1
1 1 0 1 1 0 0 0
1 0 1 0 1 0 1 1
1 1 1 0 0 0 1 0
1 0 1 0 0 0 1 0
0 1 1 1 1 0 1 1
SELECTION
During each successive generation, a proportion of the existing population is selected to breed a
new generation. Individual solutions are selected through a fitness based process, where
fitter solutions (as measured by a fitness function) are typically more likely to be selected. Certain
selection methods rate the fitness of each solution and preferentially select the best solutions. Other
methods rate only a random sample of the population, as this process may be very time-consuming.
Most functions are STOCHASTIC and designed so that a small proportion of less fit solutions are
selected. This helps keep the diversity of the population large, preventing premature convergence
on poor solutions. Some of the SELECTION methods are as follows:
It works on the concept of the game Roulette. Under this game each individual gets a slice of the
wheel, but more fit ones get larger slices than less fit ones. The wheel is then spun, and whichever
individual "owns" the section on which it lands each time is chosen. A form of fitness-proportionate
selection in which the chance of an individual's being selected is proportional to the amount by
which its fitness is greater or less than its competitors' fitness.
Scaling Selection
As the average fitness of the population increases, the strength of the selective pressure also
increases and the fitness function becomes more discriminating. This method can be helpful in
making the best selection later on when all individuals have relatively high fitness and only small
differences in fitness distinguish one from another.
Tournament Selection
Subgroups of individuals are chosen from the larger population, and members of each subgroup
compete against each other. Only one individual from each subgroup is chosen to reproduce.
Rank Selection
Each individual in the population is assigned a numerical rank based on fitness, and selection is
based on these ranking rather than absolute differences in fitness. The advantage of this method is
that it can prevent very fit individuals from gaining dominance early at the expense of less fit ones,
which would reduce the population's genetic diversity and might hinder attempts to find an
acceptable solution.
Hierarchical Selection
Individuals go through multiple rounds of selection each generation. Lower-level evaluations are
faster and less discriminating, while those that survive to higher levels are evaluated more
rigorously. The advantage of this method is that it reduces overall computation time by using faster,
less selective evaluation to weed out the majority of individuals that show little or no promise, and
only subjecting those who survive this initial test to more rigorous and more computationally
expensive fitness evaluation.
REPRODUCTION
Reproduction is a process in which individual strings are copied according to their objective function
values that is f. We can call this function as Fitness Function.
In other way around we can think this as some measure of profit, utility or goodness that we want to
maximize. Here fitness values of string signifies that
Strings with a higher value of would have higher probability of contributing one or more offspring to
next generation . This operator is an artificial version of natural selection. The reproduction operator
may be implemented in algorithmic form in a number of ways . We have used the concept of
Roulette wheel here, where each current string in the population has a roulette wheel slot size in the
proportion to its fitness.
Each time we require another offspring , a simple spin of the weighted roulette wheel yields the
reproductive candidate. In this way, more highly fit springs have a higher number of offspring in the
succeeding generation . Once a string has been selected for replica this string is then entered in to a
mating pool , a tentative new population for further operator action.
Here x1 is a 10 cross 8 array. Where each column is the string that we have randomly generated.
Convert is a function that converts the binary to decimal by breaking it into two 4 bit string and
calculating its decimal value. So our range of solution space is from 0 to 15.
%binary is 2cross1
binary1=convert(x1(1:8));
binary2=convert(x1(9:16));
binary3=convert(x1(17:24));
binary4=convert(x1(25:32));
binary5=convert(x1(33:40));
binary6=convert(x1(41:48));
binary7=convert(x1(49:56));
binary8=convert(x1(57:64));
binary9=convert(x1(65:72));
binary10=convert(x1(73:80));
%main gives scalar
Fitness is calculated by passing the values to the main function where values are inserted in actual
equation ,error is calculated and so the fitness value, which is returned back.
Roulette Selection is used to select the string which will go for the Reproduction . So, probability of
each of the 10 strings are calculated that will undergo Reproduction on the basis of their fitness
value. Using Random value and passing it down to different sets of condition we find the string
which will undergo Reproduction. Also, those strings which are selected for Reproduction are stored
in matrix named next.
sum=0;
for i=1:10
sum=sum+fitness1(i);
end
sum
probability=[];%10cross1
probability(1)=fitness1(1)/sum;
for i=2:10
probability(i)=probability(i-1)+fitness1(i)/sum;
end
next=[];%8cross10 after selection
for i=1:10
r=rand;
if(rand<probability(1))
next=[next;x1(1:8)];
elseif(rand<probability(2))
next=[next;x1(9:16)];
elseif(rand<probability(3))
next=[next;x1(17:24)];
elseif(rand<probability(4))
next=[next;x1(25:32)];
elseif(rand<probability(5))
next=[next;x1(33:40)];
elseif(rand<probability(6))
next=[next;x1(41:48)];
elseif(rand<probability(7))
next=[next;x1(49:56)];
elseif(rand<probability(8))
next=[next;x1(57:64)];
elseif(rand<probability(9))
next=[next;x1(65:72)];
elseif(rand<probability(10))
next=[next;x1(73:80)];
end
end
initial_generation=transpose(x1);%8 cross 10 // Row represent string
after_selection=transpose(next);%10 cross 8 // Column represent string
after_select =
1 1 1 0 1 0 1 0
0 0 1 1 0 0 0 1
0 0 1 1 0 0 0 1
0 0 1 1 0 0 0 1
0 0 1 1 0 0 0 1
1 1 0 1 1 0 0 0
0 0 1 1 0 0 0 1
0 0 1 1 0 0 0 1
1 1 0 1 1 0 0 0
1 1 1 0 1 0 1 0
Here we crossover the selected strings using the above formulae and rest of the strings are directly
inherited in the matrix.
function g=greatest_integer(A)
g=A-mod(A,1);
end
after_crossover =
0 0 1 1 0 0 0 1
0 0 1 1 0 0 0 1
1 1 1 1 0 0 0 1
0 0 1 0 1 0 1 0
0 0 1 1 0 0 0 1
0 0 1 1 0 0 0 1
0 0 0 1 1 0 0 0
1 1 1 1 0 0 0 1
1 1 1 0 1 0 1 0
1 1 0 1 1 0 0 0
MUTATION
Mutation adds new information in a random way to the genetic search process and ultimately helps
to avoid getting trapped at local optima.
Mutation in a way is the process of randomly disturbing genetic information. They operate at the bit
level; when the bits are being copied from the current string to the new string, there is probability
that each bit may become mutated. This probability is usually a quite small value, called as mutation
probability. A coin toss mechanism is employed; if random number between zero and one is less
than the mutation probability, then the bit is inverted, so that zero becomes one and one becomes
zero. This helps in introducing a bit of diversity to the population by scattering the occasional points.
This random scattering would result in a better optima, or even modify a part of genetic code that
will be beneficial in later operations. On the other hand, it might produce a weak individual that will
never be selected for further operations.
Mutation value is taken as 0.01 and using the random value we change the value of the binary digit
if probability allows to do so.
%MUTATION
changes=0;
for i=1:length(after_crossover)
if(rand<0.01)
after_crossover(i)=invert(after_crossover(i));
changes=changes+1;
end
end
after_mutation=after_crossover;
initial_generation
after_select
after_crossover
after_mutation
changes%gives number of changes
initial_fitness=sum;
initial_fitness
final_fitness=fitness(after_mutation);
final_fitness
end
after_mutation =
0 0 1 1 0 0 0 1
0 0 1 1 0 0 0 1
1 1 1 1 0 0 0 1
0 0 1 0 1 0 1 0
0 0 1 1 0 0 0 1
0 0 1 1 0 0 0 1
0 0 0 1 1 0 0 0
1 1 1 1 0 0 0 1
1 1 1 0 1 0 1 0
1 1 0 1 1 0 0 0
changes = 0
initial_fitness = 0.4753
final_fitness = 1.4484
The main problem statement which we are going to present is taken from the research paper “The
application of Genetic Algorithm (GA) to estimate the rate parameters for solid state reduction of
iron ore in presence of graphite”. GA’s have found their application in many diverse fields like
physics, astronomy, finance, chemistry, etc. Here, we have applied GA to estimate the various rate
parameters using experimental data and compared the results with those values presented in the
literature. The reduction of hematite to iron has been considered to occur in three sequential steps
namely as:
Mass balance equations were assumed to follow the first order kinetics.
( )
( ) ( )
( ) ( )
( )
Where, H,M,W represent the concentrations of hematite, magnetite and wustite at time t
respectively.
The unknown parameters are three rate constants: kh, km, kw and activation energies, Eh, Em and
Ew which we are going to find using GA and experimental data. The parameter which is measured
using the experiment is defined as the degree of reduction coefficient which is found by using the
formula : ( ) , where Wh is the total weight of hematite at time t, is the
( )
loss in weight of the packed bed and Z is the exit gas composition, i.e., CO/CO2 ratio.
The predicted values of degree of reduction have been obtained from the respective rates of
hematite, magnetite, and wustite, as follows:
( ( ) ( ) ( ))
Where, S0 gives the total concentration of oxygen consumed in time , and 936 kg/m3 is the total
removable oxygen. H, M and W can be found from the experimental graph of the concentration
versus time graph. GA has been used to find the unknown parameters by minimizing the error
between the experimental and predicted values of the degree of reduction. We have used a double
vector to generate the populations. GA parameters used by us are:
Generations: 400
Range at 1000oC: for kh (s-1), km(s-1), kw(s-1), Eh (KJ/mol), Em (KJ/mol), Ew (KJ/mol) is resp:
The total time chosen is 4800 s. and the initial values of H, M and W are 3100, 0 and 0
respectively (from graph):
Evaluation of concentration of various iron oxide phases as well as pure iron during packed bed reduction of
0
iron ore – graphite composite pellets under argon atmosphere at 1000 C
We made the following code to write the fitness function using MATLAB:
function z=solve2(y)
b1=y(1)*exp(-y(4)/10583.722);
b2=y(2)*exp(-y(5)/10583.722);
b3=y(3)*exp(-y(6)/10583.722);
h=3100*exp(-b1*1300);
m=0.97*3100*b1*(exp(-b1*4800)-exp(-b2*4800))/(b2-b1);
w=0.93*b2*0.97*3100*((exp(-b1*4800)-exp(-b3*4800))/(b3-b1)-(exp(-b2*4800)-exp(-b3*4800))/(b3-b2))/(b2-
b1);
ht=-h*b1;
mt=0.97*h*b1-m*b2;
wt=0.93*m*b2-w*b3;
S0=0.033*(h-3100)+0.068*m+0.222*w;
z=1/(0.62-((S0+(0.033*ht+0.068*mt+0.222*wt)*4800)/936));
end
The code was run using the in-built function of MATLAB – ga. The GUI interface was used and the
conditions were inserted and the code was run for 400 generations. The results found were:
Kh (s-1) Km (s-1) Kw (s-1) Eh Em Ew
(KJ/mol) (KJ/mol) (KJ/mol)
2. Aerospace Engineering- To design wing shape Super Sonic aircraft minimizing aerodynamic drag
at supersonic cruising speeds, minimizing drag at subsonic speeds, and minimizing aerodynamic
load .
6. Routing and Scheduling- To find optimal routing paths in telecommunication networks which are
used to relay data from sender to recipients .
8. Data mining- GA can also be used in data mining and pattern recognition. Given a large set of
data, GA can perform a search for the optimum set of data which satisfies the conditions. Initial
population in this case may be single objective conditions and at the end of the GA, we get a
combined complex condition which when applied on the large data set can lead to finding the
required set of data.
Strengths of GAs
1. The main strength of GA is that it is intrinsically parallel. Whereas most other algorithms are
serial and search the solution space to a problem in one direction at a time. So, in such case
if solution turns out to be suboptimal then we have to abandon all work previously and start
over. However since GA has multiple offspring so they can explore the solution space in
multiple direction at once given the condition that some of them would turn out as dead
end.
2. GA is better in solving problems where the space of all potential solution is truly huge. Non
Linear functions falls into this category, where changing one component may have ripple
effects on the entire system and where multiple changes that individually are detrimental
may lead to much greater improvements in fitness function.
3. GA performs well in problems for which fitness function is complex or discontinuous or noisy
or changes over time or has many local optima. Whereas other search algorithms can
become trapped by local optima but GA works well to avoid local optima.
4. One of the main features of GA is its ability to manipulate many parameters simultaneously.
Many problems cannot be stated as single value to be maximized or minimized but
expressed as multiple objectives.
5. GAs know nothing about the problem that they are deployed to solve. The virtue of this
technique is that it allows GA to start out with open mind. Since in GA decisions are based
on randomness all possible search pathways are theoretically open to GA.
Neural networks (NN) fall into the category of Supervised Models. That is, our data will be a set of
rows, where each row contains an input and a corresponding output for that input. Our NN learns by
seeing the difference between the correct output and one it predicted, and then adjusting its
parameters. So one can't use NN if he/she don't have input-ouput data .
Genetic algorithms (GA) are basically optimizers. Here we will have some set of parameters that we
want to optimize for something. We will need an evaluation function that takes these parameters
and tells us how good these parameters are. So we keep changing these parameters somehow until
we get an acceptable value from our evaluation function, or until we see that things are not
improving any more.
Examples: Scheduling airplanes/shipping. Timetables . Finding the best characteristics for a simple
agent in an artificial environment. Rendering an approximation of a picture with random polygons.
If we have data that is suitable for supervising a model, then we can use a NN. If we want to
optimize some parameters, then use a GA. But most importantly, it is the nature of our data and
what we want out of it that should decide what model to use.
Simulated annealing:
The fitness of the new solution is then compared to the fitness of the previous solution; if it is
higher, the new solution is kept. Otherwise, the algorithm makes a decision whether to keep or
discard it based on temperature. If the temperature is high, as it is initially, even changes that cause
significant decreases in fitness may be kept and used as the basis for the next round of the
algorithm, but as temperature decreases, the algorithm becomes more and more inclined to only
accept fitness-increasing changes. Finally, the temperature reaches zero and the system "freezes";
whatever configuration it is in at that point becomes the solution. Simulated annealing is often used
for engineering design applications such as determining the physical layout of components on a
computer chip.
Important Problems of GA:
There are several problem related to using GA but we would like to stress on 4 major problems of
GA:
1. Difficulty in population selection,
Population selection is a big question but different selections for different problem have been found
out. Like for any case of optimization problem in which we have n independent variables and all
have certain constraint over them like they are bounded in between the given values. In such
problems first of all we are supposed to find out how much accuracy would be required and then if
we go for Bit String representation then the corresponding population will be decided by
Generally GA Fitness Function value increases with the iteration so we have to define our
Fitness Function in such a way so that it increases even with searching for minima or
maxima. Like in our main Problem we were supposed to find out the optimal minimum
value so we defined our fitness function in such a way so that optimal solution fitness can
increase with every iteration.
We defined it like
So we can always find a way to define our fitness function in such a way so that we fit it into
the GA.
Rapid convergence is a very common as well as very general problem in GA. It occurs
because the Fitness function changes its value rapidly and that’s what results in a premature
convergence There are several method to modify our Fitness function in such a way so that
Convergence occur slowly which will allow enough time for GA to search in whole space and
find the global optima.2 ways are defined below.
1. F’ = a*F + b ;
Here a is a small no while b is relatively larger number now it can be clearly seen that
even if F rapidly increases then after F’ will increase slowly.
Here k is kept low so that even if F increases rapidly it may not create rapid Convergence
. There is slightly advanced way to define Fitness function in which Fitness Function is
modified in each and every generation.
So,
k = func(t) , t is generation
Another important problem is that most of the times GA converges to local optima. We may
very well think that we have found the optimal solution but actually we didn’t.
Now, it is important to note that GA ability to find the optimal solution lies in the
hand of the user. Optimal solution will be achieved only if programmer has written the code
so that GA has the ability to search over whole Space to find the optimal solution. So what
we should do in such case is that we should modify our crossover and mutation function
because they are responsible for changing the population in each and every iteration.
*10101100101
Where * is called as don’t care symbol and can be either 1 or 0. So the above schema matches with
the string shown below:
110101100101
010101100101
So from the above we can easily see that 1010100011 represent only one string but **********
represents all the string of length 10.
So on…..
Now there are 2 important schema properties that should be noted here:
• Order of the Schema (denoted by o(S)) : it is defined as the total number of 0’s and 1’s in
the Schema
• Defining length of Schema(denoted by Ω(S)): is defined as distance between the 1st and the
last fixed string position
For Example –
S1 = ***001*110
S2 = 11101**001
So,
o(S1) = 6, o(S2) = 8 ;
Ω(S1) = 10-4 = 6, Ω(S2) = 10-1 = 9
Now as discussed earlier the GA program consists of four consecutive repeated steps:
t ⟵ t+1
Select P(t) from P(t-1) and recombine P(t). Evaluate P(t) for which we will require some new
terms
£(S,t) : it is defined as the number of string in the population which matches with the
Schema at time t.
for example
S = ****111***********************
o(S) = 3
Ω(S) = 2
X1 = 11110111000000101010101010100
X2 = 01110011001010010101001010101
X3 = 01010100101011111111111111000
X4 = 00000011111111111100000000011
X5 = 10110110000000011010101001101
X6 = 10101111110100000000111111111
X7 = 00000000011111111101101010010
X8 = 01010111111000000000111111111
X9 = 10101010010100000111111111000
X10 = 10101010100100101001010010101
X11 = 10000111101010101111111100000
X12 = 10101011111111111111111111111
X13 = 10101111000000010101001010101
X14 = 10001000000000000011111111111
X15 = 10101111111111110000000000000
X16 = 10000000010101010010101001001
X17 = 10001110000000010101000010000
X18 = 10000000000000000000000000000
X19 = 11110000111101010010101001000
X20 = 00000000000000000011111111111
Then in the above population X13 , X15 ,X17 are the 3 strings that are matched with the
Schema.
So £(S,t) = 3
eval (S,t) is defined as the average fitness of the String in the population which matches
with the Schema ,i.e
Or
Here pop_size is the population size and F(S,t) is the sum of the fitness of all the Strings.
Where F(S,t) is the average fitness of the entire population.
Now from the above relation it is clear that “above average” Schema receive an increasing
number of String in the next generation and “below average” Schema receive decreasing
number of String in the next generation.
So,
£ (S,t) = £(S,0)(1 + α)t
Consider 2 Strings:
S1 = ***111**************************
S2 = 111***************************01
It can be clearly seen that there are relatively higher chances that S1 will survive after
Crossover but on the other hand it is almost sure that the S2 will be destructed until unless
it is Crossover by itself or similar kind of Schema.
In general a crossover site is selected uniformly among m-1 possible sites. This
implies that that the probability of destruction of Schema S is:
pd(S) = Ω(S)/(m-1)
ps(S) = 1 - Ω(S)/(m-1)
ps(S1) = 30/32 , ps(S2) = 0
So,
So, the modified number of expected number of String which will match in the next
iteration with the Schema is given by:
ps(S) = (1 – pm)o(S)
since pm << 1 . So
ps(S) 1 –o(S).Pm
hence,
So this is the final expected number of String which will be same as that of Schema.
If [eval(S,t) /F(S,t).[ 1 - pc(S). Ω(S)/(m-1) - o(S). pm]] > 1
Then, the number of String similar to Schema will increase exponentially otherwise decrease
exponentially. So this is how GA discriminate between the above average Schema and below
average Schema.
Conclusion:
We have tried to explore the Genetic Algorithm as an optimizing technique and studied its
advantages over the other optimizing techniques. The comparison was made with other
optimization techniques and we found that GA due to its stochastic nature, parallelism and natural
selection technique is better than others in many ways. GA’s can be applied to complex non-linear
functions and problems involving huge solution space search where other optimizing techniques
may not work. For simple problems, GA may be quite expensive with respect to time.
We also explained the algorithm with the help of an example showing all major steps of GA
including reproduction and selection, cross over and mutation. We found that the total fitness of the
population increases with iterations and hence, we approach to optimum values. We discussed
varied applications of GA in many fields and finally, showed the use of GA in finding the rate
parameters of reduction of iron ore in presence of graphite.
References:
1. Golap Md. Chowdhury, Gour G. Roy, “Application of Genetic Algorithm (GA) to estimate the rate
parameters for solid state reduction of iron ore in presence of graphite”, Computational
Materials Science 45 (2009) 176–180
2. Dorit Wolf and Ralf Moros, “Estimating rate constants of heterogeneous catalytic reactions
without supposition of rate determining surface steps- an application of a genetic algorithm”,
Chemical Engineering Science, Vol. 52, No. 7, pp-1189-1199, 1997('her
5. www.rennard.org/alife/english/gavintrgb.html
6. www.ai-junkie.com/ga/intro/gat1.html
7. www.genetic-programming.org/
8. en.wikipedia.org/wiki/Genetic algorithm