0% found this document useful (0 votes)
8 views382 pages

SoftComputing Unit4 GeneticAlgorithms

Uploaded by

Konakalla Srija
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
8 views382 pages

SoftComputing Unit4 GeneticAlgorithms

Uploaded by

Konakalla Srija
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 382

EVOLUTIONARY COMPUTATIONAL THEORY

LAMARCKISM
In the olden days people believed that all the organisms on the earth had not undergone any
change. Jean Baptist Lamarck was the first person to propose the theory of evolution. He thought
that at some point of time in the history the size of giraffe was equal to that of deer. Due to
shortage of food material on the ground and to reach the lower branches of trees giraffes started
stretching their necks. Because of continuous stretching of neck, after several generations giraffes
developed long necks. Such characters that are developed during the lifetime of an organism are
called ‘acquired characters’. Lamarck proposed that these acquired characters are passed on to its
offsprings i.e. to next generation and proposed the theory of ‘Inheritance of acquired characters’.
For example elongation of neck and forelimbs in giraffe.

AUGUST WEISMANN
But August Weismann, tested this theory by an experiment on rats. He removed tails of parental
rats. He observed that their offspring’s are normal with tails. He has done it again and again for
twenty two generations but still offsprings are normal with tails. He proved that the bodily changes
are not inherited. So they won’t be passed to it’s offsprings.

Darwinism
Charles Darwin proposed ‘Natural selection’ the famous ‘theory of evolution’. Charles Darwin
(1809-1882) was born in England. He voyaged for five years, just when he was 22 years old. In the
world survey ship HMS Beagle. He visited a number of places including Galapagos Islands. He
keenly observed the flora and fauna of these places. He gathered a lot of information and
evidences. Darwin observed a small group of related birds which are exhibiting diversity in
structure in the Galapagos islands. These birds are Finch birds. Observe the fig-12. How do the
beaks help them. He was influenced by the book ‘Principles of geology’ written by Sir Charles
Lyell. He suggested that geological changes occured in a uniform rate, Darwin did not agree to this
idea. He felt that large changes occured due to accumulation of small changes.
Darwin was also influenced by the famous ‘Malthus theory’. This was written in ‘An essay on the
principles of population’. Malthus observed that population grows in geometrical progression (1, 2,
4, 8, ......) where as food sources increases in arithmetic progression (1, 2, 3, 4, 5, .......). He
suggested that geological changes occured characters are passed on to its offsprings i.e. to next
generation and proposed the theory of ‘Inheritance of acquired characters’. For example elongation
of neck and forelimbs in giraffe. But August Weismann, tested this theory by an experiment on
rats. He removed tails of parental rats. He observed that their offspring’s are normal with tails. He
has done it again and again for twenty two generations but still offsprings are normal with tails. He
proved that the bodily changes are not inherited. So they Charles Darwin proposed ‘Natural
selection’ the famous ‘theory of Charles Darwin (1809-1882) was born in England. He voyaged for
five years, just when he was 22 years old. In the world survey ship HMS Beagle. He visited a
number of places including Galapagos Islands. He keenly observed the flora and fauna of these
places. He gathered a lot of information and evidences. Darwin observed a small group of related
birds which are exhibiting diversity in structure in the Galapagos islands. These birds are Finch
birds. Observe the fig-12. How do the beaks help them. He was influenced by the book ‘Principles
of geology’ written
by Sir Charles Lyell. He suggested that geological changes occured Think and discuss Based on
these ideas, Darwin proposed the theory of “Natural selection”, which means that the nature only
selects or decides which organismshould survive or perish in nature. This is the meaning of
survival of the fittest. The organisms with useful traits will survive. If traits are not usefull to
organisms then they are going to be perished or eliminated from its environment Alfred Russel
Wallace also independently concluded that natural selection contributed to origin of new species.
For example we have seen in the case of red beetles which were seen and eaten by crows. So, the
population of red beetles gradually got eliminated or perished from its environment. But at the
same time the beetles which are green in colour which are present on the green leaves were not
noticed by crows. So the green beetles survived in the environment and their population have
gradually increased. This is nothing but “natural selection” Variations which are useful to an
individual are retained, while those which are not useful are lost. In a population when there is a
struggle for the existence the ‘fittest’ will be survived. Nature favours only useful variations. Each
species tend to produce large number of offsprings. They compete with each other for food, space,
mating and other needs. In this struggle for existence, only the fittest can survive. This is called
‘survival of the fittest’. Over a long period of time this leads to the formation of new species. You
may observe in your surroundings some seedlings and some of the animal kids only survive.

DARWIN’S THEORY OF EVOLUTION IN A NUTSHELL


1. Any group of population of an organism develops variations and all members of group are not
identical.
2. Variations are passed from parent to offspring through heredity.
3. The natural selection overabundance of offspring leads to a constant struggle for their survival in
any population
4. Individuals with variations that help them to survive and reproduce tend to live longer and have
more offspring’s than organisms with less useful features.
5. The offspring of survivors inherit the useful variations, and the same process happens with every
new generation until the variation becomes a common feature.
6. As the environment changes, the organism within the environment adapt and changes to the new
living conditions.
7. Over a long period of time, each species of organism can accumulate so many changes that it
becomes a new species, similar to but distinctly different from the original species. All species on
the earth arise in this way.
8. Evolution is a slow and continuous process. There are some limitations and objections to the
Darwin theory. Many new theories like synthetic theory, mutation theory is put forward
Speciation

How new species are evolved?


We have seen variations in a population of species, where the organism contains the traits that
helped to adapt to the environment. These organisms are going to survive more efficiently. But in
the same population the organism which contains the nonbeneficial traits may not be adapted in the
environment. They are going to perish or eliminated slowly, like red and blue beetles in a
population . These small changes within the species for example colour of beetles red and green is
known as micro evolution. Now we are going to discuss how new species are formed. This is
known as speciation, which is also known as Macroevolution. We have seen red and green beetles
can mate each other and can have offsprings. But let us imagine that red and green beetles are
separated by some cause (for example while eating beetles crows dropped some beetles accidently
in the long distance far away places) for long years. There might be a lot of variations taken place
in these years in the red and green beetle population. Now even though they may meet accidentally,
they cannot mate and produce new offsprings. They can only mate in their population either red or
green and can reproduce its off spring. Thus new species have been formed. Variations in beetle
population Observe the below diagram showing variation in beetle population and it its impact.
This is only an assumption.

Situation-1:
In this situation a colour variation arises during reproduction. So that there appears one beetle that
is green in colour instead of red

More over this green coloured beetle passes it’s colour to it’s off spring (Progeny). So that all its
progeny are green. Crows cannot see the green coloured beetles on green leaves of the bushes and
therefore crows cannot eat them. But crows can see the red beetles and eat them.
As a result there are more and more green beetles than red ones which decrease in their number.
The variation of colour in beetle ‘green’ gave a survival advantage to
‘green beetles’ than red beetles. In other words it was naturally selected. We can see that the
‘natural selection’ was exerted by the crows. The more crows there are, the more red beetles would
be eaten and the more number of green beetles in the population would be. Thus the natural
selection is directing evolution in the beetle population. It results in adaptation in the beetle
population to fit in their environment better. Let us think of another situation.

Situation-2:
In this situation a colour variation occurs again in its progeny during reproduction, but now it
results in ‘Blue’ colour beetles instead of ‘red’ colour beetle. This blue colour beetle can pass its
colour to its progeny. So that all its progeny are blue Blue and red beetle Crows can see blue
coloured beetles on the green leaves of the bushes and the red ones as well. And therefore crows
can eat both red and blue coloured beetles. In this case there is no survival advantage for blue
coloured beetles as we have seen in case of green coloured beetles. What happens initially in the
population, there are a few blue beetles, but most are red. Imagine at this point an elephant comes
by and stamps on the bushes where the beetles live. This kills most of the beetles. By chance the
few beetles survived are mostly blue. Again the beetle population slowly increases. But in the
beetle population most of them are in blue colour. Thus sometimes accidents may also result in
changes in certain characters of the a population. Characters as we know are governed by genes.
Thus there is change in the frequency of genes in small populations. This is known as “Genetic
drift’, which provides diversity in the population.
In this case beetles population is increasing, but suddenly bushes were affected by a plant disease
in which leaf material were destroyed or in which leaves are affected by this beetles got less food
material. So beetles are poorly nourished. So the weight of beetles decrease but no changes take
place in their genetic material (DNA). After a few years the plant disease are eliminated. Bushes
are healthy with plenty of leaves. What do you think will be condition of the beetles

We inherited our traits from our parents. Let us see how sex is determined in human beings. Each
human cell contains 23 pairs (46) of chromosomes. Out of 23 pairs 22 pairs of chromosomes are
autosomes. Chromosomes whose number and morphology do not differ between males and
females of a species are called autosomes. The remaining pair is called allosomes or sex
chromosomes. These are two types, one is ‘X’ and the other is ‘Y’. These two chromosomes
determine the sex of an individual. Females have two ‘X’ chromosomes in their cells (XX). Males
have one ‘X’ and one ‘Y’ chromosomes in their cells (XY). All the gametes (ova) produced by a
woman have only X chromosomes. The gametes (sperm) produced by a man are of two types one
with X chromosome and other Y chromosome. If the sperm carries Y chromosome and fertilizes
the ovum (X chromosome). Then the baby will have XY condition. So the baby will be a boy
,mother’s sex chromosomes father’s sex chromosomes male child female child Father Mother
Baby girl Baby boy Baby girl Baby boy 22 +X 22 +Y 22+X 22+X (44+XX) ,(44+XY) (44+XX)
(44+XY) Parents Gametes Offsping 44+XY 44+XX Gyno Sperm Andro Sperm Eggs What will
happen if the sperm containing X chromosomes fertilizes the ovum? Who decides the sex of the
baby – mother or father? Is the sex also a character or trait? Does it follow Mendels’ law of
dominance? Were all your traits similar to that of your parents?

Ant Colony Optimization


The algorithmic world is beautiful with multifarious strategies and tools being developed round the
clock to render to the need for high-performance computing. In fact, when algorithms are inspired
by natural laws, interesting results are observed. Evolutionary algorithms belong to such a class of
algorithms. These algorithms are designed so as to mimic certain behaviours as well as
evolutionary traits of the human genome. Moreover, such algorithmic design is not only
constrained to humans but can be inspired by the natural behaviour of certain animals as well. The
basic aim of fabricating such methodologies is to provide realistic, relevant and yet some low-cost
solutions to problems that are hitherto unsolvable by conventional means. Different optimization
techniques have thus evolved based on such evolutionary algorithms and thereby opened up the
domain of metaheuristics. Metaheuristic has been derived from two Greek words, namely, Meta
meaning one level above and heuriskein meaning to find. Algorithms such as the Particle Swarm
Optimization (PSO) and Ant Colony Optimization (ACO) are examples of swarm intelligence and
metaheuristics. The goal of swarm intelligence is to design intelligent multi-agent systems by
taking inspiration from the collective behaviour of social insects such as ants, termites, bees,
wasps, and other animal societies such as flocks of birds or schools of fish.

Background:
Ant Colony Optimization technique is purely inspired from the foraging behaviour of ant colonies,
first introduced by Marco Dorigo in the 1990s. Ants are eusocial insects that prefer community
survival and sustaining rather than as individual species. They communicate with each other using
sound, touch and pheromone. Pheromones are organic chemical compounds secreted by the ants
that trigger a social response in members of same species. These are chemicals capable of acting
like hormones outside the body of the secreting individual, to impact the behaviour of the receiving
individuals. Since most ants live on the ground, they use the soil surface to leave pheromone trails
that may be followed (smelled) by other ants. Ants live in community nests and the underlying
principle of ACO is to observe the movement of the ants from their nests in order to search for
food in the shortest possible path. Initially, ants start to move randomly in search of food around
their nests. This randomized search opens up multiple routes from the nest to the food source.
Now, based on the quality and quantity of the food, ants carry a portion of the food back with
necessary pheromone concentration on its return path. Depending on these pheromone trials, the
probability of selection of a specific path by the following ants would be a guiding factor to the
food source. Evidently, this probability is based on the concentration as well as the rate of
evaporation of pheromone. It can also be observed that since the evaporation rate of pheromone is
also a deciding factor, the length of each path can easily be accounted for.

In the above figure, for simplicity, only two possible paths have been considered between the food
source and the ant nest. The stages can be analyzed as follows:
1. Stage 1: All ants are in their nest. There is no pheromone content in the environment. (For
algorithmic design, residual pheromone amount can be considered without interfering with the
probability)
2. Stage 2: Ants begin their search with equal (0.5 each) probability along each path. Clearly, the
curved path is the longer and hence the time taken by ants to reach food source is greater than the
other.
3. Stage 3: The ants through the shorter path reaches food source earlier. Now, evidently they face
with a similar selection dilemma, but this time due to pheromone trail along the shorter path
already available, probability of selection is higher.
4. Stage 4: More ants return via the shorter path and subsequently the pheromone concentrations
also increase. Moreover, due to evaporation, the pheromone concentration in the longer path
reduces, decreasing the probability of selection of this path in further stages. Therefore, the whole
colony gradually uses the shorter path in higher probabilities. So, path optimization is attained.

Algorithmic Design:
Pertaining to the above behaviour of the ants, an algorithmic design can now be developed. For
simplicity, a single food source and single ant colony have been considered with just two paths of
possible traversal. The whole scenario can be realized through weighted graphs where the ant
colony and the food source act as vertices (or nodes); the paths serve as the edges and the
pheromone values are the weights associated with the edges. Let the graph be G = (V, E) where V,
E are the edges and the vertices of the graph. The vertices according to our consideration are Vs
(Source vertex – ant colony) and Vd (Destination vertex – Food source), The two edges are E1 and
E2 with lengths L1 and L2 assigned to each. Now, the associated pheromone values (indicative of
their strength) can be assumed to be R1 and R2 for vertices E1 and E2 respectively. Thus for each
ant, the starting probability of selection of path (between E1 and E2) can be expressed as follows:

Evidently, if R1>R2, the probability of choosing E1 is higher and vice-versa. Now, while returning
through this shortest path say Ei, the pheromone value is updated for the corresponding path. The
updation is done based on the length of the paths as well as the evaporation rate of pheromone. So,
the update can be step-wise realized as
follows:
1. In accordance to path length –

In the above updation, i = 1, 2 and ‘K’ serves as a parameter of the model. Moreover, the update is
dependent on the length of the path. Shorter the path, higher the pheromone added.
2. In accordance to evaporation rate of pheromone –

The parameter ‘v’ belongs to interval (0, 1] that regulates the pheromone
evaporation. Further, i = 1, 2. At each iteration, all ants are placed at source vertex Vs (ant colony).
Subsequently, ants move from Vs to Vd (food source) following step 1. Next, all ants conduct their
return trip and reinforce their chosen path based on step 2.

Pseudocode:

Procedure AntColonyOptimization:
Initialize necessary parameters and pheromone trials;
while not termination do:
Generate ant population;
Calculate fitness values associated with each ant;
Find best solution through selection methods;
Update pheromone trial;
end while
end procedure

The pheromone update and the fitness calculations in the above pseudocode can be found through
the step-wise implementations mentioned above. Thus, the introduction of the ACO optimization
technique has been established. The application of the ACO can be extended to various problems
such as the famous TSP (Travelling Salesman Problem).
Introduction to Genetic Algorithm

Debasis Samanta

Indian Institute of Technology Kharagpur


dsamanta@sit.iitkgp.ernet.in

26.02.2016

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 26.02.2016 1 / 26


Limitations of the traditional optimization
approaches

Limitations:

Computationally expensive.
For a discontinuous objective function, methods may fail.
Method may not be suitable for parallel computing.
Discrete (integer) variables are difficult to handle.
Methods may not necessarily adaptive.

Evolutionary algorithms have been evolved to address the above


mentioned limitations of solving optimization problems with traditional
approaches.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 26.02.2016 2 / 26


Evolutionary Algorithms

The algorithms, which follow some biological and physical behaviors:

Biologic behaviors:
Genetics and Evolution –> Genetic Algorithms (GA)
Behavior of ant colony –> Ant Colony Optimization (ACO)
Human nervous system –> Artificial Neural Network (ANN)

In addition to that there are some algorithms inspired by some physical


behaviors:
Physical behaviors:
Annealing process –> Simulated Annealing (SA)
Swarming of particle –> Particle Swarming Optimization (PSO)
Learning –> Fuzzy Logic (FL)

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 26.02.2016 3 / 26


Genetic Algorithm

It is a subset of evolutionary algorithm:

Ant Colony optimization


Swarm Particle Optimization

Models biological processes:


Genetics
Evolution
To optimize highly complex objective functions:
Very difficult to model mathematically
NP-Hard (also called combinatorial optimization) problems (which
are computationally very expensive)
Involves large number of parameters (discrete and/or continuous)

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 26.02.2016 4 / 26


Background of Genetic Algorithm

Firs time itriduced by Ptrof. John Holland (of Michigan University, USA,
1965).
But, the first article on GA was published in 1975.

Principles of GA based on two fundamental biological processes:


Genetics: Gregor Johan Mendel (1865)
Evolution: Charles Darwin (1875)

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 26.02.2016 5 / 26


A brief account on genetics
The basic building blocks in living bodies are cells. Each cell carries
the basic unit of heredity, called gene
Nucleus

Chromosome

Other cell bodies

For a particular specie, number of chromosomes is fixed.


Examples
Mosquito: 6
Frogs: 26
Human: 46
Goldfish: 94
etc.
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 26.02.2016 6 / 26
A brief account on genetics
Genetic code

Spiral helix of protein substance is called DNA.


For a specie, DNA code is unique, that is, vary uniquely from one
to other.
DNA code (inherits some characteristics from one generation to
next generation) is used as biometric trait.
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 26.02.2016 7 / 26
A brief account on genetics

Reproduction

+ = Organism’s cell :
Cell division
x y
gamete diploid

(Reproductive cell has Each chromosome from


haploid half the number of diploid both haploids are combined
chromosomes) to have full numbers

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 26.02.2016 8 / 26


A brief account on genetics
Crossing over

Kinetochore

Information from
two different Combined into so that diversity
organism’s body in information is possible
cells
Random crossover points
makes infinite diversities

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 26.02.2016 9 / 26


A brief account on evolution
Evolution : Natural Selection

Four primary premises:

1 Information propagation: An offspring has many of its


characteristics of its parents (i.e. information passes from parent
to its offspring). [Heredity]
2 Population diversity: Variation in characteristics in the next
generation. [Diversity]
3 Survival for exitence: Only a small percentage of the offspring
produced survive to adulthood. [Selection]
4 Survival of the best: Offspring survived depends on their
inherited characteristics. [Ranking]

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 26.02.2016 10 / 26


A brief account on evolution

Mutation:

To make the process forcefully dynamic when variations in population


going to stable.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 26.02.2016 11 / 26


Biological process : A quick overview

Genetics

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 26.02.2016 12 / 26


Working of Genetic Algorithm

Definition of GA:
Genetic algorithm is a population-based probabilistic search and
optimization techniques, which works based on the mechanisms of
natural genetics and natural evaluation.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 26.02.2016 13 / 26


Framework of GA

Start

Note:
An individual in the
population is
corresponding to a
Initial Population possible solution

No
Converge ? Selection

Yes

Reproduction
Stop

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 26.02.2016 14 / 26


Working of Genetic Algorithm

Note:

1 GA is an iterative process.
2 It is a searching technique.
3 Working cycle with / without convergence.
4 Solution is not necessarily guranteed. Usually, terminated with a
local optima.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 26.02.2016 15 / 26


Framework of GA: A detail view

Start

Define parameters

Parameter representation

Create population
Initialize population
Apply cost
function to each of
the population

No
Converge ? Evaluate the fitness

Selection
Yes

Select Mate
Stop

Crossover

Reproduction
Mutation

Inversion

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 26.02.2016 16 / 26


Optimization problem solving with GA

For the optimization problem, identify the following:

Objective function(s)

Constraint(s)

Input parameters

Fitness evaluation (it may be algorithm or mathematical formula)

Encoding

Decoding

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 26.02.2016 17 / 26


GA Operators

In fact, a GA implementation involved with the realization of the


following operations.

1 Encoding: How to represent a solution to fit with GA framework.


2 Convergence: How to decide the termination criterion.
3 Mating pool: How to generate next solutions.
4 Fitness Evaluation: How to evaluate a solution.
5 Crossover: How to make the diverse set of next solutions.
6 Mutation: To explore other solution(s).
7 Inversion: To move from one optima to other.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 26.02.2016 18 / 26


Different GA Strategies

Simple Genetic Algorithm (SGA)

Steady State Genetic Algorithm (SSGA)

Messy Genetic Algorithm (MGA)

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 26.02.2016 19 / 26


Simple GA

Start

Create Initial population


of size N

Evaluate each individuals

Convergence
Yes Return the individual(s) with
Criteria meet ? best fitness value
No

Select Np individuals
(with repetition) Stop

Create mating pool (randomly) (Pair of


parent for generating new offspring)

Perform crossover and


create new offsprings
Reproduction

Mutate the offspring

Perform inversion on
the offspring

Replace all individuals in the last generation


with new offsprings created

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 26.02.2016 20 / 26


Important parameters involved in Simple GA

SGA Parameters

Initial population size : N

Size of mating pool, Np : Np = p%ofN

Convergence threshold δ

Mutation µ

Inversion η

Crossover ρ

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 26.02.2016 21 / 26


Salient features in SGA

Simple GA features:

Have overlapping generation (Only fraction of individuals are


replaced).

Computationally expensive.

Good when initial population size is large.

In general, gives better results.

Selection is biased toward more highly fit individuals; Hence, the


average fitness (of overall population) is expected to increase in
succession.
The best individual may appear in any iteration.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 26.02.2016 22 / 26


Steady State Genetic Algorithm (SSGA)
Start

Generate Initial population of size N

Evaluate each individuals

Select two individual without


repetition

Crossover

Mutation

Inversion

Yes Reject the


offspring if
duplicated

No

Evaluate the offspring

If the offspring are better than the


worst individuals then replace the
worst individuals with the offspring

Convergence
meet ?

Return the solutions

Stop

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 26.02.2016 23 / 26


Salient features in Steady-state GA

SGA Features:

Generation gap is small.


Only two offspring are produced in one generation.

It is applicable when

Population size is small


Chromosomes are of longer length
Evaluation operation is less computationally expensive (compare to
duplicate checking)

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 26.02.2016 24 / 26


Salient features in Steady-state GA

Limitations in SSGA:

There is a chance of stuck at local optima, if


crossover/mutation/inversion is not strong enough to diversify the
population).

Premature convergence may result.

It is susceptible to stagnation. Inferiors are neglected or removed


and keeps making more trials for very long period of time without
any gain (i.e. long period of localized search).

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 26.02.2016 25 / 26


***

Any Questions??

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 26.02.2016 26 / 26


Encoding Techniques in Genetic Algorithms

Debasis Samanta

Indian Institute of Technology Kharagpur


dsamanta@iitkgp.ac.in

01.03.2016

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 1 / 42


GA Operators

Following are the GA operators in Genetic Algorithms.

1 Encoding
2 Convergence test
3 Mating pool
4 Fitness Evaluation
5 Crossover
6 Mutation
7 Inversion

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 2 / 42


Encoding Operation

1 Encoding
2 Convergence test
3 Mating pool
4 Fitness Evaluation
5 Crossover
6 Mutation
7 Inversion

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 3 / 42


Different Encoding Schemes

Different GAs

Simple Genetic Algorithm (SGA)


Steady State Genetic Algorithm (SSGA)
Messy Genetic Algorithm (MGA)

Encoding Schemes

Binary encoding
Real value encoding
Order encoding
Tree encoding

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 4 / 42


Different Encoding Schemes

Often, GAs are specified according to the encoding scheme it follows.

For example:
Encoding Scheme

Binary encoding –> Binary Coded GA or simply Binary GA

Real value encoding –> Real Coded GA or simply Real GA

Order encoding –> Order GA (also called as Permuted GA)

Tree encoding

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 5 / 42


Encoding Schemes in GA

Genetic Algorithm uses metaphor consisting of two distinct elements :

1 Individual
2 Population

An individual is a single solution while a population is a set of


individuals at an instant of searching process.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 6 / 42


Individual Representation :Phenotype and
Genotype
An individual is defined by a chromosome. A chromosome stores
genetic information (called phenotype) for an individual.
Here, a chromosome is expressed in terms of factors defining a
problem.

Genotype

Factor 1 Factor 2 …. Factor n

Gene 1 Gene 2 …. Gene n

Phenotype

a b c 1 0 1 2 9 6 7 $ α β..................

Chromosome
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 7 / 42
Individual Representation :Phenotype and
Genotype

Note :

A gene is the GA’s representation of a single factor (i.e. a design


parameter), which has a domain of values (continuous,
discontinuous, discrete etc.) symbol, numbering etc.

In GA, there is a mapping from genotype to phenotype. This


eventually decideds the performance (namely speed and
accuracy) of the problem solving.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 8 / 42


Encoding techniques

There are many ways of encoding:


1 Binary encoding: Representing a gene in terms of bits (0s and
1s).
2 Real value encoding: Representing a gene in terms of values or
symbols or string.
3 Permutation (or Order) encoding: Representing a sequence of
elements)
4 Tree encoding: Representing in the form of a tree of objects.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 9 / 42


Binary Encoding

In this encoding scheme, a gene or chromosome is represented by a


string (fixed or variable length) of binary bits (0’s and 1’s)

A: 0 1 1 0 0 1 0 1 0 1 0 1 0 1 1 1 1 0 Individual 1

B: 0 0 1 0 1 0 1 1 1 0 1 0 1 0 1 0 0 0 Individual 2

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 10 / 42


Example: 0-1 Knapsack problem

There are n items, each item has its own cost (ci ) and weight (wi ).

There is a knapsack of total capacity w.

The problem is to take as much items as possible but not


exceeding the capacity of the knapsack.

This is an optimization problem and can be better described as follows.

Maximize :
P
i ci ∗ wi ∗ xi
Subject to
P
xi ∗ wi ≤ W
where xi ∈ [0 · · · 1]

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 11 / 42


Example: 0-1 Knapsack problem
Consider the fallowing, an instance of the 0-1 Knapsack problem.
I3
I2
I1
Max. Weight
30 50
50
20
10

$60 $100 $120 Knapsack

Brute force approach to solve the above can be stated as follows:


Select at least one item
[10], [20], [30], [10, 20], [10, 30], [, 20, 30], [10, 20, 30]
So, for n-items, are there are 2n − 1 trials.
0 - means item not included and 1 - means item included
[100], [010], [011], [110], [101], [011], [111]
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 12 / 42
Example: 0-1 Knapsack problem

The encoding for the 0-1 Knapsack, problem, in general, for n items
set would look as follows.

Genotype :
1 2 3 4 ..... n-1 n
.....

Phenotype :

0 1 0 1 1 0 1 0 1 0 1 0 1. . . . . .1 0 1
A binary string of n-bits

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 13 / 42


Few more examples
Example 1 :
Minimize :
2
f (x) = x2 + 125
x
where 0 ≤ x ≤ 15 and x is any discrete integer value.

Genotype :

Phenotype :

01101
A binary string of 5-bits

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 14 / 42


Few more examples
Example 2 :
Maximize :
f (x, y ) = x 3 − x 2 y + xy 2 + y 3
Subject to :
x + y ≤ 10
and
1 ≤ x ≤ 10
−10 ≤ y ≤ 10

Genotype :

x y

Phenotype :

01101 11001
Two binary string of 5-bits each
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 15 / 42
Pros and cons of Binary encoding scheme

Limitations:
1 Needs an effort to convert into binary from
2 Accuarcy depends on the binary reprresentation

Advantages:
1 Since operations with binary represntation is faster, it provide a
faster implementations of all GA operators and hence the execution
of GAs.
2 Any optimization problem has it binary-coded GA implementation

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 16 / 42


Real value encoding

The real-coded GA is most suitable for optimization in a


continuous search space.
Uses the direct representations of the design paparmeters.
Thus, avoids any intermediate encoding and decoding steps.

Genotype :

x y

Phenotype :

5.28 -475.36
Real-value representation

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 17 / 42


Real value encoding with binary codes

Methodology: Step 1 [Deciding the precision]

For any continuous design variable x such that XL ≤ x ≤ XU , and if ε is


the precision required, then string length n should be equal to
 
n = log2 XU −Xε
L

where XL ≤ x ≤ XU

Equivalently,  
ε = XU2−Xn
L

In general, ε = [0 · · · 1]. It is also called, Obtaianable accuracy

Note:If ε = 0.5, then 4.05 or 4.49 ≡ 4 and 4.50 or 4.99 ≡ 4.5 and so
on.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 18 / 42


Real value encoding: Illustration 1

1 Example 1:
1 ≤ x ≤ 16, n = 6. What is the accuracy?

16−1 15
ε= 26
= 64 = 0.249 ≈ 0.25

2 Example 2:
What is the obtainable accuracy, for the binary representation for a
variable X in the range range 20.1 ≤ X ≤ 45.6 with 8-bits?

3 Example 3:
In the above case, what is the binary representation of X = 34.35?

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 19 / 42


Real value encoding with binary codes

Methodology: Step 2[Obtaining the binary representation]

Once, we know the length of binary string for an obtainable accuracy


(i.e precision), then we can have the following mapping relation from a
real value X to its binary equivalent decoded value XB ,which is given
by
X = XL + X2Un−X
−1 × XB
L

where XB = Decoded value of a binary string,


n is the number of bits in the representation,
XL → 0 0 0 0 · · · 0 and
XU → 1 1 1 1 · · · 1
are the decoded values of the binary representation of the lower and
upper values of X .

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 20 / 42


Real value encoding: Illustration 2

Example:

Suppose, XL = 2 and XU = 17 are the two extreme decoded values of


a variable x.

n = 4 is the number of binary bits in the representation for x.

XB = 10(= 1 0 1 0) is a decoded value for a given x.

What is the value of x =? and its binary representation??


17−2
Here, x = 2 + 24 −1
× 10 = 12
Binary representation of x = 1 1 0 0

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 21 / 42


Order Encoding

Let us have a look into the following instance of the Traveling


Salesman Problem (TSP).

All cities are to be visited A possible tour

TSP
- Visit all the cities
- One city once only
- Starting and ending city is the same

How we can formally define the TSP?

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 22 / 42


Order Encoding for TSP
Understanding the TSP:
There is a cost of visiting a city from another city and hence the total
cost of visiting all the cities but exactly once (except the starting city).
Objective function: To find a tour (i.e. a simple cycle covering all the
cities) with a minimum cost involved.
Constraints:
All cities must be visited.
There will be only one occurrence of each city (except the starting
city).
Design parameters:
Euclidean distance may be taken as the measurement of the cost,
otherwise, if it is specified explicitly.
The above stated information are the design variables in this case.
We are to search for the best path out of n! possible paths.
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 23 / 42
A small instance of the TSP

d A B C D E 6
2 4
A 0 2  6 4

B 2 0 7  5
5 D
C  7 0 3 1 B
E
D 6  3 0  3
7
E 4 5 1  0

d= Distance matrix C

Connectivity among cities

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 24 / 42


Defining the TSP

Minimizing

Pn−2
cost = i=0 d(ci , ci+1 ) + d(cn−1 , c0 )

Subject to
P = [c0 , c1 , c2 , · · · , cn−1 , c0 ]
where ci ∈ X ;
Here, P is an ordered collection of cities and ci 6= cj such that
∀i, j = 0, 1, · · · , n − 1
Note: P represents a possible tour with the starting cities as c0 .
and
X = x1 , x2 , · · · , xn , set of n number of cities,
d(xi , xj ) is the distance between any two cities xi and xj .
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 25 / 42
Tree encoding

In this encoding scheme, a solution is encoded in the form of a binary


tree.

A D A B E G C F (In-order)

(TL R TR)
B C
A B D C E G F (Pre-order)

D E F (R TL TR)

D B G E E C A (Post-order)
G
(TL TR R)

A binary tree Three compact representation

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 26 / 42


Floor Planning : An example of tree encoding
Floor planning is a standard problem in VLSI design. Here, given n
circuits of different area requirements, we are to arrange them into a
floor of chip layout, so that all circuits are placed in a minimum layout
possible.

C2
C1
C3
C1

C2
C9 C5

C5

C6
C7 C4
C4 C8
C6 C8

C7
C9
C10 C3

10 circuits A floor plan

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 27 / 42


Formulation of floor planning problem

A possible formulation of Floor planning problem of VLSI circuit is as


follows.
Given :

1 A set of n rectangular blocks B = b1 , b2 , · · · , bi , · · · , bn


2 For each bi ∈ B, we have the following specification:

the width wi and height hi (which are constant for rigid blocks and
variable for flexible blocks)
1 hi
ρi , the desirable aspect ratio about where ρi ≤ wi ≤ ρi , where
ρi = 1, if the block bi is rigid.
ai = wi × hi , the area of each block bi .

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 28 / 42


Formulation of floor planning problem

3 A set of nets N = {n1 , n2 , · · · , nk } describing the connectivity


information.

Wire = f1 (B, N)
4 Desirable floor plan aspect ratio ρ such that 1ρ ≤ W
H
≤ ρ, where H
and W are the height and width of the floor plan, respectively.

Area = f2 (B, N, ρ)
5 Timing information.
Delay = f3 (B, N, ρ)

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 29 / 42


Formulation of Floor planning problem
A legal floor plan is a floor plan that satisfies the following constraints.
Constraints :

3 Each block bi is assigned to a location say (xi , yi ).


4 No two blocks overlap
5 For each flexible block say bi , ai = wi × hi and should meet aspect
ratio constraint.

Objectives :
We are to find a floor plan, which would
1 Minimize floor plan area.
2 Minimize wire length.
3 Minimize circuit delay.
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 30 / 42
Tree encoding for Floor planning problem

3 4 5
1 1

4 5 6 7

2 2
6 7 3

Floor Plan I Floor Plan II

1 How many floor plans are possible?


2 Can we find a binary tree representation of a floor plan??

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 31 / 42


Binary tree representation for floor planning

A floor plan can be modeled by a binary tree with n leaves and n − 1


nodes where

each node represents a vertical cut-line or horizontal cut-line, and


Letter V and H refer to vertical and horizontal cut-operators.

each leaf node represents a rectangle blocks.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 32 / 42


Example : Floor plane I

H H
3
1
2 1 H 3
4 5

2 V V
6 7

6 7 4 5

Floor Plan I Binary tree representation of the floor plan I

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 33 / 42


Example : Floor plane I

Note 1:
The operators H and V expressed in polish notation carry the following
meanings:
ijH → Block bj is on top of the block bi .
ijV → Block bi is on the left of block bj .

Note 2: A tree can be represented in a compact form using Polish


notation
Note 3: Polish notation
a + b ÷ c = a + (b ÷ c) = abc ÷ +
a + b − c = ab + c−

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 34 / 42


Example : Floor plane I

Note 4:
Post order traversal of a binary tree is equivalent to polish notation

+ -

abc÷+ ab+c-
a ÷ + c

b c
a b

Note 5:
There is only one way to performing a post order traversal of a binary
tree.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 35 / 42


Example : Floor Plane I (with Polish notation

H H
3
1
2 1 H 3
4 5

2 V V
6 7

6 7 4 5

Floor Plan I Binary tree representation of the floor plan I

Polish notation : 2 1 H 6 7 V 4 5 V H 3 H V

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 36 / 42


Example : H and V operators

H
V2
2
3 1 V
H3 2
V4
3
4 5
2
H1 H 2

1 4
V 3
Floor Plan
4 5

Binary tree

Polish notation : 4 5 V 3 H 2 V 1 H

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 37 / 42


Example :Floor plan II

H H

2 1 V
H
?
6 7
V 3

4 5

2 1 H 6 7 V 4 5 V 3 H H V

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 38 / 42


Example :Floor plan II

H H 3
1

2 1 V
H 4 5

6 7 2
V 3 6 7

4 5

2 1 H 6 7 V 4 5 V 3 H H V

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 39 / 42


Problem

Problem :

How many number of solutions possible with n blocks in a floor


planning problem?

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 40 / 42


Problem

Problem :

How many number of solutions with n blocks in a floor planning


problem?
1 2n

N= n+1 n

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 41 / 42


Any question??

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 01.03.2016 42 / 42


Fitness Evaluation and Selection

Debasis Samanta

Indian Institute of Technology Kharagpur


dsamanta@iitkgp.ac.in

08.03.2016

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 1 / 40


Important GA Operations

1 Encoding
2 Fitness Evaluation and Selection
3 Mating pool
4 Crossover
5 Mutation
6 Inversion
7 Convergence test

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 2 / 40


Important GA Operations

1 Encoding
2 Fitness evaluation and Selection
3 Mating pool
4 Crossover
5 Mutation
6 Inversion
7 Convergence test

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 3 / 40


GA Selection

After deciding an encoding scheme, the second important things


is how to perform selection from a set of population, that is, how to
choose the individuals in the population that will create offspring
for the next generation and how many offspring each will create.

The purpose of selection is, of course, to emphasize fittest


individuals in the population in hopes that their offspring will in
turn have even higher fitness.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 4 / 40


Selection operation in GAs

Selection is the process for creating the population for next generation
from the current generation
To generate new population: Breeding in GA

Create a mating pool

Select a pair

Reproduce

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 5 / 40


Fitness evaluation

In GA, there is a need to create next generation

The next generation should be such that it is toward the (global)


optimum solution
Random population generation may not be a wiser strategy
Better strategy follows the biological process: Selection

Selection involves:
Survival of the fittest
Struggle for the existence

Fitness evaluation is to evaluate the survivability of each


individual in the current population

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 6 / 40


Fitness evaluation

How to evaluate the fitness of an individual?

A simplest strategy could be to take the confidence of the value(s)


of the objective function(s)

Simple, if there is a single objective function


But, needs a different treatment if there are two or more objective
functions
They may be in different scales
All of them may not be same significant level in the fitness calculation

. . . etc.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 7 / 40


An example

A
3 E
P1: C B A D F E 11
1 6
2 P2: A B D C E F 19
B 2 D 5 P3: A C B F E D 16

5 4 P4: F C D B E A 12
4
P5: C F D A B E 10
C F
2

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 8 / 40


Selection Schemes in GAs

Different strategies are known for the selection:

Canonical selection (also called proportionate-based selection)

Roulette Wheel selection (also called proportionate-based


selection)

Rank-based selection (also called as ordinal-based selection)

Tournament selection

Steady-state selection

Boltzman selection

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 9 / 40


Canonical selection

In this techniques, fitness is defined for the i − th individual as


follows.
fi
fitness(i) = F̄

where fi is the evaluation associated with the i − th individual in


the population.

F̄ is the average evaluation of all individuals in the population size


N and is defined as follows.
PN
i=1 fi
F̄ = N

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 10 / 40


Canonical selection

fi
In an iteration, we calculate F̄
for all individuals in the current
population.

In Canonical selection, the probability that individuals in the


current population are copied and placed in the mating pool is
proportional to their fitness.

Note :
Here, the size of the mating pool is p% × N, for some p.
Convergence rate depends on p.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 11 / 40


Roulette-Wheel selection
In this scheme, the probability for an individual being selected in
the mating pool is considered to be proportional to its fitness.
It is implemented with the help of a wheel as shown.

i
fi > fj

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 12 / 40


Roulette-Wheel selection mechanism

The top surface area of the wheel is divided into N parts in


proportion to the fitness values f1 , f2 , f3 · · · fN .

The wheel is rotated in a particular direction (either clockwise or


anticlockwise) and a fixed pointer is used to indicate the winning
area, when it stops rotation.

A particular sub-area representing a GA-Solution is selected to be


winner probabilistically and the probability that the i − th area will
be declared as
f
pi = PNi
i=1 fi

In other words, the individual having higher fitness value is likely to


be selected more.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 13 / 40


Roulette-Wheel selection mechanism

The wheel is rotated for Np times (where Np = p%N, for some p) and
each time, only one area is identified by the pointer to be the winner.
Note :
Here, an individual may be selected more than once.
Convergence rate is fast.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 14 / 40


Roulette-Wheel selection mechanism: An Example

8% 20%

7
8% 8
Individual Fitness value pi 6
1 1.01 0.05
2 2.11 0.09
3 3.11 0.13 5%
4 4.01 0.17 1
5 4.66 0.20 20% 5
6 1.91 0.08
2
7 1.93 0.08
8 4.51 0.20 9%
4 3

17% 13%

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 15 / 40


Roulette-Wheel selection : Implementation
Input: A Population of size N with their fitness values
Output: A mating pool of size Np
Steps:
f
1 Compute pi = PNi , ∀i = 1, 2 · · · N
i=1 fi

2 Calculate the cumulative probability for each of the individual


starting
P from the top of the list, that is
Pi = ij=1 pj , for all j = 1, 2 · · · N
3 Generate a random number say r between 0 and 1.
4 Select the j-th individual such that Pj−1 < r ≤ Pj
5 Repeat Step 3-4 to select Np individuals.
6 End

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 16 / 40


Roulette-Wheel selection: Example
The probability that i-th individual will be pointed is
f
pi = PNi
i=1 fi

Example:

Individual pi Pi r T

1 0.05 0.05 0.26 I


2 0.09 0.14 0.04 I
3 0.13 0.27 0.48 II
4 0.17 0.44 0.43 I
5 0.20 0.64 0.09 II
6 0.08 0.72 0.30
7 0.08 0.80 0.61
8 0.20 1.0 0.89 I

pi = Probability of an individual r = Random Number between 0..1

Pi = Cumulative Probability T=Tally count of selection

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 17 / 40


Roulette-Wheel selection

Following are the point to be noted:


1 The bottom-most individual in the population has a cumulative
probability PN = 1
2 Cumulative probability of any individual lies between 0 and 1
3 The i-th individual in the population represents the cumulative
probability from Pi−1 to Pi
4 The top-most individual represents the cumulative probability
values between 0 and p1
5 It may be checked that the selection is consistent with the
expected count Ei = N × pi for the i-th individual.

Does the selection is sensitive to ordering, say in ascending


order of their fitness values?

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 18 / 40


Drawback in Roulette-Wheel selection

Suppose, there are only four binary string in a population, whose


fitness values are f1 , f2 , f3 and f4 .

Their values 80%, 10%, 6% and 4%, respectively.

What is the expected count of selecting f3 , f4 , f2 or f1 ?

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 19 / 40


Problem with Roulette-Wheel selection scheme
The limitations in the Roulette-Wheel selection scheme can be better
illustrated with the following figure.

80 %

10 %
6% 4%

The observation is that the individual with higher fitness values will
guard the other to be selected for mating. This leads to a lesser
diversity and hence fewer scope toward exploring the alternative
solution and also premature convergence or early convergence with
local optimal solution.
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 20 / 40
Rank-based selection
To overcome the problem with Roulette-Wheel selection, a
rank-based selection scheme has been proposed.
The process of ranking selection consists of two steps.
1 Individuals are arranged in an ascending order of their fitness
values. The individual, which has the lowest value of fitness is
assigned rank 1, and other individuals are ranked accordingly.
2 The proportionate based selection scheme is then followed based
on the assigned rank.
Note:
The % area to be occupied by a particular individual i, is given by
r
PNi × 100
i=1 ri
where ri indicates the rank of i − th individual.
Two or more individuals with the same fitness values should have
the same rank.
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 21 / 40
Rank-based selection: Example
Continuing with the population of 4 individuals with fitness values:
f1 = 0.40, f2 = 0.05, f3 = 0.03 and f4 = 0.02.
Their proportionate area on the wheel are: 80%, 10%, 6% and 4%
Their ranks are shown in the following figure.

80%

30%
40%

20%
10% 10%
6% 4%

It is evident that expectation counts have been improved compared to


Routlette-Wheel selection.
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 22 / 40
Rank-based selection: Implementation
Input: A population of size N with their fitness values
Output: A mating pool of size Np .
Steps:

1 Arrange all individuals in ascending order of their fitness value.


2 Rank the individuals according to their position in the order, that
is, the worst will have rank 1, the next rank 2 and best will have
rank N.
3 Apply the Roulette-Wheel selection but based on their assigned
ranks. For example, the probability pi of the i-th individual would
be
ri
pi = Pi
j=1 rj

4 Stop
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 23 / 40
Comparing Rank-based selection with
Roulette-Wheel selection

Individual % Area fi Rank (ri) % Area

1 80 % 0.4 4 40 %
2 10 % 0.05 3 30 %
3 7% 0.03 2 20 %
4 4% 0.02 1 10 %

40 %
80 %

1 1

2 10 % 2
3 30 %
4 4 3
7%
10 % 20 %
3%

Roulette-Wheel based on Roulette-Wheel based on


proportionate-based selection ordinal-based selection

A rank-based selection is expected to performs better than the


Roulette-Wheel selection, in general.
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 24 / 40
Basic concept of tournament selection

Who will win the match in this tournament?

Winner

?
?

? ? ? ?

India New Zealand England Sri Lanka S. Africa Australia Pakistan Zimbabwe

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 25 / 40


Tournament selection
1 In this scheme, we select the tournament size n (say 2 or 3) at
random.
2 We pick n individuals from the population, at random and
determine the best one in terms of their fitness values.
3 The best individual is copied into the mating pool.
4 Thus, in this scheme only one individual is selected per
tournament and Np tournaments are to be played to make the size
of mating pool equals to Np .
Note :
Here, there is a chance for a good individual to be copied into the
mating pool more than once.
This techniques founds to be computationally more faster than
both Roulette-Wheel and Rank-based selection scheme.
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 26 / 40
Tournament selection : Implementation

The tournament selection scheme can be stated as follows.


Input : A Population of size N with their fitness values
Output : A mating pool of size Np (Np ≤ N)
Steps:

1 Select NU individuals at random (NU ≤ N).


2 Out of NU individuals, choose the individual with highest fitness
value as the winner.
3 Add the winner to the mating pool, which is initially empty.
4 Repeat Steps 1-3 until the mating pool contains Np individuals
5 Stop

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 27 / 40


Tournament selection : Example

N = 8, NU = 2, Np = 8

Input :
Individual 1 2 3 4 5 6 7 8
Fintess 1.0 2.1 3.1 4.0 4.6 1.9 1.8 4.5

Output :
Trial Individuals Selected
1 2, 4 4
2 3, 8 8
3 1, 3 3
4 4, 5 5
5 1, 6 6
6 1, 2 2
7 4, 2 4
8 8, 3 8

If the fitness values of two individuals are same, than there is a tie in
the match!! So, what to do????
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 28 / 40
Tournament selection

Note :
There are different twists can be made into the basic Tournament
selection scheme:

1 Frequency of NU = small value (2, 3), moderate 50 % of N and


large NU ≈ N.
2 Once an individual is selected for a mating pool, it can be
discarded from the current population, thus disallowing the
repetition in selecting an individual more than once.
3 Replace the worst individual in the mating pool with those are not
winners in any trials.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 29 / 40


Steady-State selection algorithm

Steps :

1 NU individuals with highest fitness values are selected.


2 NU individuals with worst fitness values are removed and NU
individuals selected in Step 1 are added into the mating pool.

This completes the selection procedure for one iteration. Repeat the
iteration until the mating pool of desired size is obtained.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 30 / 40


Survey on GA selection strategies

Reference:
D. D. Goldberg and K. Deb,”A comparison of selection schemes in
foundation of GA”, Vol. 1, 1991, Pg. 69-93
Web link : K. Deb Website, IIT Kanpur

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 31 / 40


Elitisms
In this scheme, an elite class (in terms of fitness) is identified first
in a population of strings.
It is then directly copied into the next generation to ensure their
presence.

Moves to the
Elite 1
mating pool

Elite 2
. . . . . . . . . .
Select then based
on earlier discussed
any scheme

Elite n

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 32 / 40


Comparing selection schemes

Usually, a selection scheme follows Darwin’s principle of ”Survival


of the fittest”.

In other words, a selection strategy in GA is a process that favours


the selection of better individuals in the population for the matting
pool (so that better genes are inherited to the new offspring) and
hence search leads to the global optima.

There are two issues to decide the effectiveness of any selection


scheme.

Population diversity

Selection pressure

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 33 / 40


Analyzing a selection schemes

More population diversity means more exploration


Higher selection pressure means lesser exploitation
Selection pressure

Population diversity

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 34 / 40


Effectiveness of any selection scheme

Population diversity

This is similar to the concept of exploration.


The population diversity means that the genes from the already
discovered good individuals are exploited while permitting the new
area of search space continue to be explored.

Selection pressure

This is similar to the concept of exploitation.


It is defined as the degree to which the better individuals are
favoured.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 35 / 40


Effectiveness of any selection schemes
These two factors are inversely related to each other in the sense that
if the selection pressure increases, the population diversity decrease
and vice-versa. Thus,
1 If selection pressure is HIGH
The search focuses only on good individuals (in terms of fitness) at
the moment.
It loses the population diversity.
Higher rate of convergence. Often leads to pre-mature
convergence of the solution to a sub-optimal solution.
2 If the selection pressure is LOW
May not be able to drive the search properly and consequently the
stagnation may occurs.
The convergence rate is low and GA takes unnecessary long time
to find optimal solution.
Accuracy of solution increases (as more genes are usually explored
in the search).
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 36 / 40
Analysis of different selection strategies

Selection Scheme Population Diversity Selection Pressure


Roulette‐wheel selection  Low Population Diversity  It is with high
‐ Pre‐mature convergence selection pressure
(It works fine when ‐ Less Accuracy in solution ‐ Stagnation of Search
fitness values are
informally distributed)
Rank Selection  Favors a high population  Selection pressure is
diversity low
(It works fine when ‐ Slow rate of convergence ‐ Explore more
fitness values are not solutions
necessarily uniformly
distributed)
Tournament Selection  Population diversity is  It provides very high
moderate selection pressure
(It works fine when ‐ Ends up with a moderate ‐ better exploration of
population are with very rate of convergence search space
diversified fitness values)
Steady‐state Selection  Population diversity is  Selection pressure is
decreases gradually as the too low.
generation advances ‐ Convergence rate is
too slow

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 37 / 40


Fine tuning a selection operator : Generation Gap

The generation gap is defined as the proportion of individuals in the


population, which are replaced in each generation, i.e

p
Gp = N

Where N is the population size and p is the number of individuals that


will be replaced.
Note that in steady-state selection p = 2 and hence Gp ≈ 0 for a large
population whereas other selection schemes has Gp ≈ 1

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 38 / 40


Fine tuning a selection operator : Generation Gap

To make the Gp a large value, several strategies may be adopted.

1 Selection of individuals according to their fitness and replacement


at random
2 Selection of individuals at random and replacement according to
the inverse of their fitness values.
3 Selection of both parents and replacement of according to fitness
or inverse fitness.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 39 / 40


Any question??

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 08.03.2016 40 / 40


Crossover Techniques in GAs

Debasis Samanta

Indian Institute of Technology Kharagpur


dsamanta@iitkgp.ac.in

11.03.2016

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 1 / 58


Important GA Operations

1 Encoding
2 Fitness Evaluation and Selection
3 Mating pool
4 Crossover
5 Mutation
6 Inversion
7 Convergence test

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 2 / 58


Important GA Operations

1 Encoding
2 Fitness evaluation and Selection
3 Mating pool
4 Crossover
5 Mutation
6 Inversion
7 Convergence test

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 3 / 58


Reproduction in Genetic Algorithm

Reproduction:

Crossover

Mutation

Inversion

These genetic operators varies from one encoding scheme to another.

Binary coded GAs

Real-coded GAs

Tree-coded GAs

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 4 / 58


Mating Pool: Prior to crossover operation

A mating pair (each pair consists of two strings) are selected at


random. Thus, if the size of mating pool is N, then N2 mating pairs
are formed.[Random Mating]

The pairs are checked, whether they will participate in


reproduction or not by tossing a coin, whose probability being pc .
If pc is head, then the parent will participate in reproduction.
Otherwise, they will remain intact in the population.

Note :
Generally, pc = 1.0, so that almost all the parents can participate in
production.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 5 / 58


Crossover operation

Once, a pool of mating pair are selected, they undergo through


crossover operations.

1 In crossover, there is an exchange of properties between two


parents and as a result of which two offspring solutions are
produced.
2 The crossover point(s) (also called k-point(s)) is(are) decided
using a random number generator generating integer(s) in
between 1 and L, where L is the length of the chromosome.
3 Then we perform exchange of gene values with respect to the
k-point(s)

There are many exchange mechanisms and hence crossover


strategies.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 6 / 58


Crossover Techniques in Binary Coded GA

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 7 / 58


Crossover operations in Binary-coded GAs

There exists a large number of crossover schemes, few important


of them are listed in the following.
1 Single point crossover
2 Two-point crossover
3 Multi-point crossover (also called n-point crossover)
4 Uniform crossover (UX)
5 Half-uniform crossover (HUX)
6 Shuffle crossover
7 Matrix crossover (Tow-dimensional crossover)
8 Three parent crossover

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 8 / 58


Single point crossover

1 Here, we select the K-point lying between 1 and L. Let it be k .


2 A single crossover point at k on both parent’s strings is selected.
3 All data beyond that point in either string is swapped between the
two parents.
4 The resulting strings are the chromosomes of the offsprings
produced.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 9 / 58


Single point crossover: Illustration

Before Crossover

Parent 1 : 0 1 1 0 0 0 1 0 Two diploid


from a
Parent 2 : 1 0 1 0 1 1 0 0 mating pair

Crossover Point - k

Select crossover points randomly

Two diploid
Offspring 1: 0 1 1 0 1 1 0 0
for two new
Offspring 2: offspring is
1 0 1 0 0 0 1 0
produced

After Crossver

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 10 / 58


Two-point crossover

1 In this scheme, we select two different crossover points k1 and k2


lying between 1 and L at random such that k1 6= k2 .
2 The middle parts are swapped between the two strings.
3 Alternatively, left and right parts also can be swapped.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 11 / 58


Two-point crossover: Illustration

Before Crossover

Parent 1 : 0 1 1 0 0 0 1 0

Parent 2 : 1 0 1 0 1 1 0 0

Crossover Point k1 Crossover Point k2

Select two crossover points randomly

Offspring 1: 0 1 1 0 1 0 1 0

Offspring 2: 1 0 1 0 0 1 0 0

After Crossver

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 12 / 58


Multi-point crossover

1 In case of multi-point crossover, a number of crossover points are


selected along the length of the string, at random.
2 The bits lying between alternate pairs of sites are then swapped.

k1 k2 k3

Parent 1 Offspring 1

Parent 2 Offspring 2

Swap 1 Swap 2

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 13 / 58


Uniform Crossover (UX)

Uniform crossover is a more general version of the multi-point


crossover.

In this scheme, at each bit position of the parent string, we toss a


coin (with a certain probability ps ) to determine whether there will
be swap of the bits or not.

The two bits are then swapped or remain unaltered, accordingly.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 14 / 58


Uniform crossover (UX): Illustration

Before crossover

Parent 1 : 1 1 0 0 0 1 0 1 1 0 0 1

Parent 2 : 0 1 1 0 0 1 1 1 0 1 0 1

Coin tossing: 1 0 0 1 1 1 0 1 1 0 0 1

After crossover

Offspring 1: 1 1 1 0 0 1 1 1 1 1 0 1

Offspring 2:
0 1 0 0 0 1 0 1 0 0 0 1

Rule: If the toss is 0 than swap the bits between P1 and P2

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 15 / 58


Uniform crossover with crossover mask

Here, each gene is created in the offspring by copying the


corresponding gene from one or the other parent chosen
according to a random generated binary crossover mask of the
same length as the chromosome.

Where there is a 1 in the mask, the gene is copied from the first
parent

Where there is a 0 in the mask, the gene is copied from the


second parent.

The reverse is followed to create another offsprings.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 16 / 58


Uniform crossover with crossover mask:
Illustration

Before Crossover

Parent 1 : 1 1 0 0 0 1 0 1

Parent 2 : 0 1 1 0 0 1 1 1

Mask 1 0 0 1 1 1 0 1

Offspring 1: When there is a 1 in the mask, the gene is


1 1 1 0 0 1 1 1 copied from Parent 1 else from Parent 2.

Offspring 2: When there is a 1 in the mask, the gene is


0 1 0 0 0 1 0 1 copied from Parent 2 else from Parent 1.

After Crossver

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 17 / 58


Half-uniform crossover (HUX)

In the half uniform crossover scheme, exactly half of the


non-matching bits are swapped.
1 Calculate the Hamming distance (the number of differing bits)
between the given parents.
2 This number is then divided by two.
3 The resulting number is how many of the bits that do not match
between the two parents will be swapped but probabilistically.
4 Choose the locations of these half numbers (with some strategies,
say coin tossing) and swap them.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 18 / 58


Half-uniform crossover: Illustration

Before crossover

Parent 1 : 1 1 0 0 0 0 1 0
Here, Hamming
distance is 4
Parent 2 : 1 0 0 1 1 0 1 1

Tossing: 1 0 1 1
If toss is 1, then swap the
bits else remain as it is

Offspring 1: 1 0 0 0 1 0 1 1

Offspring 2:
1 1 0 1 0 0 1 0

After crossver

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 19 / 58


Shuffle crossover

A single crossover point is selected. It divides a chromosome into


two parts called schema.

In both parents, genes are shuffled in each schema. Follow some


strategy for shuflling bits

Schemas are exchanged to create offspring (as in single


crossover)

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 20 / 58


Shuffle crossover: Illustration
Before crossover

P1 : 1 1 0 0 0 1 1 0

P2 : 1 0 0 1 1 0 1 1

K-point

P1' : 0 0 1 0 1 1 0 1
After shuffing bits
P2' : 0 1 1 1 0 1 0 1

Offspring 1: 0 0 1 0 1 1 0 1
Single point
Offspring 2: crossover
0 1 1 1 0 1 0 1

After crossver

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 21 / 58


Matrix crossover
The matrix crossover strategy is expained with the following illustration.

Rows..

I1: r11 r12 r13 r14 …... r1n

I2: r21 r22 r23 r24 …... r2n

P1: P2:
r11 r12 r13 r14 r21 r22 r23 r24 Two
dimensianal
..

..

..

..

..

..

..

..
representation
of the
..

..

..

..

..

..

..

..
chromosomes
..

..

..

..

..

..

..

..
r1n-3 r1n-2 r1n-1 r1n n×4
r2n-3 r2n-2 r2n-1 r2n n×4

Then matrices are divided into a number of non-overlapping zones


Two matrices
are divided
into a number
of non-
C1: C2: overlapping
zones and
shuffle
between them

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 22 / 58


Three parent crossover

In this techniques, three parents are randomly chosen.

Each bit of the first parent is compared with the bit of the second
parent.

If both are the same, the bit is taken for the offspring.

Otherwise, the bit from the third parent is taken for the offspring.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 23 / 58


Three parent crossover: Illustration

P1: 1 1 0 1 0 0 0 1

P2: 0 1 1 0 1 0 0 1

P3: 0 1 1 0 1 1 0 1

C1: 0 1 1 0 1 0 0 1

Note: Sometime, the third parent can be taken as the crossover mask.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 24 / 58


Comments on the binary crossover techniques

1 Non-uniform variation:
It can not combine all possible schemas (i.e. building blocks)

For example : it can not in general combine instances of


11*****1
and
****11**
to form an instance of
1 1 * * 1 1 * 1.

2 Positional bias:
The schemas that can be created or destroyed by a crossover
depends strongly on the location of the bits in the chromosomes.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 25 / 58


Comments on the binary crossover techniques

3 End-point bias:
It is also observed that single-point crossover treats some loci
preferentially, that is, the segments exchanged between the two
parents always contain the end points of the strings.
4 Hamming cliff problem:
A one-bit change can make a large (or a small) jump.
A multi-bits can make a small (or a large gap).
For example, 1000 =⇒ 0111
(Here, Hamming distance = 4, but distance between phenotype is
1)
Similarly, 0000 =⇒ 1000
(Here, Hamming distance = 1, but distance between phenotype is
8)

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 26 / 58


Comments on the binary crossover techniques

To reduce the positional bias and end-point bias, two-point


crossover and multi-point crossover schemes have been evolved.

In contrast, UX and HUX distribute the patterns in parent


chromosomes largely resulting too much deflections in the
offspring.

To avoid binary code related problem, gray coding can be used.

In summary, binary coding is the simplest encoding and its


crossover techniques are fastest compared to the crossover
techniques in other GA encoding schemes.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 27 / 58


Crossover Techniques in Real Coded GA

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 28 / 58


Crossover techniques in Real coded GA

Following are the few well known crossover techniques for the
real-coded GAs.

Linear crossover
Blend crossover
Binary simulated crossover

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 29 / 58


Linear crossover in Real-coded GAs

This scheme uses some linear functions of the parent


chromosomes to produce the new children.

For example
Suppose P1 and P2 are the two parameter’s values in two parents,
then the corresponding offspring values in chromosomes can be
obtained as

Ci = αi P1 + βi P2
where i = 1, 2 · · · n (number of children).
αi and βi are some constants.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 30 / 58


Linear crossover: An example
Example :
Suppose P1 = 15.65 and P2 = 18.83
α1 = 0.5 = β1
α2 = 1.5 and β2 = −0.5
α3 = −0.5 and β3 = 1.5
Answer :
C1 = 0.5 × (P1 + P2 ) = 17.24
C2 = 1.5 × P1 − 0.5 × P2 = 14.06
C3 = −0.5 × P1 + 1.5 × P2 = 20.24

C2=14.06 C1=17.24 C3=20.42

10.0 P1=15.65 P2=18.83 25.0

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 31 / 58


Advantages and limitations

Advantages
1 It is simple to calculate and hence faster in computation
2 Can allow to generate a large set of offspring from two parent
values
3 Controls are possible to choose a wide-range of variations

Limitations
1 Needs to be decided the values of αi and βi
2 It is difficult for the inexperienced users to decide the right values
for αi and βi
3 If αi and βi values are not chosen properly, the solution may stuck
into a local optima.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 32 / 58


Blend crossover in Real-coded GAs

This scheme can be stated as follows.

1 Let P1 and P2 are the two parameter’s values in two parent’s


chromosomes, such that P1 < P2
2 Then the blend crossover scheme creates the children solution
lying in the range

h{P1 − α (P2 − P1 )} · · · {P2 − α (P2 − P1 )}i


where α is a constant to be decided so that children solution do
not come out of the range of domain of the said parameter.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 33 / 58


Blend crossover in Real-coded GAs

3 Another parameter γ has to be identified by utilizing the α and a


random number r in the range of (0.0, 1.0) both exclusive like the
following:
γ = (1 + 2α) r − α

4 The children solutions C1 and C2 are determined from the parents


as follows,
C1 = (1 − γ) P1 + γP2
C2 = (1 − γ) P2 + γP1

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 34 / 58


Blend crossover : An example

Example :
P1 = 15.65 and P2 = 18.83
α = 0.5 and γ = 0.6

New offspring
C1=16.60 C2=17.88

10.0 P1=15.65 P2=18.83 25.0


Parents

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 35 / 58


Simulated binary crossover in Real-coded GAs

This scheme is based on the probability distribution of generated


children solution from the given parents.

A spread factor α is used to represent the spread of the children


solutions with respect to that of the parents, as given below.

C1 −C2
α= P1 −P2

Here P1 and P2 are represent the parent points and C1 and C2 are
two children solutions.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 36 / 58


Simulated binary crossover in Real-coded GAs

Three different cases may occurs:

Case 1: α < 1 (Contracting Crossover)


The spread of children is less than the parents.

Case 2: α > 1 (Expanding Crossover)


The spread of children is more than the parents.

Case 3: α = 1 (Stationary Crossover)


The spread of children is same as that of the parents.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 37 / 58


Simulated Binary Crossover

Probability Distribution:

Case 1: For Contracting Crossover

C(α) = 0.5(q + 1)α2

Case 2: For Expanding Crossover


1
E(α) = 0.5(q + 1) αq+2

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 38 / 58


Simulated binary crossover in Real-coded GAs
Following steps are used to create two children solutions C1 and C2
from the parents P1 and P2 .

1 Create a random number r ∈ {0.0 · · · 1.0}


2 Determine α0 such that
R α0
0 C(α)dα = r , if r < 0.5
and
R α0
1 E(α)dα = r , if r > 0.5

3 Using the value of α0 obtain two children solution as follows

C1 = 0.5 [(P1 + P2 ) − α0 |P2 − P1 |]


C2 = 0.5 [(P1 + P2 ) + α0 |P2 − P1 |]

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 39 / 58


Simulated binary crossover in Real-coded GAs
Example:
P1 = 15.65
P2 = 18.83
q=2
α0 = 1.0772
Assuming expanding crossover with r > 0.5

offspring

C1=15.52 C2=18.95

10.0 P1=15.65 P2=18.83 25.0

parent

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 40 / 58


Advantages and limitations
Advantages
1 We can generate a large number of offspring from two parents.
2 More explorations with diverse offspring.
3 Results are accurate and usually terminated with global optima.
4 Termination with a less number of iterations.
5 Crossover techniques are independent of the length of the
chromosome.

Limitations
1 Computationally expensive compared to binary crossover.
2 If proper values of parameters involved in the crossover
techniques are not chosen judiciously, then it may lead to
premature convergence with not necessarily optimum solutions.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 41 / 58


Crossover Techniques in Order GAs

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 42 / 58


Crossover techniques in order GA
Any binary crossover techniques are not applicable to Order
coded GAs
Example
Reference: TSP
Consider any two chromosomes with Order-coded encoding scheme

A B C D E F G H
Before crossover

H G F E D C B A

K-point

A B C D E C B A
After Single point
binary crossover
H G F E D F G H

Here, the offspring are not valid chromosomes

Since, sequence of gene values are important, Real-coded


crossover techniques, which are to produce real number from two
given real numbers are also not applicable to Order-coded GAs.
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 43 / 58
Crossover techniques in order GA
Some important crossover techniques in Order-coded GAs are:
1 Single-point order crossover
2 Two-point order crossover
3 Partially mapped crossover (PMX)
4 Position based crossover
5 Precedence-preservation crossover (PPX)
6 Edge recombination crossover

Assumptions: For all crossover techniques, we assume the following:

Let L be the length of the chromosome.


P1 and P2 are two parents (are selected from the mating pool).
C1 and C2 denote offspring (initially empty).

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 44 / 58


Single point order crossover

Given two parents P1 and P2 with chromosome length, say L.


Steps:

1 Randomly generate a crossover point K such that (1 < K < L).


2 Copy the left schema of P1 into C1 (initially empty) and left
schema of P2 into C2 (also initially empty).
3 For the schema in the right side of C1 , copy the gene value from
P2 in the same order as they appear but not already present in the
left schema.
4 Repeat the same procedure to complete C2 from P1 .

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 45 / 58


Single point order crossover: Illustration

Example :

Crossover Point K

P1 : A C D E B F G H J I

P1 : E D C J I H B A F G

C1 : A C D E J I H B F G

C2: E D C J A B F G H I

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 46 / 58


Two-point order crossover

It is similar to the single-point order crossover, but with two k−points.


Steps:

1 Randomly generate two crossover points K1 and K2 . 1 < K1 ,


K2 < L
2 The schema in middle of P1 and P2 are copied into C1 and C2
(initially both are empty), respectively in the same location as well
as in the same order.
3 The remaining schema in C1 and C2 are copied from P2 and P1
respectively, so that an element already selected in child solution
does not appear again.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 47 / 58


Two-point order crossover: Illustration

Example :

K1 K2

P1 : A C D E B F G H J I

P1 : E D C J I H B A F G

C1 : E D C J B F G I H A

C2: A C D E I H B F G J

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 48 / 58


Precedence preservation order crossover

Let the parent chromosomes be P1 and P2 and the length of


chromosomes be L.
Steps:

(a) Create a vector V of length L randomly filled with elements from


the set {1, 2}.

(b) This vector defines the order in which genes are successfully
drawn from P1 and P2 as follows.

1 We scan the vector V from left to right.


2 Let the current position in the vector V be i (where i = 1, 2, · · · , L).
3 Let j (where j = 1, 2, · · · , L) and k (where j = 1, 2, · · · , L) denotes
the j th and k th gene of P1 and P2 , respectively. Initially j = k = 1.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 49 / 58


Precedence preservation order crossover

4 If i th value is 1 then
Delete j th gene value from P1 and as well as from P2 and
append it to the offspring (which is initially empty).
5 Else
Delete k th gene value from P2 and as well as from P1 and
append it to the offspring.
6 Repeat Step 2 until both P1 and P2 are empty and the offspring
contains all gene values.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 50 / 58


Precedence preservation order crossover :
Example

Example :

Random Vector σ 2 1 1 2 1 1 2 2 1 2

P1 : A C D E B F G H J I

P2 : E D C J I H B A F G

C1 : E C D J B F H A I G

C2: ?

Note : We can create another offspring following the alternative rule


for 1 and 2.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 51 / 58


Position-based order crossover

Steps :

1 Choose n crossover points K1 , K2 · · · Kn such that n  L, the


length of chromosome.
2 The gene values at K1th , K2th · · · Knth positions in P1 are directly
copied into offspring C1 (Keeping their position information intact).
3 The remaining gene values in C1 will be obtained from P2 in the
same order as they appear there except they are already not
copied from P1 .
4 We can reverse the role of P1 and P2 to get another offspring C2 .

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 52 / 58


Position-based order crossover : Example

Let su consider three k −points namely K1 , K2 and k3 in this example.

K1 K2 K3

P1 : A C D E B F G H J I

P1 : E D C J I H B A F G

C1 : E C D J B I A H F G

C2: D E C B I F G A H J

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 53 / 58


Edge recombination order crossover

This crossover technique is used to solve TSP problem when the


cities are not completely connected to each other.

In this technique, an edge table which contains the adjacency


information (but not the order).

In the other words, edge table provides connectivity information.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 54 / 58


Edge recombination order crossover: Illustration
Example

Let us consider a problem instance of a TSP with 9 cities.


Assume any two chromosome P1 and P2 for the mating.

P1 : 1 2 4 6 8 7 5 3

P2 : 4 3 5 7 8 6 2 1

Connectivity graph:
2
6
1

4 8
3

7
5

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 55 / 58


Edge recombination order crossover: Illustration

Edge table for the connectivity graph:

City Connectivity

1 2 4 3

2 1 4 7 6
2
6
1 3 1 4 5

4 1 2 3 6
4 8
3
5 3 7 8
7
5 6 2 4 8

7 2 5 8

8 5 6 7

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 56 / 58


Edge recombination order crossover: Illustration
Steps:
Let the child chromosome be C1 (initially empty).

1 Start the child tour with the starting city of P1 . Let this city be X .
2 Append city X to C.
3 Delete all occurrences of X from the connectivity list of all cities
(right-hand column).
4 From city X choose the next city say Y , which is in the list of
minimum (or any one, if there is no choice) connectivity links.
5 Make X = Y [ i.e. new city Y becomes city X ].
6 Repeat Steps 2-5 until the tour is complete.
7 End
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 57 / 58
Any questions??

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 11.03.2016 58 / 58


Mutation and Fitness Scalling in GAs

Debasis Samanta

Indian Institute of Technology Kharagpur


dsamanta@iitkgp.ac.in

15.03.2016

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 15.03.2016 1 / 30


Important GA Operations

1 Encoding
2 Fitness Evaluation and Selection
3 Mating pool
4 Reproduction
Crossover
Mutation
Inversion
5 Convergence test

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 15.03.2016 2 / 30


This lecture includes ...

1 Encoding
2 Fitness evaluation and Selection
3 Mating pool
4 Crossover
5 Mutation
6 Inversion
7 Convergence test
8 Fitness scaling

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 15.03.2016 3 / 30


Mutation Operation
In genetic algorithm, the mutation is a genetic operator used to
maintain genetic diversity from one generation of a population (of
chromosomes) to the next.
It is analogues to biological mutation.
In GA, the concept of biological mutation is modeled artificially to
bring a local change over the current solutions.

Mutation in Natural Biological Mutation in Genetic Algorithm


Process

Evolution

Local optima
Evolution
Global optima

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 15.03.2016 4 / 30


Mutation Operation in GAs

Like different crossover techniques in different GAs there are many


variations in mutation operations.

Binary Coded GA :
Flipping
Interchanging
Reversing
Real Coded GA :
Random mutation
Polynomial mutation
Order GA :
Tree-encoded GA :

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 15.03.2016 5 / 30


Mutation operation in Binary coded GA

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 15.03.2016 6 / 30


Mutation Operation in Binary coded GA

In binary-coded GA, the mutation operator is simple and straight


forward.
In this case, one (or a few) 1(s) is(are) to be converted to 0(s) and
vice-versa.
A common method of implementing the mutation operator involves
generating a random variable called mutation probability (µp ) for
each bit in a sequence.
This mutation probability tells us whether or not a particular bit will
be mutated (i.e. modified).
Note :

To avoid large deflection, µp is generally kept to a low value.


0.1 1.0
It is varied generally in the range of L to L , where L is the string
length.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 15.03.2016 7 / 30


Mutation in Binary-coded GA : Flipping
Here, a mutation chromosome of the same length as the
individual’s chromosome is created with a probability pµ of 10 s in
the bit.
For a 1 in mutation chromosome, the corresponding bit in the
parent chromosome is flipped ( 0 to 1 or 1 to 0) and mutated
chromosome is produced.

1 0 1 1 0 0 1 0

offspring

1 0 0 0 1 0 0 1

Mutation chromosome

0 0 1 1 1 1 0 0

Mutated offspring

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 15.03.2016 8 / 30


Binary-coded GA : Interchanging

Two positions of a child’s chromosome are chosen randomly and


the bits corresponding to those position are interchanged.

* *
1 0 1 1 0 1 0 0

Child chromosome

1 1 1 1 0 0 0 1

Mutated chromosome

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 15.03.2016 9 / 30


Mutation in Binary-coded GA : Reversing

A positions is chosen at random and the bit next to that position


are reversed and mutated child is produced.

*
0 1 1 0 0 1 0 1

Child chromosome

0 1 1 0 0 1 1 1

Mutated chromosome

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 15.03.2016 10 / 30


Mutation operation in Real-coded GA

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 15.03.2016 11 / 30


Mutation Operation in GAs

Like different crossover techniques in different GAs there are many


variations in mutation operations.

Binary Coded GA :
Flipping
Interchanging
Reversing
Real-coded GA :
Random mutation
Polynomial mutation
Order GA :
Tree-encoded GA :

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 15.03.2016 12 / 30


Mutation in Real-coded GA : Random mutation

Here, mutated solution is obtained from the original solution using


the following rule.

Pmutated = Poriginal + (r − 0.5) × ∆


Where r is a random number lying between 0.0 and 1.0 and ∆ is
the maximum value of the perturbation decided by the user.

Example :
Poriginal = 15.6
r = 0.7
∆ = 2.5
Then, Pmutated = 15.6 + (0.7 − 0.5) × 2.5 = 16.1

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 15.03.2016 13 / 30


Mutation in Real-coded GA : Polynomial mutation
It is a mutation operation based on the polynomial distribution.
Following steps are involved.
1 Calculate a random number r lying between 0.0 and 1.0
2 Calculate the perturbation factor δ using the following rule
1
(
(2r ) q+1 − 1 ,if r < 0.5
δ= 1
1 − [2 (1 − r )] q+1 ,if r ≥ 0.5
where q is a exponent (positive or negative value) decided by the
user.
3 The mutated solution is then determined from the original solution
as follows
Pmutated = Poriginal + δ × ∆
Where ∆ is the user defined maximum perturbation allowed
between the original and mutated value.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 15.03.2016 14 / 30


Mutation in Real-coded GA : Polynomial mutation

Example :
Poriginal = 15.6, r = 0.7, q = 2, ∆ = 1.2 then Pmutated =?

1
δ = 1 − [2 (1 − r )] q+1 = 0.1565
Pmutated = 15.6 × 0.1565 × 1.2 = 15.7878

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 15.03.2016 15 / 30


Revisiting the flow in GA

Start

Initial Population Encoding

No
Fitness evaluation &
? Converge ?
Selection

Selection
Yes
Select Mate

Stop

Crossover

Reproduction
Mutation

Inversion

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 15.03.2016 16 / 30


Termination and/or convergence criteria

Each iteration should be tested with some convergence test.


Commonly known convergence test or termination conditions are :

1 A solution is found that satisfies the objective criteria.


2 Fixed number of generation is executed.
3 Allocated budget (such as computation time) reached.
4 The highest ranking solution fitness is reaching or has reached a
plateau such that successive iterations no longer produce better
result.
5 Manual inspection.
6 Combination of the above.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 15.03.2016 17 / 30


Fitness Scaling in GA

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 15.03.2016 18 / 30


Issue with fitness values
Let us look into a scenario, which is depicted in the following figure.

Fitness value
Best
individuals

Worst individuals

Search space

Here, the fitness values are with wider range of values.


It then highly favors the individuals with large fitness values and
thus stuck at local optima/premature termination/inaccurate result.
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 15.03.2016 19 / 30
Issue with fitness values
Now, let us look into another scenario, which is depicted in the
following figure.

Fitness value

Search space

Here, the fitness values are with narrower range of values.


In this case, successive iterations will not show any improvement
and hence stuck at local optima/premature termination/inaccurate
result.
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 15.03.2016 20 / 30
Summary of observations

It is observed that

If fitness values are too far apart, then it will select several copies
of the good individuals and many other worst individual will not be
selected at all.

This will tend to fill the entire population with very similar
chromosomes and will limit the ability of the GA to explore large
amount of the search space.

If the fitness values are too close to each other, then the GA will
tend to select one copy of each individual, consequently, it will not
be guided by small fitness variations and search scope will be
reduced.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 15.03.2016 21 / 30


Why fitness scaling?

As a way out we can think for crossover or mutation (or both) with
a higher fluctuation in the values of design parameter.

This leads to a chaos in searching.


May jump from one local optima to other.
Needs a higher number of iterations.

As an alternative to the above, we can think for fitness scaling


strategy.

Fitness scaling is used to scale the raw fitness values so that the GA
sees a reasonable amount of difference in the scaled fitness values of
the best versus worst individuals.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 15.03.2016 22 / 30


Approaches to fitness scaling

In fact, fitness scaling is a sort of discriminating operation in GA.


Few algorithms are known for fitness scaling:

Linear scaling

Sigma scaling

Power law scaling

Note:
The fitness scaling is useful to avoid premature convergence, and
slow finishing.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 15.03.2016 23 / 30


Linear scaling

Algorithm

Input : F = {f1 , f2 · · · fN } is a set of raw fitness values of N


individuals in a population (initial).

Output : F 0 = f10 , f20 · · · fN0 is a set of fitness values after scaling




Steps :

1 Calculate the average fitness value


PN
i=1 fi
f̄ = N
2 Find fmax = MAX (F ), Find the maximum value in the set F .
3 Find fmin = MIN(F ), Find the minimum value in the set F .

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 15.03.2016 24 / 30


Linear scaling

4 Calculate the following,


a= fmax −f̄
,
f̄ ∗fmin
b= fmin −f̄
5 For each fi ∈ F do

fi0 = a × fi + b
F 0 = F 0 ∪ fi0

where F 0 is initially empty.


6 End

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 15.03.2016 25 / 30


Linear scaling

Note :

1 For better scaling it is desirable to have f̄ = f¯0


2 In order not to follow dominance by super individuals, the number
−fmin
0
of copies can be controlled with fmax = C × f¯0 where C = fmax
f̄ −f min

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 15.03.2016 26 / 30


Sigma scaling
Algorithm
Input : F = {f1 , f2 · · · fN } is a set of row fitness values of N
individuals in a population.
Output : F 0 = f10 , f20 · · · fN0 is a set of fitness values after scaling.


Steps :
1 Calculate the average fitness value
PN
i f
f̄ = i=1
N
2 Determine reference worst-case fitness value fw such that
fw = f̄ + S ∗ σ
Where σ = STD(F ), is the standard deviation of the fitness of
population and
S is a user defined factor called sigma scaling factor (Usually
1 ≤ S ≤ 5)
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 15.03.2016 27 / 30
Sigma scaling

3 Calculate fi0 as follows


For each fi ∈ F do
fi0 = fw − fi , if (fw > fi )
Else fi0 = 0
4 Discard all the individuals with fitness value 0
5 Stop
Note :

Linear scaling (only) may yield some individuals with negative


fitness value.
Hence, Linear scaling is usually adopted after Sigma scaling to
avoid the possibility of negative fitness individuals.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 15.03.2016 28 / 30


Power law scaling

In power law scaling, the scaled fitness value is given by

fi0 = fik

where k is a problem dependent constant.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 15.03.2016 29 / 30


Any questions??

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 15.03.2016 30 / 30


Multi-Objective Optimization: Introduction

Debasis Samanta

Indian Institute of Technology Kharagpur


dsamanta@iitkgp.ac.in

18.03.2016

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 1 / 53


Multiobjective optimization problem: MOOP
There are three components in any optimization problem:
F: Objectives
minimize (maximize) fi (x1 , x2 , · · · , xn ), i = 1, 2, · · · , m

S: Constraints
Subject to
gj (x1 , x2 , · · · , xn ), ROPj Cj , j = 1, 2, · · · , l

V: Design variables

xk ROPk dk , k = 1, 2, · · · , n
Note :
1 For a multi-objective optimization problem (MOOP), m ≥ 2
2 Objective functions can be either minimization, maximization or
both.
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 2 / 53
A formal specification of MOOP
Let us consider, without loss of generality, a multi-objective
optimization problem with n decision variables and m objective
functions

Minimize y = f (x) = [y1 ∈ f1 (x), y2 ∈ f2 (x), · · · , yk ∈ fm (x)]

where

x = [x1 , x2 , · · · , xn ] ∈ X
y = [y1 , y2 , · · · , yn ] ∈ Y

Here :
x is called decision vector
y is called an objective vector
X is called a decision space
Y is called an objective space
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 3 / 53
Illustration: Decision space and objective space

3 2

1
1

Thus, solving a MOOP implies to search for x in the decision space


(X ) for an optimum vector (y) in the objective space (Y ).

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 4 / 53


A formal specification of MOOP (contd...)

In other words,

1 We wish to determine X̄ ∈ X (called feasible region in X ) and any


point x̄ ∈ X̄ (which satisfy all the constraints in MOOP) is called
feasible solution.
2 Also, we wish to determine from among the set X̄ , a particular
solution x̄ ∗ that yield the optimum values of the objective functions.
Mathematically,

∀x̄ ∈ X̄ and ∃x̄ ∗ ∈ X̄ | fi (x̄ ∗ ) ≤ fi (x̄),

where ∀i ∈ [1, 2, · · · , m]
3 If this is the case, then we say that x¯∗ is a desirable solution.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 5 / 53


Why solving a MOOP is an issue?

In a single-objective optimization problem, task is to find typically


one solution which optimizes the sole objective function

In contrast to single–objective optimization problem, in MOOP:

Cardinality of the optimal set is more than one, that is, there are
m ≥ 2 goals of optimization instead of one
There are m ≥ 2 different search points (possibly in different
decision spaces) corresponding to m objectives

Optimizing each objective individually not necessarily gives the


optimum solution.

Possible, only if objective functions are independent to their solution


spaces.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 6 / 53


Illustration: Single vs. multiple objectives

f1 f2 f3 f4
Objectives

Objectives
Search space Search space

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 7 / 53


Why solving an MOOP is an issue?

In fact, majority of the real-world MOOPs are with a set of trade-off


optimal solutions. A set of trade-off optimal solutions is also
popularly termed as Pareto optimal solutions

In a particular search point, one may be the best whereas other


may be the worst

Also, sometime MOOPs are with conflicting objectives

Thus, optimizing an objective means compromising other(s) and


vice-versa.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 8 / 53


MOOP: Trade-off and conflicts in solutions

5 f2
4
i ze
Maximize f2

3 im

Objectives
x
Ma
2

1
Minim
ize f
1

Minimize f1 Search space

Trade-off optimal solution Conflicting objectives

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 9 / 53


Illustration: ideal solution vs. real solution
It is observed that in many real-life problems, we hardly have a
situation in which all the fi (x̄) have a minimum in X̄ at a common point
x̄ ∗ .
This is particularly true when objective functions are conflicting in their
interests.

F2 (minimize)
F2 (minimize)

F1 (minimize) F1 (minimize)

Ideal Situation Real Situation

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 10 / 53


GA-based approach to solve MOOPs
MOEA : Multi Objective Evolutionary Algorithm

Solution
Found ?

MOEA follows the same reproduction operation as in GA but follow


different selection procedure and fitness assignment strategies.
There are also a number of stochastic approaches such as
Simulated Annealing (SA), Ant Colony Optimization (ACO),
Particle Swam Optimization (PSO), Tabu Search (TS) etc. could
be used to solve MOOPs.
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 11 / 53
MOEA: GA-based approach to solve MOOP

There are two board approaches to solve MOOPs with MOEA


A priori approach (also called preference based approach)
A posteriori approach (does not require any prior knowledge)

Two major problems must be addressed when a GA is applied to


multi-objective optimization problems.

1 How to accomplish fitness assignment and selection in order to


guide the search toward the optimal solution set?
2 How to maintain a diverse population in order to prevent
premature convergence and achieve a well distributed trade-off
front?

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 12 / 53


Schematic of a priori MOEA approach

A MOOP problem Estimate a Single-objective


Minimize f1 relative optimization problem
Higher level m
Minimize f2
information
importance F   wi  fi
......... vector W = i 1
Minimize fm {w1,w2 . . .wm} Subject to S and V

Subject to
constraint S
with design
variables V
Single Optimal GA to solve
Solution the problem

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 13 / 53


Schematic of a posteriori MOEA approach

A MOOP problem
Minimize f1 Ideal Multiple Pareto-
Minimize f2 Multiobjective optimal
......... Optimizer solutions
Minimize fm

Subject to
constraint S
with design
Choose one Higher level
variables V
solution Informaiton

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 14 / 53


IDEAL multi-objective optimization

Here, effort have been made in finding the set of trade-off solutions by
considering all objective to be important.
Steps

1 Find multiple trade-off optimal solutions with a wide range of


values for objectives. (Note: here, we do not use any relative
preference vector information). The task here is to find as many
different trade-off solutions as possible.
2 Choose one of the obtained solutions using higher level
information (i.e. evaluate and compare the obtained trade-off
solutions)

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 15 / 53


Illustration: Higher level information
Consider the decision making involved in buying an automobile car.
Consider two objectives.

minimize Cost
maximize Comfort

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 16 / 53


Illustration: Higher level information

Here, solution 1 and 2 are two extreme cases.

Between these two extreme solutions, there exist many other


solutions, where trade-off between cost and comfort exist.

In this case, all such trade-off solutions are optimal solutions to a


multi-objective optimization problem.

Often, such trade-off solution provides a clear front on an


objective space plotted with the objective values.

This front is called Pareto-optimal front and all such trade-off


solutions are called Pareto-optimal solutions (after the name of
Vilfredo Pareto, 1986)

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 17 / 53


Choosing a solution with higher level information

Knowing the number of solutions that exist in the market with


different trade-offs between cost and comfort, which car does one
buy?

It involves many other considerations

total finance available to buy the car


fuel consumption
depreciation value
road condition
physical health of the passengers
social status
After sales service, vendor’s reputation, manufacturer’s past history
etc.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 18 / 53


Formal specification of MOEA approach

In the next few slides, we shall discuss the above idea of solving
MOOPs more precisely. Before that, let us familiar to few more basic
definitions and terminologies.

1 Concept of domination
2 Properties of dominance relation
3 Pareto-optimization
4 Solutions with multiple-objectives

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 19 / 53


Solution with multiple objectives : Ideal objective
vector
For each of the M-th conflicting objectives, there exist one different
optimal solution. An objective vector constructed with these individual
optimal objective values constitute the ideal objective vector.

Definition 1: Ideal objective vector


Without any loss of generality, suppose the MOOP is defined as
Minimize fm (x), m = 1, 2, · · · , M
Subject to X ∈ S, where S denotes the search space.
and
fm∗ denotes the minimum solution for the m-th objective functions, then
the ideal objective vector can be defined as

Z ∗ = f ∗ = [f1∗ , f2∗ , · · · , fM∗ ]

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 20 / 53


Ideal objective vector : Physical interpretation

Z*1

f2
f2
Z*

Z* Z*2
Z*

f1 f1
(B) A good solution vector should
(A) Ideal objective vector be as close to ideal solution vector

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 21 / 53


Ideal objective vector : Physical interpretation
Let us consider a MOOP with two objective functions f1 and f2 where
both are to be minimized.
If z ∗ = f ∗ = [f1∗ , f2∗ ] then both f1 and f2 are minimum at x ∗ ∈ S.
(That is, there is a feasible solution when the minimum solutions to
both the objective functions are identical).
In general, the ideal objective vector z ∗ corresponds to a
non-existent solution (this is because the minimum solution for
each objective function need not be the same solution).
If there exist an ideal objective vector, then the objectives are
non-conflicting with each other and the minimum solution to any
objective function would be the only optimal solution to the MOOP.
Although, an ideal objective vector is usually non-existing, it is
useful in the sense that any solution closer to the ideal objective
vector are better. (In other words, it provides a knowledge on the
lower bound on each objective function to normalize objective
values within a common range).
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 22 / 53
Solution with multiple objectives : Utopian
objective vector
Utopian objective vector corresponding to a solution which has an
objective value strictly better than (and not equal to) that of any
solution in search space.

f2

Z*
Utopian objective vector

Z** Ideal objective vector

f1

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 23 / 53


Solution with multiple objectives : Utopian
objective vector

The Utopian objective vector can be formally defined as follows.

Definition 2 : Utopian objective vector


A Utopian objective vector z ∗∗ has each of its component marginally
smaller than that of the ideal objective vector, that is
zi∗∗ = zi∗ − ∈i with ∈i > 0 for all i = 1, 2, · · · , M

Note :
Like the ideal objective vector, the Utopian objective vector also
represents a non-existent solution.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 24 / 53


Solution with multiple objectives : Nadir objective
vector
The ideal objective vector represents the lower bound of each
objective in the entire feasible search space. In contrast to this, the
Nadir objective vector, denoted as z nadir , represents the upper bound
of each objective in the entire Pareto-optimal set (note: not in the
entire search space).

(f1max,f2max)
*
Z2
2 Znadir

Z* Z1*

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 25 / 53


Solution with multiple objectives : Nadir objective
vector

Note :
z nadir is the upper bound with respect to Pareto optimal set. Whereas,
a vector of objective W found by using the worst feasible function
values fimax in the entire search space.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 26 / 53


Usefulness of Nadir objective vector

In order to normalize each objective in the entire range of


Pareto-optimal region, the knowledge of Nadir and ideal objective
vectors can be used as follows.
fi −zi∗
f¯i = nadir
zi −zi∗

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 27 / 53


Concept of domination

Notation

Suppose, f1 , f2 , · · · , fM are the objective functions


xi and xj are any two solutions
The operator  between two solutions xi and xj as xi  xj to
denote that solution xi is better than the solution xj on a particular
objective.
Alternatively, xi  xj for a particular objective implies that solution
xi is worst than the solution xj on this objective.
Note :
If an objective function is to be minimized, the operator  would mean
the ”<” (less than operator), whereas if the objective function is to be
maximized, the operator  would mean the ”>” (greater than operator).

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 28 / 53


Concept of domination

Definition 3 : Domination
A solution xi is said to dominate the other solution xj if both condition I
and II are true.
Condition : I
The solution xi is no worse than xj in all objectives. That is
fk (xi ) 7 fk (xj ) for all k = 1, 2, · · · , M

Condition : II
The solution xi is strictly better than xj in at least one objective. That is
fk̄ (xi )  fk̄ (xj ) for at least one k̄ ∈ {1, 2, · · · , M}

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 29 / 53


Illustration 1

Consider that f1 and f2 are two objectives to be minimized.

x3
Minimize f1
f2 Minimize f2
x1 ≤ x2
x2
x1
x1 ≤ x3 but x3 ≤ x1
x2 ≤ x3 as well as x3 ≤ x2

f1

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 30 / 53


Illustration 2

x3 Minimize f1
f2 Maximize f2

x2 x1 ≤ x2 or x2 ≤ x1 ?
x1
x1 ≤ x3 or x3 ≤ x1 ?
x2 ≤ x3 or x3 ≤ x2 ?

f1

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 31 / 53


Points to be noted

Note :

If either of the condition I and II is violated then the solution xi


does not dominate the solution xj .

If xi dominates the solution xj (it is also mathematically denoted as


xi ≤ xj .
The domination also alternatively can be stated in any of the
following ways.

xj is dominated by xi
xi is non-dominated by xj
xi is non-inferior to xj

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 32 / 53


Illustration 3

2
5
4
4
f2 minimize

3
1 5
2
3
1

2 6 10 14 18
f1 maximize

Here, 1 dominates 2, 5 dominates 1 etc.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 33 / 53


Properties of dominance relation

Definition 3 defines the dominance relation between any two


solutions.

This dominance relation satisfies four binary relation properties.

Reflexive :
The dominance relation is NOT reflexive.
Any solution x does not dominate itself.

Condition II of definition 3 does not allow the reflexive property to


be satisfied.
Symmetric :
The dominance relation also NOT symmetric
x  y does not imply y  x.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 34 / 53


Properties of dominance relation

Antisymmetric :

Dominance relation can not be antisymmetric

Transitive :

The dominance relation is TRANSITIVE


If x  y and y  z, then x  z.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 35 / 53


Properties of dominance relation
Note :

1 An interesting property that dominance relation possesses is : If


solution x does not dominate solution y, this does not imply that y
dominates x.
2 In order for a binary relation to qualify as an ordering relation, it
must be at least transitive. Hence, dominance relation qualifies as
an ordering relation.
3 A relation is called partially ordered set, if it is reflexive,
antisymmetric and transitive. Since dominance relation is NOT
REFLEXIVE, NOT ANTISYMMETRIC, it is NOT a PARTIALLY
ORDER RELATION
4 Since, the dominance relation is not reflexive, it is a STRICT
PARTIAL ORDER.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 36 / 53


Pareto optimality

2
5
4
f2 minimize 4
3
1 5
2
3
1

2 6 10 14 18

f1 maximize

Non-dominated front

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 37 / 53


Pareto optimality

Consider solution 3 and 5.

Solution 5 is better than solution 3 with respect to f1 while 5 is


worse than 3 with respect to f2 .

Thus, condition I (of Definition 3) is not satisfied for both of these


solutions.

Hence, we can not conclude that 5 dominates 3 nor 3 dominated


5.

In other words, we can not say that two solutions 3 and 5 are
better.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 38 / 53


Non-dominated set

2
5
4
4
f2 minimize

3
1 5
2
3
1

2 6 10 14 18

f1 maximize

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 39 / 53


Non-dominated set

From the figure it is evident that

There are a set of solutions namely 1, 2, 3, 4 and 5.

1 dominates 2; 5 dominates 1 etc.

Neither 3 dominates 5 nor 5 dominates 3


We say that solution 3 and 5 are non-dominated with respect to
each other.

Similarly, we say that solution 1 and 4 are non-dominated.

In this example, there is not a single solution, which dominates all


other solution

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 40 / 53


Non-dominated set: A counter example

2
5
4
4
f2 minimize

3
1 5
2
3
1

2 6 10 14 18 22

f1 maximize

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 41 / 53


Non-dominated set

Definition 4 : Non-dominated set


Among a set of solutions P, the non-dominated set of solutions P 0 are
those which are not dominated by any member of the set P.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 42 / 53


Non-dominated set

2
5
4
4
f2 maximize

3 Here P = {1,2,3,4,5}
1 5
2 Non-dominated set
3 P’ = {3, 5}
1

2 6 10 14 18

f1 maximize

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 43 / 53


How to find a non-dominated set ?

For a given finite set of solutions, we can perform all pair-wise


comparisons.

Find which solution dominates


Find which solutions are non-dominated with respect to each other.

Property of solutions in non-dominated set

∃xi , xj ∈ P 0 such that xi  xj and xj  xi


A set of solution where any two of which do not dominate each
other if

/ P 0 then xi  xj where xj ∈ P 0 for any solution outside


∃xi ∈ P and xi ∈
of the non-dominated set, we can always find a solution in this set
which will domnaite each other.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 44 / 53


Some important observations

The above definition does not applicable to ideal situation.

2
5
4
4
f2 miniimize

f2 (minimize)
3
1 5
2
3
1

2 6 10 14 18 22
f1 (minimize)
f1 maximize

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 45 / 53


Some important observations

The non-dominated set concept is applicable when there is a trade-off


in solutions.

2
5
4
4
f2 minimize

3
1 5
2

F2 (minimize)
f2 (maximize)
3
1

2 6 10 14 18

f1 maximize

f1 (maximize) F1 (minimize)

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 46 / 53


Pareto optimal set

Definition 5: Pareto optimal set


When the set P is the entire search space, that is P = S, the resulting
non-dominated set P 0 is called the Pareto-optimal set.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 47 / 53


Examples: Pareto optimal sets
Following figures shows the Pareto ooptimal set for a set of feasible
solutions over an entire search space under four different situations
with two ojective functions f1 and f2 .

F2 (maximize)
F2 (miniimize)

F1 (miniimize) F1 (miniimize)

F2 (maximize)
F2 (miniimize)

F1 (maximize) F1 (maximize)

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 48 / 53


Pareto optimal fronts

In visual representation, all Pareto optimal solutions lie on a front


called Pareto optimal front, or simply, Pareto front.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 49 / 53


Examples

f2 (min)

f2 (min)
f2 max)

f1 (min) f1 (min) f1 (min)


F2 max)

f2 (min)
F2 (min)

f1 (min) f1 (max) f1 (min)

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 50 / 53


Examples

f2 (min)

f2 (min)
f2 (min)

f1 (min) f1 (max) f1 (min)

f2 (max)
F2 (max)

F2 max)

f1 (min) f1 (max) f1 (max)

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 51 / 53


Few good articles to read.
1 ”An Updated Survey of GA Based Multi-objective Optimization
Techniques” by Carles A Coello Coello, ACM Computing Surveys,
No.2,Vol. 32, June 2000.

2 ”Comparison of Multi-objective Evolutionary Algorithm : Empirical


Result” by E. Zitzler, K.Deb, Lother Thiele, IEEE Transaction of
Evolutionary Computation, No.2, Vol.8, Year 2000.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 52 / 53


Any questions??

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 18.03.2016 53 / 53


Solving MOOP: Non-Pareto MOEA approaches

Debasis Samanta

Indian Institute of Technology Kharagpur


dsamanta@iitkgp.ac.in

22.03.2016

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 1 / 32


Multi-objective evolutionary algorithm
To distinguish the GA to solve single objective optimization
problems to that of MOOPs, a new terminology called
Evolutionary Algorithm (EA) has been coined.
In many research articles, it is popularly abbreviated as MOEA,
the short form of Multi-Objective Evolutionary Algorithm.
The following is the MOEA framework, where Reproduction is
same as in GA but different strategies are followed in Selection.

Initialization of Yes
MOOP Selection Convergence Solution
Population
Test

No

Reproduction

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 2 / 32


Difference between GA and MOEA

1 Difference between GA and MOEA are lying in input (single


objective vs. multiple objectives) and output (single solution vs.
trade-off solutions, also called Pareto-optimal solutions).

2 Two major problems are handled in MOEA

How to accomplish fitness assignment (evaluation) and selection


thereafter in order to guide the search toward the Pareto optimal
set.
How to maintain a diverse population in order to prevent premature
convergence and achieve a well distributed Pareto-optimal front.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 3 / 32


Classification of MOEA techniques

MOEA Techniques

A priori approach A posteriori approach

Independent sampling
Aggregation Hybrid Selection
(Ordering)
Criterion selection (VEGA)
Lexicographic ordering
Pareto Selection
Aggregation
(Scalarization Ranking

Linear fitness evaluation Ranking and Niching


(SOEA) Demes
Non-linear fitness evaluation (SOEA) Elitist

Goal attainment

Weighted Min-max method

Game theory
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 4 / 32
Classification of MOEA techniques

Note :

A priory technique requires a knowledge to define the relative


importances of objectives prior to search

A posteriori technique searches for Pareto-optimal solutions from


a set of feasible solutions

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 5 / 32


MOEA technoiques to be discussed

1 A priori approches
Lexicographic ordering
Simple weighted approach (SOEA)

2 A posteriori approaches
Criterion selection (VEGA)
Pareto-based approaches

Rank-based approach (MOGA)


Rank + Niche based approach (NPGA)
Non-dominated sorting based approach (NSGA)
Elitist non-dominated sorting based approach (NSGA-II)

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 6 / 32


MOEA technoiques to be discussed

1 Non-Pareto based approches


Lexicographic ordering
Simple weighted approach (SOEA)
Criterion selection (VEGA)

2 Pareto-based approaches

Rank-based approach (MOGA)


Rank + Niche based approach (NPGA)
Non-dominated sorting based approach (NSGA)
Elitist non-dominated sorting based approach (NSGA-II)

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 7 / 32


Lexicographic Ordering

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 8 / 32


Lexicographic ordering method

Reference :
”Compaction of Symbolic Layout using Genetic Algorithms” by M.P
Fourman in Proceedings of 1st International Conference on Genetic
Algorithms, Pages 141-153, 1985.

It is an a priori technique based on the principle of ”aggregation


with ordering”.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 9 / 32


Lexicographic ordering method
Suppose, a MOOP with k objectives and n constraints over a decision
space x and is denoted as.

Minimize

f = [f1 , f2 , · · · , fk ]
Subject to

gj (x) ≤ cj , where j = 1, 2, · · · , n

1 Objectives are ranked in the order of their importance (done by


the programmer). Suppose, the objectives are arranged in the
following order.
f = [f1 < f2 < f3 < · · · < fk ]

Here, fi < fj implies fi is of higher importance than fj

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 10 / 32


Lexicographic ordering method
2 The optimum solution x¯∗ is then obtained by minimizing each
objective function at a time, which is as follows.

(a) Minimize f1 (x)


Subject to gj (x) ≤ cj , j = 1, 2, · · · , n
Let its solution be x̄1∗ , that is f1∗ = f1 (x̄1∗ )

(b) Minimize f2 (x)


Subject to gj (x) ≤ cj , j = 1, 2, · · · , n
f1 (x) = f1∗
Let its solution be x̄2∗ , that is f2∗ = f2 (x̄2∗ )

.................................................................
.................................................................
(c) At the i-th step, we have
Minimize fi (x)
Subject to gj (x) ≤ cj , j = 1, 2, · · · , n
fl (x) = fl∗ , l = 1, 2, · · · , i − 1

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 11 / 32


Lexicographic ordering method

This procedure is repeated until all k objectives have been considered


in the order of their importances.

The solution obtained at the end is x̄k∗ , that is, fk∗ = fk (x̄k∗ ).
This is taken as the desired solution x̄ ∗ of the given multiobjective
optimization problem

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 12 / 32


Remarks on Lexicographic ordering method

Remarks :

Deciding priorities (i.e. ranks) of objective functions is an issue.


Solution may vary if a different ordering is taken.

Different strategies can be followed to address the above issues.


1 Random selection of an objective function at each run
2 Naive approach to try with k ! number of orderings of k objective
functions and then selecting the best observed result.

Note :
It produces a single solution rather than a set of Pareto-optimal
solutions.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 13 / 32


Single Objective Evolutionary Agorithm

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 14 / 32


SOEA: Single-Objective Evolutionary Algorithm

This is an a priori technique based on the principle of ”linear


aggregation of functions”.

It is also alternatively termed as (SOEA) ”Single Objective


Evoluationary Algorithm”.

In many literature, this is also termed as Weighted sum approach.

In fact, it is a naive approach to solve a MOOP.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 15 / 32


SOEA approach to solve MOOPs
This method consists of adding all the objective functions together
using different weighting coefficients for each objective.
This means that our multi-objective optimization problem is
transformed into a scalar optimization problem.
In other words, in order to optimize say n objective functions
f1 , f2 , · · · , fn . It compute fitness using

Pn
fitness = i=1 wi × fi (x)

where wi ≥ 0 for each i = 1, 2, ...n are the weighting coefficients


representing the relative importance of the objectives. It is usually
assume that
Pn
i=1 wi =1

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 16 / 32


Comments on SOEA

1 This is the simplest approach and works in the same framework of


Simple GA.
2 The results of solving an optimization problem can vary
significantly as the weighting coefficient changes.
3 In other words,it produces different solutions with different values
of wi ’s.
4 Since very little is usually known about how to choose these
coefficients, it may result into a local optima.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 17 / 32


Local optimum solution in SOEA

Pareto -front

Minimize f2

Pare
to -fron
W1 f t2
1 +w
2 f2

Pa
re
Minimize f1
to
-fr
on
t2

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 18 / 32


Comments on SOEA

3 As a way out of this, it is necessary to solve the same problem for


many different values of wi ’s.
4 The wighting coefficients do not proportionally reflects the relative
importance of the objectives, but are only factors, which, when
varied, locate points in the Pareto set.
5 This method depends on not only wi ’s values but also on the units
in which functions are expressed.
6 In that case, we have to scale the objective values. that is
Pn
fitness = i=1 wi × fi (x) × ci
where ci ’s are constant multipliers that scales the objectives
properly.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 19 / 32


Naive Approach : Weighted sum approach
7 The technique can not be used to find Pareto-optimal solutions
which lie on the non-convex portion of the Pareto optimal front. In
that case, it gives only one solution, which might be on the Pareto
front.

Feasible objective
space

Minimize f2

Pareto-optimal front

min f1
SOEA Solution

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 20 / 32


Vector Evaluated Genetic Agorithm

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 21 / 32


Vector Evaluated Genetic Algorithm (VEGA)

Proposed by David Schaffer (1985) in


”Multiple objective optimization with vector evaluated genetic
algorithm - Genetic algorithm and their application”: Proceeding of
the first international conference on Genetic algorithm, 93-100,
1985.

It is normally considered as the first implementation of a MOEA

VEGA is an a posteriori MOEA technique based on the principle


of Criterion selection strategy.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 22 / 32


Vector Evaluated Genetic Algorithm (VEGA)

About VEGA :

It is an extension of Simple Genetic Algorithm (SGA).

It is an example of a criterion (or objective) selection technique


where a fraction of each succeeding population is selected based
on separate objective performance. The specific objective for
each fraction are randomly selected at each generation.

VEGA differs SGA in the way in which the selection operation is


performed.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 23 / 32


Basic steps in VEGA

1 Suppose, given a MOOP is to optimize k objective functions


f1 , f2 , · · · , fk
2 A number of sub-population is selected according to each
objective function in turn.
3 Thus, k-subpopulations each of size Mk are selected, where M is
the size of the mating pool (M ≤ N), and N is the size of the input
population.
4 These sub-population are shuffled together to obtain a new
ordering of individuals.
5 Apply standard GA operations related to reproduction.
6 This produced next generation and Steps 2-5 continue until the
termination condition is reached.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 24 / 32


Overview of the VEGA

Selection in VEGA (i+1)-th


i-th generation generation

1 1 1
Sub Population 1
2 2 2

.......
.......

.......

.......
Shuffle the entire
Create Sub-population
Sub-population j population Reproduction
1 2 ... k
Split it into k-blocks Apply crossover
according to fitness and mutation to
.......

.......
.......

values for f1,f2...fk this mating pool

.......
Sub-population k
N N N

Initial population k blocks are created Individuals are New generation is


of size N now mixed created

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 25 / 32


VEGA selection strategy

VEGA consists of the following three major steps:


M
1 Creating k sub-populations each of size k
2 Shuffle the sub-populations
3 Reproduction of offspring for next generation (same as in SGA)

We explain the above steps with the following consideration:

Suppose, given a MOOP, where we are to optimize k number of


objective functions f = f1 , f2 , · · · , fk .

Given the population size as N with individual I1 , I2 , · · · , IN

We are to create a mating pool of size M, where (M ≤ N).

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 26 / 32


VEGA: Creation of sub-populations
1 Create a mating pool of size M (M ≤ N)
Generate i-th subpopulation of size Mk where i = 1, 2, · · · , k.
To do this follow the proportional selection strategy (such as
Roulette-wheel selection) according to the i-th objective function
only at a time.

I1
Proportional to
I2 Sub population 1
selection w.r to f1
I3
Proportional to Sub population 2
selection w.r to f2
Create

a mating
pool of
size M

Proportional to
IN-1 Sub population k
selection w.r to fK
IN
Sub population of size M
Current population of size N
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 27 / 32
VEGA: Shuffle the sub-populations
2 Shuffle the sub-populations
Using some shuffling operation (e.g. generate two random
numbers i and j between 1 and M both inclusive and then swap Ii
and Ij which are in the i and j sub-populations.

Ii Ij
Shuffle

Ij Ii

Sub population of Sub population of


size M size M

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 28 / 32


VEGA: Reproduction
3 Reproduction:Perform reproduction to produce new generation
of population size N.

Apply standard reproduction procedure with


crossover, mutation operators etc.

I1 I1
I2 I2

Reproduction

Crossover
Mutation

IM IN

New generation of
population size N
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 29 / 32
Comments on VEGA

Advantages:

1 VEGA can be implemented in the same framework as SGA (only


with a modification of selection operation).
2 VEGA can be viewed as optimizing f1 , f2 , · · · , fk simultaneously.
That is, f (x) = eˆ1 f1 (x) + eˆ2 f2 (x) + · · · + eˆk fk (x), where ei is the
i-th vector.
Thus, VEGA is a generalization from scalar genetic algorithm to
vector evaluated genetic algorithm (and hence its name!).
3 VEGA leads to a solution close to local optima with regard to each
individual objective.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 30 / 32


Coments on VEGA

Disadvantages:

1 The solutions generated by VEGA are locally non-dominated but


not necessarily globally dominated. This is because their
non-dominance are limited to the current population only.
2 ”Speciation” problem in VEGA : It involves the evolution of
”Species” within the population (which excel on different
objectives).
3 This is so because VEGA selects individuals who excel in one
objective, without looking at the others.
4 This leads to ”middling” performance (i.e. an individual with
acceptable performance, perhaps above average, but not
outstanding for any of the objective function.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 31 / 32


An Questions?

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 22.03.2016 32 / 32


Solving MOOP: Pareto-based MOEA approaches

Debasis Samanta

Indian Institute of Technology Kharagpur


dsamanta@iitkgp.ac.in

29.03.2016

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 1 / 70


MOEA strategies

MOEA Solution
Techniques

A priori approach A posteriori approach

Independent sampling
Lexicographic ordering
Aggregate Selection
Game theory approach
Criterion selection (VEGA)
Non-linear fitness evaluation

Pareto selection
SOEA

Min-Max method
Ranking (MOGA)

Ranking and Niching

Demes

Elitist

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 2 / 70


MOEA strategies

MOEA Solution
Techniques

A priori approach A posteriori approach

Independent sampling
Lexicographic ordering
Aggregate Selection
Game theory approach
Criterion selection (VEGA)
Non-linear fitness evaluation

Pareto selection
SOEA

Min-Max method
Ranking (MOGA)

Ranking and Niching

Demes

Elitist

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 3 / 70


MOGA : Multi-Objective Genetic Algorithm

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 4 / 70


MOGA : Multi-Objective Genetic Algorithm

It is Pareto-based approach based on the principle of ranking


mechanism prposed by Carlos M. Fonseca and Peter J. Fleming
(1993).
Reference :
C. M. Fonseca and P. J. Fleming, ”Genetic Algorithm for
multi-objective Optimization : Formulation, Discussion and
Generalization” in Proceeding of the 5th International Conference
on Genetic Algorithm, Page 416-423, 1993.
Regarding the ”generation” and ”selection” of the Pareto-optimal
set, ordering and scaling techniques are required.

MOGA follows the following methodologies:


For ordering: Dominance-based ranking,
For scaling: Linearized fitness assignment and fitness averaging.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 5 / 70


Flowchart of MOGA

Converged ?

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 6 / 70


Dominance-based ranking

Definition 6 : Rank of a solution


The rank of a certain individual corresponds to the number of
chromosomes in the current population by which it is dominated.
More formally,
If an individual xi is dominated by pi individuals in the current
generation, then rank(xi ) = 1 + pi

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 7 / 70


Example 1: Dominance-based ranking

Min f2

xi

Xi

Min f1
Rank(x1)=1+|xi|
Where |xi| = number of solutions in
the shaded region

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 8 / 70


Example 2: Dominance-based ranking

Max f2

xi

Max f1

Rank(x1)=1+11=12

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 9 / 70


Example 3: Dominance-based ranking

Max f2
1

8 1
Max f1

Number of dominated points


with their domination count
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 10 / 70
Interpretation : Dominance-based ranking

Note :

1 Domination count = How many individual does an individual


dominates
2 All non-dominated individuals are assigned rank 1.
3 All dominated individuals are penalized according to the
population density of the corresponding region of the trade-off
surface.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 11 / 70


Fitness Assignment in MOGA

Steps :

1 Sort the population in ascending order according to their ranks.


2 Assign fitness to individuals by interpolating the best (rank 1) to
the worst (rank ≤ N, N being the population size) according to
some linear function.
3 Average the fitness of individual with the same rank, so that all of
them are sampled at the same rate.

This procedure keeps the global population fitness constant while


maintaining appropriate selective pressure, as defined by the function
used.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 12 / 70


Fitness Assignment in MOGA

interpolation

Individual with rank i


f2

Linerization Rank i+1

f1

Pk fii
Example : Linearization = f̄i = j=1 f¯i
j

where fji denotes the j-th objective function of a solution in the i-th rank
and f¯i denotes the average value of the j-th objectives of all the
j
solutions in the i-th rank.
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 13 / 70
Illustration of MOGA

1 2 3 4 k l

1, 2, n

1 2

1 2 q

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 14 / 70


Remarks on MOGA

The fitness assignment (Step 3) in MOGA attempts to keep global


population fitness constant while maintaining appropriate
selection pressure.

MOGA follows blocked fitness assignment which is likely to


produce a large selection pressure that might lead to premature
convergence.

MOGA founds to produce better result (near optimal) in majority of


MOOPs.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 15 / 70


Niched Pareto Genetic Algorithm (NPGA)

J. Horn and N. Nafploitis, 1993 Reference : Multiobjective


Optimization using the Niched Pareto Genetic Algorithm by J.Horn and
N.Nafpliotis, Technical Report University of Illionis at
Urbans-Champaign, Urbana, Illionis, USA, 1993

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 16 / 70


Niched Pareto Genetic Algorithm (NPGA)
NPGA is based on the concept of tournament selection scheme
(based on Pareto dominance principle).
In this techniques, first two individuals are randomly selected for
tournament.
To find the winner solution, a comparison set that contains a
number of other individuals in the population is randomly selected.
Then the dominance of both candidates with respect to the
comparison set is tested.
If one candidate only dominates the comparison set, then the
candidate is selected as the winner.
Otherwise, niched sharing is followed to decide the winner
candidate.
The above can be specified as follows.
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 17 / 70
Niched Pareto Genetic Algorithm (NPGA)

Pareto-domination tournament
let N = size of the population, K is the no of objective functions.
Steps :

1 i=1 (The first iteration)


2 Randomly select any two candidates C1 and C2
3 Randomly select a ”Comparison Set (CS)” of individuals from the
current population
Let its size be N ∗ (Where N ∗ = P%N; P decided by the
programmer)
4 Check the dominance of C1 and C2 against each individual in CS

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 18 / 70


Niched Pareto Genetic Algorithm (NPGA)

4 If C1 is dominated by CS but not by C2 than select C2 as the


winner
Else if C2 is dominated by CS but not C1 than select C1 as the
winner
Otherwise Neither C1 nor C2 dominated by CS
do sharing (C1 , C2 ) and choose the winner.
5 If i = N 0 than exit (Selection is done)
Else i=i+1, go to step 2

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 19 / 70


Niched Pareto Genetic Algorithm (NPGA)

A sharing is followed, when there is no preference in the


candidates.

This maintains the genetic diversity allows to develop a


reasonable representation of Pareto-optimal front.

The basic idea behind sharing is that the more individuals are
located in the neighborhood of a certain individual, the more its
fitness value is degraded.

The sharing procedure for any candidate is as follows.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 20 / 70


Niched Pareto Genetic Algorithm (NPGA)

Procedure do sharing(C1 , C2 )

1 j=1. Let x = C1
2 Compute a normalized (Euclidean distance) measure with the
individual xj in the current population as follows,
s  2
j
Pk fix −fi
dxj = i=1 f U −f L
i i

where fij denotes the i-th objective function of the j-th individual
fiU and fiL denote the upper and lower values of the i-th objective
function.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 21 / 70


Niched Pareto Genetic Algorithm (NPGA)

3 Let σshare = Niched Radius


Compute the following sharing value
(  2
dxj
1 − σshare , if dxj < σshare
sh(dxj ) =
0 , otherwise
4 Set j = j + 1, if j < N, go to step 2 else calculate ”Niched Count”
for the candidate as follows
n1 = N
P 
j=1 sh dij
5 Repeat step 1-4 for C2 .
Let the niched count for C2 be n2
6 if n1 < n2 then choose C2 as the winner else C1 as the winner.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 22 / 70


Niched Pareto Genetic Algorithm (NPGA)

C1

C2

....
N*
C1
. . . . . . .

C2

Initial Population of Random Population


size N index

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 23 / 70


Niched Pareto Genetic Algorithm (NPGA)

This approach proposed by Horn and Nafploitis [1993]. The approach


is based on tournament scheme and Pareto dominance. In this
approach, a comparison was made among a number of individuals
(typically 10%) to determine the dominance. When both competitors
are dominated or non-dominated (that is, there is a tie) the result of the
tournament is decided through fitness sharing (also called equivalent
class sharing)
The pseudo code for Pareto domination tournament assuming that all
of the objectives are to be maximized is presented below. Let us
consider the following.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 24 / 70


Pareto domination Tournament

S = an array of N individuals in the current population.


random pop index = an array holding N individuals of S, in a random
order.
tdom = the size of the comparison set.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 25 / 70


Algorithm Selection

1
Comparison_set_index 2

1
2 Candidate 1

Candidate 2

Comparison_individual

Random_pop_index

Population list

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 26 / 70


Algorithm Selection
This algorithm returns an individual from the current population S.
Begin
shuffle(random pop index)
candidate 1 = random pop index[1];
candidate 2 = random pop index[2];
candidate 1 dominated = F ;
candidate 2 dominated = F ;
for comparison set index = 3 to tdom + 3 do
comparison individual = random pop index[comparison set index];
if s[comparison set index]dominatess[candidate 1] then
candidate 1 dominated = TRUE;
end if
if s[comparison set index]dominatess[candidate 2] then
candidate 2 dominated = TRUE;
end if
end for
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 27 / 70
Algorithm Selection

if candidate 1 dominatedand ∼ candidate 2 dominated then


return candidate 1
else if candidate 2 dominatedand ∼ candidate 1 dominated then
return candidate 2
else
if nichecount(candidate 1) > nichecount(candidate 2) then
return candidate 2
else
return candidate 1
end if
end if
END

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 28 / 70


Algorithm Selection

This approach does not apply Pareto selection to the entire population,
but only to a segment of it at each num, the technique is very first and
produces good non-dominated num that can be kept for a large
number of generation.
However, besides requiring a sharing factor, this approach also
requires a good choice of the value of tdom to perform well,
complicating its appropriate use in practice.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 29 / 70


Points to Ponder an Multi-objective Evolutionary
Algorithm

How you solve two optimization problem


Strategy 1 : Solve individually
Strategy 2 : Solve 1 as main as other as constraint
Strategy 3 : C = C1 + c2 , X = X1 ∪ X2
Justify three strategies
What are the issues with one minimize and another maximization
problem
Explain weighted-sum approach.
What are the issues ?
How pareto-based approach address this issues ?

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 30 / 70


Non-dominated Sorting Genetic Algorithm(NSGA)

N.Srinivas and K.Deb, 1994


Reference : Multi-objective Optimization using Non-dominated Sorting
in Genetic Algorithm by N.Srinivas and K.Deb, IEEE Transaction on
Evolutionary Computing, Vol. 2, No. 3, Pages 221-248, 1994.

The algorithm is based on the concept of


1 Non-dominated sorting procedure (a ranking selection method to
select good non dominated points.
2 Niche method (is used to maintain stable subpopulation of good
points)

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 31 / 70


Non-dominated Sorting Procedure

Notations :

xi - denotes i-th solution


Si - denotes a set of solutions which dominate the solution xi
P - denotes a set of solutions (input solutions)
Pk - denotes a non-domination front at the k-th level (output as
sorting based on the non-domination rank of solution)
ni denotes the domination count, i.e. the number of solutions
which dominates xi

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 32 / 70


Non-dominated Sorting Procedure

xi

Max f2

Front 1

Front 2

Max f1

Front 4 Front 3

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 33 / 70


Non-dominated Sorting Procedure

Steps :

1 For each xi ∈ P do

1 For all xj 6= xi and xj ∈ P


If xi ≤ xj (i.e. if xi ) dominates xj
then si = si ∪ xj
else if xj ≤ xi then ni = ni + 1
2 If ni = 0, keep xi in P1 (the first front). Set a front counter K = 1
2 [INITIALIZATION]
For each xi ∈ P, ni = 0 and Si = φ
< −− This completes the finding of first front −− >

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 34 / 70


Non-dominated Sorting Procedure

< −− To find other fronts −− >

3 While Pk = φ do

1 [Initialize] Q = φ (for sorting next non-dominated solutions)


2 For each xi ∈ Pk and For each xj ∈ Si do

Update nj = nj − 1
If nj = 0 then xj in Q Else Q = Q ∪ xj

3 Set k = k + 1 and Pk = Q (Go for the next front)

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 35 / 70


Non-dominated Sorting Procedure

Note : Time complexity of this procedure is O(MN 2 ), where M is the


number of objective and N is the population size.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 36 / 70


An Illustration

2
5
4
4
f2 maximize

3
1 5
2
3
1

2 6 10 14 18

f1 maximize

Here, 1 dominates 2
5 dominates 1 etc.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 37 / 70


Non-dominated Sorting Procedure

Three non-dominated fronts as shown.

The solutions in the front [3, 5] are the best, follows by solutions
[1, 4].

Finally, solution [2] belongs to the worst non-dominated front.

Thus, the ordering of solutions in terms of their non-domination


level is : [3, 5], [1, 4], [2]

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 38 / 70


NSGA Technique

NSGA is based on several front of classification of the individuals.


It finds a front (of non dominated individuals) at particular rank and
then assign a fitness value (called dummy fitness value) to each
solution in the front followed by sharing of fitness value.
It then selects some individuals based on shared fitness values of
the individuals for mating pool.
Reproduce offspring for next generation and thus completes an
iteration.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 39 / 70


Flowchart of NSGA

Start

MOOP
Initial Population
Encoding

Front k = 1

Is Population No Identified non-

Classification
Classified ?
dominated individual
Yes
Selection for mating Assign dummy fitness
pool value

Reproduction

Sharing
(Crossover, Mutation) Fitness sharing

Evaluate each
front k = k + 1
individual

No
Converged ?

Yes
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 40 / 70
Assign dummy fitness value

Once all non dominated individual are identified for a front, it


assigns a unique fitness value to each belongs to that front is
called dummy fitness value.
The dummy fitness value is a very large number and typically is
proportional to the number of population (which includes the
number of individuals in all fronts including the current front).
The proportional constant (≥ 1) is decided by the programmer

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 41 / 70


Assign dummy fitness value

Note :

The same fitness value is assigned to give an equal reproductive


potential to all the individual belong to the front.
A higher value is assigned to individuals in an upper layer front to
ensure selection pressure, that is, having better chance to e
selected for mating
As we go from an upper front to next lower front, count of
individuals are ignored (i.e. logically remove from the population
set) and thus number of population successive decreases as we
move from one front to next.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 42 / 70


Sharing the fitness value

In the given front, all individuals are assigned with a dummy


fitness value which is proportional to the number of population.
These classified individuals are then shared with their dummy
fitness values.
Sharing is achieved by dividing the dummy fitness value by ”a
quantity proportional to the number of individuals around it”. This
quantity is called niche count.
This quantity is calculated by computing ”Sharing coefficient”
denoted as sh(dij ) between the individual xi ,xj belong to the front
under process, as per following.
(  2
dij
1 − Tshared , if dij < Tshared
sh(dij ) =
0 , otherwise
In the above, the parameter dij is the ”phenotype distance”
between two individuals xi and xj in the current front.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 43 / 70


Sharing the fitness value

Tshare is the maximum phenotype distance allowed between any


two individuals to become members of a niche. This value should
be decided by the programmer.
Niche count of xi then can be calculated as
P
γ(xi ) = ∀xj ∈pk Sh(dij )
where Pk denotes the set of non dominated individual in the
current front k).
The shared fitness value of xi is then f¯i = f , where f is the
γ(xi )
dummy fitness value.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 44 / 70


NSGA Technique

Individual are with


higher niche count
f2

Current front

f1

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 45 / 70


Sharing the fitness value

Note :

The sharing is followed to ensure the population diversity. That is,


give a fair chance to those individuals which are not so niche.
sh(dij ) 6= 0, and it always greater than or equal to 1 (=1 when
dij = 0).
After the fitness sharing operation, all individuals in the current
front are moved to a population set to be considered for mating
subject to their selection.
The assignment of dummy fitness value followed by fitness
sharing is then repeated for the next front (with a smaller
population size).

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 46 / 70


NSGA Selection
Once all fronts are processed, we obtained a set of population
with their individual shared fitness values.
This population is then through a selection procedure.
NSGA follows ”stochastic remainder proportionate selection”,
which is as follows.
1 Calculate cumulative probabilities of all individuals as in
Roulette-Wheel scheme.
2 Generate a random number (for first time only) ri (this is the
probability to get an individual to be selected).
3 If Pj−1 ≤ ri < Pj (where PJ and Pj−1 are two cumulative
probabilities, then consider j-th individual for the selection.
4 Let Ej = Pj × N (N, the population size and Ej denotes expected
count).
5 If integer part in Ej is nonzero, then select j-th solution for mating.
6 The fractional part is used as the random number for the selection
of next solution.
7 Repeat step 3-6 until the mating pool of desired size is not reached.
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 47 / 70
Reproduction and generation of new population

Follow the standard reproduction procedure as in Simple GA.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 48 / 70


Remark on NSGA

Several comparative studies reveal that NSGA is outperformed by


both MOGA and NPGA.
NSGA is also highly inefficient algorithms because of the way in
which it classified individuals (if is O(MN 3 )) time complexity)
Deb et al. proposed an improved version of NSGA algorithm
called NSGA-II.
It needs to specify a sharing parameter.
Tshare It is a non-elitism appraoch.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 49 / 70


Elitist Multi-objective Genetic Algorithm : NSGA-II

K.Deb, A.Protap, S.Agarwal and T.Meyarivan 2002.


Reference :
A Fast and Elitist Multi-objective Genetic Algorithm : NSGA-II by
K.Deb, A.Pratap, S.Agrawal and T.Meyarivan in IEEE Transaction on
Evolutionary Computation, Vol.6, No.2, Pates 182-197, April 2002.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 50 / 70


Elitist Multi-objective Genetic Algorithm : NSGA-II

The following approach of finding non-domination front is followed in


NSGA.
Notation

P = A set of input solution


xi and xj denote any two solution
P 0 = set of solutions on non-dominated front.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 51 / 70


Elitist Multi-objective Genetic Algorithm : NSGA-II

Approach

In this approach every solution from the population P is checked with a


partially filled population P 0 (which is initially empty). To start with, the
first solution from P is moved into initially empty P 0 . Thereafter, each
solution xi ∈ P one by one is compared with all member of the set P 0 .
If the solution xi dominates any member of P 0 , then that solution is
removed from P 0 . Otherwise, if solution xi is dominated by any
member of P 0 , the solution xi is ignored. On the other hand, if solution
xi is not dominated by any member of P 0 it is moved in to P 0 . This is
how the set P 0 grows with non dominated solutions. When all solutions
of P are checked, the remaining members of P 0 constitute the
solutions on non-dominated front.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 52 / 70


Elitist Multi-objective Genetic Algorithm : NSGA-II
This front is removed and the same steps are repeated with the
remaining solutions to find the next non-domination front until P is
empty. This approach is precisely stated in the following.
Steps :
P = x1 , x2 , · · · , xN

1 Initialize P 0 = xi set solution counter j = 2.


2 Let j = 1.
3 Compare solution xi with xj from P for domination
4 If xi ≤ xj , delete xj from P 0
5 If j < |P 0 |, then increment j by one and go to step 3.
Else Go to step 7.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 53 / 70


Elitist Multi-objective Genetic Algorithm : NSGA-II

6 If xj ≤ xi , then increment i by one and go to step 2.


7 Insert xi in P 0 i.e. P 0 = P 0 ∪ i
8 If i < N, increment i by one and go to step 2.
9 Output P 0 as the non-dominated from with current population
10 if P 6= φ, repeat Step 1-9.
11 Stop.

Note : Time complexity of this procedure is O(MN 2 ) and in worst


case O(MN 3 ) (when one front contains only one solution.)

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 54 / 70


Overview of NSGA-II

Let P be the current population


Create offspring population Q from P using the genetic operators
(mating pool, crossover and mutation)
Two population are combined together to form a large population
R of size of 2N
Apply non-dominating sorting procedure to classify the entire
population R.
After the non-domination sorting, the new population P is obtained
by fitting the non-dominated front one-by-one until the |P 0 | < |P|
The filling starts with the best non-dominated front followed by the
second non-dominated front, followed by the third non-dominated
front and so on.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 55 / 70


Overview of NSGA-II

Since the total population size is 2N, not all fonts may be
accommodated in N stats available in P 0 . All fronts which could
not be accommodated are simply rejected. when the last allowed
front is being considered, there may exist more solution in the lase
front than the remaining slots in the new population.
Instead of arbitrarily discarding some members from the last
acceptable front, the solution which will make the diversity of the
selected solutions the highest are chosen.
This is accomplished by calculating crowding distance of solutions
in the last acceptable front.
This way a new generation was obtained and the steps are
repeated until it satisfies a termination condition.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 56 / 70


NSGA Technique

F1 F1

F2 F2

P
Non-dominated Solution based on Fi
sorting Fi
crowding distance

P’

Last acceptable front


Q Rejected

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 57 / 70


Basic Steps : NSGA-II
Steps :
1 Combine parent (P) and offspring (Q) populations and obtain
R =P ∪Q
2 Perform a non-dominated sorting to R and identify different fronts
of non-dominated solutions as Fi , i = 1, 2, · · · , etc.
3 Set new population P 0 = φ. Set a new counter i = 1 (to indicate
front to be allowed to fill P 0 .
4 Fill P 0 until |P 0 | + |Fi | < N that is, P 0 ∪ Fi and i = i + 1.
5 Perform Crowding sort (Fi ) and include the most-widely spread
N − |P 0 | solution by using the crowding distance values in the
sorted Fi to P 0 .
6 Create offspring population Q 0 from P 0 by using the
crowded-tournament selection, crossover and mutation operator.
7 P = P 0, Q = Q0.
8 Repeat step 1-7 until the convergence of solutions.
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 58 / 70
Crowding sort procedure

The crowding sort follows the following two concepts.


Crowding distance metric (d) and

Crowded comparison operator <e

Note :

The crowding distance di of a solution xi is a measure of the


search space around xi which is not occupied by any other
solutions in the population. (It is just like a population density : di
is lower for a dense population.

The crowded comparison operator <e (a binary operator) to


compare two solutions and returns a winner.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 59 / 70


Crowding sort procedure

Definition 8 : Crowding comparison operator


A solution xi wins over a solution xj , iff any of the following conditions
are true
If solution xi has a better rank (i.e. dominance rank). That is
rank(xi ) < rank(xj )

If they have the same rank but solution xi has better crowding
solution xj .

That is rank(xi ) = rank(xj ) and di > dj (where di and dj are


crowding distance of xi and xj , respectively.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 60 / 70


Crowding sort procedure

Note :

The first condition ensures that xi lies on a better non-domination


front.

The second condition resolve the tie of both solution being on the
same non-dominated front by deciding on their crowding distances

(Note : For NSGA-II, only the second condition is valid as all solution
are belong to one front only.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 61 / 70


Crowding sort procedure

Definition 7 : Crowding distance measure


The crowding distance measure di is an estimate of the size of the
largest cuboid enclosing the point xi (i-th solution) without including
any point in the population.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 62 / 70


Crowding sort procedure

f3

Xi-1

Xi-1
f2
x2 Xi
f1
Xi+1

Xi+1
f1
f2
(a) Solutions in 2D space (b) Solutions in 3D space

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 63 / 70


Crowding sort procedure

In Fig(a), the crowding distance is the average side length of the


rectangle (so that two nearest solutions of xi and xi − 1 at two
opposite corners).

In fig.(b), the crowding distance is the sum of three adjacent


edges of a cuboid formed by the two neighbors xi−1 and xi+1 of xi
at two opposite corners of the cuboid.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 64 / 70


Crowding distance calculation

Given a set of non-dominated solutions say F , and objective functions


f = f1 , f2 , · · · , fM , the procedure to calculate the crowding distance of
each solution xi ∈ F is stated below. Steps :

1 Let l = |F | (Number of solutions in F )


2 For each xi nF , set di = 0 (Initialize the crowding distance)
3 For each objective fi ∈ f
F i = sort(F , i)
Sort all solutions in F with respect to objective values fi
This will result F 1 , F 2 , · · · , F m sorted vector.
The sorting is performed in ascending order of objective values fi

The sorted vectors are shown in the fig.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 65 / 70


Crowding distance calculation

1 2 M

11 21 M1

12 22 M2

1l 2l Ml

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 66 / 70


Crowding distance calculation
1 For each solution j = 2 to l − 1 do
fk∗j+1 −fk∗j−1
dj = M
P
k=1 f MAX −f MIN
k k
2 di = dl =∝(Each boundary solutions are not crowded and hence
large crowded distance)

f27

f26
f25
2
f24
f23
f22
f21

1
f11 f12 f13 f14 f15 f16 f17

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 67 / 70


Crowding distance calculation

Note :

1 All objective function are assumed be minimized


2 Time complexity of the above operation is O(MN log2 N) , is the
complexity of M sorting operation with N elements.
3 The parameters fkMAX and fkMIN , K = 1, 2, · · · M can be set as the
population-maximum and population-minimum values of k-th
objective function and is used here to normalize the objective
values.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 68 / 70


Crowding Tournament

Once crowding distance of all solutions in the last acceptable front


F is calculated we are to select N − |P 0 | solutions from F to
complete the size of |P| = N.
To do this, we are to follow tournament selection operation
according to <c (Crowding comparison operator). That is we
select xi over xj (xi <c xj ) if di > dj .
Note that we prefer those solutions which are not in the crowded
search space. This ensure a better population diversity.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 69 / 70


Crowding Tournament

Remarks :

NSGA-II is an elitist approach

It consider a faster non-dominated sorting procedure with time


complexity O(MN 2 ) compared to O(MN 3 ) in NSGA.

It does not require explicit sharing concept rather use a crowding


tournament selection with time complexity O(MN log N)

Thus, the overall time complexity of NSGA-II is O(MN 2 )

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 29.03.2016 70 / 70


Applications
Of
Genetic Algorithm

Supervised by: Made by:


Dr. A. Chatterjee GROUP- 2
Devesh Garg Y8191
Hemendra Goyal Y8215
Utsav Kumar Y8546
Vijesh Bhute Y8561
Content:
 Introduction

 Algorithm

 Selection

 Reproduction

 Crossover

 Mutation

 Main Problem Statement

 Applications

 Strengths of GA

 Other Optimizing Techniques

 Important Problems of GA

 Why GA works?

 Conclusion

 References
INTRODUCTION

Genetic Algorithm has been developed by John Holland, his colleagues and his students at the
University of Michigan in 1975. The book named “Adaptation in Natural and Artificial Systems” was
the first by them that discusses GA.

Genetic Algorithms are search algorithm based on mechanics of natural selection and natural
genetics. They combine survival of the fittest among string structures with a structured yet
randomized information exchange to form a search algorithm with some of the innovative flair of
human touch. In every new generation a new set of strings is created using bits and pieces of fittest
of the old. While randomized, genetic algorithms are no simple random walk. They efficiently exploit
historical information to speculate on new search points with expected improved performance.

The Central theme of research on genetic algorithms has been robustness the balance between
efficiency and efficacy necessary for survival in many different environments. Genetic Algorithms are
theoretically and empirically proven to provide robust search in complex spaces. These algorithms
are computationally simple and yet powerful in their search for improvement.

The key features of GAs are:

1. GAs work with a coding of the parameter set and not the parameter themselves.

2. GAs search from a population of points and not from a single point.

3. GAs use objective function information and not derivatives or other auxiliary knowledge.

4. GAs are probabilistic transition rules and not deterministic rules.


Algorithm

There are four major steps in which Genetic Algorithm works

 Selection
 Reproduction
 Crossover
 Mutation

Stopping Criteria: We need to provide a stopping criteria like the minimum final tolerance in the
function or the total time the GA runs or the total number of generations.

In our main problem statement, we have provided both minimum tolerance(=10^-6) and the
maximum number of generations to be 400.

For the purpose of explaining the different parts of GA we have considered a problem statement and
written code to explain different parts of the GA.

We have considered a simple problem in order to explain functioning of GA in optimization, i.e.

x(1)^2-x(2)^2-5.13

and the fitness function considered is……..

fitness function =1/abs((x(1)^2-x(2)^2-5.13))

INITIAL POPULATION
Initially many individual solutions are randomly generated to form an initial population. The
population size depends on the nature of the problem, but typically contains several hundreds or
thousands of possible solutions. Traditionally, the population is generated randomly, covering the
entire range of possible solutions (the search space). Occasionally, the solutions may be "seeded" in
areas where optimal solutions are likely to be found.

To generate population of 10 strings and take transpose so that each column represent the string
that we have randomly chosen.

function y=generation()
x1=[];
for i=1:10
x1=[x1;generate()];%8cross10
end
x1=transpose(x1);%10cross8

Where generate function is…

function y=generate()
for j=1:8
if(rand>0.5)
x1(j)=1;
else
x1(j)=0;
end
end
y=x1;
end

It gives us a string of eight bit and assigning either 0 or 1 to it..

initial_generation =

1 1 0 1 1 0 1 0
0 1 1 1 1 0 1 1
1 1 1 0 1 0 1 0
0 0 0 1 1 0 1 0
0 0 1 1 0 0 0 1
1 1 0 1 1 0 0 0
1 0 1 0 1 0 1 1
1 1 1 0 0 0 1 0
1 0 1 0 0 0 1 0
0 1 1 1 1 0 1 1

SELECTION

During each successive generation, a proportion of the existing population is selected to breed a
new generation. Individual solutions are selected through a fitness based process, where
fitter solutions (as measured by a fitness function) are typically more likely to be selected. Certain
selection methods rate the fitness of each solution and preferentially select the best solutions. Other
methods rate only a random sample of the population, as this process may be very time-consuming.

Most functions are STOCHASTIC and designed so that a small proportion of less fit solutions are
selected. This helps keep the diversity of the population large, preventing premature convergence
on poor solutions. Some of the SELECTION methods are as follows:

Roulette Wheel Selection

It works on the concept of the game Roulette. Under this game each individual gets a slice of the
wheel, but more fit ones get larger slices than less fit ones. The wheel is then spun, and whichever
individual "owns" the section on which it lands each time is chosen. A form of fitness-proportionate
selection in which the chance of an individual's being selected is proportional to the amount by
which its fitness is greater or less than its competitors' fitness.

Scaling Selection

As the average fitness of the population increases, the strength of the selective pressure also
increases and the fitness function becomes more discriminating. This method can be helpful in
making the best selection later on when all individuals have relatively high fitness and only small
differences in fitness distinguish one from another.

Tournament Selection

Subgroups of individuals are chosen from the larger population, and members of each subgroup
compete against each other. Only one individual from each subgroup is chosen to reproduce.

Rank Selection

Each individual in the population is assigned a numerical rank based on fitness, and selection is
based on these ranking rather than absolute differences in fitness. The advantage of this method is
that it can prevent very fit individuals from gaining dominance early at the expense of less fit ones,
which would reduce the population's genetic diversity and might hinder attempts to find an
acceptable solution.

Hierarchical Selection

Individuals go through multiple rounds of selection each generation. Lower-level evaluations are
faster and less discriminating, while those that survive to higher levels are evaluated more
rigorously. The advantage of this method is that it reduces overall computation time by using faster,
less selective evaluation to weed out the majority of individuals that show little or no promise, and
only subjecting those who survive this initial test to more rigorous and more computationally
expensive fitness evaluation.

REPRODUCTION
Reproduction is a process in which individual strings are copied according to their objective function
values that is f. We can call this function as Fitness Function.

In other way around we can think this as some measure of profit, utility or goodness that we want to
maximize. Here fitness values of string signifies that

Strings with a higher value of would have higher probability of contributing one or more offspring to
next generation . This operator is an artificial version of natural selection. The reproduction operator
may be implemented in algorithmic form in a number of ways . We have used the concept of
Roulette wheel here, where each current string in the population has a roulette wheel slot size in the
proportion to its fitness.

Each time we require another offspring , a simple spin of the weighted roulette wheel yields the
reproductive candidate. In this way, more highly fit springs have a higher number of offspring in the
succeeding generation . Once a string has been selected for replica this string is then entered in to a
mating pool , a tentative new population for further operator action.

Program for selection …

Here x1 is a 10 cross 8 array. Where each column is the string that we have randomly generated.
Convert is a function that converts the binary to decimal by breaking it into two 4 bit string and
calculating its decimal value. So our range of solution space is from 0 to 15.
%binary is 2cross1
binary1=convert(x1(1:8));
binary2=convert(x1(9:16));
binary3=convert(x1(17:24));
binary4=convert(x1(25:32));
binary5=convert(x1(33:40));
binary6=convert(x1(41:48));
binary7=convert(x1(49:56));
binary8=convert(x1(57:64));
binary9=convert(x1(65:72));
binary10=convert(x1(73:80));
%main gives scalar

Fitness is calculated by passing the values to the main function where values are inserted in actual
equation ,error is calculated and so the fitness value, which is returned back.

fitness1=[main(binary1); main(binary2);main(binary3); main(binary4); main(binary5); main(binary6);


main(binary7); main(binary8); main(binary9); main(binary10)];

Roulette Selection is used to select the string which will go for the Reproduction . So, probability of
each of the 10 strings are calculated that will undergo Reproduction on the basis of their fitness
value. Using Random value and passing it down to different sets of condition we find the string
which will undergo Reproduction. Also, those strings which are selected for Reproduction are stored
in matrix named next.

sum=0;
for i=1:10
sum=sum+fitness1(i);
end
sum
probability=[];%10cross1
probability(1)=fitness1(1)/sum;
for i=2:10
probability(i)=probability(i-1)+fitness1(i)/sum;
end
next=[];%8cross10 after selection
for i=1:10
r=rand;
if(rand<probability(1))
next=[next;x1(1:8)];
elseif(rand<probability(2))
next=[next;x1(9:16)];
elseif(rand<probability(3))
next=[next;x1(17:24)];
elseif(rand<probability(4))
next=[next;x1(25:32)];
elseif(rand<probability(5))
next=[next;x1(33:40)];
elseif(rand<probability(6))
next=[next;x1(41:48)];
elseif(rand<probability(7))
next=[next;x1(49:56)];
elseif(rand<probability(8))
next=[next;x1(57:64)];
elseif(rand<probability(9))
next=[next;x1(65:72)];
elseif(rand<probability(10))
next=[next;x1(73:80)];
end
end
initial_generation=transpose(x1);%8 cross 10 // Row represent string
after_selection=transpose(next);%10 cross 8 // Column represent string

after_select=next; // Finally String undergoing Reproduction is stored

after_select =

1 1 1 0 1 0 1 0
0 0 1 1 0 0 0 1
0 0 1 1 0 0 0 1
0 0 1 1 0 0 0 1
0 0 1 1 0 0 0 1
1 1 0 1 1 0 0 0
0 0 1 1 0 0 0 1
0 0 1 1 0 0 0 1
1 1 0 1 1 0 0 0
1 1 1 0 1 0 1 0

Pie chart representing the probability distribution in the Roulette Selection.


CROSSOVER
After reproduction, simple crossover may proceed in two steps. First each member of the newly
reproduced strings in the mating pool is mated at random. Second, each pair of strings undergoes
crossing over as follows. An integer position k along the string is selected uniformly at random
between 1 and the string length less one [1 , l – 1]. Two new strings are created by swapping all
characters between k+1 and l inclusively.

Here we crossover the selected strings using the above formulae and rest of the strings are directly
inherited in the matrix.

%let crossover probability be 80%


cross=[];%8crossn stores the strings selected for the crossover
count=0;%counts the length of crossover strings
set=[];%takes care of the indexes of the strings not chosen for crossover
for i=1:10
if(rand<=0.8)
cross=[cross;after_selection((8*i-7):8*i)];
count=count+1;
else
set=[set;i];
end
end
rest=[];
for i=1:length(set)
rest=[rest;after_selection((8*set(i)-7):8*set(i))];
end
%rest
%cross_selected=transpose(cross);
%cross
tran_cross=transpose(cross);%ncross8
count1=[0 0 0 0 0];
%CROSSOVER
comb1=[];
c1=greatest_integer(count/2);
if(c1<count/2)
comb1=[tran_cross((8*count-7):8*count)];
end
for i=1:2:(count-1)
r=rand*8;
great_int=greatest_integer(r);
temp1=tran_cross((8*i-7):(8*(i-1)+great_int));
temp2=tran_cross((8*i-7+great_int):8*i);
temp3=tran_cross((8*(i+1)-7):(8*i+great_int));
temp4=tran_cross((8*(i+1)-7+great_int):8*(i+1));
crossover_d11=[temp1 temp4];
crossover_d12=[temp3 temp2];
comb1=[comb1;crossover_d11;crossover_d12];
end
after_crossover=[rest;comb1];
Where greatest_integer function is…

function g=greatest_integer(A)
g=A-mod(A,1);
end

after_crossover =

0 0 1 1 0 0 0 1
0 0 1 1 0 0 0 1
1 1 1 1 0 0 0 1
0 0 1 0 1 0 1 0
0 0 1 1 0 0 0 1
0 0 1 1 0 0 0 1
0 0 0 1 1 0 0 0
1 1 1 1 0 0 0 1
1 1 1 0 1 0 1 0
1 1 0 1 1 0 0 0

MUTATION
Mutation adds new information in a random way to the genetic search process and ultimately helps
to avoid getting trapped at local optima.

Mutation in a way is the process of randomly disturbing genetic information. They operate at the bit
level; when the bits are being copied from the current string to the new string, there is probability
that each bit may become mutated. This probability is usually a quite small value, called as mutation
probability. A coin toss mechanism is employed; if random number between zero and one is less
than the mutation probability, then the bit is inverted, so that zero becomes one and one becomes
zero. This helps in introducing a bit of diversity to the population by scattering the occasional points.
This random scattering would result in a better optima, or even modify a part of genetic code that
will be beneficial in later operations. On the other hand, it might produce a weak individual that will
never be selected for further operations.

Mutation value is taken as 0.01 and using the random value we change the value of the binary digit
if probability allows to do so.

%MUTATION
changes=0;
for i=1:length(after_crossover)
if(rand<0.01)
after_crossover(i)=invert(after_crossover(i));
changes=changes+1;
end
end
after_mutation=after_crossover;
initial_generation
after_select
after_crossover
after_mutation
changes%gives number of changes
initial_fitness=sum;
initial_fitness
final_fitness=fitness(after_mutation);
final_fitness
end

after_mutation =

0 0 1 1 0 0 0 1
0 0 1 1 0 0 0 1
1 1 1 1 0 0 0 1
0 0 1 0 1 0 1 0
0 0 1 1 0 0 0 1
0 0 1 1 0 0 0 1
0 0 0 1 1 0 0 0
1 1 1 1 0 0 0 1
1 1 1 0 1 0 1 0
1 1 0 1 1 0 0 0

changes = 0

initial_fitness = 0.4753
final_fitness = 1.4484

MAIN PROBLEM STATEMENT

The main problem statement which we are going to present is taken from the research paper “The
application of Genetic Algorithm (GA) to estimate the rate parameters for solid state reduction of
iron ore in presence of graphite”. GA’s have found their application in many diverse fields like
physics, astronomy, finance, chemistry, etc. Here, we have applied GA to estimate the various rate
parameters using experimental data and compared the results with those values presented in the
literature. The reduction of hematite to iron has been considered to occur in three sequential steps
namely as:

(i) Hematite (Fe2O3) to Magnetite (Fe3O4),


3Fe2 O 3 + CO = 2Fe3O4 + CO2

(ii) Magnetite (Fe3O4) to Wustite (FeO),


Fe3O4 + CO = 3FeO + CO2

(iii) Wustite (FeO) to Iron (Fe),


FeO + CO = Fe + CO2

Mass balance equations were assumed to follow the first order kinetics.
( )

( ) ( )

( ) ( )

( )
Where, H,M,W represent the concentrations of hematite, magnetite and wustite at time t
respectively.
The unknown parameters are three rate constants: kh, km, kw and activation energies, Eh, Em and
Ew which we are going to find using GA and experimental data. The parameter which is measured
using the experiment is defined as the degree of reduction coefficient which is found by using the
formula : ( ) , where Wh is the total weight of hematite at time t, is the
( )
loss in weight of the packed bed and Z is the exit gas composition, i.e., CO/CO2 ratio.

The predicted values of degree of reduction have been obtained from the respective rates of
hematite, magnetite, and wustite, as follows:

( ( ) ( ) ( ))

Where, S0 gives the total concentration of oxygen consumed in time , and 936 kg/m3 is the total
removable oxygen. H, M and W can be found from the experimental graph of the concentration
versus time graph. GA has been used to find the unknown parameters by minimizing the error
between the experimental and predicted values of the degree of reduction. We have used a double
vector to generate the populations. GA parameters used by us are:

 Fitness function: 1/( )

 Generations: 400

 Mutation Probability: 0.01

 Cross over probability: 0.8

 Selection technique: Tournament selection

 Range at 1000oC: for kh (s-1), km(s-1), kw(s-1), Eh (KJ/mol), Em (KJ/mol), Ew (KJ/mol) is resp:

[10^14,10^18], [10^12,10^17], [10^10,10^12], [100,600], [150,600], [175,600]

 The total time chosen is 4800 s. and the initial values of H, M and W are 3100, 0 and 0
respectively (from graph):
Evaluation of concentration of various iron oxide phases as well as pure iron during packed bed reduction of
0
iron ore – graphite composite pellets under argon atmosphere at 1000 C

We made the following code to write the fitness function using MATLAB:
function z=solve2(y)
b1=y(1)*exp(-y(4)/10583.722);
b2=y(2)*exp(-y(5)/10583.722);
b3=y(3)*exp(-y(6)/10583.722);
h=3100*exp(-b1*1300);
m=0.97*3100*b1*(exp(-b1*4800)-exp(-b2*4800))/(b2-b1);
w=0.93*b2*0.97*3100*((exp(-b1*4800)-exp(-b3*4800))/(b3-b1)-(exp(-b2*4800)-exp(-b3*4800))/(b3-b2))/(b2-
b1);
ht=-h*b1;
mt=0.97*h*b1-m*b2;
wt=0.93*m*b2-w*b3;
S0=0.033*(h-3100)+0.068*m+0.222*w;
z=1/(0.62-((S0+(0.033*ht+0.068*mt+0.222*wt)*4800)/936));
end
The code was run using the in-built function of MATLAB – ga. The GUI interface was used and the
conditions were inserted and the code was run for 400 generations. The results found were:
Kh (s-1) Km (s-1) Kw (s-1) Eh Em Ew
(KJ/mol) (KJ/mol) (KJ/mol)

GA 3.944*10^17 9.705*10^16 3.973*10^11 188.1611 393.6013 321.0304

literature 6.00*10^17 7.50*10^16 1.70*10^11 380 410 330


The results show quite a good agreement despite of several minor changes in the problem as
mentioned below.
Sources of errors in our calculation are:
• The mutation probabilities are not the same as used in the paper.
• The conditions mentioned in the problem is slightly varied like the elitism is not considered,
the range is not the same as in the paper, etc.

Applications of Genetic Algorithm


1. Acoustics- GA is used to distinguish between sonar reflections and different types of object.
Apart from that it is also used to design Active noise control systems which cancel out undesired
sound by producing sound waves that destructively interfere the unwanted noise.

2. Aerospace Engineering- To design wing shape Super Sonic aircraft minimizing aerodynamic drag
at supersonic cruising speeds, minimizing drag at subsonic speeds, and minimizing aerodynamic
load .

3. Financial Markets – to predict the future performance of publicly traded stocks.

4. Geophysics-locate earthquake hypocenters based on seismological data.

5. Materials Engineering-To design electrically conductive carbon based polymers known as


polyanilines. To design exposure pattern for an electron lithography beam.

6. Routing and Scheduling- To find optimal routing paths in telecommunication networks which are
used to relay data from sender to recipients .

7. Systems Engineering- To perform multi–objective task of designing wind turbines used to


generate electric power.

8. Data mining- GA can also be used in data mining and pattern recognition. Given a large set of
data, GA can perform a search for the optimum set of data which satisfies the conditions. Initial
population in this case may be single objective conditions and at the end of the GA, we get a
combined complex condition which when applied on the large data set can lead to finding the
required set of data.
Strengths of GAs
1. The main strength of GA is that it is intrinsically parallel. Whereas most other algorithms are
serial and search the solution space to a problem in one direction at a time. So, in such case
if solution turns out to be suboptimal then we have to abandon all work previously and start
over. However since GA has multiple offspring so they can explore the solution space in
multiple direction at once given the condition that some of them would turn out as dead
end.

2. GA is better in solving problems where the space of all potential solution is truly huge. Non
Linear functions falls into this category, where changing one component may have ripple
effects on the entire system and where multiple changes that individually are detrimental
may lead to much greater improvements in fitness function.

3. GA performs well in problems for which fitness function is complex or discontinuous or noisy
or changes over time or has many local optima. Whereas other search algorithms can
become trapped by local optima but GA works well to avoid local optima.

4. One of the main features of GA is its ability to manipulate many parameters simultaneously.
Many problems cannot be stated as single value to be maximized or minimized but
expressed as multiple objectives.

5. GAs know nothing about the problem that they are deployed to solve. The virtue of this
technique is that it allows GA to start out with open mind. Since in GA decisions are based
on randomness all possible search pathways are theoretically open to GA.

Other Optimizing Techniques


Different Data Mining Techniques are

1. Neural Networks or pattern Recognition


2. Memory based reasoning
3. Cluster Detection or Market Basket Analysis
4. Link Analysis
5. Visualization
6. Decision Tree or Rule Induction
7. Simulated Anealing
8. Hill Climbing

Neural Network and Genetic Algorithm

Neural networks (NN) fall into the category of Supervised Models. That is, our data will be a set of
rows, where each row contains an input and a corresponding output for that input. Our NN learns by
seeing the difference between the correct output and one it predicted, and then adjusting its
parameters. So one can't use NN if he/she don't have input-ouput data .

Examples: Graph extrapolation. Facial recognition.

Genetic algorithms (GA) are basically optimizers. Here we will have some set of parameters that we
want to optimize for something. We will need an evaluation function that takes these parameters
and tells us how good these parameters are. So we keep changing these parameters somehow until
we get an acceptable value from our evaluation function, or until we see that things are not
improving any more.

Examples: Scheduling airplanes/shipping. Timetables . Finding the best characteristics for a simple
agent in an artificial environment. Rendering an approximation of a picture with random polygons.

If we have data that is suitable for supervising a model, then we can use a NN. If we want to
optimize some parameters, then use a GA. But most importantly, it is the nature of our data and
what we want out of it that should decide what model to use.

Simulated annealing:

Another optimization technique similar to evolutionary algorithms is known as simulated annealing.


The idea borrows its name from the industrial process of annealing in which a material is heated to
above a critical point to soften it, then gradually cooled in order to erase defects in its crystalline
structure, producing a more stable and regular lattice arrangement of atoms. In simulated annealing,
as in genetic algorithms, there is a fitness function that defines a fitness landscape; however, rather
than a population of candidates as in GAs, there is only one candidate solution. Simulated annealing
also adds the concept of "temperature", a global numerical quantity which gradually decreases over
time. At each step of the algorithm, the solution mutates (which is equivalent to moving to an
adjacent point of the fitness landscape).

The fitness of the new solution is then compared to the fitness of the previous solution; if it is
higher, the new solution is kept. Otherwise, the algorithm makes a decision whether to keep or
discard it based on temperature. If the temperature is high, as it is initially, even changes that cause
significant decreases in fitness may be kept and used as the basis for the next round of the
algorithm, but as temperature decreases, the algorithm becomes more and more inclined to only
accept fitness-increasing changes. Finally, the temperature reaches zero and the system "freezes";
whatever configuration it is in at that point becomes the solution. Simulated annealing is often used
for engineering design applications such as determining the physical layout of components on a
computer chip.
Important Problems of GA:

There are several problem related to using GA but we would like to stress on 4 major problems of
GA:
1. Difficulty in population selection,

2. How to define Fitness Function,

3. Premature or rapid convergence of GA

4. Converge to local optima inspite of global optima

How we can overcome them is always a big question.

Population Selection Problem:

Population selection is a big question but different selections for different problem have been found
out. Like for any case of optimization problem in which we have n independent variables and all
have certain constraint over them like they are bounded in between the given values. In such
problems first of all we are supposed to find out how much accuracy would be required and then if
we go for Bit String representation then the corresponding population will be decided by

2m-1 < (b-a)*decimal order accuracy < 2m

Defining Fitness Function:

Generally GA Fitness Function value increases with the iteration so we have to define our
Fitness Function in such a way so that it increases even with searching for minima or
maxima. Like in our main Problem we were supposed to find out the optimal minimum
value so we defined our fitness function in such a way so that optimal solution fitness can
increase with every iteration.

We defined it like

1/(different of fitness of particular individual to total fitness value)

So we can always find a way to define our fitness function in such a way so that we fit it into
the GA.

Rapid Convergence of GA:

Rapid convergence is a very common as well as very general problem in GA. It occurs
because the Fitness function changes its value rapidly and that’s what results in a premature
convergence There are several method to modify our Fitness function in such a way so that
Convergence occur slowly which will allow enough time for GA to search in whole space and
find the global optima.2 ways are defined below.
1. F’ = a*F + b ;

F is normal Fitness value of the population String.

Here a is a small no while b is relatively larger number now it can be clearly seen that
even if F rapidly increases then after F’ will increase slowly.

2. F’ = a*Fk , this is called power law

Here k is kept low so that even if F increases rapidly it may not create rapid Convergence
. There is slightly advanced way to define Fitness function in which Fitness Function is
modified in each and every generation.

So,

k = func(t) , t is generation

So in this case Fitness Function is modified in each and every case.

Convergence to Local Optima:

Another important problem is that most of the times GA converges to local optima. We may
very well think that we have found the optimal solution but actually we didn’t.
Now, it is important to note that GA ability to find the optimal solution lies in the
hand of the user. Optimal solution will be achieved only if programmer has written the code
so that GA has the ability to search over whole Space to find the optimal solution. So what
we should do in such case is that we should modify our crossover and mutation function
because they are responsible for changing the population in each and every iteration.

Why does GA work?

There is a concept of Schema defined as

*10101100101

Where * is called as don’t care symbol and can be either 1 or 0. So the above schema matches with
the string shown below:

110101100101
010101100101

So from the above we can easily see that 1010100011 represent only one string but **********
represents all the string of length 10.

Also a single string (1010100011) is matched by 210 Schema.


1010100011
*010100011
1*10100011

So on…..

Now there are 2 important schema properties that should be noted here:

• Order of the Schema (denoted by o(S)) : it is defined as the total number of 0’s and 1’s in
the Schema

• Defining length of Schema(denoted by Ω(S)): is defined as distance between the 1st and the
last fixed string position

For Example –

S1 = ***001*110
S2 = 11101**001

So,
o(S1) = 6, o(S2) = 8 ;
Ω(S1) = 10-4 = 6, Ω(S2) = 10-1 = 9

There are two important things to note here


• The notion of the order of a Schema is useful in calculating survival probabilities of the of the
Schema for Mutation.
• The notion of the defining length of a Schema is useful in calculating survival probabilities of
the Schema for Crossover.

Now as discussed earlier the GA program consists of four consecutive repeated steps:
t ⟵ t+1

Select P(t) from P(t-1) and recombine P(t). Evaluate P(t) for which we will require some new
terms

£(S,t) : it is defined as the number of string in the population which matches with the
Schema at time t.

for example

S = ****111***********************
o(S) = 3
Ω(S) = 2

If this is the population

X1 = 11110111000000101010101010100
X2 = 01110011001010010101001010101
X3 = 01010100101011111111111111000
X4 = 00000011111111111100000000011
X5 = 10110110000000011010101001101
X6 = 10101111110100000000111111111
X7 = 00000000011111111101101010010
X8 = 01010111111000000000111111111
X9 = 10101010010100000111111111000
X10 = 10101010100100101001010010101
X11 = 10000111101010101111111100000
X12 = 10101011111111111111111111111
X13 = 10101111000000010101001010101
X14 = 10001000000000000011111111111
X15 = 10101111111111110000000000000
X16 = 10000000010101010010101001001
X17 = 10001110000000010101000010000
X18 = 10000000000000000000000000000
X19 = 11110000111101010010101001000
X20 = 00000000000000000011111111111

Then in the above population X13 , X15 ,X17 are the 3 strings that are matched with the
Schema.

So £(S,t) = 3

eval (S,t) is defined as the average fitness of the String in the population which matches
with the Schema ,i.e

eval(S,t) = ∑i=1P eval(Xi)/P

Xi are the String which matches with the Schema.


where P is the number of the String which matches with the Schema.
The expected number of String which will match in the next iteration with the Schema is
given by the relation below:

£(S,t+1) = £(S,t).pop_size .eval(S,t) /F(S,t)

Or

£(S,t+1) = £(S,t) .eval(S,t) /F(S,t)

Here pop_size is the population size and F(S,t) is the sum of the fitness of all the Strings.
Where F(S,t) is the average fitness of the entire population.

Now from the above relation it is clear that “above average” Schema receive an increasing
number of String in the next generation and “below average” Schema receive decreasing
number of String in the next generation.

Now if we assume that Schema S remain above average that means


eval(S,t) = F(S,t) + α.F(S,t)

So,
£ (S,t) = £(S,0)(1 + α)t

Now we will consider the effect of Crossover and Mutation.


So, first consider Crossover

Consider 2 Strings:

S1 = ***111**************************
S2 = 111***************************01

It can be clearly seen that there are relatively higher chances that S1 will survive after
Crossover but on the other hand it is almost sure that the S2 will be destructed until unless
it is Crossover by itself or similar kind of Schema.
In general a crossover site is selected uniformly among m-1 possible sites. This
implies that that the probability of destruction of Schema S is:

pd(S) = Ω(S)/(m-1)

So probability of survival of single String is

ps(S) = 1 - Ω(S)/(m-1)
ps(S1) = 30/32 , ps(S2) = 0

So,

ps(S) = 1 - pc(S). Ω(S)/(m-1)

where pc(S) is the Crossover probability


in fact,

ps(S) ≥ 1 - pc(S). Ω(S)/(m-1)

So, the modified number of expected number of String which will match in the next
iteration with the Schema is given by:

£(S,t+1) = (£(S,t) .eval(S,t) /F(S,t)).[ 1 - pc(S). Ω(S)/(m-1)]

Now we will consider Mutation


Since the the probability of alteration of single bit is p m the probability of survival of bit is 1-
pm . So

ps(S) = (1 – pm)o(S)
since pm << 1 . So
ps(S) 1 –o(S).Pm
hence,

£(S,t+1) = (£(S,t) .eval(S,t) /F(S,t)).[ 1 - pc(S). Ω(S)/(m-1) - o(S). pm]


£(S,t+1) = £(S,0) .[eval(S,t) /F(S,t).[ 1 - pc(S). Ω(S)/(m-1) - o(S). pm]]t

So this is the final expected number of String which will be same as that of Schema.
If [eval(S,t) /F(S,t).[ 1 - pc(S). Ω(S)/(m-1) - o(S). pm]] > 1

Then, the number of String similar to Schema will increase exponentially otherwise decrease
exponentially. So this is how GA discriminate between the above average Schema and below
average Schema.

Conclusion:
We have tried to explore the Genetic Algorithm as an optimizing technique and studied its
advantages over the other optimizing techniques. The comparison was made with other
optimization techniques and we found that GA due to its stochastic nature, parallelism and natural
selection technique is better than others in many ways. GA’s can be applied to complex non-linear
functions and problems involving huge solution space search where other optimizing techniques
may not work. For simple problems, GA may be quite expensive with respect to time.

We also explained the algorithm with the help of an example showing all major steps of GA
including reproduction and selection, cross over and mutation. We found that the total fitness of the
population increases with iterations and hence, we approach to optimum values. We discussed
varied applications of GA in many fields and finally, showed the use of GA in finding the rate
parameters of reduction of iron ore in presence of graphite.
References:

1. Golap Md. Chowdhury, Gour G. Roy, “Application of Genetic Algorithm (GA) to estimate the rate
parameters for solid state reduction of iron ore in presence of graphite”, Computational
Materials Science 45 (2009) 176–180

2. Dorit Wolf and Ralf Moros, “Estimating rate constants of heterogeneous catalytic reactions
without supposition of rate determining surface steps- an application of a genetic algorithm”,
Chemical Engineering Science, Vol. 52, No. 7, pp-1189-1199, 1997('her

3. David E. Goldberg, “Genetic Algorithms in search, optimization and machine learning”

4. Adam Marczyk, “Genetic Algorithms and Evolutionary Computation”

5. www.rennard.org/alife/english/gavintrgb.html

6. www.ai-junkie.com/ga/intro/gat1.html

7. www.genetic-programming.org/

8. en.wikipedia.org/wiki/Genetic algorithm

You might also like