Genetic Algorithm
Genetic Algorithm
The genetic algorithm is a method based on natural selection, the mechanism that drives
biological evolution, for addressing both limited and unconstrained optimization problems. A
population of individual solutions is repeatedly modified by the genetic algorithm. The genetic
algorithm picks individuals from the present population to be parents at each phase and utilizes
them to generate the following generation's children. The population "evolves" toward an ideal
solution over generations. The genetic algorithm may be used to handle several optimization
problems that aren't well suited for traditional optimization techniques, such as issues with
discontinuous, nondifferentiable, stochastic, or highly nonlinear objective functions. The
evolutionary algorithm can be used to solve mixed integer programming issues in which certain
components must be integer valued.
INTRODUCTION TO OPTIMIZATION
SET OF SET OF
PROCESS OUTPUTS
INPUTS
The term "optimization" refers to the process of determining input values in order to obtain the
"optimal" output values. The term "best" has several meanings depending on the situation, but
in mathematics, it refers to maximizing or reducing one or more objective functions by
changing the input parameters.
The search space is the collection of all potential solutions or values for the inputs. There is a
point or a collection of points in this search area that provides the best solution. The goal of
optimization is to locate a certain point or group of points inside the search space.
Nature has long served as a source of inspiration for all of humanity. Genetic Algorithms (GAs)
are search-based algorithms based on natural selection and genetics principles. GAs are a
subset of Evolutionary Computing, a considerably bigger discipline of computation.
GAs was invented at the University of Michigan by John Holland and his students and
colleagues, most notably David E. Goldberg, and have subsequently been tested on a variety
of optimization problems with great success.
We have a pool or population of possible solutions to a problem in GAs. These solutions are
then subjected to recombination and mutation (as in natural genetics), resulting in the birth of
new children, and the process is repeated over generations. Each individual (or candidate
solution) is given a fitness value (based on its objective function value), and the fitter ones have
a better probability of mating and producing additional "fitter" individuals. This is consistent
with Darwin's "Survival of the Fittest" theory.
In this manner, we continue to "evolve" better persons or solutions through generations until
we reach a threshold for halting.
Genetic algorithms are adequately randomized in nature, but they outperform random local
search (in which we just attempt numerous random solutions and keep track of the best so far)
because they also use past data.
GAs offer a number of advantages that have helped them become quite popular. These include
the following:
• There is no need for any derivative data (which may not be available for many real-
world problems).
• When compared to traditional procedures, it is quicker and more efficient.
• Has excellent parallel capacity.
• Optimizes multi-objective problems as well as continuous and discrete functions.
• Provides a list of "excellent" solutions rather than simply one.
• Always receives a solution to the problem, which improves with time.
• When the search space is large and there are many parameters to consider, this method
is useful.
• For some issues, calculating the fitness value many times might be computationally
costly.
• Because the solution is stochastic, there are no assurances about its optimality or
quality.
• The GA may not converge to the best solution if it is not implemented correctly.
Genetic Algorithms can deliver a "good-enough" answer "quickly enough." As a result, genetic
algorithms are appealing for tackling optimization issues. The following are the reasons why
genetic algorithm is required:
There are a huge number of NP-Hard issues in computer science. This basically indicates that
even the most powerful computer systems will take a long time (years!) to tackle the problem.
In this situation, GAs seems to be an effective tool for quickly delivering workable near-
optimal solutions.
2. Gradient based methods failure
Starting at a random place and traveling in the direction of the gradient until we reach the
summit of the hill, traditional calculus-based approaches work. This method is quick and
effective for single-peaked objective functions like linear regression's cost function. However,
in most real-world scenarios, we have a very complicated problem called landscapes, which
are made up of multiple peaks and valleys, causing such approaches to fail because they have
an intrinsic propensity to become stuck at the local optima, as seen in the diagram below.
It's a subset of all the feasible (encoded) solutions to the problem. The population of a GA is
similar to that of human beings, only we have Candidate Solutions representing human beings
instead of human people.
Chromosomes - A chromosome is an example of a solution to a problem.
Genotype - The population in the computing space is known as genotype. The answers are
represented in the computation space in a fashion that can be easily understood and
manipulated by a computer machine.
Phenotype - is the population in the real-world solution space in which solutions are
represented in the same manner they are in real-world settings.
Decoding and Encoding – The phenotypic and genotype spaces are the same for basic issues
when decoding and encoding. In most circumstances, however, the phenotypic and genotype
spaces are distinct. Decoding is the transformation of a solution from genotype to phenotype
space, whereas encoding is the transformation from phenotype to genotype space. Because
decoding is done frequently in a GA throughout the fitness value computation, it should be
quick.
Consider the 0/1 Knapsack Problem as an example. The Phenotype space is made up of
solutions that just have the item numbers of the items to be chosen.
However, it may be represented as a binary string of length n in genotype space (where n is the
number of items). A 0 at location x indicates that the xth item is selected, whereas a 1 indicates
the opposite. The genotype and phenotype spaces are distinct in this scenario.
Encoding
Genotype Space
Phenotype space
(Computation Space)
(Actual solution space)
0 1 0 1 0 1 0 1
Decoding
Fitness function - is a function that accepts the solution as an input and returns the solution's
appropriateness as an output. The fitness function and the objective function may be the same
in certain circumstances, but they may be different in others depending on the challenge.
Genetic operator - These genetic operators change the genetic makeup of the progeny.
Crossover, mutation, and selection are examples of these.
BASIC STRUCTURE
The basic structure of a GA is as follows –
We begin by selecting parents for mating from an initial population (which may be produced
at random or seeded by other heuristics). To create new offspring, use crossover and mutation
operators on the parents. Finally, these offspring replace the population's current members, and
the cycle resumes. In this approach, genetic algorithms attempt to approximate human
evolution to a degree.
0 0 1 0 1 1 1 0 0 1
Figure 5: Binary Representation
We can represent numbers with their binary form in other situations, particularly ones
involving numbers. The issue with this type of encoding is that various bits have different
meanings, therefore mutation and crossover operators might have unexpected results. Gray
coding can help with this to some level, as a change in one bit does not have a huge impact on
the answer.
Real Value Representation
The real valued form is the most appropriate for issues when we want to specify the genes
using continuous rather than discrete variables. However, the precision of these real or floating-
point integers is restricted to the computed precision.
0.5 0.2 0.6 0.8 0.7 0.4 0.3 0.9 0.4 0.7
Figure 6:Real value representation
Integer Representation
We can't always limit the solution space to binary 'yes' or 'no' for discrete valued genes. For
example, if we wish to encode the four distances of North, South, East, and West as 0,1,2,3,
we may do so. In certain situations, integer representation is preferred.
5 2 6 8 7 4 3 9 4 7
Figure 7: Integer representation
Permutation Representation
The answer to many issues is represented as an order of elements. In these situations,
permutation representation is the best option.
The traveling salesman dilemma is a prominent example of this approach (TSP). In this game,
the salesperson must travel all of the cities, visiting each one precisely once before returning
to the beginning city. The tour's overall mileage must be kept to a minimum. Because the
solution to this TSP is a natural ordering or permutation of all the cities, it makes reasonable to
use a permutation representation for this problem.
1 5 9 8 7 4 2 3 6 0
3
Figure 8: Permutation representation
• The population's variety must be preserved; otherwise, early convergence may occur.
• A huge population can cause a GA to slow down, whereas a smaller population may
not be adequate for a suitable mating pool. As a result, the best population size must
be determined by trial and error.
Typically, the population is specified as a two-dimensional array of – size population, size x,
and chromosomal size.
Population Initialization
In a GA, there are two major ways for initializing a population. They are −
• Fill the starting population with entirely random solutions using random initialization.
• Heuristic initialization populates the beginning population with a problem-specific
heuristic.
It has been discovered that employing a heuristic to start the whole population might result in
the population having similar solutions and minimal variety. The random solutions are the
ones that lead the population to optimality, according to experiments. As a result, rather than
filling the whole population with heuristic-based answers, we use heuristic initialization to
seed the population with a few good solutions and fill in the remainder with random
solutions.
It has also been discovered that in some circumstances, heuristic initialization merely affects
the population's initial fitness, but in the end, it is the variety of the solutions that leads to
optimality.
Population Models
There are two population models widely used, they are
Steady State
In steady state GA, each iteration produces one or two offspring who replace one or two
members in the population. Incremental GA is another name for steady state GA.
Generalization
In a generational model, we create 'n' offspring, where n is the population size, and at the
conclusion of the iteration, the entire population is replaced by the new one.
GENETIC ALGORITHM - FITNESS FUNCTION
Simply explained, the fitness function is a function that takes a candidate solution to a
problem as input and outputs how "fit" or "excellent" the answer is for the problem under
discussion.
The calculation of fitness value is performed several times in a GA, therefore it should be
quick. A sluggish fitness value computation can have a negative impact on a GA, making it
extremely slow.
The fitness function and the objective function are usually the same in most circumstances
since the goal is to maximize or decrease the provided objective function. An Algorithm
Designer may pick a different fitness function for more complicated issues with numerous
objectives and restrictions.
The following properties should be included in a fitness function:
• It must quantify the fit of a particular solution or the fit of persons that may be created
from that solution.
Due to the intrinsic intricacies of the situation at hand, calculating the fitness function directly
may not be viable in some circumstances. In such circumstances, we approximate fitness to
meet our demands.
The fitness computation for a 0/1 Knapsack solution is shown in the graphic below. It's a
basic fitness function that adds the profit values of the objects being chosen (which all start
with a 1) and scans the elements from left to right until the knapsack is filled.
0 1 2 3 4 5 6 Item number
0 1 0 1 1 0 1 Chromosome
2 9 8 5 4 0 2 Profit Values
7 5 3 1 5 9 8 Weight Values
Knapsack capacity = 15
Chromos Fitness
ome value
A 8.2
B 3.2 Fixed
C 1.4 point
D 1.2
Select D as
E 4.2
the parent
F 0.3
Spin
Roulette
wheel
Figure 10: Roulette Wheel Selection
A fitter person obviously has a larger pie on the wheel and therefore a better probability of landing in
front of the fixed point when the wheel is revolved. As a result, the likelihood of selecting an
individual is exactly proportional to its fitness.
We utilize the steps below for implementation. −
S = the sum of a finesses is the formula to use.
Make a number between 0 and S at random.
Add the finesses to the partial sum P till PS, starting at the top of the population.
The chosen individual is the one for whom P surpasses S.
Stochastic Universal Sampling (SUS)
Stochastic Universal Sampling is similar to Roulette wheel selection, except that instead of having
only one fixed point, we have numerous fixed points, as seen in the graphic below. As a result, all of
the parents are picked in a single wheel spin. Furthermore, such a system promotes extremely fit
people to be picked at least once.
It should be noted that fitness proportional selection approaches do not function in situations when
fitness might be negative.
Chromos Fitness
ome value
Fixed
A 8.2
point
B 3.2
C 1.4 Fixed
D 1.2 point
E 4.2
F 0.3 Spin
Tournament Selection
In a K-Way tournament, we randomly choose K people from the population and choose the
best among them to become parents. The same procedure is followed to choose the next
parent. Tournament Selection is also very popular in literature since it may function with
fitness values that are negative.
No selection pressures
When choosing a parent, we remove the idea of a fitness value. Every person in the
population, however, is graded according to their fitness. The parents are chosen based on
each individual's rank rather than their fitness. Individuals with higher ranks are favored over
those with lower ranks.
Chromosome Fitness value Rank
A 8.1 1
B 8.0 4
C 8.05 2
D 7.95 6
E 8.02 3
F 7.99 5
Random Selection
We choose parents at random from the current population in this technique. Because there is
little selection pressure for fitter people, this method is frequently ignored.
GENETIC ALGORTIHM- CROSS OVER
Reproduction and biological crossover are comparable to the crossover operator. More than
one parent is chosen, and one or more offspring are generated utilizing the parents' genetic
material. Crossover is typically used in a high-probability GA – pc.
Cross over operator
One point crossover
Multi Point crossover
Uniform Crossover
Whole Arithmetic recombination
Davis’ order crossover