A Review on Optimization Algorithm for Deep Learning Method in Bioinformatics Field (1)
A Review on Optimization Algorithm for Deep Learning Method in Bioinformatics Field (1)
Abstract—In the past few years, deep learning has been used reconstructing brain circuits [6]. This shows that deep learning
widely in bioinformatics area to solve common problems such as can be taken into consideration as a method that can be used to
protein sequence prediction, phylogenic inferences, multiple solve multiple problems that rise in bioinformatics field.
sequence alignment and many more. It has been in the spotlight
as a powerful approach which makes significant advances in Despite successfulness and advantages offered by deep
taking care of the issues that haunt artificial intelligence learning, this method still suffers from several other
community for many years. However, several weaknesses such as weaknesses such as trap at local minima, lower performance
trap at local minima, lower performance and high computational and high computational time [7-9]. Due to these drawbacks,
time still occur in deep learning. Therefore, global optimization optimization algorithms can be implemented in order to assist
technique such as differential search algorithm can be used to deep learning to achieve best finding data and results. There are
assist deep learning method in order to get best finding result and several optimization algorithms, but this paper will focus on
data. This review will cover fundamental of deep learning and optimization algorithm called Differential Search Algorithm
their involvement in bioinformatics field as well as (DSA). Several comparison studies [10-11] have been carried
implementation of differential search algorithm and their out which proved that DSA is more powerful optimization
involvement in bioinformatics field. algorithms compared to Particle Swarm Optimization (PSO),
Artificial Bee Colony (ABC) and Gravitational Search
Keywords—bioinformatics, deep learning, neural network,
Algorithm (GSA).
backpropagation, optimization algorithm, differential search
algorithm
II. DEEP LEARNING
I. INTRODUCTION Representation learning is an arrangement of strategies that
Bioinformatics is a research area that is widely known to permits a machine to identify raw data input and allow
integrate numerous core subjects such as biology, mathematic, automatic discovery of representations which is required for
engineering and computer science. With this multidisciplinary classification or detection [2]. Deep learning method is one of
field, bioinformatics usually involves in development of new the examples of representation learning methods with multiple
methods and software tools used for better understanding of levels of representation. Deep learning is a branch of machine
biological data as the main aim of bioinformatics is to help learning which depends on a set of algorithms that endeavor to
researchers to extract knowledge encoded in the biological model high-level abstraction in data by using model
data. As mentioned by Hapudeniya [1], essential issues that architecture and composed of multiple non-linear modules [12-
occur in bioinformatics such as protein sequence prediction, 15]. Deep learning is working by exploiting the idea of the
phylogenic inferences, multiple sequence alignment, etc are hierarchical concept where the higher level, more abstract
usually hard in nature especially in non-deterministic concepts are learned from the lower level ones. This
polynomial-time aspect. Therefore, advanced computational architecture often developed with a greedy layer-by-layer
technologies, tools and algorithms need to be used in order to strategy and deep learning serves to unravels these abstractions
solve these problems [1]. and choose which elements are helpful for training purpose
[16].
To solve growing problems in bioinformatics field,
artificial intelligence methods such as deep learning method
offer an efficient and powerful approach to be used. Deep
learning is one of the artificial intelligence methods that always
been in the spotlight due to their major advances in taking care
the problems that haunt artificial intelligence community for a
long time [2]. It has turned out that deep learning can be very
good at foreseeing the action and activity of potential drug
molecules [3], anticipating the impacts of mutations in the non-
coding DNA towards gene expression and disease [4-5] and
backwards from the output layer, to the hidden layer until the
correct output or the output that are closely to the expected
output is produce.
However, while using backpropagation, there are several
problems that might occur such as differentiability to compute
the gradient [20]. Therefore, backpropagation cannot handle
discontinuous node transfer functions and it cannot handle
optimality criteria. The other problem with backpropagation is
the scaling problem [20]. Backpropagtaion works efficiently
with simple training problems, however as the problems
complexity increases, the performance might fall off rapidly.
Backpropagation has an advantage where it can escape local
Fig. 1 Deep learning with more than two hidden layer [17]. minima. Unfortunately, it leaves them unknowingly whether
Fig. 1 above visualizes deep learning architecture with the next one it finds is better or worse which then end up make
more than two hidden layers. At each layer, the input is the backpropagation bouncing between local minima without
changed by a processing unit whose parameters are now much improvement thus making for a very slow training rate.
learned through the training phase [15]. The main key aspect of
deep learning is that it only requires little engineering by hand C. Deep Learning in Bioinformatics
since deep learning is a unique method in which the layers of As mentioned before, deep learning is a powerful approach
features are usually learned from data using general-purpose that can be used for learning complex patterns at multiple
learning procedure. layers. Due to this characteristic, it also can be used to capture
multiple levels of data abstraction and processing within the
The next sub sections will describe briefly fundamental
cells. This feature makes deep learning suitable for genomics
architecture of deep learning for better understanding about
studies as mentioned by Park and Kellis [21] and also can be
deep learning which consists of two main things that make up
used in pathway analysis purpose.
deep learning. They are neural network and also
backpropagation. Alipanahi et al. [22], used deep learning strategy in their
genomics research to calculate protein-nucleic acid interactions
A. Neural Network from diverse experimental dataset. They found out that deep
Neural networks have been around and exist since the learning method performed better than other state-of-the-art
1950s and already contributed in so many things and findings. methods. Furthermore, Alipanahi et al. incorporated deep
The most successful deep learning methods usually will learning with their own algorithm called DeepBind. This
involve neural networks. Schmidhuber [15], in his article method allowed the prediction of binding affinity of a protein
explains that, neural network basically consists of many to a DNA or RNA sequence in two steps. These two steps
simple, connected processors called neurons and each of these consist of convolution module for representation learning and
neurons produces a sequence of real-valued activation. These prediction module for feature combinations. This method is
neurons are activated through weighted connections from better illustrated in Fig. 2.
previously active neurons as well as through sensors perceiving
the environment. Schmidhuber [15] also stated that neurons’
behavior may require long casual chains of computational
stages where every stage typically transforms in a non-linear
manner and aggregate activation of the network.
Basically, deep learning used neural network concept to
operate. Both deep learning and neural network look similar as
deep learning applied neural network concept which used
neurons and also hidden layers, but actually there are slightly
different between these two methods. Firstly, deep learning has
more hidden layers if compared to neural network which only
have one or two hidden layer and secondly, deep learning can
be trained in both unsupervised and supervised manner for both
unsupervised and supervised learning task [18-19].
Fig. 2 Deep convolutional neural network designed by
Alipanahi et al. [22].
B. Backpropagation
Backpropagation procedure occurs after feedforward III. OPTIMIZATION
procedure happened. By the end of feedforward procedure, it
will give actual value which is then be compared with the Optimization algorithms have been widely used in many
expected value. This comparison will result in an error value. research field areas such as bioinformatics, biotechnology and
This will lead to the backpropagation procedure where the computational biology and have been used in many
connection weights in the network are adjusted and working applications to solve variety of problems. Optimization as
708
Authorized licensed use limited to: ULAKBIM UASL - EGE UNIVERSITESI. Downloaded on November 18,2024 at 14:54:08 UTC from IEEE Xplore. Restrictions apply.
2016 IEEE EMBS Conference on Biomedical Engineering and Sciences (IECBES)
stated by Banga [23], is an algorithm with an aim to make a There are four main steps that involved in DSA. First is the
design or system as efficient and functional as possible. initialization step. Minimum and maximum bounds will be
Optimization algorithms is not a new method or concept as it is prescribed to initiate organism which utilizes NP*D-dimension
widely used to help researchers to analyze, enhance, encoded parameter vector. After the initialization, Brownian-like
and optimize many of biological data. random walk model can be describe by generate the stopover
vectors between organisms. Third steps will be the search
Therefore, optimization algorithms can be taken into
process of stopover site which can be calculated by an
consideration to be used to support and assist deep learning
organism of the superorganism and lastly the selection step.
method in order to achieve better finding result. By using
Purpose of selection step is to choose the next population
several advantages that content in optimization algorithm such
between stopover site and the organism population.
as provide great flexibility of algorithm and reduce the
computational burden [24], it can be greatest help in training
large layer of network and cover up drawbacks that occur in B. DSA for Proteomic Analysis Pipeline
deep learning method. In this section, optimization algorithm As mentioned before, DSA has been widely used in
called Differential Search Algorithm (DSA) will be discussed bioinformatics and many other research fields. One of the
briefly. applications of DSA can be seen in Xie et al. work [27]. They
stated that electron transfer dissociation (ETD) is exceptionally
A. Differential Search Algorithm (DSA) helpful for peptide fragmentation in mass spectrometry.
Unfortunately, ETD spectra usually receive low score in the
Differential search algorithm (DSA) is a relatively new
identifications of 2+ ions. Therefore, in order to solve this
novel meta-heuristic algorithm that has been proposed by
problem, Xie et al. have been proposed a new method by
Civicioglu [25]. It is inspired by Brownian-like random walk
combining both ion charges enhancing method with DSA
movement which is used by an organism to migrate [26].
algorithm. By using this new method, they observed that the
Basically how DSA work is similar like behavior of organisms
complementary identification result shows great improvement
to move away from habitat that has low food capacity to a new
in ETD identification.
habitat with a more food capacity. They will stay in that new
habitat for a time being until it finds another new habitat with
more food capacity than in their current habitat. In DSA C. Comparative study of optimization algorithms
algorithm, search space is pictures as the food area or habitat Optimization plays an important role in solving variety of
while each point in the search space is pictures as organism problems that occur in many applications and research fields. It
migration. The main goal of DSA as stated in [26] is that to was already gained wide attention in many research field areas
find the global optimal solution of the problem. Standard DSA such as biotechnology, bioinformatics, computational biology
pseudo code is as describe as Fig. 3 below. and many more. For the past few years, there were many other
optimization algorithms that have been used by researchers in
order to solve their research problems.
For this section, comparative study has been made between
several other optimization algorithms such as ABC, GA, FA
and also DSA. This comparison can be seen as shown in Table
1 below:
• Suitable to
• Difficult to
be used for
reach optimal
high
solution within
dimensional
Firefly Algorithm reasonable time
[30] and non-
(FA)
linear
problem
709
Authorized licensed use limited to: ULAKBIM UASL - EGE UNIVERSITESI. Downloaded on November 18,2024 at 14:54:08 UTC from IEEE Xplore. Restrictions apply.
2016 IEEE EMBS Conference on Biomedical Engineering and Sciences (IECBES)
710
Authorized licensed use limited to: ULAKBIM UASL - EGE UNIVERSITESI. Downloaded on November 18,2024 at 14:54:08 UTC from IEEE Xplore. Restrictions apply.
2016 IEEE EMBS Conference on Biomedical Engineering and Sciences (IECBES)
711
Authorized licensed use limited to: ULAKBIM UASL - EGE UNIVERSITESI. Downloaded on November 18,2024 at 14:54:08 UTC from IEEE Xplore. Restrictions apply.