0% found this document useful (0 votes)
13 views5 pages

A Review on Optimization Algorithm for Deep Learning Method in Bioinformatics Field (1)

This document reviews the application of deep learning methods in bioinformatics, highlighting their effectiveness in solving complex problems such as protein sequence prediction and phylogenetic inferences. It discusses the limitations of deep learning, including issues like local minima and high computational costs, and proposes the use of optimization algorithms, particularly the Differential Search Algorithm (DSA), to enhance performance. The paper emphasizes the advantages of DSA over other optimization techniques and its successful implementation in various bioinformatics applications.

Uploaded by

kircalinecla
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
13 views5 pages

A Review on Optimization Algorithm for Deep Learning Method in Bioinformatics Field (1)

This document reviews the application of deep learning methods in bioinformatics, highlighting their effectiveness in solving complex problems such as protein sequence prediction and phylogenetic inferences. It discusses the limitations of deep learning, including issues like local minima and high computational costs, and proposes the use of optimization algorithms, particularly the Differential Search Algorithm (DSA), to enhance performance. The paper emphasizes the advantages of DSA over other optimization techniques and its successful implementation in various bioinformatics applications.

Uploaded by

kircalinecla
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 5

2016 IEEE EMBS Conference on Biomedical Engineering and Sciences (IECBES)

A Review on Optimization Algorithm for Deep


Learning Method in Bioinformatics Field
Siti Noorain Mohmad Yousoff*, ‘Amirah Baharin, Afnizanfaizal Abdullah*
Synthetic Biology Research Group,
Faculty of Computing,
Universiti Teknologi Malaysia,
81310 UTM, Johor, Malaysia
*
ainyousoff@gmail.com, ami.b2525@gmail.com, *afnizanfaizal@utm.my

Abstract—In the past few years, deep learning has been used reconstructing brain circuits [6]. This shows that deep learning
widely in bioinformatics area to solve common problems such as can be taken into consideration as a method that can be used to
protein sequence prediction, phylogenic inferences, multiple solve multiple problems that rise in bioinformatics field.
sequence alignment and many more. It has been in the spotlight
as a powerful approach which makes significant advances in Despite successfulness and advantages offered by deep
taking care of the issues that haunt artificial intelligence learning, this method still suffers from several other
community for many years. However, several weaknesses such as weaknesses such as trap at local minima, lower performance
trap at local minima, lower performance and high computational and high computational time [7-9]. Due to these drawbacks,
time still occur in deep learning. Therefore, global optimization optimization algorithms can be implemented in order to assist
technique such as differential search algorithm can be used to deep learning to achieve best finding data and results. There are
assist deep learning method in order to get best finding result and several optimization algorithms, but this paper will focus on
data. This review will cover fundamental of deep learning and optimization algorithm called Differential Search Algorithm
their involvement in bioinformatics field as well as (DSA). Several comparison studies [10-11] have been carried
implementation of differential search algorithm and their out which proved that DSA is more powerful optimization
involvement in bioinformatics field. algorithms compared to Particle Swarm Optimization (PSO),
Artificial Bee Colony (ABC) and Gravitational Search
Keywords—bioinformatics, deep learning, neural network,
Algorithm (GSA).
backpropagation, optimization algorithm, differential search
algorithm
II. DEEP LEARNING
I. INTRODUCTION Representation learning is an arrangement of strategies that
Bioinformatics is a research area that is widely known to permits a machine to identify raw data input and allow
integrate numerous core subjects such as biology, mathematic, automatic discovery of representations which is required for
engineering and computer science. With this multidisciplinary classification or detection [2]. Deep learning method is one of
field, bioinformatics usually involves in development of new the examples of representation learning methods with multiple
methods and software tools used for better understanding of levels of representation. Deep learning is a branch of machine
biological data as the main aim of bioinformatics is to help learning which depends on a set of algorithms that endeavor to
researchers to extract knowledge encoded in the biological model high-level abstraction in data by using model
data. As mentioned by Hapudeniya [1], essential issues that architecture and composed of multiple non-linear modules [12-
occur in bioinformatics such as protein sequence prediction, 15]. Deep learning is working by exploiting the idea of the
phylogenic inferences, multiple sequence alignment, etc are hierarchical concept where the higher level, more abstract
usually hard in nature especially in non-deterministic concepts are learned from the lower level ones. This
polynomial-time aspect. Therefore, advanced computational architecture often developed with a greedy layer-by-layer
technologies, tools and algorithms need to be used in order to strategy and deep learning serves to unravels these abstractions
solve these problems [1]. and choose which elements are helpful for training purpose
[16].
To solve growing problems in bioinformatics field,
artificial intelligence methods such as deep learning method
offer an efficient and powerful approach to be used. Deep
learning is one of the artificial intelligence methods that always
been in the spotlight due to their major advances in taking care
the problems that haunt artificial intelligence community for a
long time [2]. It has turned out that deep learning can be very
good at foreseeing the action and activity of potential drug
molecules [3], anticipating the impacts of mutations in the non-
coding DNA towards gene expression and disease [4-5] and

978-1-4673-7791-1/16/$31.00 ©2016 IEEE 707


Authorized licensed use limited to: ULAKBIM UASL - EGE UNIVERSITESI. Downloaded on November 18,2024 at 14:54:08 UTC from IEEE Xplore. Restrictions apply.
2016 IEEE EMBS Conference on Biomedical Engineering and Sciences (IECBES)

backwards from the output layer, to the hidden layer until the
correct output or the output that are closely to the expected
output is produce.
However, while using backpropagation, there are several
problems that might occur such as differentiability to compute
the gradient [20]. Therefore, backpropagation cannot handle
discontinuous node transfer functions and it cannot handle
optimality criteria. The other problem with backpropagation is
the scaling problem [20]. Backpropagtaion works efficiently
with simple training problems, however as the problems
complexity increases, the performance might fall off rapidly.
Backpropagation has an advantage where it can escape local
Fig. 1 Deep learning with more than two hidden layer [17]. minima. Unfortunately, it leaves them unknowingly whether
Fig. 1 above visualizes deep learning architecture with the next one it finds is better or worse which then end up make
more than two hidden layers. At each layer, the input is the backpropagation bouncing between local minima without
changed by a processing unit whose parameters are now much improvement thus making for a very slow training rate.
learned through the training phase [15]. The main key aspect of
deep learning is that it only requires little engineering by hand C. Deep Learning in Bioinformatics
since deep learning is a unique method in which the layers of As mentioned before, deep learning is a powerful approach
features are usually learned from data using general-purpose that can be used for learning complex patterns at multiple
learning procedure. layers. Due to this characteristic, it also can be used to capture
multiple levels of data abstraction and processing within the
The next sub sections will describe briefly fundamental
cells. This feature makes deep learning suitable for genomics
architecture of deep learning for better understanding about
studies as mentioned by Park and Kellis [21] and also can be
deep learning which consists of two main things that make up
used in pathway analysis purpose.
deep learning. They are neural network and also
backpropagation. Alipanahi et al. [22], used deep learning strategy in their
genomics research to calculate protein-nucleic acid interactions
A. Neural Network from diverse experimental dataset. They found out that deep
Neural networks have been around and exist since the learning method performed better than other state-of-the-art
1950s and already contributed in so many things and findings. methods. Furthermore, Alipanahi et al. incorporated deep
The most successful deep learning methods usually will learning with their own algorithm called DeepBind. This
involve neural networks. Schmidhuber [15], in his article method allowed the prediction of binding affinity of a protein
explains that, neural network basically consists of many to a DNA or RNA sequence in two steps. These two steps
simple, connected processors called neurons and each of these consist of convolution module for representation learning and
neurons produces a sequence of real-valued activation. These prediction module for feature combinations. This method is
neurons are activated through weighted connections from better illustrated in Fig. 2.
previously active neurons as well as through sensors perceiving
the environment. Schmidhuber [15] also stated that neurons’
behavior may require long casual chains of computational
stages where every stage typically transforms in a non-linear
manner and aggregate activation of the network.
Basically, deep learning used neural network concept to
operate. Both deep learning and neural network look similar as
deep learning applied neural network concept which used
neurons and also hidden layers, but actually there are slightly
different between these two methods. Firstly, deep learning has
more hidden layers if compared to neural network which only
have one or two hidden layer and secondly, deep learning can
be trained in both unsupervised and supervised manner for both
unsupervised and supervised learning task [18-19].
Fig. 2 Deep convolutional neural network designed by
Alipanahi et al. [22].
B. Backpropagation
Backpropagation procedure occurs after feedforward III. OPTIMIZATION
procedure happened. By the end of feedforward procedure, it
will give actual value which is then be compared with the Optimization algorithms have been widely used in many
expected value. This comparison will result in an error value. research field areas such as bioinformatics, biotechnology and
This will lead to the backpropagation procedure where the computational biology and have been used in many
connection weights in the network are adjusted and working applications to solve variety of problems. Optimization as

708
Authorized licensed use limited to: ULAKBIM UASL - EGE UNIVERSITESI. Downloaded on November 18,2024 at 14:54:08 UTC from IEEE Xplore. Restrictions apply.
2016 IEEE EMBS Conference on Biomedical Engineering and Sciences (IECBES)

stated by Banga [23], is an algorithm with an aim to make a There are four main steps that involved in DSA. First is the
design or system as efficient and functional as possible. initialization step. Minimum and maximum bounds will be
Optimization algorithms is not a new method or concept as it is prescribed to initiate organism which utilizes NP*D-dimension
widely used to help researchers to analyze, enhance, encoded parameter vector. After the initialization, Brownian-like
and optimize many of biological data. random walk model can be describe by generate the stopover
vectors between organisms. Third steps will be the search
Therefore, optimization algorithms can be taken into
process of stopover site which can be calculated by an
consideration to be used to support and assist deep learning
organism of the superorganism and lastly the selection step.
method in order to achieve better finding result. By using
Purpose of selection step is to choose the next population
several advantages that content in optimization algorithm such
between stopover site and the organism population.
as provide great flexibility of algorithm and reduce the
computational burden [24], it can be greatest help in training
large layer of network and cover up drawbacks that occur in B. DSA for Proteomic Analysis Pipeline
deep learning method. In this section, optimization algorithm As mentioned before, DSA has been widely used in
called Differential Search Algorithm (DSA) will be discussed bioinformatics and many other research fields. One of the
briefly. applications of DSA can be seen in Xie et al. work [27]. They
stated that electron transfer dissociation (ETD) is exceptionally
A. Differential Search Algorithm (DSA) helpful for peptide fragmentation in mass spectrometry.
Unfortunately, ETD spectra usually receive low score in the
Differential search algorithm (DSA) is a relatively new
identifications of 2+ ions. Therefore, in order to solve this
novel meta-heuristic algorithm that has been proposed by
problem, Xie et al. have been proposed a new method by
Civicioglu [25]. It is inspired by Brownian-like random walk
combining both ion charges enhancing method with DSA
movement which is used by an organism to migrate [26].
algorithm. By using this new method, they observed that the
Basically how DSA work is similar like behavior of organisms
complementary identification result shows great improvement
to move away from habitat that has low food capacity to a new
in ETD identification.
habitat with a more food capacity. They will stay in that new
habitat for a time being until it finds another new habitat with
more food capacity than in their current habitat. In DSA C. Comparative study of optimization algorithms
algorithm, search space is pictures as the food area or habitat Optimization plays an important role in solving variety of
while each point in the search space is pictures as organism problems that occur in many applications and research fields. It
migration. The main goal of DSA as stated in [26] is that to was already gained wide attention in many research field areas
find the global optimal solution of the problem. Standard DSA such as biotechnology, bioinformatics, computational biology
pseudo code is as describe as Fig. 3 below. and many more. For the past few years, there were many other
optimization algorithms that have been used by researchers in
order to solve their research problems.
For this section, comparative study has been made between
several other optimization algorithms such as ABC, GA, FA
and also DSA. This comparison can be seen as shown in Table
1 below:

TABLE 1. COMPARISON TABLE FOR OPTIMIZATION ALGORITHMS

Algorithm(s) Reference(s) Advantage(s) Disadvantage(s)

• Good at • Poor in the


Ant Bee Colony global exploitation
[28]
(ABC) exploration process

• Has ability • Lower


for global convergence rate
Genetic search • Easily trap at
[28-29]
Algorithm (GA) • Strong local optima
robustness

• Suitable to
• Difficult to
be used for
reach optimal
high
solution within
dimensional
Firefly Algorithm reasonable time
[30] and non-
(FA)
linear
problem

Fig. 3 DSA pseudo code [26].

709
Authorized licensed use limited to: ULAKBIM UASL - EGE UNIVERSITESI. Downloaded on November 18,2024 at 14:54:08 UTC from IEEE Xplore. Restrictions apply.
2016 IEEE EMBS Conference on Biomedical Engineering and Sciences (IECBES)

that occur in xylitol, researchers start to show an interest on


Algorithm(s) Reference(s) Advantage(s) Disadvantage(s) how to produce more xylitol within a short time.
• DSA has • Lack of strategy In accordance to this, powerful approaches are needed in
unique that may affect order to manipulate and simulate microbial hosts such as E.
crossover and its local search
mutation ability
Coli to produce more xylitol. The proposed method in this
Differential operators. paper which is the combination of DSA into DL can be used in
Search Algorithm [10-11],[25] • More this case. It can be used as a medium to predict which
(DSA) powerful particular genes or pathways in E. Coli that may affect the
compared to production of xylitol. By doing so, analysis and simulation of
PSO, ABC,
DE and GSA
E. Coli model can be done within short time period as
computational method providing fast result if compared to wet
experiment in the laboratory setting.

IV. IMPLEMENTATION OF DSA IN DEEP LEARNING METHOD VI. CONCLUSION


As mention in [31], there are three main reasons behind the High demand from bioinformatics field regarding the
fame and successfulness of deep learning method. First is that needed of advanced computational technologies and tools to
deep learning can drastically increase chip processing abilities, help researchers to manipulate, analyze and extract knowledge
second is that it can significantly lowered cost of computing encoded in biological data has motivated many others
hardware and lastly is that deep learning is recent advanced computational biologist and computer scientist to develop
computational technology in machine learning and information many software and tools that suitable with the current situation.
processing research. Not only that, advantages such as it can Nowadays, deep learning has become one of the artificial
generate nature of model and it has an unsupervised pre- intelligence methods that offer many advantages and has
training step [31] are the key of successfulness of deep already making major advances by solving many problems that
learning. haunt artificial intelligence for many years. In this paper,
review on deep learning, fundamental of deep learning and
However, as deep learning offers powerful and efficient their involvement in bioinformatics field have been reviewed.
approach to be used to solve variety of problems, several However, despite the fame and advantages offered by deep
limitations still occurs in deep learning. Main problem that learning, they still suffer from several limitations such as stuck
always occur in deep learning is overfitting problem and it at local minima, lower performance and high computational
always stuck at local minima. Once these problems happen, it time [7-9].
will result in lower performance and high computational time
in deep learning. Optimization algorithm needed in this situation to help and
assist deep learning to get better finding result. Global
Therefore, optimization algorithm such as differential optimization technique is required in order to avoid local
search algorithm (DSA) can be taken into consideration to help minima problem [27]. Therefore, this paper has been focused
cover up limitations that occur in deep learning method. There on global optimization technique called DSA. Briefly review
are several advantages that DSA offer such as good at finding on DSA and their involvement in bioinformatics field have
optimal solution and also good at exploring search space and been made in this paper. With several advantages such as good
last but not least, the main highlight of DSA is that it is good at at locating region of global minima, good at finding the
locating region of global minima. This advantage is very useful optimal solution and also good at exploring search space, DSA
in help deep learning to solve their problem which is always can be taken into consideration as an algorithm that can be
trapped at local minima. As mentioned in [32], global implemented in deep learning to cover limitations that occur in
optimization technique is required in order to avoid local deep learning. With the implementation of DSA algorithm in
minima problem. Once the problem with trap at local minima deep learning method, it is hope that better finding results can
has been solved, it might increase back the performance of be achieved.
deep learning method and lower the computational time.
For future work, it is hope that many researchers,
computational biologist or computer scientist to proposed more
V. IMPLEMENTATION OF PROPOSED METHOD IN hybrid methods between deep learning and optimization
BIOINFORMATICS FIELD algorithm in order to get better finding results and data. As
Recently, bio-based chemical products such as xylitol have mentioned by Mohamad et al. [28,31], hybrid methods are
been in the spotlight due to their major contribution and highly recommended compared to filter methods to produce
advantages in both pharmaceutical and also food industries. better results.
Xylitol has been widely used as a sugar substitute in
pharmaceutical industries to help diabetic patients as it will not
increase the blood sugar level and insulin respond of the
diabetic patients. While in food industries, xylitol has been
widely used in sugar free chewing gum and it is believed that
by eating sugar free chewing gum that content xylitol can
prevent tooth decay problems. Due to these several advantages

710
Authorized licensed use limited to: ULAKBIM UASL - EGE UNIVERSITESI. Downloaded on November 18,2024 at 14:54:08 UTC from IEEE Xplore. Restrictions apply.
2016 IEEE EMBS Conference on Biomedical Engineering and Sciences (IECBES)

ACKNOWLEDGMENT [15] Schmidhuber, J. (2014). Deep Learning in Neural Networks: An


Overview. Neural Networks 61, 85–117.
We would like to express our appreciation to Malaysia [16] Bengio, Y.; Courville, A.; Vincent, P. (2013). Representation Learning:
Ministry of Higher Education for supporting this project under A Review and New Perspectives. IEEE Transactions on Pattern Analysis
Fundamental Research Grant Scheme (Project Vot No. and Machine Intelligence 35(8), 1798–1828.
11H26). We also would like to thank to Research Management [17] Exploring Deep Learning & CNNs - RSIP Vision. (2015). Retrieved
Center, Universiti Teknologi Malaysia for managing this June 20, 2016, from http://www.rsipvision.com/exploring-deep-learning/
project. [18] Erhan, D., Courville, A., Bengio, Y., & Vincent, P. (2010). Why Does
Unsupervised Pre-training Help Deep Learning? 9, 201-208.
[19] Hinton, G., & Salakhutdinov, R. (2006). Reducing the Dimensionality of
Data with Neural Networks. Science, 313, 504-507.
REFERENCES [20] Montana, D. J., & Davis, L. (1989). Training Feedforward Neural
Networks Using Genetic Algorithms. In IJCAI 89, 762-767.
[21] Park, Y., & Kellis, M. (2015). Deep learning for regulatory genomics.
[1] Hapudeniya, M. (2010). Artificial neural networks in bioinformatics. Sri Nat Biotechnol Nature Biotechnology, 33(8), 825-826.
Lanka Journal of Bio-Medical Informatics, 1(2).
[22] Alipanahi, B., Delong, A., Weirauch, M., & Frey, B. (2015). Predicting
[2] LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, the sequence specificities of DNA- and RNA-binding proteins by deep
521(7553), 436-444. learning. Nat Biotechnol Nature Biotechnology, 33(8), 831-838.
[3] Ma, J., Sheridan, R. P., Liaw, A., Dahl, G. E. & Svetnik, V. (2015). [23] Banga, J. R. (2008). Optimization in computational systems biology.
Deep neural nets as a method for quantitative structure-activity BMC systems biology, 2(1), 1.
relationships. J. Chem. Inf. Model. 55, 263–274.
[24] Lamos-Sweeney, J. (2012). Deep learning using genetic algorithms.
[4] Leung, M. K., Xiong, H. Y., Lee, L. J. & Frey, B. J. (2014). Deep 3066-3072.
learning of the tissueregulated splicing code. Bioinformatics, 30, i121–
[25] Civicioglu P. (2012). Transforming geocentric cartesian coordinates to
i129.
geodetic coordinates by using differential search algorithm, Comput.
[5] Xiong, H. Y. et al. (2015). The human splicing code reveals new insights Geosci. 46, 229-247.
into the genetic determinants of disease. Science, 347(6218).
[26] Liu, B. (2014). Composite Differential Search Algorithm. Journal of
[6] Helmstaedter, M. et al. (2013). Connectomic reconstruction of the inner Applied Mathematics, 2014.
plexiform layer in the mouse retina. Nature, 500, 168–174.
[27] Xie, L. Q., Shen, C. P., Liu, M. B., Chen, Z. D., Du, R. Y., Yan, G. Q.,
[7] LeCun, Y., Huang, F. J., & Bottou, L. (2004). Learning methods for ... & Yang, P. Y. (2012). Improved proteomic analysis pipeline for LC-
generic object recognition with invariance to pose and lighting. In ETD-MS/MS using charge enhancing methods. Molecular BioSystems,
Computer Vision and Pattern Recognition, 2004. CVPR 2004. 8(10), 2692-2698.
Proceedings of the 2004 IEEE Computer Society Conference on (Vol. 2,
[28] Li, B., Li, Y., & Gong, L. (2014). Protein secondary structure
pp. II-97). IEEE.
optimization using an improved artificial bee colony algorithm based on
[8] Tirumala, S. S. (2014). Implementation of Evolutionary Algorithms for AB off-lattice model. Engineering Applications of Artificial Intelligence,
Deep Architectures. In AIC, 164-171. 27, 70-79.
[9] Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & [29] Pond, S. L. K., Posada, D., Gravenor, M. B., Woelk, C. H., & Frost, S.
Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural D. (2006). GARD: a genetic algorithm for recombination detection.
networks from overfitting. The Journal of Machine Learning Research, Bioinformatics, 22(24), 3096-3098.
15(1), 1929-1958.
[30] Ali N., Othman A.Z.,Husain M.N & Misran H.M. (2014). A REVIEW
[10] P. Civicioglu (2012). Transforming geocentric Cartesian coordinates to OF FIREFLY ALGORITHM. Engineering and Applied Sciences, 9(10),
geodetic coordinates by using differential search algorithm, Comput. 1-5.
Geosci. 46, 229–247.
[31] Deng, L. (2012). Three classes of deep learning architectures and their
[11] J. Liu, K. Teo, X. Wang, C. Wu, (2015). An exact penalty function- applications: a tutorial survey. APSIPA transactions on signal and
based differential search algorithm for constrained global optimization, information processing.
Soft Comput. 1–9.
[32] Dai, C., Chen, W., & Zhu, Y. (2010). Seeker optimization algorithm for
[12] Deng, L.; Yu, D. (2014). Deep Learning: Methods and Applications. digital IIR filter design. IEEE transactions on industrial electronics,
Foundations and Trends in Signal Processing 7, 3–4. 57(5), 1710-1718.
[13] Bengio, Yoshua (2009). Learning Deep Architectures for AI. [33] Mohamad, M. S., Omatu, S., Deris, S., Yoshioka, M., Abdullah, A., &
Foundations and Trends in Machine Learning 2(1), 1–127. Ibrahim, Z. (2013). An enhancement of binary particle swarm
[14] Bengio, Y.; Courville, A.; Vincent, P. (2013). Representation Learning: optimization for gene selection in classifying cancer classes. Algorithms
A Review and New Perspectives. IEEE Transactions on Pattern Analysis for Molecular Biology, 8(1), 1.
and Machine Intelligence 35(8), 1798–1828.

711
Authorized licensed use limited to: ULAKBIM UASL - EGE UNIVERSITESI. Downloaded on November 18,2024 at 14:54:08 UTC from IEEE Xplore. Restrictions apply.

You might also like