Study of Ensemble of Activation Functions in Deep Learning

Social media is a platform for people around the world to discuss their issues and opinions. Before knowing the actual aspects of social media, we need to know what we mean by social media. Social media is a term that describes the interaction between groups or individuals who create, share, and sometimes exchange ideas, images, videos, etc. on the Internet or in virtual communities. Children grow up surrounded by mobile devices and interactive social networking sites such as Twitter, MySpace, F

Uploaded by

sangeetasharma.cs

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views

Study of Ensemble of Activation Functions in Deep Learning

Uploaded by

sangeetasharma.cs

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 10

Study of ensemble of activation functions in Deep

learning

Name:Rehan Khan

Roll:19EARCS095

ABSTRACT
Artificial neural networks (ANN), generally known as neural networks, are a
type of machine learning algorithms that had been developed in part to mimic
the real architecture of the human brain. Because they can approximate
complex functions from data, neural networks are naturally strong. The
transdisciplinary fields of image recognition, voice recognition, natural
language processing, and others have been impacted by this generalization
capability. In neural networks, activation functions are a crucial component.
They explain a network node's output in terms of a set of inputs. a simplistic
introduction to deep neural networks, a summary of what activation functions
are and how they are used in neural networks, a look at their most prevalent
characteristics, a look at the various types of activation functions, a look at
some of the problems, constraints, and potential alternatives that activation
functions face, and a conclusion.

1.INTRODUCTION

Using input data to classify and predict outcomes, neural networks are multi-
layered networks of neurons made up of nodes. There are three layers to it: an
input layer, a secret layers, and an output layer. Information is processed from
one layer to the next using nodes, each of which has a weight that is
considered into account.
Activation functions make the neural network non-linear i.e., the output
depends linearly on the input features. Although linear equations are simple
and easy to comprehend, their complexity is constrained, and they are no
longer able to analyse and identify complicated mappings from the data.
Without an activation function, a neural network frequently performs and also
has limited capacity as a linear regression model. A neural network is
anticipated to handle more complex tasks than just modelling complex sorts of
information such as images, videos, audio, voice, and text, in addition to
learning and computing a linear characteristic. To simplify complicated, high-
dimensional, and nonlinear data sets with several hidden layers for extracting
knowledge and meaningful information, we apply activation function
capabilities and artificial neural network.
In simple terms, the activation function decides whether a neuron in a neural
network should be activated or not, which is like to the working of our brain.
These functions play an important role in back propagation that is used to
update the weights and biases in a neural network.

2.LIST OF ACTIVATION FUNCTIONS

2.1 Linear:

The input is proportional to activation in the linear activation function,

sometimes referred to as "no activation" or the "identity function" (multiplied
by x1). The function just passes the value it was provided, doing nothing to the
weighted sum of the input.
2.2 Sigmoid:

Sigmoid transforms the values between the range 0 and 1. It is often used in
models when the output must be a probability prediction. Sigmoid is the best
option due to its range because probability only occurs between a range of 0
and 1. The gradient is smooth and the function is differentiable.

2.3 Tanh:
The tanh function, also known as the hyperbolic tangent, has gained greater
traction than the sigmoid function because it often provides superior training
results for multi-layer neural networks. All of the sigmoid function's beneficial
characteristics are inherited by the tanh function. A formulation of the tanh
function is:

2.4 ReLu:

ReLU has a derivative function and enables for backpropagation while still
being computationally efficient, while giving the impression of being a linear

function. The fact that not all of the neurons are activated simultaneously by
the ReLU function is crucial in this situation. Only if the output of the linear
transformation is less than 0 will the neurons become inactive. The ReLU
function is far more computationally efficient than sigmoid and tanh functions
since it only activates a subset of neurons. Due to ReLU's linear, non-saturating
nature, gradient descent's convergence towards the loss function's global

minimum is sped up.

2.5: Leaky ReLU:

Leaky ReLU has the same benefits as ReLU, and it also allows backpropagation
even for negative input values. This small adjustment ensures that the gradient
of the graph's left side is non-zero for negative input values. As a result, we
wouldn't see any dead neurons there anymore. Here is the Leaky ReLU
function's derivative.

2.6 Swish:
This function is constrained below but unconstrained above, meaning
that as X approaches negative infinity, Y approaches a constant value, but
as X approaches infinity, Y approaches infinity. Swish is a smooth

function, thus it doesn't change course suddenly like ReLU does close to x
= 0. Instead, it gently curves higher from 0 towards values 0, then
downward.
Due to the swish function's non-monotonic nature, the expression of the
input data and weight that may be learned is improved. It has the
following mathematical representation:

2.7 Softmax Function:

The sigmoid function's output fell between 0 and 1, which is comparable
to probability. Assume we have five output values that are, successively,
0.8, 0.9, 0.7, 0.8, and 0.6. How do we go with it? A concatenation of
several sigmoids is how the Softmax function is defined. The relative
probability are computed. The softmax function, like the sigmoid/logistic
activation function, yields the likelihood of each class in the neural
network. In the case of multi-class classification, it is most frequently
utilized as an activation function for the last layer of the neural network.
It has the following mathematical representation:

2.8 Mish:
It shares the swish-like quality of being unbounded above and bounded below.
Due of its non-monotonic nature, it offers advantages comparable to those of
swoosh. It has self-gates as well. Here, the gate is tanh(x)) rather than
(x).However, the gate's function remains the same. The actual cause of the
performance difference between swish and mish is this variation in the gate. It
has infinite order and is continuously differentiable. It belongs to the C class,

whereas ReLU belongs to the C0 class. For an activation function, this high
degree of differentiability is particularly beneficial.

3 Challenge faced by activation function

The vanishing gradient problem and the dead neuron problem are the two
main difficulties that commonly employed activation functions must over-
come. Both topics are covered in this section.

3.1 Problem with vanishing gradients

As backpropagation advances further into the network, this issue arises when
the gradient values approach zero, causing the weights to become saturated
and improperly updated. As a result, the loss stops dropping, and the network
is unable to complete its training. The vanishing gradient problem is the name
given to this issue. Saturated neurons are those whose weights have not been

adequately updated.
3.2 Death of a neuron Problem

As it has been previously mentioned, activation functions have an output value

that is between 0 and 1. When the value is almost 0, the relevant neurons are
made to become inactive and stop contributing to the output. Furthermore,
the weights might be changed in a way that makes a significant chunk of the
network's weighted total equal to zero. This situation may require the
deactivation of a significant amount of the input, which would impair network
performance in an unfixable way. As a result, these neurons that have forcibly
turned off are referred to as "dead neurons," and the issue is called the "dead
neuron problem."

4. CONCLUSION
Activation functions suffer from many issues like vanishing gradients,
exploding gradients, local oscillation, etc. Better performing activation functions are
those which are highly differentiable and vary within limits. Based on the use case, it
is beneficial to find optima and different functions give different results based on their
nature.
Many different activation functions have been invented that deal with the
aforementioned problems and still provide nonlinearity to the neural network model
at hand. The functions inside a model need to be ordered in such a way that the effect
of one does not get nullified by the next function in line. Fine-tuning parameters like
the number of neurons in a hidden layer and using a dropout and learning rate helps
achieve the desired results in addition to the activation function selection.
Sigmoid function and its variations usually work best in classification problems.
When in doubt, the ReLU function provides better results in most cases and is thus
widely used in deep learning networks. While designing your own activation
function, it is important to keep in mind that it would be used in the backpropagation
of errors and weights hence its effectiveness must be studied on different kinds of
datasets.

References:
1. Saha, Snehanshu & Mathur, Archana & Bora, Kakoli & Agrawal, Surbhi &
Basak, Suryoday. (2018). SBAF: A New Activation Function for Artificial
Neural Net based Habitability Classification.
2. https://www.v7labs.com/blog/neural-networks-activation-functions
3. https://towardsdatascience.com/an-overview-of-resnet-and-its-variants-
5281e2f5 6035
4. https://iq.opengenus.org/architecture-of-densenet121/
5. Wang Y, Li Y, Song Y, Rong X. The Influence of the Activation Function in a
Convolution Neural Network Model of Facial Expression Recognition.
Applied Sciences. 2020; 10(5):1897. https://doi.org/10.3390/app10051897

6. www.v7labs.com

7. Medium.com

8. Towardsdatascience.com

9. Paperswithcode.com
10. Parisi, Luca & Ma, Renfei & RaviChandran, Narrendar & Lanzillotta, Matteo.
(2021). hyper-sinh: An accurate and reliable function from shallow to deep
learning in TensorFlow and Keras. Machine Learning with Applications.
100112. 10.1016/j.mlwa.2021.100112.

11. Albert Gidon, Timothy Adam Zolnik, Pawel Fidzinski, Felix Bolduan, Athanasia
Papoutsi, Panayiota Poirazi, Martin Holtkamp, Imre Vida, and Matthew Evan
Larkum. Dendritic action potentials and computation in human layer 2/3 cortical
neurons. Science, 367(6473):83–87, 2020

Cell-Dyn 3700 - Operations Manual
100% (3)
Cell-Dyn 3700 - Operations Manual
794 pages
MKR056 - MKR050 Info PDF
No ratings yet
MKR056 - MKR050 Info PDF
9 pages
Activation Function
No ratings yet
Activation Function
43 pages
Need and Use of Activation Functions in Anndeep Learning
No ratings yet
Need and Use of Activation Functions in Anndeep Learning
7 pages
Activation
No ratings yet
Activation
7 pages
4 4 Choosing The Right Activation Function For Neural Networks
No ratings yet
4 4 Choosing The Right Activation Function For Neural Networks
25 pages
Performance Analysis of Various Activation Functio
No ratings yet
Performance Analysis of Various Activation Functio
7 pages
Deep Learning: International Islamic University of Chittagong
No ratings yet
Deep Learning: International Islamic University of Chittagong
31 pages
Activation Function
No ratings yet
Activation Function
36 pages
Functii de Activare1
No ratings yet
Functii de Activare1
89 pages
UNIT-III Activation-function
No ratings yet
UNIT-III Activation-function
6 pages
Types of Neural Network Activation Functions_ How to Choose_ (1)
No ratings yet
Types of Neural Network Activation Functions_ How to Choose_ (1)
36 pages
ML_Lec-22
No ratings yet
ML_Lec-22
25 pages
Unit 3 Deep Learning
No ratings yet
Unit 3 Deep Learning
11 pages
Activation Functions - Ipynb - Colaboratory
No ratings yet
Activation Functions - Ipynb - Colaboratory
10 pages
Activation Functions
No ratings yet
Activation Functions
8 pages
Act_Fun
No ratings yet
Act_Fun
7 pages
Module1 - Upto Loss Function
No ratings yet
Module1 - Upto Loss Function
137 pages
5 TH
No ratings yet
5 TH
22 pages
Lecture 2.1.2activation Function
No ratings yet
Lecture 2.1.2activation Function
15 pages
Fundamentals Deep Learning Activation Functions When To Use Them
No ratings yet
Fundamentals Deep Learning Activation Functions When To Use Them
15 pages
Activation Function in NN
No ratings yet
Activation Function in NN
29 pages
Unit 5 Activation Function
No ratings yet
Unit 5 Activation Function
15 pages
Perceptron in Machine Learning
No ratings yet
Perceptron in Machine Learning
11 pages
7 Types of Neural Network Activation Functions
No ratings yet
7 Types of Neural Network Activation Functions
16 pages
Mod 2.3 - Activation Function, Loss Functions
No ratings yet
Mod 2.3 - Activation Function, Loss Functions
12 pages
Activation Functions in Neural Networks - 241102 - 224129
No ratings yet
Activation Functions in Neural Networks - 241102 - 224129
7 pages
Ad3451 Ml Unit 4 Notes
No ratings yet
Ad3451 Ml Unit 4 Notes
34 pages
Module1
No ratings yet
Module1
124 pages
Neural Network example and Activation Functions Summary
No ratings yet
Neural Network example and Activation Functions Summary
2 pages
4 - Activation Functions in Neural Networks
No ratings yet
4 - Activation Functions in Neural Networks
12 pages
Pr1_ANN_Writeup.docx
No ratings yet
Pr1_ANN_Writeup.docx
7 pages
Lect 5- Non Linear Activation Functions
No ratings yet
Lect 5- Non Linear Activation Functions
41 pages
Activation Functions and Their Characteristics in Deep Neural Networks
No ratings yet
Activation Functions and Their Characteristics in Deep Neural Networks
6 pages
Mod 2.3 - Activation Function
No ratings yet
Mod 2.3 - Activation Function
9 pages
3. Activation Function
No ratings yet
3. Activation Function
14 pages
Artificial Neural Networks(ANN)
No ratings yet
Artificial Neural Networks(ANN)
67 pages
ML Mentorship Prahitha Movva V1
No ratings yet
ML Mentorship Prahitha Movva V1
5 pages
lecture 9-NN- modified
No ratings yet
lecture 9-NN- modified
94 pages
SoftComp 02
No ratings yet
SoftComp 02
33 pages
Activation Function
No ratings yet
Activation Function
9 pages
Unit 2b
No ratings yet
Unit 2b
11 pages
Unit Iv
No ratings yet
Unit Iv
34 pages
Activation Functions
No ratings yet
Activation Functions
9 pages
Activation Function
No ratings yet
Activation Function
44 pages
DL Answers
No ratings yet
DL Answers
24 pages
4-Neural Networks and Activation Function
No ratings yet
4-Neural Networks and Activation Function
28 pages
NN unit_1
No ratings yet
NN unit_1
27 pages
Activation functions 2
No ratings yet
Activation functions 2
5 pages
Forward_and_Backward_Propagation_Deep_Learning_1703697260
No ratings yet
Forward_and_Backward_Propagation_Deep_Learning_1703697260
9 pages
Activation Function
No ratings yet
Activation Function
4 pages
Deep Learning Tutorial 3
No ratings yet
Deep Learning Tutorial 3
12 pages
Activation Function
No ratings yet
Activation Function
31 pages
Lec08-1Activation Functions
No ratings yet
Lec08-1Activation Functions
19 pages
Deep Learning
No ratings yet
Deep Learning
10 pages
activatn fn 2
No ratings yet
activatn fn 2
10 pages
Artificial Neural Network Notes
No ratings yet
Artificial Neural Network Notes
9 pages
activation fn
No ratings yet
activation fn
15 pages
UNIT II DNN
No ratings yet
UNIT II DNN
24 pages
Activation F
No ratings yet
Activation F
4 pages
3-Activation Function, Loss Function-24-07-2024
No ratings yet
3-Activation Function, Loss Function-24-07-2024
19 pages
Feedforward Neural Networks: Fundamentals and Applications for The Architecture of Thinking Machines and Neural Webs
From Everand
Feedforward Neural Networks: Fundamentals and Applications for The Architecture of Thinking Machines and Neural Webs
Fouad Sabry
No ratings yet
Wiggins Vr300 Non Pressurising Ultra Fast Fil Diesel Refuelling Systems
No ratings yet
Wiggins Vr300 Non Pressurising Ultra Fast Fil Diesel Refuelling Systems
3 pages
Department of Education: Bids and Awards Committee (BAC) Members
No ratings yet
Department of Education: Bids and Awards Committee (BAC) Members
3 pages
Concrete Mixed Design For 60MPa
No ratings yet
Concrete Mixed Design For 60MPa
1 page
ATI Flash Recovery
No ratings yet
ATI Flash Recovery
3 pages
PPE Quiz 1
No ratings yet
PPE Quiz 1
1 page
Task1 Vet ChowKokKeong3
No ratings yet
Task1 Vet ChowKokKeong3
26 pages
2025 Call for Application Form
No ratings yet
2025 Call for Application Form
3 pages
GC 2024 05 19
No ratings yet
GC 2024 05 19
4 pages
Lean & Six Sigma For Clinical Laboratory by DR Annabel DSouza Sekar
No ratings yet
Lean & Six Sigma For Clinical Laboratory by DR Annabel DSouza Sekar
27 pages
RS-485 2X227 AWG SFUTP PVC - 9FY7F1V129 - V - 1 - R - 1
No ratings yet
RS-485 2X227 AWG SFUTP PVC - 9FY7F1V129 - V - 1 - R - 1
2 pages
Lift Control SLC4 Information For The Expert I Subranges of The Main Card of Central Unit - AZE0
No ratings yet
Lift Control SLC4 Information For The Expert I Subranges of The Main Card of Central Unit - AZE0
6 pages
Project Report 2004 Solar: A Solar System Simulator Author: Sam Morris Supervisor: Dr. Richard Banach
No ratings yet
Project Report 2004 Solar: A Solar System Simulator Author: Sam Morris Supervisor: Dr. Richard Banach
56 pages
Unable To Save Application Information During Test Application or MER Creation
No ratings yet
Unable To Save Application Information During Test Application or MER Creation
6 pages
ADVANCED FOOTSTEP POWER GENERATION SYSTEM - 4 Steps - Instructables
No ratings yet
ADVANCED FOOTSTEP POWER GENERATION SYSTEM - 4 Steps - Instructables
26 pages
pmt-hps-fte-qualified-switch-firmware-versions (003)
No ratings yet
pmt-hps-fte-qualified-switch-firmware-versions (003)
18 pages
L148N50B规格书 A2
No ratings yet
L148N50B规格书 A2
28 pages
Resume Marjorie Turingan v2021-2
No ratings yet
Resume Marjorie Turingan v2021-2
3 pages
Jaguar 870 840 - Technical Data
No ratings yet
Jaguar 870 840 - Technical Data
2 pages
Cdi5 WK8
No ratings yet
Cdi5 WK8
5 pages
Transit Us A
No ratings yet
Transit Us A
59 pages
4th Sem Detailed Syllabus (B. Sc. in Data Science)
No ratings yet
4th Sem Detailed Syllabus (B. Sc. in Data Science)
5 pages
Subnet Mask Cheat Sheet - NetworkCalc
No ratings yet
Subnet Mask Cheat Sheet - NetworkCalc
1 page
5331 Proje Wep PDF 1
No ratings yet
5331 Proje Wep PDF 1
5 pages
NBR Iec 60439-1-2-3 PDF
90% (20)
NBR Iec 60439-1-2-3 PDF
130 pages
1934-Suket KTP Dki 2
No ratings yet
1934-Suket KTP Dki 2
1 page
java basics-páginas-1
No ratings yet
java basics-páginas-1
26 pages
Rosenerlande Projet
No ratings yet
Rosenerlande Projet
30 pages
Ayal (He) by Rabindranath Tagore
No ratings yet
Ayal (He) by Rabindranath Tagore
210 pages