0% found this document useful (0 votes)
2 views10 pages

Deep Learning

The document discusses the importance of activation functions in neural networks, which introduce non-linearity and enable the learning of complex patterns. It covers various types of activation functions, including Sigmoid, Tanh, ReLU, and Softmax, detailing their characteristics, advantages, and disadvantages. The conclusion emphasizes the need to choose activation functions based on specific tasks and network architectures, highlighting the importance of experimentation.

Uploaded by

Vikiron Mondal
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
2 views10 pages

Deep Learning

The document discusses the importance of activation functions in neural networks, which introduce non-linearity and enable the learning of complex patterns. It covers various types of activation functions, including Sigmoid, Tanh, ReLU, and Softmax, detailing their characteristics, advantages, and disadvantages. The conclusion emphasizes the need to choose activation functions based on specific tasks and network architectures, highlighting the importance of experimentation.

Uploaded by

Vikiron Mondal
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 10

Name: Vikiron Mondal

Stream: CSE (AI & ML)


Roll: 13030822083
Registration No: 221300110327 OF 2022-23
Year: 3rd
Semester: 6th
Subject Name: Deep Learning
Subject Code: PCCAIML602
Topic: Activation Function in Neural Networks
Introduction to Neural Networks
Neural networks are computational models inspired by the human brain. They
learn from data to make predictions.

Activation functions are crucial. They introduce non-linearity, allowing


networks to learn complex patterns.

Data Input
Data flows into the network.

Hidden Layers
Data gets transformed.

Output
Prediction is made.
What are Activation
Functions?
Activation functions decide whether a neuron should be activated.

They introduce non-linearity, enabling networks to learn complex


relationships in data.

1 Neuron Activation 2 Non-Linearity


Controls neuron output. Enables complex learning.

3 Data Transformation
Maps input to output.
Why Activation Functions Matter
Without activation functions, neural networks would essentially be linear regression models. This severely limits their
ability to learn and model complex data patterns.

Activation functions enable networks to approximate any continuous function, allowing them to learn intricate patterns
and relationships within data that linear models simply cannot capture. They introduce the necessary non-linearity for
deep learning.

Pattern Recognition Complexity Handling

Activation functions play a critical role in extracting key Activation functions enable neural networks to solve non-
data features. They allow neurons to selectively respond linear problems. Many real-world problems are inherently
to specific input patterns, enabling the network to identify non-linear, and without activation functions, neural
and learn complex features. networks would be unable to model these complexities
effectively.
Types of Activation Functions
Various activation functions exist, each with unique characteristics. These functions play a critical role in determining the output of a neuron and
influencing the network's ability to learn complex patterns.

Common types: Sigmoid, Tanh, ReLU, and Softmax. Each of these activation functions has its own strengths and weaknesses, making them
suitable for different types of neural network architectures and tasks.

Linear Sigmoid Tanh


A basic activation function that simply Outputs values between 0 and 1, making it Similar to Sigmoid but outputs values
outputs the input value. It doesn't suitable for binary classification problems. between -1 and 1. It often converges faster
introduce non-linearity, limiting the However, it can suffer from vanishing than Sigmoid but can still suffer from
network's ability to learn complex gradients. vanishing gradients.
patterns.

ReLU Softmax
Outputs the input directly if it is positive, otherwise, it outputs Converts a vector of numbers into a vector of probabilities, where
zero. It helps to alleviate the vanishing gradient problem and is the probabilities of each value are proportional to the relative scale
computationally efficient. of each value in the vector. It is commonly used in the final layer of
a classification neural network.
Sigmoid Activation Function
The Sigmoid activation function is a classic choice in neural networks, outputting values between 0 and 1. This makes it
particularly useful in scenarios where probabilities are required, such as in binary classification problems.

However, the Sigmoid function is prone to vanishing gradients, especially when dealing with very high or very low input
values. This limitation restricts its effectiveness in deep networks, where gradients need to propagate through multiple
layers.

Formula Pros Cons

σ(x) = 1 / (1 + e^(-x)) Output is easy to interpret as Vanishing gradient problems can


probabilities, making it suitable for hinder learning, especially in deep
binary classification. The function is networks. The output is not zero-
also continuous and differentiable, centered, which can slow down
which is essential for gradient-based learning. Also, the exponential
learning methods. computation can be relatively slow.
Tanh Activation Function
The Tanh (Hyperbolic Tangent) activation function is another popular choice in neural networks, similar to Sigmoid but
with an output range between -1 and 1. This makes it naturally zero-centered, which can help accelerate learning in some
cases.

While Tanh addresses some of the vanishing gradient issues present in Sigmoid, it's not entirely immune, particularly with
very deep networks or extreme input values.

Formula

tanh(x) = (e^x - e^-x) / (e^x + e^-x)

Range Pros Cons

Output values are bounded between - The zero-centered output can lead to Like Sigmoid, Tanh can still suffer
1 and 1, providing a clear range for faster convergence during training from vanishing gradients, especially
activation. compared to Sigmoid. in deep networks.
ReLU and Leaky ReLU
ReLU outputs the input directly if positive, otherwise zero.

Leaky ReLU addresses the "dying ReLU" problem by allowing a small


negative slope.

1 ReLU
Simple and fast.

2 Dying ReLU
Can stop learning.

3 Leaky ReLU
Avoids the dying problem.
Softmax Activation Function
Softmax converts raw scores into a probability distribution.

It's used in the output layer for multi-class classification problems.

2
Multi-Class

Probability Distribution 1

Output Layer
3
Comparison and Conclusion
Each activation function has its strengths and weaknesses.

Choose based on the specific task and network architecture. Experimentation


is vital.

Function Use Case Pros Cons

Sigmoid Binary Easy to Vanishing


classification interpret gradients

Tanh Hidden layers Centered Gradient issues


output

ReLU Hidden layers Fast Dying ReLU

Softmax Multi-class Probability None


output distribution

You might also like