0% found this document useful (0 votes)

45 views24 pages

Unit 5

The document provides an overview of neural networks, covering their basic structure, evolution, working principles, types, advantages, disadvantages, applications, limitations, and comparisons with biological neurons. It details how neural networks process data through layers of interconnected nodes, utilizing techniques like forward propagation and backpropagation for learning. Additionally, it discusses the challenges faced in training neural networks, including data requirements, computational complexity, and ethical concerns.

Uploaded by

achilles2006ad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views24 pages

Unit 5

Uploaded by

achilles2006ad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

MACHINE LEARNING COURSE CODE-A8703 MODULE-05

SYLLABUS: Basics of Neural Network: Introduction, Understanding the Biological

Neuron, Exploring the Artificial Neuron, Types of Activation Functions, Early
Implementations of ANN, Architectures of Neural Network.

Introduction to Neural Network:

 Neural Network extract identifying features from data, lacking pre-programmed
understanding. Network components include neurons, connections, weights, biases,
propagation functions, and a learning rule. Neurons receive inputs, governed by thresholds
and activation functions. Connections involve weights and biases regulating information
transfer. Learning, adjusting weights and biases, occurs in three stages: input computation,
output generation, and iterative refinement enhancing the network’s proficiency in diverse
tasks.
 Neural Networks are computational models that mimic the complex functions of the human
brain. The neural networks consist of interconnected nodes or neurons that process and
learn from data, enabling tasks such as pattern recognition and decision making in machine
learning. The article explores more about neural networks, their working, architecture and
more.
Evolution of Neural Networks
Since the 1940s, there have been a number of noteworthy advancements in the field of neural
networks:
1. 1940s-1950s: Early Concepts
Neural networks began with the introduction of the first mathematical model of artificial
neurons by McCulloch and Pitts. But computational constraints made progress difficult.
2. 1960s-1970s: Perceptron’s
This era is defined by the work of Rosenblatt on perceptron’s. Perceptron’s are single-layer
networks whose applicability was limited to issues that could be solved linearly separately.
3. 1980s: Backpropagation and Connectionism
Multi-layer network training was made possible by Rumelhart, Hinton, and Williams’
invention of the backpropagation method. With its emphasis on learning through
interconnected nodes, connectionism gained appeal.
4. 1990s: Boom and Winter With applications in image identification, finance, and other
fields, neural networks saw a boom. Neural network research did, however, experience a
“winter” due to exorbitant computational costs and inflated expectations.

Dr [Link] 1|Page
MACHINE LEARNING COURSE CODE-A8703 MODULE-05
5. 2000s: Resurgence and Deep Learning
Larger datasets, innovative structures, and enhanced processing capability spurred a
comeback. Deep learning has shown amazing effectiveness in a number of disciplines by
utilizing numerous layers.
6. 2010s-Present: Deep Learning Dominance
Convolutional neural networks (CNNs) and recurrent neural networks (RNNs), two deep
learning architectures, dominated machine learning. Their power was demonstrated by
innovations in gaming, picture recognition, and natural language processing.
How does Neural Networks work?
Let’s understand with an example of how a neural network works:
1. Consider a neural network for email classification. The input layer takes features like email
content, sender information, and subject.
2. These inputs, multiplied by adjusted weights, pass through hidden layers. The network,
through training, learns to recognize patterns indicating whether an email is spam or not.
The output layer, with a binary activation function, predicts whether the email is spam (1)
or not (0). As the network iteratively refines its weights through backpropagation, it
becomes adept at distinguishing between spam and legitimate emails, showcasing the
practicality of neural networks in real-world applications like email filtering.
Working of a Neural Network
1. Neural networks are complex systems that mimic some features of the functioning of the
human brain.
2. It is composed of an input layer, one or more hidden layers, and an output layer made up of
layers of artificial neurons that are coupled.
3. The two stages of the basic process are called backpropagation and forward propagation.

Dr [Link] 2|Page
MACHINE LEARNING COURSE CODE-A8703 MODULE-05
Figure 1: Neural Network architecture with various components
Forward Propagation
1. Input Layer: Each feature in the input layer is represented by a node on the network, which
receives input data.
2. Weights and Connections: The weight of each neuronal connection indicates how strong
the connection is. Throughout training, these weights are changed.
3. Hidden Layers: Each hidden layer neuron processes inputs by multiplying them by weights,
adding them up, and then passing them through an activation function. By doing this, non-
linearity is introduced, enabling the network to recognize intricate patterns.
4. Output: The final result is produced by repeating the process until the output layer is
reached.

Backpropagation
1. Loss Calculation: The network’s output is evaluated against the real goal values, and a loss
function is used to compute the difference. For a regression problem, the Mean Squared
Error (MSE) is commonly used as the cost function.
Loss Function:

2. Gradient Descent: Gradient descent is then used by the network to reduce the loss. To
lower the inaccuracy, weights are changed based on the derivative of the loss with respect
to each weight.
3. Adjusting weights: The weights are adjusted at each connection by applying this iterative
process, or backpropagation, backward across the network.
4. Training: During training with different data samples, the entire process of forward
propagation, loss calculation, and backpropagation is done iteratively, enabling the network
to adapt and learn patterns from the data.
5. Activation Functions: Model non-linearity is introduced by activation functions like
the rectified linear unit (ReLU) or sigmoid. Their decision on whether to “fire” a neuron is
based on the whole weighted input.

Dr [Link] 3|Page
MACHINE LEARNING COURSE CODE-A8703 MODULE-05

Types of Neural Networks

There are seven types of neural networks that can be used.
1. Feedforward Networks: A feedforward neural network is a simple artificial neural
network architecture in which data moves from input to output in a single direction. It has
input, hidden, and output layers; feedback loops are absent. Its straightforward architecture
makes it appropriate for a number of applications, such as regression and pattern
recognition.
2. Multilayer Perceptron (MLP): MLP is a type of feedforward neural network with three or
more layers, including an input layer, one or more hidden layers, and an output layer. It uses
nonlinear activation functions.
3. Convolutional Neural Network (CNN): A Convolutional Neural Network (CNN) is a
specialized artificial neural network designed for image processing. It employs
convolutional layers to automatically learn hierarchical features from input images,
enabling effective image recognition and classification. CNNs have revolutionized computer
vision and are pivotal in tasks like object detection and image analysis.
4. Recurrent Neural Network (RNN): An artificial neural network type intended for
sequential data processing is called a Recurrent Neural Network (RNN). It is appropriate for
applications where contextual dependencies are critical, such as time series prediction and
natural language processing, since it makes use of feedback loops, which enable information
to survive within the network.
5. Long Short-Term Memory (LSTM): LSTM is a type of RNN that is designed to overcome
the vanishing gradient problem in training RNNs. It uses memory cells and gates to
selectively read, write, and erase information.
Advantages of Artificial Neural Network
The advantages of the neural network are as follows −
 A neural network can implement tasks that a linear program cannot.
 When an item of the neural network declines, it can continue without some issues by its
parallel features.
 A neural network determines and does not require to be reprogrammed.
 It can be executed in any application.
Disadvantages of Artificial Neural Network
The disadvantages of the neural network are as follows −
 The neural network required training to operate.

Dr [Link] 4|Page
MACHINE LEARNING COURSE CODE-A8703 MODULE-05
 The structure of a neural network is disparate from the structure of microprocessors
therefore required to be emulated.
 It needed high processing time for big neural networks.
Applications of Artificial Neural Network
[Link] Classification and Categorization
This is vital part of several applications like web searching, information filtering, language
identification, readability assessment, as well as sentiment analysis. Artificial neural networks are
widely used for these tasks.
2. Named Entity Recognition (NER)
Named entity recognition involves focuses on categorizing named entities predefined classes like
persons, organizations, locations, dates, times, etc. The most effective and powerful named entity
recognition systems make use of artificial neural networks.
3. Part-of-Speech Tagging
Part-of-speech tagging is used for parsing, text-to-speech conversion, information extraction and
many other applications. The process is about tagging words as adjectives, verbs, nouns, adverbs,
pronouns, etc.
4. Machine Translation
Machine translation is widely used around the world, however, it still has certain limitations and
there are certain domains in which the quality of the translations is rather substandard. To improve
the quality of machine translations, researchers are attempting to use neural networks.
5. Semantic Parsing and Question Answering
Such systems automate the answering of various types of questions (this includes definition
questions, biographical questions, multilingual questions, and many other kinds of questions) that
are asked to the system in natural language.
Using artificial neural networks, it is possible to create high-performance question answering
systems.
6. Paraphrase Detection
This essentially involves figuring out whether two sentences mean the same thing. This is particularly
important in question answering systems because there are several ways in which your users could
ask the very same question.
7. Speech Recognition
Artificial neural networks are used rather extensively in speech recognition. It involves making use
of natural language processing to convert voice data into a machine-readable format.
8. Language Generation & Multi-document Summarization
Dr [Link] 5|Page
MACHINE LEARNING COURSE CODE-A8703 MODULE-05
Natural language generation (NLG) can be used for various reports. Some of them include writing
reports, generating texts based on the data that the system analyzed, drafting summaries of
electronic medical records and generating textual weather forecasts based on weather data.
9. Character Recognition
Character recognition is applied to receipts, invoices, cheques, legal documents, etc. By using artificial
neural networks, character recognition can even be performed on hand-written characters with an
accuracy of around 85%.
10. Spell Checking
This is widely used in text editors to inform users if their text contains spelling errors. Several spell-
checking tools now make use of artificial neural networks.
Neural Network Limitations
1. Data Requirements
Large Data Needs: ANNs require vast amounts of labelled data to train effectively, which can
be difficult and expensive to obtain.
Data Quality: The quality of the data is crucial. Poor-quality data, including noisy, incomplete,
or biased datasets, can lead to inaccurate and unreliable models.
2. Computational Complexity
High Resource Consumption: Training large ANNs, especially deep networks, requires
significant computational power and time. This often necessitates access to specialized
hardware like GPUs or TPUs.
Training Time: The process of training can be very time-consuming, particularly for deep
networks with large datasets.
3. Overfitting
Susceptibility to Overfitting: ANNs can easily overfit to the training data, especially when
the model is complex and the dataset is not sufficiently large or diverse.
Complex Regularization Needs: Preventing overfitting requires careful application of
regularization techniques and validation strategies, which can be complex and time-
consuming.
4. Interpretability and Transparency
Black Box Nature: ANNs are often considered "black boxes" due to their complex internal
workings, making it difficult to understand how they make decisions.
Lack of Explain ability: Despite advancements in explainable AI, fully interpreting the
decision-making process of ANNs remains challenging.

Dr [Link] 6|Page
MACHINE LEARNING COURSE CODE-A8703 MODULE-05
5. Hyper parameter Sensitivity
Difficult Hyper Parameter Tuning: ANNs have many hyper parameters that need careful
tuning (e.g., learning rate, number of layers, neurons per layer). Finding the optimal
configuration often requires extensive experimentation.
Initialization Issues: The initial weights of an ANN can significantly affect the training
process and final performance, making proper initialization crucial.
6. Scalability Issues
Memory and Computational Limits: Scaling ANNs to handle very large datasets or model
sizes can be challenging due to memory and computational constraints.
Deployment Complexity: Large models can be difficult to deploy efficiently in real-time
applications, requiring significant resources and optimization.
7. Expertise Required
Need for Specialized Knowledge: Designing, training, and tuning ANNs often requires deep
expertise in machine learning and neural network architecture, which can limit accessibility
for non-experts.
8. Adversarial Vulnerability
Susceptibility to Adversarial Attacks: ANNs can be vulnerable to adversarial attacks, where
small, carefully crafted changes to input data can cause the network to make significant errors.
9. Hardware Dependence
Specialized Hardware Needs: Effective training and inference of ANNs often require access
to specialized hardware, such as GPUs or TPUs, which may not be available to all users.
10. Ethical and Bias Concerns
Propagation of Biases: ANNs can learn and amplify biases present in the training data,
leading to unfair or discriminatory outcomes.
11. Ethical Considerations: The use of ANNs in sensitive applications raises ethical questions
about privacy, fairness, and accountability.
Understanding the Biological Neuron
Biological neurons are the fundamental units of the brain and nervous system. They are specialized
cells that transmit information through electrical and chemical signals. Understanding the structure
and function of biological neurons helps to appreciate how artificial neural networks (ANNs) are
inspired by biological processes.

Dr [Link] 7|Page
MACHINE LEARNING COURSE CODE-A8703 MODULE-05

Structure of a Biological Neuron

1. Cell Body (Soma):
o The cell body contains the nucleus and is responsible for maintaining the cell's health.
It integrates incoming signals and generates outgoing signals to the axon.
2. Dendrites:
o Dendrites are tree-like structures that receive messages from other neurons. They act
as the input regions of the neuron, collecting electrical signals from the synapses.
3. Axon:
o The axon is a long, thin fiber that transmits electrical impulses away from the cell body
to other neurons, muscles, or glands. It can vary greatly in length, extending from a few
millimeters to over a meter.
4. Myelin Sheath:
o Many axons are covered with a myelin sheath, a fatty layer that insulates the axon and
speeds up the transmission of electrical signals.
5. Nodes of Ranvier:
o These are gaps in the myelin sheath along the axon where the action potential is
regenerated, allowing for faster signal transmission.
6. Axon Terminals (Synaptic Boutons):
o The axon terminals are the endpoints of the axon that release neurotransmitters into
the synapse (the gap between neurons) to transmit signals to other neurons.
Function of a Biological Neuron
1. Signal Reception:
o Neurons receive signals from other neurons through the dendrites. These signals are
typically in the form of neurotransmitters, which bind to receptors on the dendritic
membrane.
2. Signal Integration:
o The cell body integrates incoming signals. If the combined signal strength reaches a
certain threshold, the neuron generates an action potential.
3. Signal Transmission:
o The action potential travels down the axon, jumping from one Node of Ranvier to the
next, in a process called saltatory conduction, which speeds up signal transmission.
4. Signal Propagation:

Dr [Link] 8|Page
MACHINE LEARNING COURSE CODE-A8703 MODULE-05
o When the action potential reaches the axon terminals, it triggers the release of
neurotransmitters into the synapse. These chemicals cross the synaptic gap and bind
to receptors on the dendrites of the adjacent neuron, continuing the transmission of
the signal.
Comparison with Artificial Neural Networks
Artificial Neural Networks (ANNs) are computational models inspired by the structure and function
of biological neurons. Here’s how they compare:
1. Neurons and Perceptrons:
o In ANNs, artificial neurons (or perceptrons) are simplified models that take multiple
inputs, apply weights, sum them, and pass the result through an activation function to
produce an output, analogous to the signal processing in biological neurons.
2. Synapses and Weights:
o The connections between artificial neurons have weights that are adjusted during
training, similar to how synapses strengthen or weaken in biological neurons based on
activity.
3. Layers and Networks:
o ANNs are composed of layers of neurons: input layer (receives data), hidden layers
(processes data), and output layer (produces results), mimicking the complex
networks of interconnected neurons in the brain.
4. Learning and Flexibility:
o The process of training an ANN, involving adjustments to weights based on error
minimization (e.g., backpropagation), is analogous to synaptic plasticity, where the
strength of synapses changes based on experience and learning.

Dr [Link] 9|Page
MACHINE LEARNING COURSE CODE-A8703 MODULE-05

Exploring the Artificial Neuron:

Artificial neurons, also known as perceptrons, are the fundamental units of artificial neural networks
(ANNs). They are inspired by the structure and function of biological neurons but are much simpler.
Understanding the artificial neuron helps to grasp how ANNs function as a whole.
Components of an Artificial Neuron:
1. Inputs (X): These are the features or signals fed into the neuron. In a dataset, these could
represent various attributes of the data points.
2. Weights (W): These are learnable parameters that the neuron adjusts during the training
process. The goal is to find the optimal weights that minimize the error in predictions.
3. Bias (b): This term allows the model to fit the data better by shifting the activation function.
It helps in improving the model's flexibility.
4. Weighted Sum (Z): This is the intermediate value calculated by combining the inputs and
weights. It serves as the input to the activation function.
5. Activation Function (f): This function determines the output of the neuron based on the
weighted sum. It adds non-linearity to the model, enabling it to learn complex patterns.
6. Output (Y): This is the final value produced by the neuron, which can be an input to another
neuron in the next layer or the final prediction in a single-layer network.
7. Input Collection: The neuron collects inputs from either the initial data or the outputs of
neurons from previous layers.
Role of Artificial Neurons in Neural Networks
1. Input Layer:
o The input layer consists of neurons that receive the initial data inputs.
2. Hidden Layers:
o Hidden layers contain neurons that process inputs from the previous layer. The
multiple layers allow the network to learn hierarchical representations of the data.
3. Output Layer:
o The output layer produces the final prediction or classification. The number of neurons
in the output layer corresponds to the number of classes or regression targets.

Dr [Link] 10 | P a g e
MACHINE LEARNING COURSE CODE-A8703 MODULE-05

Types of Activation functions

Activation functions
Activation functions are a crucial part of artificial neurons in neural networks. They play a key role
in introducing non-linearity, which allows the network to learn and model complex patterns that
wouldn't be possible with linear relationships alone.
Why are they important?
1. Non-linearity: Linear models can only learn basic relationships between inputs and outputs.
Activation functions introduce non-linearity, allowing the network to model more complex
relationships like curves, thresholds, and decision boundaries.
2. Learning complex patterns: With non-linearity, neural networks can learn intricate patterns
in data that wouldn't be possible with just linear activation.
3. Gradient descent: Activation functions with smooth gradients are essential for the
backpropagation algorithm used in training neural networks. Backpropagation relies on
calculating gradients to adjust the weights in the network and optimize its performance.

Different Types of Activation Functions

1. Sigmoid: Outputs a value between 0 and 1, often used for binary classification problems.
(Think of it squashing the input between 0 and 1)
Advantages:
 Smooth gradient, preventing abrupt changes in predictions.
 Output range (0, 1) makes it suitable for probabilistic interpretations.
Disadvantages:
 Vanishing gradient problem for very high/low values of xxx, making training deep
networks difficult.
 Outputs are not zero-centered, causing gradients to be consistently positive or negative.
Limitations:
 Not suitable for layers after the first hidden layer due to gradient saturation.
Real-Time Applications:
 Binary classification problems.
 Logistic regression.
Utilization: Sigmoid functions are typically used in the output layer of binary classification neural
networks.

Dr [Link] 11 | P a g e
MACHINE LEARNING COURSE CODE-A8703 MODULE-05

2. Tanh (Hyperbolic Tangent): Similar to sigmoid but outputs range from -1 to 1, often used in
hidden layers.
Advantages:
 Computationally efficient, simple implementation.
 Reduces likelihood of vanishing gradient problem.
 Speeds up the convergence of stochastic gradient descent.
Disadvantages:
 Dying ReLU problem: neurons can become inactive and only output zero.
Limitations:
 Sensitive to high learning rates which can cause neurons to die.
Real-Time Applications:
 Image classification.
 Convolutional neural networks (CNNs).
Utilization:
 Widely used in hidden layers of deep neural networks due to efficiency and effectiveness.

[Link] (Rectified Linear Unit): Simplest and most popular, outputs the input directly if it's
positive, otherwise outputs 0.
Advantages:
 Addresses the dying ReLU problem by allowing a small gradient when x≤0x ,0x≤0.
Disadvantages:
 Introduces a slight complexity in the computation.
Limitations:
 The choice of alpha α can significantly affect performance.
Real-Time Applications:
 Object detection.
 Various deep learning models needing robust activation functions.
Utilization:
 Used in hidden layers where avoiding zero gradients is crucial.

[Link] ReLU (PReLU)

Advantages:
 Learns the parameters for negative values, potentially improving performance.
Disadvantages:
 Increases the number of parameters to be learned.

Dr [Link] 12 | P a g e
MACHINE LEARNING COURSE CODE-A8703 MODULE-05

Limitations:
 More computationally intensive due to additional parameters.
Real-Time Applications:
 Image and video recognition tasks.
 Deep CNNs.
Utilization:
 Applied in deeper networks where learning parameters dynamically can be beneficial.

5. Exponential Linear Unit (ELU)

Advantages:
 Helps to make the mean of activations closer to zero, reducing bias shifts.
 Improves learning characteristics.
Disadvantages:
 More computationally expensive compared to ReLU.
Limitations:
 Needs careful initialization of α-alphaα.
Real-Time Applications:
 Image processing.
 Advanced neural network architectures.
Utilization:
 Useful in layers where maintaining zero-centered activations is important.

[Link] ReLU: A variant of ReLU that avoids the "dying ReLU" problem by allowing a small
positive gradient for negative inputs. Leaky Rectified Linear Unit (Leaky ReLU) is an activation
function designed to address the "dying ReLU" problem, where neurons become inactive and only
output zero for any input. Leaky ReLU introduces a small slope for negative input values, allowing
a small, non-zero gradient when the input is less than zero.
Advantages of Leaky ReLU
1. Mitigates Dying ReLU Problem:
o Unlike ReLU, which outputs zero for all negative inputs, Leaky ReLU allows a small
gradient to pass through, ensuring that neurons can continue to learn even when
their weights produce negative values.

Dr [Link] 13 | P a g e
MACHINE LEARNING COURSE CODE-A8703 MODULE-05

o Computationally Efficient: Similar to ReLU, Leaky ReLU is computationally simple

and efficient to compute. It involves simple thresholding which is very fast to
compute during training and inference.
2. Improved Gradient Flow:
o By allowing a small gradient for negative inputs, Leaky ReLU improves gradient flow
through the network, which can result in faster convergence and better
performance during training.
Disadvantages of Leaky ReLU
1. Fixed Negative Slope:
o The fixed negative slope α-alphaα might not be optimal for all tasks. While it allows
for gradient flow, the fixed value could limit performance in some scenarios.
2. Not Zero-Centered:
o Similar to ReLU, the outputs of Leaky ReLU are not zero-centered, which can lead to
inefficient updates of the model parameters during training.
Limitations of Leaky ReLU
1. Choice of α-alphaα:
o The choice of the parameter α-alphaα is crucial. If α-alphaα is too large, it might
result in high negative activations, which can slow down training. Conversely, if α-
alphaα is too small, it might not effectively address the dying ReLU problem.
2. Not Universally Optimal:
o Leaky ReLU may not be the best choice for all types of neural networks or tasks.
Depending on the specific problem and network architecture, other activation
functions might perform better.
Real-Time Applications of Leaky ReLU
1. Computer Vision:
o Leaky ReLU is extensively used in convolutional neural networks (CNNs) for image
classification, object detection, and segmentation tasks. It helps in maintaining a
healthy gradient flow, thereby improving training efficiency and performance.
2. Speech Recognition:
o In recurrent neural networks (RNNs) and their variants like LSTMs and GRUs used
for speech and audio processing, Leaky ReLU can help in addressing vanishing
gradient issues, leading to better learning of temporal dependencies.
3. Natural Language Processing (NLP):

Dr [Link] 14 | P a g e
MACHINE LEARNING COURSE CODE-A8703 MODULE-05

o Although less common than in vision tasks, Leaky ReLU can be used in NLP models,
especially in feedforward layers of sequence-to-sequence models and transformers,
to enhance gradient flow and mitigate the dying ReLU problem.
o

Detailed Utilization of Leaky ReLU

 Convolutional Neural Networks (CNNs):
o Layers: Leaky ReLU is typically used in the hidden layers of CNNs.
o Benefits: Helps in maintaining active neurons during training, which is critical for
learning rich feature representations in tasks like image recognition.
 Recurrent Neural Networks (RNNs):
o Layers: Used in hidden layers of RNNs, LSTMs, and GRUs.
o Benefits: Mitigates vanishing gradient issues, thus preserving long-term
dependencies in sequences, crucial for tasks such as language modeling and
machine translation.
 Fully Connected Neural Networks:
o Layers: Applied in hidden layers of deep feedforward networks.
o Benefits: Ensures that neurons do not become inactive, thus maintaining a steady
gradient flow and contributing to efficient learning.
 GANs (Generative Adversarial Networks):
o Discriminator Network: Leaky ReLU is often used in the discriminator network of
GANs to ensure that gradients flow smoothly and the model trains effectively
without dead neurons.
o Benefits: Helps in stabilizing the training process and improves the generation
quality of adversarial examples.

7. Softmax: Used in the output layer for multi-class classification problems. Outputs a probability
distribution for multiple categories.
Advantages:
 Converts logits into probabilities.
 Suitable for multi-class classification.
Disadvantages:
 Not suitable for hidden layers due to computational complexity.
Limitations:
 Only useful in the output layer.

Dr [Link] 15 | P a g e
MACHINE LEARNING COURSE CODE-A8703 MODULE-05

Real-Time Applications:
 Multi-class classification problems.
 Neural networks for language models.

Utilization:
 Typically used in the output layer of neural networks for classification tasks.
Choosing the right activation function, the best activation function for your neural network
depends on the specific problem you're trying to solve. Here are some general guidelines:
a. Classification (binary): Sigmoid or tanh
b. Classification (multi-class): Softmax
c. Regression: Linear or ReLU
Real-Time Application Examples
1. Image Recognition: ReLU and its variants (Leaky ReLU, PReLU) are widely used in CNNs
for tasks like object detection and facial recognition.
2. Speech Recognition: Tanh and ReLU are commonly used in RNNs and LSTMs for
processing sequential data like audio signals.
3. Natural Language Processing (NLP): Softmax is used in the output layer for language
models and text classification tasks.

EQUATIONS FOR ACTIVATION FUNCTIONS

Figure 1: Activation function formulas

Dr [Link] 16 | P a g e
MACHINE LEARNING COURSE CODE-A8703 MODULE-05

Early Implementations of ANN

Artificial Neural Networks (ANNs) have a rich history that traces back to foundational theoretical
work and early practical implementations. These early implementations laid the groundwork for the
advanced neural networks we use today. Here are some significant early milestones:
1. McCulloch-Pitts Neuron (1943)
 Description: The McCulloch-Pitts neuron is a simplified model of a biological neuron.
 Structure: It sums its input signals and outputs a binary value (0 or 1) based on whether the
sum exceeds a certain threshold.
 Significance: Demonstrated that neural networks could, in theory, compute any computable
function by combining neurons in specific ways.
2. Hebbian Learning (1949)
 Description: A learning rule based on the idea that the connection between two neurons is
strengthened when both neurons are activated simultaneously.
 Significance: Provided the first learning mechanism for neural networks, emphasizing the
role of synaptic plasticity in learning.
3. Perceptron (1958)
 Description: The Perceptron is a type of single-layer neural network that can classify input
into one of two categories.
 Structure: Consists of input units, weights, a summation processor, and an activation function
(usually a step function).
 Training: Uses a simple learning algorithm that adjusts the weights based on the classification
error.
 Significance: One of the first practical implementations of a neural network, highlighting both
the potential and limitations of neural network models.
4. Adaline and Madaline (1960)
 Adaline (Adaptive Linear Neuron):
o Description: A single-layer neural network that uses linear activation functions.
o Training: Uses the least mean squares (LMS) algorithm to minimize the error between
the predicted and actual outputs.
 Madaline (Multiple Adaline):
o Description: An extension of Adaline, consisting of multiple Adaline units organized
in layers.

Dr [Link] 17 | P a g e
MACHINE LEARNING COURSE CODE-A8703 MODULE-05
o Significance: Demonstrated the potential for multi-layer networks and introduced
new training techniques.
5. Hopfield Network (1982)
 Creator: John Hopfield.
 Description: A recurrent neural network where each neuron is connected to every other
neuron. It is used to store and recall patterns.
 Structure: Symmetric weight connections and a binary threshold activation function.
 Significance: Showed how neural networks could be used for associative memory and
optimization problems.
6. Multilayer Perceptron (MLP) and Backpropagation (1986)
 Pioneers: David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams.
 Description: An MLP is a neural network with one or more hidden layers and non-linear
activation functions (e.g., sigmoid, tanh).
 Training: Uses the backpropagation algorithm to adjust weights by minimizing the error
through gradient descent.
 Significance: Enabled the training of deep neural networks and solved complex problems like
XOR that single-layer perceptron’s could not.
Advantages and Disadvantages of Early Implementations
Advantages
1. Foundational Work: Established the basic principles and mechanisms of neural networks.
2. Demonstrated Learning: Showed that neural networks could learn from data and perform
various tasks.
3. Practical Applications: Early models were used in simple practical applications, such as
pattern recognition and control systems.
Disadvantages
1. Limited Complexity: Early models, especially single-layer networks, struggled with complex
tasks and non-linear separable problems.
2. Training Challenges: Efficient training methods for deep networks were not available until
the development of backpropagation.
3. Computational Constraints: Early implementations were limited by the computational
power of the time, restricting their scalability and practicality.

Dr [Link] 18 | P a g e
MACHINE LEARNING COURSE CODE-A8703 MODULE-05

Real-Time Applications of Early ANN Models

1. Pattern Recognition:
o Perceptron: Used in early optical character recognition systems.
2. Signal Processing:
o Adaline: Applied in adaptive filtering, such as noise cancellation and echo suppression
in telecommunications.
3. Robotics:
o Madaline: Implemented in early robotic control systems to learn and adapt to
changing environments.
Utilization in Modern Contexts
1. Educational Tools:
o Early models are used to teach fundamental concepts of neural networks and machine
learning in academic settings.
2. Historical Research:
o Provides insights into the evolution of neural networks and the challenges overcome
to reach modern deep learning architectures.
3. Foundation for Advanced Models:
o Techniques and principles from early implementations continue to influence the
development of advanced neural network models and training algorithms.

Dr [Link] 19 | P a g e
MACHINE LEARNING COURSE CODE-A8703 MODULE-05

Architectures of Neural Network:

There are several different architectures for ANNs, each with their own strengths and weaknesses.
Some of the most common architectures include:
1. Feedforward Neural Networks: This is the simplest type of ANN architecture, where the
information flows in one direction from input to output. The layers are fully connected,
meaning each neuron in a layer is connected to all the neurons in the next layer.
2. Recurrent Neural Networks (RNNs): These networks have a “memory” component, where
information can flow in cycles through the network. This allows the network to process
sequences of data, such as time series or speech.
3. Convolutional Neural Networks (CNNs): These networks are designed to process data
with a grid-like topology, such as images. The layers consist of convolutional layers, which
learn to detect specific features in the data, and pooling layers, which reduce the spatial
dimensions of the data.
4. Auto encoders: These are neural networks that are used for unsupervised learning. They
consist of an encoder that maps the input data to a lower-dimensional representation and a
decoder that maps the representation back to the original data.
5. Generative Adversarial Networks (GANs): These are neural networks that are used for
generative modelling. They consist of two parts: a generator that learns to generate new
data samples, and a discriminator that learns to distinguish between real and generated
data.

Dr [Link] 20 | P a g e
MACHINE LEARNING COURSE CODE-A8703 MODULE-05

What are neural networks used for?

Although we have been studying and implementing neural networks since at least the 1940s,
advancements in deep learning have guided us to work with the algorithms in new and advanced ways.
Today, researchers and scientists can use neural networks for real-world applications in various fields,
including the automotive industry, finance, national defense, insurance, health care, and utilities.
1. Automotive: Self-driving cars use neural networks to make decisions based on the data they
receive from their surroundings. Neural networks can also optimize vehicle parts and functions
or estimate how many vehicles you need to make to meet demand.
2. Finance: Neural networks have many uses in the finance industry, from predicting the
performance of the stock market or exchange rates between monetary denominations to
determining credit scores and default risks.
3. National defense: The Department of Defense uses neural networks to simulate situational
training, such as combat readiness. Other neural network applications in national defence
include the ability to develop unmanned aircraft.
4. Insurance: Insurance providers can use neural networks to model how often customers file
insurance claims and the size of those claims.
5. Health care: In a health care setting, doctors, health care administrators, and researchers use
neural networks to make informed decisions about patient care, organizational decisions, and
developing new medications.
6. Utilities: Utility companies can use neural networks to forecast energy demand. Other uses
include stabilizing electrical voltage or modelling oil recovery from residential areas.
Interconnections in Network:
Interconnection can be defined as the way processing elements (Neuron) in ANN are connected to
each other. Hence, the arrangements of these processing elements and geometry of
interconnections are very essential in ANN.
There exist five basic types of neuron connection architecture:
1. Single-layer feed-forward network
2. Multilayer feed-forward network
3. Single node with its own feedback
4. Single-layer recurrent network
5. Multilayer recurrent network

Dr [Link] 21 | P a g e
MACHINE LEARNING COURSE CODE-A8703 MODULE-05

2. Multilayer Feed-Forward Network:

This layer also has a hidden

layer that is internal to the
network and has no direct
contact with the external
layer. The existence of one or
more hidden layers enables
the network to be
computationally stronger, a
feed-forward network
because of information flow
through the input function,
and the intermediate computations used to determine the output Z. There are no feedback
connections in which outputs of the model are fed back into itself.

Dr [Link] 22 | P a g e
MACHINE LEARNING COURSE CODE-A8703 MODULE-05

[Link] Node with Its Own Feedback

When outputs can be directed back as inputs to the
same layer or preceding layer nodes, then it results
in feedback networks. Recurrent networks are
feedback networks with closed loops. The above
figure shows a single recurrent network having a
single neuron with feedback to itself.

4. Single-Layer Recurrent Network:

The above network is a single-layer network
with a feedback connection in which the
processing element’s output can be directed
back to itself or to another processing
element or both. A recurrent neural network
is a class of artificial neural networks where
connections between nodes form a directed
graph along a sequence. This allows it to
exhibit dynamic temporal behaviour for a time sequence. Unlike feedforward neural networks,
RNNs can use their internal state (memory) to process sequences of inputs.
5. Multilayer Recurrent Network:
In this type of network, processing element output
can be directed to the processing element in the same
layer and in the preceding layer forming a multilayer
recurrent network. They perform the same task for
every element of a sequence, with the output being
dependent on the previous computations. Inputs are
not needed at each time step. The main feature of a
Recurrent Neural Network is its hidden state, which
captures some information about a sequence.

Dr [Link] 23 | P a g e
MACHINE LEARNING COURSE CODE-A8703 MODULE-05

Important Questions
2 Marks Questions
1. Define a neural network
2. What is feedforward neural network.
3. What is weights and bias in an artificial neuron.
4. What is the purpose of an activation function in an artificial neural network?
5. What is single layer perceptron?
6. List any four activation functions
7. What is feed forward network?
8. What are the main components of a biological neuron?
9. What is the ReLU (Rectified Linear Unit) activation function?
10. What is the advantage of using ReLU over Sigmoid?
11. What is a multi-layer perceptron (MLP)?

5 Marks Questions
1. Explain the basic concept of artificial neural networks and their importance in machine learning.
2. What are the activation functions? Explain in detail
3. Define what is Artificial Neural Network? Explain its procedure with example
4. Define a neural network and explain its significance in the field of artificial intelligence.
5. Explain the concept of the input function and output function in an artificial neuron.
6. What are the main differences between biological and artificial neural networks?
7. Describe the structure of an artificial neuron, including its components such as input weights, bias, and
activation function.
8. Compare sigmoid, ReLU, and Tanh activation functions, highlighting their differences, advantages, and
disadvantages.
9. Compare and contrast biological neurons with artificial neurons in terms of structure and function.

Dr [Link] 24 | P a g e

What Are Neural Networks
No ratings yet
What Are Neural Networks
5 pages
Unit - 4
No ratings yet
Unit - 4
17 pages
ML-5TH Unit
No ratings yet
ML-5TH Unit
28 pages
CCS355 NNDL Unit1
No ratings yet
CCS355 NNDL Unit1
30 pages
UNIT - 4 ML
No ratings yet
UNIT - 4 ML
45 pages
Chapter 5
No ratings yet
Chapter 5
63 pages
Neural Networks: Training & Evolution
No ratings yet
Neural Networks: Training & Evolution
17 pages
Neural Networks
No ratings yet
Neural Networks
17 pages
Neural Networks: Introduction & Types
No ratings yet
Neural Networks: Introduction & Types
9 pages
Aimlf Unit4
No ratings yet
Aimlf Unit4
20 pages
Deep Learning UNIT 1
No ratings yet
Deep Learning UNIT 1
22 pages
Technical Seminar
No ratings yet
Technical Seminar
27 pages
Artificial Neural Network 2082 Notes
No ratings yet
Artificial Neural Network 2082 Notes
15 pages
Types of Neural Networks and Definition of Neural Network
No ratings yet
Types of Neural Networks and Definition of Neural Network
15 pages
Unit 1
No ratings yet
Unit 1
16 pages
Introduction To ANN
No ratings yet
Introduction To ANN
26 pages
Unit 4 - Artificial Intelligence
No ratings yet
Unit 4 - Artificial Intelligence
9 pages
Neural Networks
No ratings yet
Neural Networks
16 pages
Report On Neural Networks
No ratings yet
Report On Neural Networks
15 pages
Neural Network Oxygen
No ratings yet
Neural Network Oxygen
25 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
75 pages
ML Module 5
No ratings yet
ML Module 5
22 pages
Chapter 05
No ratings yet
Chapter 05
25 pages
Unit 2 ML Ak
No ratings yet
Unit 2 ML Ak
12 pages
Module 2
No ratings yet
Module 2
84 pages
Neural Network Basics Explained
No ratings yet
Neural Network Basics Explained
10 pages
Understanding Neural Networks in AI
No ratings yet
Understanding Neural Networks in AI
13 pages
Neural Networks
No ratings yet
Neural Networks
29 pages
An Ingression Into Deep Learning - Resp
No ratings yet
An Ingression Into Deep Learning - Resp
25 pages
Unit IV Artificial Neural Networks
No ratings yet
Unit IV Artificial Neural Networks
25 pages
Unit III
No ratings yet
Unit III
29 pages
Neural Network
No ratings yet
Neural Network
7 pages
Deep Learning Computer Vision
No ratings yet
Deep Learning Computer Vision
47 pages
DL Lect 4
No ratings yet
DL Lect 4
41 pages
Physics 12
No ratings yet
Physics 12
33 pages
Neural Networks: A Deep Learning Guide
No ratings yet
Neural Networks: A Deep Learning Guide
13 pages
Unit-1 and 2 Deep Learning
No ratings yet
Unit-1 and 2 Deep Learning
22 pages
Neural Network Explained To Beginners
No ratings yet
Neural Network Explained To Beginners
16 pages
Chapter-4 Fundamental of Neural Network
No ratings yet
Chapter-4 Fundamental of Neural Network
26 pages
ML Unit-5
No ratings yet
ML Unit-5
22 pages
Introduction To Neural Networks
No ratings yet
Introduction To Neural Networks
51 pages
Lecture 2
No ratings yet
Lecture 2
37 pages
Lecture 1
No ratings yet
Lecture 1
26 pages
Physucs prjct-1
No ratings yet
Physucs prjct-1
33 pages
Neural Network
No ratings yet
Neural Network
11 pages
Components of Soft Computing Explained
No ratings yet
Components of Soft Computing Explained
29 pages
Unit 1
No ratings yet
Unit 1
20 pages
Understanding of Neural Networks
No ratings yet
Understanding of Neural Networks
7 pages
Deep Learning - Unit 1 Notes
No ratings yet
Deep Learning - Unit 1 Notes
27 pages
Understanding Neural Networks and Applications
No ratings yet
Understanding Neural Networks and Applications
31 pages
OE Unit 5
No ratings yet
OE Unit 5
80 pages
Dsa Theory Da
No ratings yet
Dsa Theory Da
41 pages
Unit 4
100% (1)
Unit 4
57 pages
Don T Be Scared of Neural Network 1731079571
No ratings yet
Don T Be Scared of Neural Network 1731079571
18 pages
N N Lecture
No ratings yet
N N Lecture
40 pages
Kiet School of Engineering & Technology: Department of Computer Appication
No ratings yet
Kiet School of Engineering & Technology: Department of Computer Appication
30 pages
2.deep Feed Forward Networks
No ratings yet
2.deep Feed Forward Networks
26 pages
Approach To The Synthesis of Neural Network Structure During Classification
No ratings yet
Approach To The Synthesis of Neural Network Structure During Classification
7 pages
Partitioning Methods & Hierachical Methods
No ratings yet
Partitioning Methods & Hierachical Methods
22 pages
ML Short Notes
No ratings yet
ML Short Notes
2 pages
1 s2.0 S0167404816301572 Main
No ratings yet
1 s2.0 S0167404816301572 Main
18 pages
Unit-Ii MLT1
No ratings yet
Unit-Ii MLT1
45 pages
Neural Networks Lab: XOR & Classification
No ratings yet
Neural Networks Lab: XOR & Classification
2 pages
KNN Algorithm Explained for Prediction
No ratings yet
KNN Algorithm Explained for Prediction
4 pages
IPA Unit4 Notes
No ratings yet
IPA Unit4 Notes
9 pages
Stock Prediction: MLP vs Random Forest
No ratings yet
Stock Prediction: MLP vs Random Forest
18 pages
KTU Data Mining and Warehousing Syllabus
No ratings yet
KTU Data Mining and Warehousing Syllabus
3 pages
NNDL
No ratings yet
NNDL
7 pages
Introduction to Machine Learning Lecture 1
No ratings yet
Introduction to Machine Learning Lecture 1
113 pages
Under The Guidance of BY, Prof. Meiliu Lu Shekhar Shiroor
No ratings yet
Under The Guidance of BY, Prof. Meiliu Lu Shekhar Shiroor
17 pages
Exercise For K Means Tutorial
No ratings yet
Exercise For K Means Tutorial
5 pages
Machine Learning Axiom
100% (2)
Machine Learning Axiom
3 pages
ANN Slide Explanations
No ratings yet
ANN Slide Explanations
2 pages
TD 3 Computer Vision
No ratings yet
TD 3 Computer Vision
4 pages
Overview of Machine Learning Concepts
No ratings yet
Overview of Machine Learning Concepts
4 pages
Lec 7,8,9
No ratings yet
Lec 7,8,9
23 pages
DL Tlep
No ratings yet
DL Tlep
58 pages
Ensemble Methods Final PDF
No ratings yet
Ensemble Methods Final PDF
25 pages
Pengantar Deep Learning untuk NLP
100% (1)
Pengantar Deep Learning untuk NLP
109 pages
Survey of Deep Representation Learning For Speech Emotion Recognition
No ratings yet
Survey of Deep Representation Learning For Speech Emotion Recognition
21 pages
DL (7 8th) Dec2022
No ratings yet
DL (7 8th) Dec2022
2 pages
Deep Learning Lab Manual
No ratings yet
Deep Learning Lab Manual
73 pages
Btech Cs 5 Sem Machine Learning Techniques Kcs055 2023
No ratings yet
Btech Cs 5 Sem Machine Learning Techniques Kcs055 2023
1 page
Deep Learning Architectures Cheatsheet
No ratings yet
Deep Learning Architectures Cheatsheet
3 pages
Soft Computing Notes
No ratings yet
Soft Computing Notes
3 pages
Deep Learning Overview and Applications
No ratings yet
Deep Learning Overview and Applications
12 pages
Unsupesfwafarvised Learning
No ratings yet
Unsupesfwafarvised Learning
49 pages

Unit 5

Uploaded by

Unit 5

Uploaded by

MACHINE LEARNING COURSE CODE-A8703 MODULE-05

SYLLABUS: Basics of Neural Network: Introduction, Understanding the Biological

Introduction to Neural Network:

Types of Neural Networks

Structure of a Biological Neuron

Exploring the Artificial Neuron:

Types of Activation functions

Different Types of Activation Functions

[Link] ReLU (PReLU)

5. Exponential Linear Unit (ELU)

o Computationally Efficient: Similar to ReLU, Leaky ReLU is computationally simple

Detailed Utilization of Leaky ReLU

EQUATIONS FOR ACTIVATION FUNCTIONS

Figure 1: Activation function formulas

Early Implementations of ANN

Real-Time Applications of Early ANN Models

Architectures of Neural Network:

What are neural networks used for?

[Link]-layer feed-forward network:

2. Multilayer Feed-Forward Network:

This layer also has a hidden

[Link] Node with Its Own Feedback

4. Single-Layer Recurrent Network:

You might also like