Machine Learning Unit-2 Backpropagation Algorithm
Machine Learning Unit-2 Backpropagation Algorithm
Unit-2
Backpropagation Algorithm
2
Introduction:
• Back propagation is abbreviation for "backward propagation of
errors", is a common method of training artificial neural networks
(ANN).
• The method calculates the gradient of a loss function with respect to
all the weights (w) in the network using gradient descent rule.
• The gradient is fed to the optimization method which in turn uses it
to update the weights, in an attempt to minimize the loss function.
3
Historical Background:
• Donald Hebb (1949) gave the hypothesis in his thesis “The
Organization of Behavior”:
• “Neural pathways are strengthened every time they are used.”
4
Perceptron:
• It is a step function based on a linear combination of real-valued
inputs. If the combination is above a threshold it outputs a 1,
otherwise it outputs a –1
• A perceptron can only learn examples that are linearly separable.
x1 w1
x2
w2 Σ {1 or –1}
wn w0 [3]
xn
X0=1
5
Back propagation Algorithm:
Back Propagation Algorithm:
Back Propagation Algorithm:
Back Propagation Algorithm:
Delta Rule for weight update:
• The delta rule is used for calculating the gradient change that is used for updating the weights.
• We will minimize the following error (E):
• E = ½ Σi (ti – oi) 2
• For a new input training data example X = (x1, x2, …, xn), update each weight (w) according to
this rule:
• wi = wi + Δwi
where, Δwi= - η * E’(w)/wi where, η = Learning rate of neurons
∆ wi = η * Σi (ti – oi) xi
13
Phases of Back Propagation Algorithm:
Phase 1: Feed Forward the ANN
Phases of Back Propagation Algorithm:
Phases of Back Propagation Algorithm:
Phase 2: Back propagation of Errors
Phases of Back Propagation Algorithm:
Phase 3: Updation of Weights
Back propagation Algorithm (Small Version)
[2]
18
What is gradient descent Rule?
• Back propagation calculates the gradient (small changes) of the
error of the network regarding the network's modifiable weights.
• This gradient is always used in a gradient descent algorithm to find
weights which can minimize the error.
E(W)
w1
[3]
19
w2
Learning Rates
• Different learning rates affect the performance of a neural network
significantly.
20
Limitations of the back propagation algorithm
• It is not guaranteed to find global minimum of the error function. It
may get trapped in a local minima,
• Improvements,
• Add momentum.
• Use stochastic gradient descent.
• Use different networks with different initial values for the weights.
21
References
• Artificial Neural Networks,
https://en.wikipedia.org/wiki/Artificial_neural_network
• Back propagation Algorithm,
https://en.wikipedia.org/wiki/Backpropagation
• Geeksforgeeks.com
• https://www.geeksforgeeks.org/backpropagation-in-data-mining/
22
Thank You