0% found this document useful (0 votes)
4 views6 pages

DLassignment

The document discusses the development and explanation of a Deep Feedforward Network, detailing its structure including input, hidden, and output layers, as well as the processes of loss computation and backpropagation. It also includes a function for performing a convolution operation, explaining its implementation and the steps involved. Additionally, the document describes how to find gradients in Recurrent Neural Networks (RNNs) using Backpropagation Through Time (BPTT) and differentiates between explicit and implicit derivatives.

Uploaded by

Pratham Dhiman
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
4 views6 pages

DLassignment

The document discusses the development and explanation of a Deep Feedforward Network, detailing its structure including input, hidden, and output layers, as well as the processes of loss computation and backpropagation. It also includes a function for performing a convolution operation, explaining its implementation and the steps involved. Additionally, the document describes how to find gradients in Recurrent Neural Networks (RNNs) using Backpropagation Through Time (BPTT) and differentiates between explicit and implicit derivatives.

Uploaded by

Pratham Dhiman
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 6

Deep Learning Assignment Name - Pratham Dhiman

IT Section - 2, 6th Semester


Roll No- 𝑈𝐸218075

Q1) Develop and Explain a Deep Feed forward network.

Ans:-
A deep feedforward network is also known as a feedforward neural network or a multilayer
perceptron (MLP).

1. Input Layer:
• Let's start with a simple example where we have two input features: 𝑥1x1 and
𝑥2x2.
• The input layer simply passes these features forward without any processing:
Input Layer:𝑥=[𝑥1,𝑥2]Input Layer:x=[x1,x2]
2. Hidden Layers:
• Let's introduce two hidden layers with three neurons each in our example.
• Each neuron in a layer applies an activation function to the weighted sum of its
inputs from the previous layer.
Hidden Layer 1:ℎ(1)=𝜙(𝑊(1)𝑥+𝑏(1))Hidden Layer 1:h(1)=ϕ(W(1)x+b(1
))
Hidden Layer 2:ℎ(2)=𝜙(𝑊(2)ℎ(1)+𝑏(2))Hidden Layer 2:h(2)=ϕ(W(2)h(
1)+b(2))
Where:
• 𝑊(1)W(1) and 𝑊(2)W(2) are weight matrices for the first and second
hidden layers respectively.
• 𝑏(1)b(1) and 𝑏(2)b(2) are bias vectors for the first and second hidden
layers respectively.
• 𝜙ϕ is the activation function such as ReLU, Sigmoid, or Tanh.
3. Output Layer:
• Finally, we have the output layer where we compute the final output.
• Let's assume a binary classification task with a single output neuron and a sigmoid
activation function.
Output Layer:𝑦^=𝜎(𝑊(3)ℎ(2)+𝑏(3))Output Layer:y^=σ(W(3)h(2)+b(3))
Where:
• 𝑦^y^ is the predicted output.
• 𝜎(𝑧)=11+𝑒−𝑧σ(z)=1+e−z1 is the sigmoid activation function.
4. Compute Loss:
• Calculate the loss between the predicted output 𝑦^y^ and the actual target
output 𝑦y using a suitable loss function such as mean squared error (MSE) or
binary cross-entropy. Loss:𝐽(𝑊,𝑏)=MSE(𝑦^,𝑦)Loss:J(W,b)=MSE(y^,y)
5. Backward Propagation (Gradient Calculation):
• Compute the gradients of the loss function with respect to the parameters of each
layer using backpropagation.
• Update the parameters using gradient descent:
𝑊(𝑙):=𝑊(𝑙)−𝛼∂𝐽∂𝑊(𝑙)W(l):=W(l)−α∂W(l)∂J
𝑏(𝑙):=𝑏(𝑙)−𝛼∂𝐽∂𝑏(𝑙)b(l):=b(l)−α∂b(l)∂J for 𝑙=1,2,3l=1,2,3.
For example, to update 𝑊(3)W(3) and 𝑏(3)b(3), we would calculate:
∂𝐽∂𝑊(3)=∂𝐽∂𝑦^∂𝑦^∂𝑊(3)∂W(3)∂J=∂y^∂J∂W(3)∂y^ ∂𝐽∂𝑏(3)=∂𝐽∂𝑦^∂𝑦^∂𝑏(3)∂b(3)∂J
=∂y^∂J∂b(3)∂y^
Similarly, gradients for 𝑊(2)W(2) and 𝑏(2)b(2), as well as 𝑊(1)W(1) and 𝑏(1)b(1),
would be computed and updated.
6. Convergence :
• Repeat the forward and backward propagation steps for a specified number of
iterations or until convergence, adjusting the parameters with each iteration to
minimize the loss function.
Q2) Write an example function for Convolution operation and explain it in detail.

Ans - A convolution operation operates on all the pixel values within its kernel's
receptive field, producing a single value by essentially multiplying the kernel weights
with the pixel values elementwise and adding a bias term to the result. This reduces the
dimensions of the input matrix as well.

import numpy as np

def convolution2D(image, kernel):


# Get dimensions of the image and kernel
image_height, image_width = image.shape
kernel_height, kernel_width = kernel.shape

# Compute the padding required for the image


pad_height = kernel_height // 2
pad_width = kernel_width // 2

# Pad the image symmetrically


padded_image = np.pad(image, ((pad_height, pad_height), (pad_width,
pad_width)), mode='constant')

# Initialize the result matrix


result = np.zeros_like(image)

# Perform convolution
for i in range(image_height):
for j in range(image_width):
# Extract the region of interest from the padded image
region = padded_image[i:i+kernel_height, j:j+kernel_width]

# Perform element-wise multiplication between the region and the


kernel
result[i, j] = np.sum(region * kernel)

return result

Now, let's explain each part of this function in detail:

1. Function Definition:
• The convolution2D function takes two arguments: image (a 2D numpy array
representing the image) and kernel (a 2D numpy array representing the
convolution kernel).
2. Dimensions and Padding:
• It computes the dimensions of the image and the kernel ( image_height, image_width,
kernel_height, kernel_width).
• It calculates the padding required for the image to ensure that the output of the
convolution operation has the same dimensions as the input image.
• Padding is done symmetrically using np.pad with zeros (mode='constant').
3. Initializing the Result Matrix:
• It initializes an empty matrix ( result) with the same dimensions as the input image
to store the result of convolution.
4. Convolution Operation:
• The function performs convolution using nested loops over each pixel position in
the image.
• At each position (i, j) in the image, it extracts the corresponding region of
interest from the padded image.
• It then performs element-wise multiplication between the region and the kernel
and sums up the results to obtain the output value for that position in the result
matrix.
• This process is repeated for all positions in the image, resulting in the final output
matrix (result).
5. Return Statement:
• The function returns the result matrix representing the output of the convolution
operation between the image and the kernel.
• gion and the kernel and sums up the results to obtain the output value for that
position in the result matrix.
• This process is repeated for all positions in the image, resulting in the final output
matrix (result).

Q 3 ) Describe how to find the gradient in a Recurrent Neural Network (RNN).

Ans:-

1. Backward Pass (Backpropagation Through Time - BPTT):


• Start by initializing gradients for all the parameters to zeros.
• Compute the gradient of the loss 𝐿L with respect to the output at each time step:
∂𝐿𝑡∂𝑦𝑡∂yt∂Lt.
• Then, for each time step 𝑡t from 𝑇T down to 11:
• Compute the gradient of the loss with respect to the output at that time
step: ∂𝐿𝑡∂𝑦𝑡∂yt∂Lt.
• Use the chain rule to propagate this gradient backward through the output
to hidden connections and update the gradients of the weights and biases.
• Compute the gradient of the loss with respect to the hidden state at time
step 𝑡t: ∂𝐿𝑡∂ℎ𝑡∂ht∂Lt.
• Backpropagate this gradient through time by recursively computing
∂𝐿𝑡∂ℎ𝑡∂ht∂Lt and updating the gradients of the weights and biases for the
hidden to hidden connections.
• Update the gradients for the input to hidden connections as well.

In Recurrent Neural Networks (RNNs), we often encounter the need to compute derivatives
during backpropagation to update the network's parameters (weights and biases). Implicit and
explicit derivatives are two approaches used in this context.

1. Explicit Derivatives:
• Explicit derivatives involve directly computing the gradients of the loss function
with respect to the parameters using standard differentiation rules.
• For example, given a loss function 𝐿L and a parameter 𝜃θ, the explicit derivative
∂𝐿∂𝜃∂θ∂L is computed using formulas derived from the chain rule and other
differentiation rules.
∂𝐿∂𝜃=∂𝐿∂output×∂output∂hidden state×∂hidden state∂𝜃∂θ∂L=∂output∂L
×∂hidden state∂output×∂θ∂hidden state
Here, each term on the right-hand side is explicitly computed using known
mathematical formulas.
2. Implicit Derivatives:
• Implicit derivatives, on the other hand, involve computing derivatives indirectly
using implicit differentiation techniques or through iterative methods such as
numerical differentiation.
• In the context of RNNs, implicit derivatives are often used when explicit
differentiation becomes computationally expensive or impractical due to the
complex dynamics of the network.
𝑑𝐿𝑑𝜃=lim⁡𝜖→0𝐿(𝜃+𝜖)−𝐿(𝜃)𝜖dθdL=limϵ→0ϵL(θ+ϵ)−L(θ)
This equation represents a numerical approximation of the derivative 𝑑𝐿𝑑𝜃dθdL by
computing the difference in the loss function 𝐿L for slightly perturbed values of
the parameter 𝜃θ and dividing by the perturbation 𝜖ϵ.
• Implicit derivatives are especially useful in scenarios where analytical solutions for
derivatives are not readily available or are too complex to compute directly.

In RNNs, both explicit and implicit derivatives can be used depending on the specific
requirements of the problem, the computational resources available, and the complexity of the
network's architecture.

You might also like