Tensorflow
Tensorflow
Interview Preparation
Deep Learning Interview Questions
Deep Learning is one of the Hottest topics now a days and for a good
reason. There have been so many advancements in the industry
wherein the time has come when machines or computer programs are
replacing humans. Artificial Intelligence is going to create 2.3 million
Jobs in the upcoming days and to crack those job interviews, here is a
set of Deep Learning Interview Questions divided into two sections:
1
Q2. Do you think Deep Learning is Better than Machine Learning? If so,
why?
2
1. Dendrite: Receives signals from other neurons
2. Cell Body: Sums all the inputs
3. Axon: It is used to transmit signals to the other cells
Answer: For a perceptron, there can be one more input called bias.
While the weights determine the slope of the classifier line, bias allows
us to shift the line towards the left or right. Normally bias is treated as
another weighted input with the input value x0.
3
Q5. What are the activation functions?
Answer: The activation function translates the inputs into outputs. The
activation function decides whether a neuron should be activated or
not by calculating the weighted sum and further adding to it. The
purpose of the activation function is to introduce non-linearity into the
output of a neuron.
1. Linear or Identity
2. Unit or Binary Step
3. Sigmoid or Logistic
4. Tanh
5. ReLU
6. Softmax
4
Wj (t) – Old Weight
d – Desired Output
y – Actual Output
x – Input
Batch Gradient Descent: Calculate the gradients for the whole dataset
and perform just one update at each iteration.
5
optimization algorithms.
6
Q11. Create a Gradient Descent in python.
Answer:
updates = []
updates.append([p, p - g * lr])
return updates
7
decides or prediction about the input, and in between those two, an
arbitrary number of hidden layers that are the true computational
engine of the MLP.
Input Nodes: The Input nodes provide information from the outside
world to the network and are together referred to as the “Input
Layer”. No computation is performed in any of the Input nodes – they
just pass on the information to the hidden nodes.
8
These were some basic Deep Learning Interview Questions. Now, let’s
move on to some advanced ones.
9
Advance Deep Learning Interview
Questions
Q16. Which are Better Deep Networks or Shallow ones? and why?
Biases can be generally initialized to zero. The rule for setting the
weights is to be close to zero without being too small.
10
Q18. What’s the difference between a feed-forward and a
backpropagation neural network?
Q19. What are the Hyperparameters? Name a few used in any Neural
Network.
The number of Hidden Layers: Many hidden units within a layer with
regularization techniques can increase accuracy. A smaller number of
11
units may cause underfitting.
Training Hyperparameters
Learning Rate: The learning rate defines how quickly a network updates
its parameters. A low learning rate slows down the learning process but
converges smoothly. A larger learning rate speeds up the learning but
may not converge.
Batch size: Mini batch size is the number of sub-samples given to the
network after which parameter update happens. A good default for
batch size might be 32. Also try 32, 64, 128, 256, and so on.
12
dropout value of 20%-50% of neurons with 20% providing a good
starting point. A probability too low has minimal effect and a value too
high results in under-learning by the network.
Use a larger network. You are likely to get better performance when
dropout is used on a larger network, giving the model more of an
opportunity to learn independent representations.
Q22. In training a neural network, you notice that the loss does not
decrease in the few starting epochs. What could be the reason?
TensorFlow
Caffe
The Microsoft Cognitive Toolkit/CNTK
Torch/PyTorch
MXNet
Chainer
Keras
13
Q24. What are Tensors?
Answer: Tensors are nothing but a de facto for representing the data in
deep learning. They are just multidimensional arrays, that allow you to
represent data having higher dimensions. In general, Deep Learning
deals with high-dimensional data sets where dimensions refer to
different features present in the data set.
Answer:
14
One can think of a Computational Graph as an alternative way of
conceptualizing mathematical calculations that takes place in a
TensorFlow program. The operations assigned to different nodes of a
Computational Graph can be performed in parallel, thus, providing
better performance in terms of computations.
15
in the network. The pooling layer operates on each feature map
independently.
Vanishing Gradient
Exploding Gradient
16
Earlier layers in the Network are important because they are
responsible to learn and detect simple patterns and are the building
blocks of our Network.
If they give improper and inaccurate results, then how can we expect
the next layers and the complete Network to perform nicely and
produce accurate results? The Training process takes too long, and the
Prediction Accuracy of the Model will decrease.
The Gradient Descent process works best when these updates are small
and controlled. When the magnitudes of the gradients accumulate, an
unstable network is likely to occur, which can cause poor prediction of
results or even a model that reports nothing useful whatsoever.
17
data points, but also entire sequences of data.
They are a special kind of Recurrent Neural Networks which are capable
of learning long-term dependencies.
Answer: Capsules are a vector specifying the features of the object and
its likelihood. These features can be any of the instantiation parameters
like position, size, orientation, deformation, velocity, hue, texture, and
much more.
A capsule can also specify its attributes like angle and size so that it can
represent the same generic information. Now, just like a neural
network has layers of neurons, a capsule network can have layers of
capsules.
Now, let’s continue these Deep Learning Interview Questions and move
to the section on autoencoders and RBMs.
18
Q36. In terms of Dimensionality Reduction, how does Autoencoder
differ from PCAs?
Answer:
19
Q37. Give some real-life examples where autoencoders can be applied.
Denoising Image: The input seen by the autoencoder is not the raw
input but a stochastically corrupted version. A denoising autoencoder is
thus trained to reconstruct the original input from the noisy version.
Encoder
Code
Decoder
20
Q39. Explain the architecture of an Autoencoder.
Answer: Encoder: This part of the network compresses the input into a
latent space representation. The encoder layer encodes the input
image as a compressed representation in a reduced dimension. The
compressed image is the distorted version of the original image.
Code: This part of the network represents the compressed input that is
fed to the decoder.
Decoder: This layer decodes the encoded image back to the original
dimension. The decoded image is a lossy reconstruction of the original
image and it is reconstructed from the latent space representation.
Answer: The layer between the encoder and decoder, ie. the code is
also known as Bottleneck. This is a well-designed approach to deciding
which aspects of observed data are relevant information and which
aspects can be discarded.
21
Q41. Is there any variation of Autoencoders?
Answer:
Convolution Autoencoders
Sparse Autoencoders
Deep Autoencoders
Contractive Autoencoders
First four or five shallow layers represent the encoding half of the
net.
The second set of four or five layers that make up the decoding
half.
22
Q43. What is a Restricted Boltzmann Machine?
RBM shares a similar idea, but it uses stochastic units with distribution
instead of deterministic distribution. The task of training is to find out
23
how these two sets of variables are connected.
So, with this, we come to the end of this Deep Learning Interview
Questions article. I hope this set of questions will be enough to crack
any Deep Learning Interview, but if you’re applying for any specific job,
then you need to have sound knowledge of that industry as well
because the jobs mostly are for specialists.
24