Artificial Neural Network
and Deep Learning
Lecture 6
Deep Learning & Convolution Neural
Networks (CNN)
Deep Learning
Stride
Agenda Convolution Padding
Convolution Neural Networks
(CNN)
1
Deep
Learning
• Deep learning is a Neural Networks algorithms that can
learn useful representations or features directly from
images, text and sound.
Deep
Learning
2
Deep Learning, cont.
Deep learning is a set of algorithms that
learn to represent the data. The most
popular ones.
• Convolutional Neural Networks
• Deep Belief Networks
• Deep Auto-Encoders
• Recurrent Neural Networks (LSTM)
One of the promises of deep learning is
that they will substitute hand-crafted
feature extraction. The idea is that they will
"learn" the best features needed to
represent the given data.
5
• Deep learning models are formed by multiple layers.
• On the context of artificial neural networks the multi
layer perceptron (MLP) with more than 2 hidden layers
is already a Deep Model.
Layers and • As a rule of thumb deeper models will perform better
layers than shallow models, the problem is that more deep you
go more data, you will need to avoid over-fitting.
3
Fat + Short v.s. Thin + Tall
The same number
of parameters Which one is better?
As a rule of thumb deeper
models will perform better
than shallow models, the
problem is that more deep
you go more data, you will
…… need to avoid over-fitting.
x1 x2 …… xN x1 x2 …… xN
Shallow Deep
7
Convolution layer
Pooling Layer
Dropout Layer
Batch normalization layer
Layer types
Fully Connected layer
Relu, Tanh, sigmoid layer
Softmax, Cross Entropy, SVM, Euclidean (Loss layers)
4
Some guys
from Deep
Learning
Actually the only new thing is the usage of
something that will learn how to represent the data
(feature selection) automatically and based on the
dataset given.
Is not about saying that SVM or decision trees are
bad, actually some people use SVMs at the end of
the deep neural network to do classification.
Old vs New The only point is that the feature selection can be
easily adapted to new data.
10
5
Convolution
11
Convolution
• Convolution is a mathematical operation that does the integral of the
product of 2 functions(signals), with one of the signals flipped. For
example bellow we convolve 2 signals f(t) and g(t).
conv(a,b)==conv(b,a)
1- flip horizontally (180 degrees) the signal g.
2- slide the flipped g over f, multiplying and
accumulating all it's values.
12
6
Application of convolutions
People use convolution on signal processing for the following use cases:
• Filter signals (1D audio, 2D image processing)
• Check how much a signal is correlated to another
• Find patterns in signals
13
Example
• convolve two signals x = (0,1,2,3,4) with w = (1,-1,2).
1- The first thing is to flip W horizontally (Or rotate to left 180 degrees).
2- After that we need to slide the flipped W over the input X
14
7
2- After that we need to slide the flipped W over the input X.
• Observe that on steps 3,4,5 the flipped window is completely inside the input
signal.
• The cases where the flipped window is not fully inside the input window(X), we can
consider to be zero, or calculate what is possible to be calculated, e.g. on step 1 we
15
multiply 1 by zero, and the rest is simply ignored.
Transforming convolution to computation
graph
16
8
2D Convolution
• 2D convolutions are used as image filters.
17
2D Convolution
Consider 5 x 5 image
3 x 3 Filter
Output (Feature map)
19
9
Example
20
By default when we're doing
convolution we move our window
one pixel at a time (stride=1),
but some times in convolutional
Stride neural networks we want to move
more than one pixel.
• For example, with kernels of size 2 we will
use a stride of 2. Setting the stride and
kernel size both to 2 will result in the
output being exactly half the size of the
input along both dimensions.
21
10
Convolution and stride
Output size : (N-F) / stride +1
22
• By default the convolution output will always
have a result smaller than the input. To avoid
this behaviour we need to use padding.
• In order to keep the convolution result size the
Input same size as the input, and to avoid an effect
padding called circular convolution, we pad the signal
with zeros.
Where you put the zeros depends on what you
want to do,
• i.e.: on the 1D case you can
concatenate them on each end,
• but on 2D it is normally placed
all the way around the original signal.
23
11
• We have an input 5x5 convolved with a filter 3x3 (k=3).
Convolution with no padding Convolution with padding and
and stride of 1 stride of 1
24
Output size for 2D
• If we consider the padding and stride, input of spatial size [H, W] padded by P,
with a square kernel of size F and using stride S, the output size of convolution is
defined as:
• Example:
5x5 (WxH) input, with a conv layer with the following parameters Stride=1, Pad=1,
F= (3x3 kernel).
The output size: ((5 - 3 + 2)/1 )+ 1 = 5
25
12
Convolution Neural
Networks
(CNN)
26
Convolution Neural Networks (CNN) have become an
important tool for object recognition
• Object detection
• Action Classification
• Image Captioning (Description)
27
13
Large Scale Object Recognition Challenge
28
CNN
• A CNN is composed of layers that filters(convolve) the inputs to get useful
information.
• These have two kinds of layers: convolution layers and pooling layers.
• The convolution layer has a set of filters. Its output is a set of feature maps,
each one obtained by convolving the image with a filter.
• CNN are better to work with images.
• Common architecture for CNN:
[CONV->ReLU->Pool->CONV->ReLU->Pool->FC->Softmax_loss(during train)] 29
14
Main actor the convolution layer
• The most important operation on the convolutional neural network are the
convolution layers.
• The filter will look for a particular thing on all the image, this means that it will
look for a pattern in the whole image with just one filter.
30
Main actor the convolution layer
31
15
• It's common to apply a linear rectication nonlinearity: yi = max(zi ; 0)
ReLU
Why might we do this?
• Convolution is a linear
operation.
• Therefore, we need a non-
linearity, otherwise 2
convolution layers would be
no more powerful than 1.
• ReLU has been used after every
Convolution operation.
32
ReLU
• Other functions are also used to increase nonlinearity, for example the
saturating hyperbolic tangent, sigmoid function .
• Compared to other functions the usage of ReLU is preferable, because it
results in the neural network training several times faster. 33
16
Pooling
Layers
• The other type of layer is a pooling layer. These layers
reduce the size of the representation and build in
invariance to small transformations.
34
Deep layer
Deep Learning Shallow
1D Convolution
Summary Convolution
2D Convolution
Stride
Padding
Convolution
Convolution Neural layer
ReLU
Networks (CNN)
Pooling Layer
35
17
Next: More details for
Convolution Neural
Networks (CNN)
Thanks
36
18