0% found this document useful (0 votes)
14 views29 pages

Syllabus

The document discusses the architecture and functionality of Convolutional Neural Networks (CNNs), highlighting the importance of convolutional layers and filters in detecting patterns within images. It explains how CNNs reduce the number of parameters through shared weights and max pooling, allowing for efficient image processing. Additionally, it contrasts CNNs with fully connected networks, emphasizing the advantages of CNNs in handling image data.

Uploaded by

manas.dubey007
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views29 pages

Syllabus

The document discusses the architecture and functionality of Convolutional Neural Networks (CNNs), highlighting the importance of convolutional layers and filters in detecting patterns within images. It explains how CNNs reduce the number of parameters through shared weights and max pooling, allowing for efficient image processing. Additionally, it contrasts CNNs with fully connected networks, emphasizing the advantages of CNNs in handling image data.

Uploaded by

manas.dubey007
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd

SYLLABUS

 We know it is good to learn a small model.


 From this fully connected model, do we really
need all the edges?
 Can some of these be shared?
 Some patterns are much smaller than the whole image

Can represent a small region with fewer


parameters

“beak” detector
“upper-left beak”
detector

They can be compressed


to the same parameters.

“middle beak”
detector
A CNN is a neural network with some convolutional layers
(and some other layers). A convolutional layer has a number
of filters that does convolutional operation.

Beak detector

A filter
These are the network
parameters to be learned.

1 -1 -1
1 0 0 0 0 1 -1 1 -1 Filter 1
0 1 0 0 1 0 -1 -1 1
0 0 1 1 0 0
1 0 0 0 1 0 -1 1 -1
0 1 0 0 1 0 -1 1 -1 Filter 2
0 0 1 0 1 0 -1 1 -1



6 x 6 image
Each filter detects a
small pattern (3 x 3).
1 -1 -1
-1 1 -1 Filter 1
-1 -1 1
stride=1

1 0 0 0 0 1 Dot
product
0 1 0 0 1 0 3 -1
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0

6 x 6 image
1 -1 -1
-1 1 -1 Filter 1
-1 -1 1
If stride=2

1 0 0 0 0 1
0 1 0 0 1 0 3 -3
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0

6 x 6 image
1 -1 -1
-1 1 -1 Filter 1
-1 -1 1
stride=1

1 0 0 0 0 1
0 1 0 0 1 0 3 -1 -3 -1
0 0 1 1 0 0
1 0 0 0 1 0 -3 1 0 -3
0 1 0 0 1 0
0 0 1 0 1 0 -3 -3 0 1

6 x 6 image 3 -2 -2 -1
-1 1 -1
-1 1 -1 Filter 2
-1 1 -1
stride=1
Repeat this for each filter
1 0 0 0 0 1
0 1 0 0 1 0 3 -1 -3 -1
-1 -1 -1 -1
0 0 1 1 0 0
1 0 0 0 1 0 -3 1 0 -3
-1 -1 -2 1
0 1 0 0 1 0 Feature
0 0 1 0 1 0 -3 -3 Map
0 1
-1 -1 -2 1
6 x 6 image 3 -2 -2 -1
-1 0 -4 3
Two 4 x 4 images
Forming 2 x 4 x 4 matrix
11 -1-1 -1-1 -1-1 11 -1-1
1 -1 -1 -1 1 -1
-1-1 11 -1-1 -1-1-1111-1-1-1 Filter 2
-1 1 -1 Filter 1 -1 1 -1
-1-1 -1-1 11 -1-1 11 -1-1
-1 -1 1
Color image
1 0 0 0 0 1
1 0 0 0 0 1
0 11 00 00 01 00 1
0 1 0 0 1 0
0 00 11 01 00 10 0
0 0 1 1 0 0
1 00 00 10 11 00 0
1 0 0 0 1 0
0 11 00 00 01 10 0
0 1 0 0 1 0
0 00 11 00 01 10 0
0 0 1 0 1 0
0 0 1 0 1 0
Convolution v.s. Fully Connected

1 0 0 0 0 1 1 -1 -1 -1 1 -1
0 1 0 0 1 0 -1 1 -1 -1 1 -1
0 0 1 1 0 0 -1 -1 1 -1 1 -1
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
convolution
image

x1
1 0 0 0 0 1
0 1 0 0 1 0 x2
Fully- 0 0 1 1 0 0
1 0 0 0 1 0
connected



0 1 0 0 1 0
0 0 1 0 1 0
x36
1 -1 -1 Filter 1 1 1
-1 1 -1 2 0
-1 -1 1 3 0
4 0 3
:


1 0 0 0 0 1
0 1 0 0 1 0 0
0 0 1 1 0 0 8 1
1 0 0 0 1 0 9 0
0 1 0 0 1 0 10: 0


0 0 1 0 1 0
13 0
6 x 6 image
14 0
fewer 15 1 Only connect
parameters! 16 1 to 9 inputs,
not fully

connected
1 -1 -1 1 1
:2 0
-1 1 -1 Filter 1
:3 0
-1 -1 1
:4 0 3
:


1 0 0 0 0 1
0 1 0 0 1 0 7 0
:8 1
0 0 1 1 0 0
:9 0 -1
1 0 0 0 1 0
10:: 0
0 1 0 0 1 0


0 0 1 0 1 0
13 0
6 x 6 image
:
14 0
Fewer parameters :15 1
:
16 1 Shared weights
Even fewer
:

parameters
cat dog ……
Convolution

Max Pooling
Can repeat
Fully Connected many
Feedforward
network
Convolution times

Max Pooling

Flattened
1 -1 -1 -1 1 -1
-1 1 -1 Filter 1 -1 1 -1 Filter 2
-1 -1 1 -1 1 -1

3 -1 -3 -1 -1 -1 -1 -1

-3 1 0 -3 -1 -1 -2 1

-3 -3 0 1 -1 -1 -2 1

3 -2 -2 -1 -1 0 -4 3
 Subsampling pixels will not change the
object
bird
bird

Subsampling

We can subsample the pixels to make image smaller


fewer parameters to characterize the image
 Reducing number of connections
 Shared weights on the edges
 Max pooling further reduces the complexity
New image
1 0 0 0 0 1 but smaller
0 1 0 0 1 0 Conv
3 0
0 0 1 1 0 0 -1 1
1 0 0 0 1 0
Max
0 1 0 0 1 0 30 13
Poolin
0 0 1 0 1 0 g
2 x 2 image
6 x 6 image
Each filter
is a channel
3 0
-1 1 Convolution

3 1
0 3
Max Pooling
Can repeat
A new
many
image
Convolution times
Smaller than the original
image
The number of channels Max Pooling

is the number of filters


cat dog ……
Convolution

Max Pooling

Fully Connected A new


Feedforward
Convolutionimage
network

Max Pooling

Flattene A new
d image
3

1
3- 0
1 3
1

30 1 -1
3 Flattened
Fully Connected
1 Feedforward
network
0

3
Only modified the network structure
CNN in Keras and input format (vector -> 3-D
tensor)
input

Convolution
1 -1 -1
-1 1 -1
-1 1 -1
-1 1 -1 … There are
-1 -1 1 25 3x3
-1 1 -1 … Max Pooling
filters.
Input_shape = ( 28 , 28 , 1)

28 x 28 pixels 1: black/white, 3: RGB Convolution

3 -1 3 Max Pooling

-3 1
Only modified the network structure
CNN in Keras and input format (vector -> 3-D
array)
Input
1 x 28 x 28

Convolution
How many parameters for
each filter? 9 25 x 26 x
26
Max Pooling
25 x 13 x
13
Convolution
How many parameters 225=
for each filter? 50 x 11 x
25x9 11
Max Pooling
50 x 5 x 5
Only modified the network structure
CNN in Keras and input format (vector -> 3-D
array)
Input
1 x 28 x 28

Output Convolution

25 x 26 x
26
Fully connected Max Pooling
feedforward
network 25 x 13 x
13
Convolution
50 x 11 x
11
Max Pooling
1250 50 x 5 x 5
Flattened
Next move
Neural
(19 x 19
Network positions)

19 x 19 matrix
Black: 1 Fully-connected feedforward
network can be used
white: -1
none: 0 But CNN performs much better
The following is quotation from their Nature article:
Note: AlphaGo does not use Max Pooling.
The filters move in the
CNN frequency direction.
Frequency

Image Time
Spectrogram
?

Source of image: http://citeseerx.ist.psu.ed


u/viewdoc/download?doi=10.1.1.703.6858
&rep=rep1&type=pdf

You might also like