An Introduction To Neural Networks: Instituto Tecgraf PUC-Rio Nome: Fernanda Duarte Orientador: Marcelo Gattass
An Introduction To Neural Networks: Instituto Tecgraf PUC-Rio Nome: Fernanda Duarte Orientador: Marcelo Gattass
Networks
A computer program is said to learn from experience E with respect to some class of tasks T
and performance measure P, if its performance at tasks in T, as measured by P, improves
with experience E. (Mitchell, 1997)
Applications
Digit Recognition Face Recognition Recommendation Engines
Formal tasks: Playing board games, solving puzzles, mathematical and and logic
problems → Easier to code!
or
Feedforward Neural Networks
(or Deep Feedforward Networks or Multilayer
Perceptrons (MLP)) (see “Deep Learning” book, Ian Goodfellow et al.)
- Nonlinearity (extends the kinds of functions that we can represent with our
neural network, e.g. XOR function (“exclusive or” ))
Feedforward Neural Networks
Goal: Approximate some function !*
In this case, ! (1) is called the first hidden layer of the network, ! (2) is
called the second hidden layer, and the final layer ! (3) is called the output
layer.
- Why hidden layer? Behavior not directly specified → learning algorithm must
decide how to use those layers to form ! (") that best approximates !*.
- Length of the chain gives the depth of the model → deep learning (you can
“stack” multiple layers)
Graph representation of the network
- The feedforward network model is associated with a directed acyclic graph
(DAG) describing how the functions are composed together.
Artificial neuron
Fully-connected layers
- Neurons between two adjacent layers are fully pairwise connected, but neurons
within a single layer share no connections.
Feedforward computation
- The abstraction of a layer has the nice property that it allows us to use efficient
vectorized code (e.g. matrix multiplies).
Feedforward computation
Example of activation function: Sigmoid
Feedforward computation
Example of activation function: Sigmoid
Feedforward computation
Feedforward computation
A loss function C is a measure of how wrong the model is in terms of its ability to
estimate the relationship between ! and ", i.e., " = #* (!), with the chosen
parameters. (e.g. Mean Squared Error (MSE))
i j
Parameters update
during learning
process.
! is the learning rate
(hyperparameter)
Gradient Descent
Backpropagation
How to compute the gradient of the cost function with respect to the weights w and biases b
(efficiently)?
The backprop algorithm gives us detailed insights into how changing the weights and biases
changes the overall behaviour of the network.
See: http://neuralnetworksanddeeplearning.com/chap2.html#the_backpropagation_algorithm
Learning process (summary)
For each learning example i in training set:
1 - Feedforward computation;
2 - Backpropagation;
3 - Weight update.
- Learning rate (!)
- Epoch
(AlexNet)
Convolutional Neural Networks (CNN or ConvNets)
- Useful when the proximity between two data points indicates how related they are (e.g.: pixels in
images!) → CNNs preserves spatial structure.
- The neurons in a layer will only be connected to a small region of the layer before it, instead of all
of the neurons in a fully-connected manner → less parameters!
- Convolutional Neural Networks take advantage of the fact that the input consists of images and
they constrain the architecture in a more sensible way. In particular, unlike a regular Neural
Network, the layers of a ConvNet have neurons arranged in 3 dimensions: width, height, depth.
Convolutional Neural Networks (CNN or ConvNets)
Every layer of a ConvNet transforms one volume of activations to another through a differentiable
function.
We use three main types of layers to build a basic CNN architecture: Convolutional Layer, Pooling Layer,
and Fully-Connected Layer.
Convolutional Layer
Forward pass: We slide (or convolve) each filter across the input volume and compute dot products
between the entries of the filter and the input element by element, followed by an nonlinear activation
function elementwise.
Every filter produce a 2-dimensional activation map. (e.g. if we use 12 filters of dimensions 5x5x3, the
conv layer may have an output volume of 32x32x12, i.e., 12 activation maps with dimensions 32x32)
Intuitively, the network will learn filters that activate when they see some type of visual feature, such as
an edge of some orientation, for example.
zero-padding
Pooling Layer
Its function is to progressively reduce the spatial size of the representation to reduce the amount of
parameters and computation in the network, and hence to also control overfitting.
Fully-Connected Layer
Same as before.
http://yann.lecun.com/exdb/mnist/
https://www.youtube.com/watch?v=d14TUNcbn1k
http://neuralnetworksanddeeplearning.com/chap1.html
http://www.deeplearningbook.org/
https://www.youtube.com/watch?v=1L0TKZQcUtA
https://towardsdatascience.com/machine-learning-fundamentals-via-linear-regression-41a5d11f5220
References
Backpropagation:
https://mattmazur.com/2015/03/17/a-step-by-step-backpropagation-example/
https://ayearofai.com/rohan-lenny-1-neural-networks-the-backpropagation-algorithm-explained-abf4609d4f9d
http://neuralnetworksanddeeplearning.com/chap2.html
https://www.youtube.com/watch?v=tIeHLnjs5U8
References
CNNs:
https://hashrocket.com/blog/posts/a-friendly-introduction-to-convolutional-neural-networks
https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/
http://web.stanford.edu/class/cs231a/lectures/intro_cnn.pdf
http://cs231n.stanford.edu/