0% found this document useful (0 votes)
80 views

DeepLearning L1 Intro

Deep learning introduces neural networks that can learn representations of data directly from large datasets. This overcomes limitations of hand-engineered features. Recent progress is due to large datasets, powerful GPUs, and improved techniques like backpropagation for training networks. The basic building block of neural networks is the perceptron, which performs a weighted sum of its inputs and applies an activation function. Networks are trained by minimizing a loss function using gradient descent and backpropagation to update weights. Techniques like dropout and early stopping help prevent overfitting during training.

Uploaded by

lafdali
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
80 views

DeepLearning L1 Intro

Deep learning introduces neural networks that can learn representations of data directly from large datasets. This overcomes limitations of hand-engineered features. Recent progress is due to large datasets, powerful GPUs, and improved techniques like backpropagation for training networks. The basic building block of neural networks is the perceptron, which performs a weighted sum of its inputs and applies an activation function. Networks are trained by minimizing a loss function using gradient descent and backpropagation to update weights. Techniques like dropout and early stopping help prevent overfitting during training.

Uploaded by

lafdali
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 92

Deep Learning: Introduction

Pr. Tarik Fissaa


DATA – INE2

A.U : 2022/2023
What is Deep Learning ?
Why Deep Learning and Why Now ?
Why Deep Learning ?
Hand engineered features are time consuming, brittle, and not scalable in practice.

Can we learn the underlying features directly from data?


Why Now?

Neural networks date back decades, so why the resurgence?

1. Big Data 2. Hardware 3. Software


. Larger Datasets . Graphics Processing . Improved Techniques
. Easier Collection Units (GPUs) . New Models
& Storage . Massively Parallelizable . Toolboxes
The Perceptron
The structural building block for deep learning
The perceptron: Forward Propagation
The perceptron: Forward Propagation
The perceptron: Forward Propagation
The perceptron: Forward Propagation
Common Activation Functions
Importance of Activation Functions
Importance of Activation Functions
The Perceptron: Example
The Perceptron: Example
The Perceptron: Example
The Perceptron: Example
Building neural networks with Perceptrons
The Perceptron: simplified
The Perceptron: simplified
Multi Output Perceptron
Because all inputs are densely connected to all outputs, these layers are called Dense layers
Multi Output Perceptron
Because all inputs are densely connected to all outputs, these layers are called Dense layers
Single Layer Neural Network
Single Layer Neural Network
Single Layer Neural Network
Deep neural Network
Deep neural Network
Applying Neural Networks
Exemple

Will I pass this class?

let’s start with a simple two feature model:

𝑥1 = Number of lectures you attend.

𝑥2 = Hours spent on the final project


Exemple problem: Will I pass this class?

𝑥2 = Hours
spent on the
final project

𝑥1 = Number of lectures you attend


Exemple problem: Will I pass this class?

𝑥2 = Hours
spent on the
final project

𝑥1 = Number of lectures you attend


Exemple problem: Will I pass this class?
Exemple problem: Will I pass this class?
Quantifying Loss
The loss of our network measures the cost incurred from incorrect predictions
Empirical Loss
The empirical loss measures the total loss over our entire dataset
Binary Cross Entropy Loss
Cross entropy loss can be used with models that ouput a probability between o and 1
Mean Squared Error Loss MSE
Mean squared error loss can be used with regression models that ouput continuos
real numbers
Training Neural Networks
Loss Optimization
We want to find the network weights that achieve the lowest loss
Loss Optimization
We want to find the network weights that achieve the lowest loss
Loss Optimization
Loss Optimization
Loss Optimization
Loss Optimization
Loss Optimization
Algorithme du gradient (Gradient descent)
Computing gradients: Backpropagation

How does a small change in one weight (ex. 𝑤2 ) affect the final loss 𝐽 𝑊 ?
Computing gradients: Backpropagation
Computing gradients: Backpropagation
Computing gradients: Backpropagation
Computing gradients: Backpropagation

Repeat this for every weight in the network using gradients from later layers
Computing gradients: Backpropagation
Neural Networks in Practice:
Optimization
Training Neural Networks is difficult

« Visualizing the loss


landscape ». Hao Li, Dec 2017
Loss functions can be difficult to optimize
Loss functions can be difficult to optimize
Setting the learning Rate
Setting the learning Rate
Setting the learning Rate
Comment gérer cela?

Idea 1:
Try lots of different learning rates and see what works « just right »
Comment gérer cela?

Idée 1:
essayez de nombreux taux d'apprentissage différents et voyez ce qui fonctionne « juste »

Idée 2:
Do something smarter!
Design and adaptive learning rate that adapts to the landscape
Adaptive Learning Rates

• Learning rates are no longer fixed


• Can be made larger or smaller depending on:
• How large gradient is
• How fast learning is happening
• Size of particular weights
• Etc...
Gradient Descent Algorithms
Neural Networks in Practice:
Mini-batches
Mini-batches while training

More accurate estimation of gradient


smoother convergence

Allows for larger learning rates


Mini-batches while training
Estimation plus précise du gradient
Convergence plus fluide

Permet des taux d'apprentissage plus élevés

Mini-batches lead to faster training


Can parallelize computation + achieve significant speed increase on GPU’s
Neural Networks in Practice:
Overfitting
Regularization

what is it?
Technique that constrains our optimization problem to discourage complex models
Regularization

C’est quoi?
Technique qui contraint notre problème d'optimisation à décourager les modèles complexes

Why?
Improve generalisation of our model on unseen data
Regularization 1: Dropout
Regularization 1: Dropout
Regularization 1: Dropout
Regularization 2: Early Stopping
Regularization 2: Early Stopping
Regularization 2: Early Stopping
Regularization 2: Early Stopping
Regularization 2: Early Stopping
Regularization 2: Early Stopping
Regularization 2: Early Stopping
Regularization 2: Early Stopping
Résumé: les fondations de base

You might also like