0% found this document useful (0 votes)
118 views18 pages

Lec16 - Autoencoders

Autoencoders are unsupervised neural networks designed to reproduce their input. They consist of an encoder that compresses the input into a latent space representation and a decoder that reconstructs the input from the latent space. Variations include denoising autoencoders which add noise to the input to learn a more robust representation, and sparse autoencoders which add regularization to activate only a few nodes in the latent space. Contractive autoencoders enforce the latent space to be invariant to small perturbations in the input.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
Download as pptx, pdf, or txt
0% found this document useful (0 votes)
118 views18 pages

Lec16 - Autoencoders

Autoencoders are unsupervised neural networks designed to reproduce their input. They consist of an encoder that compresses the input into a latent space representation and a decoder that reconstructs the input from the latent space. Variations include denoising autoencoders which add noise to the input to learn a more robust representation, and sparse autoencoders which add regularization to activate only a few nodes in the latent space. Contractive autoencoders enforce the latent space to be invariant to small perturbations in the input.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1/ 18

Autoencoders

• Supervised learning uses explicit labels/correct output


in order to train a network.
• E.g., classification of images.

• Unsupervised learning relies on data only.


• E.g., CBOW and skip-gram word embeddings: the output is
determined implicitly from word order in the input data.
• Key point is to produce a useful embedding of words.
• The embedding encodes structure such as word similarity
and some relationships.
• Still need to define a loss – this is an implicit supervision.
Autoencoders
• Autoencoders are designed to reproduce their
input, especially for images.
• Key point is to reproduce the input from a learned
encoding.

https://www.edureka.co/blog/autoencoders-tutorial/
Autoencoders
• Compare PCA/SVD
• PCA takes a collection of vectors (images) and produces a
usually smaller set of vectors that can be used to
approximate the input vectors via linear combination.
• Very efficient for certain applications.
• Fourier and wavelet compression is similar.

• Neural network autoencoders


• Can learn nonlinear dependencies
• Can use convolutional layers
• Can use transfer learning

https://www.edureka.co/blog/autoencoders-tutorial/
Autoencoders: structure
• Encoder: compress input into a latent-space of
usually smaller dimension. h = f(x)
• Decoder: reconstruct input from the latent space.
r = g(f(x)) with r as close to x as possible

https://towardsdatascience.com/deep-inside-autoencoders-7e41f319999f
Autoencoders: applications
• Denoising: input clean image + noise and train to
reproduce the clean image.

https://www.edureka.co/blog/autoencoders-tutorial/
Autoencoders: Applications
• Image colorization: input black and white and train
to produce color images

https://www.edureka.co/blog/autoencoders-tutorial/
Autoencoders: Applications
• Watermark removal

https://www.edureka.co/blog/autoencoders-tutorial/
Properties of Autoencoders
• Data-specific: Autoencoders are only able to
compress data similar to what they have been
trained on.
• Lossy: The decompressed outputs will be degraded
compared to the original inputs.
• Learned automatically from examples: It is easy to
train specialized instances of the algorithm that will
perform well on a specific type of input.

https://www.edureka.co/blog/autoencoders-tutorial/
Capacity
• As with other NNs, overfitting is a problem when
capacity is too large for the data.

• Autoencoders address this through some


combination of:
• Bottleneck layer – fewer degrees of freedom than in
possible outputs.
• Training to denoise.
• Sparsity through regularization.
• Contractive penalty.
Bottleneck layer (undercomplete)
• Suppose input images are nxn and the latent space
is m < nxn.
• Then the latent space is not sufficient to reproduce
all images.
• Needs to learn an encoding that captures the
important features in training data, sufficient for
approximate reconstruction.
Simple bottleneck layer in Keras
• input_img = Input(shape=(784,))
• encoding_dim = 32
• encoded = Dense(encoding_dim, activation='relu')(input_img)
• decoded = Dense(784, activation='sigmoid')(encoded)
• autoencoder = Model(input_img, decoded)
• Maps 28x28 images into a 32 dimensional vector.
• Can also use more layers and/or convolutions.

https://blog.keras.io/building-autoencoders-in-keras.html
Denoising autoencoders
• Basic autoencoder trains to minimize the loss
between x and the reconstruction g(f(x)).
• Denoising autoencoders train to minimize the loss
between x and g(f(x+w)), where w is random noise.
• Same possible architectures, different training data.

• Kaggle has a dataset on damaged documents.

https://blog.keras.io/building-autoencoders-in-keras.html
Denoising autoencoders
• Denoising autoencoders can’t simply memorize the
input output relationship.
• Intuitively, a denoising autoencoder learns a
projection from a neighborhood of our training
data back onto the training data.

https://ift6266h17.files.wordpress.com/2017/03/14_autoencoders.pdf
Sparse autoencoders
• Construct a loss function to penalize activations
within a layer.
• Usually regularize the weights of a network, not the
activations.
• Individual nodes of a trained model that activate
are data-dependent.
• Different inputs will result in activations of different
nodes through the network.
• Selectively activate regions of the network
depending on the input data.

https://www.jeremyjordan.me/autoencoders/
Sparse autoencoders
• Construct a loss function to penalize activations the
network.
• L1 Regularization: Penalize the absolute value of the
vector of activations a in layer h for observation I

• KL divergence: Use cross-entropy between average


activation and desired activation

https://www.jeremyjordan.me/autoencoders/
Contractive autoencoders
• Arrange for similar inputs to have similar activations.
• I.e., the derivative of the hidden layer activations are
small with respect to the input.
• Denoising autoencoders make the reconstruction function
(encoder+decoder) resist small perturbations of the input
• Contractive autoencoders make the feature extraction
function (ie. encoder) resist infinitesimal perturbations of
the input.

https://www.jeremyjordan.me/autoencoders/
Contractive autoencoders
• Contractive autoencoders make the feature
extraction function (ie. encoder) resist infinitesimal
perturbations of the input.

https://ift6266h17.files.wordpress.com/2017/03/14_autoencoders.pdf
Autoencoders
• Both the denoising and contractive autoencoder can
perform well
• Advantage of denoising autoencoder : simpler to implement-
requires adding one or two lines of code to regular autoencoder-
no need to compute Jacobian of hidden layer
• Advantage of contractive autoencoder : gradient is deterministic
-can use second order optimizers (conjugate gradient, LBFGS,
etc.)-might be more stable than denoising autoencoder, which
uses a sampled gradient
• To learn more on contractive autoencoders:
• Contractive Auto-Encoders: Explicit Invariance During Feature
Extraction. Salah Rifai, Pascal Vincent, Xavier Muller, Xavier
Glorot et Yoshua Bengio, 2011.

https://ift6266h17.files.wordpress.com/2017/03/14_autoencoders.pdf

You might also like