Lec16 - Autoencoders
Lec16 - Autoencoders
https://www.edureka.co/blog/autoencoders-tutorial/
Autoencoders
• Compare PCA/SVD
• PCA takes a collection of vectors (images) and produces a
usually smaller set of vectors that can be used to
approximate the input vectors via linear combination.
• Very efficient for certain applications.
• Fourier and wavelet compression is similar.
https://www.edureka.co/blog/autoencoders-tutorial/
Autoencoders: structure
• Encoder: compress input into a latent-space of
usually smaller dimension. h = f(x)
• Decoder: reconstruct input from the latent space.
r = g(f(x)) with r as close to x as possible
https://towardsdatascience.com/deep-inside-autoencoders-7e41f319999f
Autoencoders: applications
• Denoising: input clean image + noise and train to
reproduce the clean image.
https://www.edureka.co/blog/autoencoders-tutorial/
Autoencoders: Applications
• Image colorization: input black and white and train
to produce color images
https://www.edureka.co/blog/autoencoders-tutorial/
Autoencoders: Applications
• Watermark removal
https://www.edureka.co/blog/autoencoders-tutorial/
Properties of Autoencoders
• Data-specific: Autoencoders are only able to
compress data similar to what they have been
trained on.
• Lossy: The decompressed outputs will be degraded
compared to the original inputs.
• Learned automatically from examples: It is easy to
train specialized instances of the algorithm that will
perform well on a specific type of input.
https://www.edureka.co/blog/autoencoders-tutorial/
Capacity
• As with other NNs, overfitting is a problem when
capacity is too large for the data.
https://blog.keras.io/building-autoencoders-in-keras.html
Denoising autoencoders
• Basic autoencoder trains to minimize the loss
between x and the reconstruction g(f(x)).
• Denoising autoencoders train to minimize the loss
between x and g(f(x+w)), where w is random noise.
• Same possible architectures, different training data.
https://blog.keras.io/building-autoencoders-in-keras.html
Denoising autoencoders
• Denoising autoencoders can’t simply memorize the
input output relationship.
• Intuitively, a denoising autoencoder learns a
projection from a neighborhood of our training
data back onto the training data.
https://ift6266h17.files.wordpress.com/2017/03/14_autoencoders.pdf
Sparse autoencoders
• Construct a loss function to penalize activations
within a layer.
• Usually regularize the weights of a network, not the
activations.
• Individual nodes of a trained model that activate
are data-dependent.
• Different inputs will result in activations of different
nodes through the network.
• Selectively activate regions of the network
depending on the input data.
https://www.jeremyjordan.me/autoencoders/
Sparse autoencoders
• Construct a loss function to penalize activations the
network.
• L1 Regularization: Penalize the absolute value of the
vector of activations a in layer h for observation I
https://www.jeremyjordan.me/autoencoders/
Contractive autoencoders
• Arrange for similar inputs to have similar activations.
• I.e., the derivative of the hidden layer activations are
small with respect to the input.
• Denoising autoencoders make the reconstruction function
(encoder+decoder) resist small perturbations of the input
• Contractive autoencoders make the feature extraction
function (ie. encoder) resist infinitesimal perturbations of
the input.
https://www.jeremyjordan.me/autoencoders/
Contractive autoencoders
• Contractive autoencoders make the feature
extraction function (ie. encoder) resist infinitesimal
perturbations of the input.
https://ift6266h17.files.wordpress.com/2017/03/14_autoencoders.pdf
Autoencoders
• Both the denoising and contractive autoencoder can
perform well
• Advantage of denoising autoencoder : simpler to implement-
requires adding one or two lines of code to regular autoencoder-
no need to compute Jacobian of hidden layer
• Advantage of contractive autoencoder : gradient is deterministic
-can use second order optimizers (conjugate gradient, LBFGS,
etc.)-might be more stable than denoising autoencoder, which
uses a sampled gradient
• To learn more on contractive autoencoders:
• Contractive Auto-Encoders: Explicit Invariance During Feature
Extraction. Salah Rifai, Pascal Vincent, Xavier Muller, Xavier
Glorot et Yoshua Bengio, 2011.
https://ift6266h17.files.wordpress.com/2017/03/14_autoencoders.pdf