Autoencoders
Autoencoders are a type of neural network used for unsupervised learning, primarily for
dimensionality reduction, feature extraction, and anomaly detection. They are trained to
reconstruct input data by encoding it into a lower-dimensional representation (latent space) and
then decoding it back to the original dimension.
1. Architecture of Autoencoders
An autoencoder consists of two main components:
1. Encoder: Compresses the input into a lower-dimensional latent space representation.
where x is the input, W are weights, and h is the encoded representation.
2. Decoder: Reconstructs the input from the compressed representation.
where x^ is the reconstructed input
The goal is to minimize the reconstruction loss between x and x^.
2. Types of Autoencoders
a) Vanilla Autoencoder
● Uses a simple feedforward neural network for encoding and decoding.
● Suitable for basic feature extraction.
b) Variational Autoencoder (VAE)
● Probabilistic version that learns a distribution of latent representations rather than a single
point.
● Useful in generating new samples (e.g., image synthesis).
c) Denoising Autoencoder
● Trained to remove noise from corrupted input data.
● Used in image denoising and signal processing.
d) Sparse Autoencoder
● Adds a sparsity constraint to enforce activations of neurons to be minimal.
● Useful for learning disentangled features.
e) Convolutional Autoencoder (CAE)
● Uses CNNs for image-related tasks.
● Used in image compression and enhancement.
3. Applications of Autoencoders
✅
✅ Dimensionality Reduction (similar to PCA but non-linear)
✅
✅
Anomaly Detection (e.g., fraud detection, medical diagnosis)
Image Denoising (removing noise from images)
✅ Feature Learning (extracting meaningful representations for downstream tasks)
Data Compression (reducing memory/storage requirements)
Generative Adversarial Networks (GANs)
Generative Adversarial Networks (GANs) are deep learning models designed to generate new,
realistic data samples by training two neural networks in a competitive setting. Introduced by Ian
Goodfellow in 2014, GANs have been widely used in image synthesis, style transfer, data
augmentation, and deepfake generation.
1. Architecture of GANs
GANs consist of two primary components that compete against each other:
1. Generator (G)
o Takes a random noise vector zzz as input and tries to generate realistic data (e.g.,
images, text).
o Its goal is to fool the Discriminator into believing that its generated samples are real.
2. Discriminator (D)
o A binary classifier that differentiates between real samples (from actual data) and
fake samples (from the Generator).
o Trained to maximize accuracy in distinguishing real from fake.
Training Process (Adversarial Learning):
● The Generator improves to produce more realistic outputs.
● The Discriminator improves to become better at detecting fakes.
● This cycle continues until the Generator produces data indistinguishable from real data.
2. Types of GANs
a) Vanilla GAN
● The original GAN structure with a simple Generator and Discriminator.
● Often suffers from mode collapse (Generator only learns a few samples instead of diverse
outputs).
b) Deep Convolutional GAN (DCGAN)
● Uses Convolutional Neural Networks (CNNs) for both Generator and Discriminator.
● Produces higher-quality images.
c) Conditional GAN (cGAN)
● Introduces labels as input, allowing the Generator to create specific classes of data.
● Example: Generating images of dogs vs. cats by conditioning the network on labels.
d) Wasserstein GAN (WGAN)
● Replaces the standard loss function with the Wasserstein distance, improving stability and
reducing mode collapse.
e) StyleGAN
● Developed by NVIDIA, allows control over specific features of generated images (e.g., hair
color, age).
● Used in deepfake and high-resolution image synthesis.
3. Applications of GANs
✅ Image Generation – Used in AI art, deepfakes, and synthetic data generation.
✅ Super-Resolution – Enhancing image resolution (e.g., ESRGAN for upscaling images).
✅ Data Augmentation – Generating synthetic datasets for machine learning.
✅ Text-to-Image Synthesis – Generating images from text descriptions (e.g., DALL·E).
✅ Anomaly Detection – Identifying fake vs. real data in security and medical imaging
Implementation of simple autoencoder
import torch
import torch.nn as nn
import torch.optim as optim
from sklearn.datasets import load_iris
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
# 1. Load and preprocess data
iris = load_iris()
X = iris.data
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
X_train, X_test = train_test_split(X_scaled, test_size=0.2, random_state=42)
X_train_tensor = torch.tensor(X_train, dtype=torch.float32)
X_test_tensor = torch.tensor(X_test, dtype=torch.float32)
# 2. Define the Autoencoder model
class Autoencoder(nn.Module):
def __init__(self):
super(Autoencoder, self).__init__()
self.encoder = nn.Sequential(
nn.Linear(4, 3),
nn.ReLU(),
nn.Linear(3, 2)
self.decoder = nn.Sequential(
nn.Linear(2, 3),
nn.ReLU(),
nn.Linear(3, 4)
def forward(self, x):
encoded = self.encoder(x)
decoded = self.decoder(encoded)
return decoded
model = Autoencoder()
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.01)
# 3. Training
epochs = 100
losses = []
for epoch in range(epochs):
output = model(X_train_tensor)
loss = criterion(output, X_train_tensor)
optimizer.zero_grad()
loss.backward()
optimizer.step()
losses.append(loss.item())
if epoch % 10 == 0:
print(f"Epoch {epoch}, Loss: {loss.item():.4f}")
# 4. Plot the training loss
plt.plot(losses)
plt.xlabel("Epoch")
plt.ylabel("Reconstruction Loss")
plt.title("Autoencoder Training Loss")
plt.grid()
plt.show()
# 5. Encode and visualize
with torch.no_grad():
encoded_data = model.encoder(X_test_tensor).numpy()
plt.scatter(encoded_data[:, 0], encoded_data[:, 1], c=iris.target[iris.target != -1], cmap='viridis')
plt.title("Encoded Feature Space")
plt.xlabel("Feature 1")
plt.ylabel("Feature 2")
plt.colorbar()
plt.grid(True)
plt.show()