0% found this document useful (0 votes)
33 views3 pages

Similarity Learning

Similarity learning is a machine learning approach focused on measuring the similarity or dissimilarity between data samples rather than predicting class labels. It is particularly useful in scenarios with new classes or few samples per class, utilizing models like Siamese and Triplet networks to learn similarity functions. Key metrics for measuring similarity include Euclidean distance, cosine similarity, and Manhattan distance.

Uploaded by

safinahmmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views3 pages

Similarity Learning

Similarity learning is a machine learning approach focused on measuring the similarity or dissimilarity between data samples rather than predicting class labels. It is particularly useful in scenarios with new classes or few samples per class, utilizing models like Siamese and Triplet networks to learn similarity functions. Key metrics for measuring similarity include Euclidean distance, cosine similarity, and Manhattan distance.

Uploaded by

safinahmmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

🎓 Lecture Topic: Similarity Learning

1. Introduction

Similarity learning is a machine learning paradigm where the goal is to learn a function that
measures how similar or dissimilar two data samples are.

Instead of predicting class labels directly (like in standard classification), similarity learning
focuses on learning relationships between samples — for example:

●​ Are these two faces from the same person?


●​ Are these two X-ray images from the same disease class?
●​ Are these two products visually similar?

Key Idea:​
The model learns to map input data into a space where similar samples are close together,
and dissimilar samples are far apart.

2. Why Similarity Learning?

Traditional classifiers work well when you have a fixed number of classes, but:

●​ What if new classes appear later?


●​ What if you have very few samples per class (few-shot learning)?

Similarity learning handles this by learning relationships, not explicit labels.

Example:​
In facial recognition, it’s easier to learn “who looks like whom” than to train a classifier for every
person in the world.

3. Core Concept

We aim to learn a similarity function S(xi,xj)S(xi​,xj​) that gives:

S(xi,xj)={High value (e.g., close to 1)if xi and xj are similarLow value (e.g., close to 0)if xi and xj
are dissimilarS(xi​,xj​)={High value (e.g., close to 1)Low value (e.g., close to 0)​if xi​and xj​are
similarif xi​and xj​are dissimilar​
This function is often implemented using neural networks (e.g., Siamese, Triplet, or
Contrastive networks).

4. Key Approaches

A. Siamese Network

●​ Uses two identical subnetworks (with shared weights).


●​ Each network extracts a feature embedding for one input.
●​ A distance metric (e.g., Euclidean or cosine distance) compares embeddings.

Loss Function (Contrastive Loss):

L=(1−y)⋅D2+y⋅max⁡(0,m−D)2L=(1−y)⋅D2+y⋅max(0,m−D)2

where

●​ y=0y=0 if similar, y=1y=1 if dissimilar


●​ DD = distance between embeddings
●​ mm = margin separating dissimilar pairs.

Used in: Face verification, signature matching, fingerprint comparison.

B. Triplet Network

●​ Uses three inputs: anchor, positive, and negative.


○​ Anchor (A): reference sample
○​ Positive (P): similar to anchor
○​ Negative (N): dissimilar to anchor

Triplet Loss:

L=max⁡(0,D(A,P)−D(A,N)+α)L=max(0,D(A,P)−D(A,N)+α)

where αα is a margin.

Goal:​
Pull anchor and positive closer, push anchor and negative apart.

Used in: FaceNet (Google’s face recognition model).


C. Contrastive Learning (Self-Supervised)

●​ Learns similarities without explicit labels.


●​ Generates positive and negative pairs using data augmentations.
●​ Examples:
○​ SimCLR
○​ MoCo
○​ BYOL

Intuition:​
If two augmented versions of the same image are passed through a model, their embeddings
should be similar.

5. Similarity Metrics

To measure how close two embeddings are:

Metric Formula Interpretation

Euclidean Distance ∥xi−xj∥2∥xi​−xj​∥2​ Geometric distance in feature


space

Cosine Similarity xi⋅xj∥xi∥∥xj∥∥xi​ Measures angular similarity


∥∥xj​∥xi​⋅xj​​

Manhattan Distance ∥xi−xj∥1∥xi​−xj​∥1​ Sum of absolute differences

You might also like