0% found this document useful (0 votes)

36 views56 pages

ML Unit - IV

The document discusses different machine learning clustering techniques: 1. Mixture densities describe combining density laws from multiple groups to model data. K-means clustering partitions observations into distinct clusters based on feature similarity. 2. Spectral clustering maps data to a new space where similarities are enhanced before clustering. It constructs a neighborhood graph and uses eigenvectors of the graph Laplacian for dimensionality reduction. 3. Hierarchical clustering builds nested clusters by merging or splitting them successively, forming a dendrogram that shows the cluster relationships.

Uploaded by

Hamsi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views56 pages

ML Unit - IV

Uploaded by

Hamsi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

Machine Learning

Unit - IV

By
Mrs. P Jhansi Lakshmi
Assistant Professor
Department of CSE, VFSTR
Syllabus
UNIT – IV
CLUSTERING: Mixture densities; K-means Clustering; Supervised
learning after clustering; Spectral clustering; Hierarchical clustering.
NONPARAMETRIC METHODS: Nonparametric density estimation;
Histogram estimator; Kernel estimator; k-nearest neighbor estimator;
Generalization to multivariate data; Nonparametric classification;

P JHANSI LAKSHMI Department of CSE

CLUSTERING: Mixture
densities
Mixture densities
Mixture density: Combination of density laws associated with several groups.

Where are the mixture components. They are also called group or clusters.
are the component densities and are the mixture pro-portions.
• The number of components, k, is a hyperparameter and should be speciﬁed
beforehand. Given a sample and k, learning corresponds to estimating the component
densities and proportions.
• When we assume that the component densities obey a parametric model, we need
only estimate their parameters.

P JHANSI LAKSHMI Department of CSE

Mixture densities
• Mixture density of components according multivariate Gaussian, then we
have
• Component densitie and
• Parametrization

are the parameters that should be estimated from the(independent and

identically distributed )iid unlabeled samples

P JHANSI LAKSHMI Department of CSE

Mixture densities
Mixture Density:
Proportion of the group in the mixture,

• Probability that x belongs to the group

P JHANSI LAKSHMI Department of CSE

Mixture densities
• Parametric classiﬁcation is a bona ﬁde mixture model where groups, ,
correspond to classes, component densities correspond to class densities
and correspond to class priors, :

P JHANSI LAKSHMI Department of CSE

Mixture densities
• In the supervised case, we know how many groups there are and learning the parameters
is trivial because we are given the labels, namely, which instance belongs to which class
(component).
• We know that the sample where if and 0 otherwise, the parameters can be calculated
using maximum likelihood.
• When each class is Gaussian distributed, we have a Gaussian mixture, and the
parameters are estimated as:

P JHANSI LAKSHMI Department of CSE

Classes vs. Clusters
9

Supervised: X = {xt,rt }t Unsupervised : X = { xt }t

• Classes Ci i=1,...,K • Clusters Gi i=1,...,k
K
px    px |C i P C i  k

i 1 px    px |Gi P Gi 

i 1

where p(x|Ci) ~ N(μi ,∑i )

• Φ = {P (Ci ), μi , ∑i }Ki=1 where p(x|Gi) ~ N ( μi , ∑i )
PˆC i  
t i
r t

mi 
t i x
r t t
• Φ = {P ( Gi ), μi , ∑i }ki=1
N t i
r t

t rit xt  m i xt  m i 

Si  Labels rti ?
r
t i
t
Mixture densities
• In unsupervised learning problem, we have given the sample . That is only
and not the labels , that is, we do not know which comes from which
component.
• So we should estimate both:
• First, we should estimate the labels, , the component that a given instance belongs
to; and
• second, once we estimate the labels, we should estimate the parameters of the
components given the set of instances belonging to them.

P JHANSI LAKSHMI Department of CSE

Supervised learning after
clustering
Supervised learning after clustering
• Dimensionality reduction methods find correlations between features and
group features
• Clustering methods find similarities between instances and group instances
• Allows knowledge extraction through
number of clusters,
prior probabilities,
cluster parameters, i.e., center, range of features.

Example: CRM, customer segmentation

P JHANSI LAKSHMI Department of CSE

Supervised learning after clustering
• Clustering is also used as a preprocessing stage.
• After clustering, we also map to a new k dimensional space where the
dimensions are hj
• Estimated group labels hj (soft) or bj (hard) may be seen as the dimensions
of a new k dimensional space, where we can then learn our discriminant or
regressor.
• Local representation (only one bj is 1, all others are 0; only few hj are
nonzero) vs
Distributed representation (After PCA; all zj are nonzero)

P JHANSI LAKSHMI Department of CSE

Mixture of Mixtures
• In classification, the input comes from a mixture of classes (supervised).
• If each class is also a mixture, e.g., of Gaussians, (unsupervised), we have a
mixture of mixtures:

Where is the number of components making up and is the component j of

class i.
• Diﬀerent classes may need diﬀerent number of components. Learning the
parameters of components is done separately for each class

P JHANSI LAKSHMI Department of CSE

Spectral Clustering
Why Spectral Clustering?
Spectral Clustering has some unique advantages:
• Makes no assumptions on the shapes of the clusters, can handled intertwined spirals, etc.
• EM or like a k-means similar require an iterative process to find the local minima and
they are very sensitive to initialization so we need multiple restarts to get high quality
clusters.
Process of Spectral Clustering:
• Construct a similarity graph (e.g KNN graph) for all the data points.
• Embed data points in a low – dimensional space (spectral embedding), in which the
clusters are more obvious, with the use of eigenvectors of the graph Laplacian.
• A classical clustering algorithm (e.g., k-means) is applied to partition the embedding.

P JHANSI LAKSHMI Department of CSE

Spectral Clustering
• Instead of clustering in the original space, a possibility is to ﬁrst map the
data to a new space with reduced dimensionality such that similarities are
made more apparent and then cluster in there.
• Any feature selection or extraction method can be used for this purpose,
and one such method is the Laplacian eigenmaps.
• After such a mapping, points that are similar are placed nearby, and this is
expected to enhance the performance of clustering.
This is the idea behind spectral clustering.

P JHANSI LAKSHMI Department of CSE

Spectral Clustering
There are two steps:
1. In the original space, we define a local neighborhood (by either fixing the
number of neighbors or a distance threshold), and then for instances that
are in the same neighborhood, we define a similarity measure
• for example, using the Gaussian kernel—that is inversely proportional to the
distance between them.
• Remember that instances not in the same local neighborhood are assigned a
similarity of 0 and hence can be placed anywhere with respect to each other. Given
this Laplacian, instances are positioned in the new space using feature embedding.

2. Then run k-means clustering with the new data coordinates in this new
space.
P JHANSI LAKSHMI Department of CSE
Spectral Clustering
• When B is the matrix of pairwise similarities and D is the diagonal degree
matrix with on the diagonals.
• The graph Laplacian is deﬁned as L = D − B
• This is the unnormalized Laplacian. There are two ways to normalize.
• One is closely related to the random walk and the other constructs a symmetric
matrix.

• They may lead to better performance in clustering:

P JHANSI LAKSHMI Department of CSE

P JHANSI LAKSHMI Department of CSE
Spectral Clustering
• It is always a good idea to do dimensionality reduction before clustering
using Euclidean distance.
• Using Laplacian eigenmaps makes more sense than multidimensional
scaling proper or principal components analysis because those two check
for the preservation of pairwise similarities between all pairs of instances
• whereas here with Laplacian eigenmaps, we care about preserving the
similarity between neighboring instances only.
• This has the effect that instances that are nearby in the original space,
probably within the same cluster, will be placed very close in the new
space.

P JHANSI LAKSHMI Department of CSE

Hierarchical Clustering
Hierarchical Clustering
• There are methods for clustering that use only similarities of instances,
without any other requirement on the data;
• The aim is to ﬁnd groups such that instances in a group are more similar to
each other than instances in diﬀerent groups.
• This is the approach taken by hierarchical clustering.
• Hierarchical Clustering or Hierarchical Cluster Analysis or HCA is a
method of clustering which seeks to build a hierarchy of clusters in a given
dataset

P JHANSI LAKSHMI Department of CSE

Hierarchical Clustering
• Cluster based on similarities/distances
• Distance measure between instances xr and xs

Minkowski (Lp) (Euclidean for p = 2)

City-block distance

P JHANSI LAKSHMI Department of CSE

Agglomerative Clustering
• Start with N groups each with one instance and merge two closest groups at
each iteration
• Distance between two groups Gi and Gj:
• Single-link:

• Complete-link:

• Average-link, centroid

P JHANSI LAKSHMI Department of CSE

Example: Single-Link Clustering

Dendrogram

P JHANSI LAKSHMI Department of CSE

K-means Clustering
K-means Clustering
• K-means is a simplified approach to clustering.
• It is similar to Gaussian mixture with the difference being that in k-means is
we have some how ‘hard choices’ about the means of each cluster.
• That is we start with some initial values for the means of the cluster and we
update them iteratively throughout the training.
• This is in contrast with the softer choices we made in the Gaussian mixture
models.

P JHANSI LAKSHMI Department of CSE

k-Means Clustering
• Find k reference vectors (prototypes/codebook vectors/codewords)
which best represent data

• Reference vectors, mj, j =1,...,k

P JHANSI LAKSHMI Department of CSE

Encoding/Decoding
• Given x, the encoder sends the index of the closest code word and the
decoder generates the code word with the received index as .
Error is

P JHANSI LAKSHMI Department of CSE

k-Means Clustering
• Use nearest (most similar) reference:

xt  m i  min xt  m j
j

• Reconstruction error

 
E m i ki1 X  t i bit xt  m i

 1 if x t
 m  min x t
mj
bi  
t i
j
0 otherwise

P JHANSI LAKSHMI Department of CSE

k-means Clustering

P JHANSI LAKSHMI Department of CSE

Nonparametric Methods
Introduction
• We discussed the parametric and semiparametric approaches where we assumed that
the data is drawn from one or a mixture of probability distributions of known form.
• Now, we discuss the nonparametric approach that is used when no such assumption can
be made about the input density and the data speaks for itself.
• We consider the nonparametric approaches for density estimation, classiﬁcation, outlier
detection, and regression and see how the time and space complexity can be checked.

P JHANSI LAKSHMI Department of CSE

• In nonparametric estimation, we assume that similar inputs have similar
outputs.
• Nonparametric methods do not assume any a priori parametric form for the
underlying densities.
• A nonparametric model is not ﬁxed but its complexity depends on the size
of the training set.
• Nonparametric methods are also called Instance-based or Memory-based
learning algorithms.

P JHANSI LAKSHMI Department of CSE

Nonparametric Density Estimation
• In density estimation, we assume that the sample is drawn independently
from some unknown probability density p(·).
• (·) is our estimator of p(·).
• We start with the univariate case where are scalars and later generalize to
the multidimensional case.

P JHANSI LAKSHMI Department of CSE

Nonparametric Density Estimation
• The nonparametric estimator for the cumulative distribution function, F(x),
at point x is the proportion of sample points that are less than or equal to x:

• Where denotes the number of training instances whose is less than or

equal to x.

P JHANSI LAKSHMI Department of CSE

Nonparametric Density Estimation
• Similarly, the nonparametric estimate for the density function, which is the
derivative of the cumulative distribution, can be calculated as

• h is the length of the interval and instances that fall in this interval are
assumed to be “close enough.”

P JHANSI LAKSHMI Department of CSE

Histogram Estimator
• The oldest and most popular method is the histogram where the input space
is divided into equal-sized intervals named bins of size h.
• Given an origin and a bin width h, the bins are the intervals for positive
and negative integers m and the estimate is given as

P JHANSI LAKSHMI Department of CSE

Histogram Estimator
• In constructing the histogram, we have to choose both an origin and a bin
width.
• The choice of origin affects the estimate near boundaries of bins, but it is
mainly the bin width that has an effect on the estimate:
• With small bins, the estimate is spiky, and with larger bins, the estimate is smoother
(see figure 8.1).
• The estimate is 0 if no instance falls in a bin and there are discontinuities at bin
boundaries.

• One advantage of the histogram is that once the bin estimates are calculated
and stored, we do not need to retain the training set.

P JHANSI LAKSHMI Department of CSE

Histograms for various bin lengths. ‘×’denote data points.

P JHANSI LAKSHMI Department of CSE

Naive estimator
• The Naive estimator frees us from setting an origin. It is deﬁned as

and is equal to the histogram estimate where x is always at the center of a bin of
size h (see ﬁgure).
The estimator can also be written as:

And the weight function is defined as:

P JHANSI LAKSHMI Department of CSE

Naive estimator

P JHANSI LAKSHMI Department of CSE

Kernel Estimator
• To get a smooth estimate, we use a smooth weight function called a kernel
function. The most popular is the Gaussian kernel:

• The kernel estimator, also called, is deﬁned as:

P JHANSI LAKSHMI Department of CSE

Kernel Estimator

P JHANSI LAKSHMI Department of CSE

Kernel Estimator
• The kernel function K(·) determines the shape of the influences and the
window width h determines the width.
• Just like the naive estimate is the sum of “boxes,” the kernel estimate is the
sum of “bumps.”
• All the have an effect on the estimate at x, and this effect decreases
smoothly as |x - | increases.
• To simplify calculation, K(·) can be taken to be 0 if |x - | > 3h.
• There exist other kernels easier to compute that can be used, as long as K(u)
is maximum for u = 0 and decreasing symmetrically as |u| increases.

P JHANSI LAKSHMI Department of CSE

Kernel Estimator
• When h is small, each training instance has a large eﬀect in a small region
and no eﬀect on distant points.
• When h is larger, there is more overlap of the kernels and we get a smoother
estimate.

P JHANSI LAKSHMI Department of CSE

k-Nearest Neighbor Estimator
• The nearest neighbor class of estimators adapts the amount of smoothing to
the local density of data.
• The degree of smoothing is controlled by k, the number of neighbors taken
into account, which is much smaller than N, the sample size.
• Let us deﬁne a distance between a and b, for example, |a − b|, and for each
x, we deﬁne to be the distances arranged in ascending order, from x to the
points in the sample: is the distance to the nearest sample, is the distance
to the next nearest, and so on.

P JHANSI LAKSHMI Department of CSE

k-Nearest Neighbor Estimator
• If are the data points, then we deﬁne , and if i is the index of the closest
sample, namely i then and so forth.
• The k-nearest neighbor (k-nn) density estimate is

dk(x), distance to kth closest instance to x

• This is like a naive estimator with

P JHANSI LAKSHMI Department of CSE

k-Nearest Neighbor Estimator

P JHANSI LAKSHMI Department of CSE

k-Nearest Neighbor Estimator
• To get a smoother estimate, we can use a kernel function whose e ﬀect
decreases with increasing distance

• This is like a kernel estimator with adaptive smoothing parameter .

• K(·) is typically taken to be the Gaussian kernel

P JHANSI LAKSHMI Department of CSE

Generalization to Multivariate Data
• Given a sample of d-dimensional observations the multivariate kernel
density estimator is

with the requirement that

P JHANSI LAKSHMI Department of CSE

Generalization to Multivariate Data
Multivariate Gaussian kernel
spheric

ellipsoid
where S is the sample covariance matrix. This corresponds to using
Mahalanobis distance instead of the Euclidean distance.

P JHANSI LAKSHMI Department of CSE

Nonparametric Classification
• When used for classiﬁcation, we use the nonparametric approach to
estimate the class-conditional densities p(x|Ci).
• The kernel estimator of the class-conditional density p(x|Ci) is given as
Kernel estimator

Where is 1 if ∈ and 0 otherwise, =

P JHANSI LAKSHMI Department of CSE

Nonparametric Classification
• The MLE of the prior density is
• Then, the discriminant can be written as

and x is assigned to the class for which the discriminant takes its maximum.

P JHANSI LAKSHMI Department of CSE

Nonparametric Classification
• For special case of k-NN estimator

Where is the number of neighbors out of the k nearest that belong to and is
the volume of the d-dimensional hypersphere centered at x, with radius r = ||
x - || where is the k-th nearest observation to x.
Density estimator:

P JHANSI LAKSHMI Department of CSE

Module - 5 - ECE3047 - Machine Learning
No ratings yet
Module - 5 - ECE3047 - Machine Learning
52 pages
Clustering
No ratings yet
Clustering
28 pages
Unsupervised Learning & Clustering
No ratings yet
Unsupervised Learning & Clustering
102 pages
Clustering
No ratings yet
Clustering
82 pages
Unit 3 Clustering Algorithm
No ratings yet
Unit 3 Clustering Algorithm
44 pages
Artificial Intelligence Lec 5
No ratings yet
Artificial Intelligence Lec 5
20 pages
Chapter 3 Unsupervised Learning
No ratings yet
Chapter 3 Unsupervised Learning
45 pages
Medical Imabmnge Analysis
No ratings yet
Medical Imabmnge Analysis
41 pages
Tema5 Teoria-2830
No ratings yet
Tema5 Teoria-2830
57 pages
CE345 - Lecture #10 - Clustering (Part 2)
No ratings yet
CE345 - Lecture #10 - Clustering (Part 2)
64 pages
ML - Unit - 4 - Part Ii
No ratings yet
ML - Unit - 4 - Part Ii
79 pages
ML DSBA Lab7
No ratings yet
ML DSBA Lab7
6 pages
Unit 4-Unsupervised Learning-K Means and Hierarchical Clustering
No ratings yet
Unit 4-Unsupervised Learning-K Means and Hierarchical Clustering
48 pages
DS303 Clustering
No ratings yet
DS303 Clustering
20 pages
Lecture 12 - Unsupervised Learning - Shoould Be Marged
No ratings yet
Lecture 12 - Unsupervised Learning - Shoould Be Marged
31 pages
Unsupervised Learning: Clustering
No ratings yet
Unsupervised Learning: Clustering
57 pages
02 - Clustering
No ratings yet
02 - Clustering
43 pages
Clustering
No ratings yet
Clustering
55 pages
Intro to Machine Learning Concepts
No ratings yet
Intro to Machine Learning Concepts
30 pages
Clustering
No ratings yet
Clustering
65 pages
Fuzzy Meaning
No ratings yet
Fuzzy Meaning
6 pages
IT3080 Lecture04 2023
No ratings yet
IT3080 Lecture04 2023
56 pages
Machine Learning Syllabus
No ratings yet
Machine Learning Syllabus
73 pages
Final ML Unit3 May24
No ratings yet
Final ML Unit3 May24
154 pages
Unsupervised Learning: K-Means Clustering
No ratings yet
Unsupervised Learning: K-Means Clustering
23 pages
ML 8
No ratings yet
ML 8
5 pages
cz4041 10 Clustering
No ratings yet
cz4041 10 Clustering
67 pages
CVPR Unit 5,6
No ratings yet
CVPR Unit 5,6
25 pages
Unit 5
No ratings yet
Unit 5
5 pages
UNIT5
No ratings yet
UNIT5
60 pages
ML Lecture 10
No ratings yet
ML Lecture 10
14 pages
DSML-ML09. Unsupervised Learning
No ratings yet
DSML-ML09. Unsupervised Learning
69 pages
I2ml3e Chap7
No ratings yet
I2ml3e Chap7
22 pages
Clustering
No ratings yet
Clustering
53 pages
Cluster Analysis 1731695796
No ratings yet
Cluster Analysis 1731695796
91 pages
8 Clustering2
No ratings yet
8 Clustering2
84 pages
LecN10 R
No ratings yet
LecN10 R
9 pages
Unit IV Clustering
No ratings yet
Unit IV Clustering
60 pages
Lecture Notes On Clustering
No ratings yet
Lecture Notes On Clustering
10 pages
Handbook of Cluster Analysis: C. Hennig, M. Meila, F. Murtagh, R. Rocci (Eds.)
No ratings yet
Handbook of Cluster Analysis: C. Hennig, M. Meila, F. Murtagh, R. Rocci (Eds.)
28 pages
Unit 2
No ratings yet
Unit 2
33 pages
I Jcs It 2015060141
No ratings yet
I Jcs It 2015060141
5 pages
UNIT 4 ML Notes
No ratings yet
UNIT 4 ML Notes
22 pages
Computer Vision Clustering Guide
No ratings yet
Computer Vision Clustering Guide
41 pages
4.5-Cluster Analysis
No ratings yet
4.5-Cluster Analysis
17 pages
4 Clustering
No ratings yet
4 Clustering
21 pages
Supervised vs Unsupervised Learning
No ratings yet
Supervised vs Unsupervised Learning
50 pages
AI ML Lecture 6
No ratings yet
AI ML Lecture 6
20 pages
Unsupervised Learning Explained
No ratings yet
Unsupervised Learning Explained
54 pages
Unit 4
No ratings yet
Unit 4
74 pages
Module 5
No ratings yet
Module 5
91 pages
Spectral Clustering 2
No ratings yet
Spectral Clustering 2
39 pages
Clustering and K-Means Algorithm
No ratings yet
Clustering and K-Means Algorithm
81 pages
ML Mod 4 Part 1
No ratings yet
ML Mod 4 Part 1
99 pages
A Comprehensive Overview of Clustering Algorithms in Pattern Recognition
No ratings yet
A Comprehensive Overview of Clustering Algorithms in Pattern Recognition
8 pages
Week 4 - Lecture Slides - K-Means, Mixture Models, & EM
No ratings yet
Week 4 - Lecture Slides - K-Means, Mixture Models, & EM
65 pages
4.unit 4 ML Q&A
No ratings yet
4.unit 4 ML Q&A
73 pages
Clustering
No ratings yet
Clustering
44 pages
5 - Clustering
No ratings yet
5 - Clustering
13 pages
Key Figures in Statistical History
No ratings yet
Key Figures in Statistical History
3 pages
Grade 6 Math: Understanding Percentages
No ratings yet
Grade 6 Math: Understanding Percentages
4 pages
Pressure Drop in Pipe Fittings and Valves - Equivalent Length and Resistance Coefficient
No ratings yet
Pressure Drop in Pipe Fittings and Valves - Equivalent Length and Resistance Coefficient
19 pages
Rat Weight Analysis Using R
No ratings yet
Rat Weight Analysis Using R
6 pages
Quasicrystal Eigenstate Analysis
No ratings yet
Quasicrystal Eigenstate Analysis
7 pages
Assignment 3 Part 1
No ratings yet
Assignment 3 Part 1
1 page
Effect of Problem Based Learning Strategy and Direct Instructional Strategy On Senior Secondary SC
No ratings yet
Effect of Problem Based Learning Strategy and Direct Instructional Strategy On Senior Secondary SC
8 pages
What Is Convex Optimization in Simple Terms
No ratings yet
What Is Convex Optimization in Simple Terms
4 pages
0-Answer Key IKMC 2020
No ratings yet
0-Answer Key IKMC 2020
1 page
Class 10 Coordinate Geometry Solutions
No ratings yet
Class 10 Coordinate Geometry Solutions
25 pages
Statistics 13th Edition Mcclave Test Bank
100% (29)
Statistics 13th Edition Mcclave Test Bank
27 pages
Lec - 05 - (Linear Angular Motion)
No ratings yet
Lec - 05 - (Linear Angular Motion)
18 pages
JEE Advanced Preparation Roadmap
No ratings yet
JEE Advanced Preparation Roadmap
15 pages
Boolean Algebra Simplification Techniques
No ratings yet
Boolean Algebra Simplification Techniques
2 pages
Distributive Property Lesson Plan
No ratings yet
Distributive Property Lesson Plan
12 pages
PHY250 Mathematical Methods Guide
No ratings yet
PHY250 Mathematical Methods Guide
48 pages
Aditya Academy (Aass) Maths Class X
No ratings yet
Aditya Academy (Aass) Maths Class X
4 pages
Data Structures & Algorithms MCQs
No ratings yet
Data Structures & Algorithms MCQs
26 pages
Discounted & Undiscounted Cash Flows
No ratings yet
Discounted & Undiscounted Cash Flows
5 pages
Structural Beam Design Guide
No ratings yet
Structural Beam Design Guide
72 pages
Introduction To Philosophy
0% (1)
Introduction To Philosophy
71 pages
JEE Main 2025 Test Performance Analysis
No ratings yet
JEE Main 2025 Test Performance Analysis
23 pages
Square Roots and Irrational Numbers
No ratings yet
Square Roots and Irrational Numbers
2 pages
DSE Oficial - Riquelme 2015 - Discontinuity Spacing Analysis in Rock Masses Using 3D Point Clouds
No ratings yet
DSE Oficial - Riquelme 2015 - Discontinuity Spacing Analysis in Rock Masses Using 3D Point Clouds
11 pages
Dense Elements in Residuated Lattices
No ratings yet
Dense Elements in Residuated Lattices
16 pages
Lecture Note On Waves and Oscillations
No ratings yet
Lecture Note On Waves and Oscillations
6 pages
Riemannian Geometry Course Notes
No ratings yet
Riemannian Geometry Course Notes
112 pages
Summative 2 DK
No ratings yet
Summative 2 DK
2 pages
Mec203 Jan 2026
No ratings yet
Mec203 Jan 2026
18 pages
Determine Wire Resistance Using Meter Bridge
No ratings yet
Determine Wire Resistance Using Meter Bridge
13 pages

ML Unit - IV

Uploaded by

ML Unit - IV

Uploaded by

Machine Learning

P JHANSI LAKSHMI Department of CSE

P JHANSI LAKSHMI Department of CSE

are the parameters that should be estimated from the(independent and

P JHANSI LAKSHMI Department of CSE

• Probability that x belongs to the group

P JHANSI LAKSHMI Department of CSE

P JHANSI LAKSHMI Department of CSE

P JHANSI LAKSHMI Department of CSE

Supervised: X = {xt,rt }t Unsupervised : X = { xt }t

i 1 px    px |Gi P Gi 

where p(x|Ci) ~ N(μi ,∑i )

t rit xt  m i xt  m i 

P JHANSI LAKSHMI Department of CSE

Example: CRM, customer segmentation

P JHANSI LAKSHMI Department of CSE

P JHANSI LAKSHMI Department of CSE

Where is the number of components making up and is the component j of

P JHANSI LAKSHMI Department of CSE

P JHANSI LAKSHMI Department of CSE

P JHANSI LAKSHMI Department of CSE

• They may lead to better performance in clustering:

P JHANSI LAKSHMI Department of CSE

P JHANSI LAKSHMI Department of CSE

P JHANSI LAKSHMI Department of CSE

Minkowski (Lp) (Euclidean for p = 2)

P JHANSI LAKSHMI Department of CSE

P JHANSI LAKSHMI Department of CSE

P JHANSI LAKSHMI Department of CSE

P JHANSI LAKSHMI Department of CSE

• Reference vectors, mj, j =1,...,k

P JHANSI LAKSHMI Department of CSE

P JHANSI LAKSHMI Department of CSE

P JHANSI LAKSHMI Department of CSE

P JHANSI LAKSHMI Department of CSE

P JHANSI LAKSHMI Department of CSE

P JHANSI LAKSHMI Department of CSE

P JHANSI LAKSHMI Department of CSE

• Where denotes the number of training instances whose is less than or

P JHANSI LAKSHMI Department of CSE

P JHANSI LAKSHMI Department of CSE

P JHANSI LAKSHMI Department of CSE

P JHANSI LAKSHMI Department of CSE

P JHANSI LAKSHMI Department of CSE

And the weight function is defined as:

P JHANSI LAKSHMI Department of CSE

P JHANSI LAKSHMI Department of CSE

• The kernel estimator, also called, is deﬁned as:

P JHANSI LAKSHMI Department of CSE

P JHANSI LAKSHMI Department of CSE

P JHANSI LAKSHMI Department of CSE

P JHANSI LAKSHMI Department of CSE

P JHANSI LAKSHMI Department of CSE

dk(x), distance to kth closest instance to x

P JHANSI LAKSHMI Department of CSE

P JHANSI LAKSHMI Department of CSE

• This is like a kernel estimator with adaptive smoothing parameter .

P JHANSI LAKSHMI Department of CSE

with the requirement that

P JHANSI LAKSHMI Department of CSE

P JHANSI LAKSHMI Department of CSE

Where is 1 if ∈ and 0 otherwise, =

P JHANSI LAKSHMI Department of CSE

P JHANSI LAKSHMI Department of CSE

P JHANSI LAKSHMI Department of CSE

You might also like