K-Means Clustering

Uploaded by

datasciencetrainingnucot

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

Download as docx, pdf, or txt

0% found this document useful (0 votes)

9 views6 pages

K-Means Clustering

Uploaded by

datasciencetrainingnucot

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

Download as docx, pdf, or txt

You are on page 1/ 6

K-means clustering is an unsupervised machine learning algorithm used for

partitioning a dataset into a set of clusters. The goal of K-means clustering is to

group similar data points together and discover underlying patterns in the data.

Suppose you want to cluster your data into 3 groups on a straight line. So the K is
“3” here.
Step 1: identify the number of clusters you want to identify.

Step 2: Process to select K (there are various ways to do it). You can randomly select
the K as any number too.

Step 3: Measure the distance between the 1 st point and the three initial clusters.
Now we do the same thing to all the points , measure and assign the points to the
nearest clusters .

Step 5: Calculate the mean or the average of the each cluster.

We then repeat what we did earlier using distance formula but now using mean
value. We divide the dataset into K-clusters and assign a mean value to each , data
points are measured and placed into a cluster where they are closest to the mean
value of the cluster.

The resulting clustering is very different from what we assumed initially.

We can access the quality of the clustering by adding up the variation with each
cluster. Since K-means clustering can’t see the best clustering, its only option is to
keep track of these clusters, and their total variance and do the whole thing again
with different data points.
So, again it goes from the beginning step1 , randomly selects 3 points , calculates
the distance and clusters them . Calculates the mean of the clusters and then re-
clusters based on the new mean and repeats the steps until the clusters can no
longer change.

How to pick K?
How K-means Clustering Works:

1. Initialization:
- Choose the number of clusters (K) you want to partition the data into.
- Randomly initialize K centroids (cluster centers) in the feature space.

2. Assign Data Points to Clusters:

- For each data point, calculate the distance to each centroid.
- Assign the data point to the cluster whose centroid is closest (usually using
Euclidean distance).

3. Update Cluster Centroids:

- Recalculate the centroid of each cluster by taking the mean of all data points
assigned to that cluster.

4. Repeat:
- Repeat steps 2 and 3 until the centroids no longer change significantly or until a
specified number of iterations is reached.

Real-life Applications:

1. Customer Segmentation:
- Segment customers based on their purchasing behavior, demographics, or
website interactions. This helps businesses target specific customer groups with
personalized marketing strategies.

2. Image Compression:
- Cluster similar pixels together in an image to reduce redundancy and compress
the image size without significant loss of information.

3. Anomaly Detection:
- Identify outliers or anomalies in large datasets by clustering normal data points
together. Any data points that do not fit well into any cluster may be considered
anomalies.

4. Document Clustering:
- Group similar documents together based on their content for tasks such as topic
modeling, information retrieval, and recommendation systems.

5. Genetic Clustering:
- Cluster genes based on their expression patterns to identify groups of genes that
are co-regulated or functionally related, aiding in biological research.

6. Market Segmentation:
- Divide markets into segments based on geographical, demographic, or
behavioral characteristics of consumers, helping businesses tailor their products
and services to different market segments.

7. Social Network Analysis:

- Identify communities or groups of individuals with similar interests or
interactions in social networks, enabling targeted advertising or content
recommendations.

Overall, K-means clustering is a versatile and widely used algorithm with

applications across various domains where data needs to be grouped into clusters
based on similarity.

X6.5 2022 Trang 52 61
No ratings yet
X6.5 2022 Trang 52 61
10 pages
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
From Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
No ratings yet
Impact of Service Quality and Restaurant Ambiance On Customer Loyalty
100% (1)
Impact of Service Quality and Restaurant Ambiance On Customer Loyalty
25 pages
AWG Catalog PDF
No ratings yet
AWG Catalog PDF
424 pages
Specification FOR Approval: Title 15.6" HD+ TFT LCD
No ratings yet
Specification FOR Approval: Title 15.6" HD+ TFT LCD
30 pages
K - Mean Clustering
No ratings yet
K - Mean Clustering
12 pages
DSV_Unit 3_Data Analysis in Depth
No ratings yet
DSV_Unit 3_Data Analysis in Depth
53 pages
UNIT-6 K Means Clustering
No ratings yet
UNIT-6 K Means Clustering
12 pages
ML UNIT 4 Sir
No ratings yet
ML UNIT 4 Sir
42 pages
UNIT 4 K-Means Clustring
No ratings yet
UNIT 4 K-Means Clustring
13 pages
7.introduction To Clustering
No ratings yet
7.introduction To Clustering
11 pages
unsupervised learning
No ratings yet
unsupervised learning
23 pages
Clustering
No ratings yet
Clustering
10 pages
MINOR PROJECT
No ratings yet
MINOR PROJECT
10 pages
Working of K Means Algorithm - YashBhure
No ratings yet
Working of K Means Algorithm - YashBhure
14 pages
DWDM Unit5
No ratings yet
DWDM Unit5
14 pages
ML UNIT-III
No ratings yet
ML UNIT-III
18 pages
K Means Clustering
No ratings yet
K Means Clustering
22 pages
Unit 3 Data
No ratings yet
Unit 3 Data
37 pages
Fundamentals of Data Science Unit 3
No ratings yet
Fundamentals of Data Science Unit 3
15 pages
K Means Clustering
No ratings yet
K Means Clustering
6 pages
K Mean
No ratings yet
K Mean
7 pages
chapter 3 p4
No ratings yet
chapter 3 p4
18 pages
Unit 4
No ratings yet
Unit 4
40 pages
Text Analytics Unit-3
No ratings yet
Text Analytics Unit-3
11 pages
DM Lecture 06
No ratings yet
DM Lecture 06
32 pages
DSE Lab Assignment - Writeup - 7
No ratings yet
DSE Lab Assignment - Writeup - 7
4 pages
Big Data Analytics
No ratings yet
Big Data Analytics
25 pages
4.unit 4 ML Q&A
No ratings yet
4.unit 4 ML Q&A
73 pages
DWDM Unit 3
No ratings yet
DWDM Unit 3
21 pages
UNIT - 4 DWDM
No ratings yet
UNIT - 4 DWDM
27 pages
Clustering Algorithm
No ratings yet
Clustering Algorithm
47 pages
M5
No ratings yet
M5
40 pages
ML Unit 2 Notes
No ratings yet
ML Unit 2 Notes
14 pages
Unit 4
No ratings yet
Unit 4
4 pages
K, Eans
No ratings yet
K, Eans
4 pages
Clustering
No ratings yet
Clustering
11 pages
clustering
No ratings yet
clustering
6 pages
Unit 3 unsupervised learning algorith
No ratings yet
Unit 3 unsupervised learning algorith
15 pages
M5
No ratings yet
M5
40 pages
Unit 5 - Cluster Analysis
No ratings yet
Unit 5 - Cluster Analysis
14 pages
Unit 4 Descriptive Modeling
No ratings yet
Unit 4 Descriptive Modeling
18 pages
Unit III Clustering
No ratings yet
Unit III Clustering
47 pages
DW & DM Unit 4 Notes
No ratings yet
DW & DM Unit 4 Notes
40 pages
unit-4 ML
No ratings yet
unit-4 ML
16 pages
MACHINE LEARNING NOTES ANNA UNIVERSITY
No ratings yet
MACHINE LEARNING NOTES ANNA UNIVERSITY
14 pages
Unit-5
No ratings yet
Unit-5
33 pages
K-MEANS-FINAL
No ratings yet
K-MEANS-FINAL
10 pages
Artificial Intelligence Report
No ratings yet
Artificial Intelligence Report
23 pages
Aiml 8
No ratings yet
Aiml 8
7 pages
Unit 4
No ratings yet
Unit 4
74 pages
ML Unit-4
No ratings yet
ML Unit-4
14 pages
DataMining_Unit4_notes
No ratings yet
DataMining_Unit4_notes
27 pages
Ds Econtent
No ratings yet
Ds Econtent
8 pages
Hierarchical Clustering: Required Data
No ratings yet
Hierarchical Clustering: Required Data
6 pages
Alehandro Lumentah 210211010188 Assignment09
No ratings yet
Alehandro Lumentah 210211010188 Assignment09
10 pages
MLP U4
No ratings yet
MLP U4
11 pages
Learneverythingai
No ratings yet
Learneverythingai
12 pages
K Mean
No ratings yet
K Mean
12 pages
Unit 3 & 4 (p18)
No ratings yet
Unit 3 & 4 (p18)
18 pages
Unit 4 Self Made (1)
No ratings yet
Unit 4 Self Made (1)
28 pages
Unit IV Unsupervised Learning
No ratings yet
Unit IV Unsupervised Learning
4 pages
ML CH 4
No ratings yet
ML CH 4
51 pages
Unit- 4(ML)
No ratings yet
Unit- 4(ML)
13 pages
Maintenance Manual: Fan Jet Falcon
No ratings yet
Maintenance Manual: Fan Jet Falcon
15 pages
1.1.5 Lab - Installing The Virtual Machines
No ratings yet
1.1.5 Lab - Installing The Virtual Machines
4 pages
I. Water,: Figure 1. Polar Covalent Bond in Water
No ratings yet
I. Water,: Figure 1. Polar Covalent Bond in Water
6 pages
Централизованное Управление Паролями Локальных Учетных Записей
No ratings yet
Централизованное Управление Паролями Локальных Учетных Записей
19 pages
21v fp90s
No ratings yet
21v fp90s
49 pages
Using Additives For Real Time FCC Catalyst Optimization
No ratings yet
Using Additives For Real Time FCC Catalyst Optimization
6 pages
Appendix I: Si and English Units and Conversion Factors
No ratings yet
Appendix I: Si and English Units and Conversion Factors
7 pages
Geometric Dimensioning and Tolerancing for Mechanical Design, 3E 3rd Edition Gene R. Cogorno - eBook PDF all chapter instant download
100% (6)
Geometric Dimensioning and Tolerancing for Mechanical Design, 3E 3rd Edition Gene R. Cogorno - eBook PDF all chapter instant download
59 pages
Lecture 1
No ratings yet
Lecture 1
11 pages
Steel Sheet, Aluminum-Coated, by The Hot-Dip Process: Standard Specification For
No ratings yet
Steel Sheet, Aluminum-Coated, by The Hot-Dip Process: Standard Specification For
6 pages
Department of Education: Republic of The Philippines
No ratings yet
Department of Education: Republic of The Philippines
24 pages
Yang Et Al (2010) A Critical Review of Performance Measurement in Construction
No ratings yet
Yang Et Al (2010) A Critical Review of Performance Measurement in Construction
18 pages
GC PDF
No ratings yet
GC PDF
28 pages
Bs 45501
No ratings yet
Bs 45501
64 pages
Chapter 4classification and Prediction
No ratings yet
Chapter 4classification and Prediction
19 pages
11 Cge I
No ratings yet
11 Cge I
17 pages
Ricoh Aficio MPC4000 Trouble Error Codes
No ratings yet
Ricoh Aficio MPC4000 Trouble Error Codes
25 pages
Fast Formulas
No ratings yet
Fast Formulas
25 pages
Week3 Quantum Fields
No ratings yet
Week3 Quantum Fields
7 pages
1 Free Vibration Damping For Class
No ratings yet
1 Free Vibration Damping For Class
22 pages
Apendice Entalpia
No ratings yet
Apendice Entalpia
6 pages
1998 - March - 03. Shuard - National Rail NR Class Locomotive
100% (2)
1998 - March - 03. Shuard - National Rail NR Class Locomotive
8 pages
Assignment 3 BSE5
No ratings yet
Assignment 3 BSE5
7 pages
M1293-E HATHOR User Manual
No ratings yet
M1293-E HATHOR User Manual
324 pages
6005-T6 Aluminum Vs 6061-T6 Aluminum
100% (1)
6005-T6 Aluminum Vs 6061-T6 Aluminum
4 pages
Gas Turbines
100% (3)
Gas Turbines
46 pages