0% found this document useful (0 votes)

15 views3 pages

Clustering Techniques in Data Analysis

Uploaded by

ANANTHI K

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views3 pages

Clustering Techniques in Data Analysis

Uploaded by

ANANTHI K

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

ROHINI COLLEGE OF ENGINEERING AND TECHNOLOGY

4.3 CLUSTERING

 Given a set of objects, place them in groups such that the objects in a group are similar
(or related) to one another and different from (or unrelated to) the objects in other
groups,
 Cluster analysis can be a powerful data-mining tool for any organization that

needs to identity discrete groups of customers, sales transactions, or other types of

behaviors and things. For example, insurance providers use cluster analysis to detect fraudulent
claims and banks used it for credit scoring.

 Cluster analysis uses mathematical models to discover groups of similar customers

based on the smallest variations among customers within each group.

 Cluster is a group of objects that belong to the same class. In another words the similar
object are grouped in one cluster and dissimilar are grouped in other cluster.
 Clustering is a process of partitioning a set of data in a set of meaningful subclasses. Every
data in the sub class shares a common trait. It helps a user understand the natural grouping or
structure in a data set.
 Various types of clustering methods are partitioning methods, hierarchical clustering, fuzzy
clustering, density based clustering and model based clustering.
 Cluster analysis is process of grouping a set of data objects into clusters.

Desirable properties of a clustering algorithm are as follows:

1. Scalability (in terms of both time and space)

2. Ability to deal with different data types.

CS3491-ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

ROHINI COLLEGE OF ENGINEERING AND TECHNOLOGY

3. Minimal requirements for domain knowledge to determine input parameters.

4. Interpretability and usability.

Clustering of data is method by which large sets of data are grouped into clusters of smaller sets
of similar data. Clusters can be considered the most important supervised learning problems

• A cluster is therefore a collection of objects which are similar between them and are dissimilar
to the objects belonging to the other clusters.

In this case we easily identify the 4 clusters into which the data can be divided; the similarity
criterion is distance : two or more objects belong to the same duster they are “close” acccording to
a given distance (in this case geometrical distance) This is called distance based clustering.

 Clustering, means grouping, of data or dividing a large data set into smaller data sets of
scene similarity.
 A clustering, algorithm attempts to find natural groups components or data based on some
similarity. Also, the clustering, algorithm finds the centroid of a group of data sets
 To determine cluster membership, most algorithms evaluate the distance between a point
and the cluster centroids. The output from a clustering algorithm is basically a statistical
description of the cluster centroids with the number of components in each cluster.
 Cluster centroid: The centroid of a cluster is a point whose parameter values are the mean
of the parameter values of all the points in the cluster. Each cluster has a well defined
centroid.
 Distance: The distance between two points is taken as a common metric to as see the
similarity among the components of population. The commonly used distance measure is
the euclidean metric which defines the distance between two points

CS3491-ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

ROHINI COLLEGE OF ENGINEERING AND TECHNOLOGY

 The goal of clustering is to determine the intrinsic grouping in a set of unlabeled data. But
how to decide what constitutes a good clustering? It can be shown that there is no absolute
"best" criterion which would be independent of the final aim of the clustering.
Consequently, it is the user which must supply criterion, in sucha way that the result of the
clustering will suit their needs.
 Clustering analysis helps construct meaningful partitioning of a large set of objects: Cluster
analysis has been widely used in numerous applications, including pattern recognition, data
analysis, image processing etc.
 Clustering algorithms may be classified as listed below:
1. Exclusive clustering
2. Overlapping clustering
3. Hierarchical clustering
4. Probabilisitic clustering

 A good clustering method will produce high quality clusters high intra- class similarity and
low inter class similarity. The quality of a clustering result depends on both the similarity
measure used by the method and its implementation. The quality of a clustering method is
also measured by it's ability to discover some or all of the hidden patterns.

 Clustering techniques types. The major clustering techniques are,

a) Partitioning methods
b) Hierarchical methods
c) Density-based methods.

CS3491-ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

Clustering Techniques in Data Analysis
No ratings yet
Clustering Techniques in Data Analysis
3 pages
Clustering
No ratings yet
Clustering
3 pages
Unit 4 Notes
No ratings yet
Unit 4 Notes
66 pages
Understanding Cluster Analysis in Data Mining
No ratings yet
Understanding Cluster Analysis in Data Mining
26 pages
Overview of Clustering Techniques in Data Mining
No ratings yet
Overview of Clustering Techniques in Data Mining
5 pages
Clustering Techniques Overview
No ratings yet
Clustering Techniques Overview
52 pages
DM Unit 5
No ratings yet
DM Unit 5
15 pages
Fds Unit03
No ratings yet
Fds Unit03
11 pages
Cluster Analysis in Data Mining Techniques
No ratings yet
Cluster Analysis in Data Mining Techniques
15 pages
Clustering Notes
No ratings yet
Clustering Notes
17 pages
Data Mining - UNIT-IV
No ratings yet
Data Mining - UNIT-IV
24 pages
Data Mining: Cluster Analysis Guide
No ratings yet
Data Mining: Cluster Analysis Guide
40 pages
Module V
No ratings yet
Module V
16 pages
Cluster Analysis Concepts & Algorithms
No ratings yet
Cluster Analysis Concepts & Algorithms
93 pages
Screenshot 2024-05-17 at 3.30.05 PM
No ratings yet
Screenshot 2024-05-17 at 3.30.05 PM
31 pages
DM Notes - UNIT 4
No ratings yet
DM Notes - UNIT 4
31 pages
Data Mining With Clustering: Dr. Mahesh Fernando
No ratings yet
Data Mining With Clustering: Dr. Mahesh Fernando
55 pages
Clustering Techniques and Evaluation
No ratings yet
Clustering Techniques and Evaluation
40 pages
Unit 4
No ratings yet
Unit 4
106 pages
Cluster Analysis Basics Explained
No ratings yet
Cluster Analysis Basics Explained
29 pages
Cluster Analysis: Concepts & Algorithms
No ratings yet
Cluster Analysis: Concepts & Algorithms
141 pages
Clustering New
No ratings yet
Clustering New
6 pages
Understanding Clustering Techniques
No ratings yet
Understanding Clustering Techniques
69 pages
Cluster Analysis in Data Mining Techniques
No ratings yet
Cluster Analysis in Data Mining Techniques
21 pages
Unit V Notes
No ratings yet
Unit V Notes
39 pages
Introduction to Clustering Techniques
No ratings yet
Introduction to Clustering Techniques
5 pages
DA Unit II
No ratings yet
DA Unit II
21 pages
ML Unit 4 Notes - NJ
No ratings yet
ML Unit 4 Notes - NJ
15 pages
Clustering Techniques in Business Analytics
No ratings yet
Clustering Techniques in Business Analytics
8 pages
Understanding Clustering in Data Analysis
No ratings yet
Understanding Clustering in Data Analysis
17 pages
Cluster Analysis: Methods & Applications
No ratings yet
Cluster Analysis: Methods & Applications
17 pages
Applications and Methods of Cluster Analysis
No ratings yet
Applications and Methods of Cluster Analysis
7 pages
Understanding Clustering Algorithms
No ratings yet
Understanding Clustering Algorithms
40 pages
Clustering Part 1
No ratings yet
Clustering Part 1
12 pages
BD Unit 3
No ratings yet
BD Unit 3
27 pages
Data Warehousing PDF 6
No ratings yet
Data Warehousing PDF 6
13 pages
Unit Iv
No ratings yet
Unit Iv
14 pages
Seminar on Data Clustering Techniques
No ratings yet
Seminar on Data Clustering Techniques
34 pages
Untitled Document
No ratings yet
Untitled Document
32 pages
Unit-V (Dmwh6em)
No ratings yet
Unit-V (Dmwh6em)
30 pages
Clustering in Data Mining
No ratings yet
Clustering in Data Mining
14 pages
Improved K-Means Clustering Algorithm by Getting Initial Cenroids
No ratings yet
Improved K-Means Clustering Algorithm by Getting Initial Cenroids
9 pages
Prasanna Hebbar @govt First Grade College Honnavar
No ratings yet
Prasanna Hebbar @govt First Grade College Honnavar
11 pages
Understanding Clustering in Data Mining
No ratings yet
Understanding Clustering in Data Mining
3 pages
Unit 15
No ratings yet
Unit 15
26 pages
Cluster Analysis in Machine Learning
No ratings yet
Cluster Analysis in Machine Learning
17 pages
Understanding Cluster Analysis Techniques
No ratings yet
Understanding Cluster Analysis Techniques
21 pages
Understanding Cluster Analysis Techniques
No ratings yet
Understanding Cluster Analysis Techniques
21 pages
DM UNIT-4 Part2
No ratings yet
DM UNIT-4 Part2
18 pages
Understanding Clustering in Data Analysis
No ratings yet
Understanding Clustering in Data Analysis
54 pages
Data Clustering A Review
No ratings yet
Data Clustering A Review
60 pages
Unit 5 - Cluster Analysis
No ratings yet
Unit 5 - Cluster Analysis
28 pages
Evaluating Clustering and Classification Algorithms
No ratings yet
Evaluating Clustering and Classification Algorithms
15 pages
Clustering Algorithms in Data Mining
No ratings yet
Clustering Algorithms in Data Mining
51 pages
Clustering Techniques in Data Mining
No ratings yet
Clustering Techniques in Data Mining
43 pages
Cluster Analysis in Data Mining Techniques
No ratings yet
Cluster Analysis in Data Mining Techniques
18 pages
Fundamentals of Data Science Unit 3
No ratings yet
Fundamentals of Data Science Unit 3
15 pages
Clustering Techniques and Applications
No ratings yet
Clustering Techniques and Applications
38 pages
Schneider Electric - EasyPact-EXE - EXE122608K3WD611
No ratings yet
Schneider Electric - EasyPact-EXE - EXE122608K3WD611
3 pages
Linear Classification Models Overview
No ratings yet
Linear Classification Models Overview
4 pages
Understanding Regression Analysis in AI
No ratings yet
Understanding Regression Analysis in AI
11 pages
Constraint Satisfaction in AI and ML
No ratings yet
Constraint Satisfaction in AI and ML
5 pages
Ensemble Learning Techniques in AI
No ratings yet
Ensemble Learning Techniques in AI
4 pages
Decision Tree Classification Explained
No ratings yet
Decision Tree Classification Explained
3 pages
Smart Data Monitoring System For Power Loom Using IOT
No ratings yet
Smart Data Monitoring System For Power Loom Using IOT
6 pages
AI27
No ratings yet
AI27
10 pages
Understanding Perceptron Models
No ratings yet
Understanding Perceptron Models
13 pages
PG Time Table - Fall 2025
No ratings yet
PG Time Table - Fall 2025
18 pages
AI Problem-Solving Agents Explained
No ratings yet
AI Problem-Solving Agents Explained
57 pages
Basic Programming Structure in C
No ratings yet
Basic Programming Structure in C
30 pages
AI Natural Language Processing Overview
No ratings yet
AI Natural Language Processing Overview
19 pages
Lab 4 Stack Using Array
No ratings yet
Lab 4 Stack Using Array
6 pages
313301-Data Structure Using C
No ratings yet
313301-Data Structure Using C
6 pages
C Program Output Prediction Questions
No ratings yet
C Program Output Prediction Questions
33 pages
181ap - Programming For Problem Solving27-Jan-25
No ratings yet
181ap - Programming For Problem Solving27-Jan-25
2 pages
Kerim Kochekov S Invitation For Visit To ST Catherine S May 2025
No ratings yet
Kerim Kochekov S Invitation For Visit To ST Catherine S May 2025
1 page
Discrete Mathematics Midterm Exam 2022
No ratings yet
Discrete Mathematics Midterm Exam 2022
1 page
Questa Tool Usage Beginers Guide
100% (1)
Questa Tool Usage Beginers Guide
131 pages
DMC_RT_MSG 188 Error Overview
No ratings yet
DMC_RT_MSG 188 Error Overview
576 pages
Exam Instructions for VB Project
No ratings yet
Exam Instructions for VB Project
3 pages
Chapter 2-Introduction To Computer Program
No ratings yet
Chapter 2-Introduction To Computer Program
70 pages
Intro to Finite Automata
No ratings yet
Intro to Finite Automata
54 pages
Area of Rectangle Calculation in Python
No ratings yet
Area of Rectangle Calculation in Python
8 pages
C# 3.5 Key Concepts and Tips
No ratings yet
C# 3.5 Key Concepts and Tips
26 pages
Design Pattern in Action
No ratings yet
Design Pattern in Action
25 pages
HARSH's Resume
No ratings yet
HARSH's Resume
1 page
COA Question Bank
No ratings yet
COA Question Bank
3 pages
Machine Learning An Introduction
No ratings yet
Machine Learning An Introduction
7 pages
Lecture 5 Valid Arguments in Propositional Logic 0
100% (1)
Lecture 5 Valid Arguments in Propositional Logic 0
9 pages
DAA Question Bank for Computer Science
100% (1)
DAA Question Bank for Computer Science
2 pages
Tic Tac Toe Game in Assembly Language
No ratings yet
Tic Tac Toe Game in Assembly Language
6 pages
Java Multithreading Basics Explained
No ratings yet
Java Multithreading Basics Explained
3 pages
ABAP Code to Remove Invalid Characters
No ratings yet
ABAP Code to Remove Invalid Characters
9 pages
L6 - Time & Space Complexity-1.2
No ratings yet
L6 - Time & Space Complexity-1.2
15 pages
Class 12 IT Project: SQL & Java Programs
No ratings yet
Class 12 IT Project: SQL & Java Programs
64 pages
Effective Code Generation For Distributed and Ping-Pong Register Files A Case Study On PAC VLIW DSP Cores (Etc.)
No ratings yet
Effective Code Generation For Distributed and Ping-Pong Register Files A Case Study On PAC VLIW DSP Cores (Etc.)
20 pages
List of Python Software
No ratings yet
List of Python Software
11 pages

Clustering Techniques in Data Analysis

Uploaded by

Clustering Techniques in Data Analysis

Uploaded by

ROHINI COLLEGE OF ENGINEERING AND TECHNOLOGY

needs to identity discrete groups of customers, sales transactions, or other types of

 Cluster analysis uses mathematical models to discover groups of similar customers

based on the smallest variations among customers within each group.

Desirable properties of a clustering algorithm are as follows:

1. Scalability (in terms of both time and space)

CS3491-ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

3. Minimal requirements for domain knowledge to determine input parameters.

CS3491-ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

 Clustering techniques types. The major clustering techniques are,

CS3491-ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

You might also like