Evaluation of Clustering Algorithm
Evaluation of Clustering Algorithm
Precision
It explains how many of the correctly predicted cases
actually turned out to be positive. Precision is useful in
the cases where False Positive is a higher concern than
False Negatives. Precision for a label is defined as
the number of true positives divided by the
number of predicted positives.
EVALUATION OF CLASSIFICATION ALGORITHMS
Recall (Sensitivity)
It explains how many of the actual positive cases we were
able to predict correctly with our model. Recall is a useful
metric in cases where False Negative is of higher concern
than False Positive.
Recall for a label is defined as the number of true positives
divided by the total number of actual positives.
EVALUATION OF CLASSIFICATION ALGORITHMS
F1 Score
It gives a combined idea about Precision and Recall
metrics.
It is maximum when Precision is equal to Recall.
F1 Score is the harmonic mean of precision and recall.
AUC-ROC
The Receiver Operator Characteristic (ROC) is a
probability curve that plots the TPR(True Positive Rate)
against the FPR(False Positive Rate) at various threshold
values and separates the ‘signal’ from the ‘noise’.
K-fold cross-validation
In K-fold cross-validation, the data set is divided into a
number of K-folds and used to assess the model's ability as
new data become available.
K represents the number of groups into which the data
sample is divided. For example, if you find the k value to be
5, you can call it 5-fold cross-validation.
It involves splitting the dataset into k subsets or folds,
where each fold is used as the validation set in turn while
the remaining k-1 folds are used for training.
EVALUATION OF CLASSIFICATION ALGORITHMS
EVALUATION OF CLASSIFICATION ALGORITHMS