Machine Learning With Python - Machine Learning Algorithms - KNN
Machine Learning With Python - Machine Learning Algorithms - KNN
• Lazy learning algorithm: KNN is a lazy learning algorithm because it does not have a
specialized training phase and uses all the data for training while classification.
K-nearest neighbors (KNN) algorithm uses ‘feature similarity’ to predict the values of new
datapoints which further means that the new data point will be assigned a value based on how
closely it matches the points in the training set.
We can understand its working with the help of following steps:
Step1: For implementing any algorithm, we need dataset. So during the first step of KNN, we
must load the training as well as test data.
Step2: Next, we need to choose the value of K i.e. the nearest data points. K can be any
integer.
Step4: End
Example
The following is an example to understand the concept of K and working of KNN algorithm: Suppose we have
a dataset which can be plotted as follows:
We can see in the above diagram the three nearest neighbors of the data point with black dot.
Among those three, two of them lies in Red class hence the black dot will also be assigned in red
class.
Implementation in Python
As we know K-nearest neighbors (KNN) algorithm can be used for both classification as
well as regression. The following are the recipes in Python to use KNN as classifier as well as regressor:
KNN as Classifier
import pandas as pd
Next, download the iris dataset from its weblink as follows:
path = "https://archive.ics.uci.edu/ml/machine-learning- databases/iris/iris.data"
dataset.head()
Data Preprocessing will be done with the help of following script lines:
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, 4].values
scaler = StandardScaler()
scaler.fit(X_train)
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)
classifier = KNeighborsClassifier(n_neighbors=8)
classifier.fit(X_train, y_train)
At last we need to make prediction. It can be done with the help of following script:
y_pred = classifier.predict(X_test)
Next, print the results as follows: from sklearn.metrics import classification_report, confusion_matrix,
accuracy_score
print("Confusion Matrix:")
print(result)
print("Classification Report:",)
print (result1)
result2 = accuracy_score(y_test,y_pred)
print("Accuracy:",result2)
Confusion Matrix:
[[21 0 0]
[ 0 16 0]
[ 0 7 16]]
Classification Report:
precision recall f1-score
support
Accuracy: 0.8833333333333333
import numpy as np
import pandas as pd
Y = array[:,2] data.shape
output:(150, 5)
knnr = KNeighborsRegressor(n_neighbors=10)
knnr.fit(X, y)
Pros
· It is very simple algorithm to understand and interpret.
· It is very useful for nonlinear data because there is no assumption about data in this algorithm.
· It has relatively high accuracy but there are much better supervised learning models than KNN.
Cons
· It is computationally a bit expensive algorithm because it stores all the training data.
The following are some of the areas in which KNN can be applied successfully:
Banking System
KNN can be used in banking system to predict weather an individual is fit for loan approval? Does that
individual have the characteristics similar to the defaulters one?
Calculating Credit Ratings
KNN algorithms can be used to find an individual’s credit rating by comparing with the persons having
similar traits.
Politics
With the help of KNN algorithms, we can classify a potential voter into various classes like “Will Vote”, “Will
not Vote”, “Will Vote to Party ‘Congress’, “Will Vote to Party ‘BJP’.
Other areas in which KNN algorithm can be used are Speech Recognition, Handwriting Detection, Image
Recognition and Video Recognition.