LAB 7
Implementation of K-nearest Neighbor Classification Model
Background
Installing Panda, numpy, scipy, sklearn libraries
Go to terminal in Pycharm
Type pip install pandas
Or you can install all by going to Settings Project Interpreter + sklearn Install
Package
KNN- Classifier
In pattern recognition and machine learning, k-nearest neighbors (KNN) is a simple algorithm
that stores all available cases and classifies new cases based on a similarity measure (e.g.
distance). KNN is a non-parametric method where the input consists of the k closest training
examples in the feature space. The output is a class membership. An object is classified by a
majority vote of its neighbors, with the object being assigned to the class most common among
its k nearest neighbors (k is a positive integer, typically small). If k = 1, then the object is simply
assigned to the class of that single nearest neighbor.
KNN Classification Model
The algorithm for KNN classification model is given below:
Algorithm: KNearestNeighbors
Input: Training data X
Training data labels Y
Sample x to classify
Output: Decision y p about sample x
fori← 1 to m do
Compute distance between training sample X i and unlabeled sample x i.e. d( X i , x)
end for
Compute set I containing the indices for the k smallest distances d( X i , x)
Compute the decision class y pby measuring the majority label Y from I
return y p
Example
Fit does training on Training Data (X_train, Y_Train)
Predict does testing on Test data (X_test) and predicts outputs
Task 1
Write a program to implement KNN classifier and classify given vector. (for k = 3)
Age Loan Class (Defaulter)
25 40000 N
35 60000 N
45 80000 N
20 20000 N
35 120000 N
52 18000 N
23 95000 Y
40 62000 Y
60 100000 Y
48 220000 Y
33 150000 Y
48 142000 ?
Task 2
Modify Task 1 and perform it for all odd values of k from 1 to 10.
Find accuracy for each value of k and display.
Find the best k which gives highest value of accuracy
Also compute confusion matrix for the best value of k.
Task 3
Implement KNN algorithm yourself in python for Iris Dataset without using built-in KNN
classifier library.
1. Load dataset
2. Split dataset into test and train sets
3. Perform KNN algorithm to make predictions for k=5
4. Compute accuracy and confusion matrix
DELIVERABLES:
1. Screenshots of all above enlisted tasks showing commands as well as results.
2. [Link], [Link], [Link]
3. Submit hard and soft copy of the report before deadline. Marks will be deducted for
late submissions.