GATE Machine Learning Question Bank
K-Nearest Neighbor & Naive Bayes Classifier
Subject: Computer Science & Engineering / Data Science & AI
Topics: K-Nearest Neighbor Algorithm, Naive Bayes Classifier
Total Questions: 55 (30 MCQs + 15 MSQs + 10 NAT)
Section A: Multiple Choice Questions (MCQs)
Choose the correct option. Each question carries 1 or 2 marks.
K-Nearest Neighbor Questions
Q1. [1 Mark] K-Nearest Neighbor (KNN) algorithm is classified as:
(A) Parametric learning algorithm
(B) Non-parametric learning algorithm
(C) Generative learning algorithm
(D) Feature selection algorithm
Q2. [2 Marks] In KNN algorithm, which distance measure is most commonly used for continuous
features?
(A) Hamming distance
(B) Manhattan distance
(C) Euclidean distance
(D) Cosine similarity
Q3. [1 Mark] KNN is also known as:
(A) Eager learning algorithm
(B) Lazy learning algorithm
(C) Active learning algorithm
(D) Reinforcement learning algorithm
Q4. [2 Marks] Consider a 2D dataset with points: A(1,1), B(2,3), C(4,2), D(5,5). For a query point
Q(3,3) with K=2, which points will be the nearest neighbors using Euclidean distance?
(A) A and B
(B) B and C
(C) C and D
(D) A and C
Q5. [1 Mark] In KNN, the value of K should be:
(A) Always even
(B) Always odd for binary classification
(C) Always a prime number
(D) Equal to the number of features
Q6. [2 Marks] Which of the following is NOT an advantage of KNN algorithm?
(A) Simple to understand and implement
(B) No assumptions about data distribution
(C) Fast training phase
(D) High memory requirements during prediction
Q7. [1 Mark] In weighted KNN, closer neighbors are given:
(A) Lower weights
(B) Higher weights
(C) Equal weights
(D) Negative weights
Q8. [2 Marks] The time complexity of KNN prediction phase for N training samples, D
dimensions, and K neighbors is:
(A) O(1)
(B) O(K)
(C) O(ND)
(D) O(ND + K*log(K))
Q9. [1 Mark] KNN suffers from which problem in high-dimensional spaces?
(A) Underfitting
(B) Curse of dimensionality
(C) Vanishing gradients
(D) Local minima
Q10. [2 Marks] For KNN regression, the predicted output is typically:
(A) The class label of the majority neighbors
(B) The weighted average of neighbor values
(C) The maximum value among neighbors
(D) The median of neighbor values
Naive Bayes Questions
Q11. [1 Mark] Naive Bayes classifier is based on:
(A) Maximum likelihood estimation
(B) Bayes' theorem
(C) Central limit theorem
(D) Law of large numbers
Q12. [2 Marks] The "naive" assumption in Naive Bayes classifier refers to:
(A) Simple computational requirements
(B) Features are conditionally independent given the class
(C) Equal prior probabilities for all classes
(D) Gaussian distribution of features
Q13. [1 Mark] Naive Bayes classifier is a:
(A) Discriminative model
(B) Generative model
(C) Ensemble model
(D) Deep learning model
Q14. [2 Marks] In Naive Bayes, the posterior probability P(C|X) is calculated as:
(A) P(X|C) × P(C)
(B) P(X|C) × P(C) / P(X)
(C) P(C|X) × P(X) / P(C)
(D) P(X) × P(C) / P(X|C)
Q15. [1 Mark] Which variant of Naive Bayes is suitable for binary features?
(A) Gaussian Naive Bayes
(B) Multinomial Naive Bayes
(C) Bernoulli Naive Bayes
(D) Complementary Naive Bayes
Q16. [2 Marks] Laplace smoothing in Naive Bayes is used to handle:
(A) Overfitting
(B) Zero probability problem
(C) Underfitting
(D) Computational complexity
Q17. [1 Mark] Naive Bayes performs particularly well for:
(A) Image classification
(B) Text classification
(C) Time series prediction
(D) Clustering problems
Q18. [2 Marks] The likelihood P(X|C) in Naive Bayes for continuous features following Gaussian
distribution is calculated using:
(A) Frequency count
(B) Probability density function
(C) Cumulative distribution function
(D) Uniform distribution
Q19. [1 Mark] In Multinomial Naive Bayes, features represent:
(A) Binary presence/absence
(B) Continuous values
(C) Count or frequency data
(D) Ordinal categories
Q20. [2 Marks] Which assumption violation affects Naive Bayes performance the LEAST?
(A) Feature independence assumption
(B) Identical distribution assumption
(C) Gaussian distribution assumption (for continuous features)
(D) Equal class prior assumption
Mixed Questions
Q21. [1 Mark] Both KNN and Naive Bayes can be used for:
(A) Only classification
(B) Only regression
(C) Both classification and regression
(D) Neither classification nor regression
Q22. [2 Marks] Compared to Naive Bayes, KNN is:
(A) More sensitive to irrelevant features
(B) Less sensitive to irrelevant features
(C) Equally sensitive to irrelevant features
(D) Not affected by irrelevant features
Q23. [1 Mark] Which algorithm requires storage of entire training dataset during prediction?
(A) Naive Bayes
(B) KNN
(C) Both
(D) Neither
Q24. [2 Marks] For a dataset with missing values, which approach is more suitable?
(A) KNN (after imputation)
(B) Naive Bayes (with appropriate handling)
(C) Both perform equally
(D) Neither can handle missing values
Q25. [1 Mark] Which algorithm has a faster training phase?
(A) KNN
(B) Naive Bayes
(C) Both have same training time
(D) Depends on dataset size
Q26. [2 Marks] In terms of decision boundary, KNN creates:
(A) Linear decision boundaries
(B) Non-linear, complex decision boundaries
(C) Quadratic decision boundaries
(D) No decision boundaries
Q27. [1 Mark] Which algorithm is more interpretable?
(A) KNN
(B) Naive Bayes
(C) Both are equally interpretable
(D) Neither is interpretable
Q28. [2 Marks] For imbalanced datasets, which technique can be applied to improve KNN
performance?
(A) Weighted KNN
(B) SMOTE with KNN
(C) Distance-weighted voting
(D) All of the above
Q29. [1 Mark] Naive Bayes assumes that features follow:
(A) Any probability distribution
(B) Specific probability distribution based on variant
(C) Uniform distribution only
(D) Normal distribution only
Q30. [2 Marks] The space complexity of storing a trained model is higher for:
(A) KNN
(B) Naive Bayes
(C) Both have same space complexity
(D) Depends on the number of features
Section B: Multiple Select Questions (MSQs)
Select ALL correct options. Partial marking is not awarded.
Q31. [2 Marks] Which of the following are valid distance measures for KNN?
(A) Euclidean distance
(B) Manhattan distance
(C) Hamming distance
(D) Cosine similarity
Q32. [2 Marks] The performance of KNN algorithm depends on:
(A) Choice of K value
(B) Distance measure used
(C) Feature scaling/normalization
(D) Training set size
Q33. [2 Marks] Which statements about KNN are TRUE?
(A) KNN is a non-parametric algorithm
(B) KNN requires parameter tuning for K
(C) KNN has no explicit training phase
(D) KNN always produces linear decision boundaries
Q34. [2 Marks] Advantages of Naive Bayes classifier include:
(A) Fast training and prediction
(B) Handles categorical and continuous features
(C) Requires small training dataset
(D) Not sensitive to irrelevant features
Q35. [2 Marks] Which Naive Bayes variants exist?
(A) Gaussian Naive Bayes
(B) Multinomial Naive Bayes
(C) Bernoulli Naive Bayes
(D) Polynomial Naive Bayes
Q36. [2 Marks] Challenges with KNN algorithm include:
(A) Sensitive to curse of dimensionality
(B) Computationally expensive during prediction
(C) Sensitive to noise and outliers
(D) Memory intensive storage requirements
Q37. [2 Marks] For text classification using Naive Bayes, which preprocessing steps are
commonly used?
(A) Tokenization
(B) Stop word removal
(C) TF-IDF weighting
(D) Stemming/Lemmatization
Q38. [2 Marks] To improve KNN performance, which techniques can be used?
(A) Feature selection/dimensionality reduction
(B) Data normalization/scaling
(C) Cross-validation for optimal K
(D) Ensemble methods
Q39. [2 Marks] Which statements about Naive Bayes independence assumption are TRUE?
(A) It assumes features are independent given the class
(B) Violation of this assumption always leads to poor performance
(C) It simplifies probability calculations significantly
(D) Real-world datasets rarely satisfy this assumption perfectly
Q40. [2 Marks] Both KNN and Naive Bayes:
(A) Are supervised learning algorithms
(B) Can handle multi-class classification
(C) Are suitable for online learning
(D) Provide probability estimates for predictions
Q41. [2 Marks] For handling categorical features in KNN:
(A) Hamming distance can be used
(B) One-hot encoding is required
(C) Gower distance is suitable
(D) Euclidean distance works directly
Q42. [2 Marks] Hyperparameters that need tuning include:
(A) K value in KNN
(B) Distance metric in KNN
(C) Smoothing parameter in Naive Bayes
(D) Feature weights in KNN
Q43. [2 Marks] Applications where Naive Bayes excels:
(A) Spam email detection
(B) Sentiment analysis
(C) Medical diagnosis
(D) Image recognition
Q44. [2 Marks] Limitations of Naive Bayes include:
(A) Strong independence assumption
(B) Zero probability problem
(C) Limited expressiveness for complex relationships
(D) Requires large datasets
Q45. [2 Marks] Cross-validation techniques applicable to both algorithms:
(A) K-fold cross-validation
(B) Leave-one-out cross-validation
(C) Stratified cross-validation
(D) Time series cross-validation
Section C: Numerical Answer Type (NAT) Questions
Enter the numerical answer (integer or decimal up to 2 places).
Q46. [1 Mark] Consider a 1D KNN problem with training points: {2, 4, 6, 8, 10} and
corresponding labels: {0, 1, 0, 1, 0}. For query point 5 with K=3, what is the predicted class using
majority voting?
Q47. [2 Marks] In a 2D space, calculate the Euclidean distance between points A(1, 3) and B(4,
7). Round your answer to 2 decimal places.
Q48. [2 Marks] Given the following data for Naive Bayes:
P(Sunny|Play=Yes) = 0.6, P(Sunny|Play=No) = 0.4
P(Play=Yes) = 0.7, P(Play=No) = 0.3
P(Sunny) = 0.5
Calculate P(Play=Yes|Sunny). Round to 2 decimal places.
Q49. [1 Mark] For a KNN classifier with K=5, if the 5 nearest neighbors have distances {1.2, 1.5,
2.1, 2.3, 2.8} and classes {A, B, A, A, B}, what is the predicted class if we use inverse distance
weighting? Enter 1 for class A, 2 for class B.
Q50. [2 Marks] In Naive Bayes, if P(Feature1=1|Class=Positive) = 0.8,
P(Feature2=1|Class=Positive) = 0.6, and P(Class=Positive) = 0.4, calculate P(Feature1=1,
Feature2=1|Class=Positive) assuming independence. Round to 2 decimal places.
Q51. [1 Mark] For KNN with K=7, if 4 neighbors vote for class 1 and 3 neighbors vote for class 2,
what is the confidence level (as percentage) for the predicted class? Round to nearest integer.
Q52. [2 Marks] Calculate the Manhattan distance between points P(2, 5) and Q(7, 1).
Q53. [1 Mark] In a Naive Bayes classifier, if we have 100 training examples with 60 positive and
40 negative cases, what is the prior probability P(Positive)? Express as decimal.
Q54. [2 Marks] For a 3-NN classifier, given neighbor distances {0.5, 1.0, 1.5} with corresponding
classes {A, B, A}, calculate the weighted vote for class A using inverse distance weighting.
Round to 2 decimal places.
Q55. [1 Mark] In Naive Bayes with Laplace smoothing (α=1), if a feature value appears 3 times in
10 examples of a class, and we have 5 possible feature values, what is the smoothed
probability? Round to 2 decimal places.
ANSWER KEY
Section A: MCQs
1. (B) - KNN is non-parametric as it makes no assumptions about data distribution
2. (C) - Euclidean distance is most commonly used for continuous features
3. (B) - KNN is lazy learning as it delays computation until query time
4. (B) - Points B(2,3) and C(4,2) are closest to Q(3,3)
5. (B) - Odd K avoids ties in binary classification
6. (D) - High memory requirement is a disadvantage, not advantage
7. (B) - Closer neighbors get higher weights in weighted KNN
8. (C) - Need to calculate distance to all N points in D dimensions
9. (B) - Curse of dimensionality affects KNN performance
10. (B) - KNN regression uses weighted average of neighbor values
11. (B) - Naive Bayes is based on Bayes' theorem
12. (B) - "Naive" refers to conditional independence assumption
13. (B) - Naive Bayes is a generative model
14. (B) - Standard Bayes' theorem formula
15. (C) - Bernoulli NB for binary features
16. (B) - Laplace smoothing handles zero probability
17. (B) - Excels in text classification
18. (B) - Uses probability density function for continuous features
19. (C) - Multinomial NB for count/frequency data
20. (A) - Feature independence violation often has minimal impact
21. (C) - Both can do classification and regression
22. (A) - KNN more sensitive to irrelevant features
23. (B) - KNN requires storing entire training set
24. (B) - Naive Bayes handles missing values better
25. (A) - KNN has faster (no) training phase
26. (B) - KNN creates complex, non-linear boundaries
27. (B) - Naive Bayes more interpretable via probabilities
28. (D) - All techniques help with imbalanced data
29. (B) - Depends on specific NB variant used
30. (A) - KNN stores entire dataset
Section B: MSQs
31. (A)(B)(C)(D) - All are valid distance measures
32. (A)(B)(C)(D) - All factors affect KNN performance
33. (A)(B)(C) - KNN is non-parametric, needs K tuning, no training phase
34. (A)(B)(C) - Fast, handles both feature types, small dataset OK
35. (A)(B)(C) - Three main variants exist
36. (A)(B)(C)(D) - All are challenges with KNN
37. (A)(B)(C)(D) - All are common preprocessing steps
38. (A)(B)(C)(D) - All techniques improve KNN
39. (A)(C)(D) - Independence assumption properties
40. (A)(B)(D) - Both supervised, multi-class, provide probabilities
41. (A)(B)(C) - Valid approaches for categorical features
42. (A)(B)(C)(D) - All are hyperparameters to tune
43. (A)(B)(C) - Naive Bayes applications
44. (A)(B)(C) - Main limitations of Naive Bayes
45. (A)(B)(C) - Valid CV techniques for both algorithms
Section C: NAT Questions
46. 0 (K=3 nearest points: 4,6,2 with labels 1,0,0; majority is 0)
47. 5.00 (√[(4-1)² + (7-3)²] = √[9+16] = √25 = 5)
48. 0.84 (P(Yes|Sunny) = (0.6×0.7)/0.5 = 0.42/0.5 = 0.84)
49. 1 (Class A: (1/1.2 + 1/2.1 + 1/2.3) = 1.77; Class B: (1/1.5 + 1/2.8) = 1.02; A wins)
50. 0.48 (0.8 × 0.6 = 0.48, assuming independence)
51. 57 (4/7 = 0.571 ≈ 57%)
52. 8 (|7-2| + |1-5| = 5 + 4 = 8)
53. 0.60 (60/100 = 0.6)
54. 2.83 (Class A: 1/0.5 + 1/1.5 = 2.67; vote weight = 2.67)
55. 0.27 ((3+1)/(10+5) = 4/15 = 0.267 ≈ 0.27)
Note: This question bank covers fundamental concepts of K-Nearest Neighbor and Naive
Bayes algorithms as typically tested in GATE examinations. Practice these questions to
strengthen your understanding of these important machine learning algorithms.