0% found this document useful (0 votes)
255 views24 pages

Assignment 1

The document provides a machine learning assignment containing 39 multiple choice questions related to decision trees and the K-nearest neighbors (KNN) algorithm. The questions cover topics such as decision tree components and representations, advantages and disadvantages of decision trees, classification vs regression trees, and properties of the KNN algorithm.

Uploaded by

mohamedmariam490
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
255 views24 pages

Assignment 1

The document provides a machine learning assignment containing 39 multiple choice questions related to decision trees and the K-nearest neighbors (KNN) algorithm. The questions cover topics such as decision tree components and representations, advantages and disadvantages of decision trees, classification vs regression trees, and properties of the KNN algorithm.

Uploaded by

mohamedmariam490
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

DECEMBER 10, 2023

MACHINE LEARNING
ASSIGNMENTS
MARIAM MOHAMED ABDELMONAAM ISMAIL (‫)مريم محمد عبد المنعم اسماعيل‬
20047
DR / AHMED YOUSSEF BADAWY
Assignment 1
1. A _________ is a decision support tool that uses a tree-like graph
or model of decisions and their possible consequences, including
chance event outcomes, resource costs, and utility.
a) Decision tree

2. Decision Tree is a display of an algorithm.


a) True

3. What is Decision Tree?


a) Flow-Chart
b) Structure in which internal node represents test on an attribute, each
branch represents outcome of test, and each leaf node represents class
label
c) Flow-Chart & Structure in which internal node represents test on an
attribute, each branch represents outcome of test, and each leaf node
represents class label
d) None of the mentioned

4. Decision Trees can be used for Classification Tasks.


a) True
b) False

5. Choose from the following that are Decision Tree nodes?


a) Decision Nodes
b) End Nodes
c) Chance Nodes
d) All of the mentioned

1
6. Decision Nodes are represented by ____________
a) Disks
b) Squares
c) Circles
d) Triangles

7. Chance Nodes are represented by __________


a) Disks
b) Squares
c) Circles
d) Triangles

8. End Nodes are represented by __________


a) Disks
b) Squares
c) Circles
d) Triangles

9. Which of the following are the advantage/s of


Decision Trees?

a) Possible Scenarios can be added


b) Use a white box model, If given result is provided by a model
c) Worst, best and expected values can be determined for
different scenarios
d) All of the mentioned

2
10. Decision trees are also known as CART. What is CART?
(A) Classification and Regression Trees
(B) Customer Analysis and Research Tool
(C) Communication Access Real-time Translation
(D) Computerized Automatic Rating Technique

11. What are the advantages of Classification and


Regression Trees (CART)?
(A) Decision trees implicitly perform variable screening or
feature selection
(B) Can handle both numerical and categorical data
(C) Can handle multi-output problems.
(D) All of the above

12. What are the advantages of Classification and Regression Trees


(CART)?
(A) Decision trees require relatively less effort from users for data
preparation
(B) Nonlinear relationships between parameters do not affect tree
performance.
(C) Both (A) and (B)
(D) None of these

3
13. What are the disadvantages of Classification and Regression Trees
(CART)?
(A) Decision trees can be unstable because small variations in the data might
result in a completely different tree being generated
(B) Decision trees require relatively less effort from users for
data preparation
(C) Nonlinear relationships between parameters do not affect
tree performance.
(D) Decision trees implicitly perform variable screening or
feature selection

14. Decision tree learners may create biased trees if some classes
dominate. What’s the solution of it?
(A) balance the dataset prior to fitting
(B) imbalance the dataset prior to fitting
(C) balance the dataset after fitting
(D) No solution possible

15. Decision tree can be used for ______.


(A) classification
(B) regression
(C) Both
(D) None of these

4
16. Decision tree is a ______ algorithm.
(A) supervised learning
(B) unsupervised learning
(C) Both
(D) None of these

17. Suppose, your target variable is whether a passenger will survive or


not using Decision Tree. What type of tree do you need to predict the
target variable?
(A) classification tree
(B) regression tree
(C) clustering tree
(D) dimensionality reduction tree

18. Suppose, your target variable is the price of a house using Decision
Tree. What type of tree do you need to predict the target variable?
(A) classification tree
(B) regression tree
(C) clustering tree
(D) dimensionality reduction tree

5
19. What is the maximum depth in a decision tree?
(A) the length of the longest path from a root to a leaf
(B) the length of the shortest path from a root to a leaf
(C) the length of the longest path from a root to a sub-node
(D) None of these

20. What is splitting in the decision tree?


(A) Dividing a node into two or more sub-nodes based on
if-else conditions
(B) Removing a sub-node from the tree
(C) Balance the dataset prior to fitting
(D) All of the above

21: What is a leaf or terminal node in the decision tree?

(A) The end of the decision tree where it cannot be split into
further sub-nodes.

(B) Maximum depth


(C) A subsection of the entire tree
(D) A node that represents the entire population or sample

6
22: What is pruning in a decision tree?
(A) Removing a sub-node from the tree
(B) Dividing a node into two or more sub-nodes based on
if-else conditions
(C) Balance the dataset prior to fitting
(D) All of the above

23: In the decision tree, the measure of the degree of probability of a


particular variable being wrongly classified when it is randomly chosen
is called _____.
(A) Pruning
(B) Information gain
(C) Maximum depth
(D) Gini impurity

24: Suppose in a classification problem, you are using a decision tree


and you use the Gini index as the criterion for the algorithm to select
the feature for the root node. The feature with the _____ Gini index
will be selected.
(A) maximum
(B) highest
(C) least
(D) None of these

7
25: In a decision tree algorithm, entropy helps to determine a feature
or attribute that gives maximum information about a class which is
called _____.
(A) Pruning
(B) Information gain
(C) Maximum depth
(D) Gini impurity

26: In a decision tree algorithm, how can you reduce the level of entropy
from the root node to the leaf node?
(A) Pruning
(B) Information gain
(C) Maximum depth
(D) Gini impurity

27: What are the advantages of the decision tree?


(A) Decision trees are easy to visualize
(B) Non-linear patterns in the data can be captured easily
(C) Both
(D) None of the above

8
28: What are the disadvantages of the decision tree?
(A) Over-fitting of the data is possible.
(B) The small variation in the input data can result in a
different decision tree
(C) We must balance the dataset before training the model
(D) All of the above

29: In Decision Trees, for predicting a class label, the algorithm starts
from which node of the tree?
(A) Root
(B) Leaf
(C) Terminal
(D) Sub-node

30: In a decision tree, which one is true for the root node?
(A) The root node represents the entire population or sample
(B) The Node that does not split is called the root node
(C) Root and leaf node are the same
(D) All of the above

31: What is a Decision Node in a decision tree?


(A) The end of the decision tree where it cannot be split into
further sub-nodes.
(B) When a sub-node splits into further sub-nodes, then it is
called the decision node
(C) The entire population or sample
(D) All of the above
9
32: What is a Branch in a decision tree?
(A) A subsection of the entire tree
(B) When a sub-node splits into further sub-nodes, then it is
called the decision node
(C) The entire population or sample
(D) All of the above

33: In the decision tree algorithm, a node that is divided into sub-nodes
is called a _____ node of sub-nodes whereas sub-nodes are the _____ of
a parent node.

(A) child, parent


(B) root, leaf
(C) leaf, root
(D) parent, child

34: For a decision tree, which options are true? (Select two)
(A) The root node represents the entire population or sample
(B) When a sub-node splits into further sub-nodes, then it is
called the decision node
(C) When a sub-node splits into further sub-nodes, then it is
called the root node
(D) leaf node and terminal node are different

10
35: For a decision tree, which options are true? (Select two)
(A) Nodes that do not split is called Leaf or Terminal node
(B) Nodes that do not split is called the root node
(C) A node, which is divided into sub-nodes is called a parent node of
sub-nodes whereas sub nodes are the child of a parent node
(D) A node that is divided into sub-nodes is called a child node of sub-
nodes whereas sub-nodes are the leaf of a parent node.

36: For a decision tree, which options are true? (Select two)
(A) Splitting and pruning are the same
(B) When we remove sub-nodes of a decision node, this
process is called splitting
(C) Splitting is a process of dividing a node into two or more
sub-nodes
(D) When we remove sub-nodes of a decision node, this process is called
pruning

37: A decision tree classifier selects the attribute that has the _____
Entropy or Largest Information gain.
(A) smallest
(B) largest
(C) mean
(D) median

11
38: A decision tree algorithm is ______.
(A) stacked queues and deques
(B) searching and sorting
(C) linked lists
(D) recursive

39: A decision tree classifier selects the attribute that has the
smallest Entropy or _____ Information gain.
(A) smallest
(B) largest
(C) mean
(D) median

12
Assignment 2
Question 1: What is the KNN algorithm?
(A) The KNN algorithm is non-parametric and does not make
assumptions about the underlying distribution of the data.
(B) The KNN works by finding the K closest data points (neighbours) to
the query point and predicts the output based on the labels of these
neighbours.
(C) The KNN algorithm is a lazy machine learning algorithm for
classification and regression tasks. It can work well with both binary and
multi-class classification problems.
(D) All of the above
Question 2: Euclidean and Minkowski distance are the most used
distance metrics in the KNN algorithm. What are the other distance
metrics used in the KNN algorithm?
(A) Cosine distance
(B) Haversine distance
(C) Manhattan distance
(D) All of the above
Question 3: What are the disadvantages of using the KNN algorithm?
(A) As the number of dimensions increases, the distance between any two
points in the space becomes increasingly large, making it difficult
to find meaningful nearest neighbors.
(B) Computationally expensive, especially for large datasets, and requires
a large amount of memory to store the entire dataset.
(C) Sensitive to the choice of K and distance metric.
(D) All of the above

13
Question 4: How do you choose the value of K (the number of
neighbours to consider) in the KNN algorithm? (Select two)
(A) A small value of K, for example, K=1, will result in a more flexible
model but may be prone to overfitting.
(B) A large value of K, for example, K=n, where n is the size of the
dataset, will result in a more stable model but may not capture the local
variations in the data.
(C) A large value of K, for example, K=n, where n is the size of the
dataset, will result in a more flexible model but may be prone to
overfitting.
(D) A small value of K, for example, K=1, will result in a more stable
model but may not capture the local variations in the data.

Question 5: How do you handle imbalanced data in the KNN


algorithm?
(A) Weighted voting, where the vote of each neighbor is weighted by its
inverse distance to the query point. This gives more weight to the closer
neighbors and less weight to the farther neighbors, which can help to
reduce the effect of the majority class.
(B) Oversample the minority class.
(C) Undersample the majority class.
(D) All of the above.

14
Question 6: How would you choose the distance metric in KNN?
(A) Euclidean distance is a good default choice for continuous data. It
works well when the data is dense and the differences between features are
important.
(B) Manhattan distance is a good choice when the data has many outliers
or when the scale of the features is different. For example, if we are
comparing distances between two cities, the distance metric should not be
affected by the difference in elevation or terrain between the cities.
(C) Minkowski distance with p=1 is equivalent to Manhattan distance, and
Minkowski distance with p=2 is equivalent to Euclidean distance.
Minkowski distance allows you to control the order of the distance metric
based on the nature of the problem.
(D) All of the above

Question 7: What are the ideal use cases for KNN?


(A) KNN is best suited for small to medium-sized datasets with relatively
low dimensionality. It can be useful in situations where the decision
boundary is linear. It can be effective in cases where the data is clustered
or has distinct groups.
(B) KNN is best suited for large datasets with relatively high
dimensionality. It can be useful when the decision boundary is highly
irregular or nonlinear. It can be effective in cases where the data is
clustered or has distinct groups.
(C) KNN is best suited for small to medium-sized datasets with relatively
low dimensionality. It can be useful when the decision boundary is highly
irregular or nonlinear. It can be effective in cases where the data is
clustered or has distinct groups.
(D) KNN is best suited for small to medium-sized datasets with relatively
low dimensionality. It can be useful when the decision boundary is highly
irregular or nonlinear. It can be effective in cases where the data is not
clustered or doesn’t have distinct groups.

15
Question 8: How does the KNN algorithm work? (Select two)
(A) KNN works by calculating the distance between a data point and all
other points in the dataset. Then, KNN selects the k-nearest neighbors. For
regression, the most common class among the ‘k’ neighbors is assigned as
the predicted class for the new data point.
(B) KNN works by calculating the distance between a data point and all
other points in the dataset. Then, KNN selects the k-nearest neighbors. For
classification, averages the values of the most common class among the ‘k’
neighbor to the target data point.
(C) KNN works by calculating the distance between a data point and all
other points in the dataset. Then, KNN selects the k-nearest neighbors. For
classification, the most common class among the ‘k’ neighbors is assigned
as the predicted class for the new data point.
(D) KNN works by calculating the distance between a data point and all
other points in the dataset. Then, KNN selects the k-nearest neighbors. For
regression tasks, instead of a majority vote, the algorithm takes the average
of the ‘k’ nearest neighbors’ values as the prediction.

Question 9: What’s the bias and variance trade-off for KNN? (Select
two)
(A) A small ‘k’ results in a low bias but high variance (the model is
sensitive to noise).
(B) A large ‘k’ results in a low bias but high variance (the model is
sensitive to noise).
(C) A large ‘k’ leads to high bias but low variance (smoothing over the
data).
(D) A small ‘k’ leads to high bias but low variance (smoothing over the
data).

16
Question 10: Which options are correct about instance-based learning,
model-based learning, and online learning? (Select two)
(A) KNN is an instance-based learning algorithm, meaning it memorizes
the entire training dataset and makes predictions based on similarity to
instances. That’s why KNN is not naturally suited for online learning
because it memorizes the entire training dataset. When new data is added,
the entire model needs to be recalculated.
(B) Model-based learning involves learning a mapping from inputs to
outputs and generalizing to new, unseen data. For example, SVM,
Decision Trees, etc.
(C) KNN is a model-based learning algorithm, meaning it memorizes the
entire training dataset and makes predictions based on similarity to
instances. That’s why KNN is not naturally suited for online learning
because it memorizes the entire training dataset. When new data is added,
the entire model needs to be recalculated.
(D) Instance-based learning involves learning a mapping from inputs to
outputs and generalizing to new, unseen data. For example, SVM,
Decision Trees, etc.

17
Assignment 3
Q: What is KNN algorithm?
A: KNN (K-Nearest Neighbours) algorithm is a non-parametric and lazy
machine learning algorithm used for classification and regression tasks. It
works by finding the K nearest data points (neighbors) to the query point
and predicts the output based on the labels of these neighbors.

Q: What is the distance metric used in KNN algorithm?


A: Euclidean distance is the most used distance metric in KNN algorithm.
However, other distance metrics such as Manhattan distance, Minkowski
distance, and Hamming distance can also be used depending on the
problem.

Q: What is the curse of dimensionality in KNN algorithm?


A: The curse of dimensionality refers to the problem that arises when the
number of dimensions in the feature space increases. As the number of
dimensions increases, the distance between any two points in the space
becomes increasingly large, making it difficult to find meaningful nearest
neighbors. This problem can be addressed by reducing the dimensionality
of the feature space or by using dimensionality reduction techniques such
as PCA.

Q: What are the advantages of using KNN algorithm? A: The


advantages of using KNN algorithm are:

• Simple to implement
• Non-parametric and does not make assumptions about the underlying
distribution of the data
• Can be used for both classification and regression tasks
• Can handle multi-class classification problems

18
• Can handle both numerical and categorical data

Q: What are the disadvantages of using KNN algorithm?


A: The disadvantages of using KNN algorithm are:
• Computationally expensive, especially for large datasets
• Sensitive to the choice of K and distance metric
• Requires a large amount of memory to store the entire dataset
• Can be affected by the presence of noisy or irrelevant features
• Cannot handle missing data

Q: How do you choose the value of K in KNN algorithm?


A: The choice of K in KNN algorithm depends on the problem and the
dataset. A small value of K (e.g., K=1) will result in a more flexible model
but may be prone to overfitting. A large value of K (e.g., K=n, where n is
the size of the dataset) will result in a more stable model but
may not capture the local variations in the data. The choice of K can be
determined using techniques such as cross-validation or grid search.

Q: What is the difference between classification and regression in


KNN algorithm?
A: In classification, the output of the KNN algorithm is a categorical
variable (e.g., class label), whereas in regression, the output is a
continuous variable (e.g., real number). The distance metric and the choice
of K are the same for both classification and regression, but the prediction
function is different.

Q: How do you handle imbalanced data in KNN algorithm?


A: One approach to handling imbalanced data in KNN algorithm is to use
weighted voting, where the vote of each neighbor is weighted by its

19
inverse distance to the query point. This gives more weight to the closer
neighbors and less weight to the farther neighbors, which can help to
reduce the effect of the majority class. Another approach is to oversample
the minority class or undersample the majority class to balance the dataset.

Q: Can KNN algorithm be used for text classification?


A: Yes, KNN algorithm can be used for text classification by representing
the text data as a bag-of-words or TF-IDF vector and using a distance
metric such as cosine similarity. However, KNN algorithm may not be the
most efficient algorithm for text classification, especially for large
datasets. Other algorithms such as Naive Bayes, SVM, and neural
networks may be more suitable.

Q: What are the parameters of KNN in scikit-learn?


A: In scikit-learn library, the main parameters of KNN algorithm are:
• n_neighbors: The number of neighbors to consider for classification or
regression. This is the K parameter in the KNN algorithm.
• weights: The weight function used in prediction. Possible values are
"uniform", where all neighbors have equal weight, or "distance", where the
weight of each neighbor is proportional to its inverse distance from the
query point.
• algorithm: The algorithm used to compute nearest neighbors. Possible
values are "brute", which performs a brute-force search over all possible
neighbors, "kd_tree", which uses a k-d tree to find the nearest neighbors,
and "ball_tree", which uses a ball tree to find the nearest neighbors.
• leaf_size: The number of points at which the k-d tree or ball tree
algorithm switches to brute-force search. Larger values lead to faster
queries but higher memory consumption.
• metric: The distance metric used to compute the distance between two
points. Possible values are "euclidean" (default), "manhattan",
"chebyshev", "minkowski", "wminkowski", "seuclidean", "mahalanobis",
and others.

20
• p: The power parameter for the Minkowski distance metric. When p=1,
this is equivalent to the Manhattan distance, and when p=2, this is
equivalent to the Euclidean distance.

There are also additional parameters that can be used for specific purposes,
such as n_jobs to control the number of CPU cores used for computation,
and metric_params to pass additional parameters to the distance metric
function.

Q: What are the default values of parameters for KNN in scikit-learn?


A: In scikit-learn library, the default values of the main parameters for
KNN algorithm are:
• n_neighbors: 5
• weights: "uniform"
• algorithm: "auto"
• leaf_size: 30
• metric: "minkowski"
• p: 2

These default values are used when no values are specified for these
parameters during the initialization of the KNeighborsClassifier or
KNeighborsRegressor classes. However, it is recommended to tune
these parameters for the specific task and dataset to achieve the best
performance of the model.

Q: What are the evaluation metrics for KNN algorithm in


classification tasks? A: The common evaluation metrics for KNN
algorithm in classification tasks are:
• Accuracy: The proportion of correctly classified instances over the total
number of instances.

21
• Precision: The proportion of true positives over the total number of
predicted positives.
• Recall: The proportion of true positives over the total number of actual
positives.
• F1 score: The harmonic means of precision and recall.
• ROC curve and AUC: The ROC (Receiver Operating Characteristic)
curve shows the trade-off between the true positive rate and false positive
rate for different threshold values, while the AUC (Area Under the Curve)
measures the overall performance of the classifier.

Q: What are the evaluation metrics for KNN algorithm in regression


tasks? A: The common evaluation metrics for KNN algorithm in
regression tasks are:

• Mean Absolute Error (MAE): The average absolute difference between


the predicted values and the actual values.
• Mean Squared Error (MSE): The average squared difference between the
predicted values and the actual values.
• Root Mean Squared Error (RMSE): The square root of the MSE.
• R-squared: The proportion of the variance in the dependent variable that
is explained by the independent variable.

Q: How do you perform cross-validation for KNN algorithm?


A: Cross-validation is a technique used to evaluate the performance of a
machine learning model. The common approach for performing cross-
validation for KNN algorithm is k-fold cross-validation, where the dataset
is divided into k equally sized folds. The KNN model is trained on k-1
folds and tested on the remaining fold, and this process is repeated k times
with a different fold used for testing each time. The average performance
over the k iterations is then used as the estimate of the model performance.

22
Q: Can KNN algorithm handle imbalanced classes?
A: Yes, KNN algorithm can handle imbalanced classes by using weighted
voting or adjusting the decision threshold. In weighted voting, each
neighbor’s vote is weighted by its inverse distance to the query point,
giving more weight to the closer neighbors and less weight to the farther
neighbors. Adjusting the decision threshold involves
changing the threshold used to classify an instance as positive or negative.
By increasing the threshold, the algorithm becomes more conservative and
tends to classify more instances as negative, which can help to balance the
classes.

Q: How do you tune the hyperparameters of KNN algorithm?


A: The two main hyperparameters of KNN algorithm are the number of
neighbors (K) and the distance metric. The optimal values of these
hyperparameters can be determined using techniques such as grid search
or randomized search. Grid search involves testing a range of values for
each hyperparameter and selecting the combination of hyperparameters
that gives the best performance. Randomized search is similar to grid
search but samples hyperparameters randomly from a distribution rather
than testing all possible combinations.
eralizing to new, unseen data. For example, SVM, Decision Trees, etc.

23

You might also like