Machine Learning
Machine Learning
Answer: A
2) Bayesian classifiers is
Answer: A
3) Algorithm is
Answer: B
4) Bias is
Answer: B
Answer: A
6) Case-based learning is
Answer: C
7) Classification is
Answer: A
Answer: A
9) Classification accuracy is
Answer: B
Answer: B
11) Cluster is
Answer: A
Answer: C
A) Complete
B) Consistent
C) Constant
D) None of these
Answer: A
Answer: A
A) Complete
B) Consistent
C) Constant
D) None of these
Answer: B
Answer: B
Answer: C
Answer: C
19) Hybrid is
Answer: A
20) Discovery is
Answer: C
Answer: C
23) Enrichment is
Answer: A
Answer: A
Answer: B
26) Heuristic is
Answer: B
Answer: A
Answer: B
Answer: B
Answer: A
Answer: C
Answer: C
Answer: A
35) Learning is
Answer: C
Answer: C
Answer: A
Answer: C
39) Node is
A) A component of a network
B) In the context of KDD and data mining, this refers to random
errors in a database table.
C) One of the defining aspects of a data warehouse
D) None of these
Answer: A
Answer: B
Answer: C
Answer: A
Answer: B
Answer: B
45) Prediction is
Answer: A
Answer: C
Answer: B
(a) Regression
(b) Classification
(c) Clustering
(d) inference of associative rules
(e) All (a), (b), (c) and (d) above.
Answer: E
Explanation: Regression, Classification and Clustering are the
data mining tasks.
Answer: A
Explanation: In a data warehouse, if D1 and D2 are two
conformed dimensions, then D1 may be an exact replica of D2.
(a) Informatica
(b) Oracle warehouse builder
(c) Datastage
(d) Visual studio
(e) DT/studio.
Answer: D
Explanation: Visual Studio is not an ETL tool.
i) Data streams
ii) Sequence data
iii) Networked data
iv) Text data
v) Spatial data
A) Data Characterization
B) Data Classification
C) Data discrimination
D) Data selection
A) cost-sensitive
B) work-sensitive
C) time-sensitive
D) technical-sensitive
Answer: C) time-sensitive
A) Data Characterization
B) Data Classification
C) Data discrimination
D) Data selection
A) i, ii and iv only
B) ii, iii and iv only
C) i, ii and iii only
D) All i, ii, iii and iv
A) Knowledge Database
B) Knowledge Discovery Database
C) Knowledge Data House
D) Knowledge Data Definition
A) Data
B) Information
C) Query
D) Useful information
A) Data Mining
B) Data Warehousing
C) Document Mining
D) Text Mining
A) OLAP
B) OLTP
C) SMTP
D) FTP
Answer: B) OLTP
64) An .................. system is market-oriented and is used
for data analysis by knowledge workers, including managers,
executives, and analysts.
A) OLAP
B) OLTP
C) Both of the above
D) None of the above
Answer: A) OLAP
A) Star schema
B) Snowflake schema
C) Fact constellation
D) Star-snowflake schema
A) top-down view
B) data warehouse view
C) data source view
D) business query view
A) many to many
B) one to one
C) one to many
D) many to one
A) top-down view
B) data warehouse view
C) data source view
D) business query view
A) Metadata
B) Current detail data
C) Lightly summarized data
D) Component Key
A) Information processing
B) Analytical processing
C) Data mining
D) Transaction processing
A) DBMS
B) RDBMS
C) Sybase
D) SQL Server
Answer:B) RDBMS
A) Information processing
B) Analytical processing
C) Data mining
D) Transaction processing
A) Multidimensional cube
B) Dimensions cube
C) Data cube
D) Data model
A) Forecasting
B) Data Mining
C) Analysis of large volumes of product sales data
D) All of the above
A) normalized
B) informational
C) summary
D) denormalized
Answer: C) summary
A) Hardware
B) Software
C) End users
D) Middle ware
A) flexibility
B) quantify
C) qualify
D) ability
Answer: A) flexibility
A) Operational database
B) Relational database
C) Multidimensional database
D) Data repository
Answer: B
Explanation: Data access tools to be used when deciding on the
data structure of a data mart.
82) The process of removing the deficiencies and loopholes in
the data is called as
Answer: C
Explanation: The process of removing the deficiencies and
loopholes in the data is called as cleaning up of data.
(a) OLTP
(b) OLAP
(c) Spread sheet
(d) XML
(e) All (a), (b), (c) and (d) above.
Answer: B
Explanation: Online Analytical Processing (OLAP) manages both
current and historic transactions.
(a) Partitioning
(b) Grid
(c) Cluster
(d) Table
(e) Data source.
Answer: C
Explanation: Cluster is the collection of data objects that are
similar to one another within the same group.
Answer: A
Explanation: KDD Process includes data cleaning, data
integration, data selection, data transformation, data mining,
pattern evolution, and knowledge presentation.
Answer: B
Explanation: Dimensional models can be created at Architecture
models level.
(a) Verbose
(b) Descriptive
(c) Equally unavailable
(d) Complete
(e) Indexed.
Answer: C
Explanation: Equally unavailable is not related to dimension
table attributes.
89) Data warehouse bus matrix is a combination of
Answer: A
Explanation: Data warehouse bus matrix is a combination of
Dimensions and data marts.
Answer: E
Explanation: Ensure that the transaction edit flat is used for
analysis is not the managing issue in the modeling process.
Answer: A
Explanation: Data modeling technique used for data marts is
Dimensional modeling.
Answer: C
Explanation: An OLAP tool provides for Slicing and dicing.
Answer: C
Explanation: The synonym for data mining is Knowledge discovery
in Database.
Answer: D
Explanation: The fact table of a data warehouse is the main
store of all of the recorded transactions over time is the
correct statement.
Answer: A
Explanation: The Most common kind of queries in a data
warehouse is Inside-out queries.
Answer: B
Explanation: Concept description is the basis form of the
descriptive data mining.
(a) If a set cannot pass a test, all of its supersets will fail
the same test as well
(b) To improve the efficiency the level-wise generation of
frequent item sets
(c) If a set can pass a test, all of its supersets will fail
the same test as well
(d) To decrease the efficiency the level-wise generation of
frequent item sets
(e) All (a), (b), (c) and (d) above.
Answer: B
Explanation: The apriori property means to improve the
efficiency the level-wise generation of frequent item sets.
Answer: D
Explanation: Disposable Data Marts is the form the set of data
created to support a specific short lived business situation.
I. Administrative.
II. Business.
III. Operational.
Answer: E
Explanation: The different types of Meta data are
Administrative, Business and Operational.
Answer: D
Explanation: Multiple Regression means extension of linear
regression involving more than one predicator value.
Answer: B
Explanation: Rapid changing dimension policy should not be
considered for each dimension attribute.
Answer: A
Explanation: A business Intelligence system requires data from
Data warehouse
(a) Biomedical
(b) DNA data analysis
(c) Financial data analysis
(d) Retail industry and telecommunication industry
(e) All (a), (b), (c) and (d) above.
Answer: E
Explanation: Data mining application domains are Biomedical,
DNA data analysis, Financial data analysis and Retail industry
and telecommunication industry
Answer: A
Explanation: The generalization of multidimensional attributes
of a complex object class can be performed by examining each
attribute, generalizing each attribute to simple-value data and
constructing a multidimensional data cube is called as object
cube.
Answer: A
Explanation: High risk high reward project is a building a data
mart for a business process/department that is very critical for
your organization
Answer: A
Explanation: Business intelligence system will have OLAP, Data
mining and reporting tolls.
Solution: (B)
A) PCA
B) K-Means
Solution: (A)
A) TRUE
B) FALSE
Solution: (A)
Y=X2. Note that, they are not only associated, but one is a
function of the other and Pearson correlation between them is 0.
A) Only 1
B) Only 2
C) Only 3
D) 1 and 2
E) 2 and 3
F) 1,2 and 3
Solution: (A)
1. Number of Trees
2. Depth of Tree
3. Learning Rate
A) Only 1
B) Only 2
C) Only 3
D) 1 and 2
E) 2 and 3
F) 1,2 and 3
Solution: (B)
6) Imagine, you are working with “Analytics Vidhya” and you want
to develop a machine learning algorithm which predicts the
number of views on the articles.
Your analysis is based on features like author name, number of
articles written by the same author on Analytics Vidhya in past
and a few other features. Which of the following evaluation
metric would you choose in that case?
2. Accuracy
3. F1 Score
A) Only 1
B) Only 2
C) Only 3
D) 1 and 3
E) 2 and 3
F) 1 and 2
Solution:(A)
B)
C)
A) 1 is tanh, 2 is ReLU and 3 is SIGMOID activation functions.
Solution: (D)
[0,0,0,1,1,1,1,1]
Solution: (A)
So the answer is A.
D) Both A and B
E) None of these
Solution: (D)
Both are true, The OHE will fail to encode the categories which
is present in test but not in train so it could be one of the
main challenges while applying OHE. The challenge given in
option B is also true you need to more careful while applying
OHE if frequency distribution doesn’t same in train and test.
10) Skip gram model is one of the best models used in Word2vec
algorithm for words embedding. Which one of the following models
depict the skip gram model?
A) A
B) B
C) Both A and B
D) None of these
Solution: (B)
A) ReLU
B) tanh
C) SIGMOID
D) None of these
Solution: (B)
A) TRUE
B) FALSE
Solution: (B)
A) Only 1
B) Only 2
C) Only 3
D) 1 and 2
E) 1 and 3
F) 2 and 3
Solution: (E)
In statistical hypothesis testing, a type I error is the
incorrect rejection of a true null hypothesis (a “false
positive”), while a type II error is incorrectly retaining a
false null hypothesis (a “false negative”).
1. Stemming
3. Object Standardization
A) 1 and 2
B) 1 and 3
C) 2 and 3
D) 1,2 and 3
Solution: (D)
Stop words are those words which will have not relevant to the
context of the data for example is/am/are.
15) Suppose you want to project high dimensional data into lower
dimensions. The two most famous dimensionality reduction
algorithms used here are PCA and t-SNE. Let’s say you have
applied both algorithms respectively on data “X” and you got the
datasets “X_projected_PCA” , “X_projected_tSNE”.
Solution: (B)
Context: 16-17
Given below are three scatter plots for two features (Image 1, 2
& 3 from left to right).
A) Features in Image 1
B) Features in Image 2
C) Features in Image 3
Solution: (D)
A) Only 1
B)Only 2
C) Only 3
D) Either 1 or 3
E) Either 2 or 3
Solution: (E)
You cannot remove the both features because after removing the
both features you will lose all of the information so you
should either remove the only 1 feature or you can use the
regularization algorithm like L1 and L2.
18) Adding a non-important feature to a linear regression model
may result in.
1. Increase in R-square
2. Decrease in R-square
A) Only 1 is correct
B) Only 2 is correct
C) Either 1 or 2
D) None of these
Solution: (A)
E) D1 = C1, D2 = C2, D3 = C3
F) Cannot be determined
Solution: (E)
Correlation between the features won’t change if you add or
subtract a value in the features.
Your model has 99% accuracy after taking the predictions on test
data. Which of the following is true in such a case?
A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4
Solution: (A)
A) 1 and 2
B) 1 and 3
C) 2 and 3
D) Only 1
E) Only 2
Solution: (A)
A) 1 and 2
B) 2 and 3
C) 1 and 3
D) 1,2 and 3
Solution: (D)
D) Can’t estimate
Solution: (D)
A) 1000-1500 second
B) 1500-3000 Second
D) None of these
Solution: (D)
H TE VE
1 105 90
2 200 85
3 250 96
4 105 85
5 300 100
A) 1
B) 2
C) 3
D) 4
E) 5
Solution: (D)
26) What would you do in PCA to get the same projection as SVD?
C) Not possible
D) None of these
Solution: (A)
When the data has a zero mean vector PCA will have same
projections as SVD, otherwise you have to centre the data first
before taking SVD.
You can also think that this black box algorithm is same as 1-NN
(1-nearest neighbor).
B) FALSE
Solution: (A)
28) Instead of using 1-NN black box we want to use the j-NN
(j>1) algorithm as black box. Which of the following option is
correct for finding k-NN using j-NN?
2. J > k
3. Not possible
A) 1
B) 2
C) 3
Solution: (A)
29) Suppose you are given 7 Scatter plots 1-7 (left to right)
and you want to compare Pearson correlation coefficients between
variables of each scatterplot.
2. 1>2>3 > 4
3. 7<6<5<4
4. 7>6>5>4
A) 1 and 3
B) 2 and 3
C) 1 and 4
D) 2 and 4
Solution: (B)
1.
If a classifier is confident about an incorrect
classification, then log-loss will penalise it heavily.
A) 1 and 3
B) 2 and 3
C) 1 and 2
D) 1,2 and 3
Solution: (D)
Question 31-32
A) 0
D) 0.4
C) 0.8
D) 1
Solution: (C)
A) 1NN
B) 3NN
C) 4NN
Solution: (A)
33) Suppose you are given the below data and you want to apply a
logistic regression model for classifying it in two given
classes.
You are using logistic regression with L1 regularization.
Where C is the
regularization parameter and w1 & w2 are the coefficients of x1
and x2.
Solution: (B)
Note: All other hyper parameters are same and other factors are
not affected.
A) Only 1
B) Only 2
C) Both 1 and 2
Solution: (A)
A) 2 and 3
B) 1 and 3
C) 1 and 2
D) All of above
Solution: (D)
1. Accuracy is ~0.91
A) 1 and 3
B) 2 and 4
C) 1 and 4
D) 2 and 3
Solution: (C)
The true Positive Rate is how many times you are predicting
positive class correctly so true positive rate would be 100/105
= 0.95 also known as “Sensitivity” or “Recall”
2. Depth of tree
A)1 and 2
B) 2 and 3
C) 1 and 3
D) 1, 2 and 3
E) Can’t say
Solution: (E)
Context 38-39
Imagine, you have a 28 * 28 image and you run a 3 * 3
convolution neural network on it with the input depth of 3 and
output depth of 8.
38) What is the dimension of output feature map when you are
using the given parameters.
Solution: (A)
39) What is the dimensions of output feature map when you are
using following parameters.
Solution: (B)
Same as above
40) Suppose, we were plotting the visualization for different
values of C (Penalty parameter) in SVM algorithm. Due to some
reason, we forgot to tag the C values with visualizations. In
that case, which of the following option best explains the C
values for the images below (1,2,3 left to right, so C values
are C1 for image1, C2 for image2 and C3 for image3 ) in case of
rbf kernel.
A) C1 = C2 = C3
B) C1 > C2 > C3
C) C1 < C2 < C3
D) None of these
Solution: (C)
2. The most widely used metrics and tools to assess a classification model is:
A. Confusion matrix
B. Cost-sensitive accuracy
C. Area under the ROC curve
D. All of these
ANSWER: D
6. Statistical significance is
A. The science of collecting, ogranizing and applying numerical facts
B. Measure of the probability that a certain hypothesis is incorrect given certain
observations
C. One of the defining aspects of a data warehouse, which is specially built around
all the existing applicatons of the operational data
D. None of these
ANSWER: B
7. Which of the folllowing is an example of feature extraction?
A. Constructing bag of words vector from an email
B. Applying PCA projects to a large high-dimensional data
C. Removing stopwords in a sentence
D. All of these
ANSWER: D
8. How can you prevent a clustering algorithm from getting stuck in bad local optima?
A. Set the same seed value for each run
B. Use multiple random initializations
C. Both A and B
D. None of these
ANSWER: B
12. Classification is
A. Subdivision of a set of examples into a number of classes
B. Measure of the accuracy, of the classification of a concept that is given by a
certain theory
C. The task of assigning a classification to a set of examples
D. None of these
ANSWER: A
14. Cluster is
A. Group of similar objects that differ significantly from other objects
B. Operations on a database to transform or simplify data in order to prepare it for
a machine-learning algorithm
C. Symbolic representation of facts or ideas from which information can potentially
be extracted
D. None of these
ANSWER: A
15. Suppose you are given an EM algorithm that finds maximum likelihood estimates for
a model with latent variables. You are asked to modify the algorithm so that it finds MAP
estimates instead. Which step or steps do you need to modify?
A. Expectation
B. Maximization
C. No modification necessary
D. Both A & B
ANSWER: B
16. Compared to the variance of the Maximum Likelihood Estimate (MLE), the variance
of the Maximum A Posteriori (MAP) estimate is ________
A. Higher
B. Same
C. Lower
D. It could be any of the above
ANSWER: C
19. Predicting on whether will it rain or not tomorrow evening at a particular time
is a type of _________ problem.
A. Classification
B. Regression
C. Unsupervised learning
D. All o these
ANSWER: A
21. A feature F1 can take certain value: A, B, C, D, E, & F and represents grade of
students from a college.Which of the following statement is true in following case?
A. Feature F1 is an example of nominal variable.
B. Feature F1 is an example of ordinal variable.
C. It doesn’t belong to any of the above category.
D. Both of A & B
ANSWER: B
22. If your training loss increases with number of epochs, which of the following could
be a possible issue with the learning process?
A. Regularization is too low and model is overfitting
B. Regularization is too high and model is underfitting
C. Step size is too large
D. Step size is too small
ANSWER: C
23. Given a large dataset of medical records from patients suffering from heart disease,
try to learn whether there might be different clusters of such patients for which we might
tailor separate treatments. What kind of learning problem is this?
A. Supervised learning
B. Unsupervised learning
C. Both A and B
D. None of these
ANSWER: B
26. Classifying email as a spam, labeling webpages based on their content, voice
recognition are the example of _____.
A. Supervised learning
B. Unsupervised learning
C. Machine learning
D. Deep learning
ANSWER: A
27. Deep learning is a subfield of machine learning where concerned algorithms are
inspired by the structured and function of the brain called _____.
A. Machine learning
B. Artificial neural networks
C. Deep learning
D. Robotics
ANSWER: B
29. When the number of output classes is greater than one, there are main possibilities
to manage a classification problem:
A. One-vs-all, One-vs-one
B. One-vs-one, Many-vs-one
C. One-vs-many, Many-vs-one
D. None of these
ANSWER: A
30. For a neural network, which one of these structural assumptions is the one that
most affects the trade-off between underfitting (i.e. a high bias model) and overfitting
(i.e. a high variance model):
A. The learning rate
B. The number of hidden nodes
C. The initial choice of weights
D. The use of a constant-term unit input
ANSWER: B
31. ___________ refers to a model that can neither model the training data nor
generalize to new data.
A. Good fitting
B. Overfitting
C. Underfitting
D. All of the these
ANSWER: C
32. Given two Boolean random variables, A and B, where P(A) = 1/2, P(B) = 1/3, and P(A
| ¬B) = 1/4, what is P(A | B)?
A. 1/6
B. 1/4
C. 3/4
D. 1
ANSWER: D
33. Suppose your model is overfitting. Which of the following is NOT a valid way to
try and reduce the overfitting?
A. Increase the amount of training data
B. Improve the optimization algorithm being used for error minimization
C. Decrease the model complexity
D. Reduce the noise in the training data
ANSWER: B
34. Predicting on whether will it rain or not tomorrow evening at a particular time
is a type of _________ problem.
A. Classification
B. Regression
C. Unsupervised learning
D. All of these
ANSWER: A
35. Given a large dataset of medical records from patients suffering from heart disease,
try to learn whether there might be different clusters of such patients for which we might
tailor separate treatments. What kind of learning problem is this?
A. Supervised learning
B. Unsupervised learning
C. Both A and B
D. Neither A nor B
ANSWER: B
36. Given a large dataset of medical records from patients suffering from heart disease,
try to learn whether there might be different clusters of such patients for which we might
tailor separate treatments. What kind of learning problem is this?
A. Supervised learning
B. Unsupervised learning
C. Both A and B
D. Neither A nor B
ANSWER: B
46. Which of the following is wrong statement about the maximum likelihood approach?
A. This method doesn’t always involve probability calculations
B. It finds a tree that best accounts for the variation in a set of sequences
C. The method is similar to the maximum parsimony method
D. The analysis is performed on each column of a multiple sequence alignment
ANSWER: A
47. The main disadvantage of maximum likelihood methods is that they are _____
A. Mathematically less folded
B. Mathematically less complex
C. Computationally lucid
D. Computationally intense
ANSWER: B
A) TRUE
B) FALSE
Solution: (A)
Yes, Linear regression is a supervised learning algorithm because it uses true labels for
training. Supervised learning algorithm should have input variable (x) and an output variable
(Y) for each example.
A) TRUE
B) FALSE
Solution: (A)
A) TRUE
B) FALSE
Solution: (A)
4) Which of the following methods do we use to find the best fit line for data in Linear
Regression?
C2 General
A) Least Square Error
B) Maximum Likelihood
C) Logarithmic Loss
D) Both A and B
Solution: (A)
In linear regression, we try to minimize the least square errors of the model to identify the
line of best fit.
5) Which of the following evaluation metrics can be used to evaluate a model while
modeling a continuous output variable?
A) AUC-ROC
B) Accuracy
C) Logloss
D) Mean-Squared-Error
Solution: (D)
Since linear regression gives output as continuous values, so in such case we use mean
squared error metric to evaluate the model performance. Remaining options are use in case
of a classification problem.
A) TRUE
B) FALSE
Solution: (A)
True, In case of lasso regression we apply absolute penalty which makes some of the
coefficients zero.
A) Lower is better
B) Higher is better
C) A or B depend on the situation
D) None of these
Solution: (A)
C2 General
Residuals refer to the error values of the model. Therefore lower residuals are desired.
8) Suppose that we have N independent variables (X1,X2… Xn) and dependent variable is
Y. Now Imagine that you are applying linear regression by fitting the best fit line using least
square error on this data.
You found that correlation coefficient for one of it’s variable(Say X1) with Y is -0.95.
Solution: (B)
The absolute value of the correlation coefficient denotes the strength of the relationship.
Since absolute correlation is very high it means that the relationship is strong between X1
and Y.
9) Looking at above two characteristics, which of the following option is the correct
for Pearson correlation between V1 and V2?
If you are given the two variables V1 and V2 and they are following below two
characteristics.
Solution: (D)
C2 General
10) Suppose Pearson correlation between V1 and V2 is zero. In such case, is it right
to conclude that V1 and V2 do not have any relation between them?
A) TRUE
B) FALSE
Solution: (B)
Pearson correlation coefficient between 2 variables might be zero even when they have a
relationship between them. If the correlation coefficient is zero, it just means that that they
don’t move together. We can take examples like y=|x| or y=x^2.
11) Which of the following offsets, do we use in linear regression’s least square line
fit? Suppose horizontal axis is independent variable and vertical axis is dependent
variable.
A) Vertical offset
B) Perpendicular offset
C) Both, depending on the situation
D) None of above
Solution: (A)
12) True- False: Overfitting is more likely when you have huge amount of data to
train?
C2 General
A) TRUE
B) FALSE
Solution: (B)
With a small training dataset, it’s easier to find a hypothesis to fit the training data exactly
i.e. overfitting.
13) We can also compute the coefficient of linear regression with the help of an
analytical method called “Normal Equation”. Which of the following is/are true about
Normal Equation?
A) 1 and 2
B) 1 and 3
C) 2 and 3
D) 1,2 and 3
Solution: (D)
Instead of gradient descent, Normal Equation can also be used to find coefficients. Refer
this article for read more about normal equation.
14) Which of the following statement is true about sum of residuals of A and B?
Below graphs show two fitted regression lines (A & B) on randomly generated data. Now, I
want to find the sum of residuals in both cases A and B.
Note:
C2 General
A) A has higher sum of residuals than B
B) A has lower sum of residual than B
C) Both have same sum of residuals
D) None of these
Solution: (C)
Sum of residuals will always be zero, therefore both have same sum of residuals
Suppose you have fitted a complex regression model on a dataset. Now, you are using
Ridge regression with penality x.
Solution: (B)
If the penalty is very large it means model is less complex, therefore the bias would be high.
16) What will happen when you apply very large penalty?
C2 General
Solution: (B)
In lasso some of the coefficient value become zero, but in case of Ridge, the coefficients
become close to zero but not zero.
17) What will happen when you apply very large penalty in case of Lasso?
A) Some of the coefficient will become zero
B) Some of the coefficient will be approaching to zero but not absolute zero
C) Both A and B depending on the situation
D) None of these
Solution: (A)
As already discussed, lasso applies absolute penalty, so some of the coefficients will
become zero.
18) Which of the following statement is true about outliers in Linear regression?
Solution: (A)
The slope of the regression line will change due to outliers in most of the cases. So Linear
Regression is sensitive to outliers.
19) Suppose you plotted a scatter plot between the residuals and predicted values in
linear regression and you found that there is a relationship between them. Which of
the following conclusion do you make about this situation?
Solution: (A)
C2 General
There should not be any relationship between predicted values and residuals. If there exists
any relationship between them,it means that the model has not perfectly captured the
information in the data.
Suppose that you have a dataset D1 and you design a linear regression model of degree 3
polynomial and you found that the training and testing error is “0” or in another terms it
perfectly fits the data.
20) What will happen when you fit degree 4 polynomial in linear regression?
A) There are high chances that degree 4 polynomial will over fit the data
B) There are high chances that degree 4 polynomial will under fit the data
C) Can’t say
D) None of these
Solution: (A)
Since is more degree 4 will be more complex(overfit the data) than the degree 3 model so it
will again perfectly fit the data. In such case training error will be zero but test error may not
be zero.
21) What will happen when you fit degree 2 polynomial in linear regression?
A) It is high chances that degree 2 polynomial will over fit the data
B) It is high chances that degree 2 polynomial will under fit the data
C) Can’t say
D) None of these
Solution: (B)
If a degree 3 polynomial fits the data perfectly, it’s highly likely that a simpler model(degree
2 polynomial) might under fit the data.
22) In terms of bias and variance. Which of the following is true when you fit degree 2
polynomial?
C2 General
C) Bias will be high, variance will be low
D) Bias will be low, variance will be low
Solution: (C)
Since a degree 2 polynomial will be less complex as compared to degree 3, the bias will be
high and variance will be low.
Which of the following is true about below graphs(A,B, C left to right) between the cost
function and Number of iterations?
23) Suppose l1, l2 and l3 are the three learning rates for A,B,C respectively. Which of
the following is true about l1,l2 and l3?
A) l2 < l1 < l3
B) l1 > l2 > l3
C) l1 = l2 = l3
D) None of these
Solution: (A)
In case of high learning rate, step will be high, the objective function will decrease quickly
initially, but it will not find the global minima and objective function starts increasing after a
few iterations.
In case of low learning rate, the step will be small. So the objective function will decrease
slowly
C2 General
Question Context 24-25:
We have been given a dataset with n records in which we have input attribute as x and
output attribute as y. Suppose we use a linear regression method to model this data. To test
our linear regressor, we split the data in training set and test set randomly.
24) Now we increase the training set size gradually. As the training set size increases,
what do you expect will happen with the mean training error?
A) Increase
B) Decrease
C) Remain constant
D) Can’t Say
Solution: (D)
Training error may increase or decrease depending on the values that are used to fit the
model. If the values used to train contain more outliers gradually, then the error might just
increase.
25) What do you expect will happen with bias and variance as you increase the size
of training data?
Solution: (D)
As we increase the size of the training data, the bias would increase while the variance
would decrease.
Consider the following data where one input(X) and one output(Y) is given.
C2 General
26) What would be the root mean square training error for this data if you run a
Linear Regression model of the form (Y = A0+A1X)?
A) Less than 0
B) Greater than zero
C) Equal to 0
D) None of these
Solution: (C)
We can perfectly fit the line on the following data so mean error will be zero.
Suppose you have been given the following scenario for training and validation error for
Linear Regression.
Number of Validation
Scenario Learning Rate Training Error
iterations Error
1 0.1 1000 100 110
2 0.2 600 90 105
3 0.3 400 110 110
4 0.4 300 120 130
5 0.4 250 130 150
C2 General
27) Which of the following scenario would give you the right hyper parameter?
A) 1
B) 2
C) 3
D) 4
Solution: (B)
Option B would be the better option because it leads to less training as well as validation
error.
28) Suppose you got the tuned hyper parameters from the previous question. Now,
Imagine you want to add a variable in variable space such that this added feature is
important. Which of the following thing would you observe in such case?
Solution: (D)
If the added feature is important, the training and validation error would decrease.
Suppose, you got a situation where you find that your linear regression model is under
fitting the data.
29) In such situation which of the following options would you consider?
A) 1 and 2
B) 2 and 3
C) 1 and 3
D) 1, 2 and 3
C2 General
Solution: (A)
In case of under fitting, you need to induce more variables in variable space or you can add
some polynomial degree variables to make the model more complex to be able to fir the
data better.
A) L1
B) L2
C) Any
D) None of these
Solution: (D)
I won’t use any regularization methods because regularization is used in case of overfitting.
A) TRUE
B) FALSE
Solution: A
True, Logistic regression is a supervised learning algorithm because it uses true labels for
training. Supervised learning algorithm should have input variables (x) and an target
variable (Y) when you train the model .
A) TRUE
B) FALSE
Solution: B
Logistic regression is a classification algorithm, don’t confuse with the name regression.
C2 General
3) True-False: Is it possible to design a logistic regression algorithm using a Neural
Network Algorithm?
A) TRUE
B) FALSE
Solution: A
A) TRUE
B) FALSE
Solution: A
Yes, we can apply logistic regression on 3 classification problem, We can use One Vs all
method for 3 class classification in logistic regression.
5) Which of the following methods do we use to best fit the data in Logistic
Regression?
Solution: B
Logistic regression uses maximum likely hood estimate for training a logistic regression.
6) Which of the following evaluation metrics can not be applied in case of logistic
regression output to compare with target?
C2 General
A) AUC-ROC
B) Accuracy
C) Logloss
D) Mean-Squared-Error
Solution: D
Since, Logistic Regression is a classification algorithm so it’s output can not be real time
value so mean squared error can not use for evaluating it
7) One of the very good methods to analyze the performance of Logistic Regression
is AIC, which is similar to R-Squared in Linear Regression. Which of the following is
true about AIC?
Solution: A
We select the best model in logistic regression which can least AIC. For more information
refer this source: http://www4.ncsu.edu/~shu3/Presentation/AIC.pdf
A) TRUE
B) FALSE
Solution: B
Standardization isn’t required for logistic regression. The main goal of standardizing
features is to help convergence of the technique used for optimization.
A) LASSO
B) Ridge
C2 General
C) Both
D) None of these
Solution: A
Context: 10-11
Consider a following model for logistic regression: P (y =1|x, w)= g(w0 + w1x)
where g(z) is the logistic function.
In the above equation the P (y =1|x; w) , viewed as a function of x, that we can get by
changing the parameters w.
A) (0, inf)
B) (-inf, 0 )
C) (0, 1)
D) (-inf, inf)
Solution: C
For values of x in the range of real number from −∞ to +∞ Logistic function will give the
output between (0,1)
11) In above question what do you think which function would make p between (0,1)?
A) logistic function
B) Log likelihood function
C) Mixture of both
D) None of them
Solution: A
C2 General
Context: 12-13
Suppose you train a logistic regression classifier and your hypothesis function H is
12) Which of the following figure will represent the decision boundary as given by
above classifier?
A)
B)
C)
D)
C2 General
Solution: B
Option B would be the right answer. Since our line will be represented by y = g(-6+x2) which
is shown in the option A and option B. But option B is the right answer because when you
put the value x2 = 6 in the equation then y = g(0) you will get that means y= 0.5 will be on
the line, if you increase the value of x2 greater then 6 you will get negative values so output
will be the region y =0.
13) If you replace coefficient of x1 with x2 what would be the output figure?
A)
B)
C)
D)
C2 General
Solution: D
14) Suppose you have been given a fair coin and you want to find out the odds of
getting heads. Which of the following option is true for such a case?
A) odds will be 0
B) odds will be 0.5
C) odds will be 1
D) None of these
Solution: C
Odds are defined as the ratio of the probability of success and the probability of failure. So
in case of fair coin probability of success is 1/2 and the probability of failure is 1/2 so odd
would be 1
15) The logit function(given as l(x)) is the log of odds function. What could be the
range of logit function in the domain x=[0,1]?
A) (– ∞ , ∞)
B) (0,1)
C) (0, ∞)
D) (- ∞, 0)
Solution: A
For our purposes, the odds function has the advantage of transforming the probability
function, which has values from 0 to 1, into an equivalent function with values between 0
and ∞. When we take the natural log of the odds function, we get a range of values from -∞
to ∞.
C2 General
16) Which of the following option is true?
A) Linear Regression errors values has to be normally distributed but in case of Logistic
Regression it is not the case
B) Logistic Regression errors values has to be normally distributed but in case of Linear
Regression it is not the case
C) Both Linear Regression and Logistic Regression error values have to be normally
distributed
D) Both Linear Regression and Logistic Regression error values have not to be normally
distributed
Solution:A
Only A is true.
17) Which of the following is true regarding the logistic function for any value “x”?
Note:
Logistic(x): is a logistic function of any number “x”
A) Logistic(x) = Logit(x)
B) Logistic(x) = Logit_inv(x)
C) Logit_inv(x) = Logit(x)
D) None of these
Solution: B
C2 General
MCQ For UNIT 2
Solution: (B)
Ordinal variables are the variables which has some order in their categories. For example,
grade A should be consider as high grade than grade B.
A) PCA
B) K-Means
Solution: (A)
A deterministic algorithm is that in which output does not change on different runs. PCA
would give the same result if we run again, but not k-means.
3) [True or False] A Pearson correlation between two variables is zero but, still their
values can still be related to each other.
A) TRUE
B) FALSE
Solution: (A)
Y=X2. Note that, they are not only associated, but one is a function of the other and
Pearson correlation between them is 0.
4) Which of the following statement(s) is / are true for Gradient Decent (GD) and
Stochastic Gradient Decent (SGD)?
2. In SGD, you have to run through all the samples in your training set for a
single update of a parameter in each iteration.
3. In GD, you either use the entire data or a subset of training data to update a
parameter in each iteration.
A) Only 1
B) Only 2
C) Only 3
D) 1 and 2
E) 2 and 3
F) 1,2 and 3
Solution: (A)
In SGD for each iteration you choose the batch which is generally contain the random
sample of data But in case of GD each iteration contain the all of the training observations.
5) Which of the following hyper parameter(s), when increased may cause random
forest to over fit the data?
1. Number of Trees
2. Depth of Tree
3. Learning Rate
A) Only 1
B) Only 2
C) Only 3
D) 1 and 2
E) 2 and 3
F) 1,2 and 3
Solution: (B)
Usually, if we increase the depth of tree it will cause overfitting. Learning rate is not an
hyperparameter in random forest. Increase in the number of tree will cause under fitting.
6) Imagine, you are working with “Analytics Vidhya” and you want to develop a
machine learning algorithm which predicts the number of views on the articles.
Your analysis is based on features like author name, number of articles written by the
same author on Analytics Vidhya in past and a few other features. Which of the
following evaluation metric would you choose in that case?
2. Accuracy
3. F1 Score
A) Only 1
B) Only 2
C) Only 3
D) 1 and 3
E) 2 and 3
F) 1 and 2
Solution:(A)
You can think that the number of views of articles is the continuous target variable which fall
under the regression problem. So, mean squared error will be used as an evaluation
metrics.
7) Given below are three images (1,2,3). Which of the following option is correct for
these images?
A)
B)
C)
A) 1 is tanh, 2 is ReLU and 3 is SIGMOID activation functions.
Solution: (D)
8) Below are the 8 actual values of target variable in the train file.
[0,0,0,1,1,1,1,1]
Solution: (A)
So the answer is A.
9) Let’s say, you are working with categorical feature(s) and you have not looked at
the distribution of the categorical variable in the test data.
You want to apply one hot encoding (OHE) on the categorical feature(s). What
challenges you may face if you have applied OHE on a categorical variable of train
dataset?
A) All categories of categorical variable are not present in the test dataset.
D) Both A and B
E) None of these
Solution: (D)
Both are true, The OHE will fail to encode the categories which is present in test but not in
train so it could be one of the main challenges while applying OHE. The challenge given in
option B is also true you need to more careful while applying OHE if frequency distribution
doesn’t same in train and test.
10) Skip gram model is one of the best models used in Word2vec algorithm for words
embedding. Which one of the following models depict the skip gram model?
A) A
B) B
C) Both A and B
D) None of these
Solution: (B)
Both models (model1 and model2) are used in Word2vec algorithm. The model1 represent
a CBOW model where as Model2 represent the Skip gram model.
11) Let’s say, you are using activation function X in hidden layers of neural network.
At a particular neuron for any given input, you get the output as “-0.0001”. Which of
the following activation function could X represent?
A) ReLU
B) tanh
C) SIGMOID
D) None of these
Solution: (B)
The function is a tanh because the this function output range is between (-1,-1).
12) [True or False] LogLoss evaluation metric can have negative values.
A) TRUE
B) FALSE
Solution: (B)
13) Which of the following statements is/are true about “Type-1” and “Type-2” errors?
3. Type1 error occurs when we reject a null hypothesis when it is actually true.
A) Only 1
B) Only 2
C) Only 3
D) 1 and 2
E) 1 and 3
F) 2 and 3
Solution: (E)
In statistical hypothesis testing, a type I error is the incorrect rejection of a true null
hypothesis (a “false positive”), while a type II error is incorrectly retaining a false null
hypothesis (a “false negative”).
14) Which of the following is/are one of the important step(s) to pre-process the text
in NLP based projects?
1. Stemming
3. Object Standardization
A) 1 and 2
B) 1 and 3
C) 2 and 3
D) 1,2 and 3
Solution: (D)
Stemming is a rudimentary rule-based process of stripping the suffixes (“ing”, “ly”, “es”, “s”
etc) from a word.
Stop words are those words which will have not relevant to the context of the data for
example is/am/are.
Object Standardization is also one of the good way to pre-process the text.
15) Suppose you want to project high dimensional data into lower dimensions. The
two most famous dimensionality reduction algorithms used here are PCA and t-SNE.
Let’s say you have applied both algorithms respectively on data “X” and you got the
datasets “X_projected_PCA” , “X_projected_tSNE”.
t-SNE algorithm consider nearest neighbour points to reduce the dimensionality of the data.
So, after using t-SNE we can think that reduced dimensions will also have interpretation in
nearest neighbour space. But in case of PCA it is not the case.
Context: 16-17
Given below are three scatter plots for two features (Image 1, 2 & 3 from left to right).
16) In the above images, which of the following is/are example of multi-collinear
features?
A) Features in Image 1
B) Features in Image 2
C) Features in Image 3
Solution: (D)
In Image 1, features have high positive correlation where as in Image 2 has high negative
correlation between the features so in both images pair of features are the example of
multicollinear features.
17) In previous question, suppose you have identified multi-collinear features. Which
of the following action(s) would you perform next?
A) Only 1
B)Only 2
C) Only 3
D) Either 1 or 3
E) Either 2 or 3
Solution: (E)
You cannot remove the both features because after removing the both features you will
lose all of the information so you should either remove the only 1 feature or you can use the
regularization algorithm like L1 and L2.
18) Adding a non-important feature to a linear regression model may result in.
1. Increase in R-square
2. Decrease in R-square
A) Only 1 is correct
B) Only 2 is correct
C) Either 1 or 2
D) None of these
Solution: (A)
After adding a feature in feature space, whether that feature is important or unimportant
features the R-squared always increase.
19) Suppose, you are given three variables X, Y and Z. The Pearson correlation
coefficients for (X, Y), (Y, Z) and (X, Z) are C1, C2 & C3 respectively.
Now, you have added 2 in all values of X (i.enew values become X+2), subtracted 2
from all values of Y (i.e. new values are Y-2) and Z remains the same. The new
coefficients for (X,Y), (Y,Z) and (X,Z) are given by D1, D2 & D3 respectively. How do
the values of D1, D2 & D3 relate to C1, C2 & C3?
E) D1 = C1, D2 = C2, D3 = C3
F) Cannot be determined
Solution: (E)
Correlation between the features won’t change if you add or subtract a value in the
features.
20) Imagine, you are solving a classification problems with highly imbalanced class.
The majority class is observed 99% of times in the training data.
Your model has 99% accuracy after taking the predictions on test data. Which of the
following is true in such a case?
3. Precision and recall metrics are good for imbalanced class problems.
4. Precision and recall metrics aren’t good for imbalanced class problems.
A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4
Solution: (A)
21) In ensemble learning, you aggregate the predictions for weak learners, so that an
ensemble of these models will give a better prediction than prediction of individual
models.
Which of the following statements is / are true for weak learners used in ensemble
model?
2. They have high bias, so they cannot solve complex learning problems
A) 1 and 2
B) 1 and 3
C) 2 and 3
D) Only 1
E) Only 2
Solution: (A)
Weak learners are sure about particular part of a problem. So, they usually don’t overfit
which means that weak learners have low variance and high bias.
22) Which of the following options is/are true for K-fold cross-validation?
1. Increase in K will result in higher time required to cross validate the result.
3. If K=N, then it is called Leave one out cross validation, where N is the number
of observations.
A) 1 and 2
B) 2 and 3
C) 1 and 3
D) 1,2 and 3
Solution: (D)
Larger k value means less bias towards overestimating the true expected error (as training
folds will be closer to the total dataset) and higher running time (as you are getting closer to
the limit case: Leave-One-Out CV). We also need to consider the variance between the k
folds accuracy while selecting the k.
Time taken by an algorithm for training (on a model with max_depth 2) 4-fold is 10
seconds and for the prediction on remaining 1-fold is 2 seconds.
23) Which of the following option is true for overall execution time for 5-fold cross
validation with 10 different values of “max_depth”?
D) Can’t estimate
Solution: (D)
Each iteration for depth “2” in 5-fold cross validation will take 10 secs for training and 2
second for testing. So, 5 folds will take 12*5 = 60 seconds. Since we are searching over the
10 depth values so the algorithm would take 60*10 = 600 seconds. But training and testing
a model on depth greater than 2 will take more time than depth “2” so overall timing would
be greater than 600.
24) In previous question, if you train the same algorithm for tuning 2 hyper
parameters say “max_depth” and “learning_rate”.
You want to select the right value against “max_depth” (from given 10 depth values)
and learning rate (from given 5 different learning rates). In such cases, which of the
following will represent the overall time?
A) 1000-1500 second
B) 1500-3000 Second
D) None of these
Solution: (D)
25) Given below is a scenario for training error TE and Validation error VE for a
machine learning algorithm M1. You want to choose a hyperparameter (H) based on
TE and VE.
H TE VE
1 105 90
2 200 85
3 250 96
4 105 85
5 300 100
Which value of H will you choose based on the above table?
A) 1
B) 2
C) 3
D) 4
E) 5
Solution: (D)
26) What would you do in PCA to get the same projection as SVD?
C) Not possible
D) None of these
Solution: (A)
When the data has a zero mean vector PCA will have same projections as SVD, otherwise
you have to centre the data first before taking SVD.
Question Context 27-28
Assume there is a black box algorithm, which takes training data with multiple
observations (t1, t2, t3,…….. tn) and a new observation (q1). The black box outputs
the nearest neighbor of q1 (say ti) and its corresponding class label ci.
You can also think that this black box algorithm is same as 1-NN (1-nearest
neighbor).
27) It is possible to construct a k-NN classification algorithm based on this black box
alone.
A) TRUE
B) FALSE
Solution: (A)
In first step, you pass an observation (q1) in the black box algorithm so this algorithm would
return a nearest observation and its class.
In second step, you through it out nearest observation from train data and again input the
observation (q1). The black box algorithm will again return the a nearest observation and it’s
class.
28) Instead of using 1-NN black box we want to use the j-NN (j>1) algorithm as black
box. Which of the following option is correct for finding k-NN using j-NN?
2. J > k
3. Not possible
A) 1
B) 2
C) 3
Solution: (A)
29) Suppose you are given 7 Scatter plots 1-7 (left to right) and you want to compare
Pearson correlation coefficients between variables of each scatterplot.
1. 1<2<3<4
2. 1>2>3 > 4
3. 7<6<5<4
4. 7>6>5>4
A) 1 and 3
B) 2 and 3
C) 1 and 4
D) 2 and 4
Solution: (B)
from image 1to 4 correlation is decreasing (absolute value). But from image 4 to 7
correlation is increasing but values are negative (for example, 0, -0.3, -0.7, -0.99).
30) You can evaluate the performance of a binary class classification problem using
different metrics such as accuracy, log-loss, F-Score. Let’s say, you are using the
log-loss function as evaluation metric.
2. For a particular observation, the classifier assigns a very small probability for the
correct class then the corresponding contribution to the log-loss will be very large.
A) 1 and 3
B) 2 and 3
C) 1 and 2
D) 1,2 and 3
Solution: (D)
Question 31-32
Note: Visual distance between the points in the image represents the actual distance.
D) 0.4
C) 0.8
D) 1
Solution: (C)
In Leave-One-Out cross validation, we will select (n-1) observations for training and 1
observation of validation. Consider each point as a cross validation point and then find the 3
nearest point to this point. So if you repeat this procedure for all points you will get the
correct classification for all positive class given in the above figure but negative class will be
misclassified. Hence you will get 80% accuracy.
32) Which of the following value of K will have least leave-one-out cross validation
accuracy?
A) 1NN
B) 3NN
C) 4NN
Solution: (A)
Each point which will always be misclassified in 1-NN which means that you will get the 0%
accuracy.
33) Suppose you are given the below data and you want to apply a logistic regression
model for classifying it in two given classes.
You are using logistic regression with L1 regularization.
Which of the following option is correct when you increase the value of C from zero to a
very large value?
Solution: (B)
By looking at the image, we see that even on just using x2, we can efficiently perform
classification. So at first w1 will become 0. As regularization parameter increases more, w2
will come more and more closer to 0.
34) Suppose we have a dataset which can be trained with 100% accuracy with help of
a decision tree of depth 6. Now consider the points below and choose the option
based on these points.
Note: All other hyper parameters are same and other factors are not affected.
A) Only 1
B) Only 2
C) Both 1 and 2
Solution: (A)
If you fit decision tree of depth 4 in such data means it will more likely to underfit the data.
So, in case of underfitting you will have high bias and low variance.
35) Which of the following options can be used to get global minima in k-Means
Algorithm?
A) 2 and 3
B) 1 and 3
C) 1 and 2
D) All of above
Solution: (D)
36) Imagine you are working on a project which is a binary classification problem.
You trained a model on training dataset and get the below confusion matrix on
validation dataset.
Based on the above confusion matrix, choose which option(s) below will give you
correct predictions?
1. Accuracy is ~0.91
A) 1 and 3
B) 2 and 4
C) 1 and 4
D) 2 and 3
Solution: (C)
The true Positive Rate is how many times you are predicting positive class correctly so true
positive rate would be 100/105 = 0.95 also known as “Sensitivity” or “Recall”
37) For which of the following hyperparameters, higher value is better for decision
tree algorithm?
2. Depth of tree
A)1 and 2
B) 2 and 3
C) 1 and 3
D) 1, 2 and 3
E) Can’t say
Solution: (E)
For all three options A, B and C, it is not necessary that if you increase the value of
parameter the performance may increase. For example, if we have a very high value of
depth of tree, the resulting tree may overfit the data, and would not generalize well. On the
other hand, if we have a very low value, the tree may underfit the data. So, we can’t say for
sure that “higher is better”.
Context 38-39
Imagine, you have a 28 * 28 image and you run a 3 * 3 convolution neural network on
it with the input depth of 3 and output depth of 8.
Solution: (A)
39) What is the dimensions of output feature map when you are using following
parameters.
Solution: (B)
Same as above
40) Suppose, we were plotting the visualization for different values of C (Penalty
parameter) in SVM algorithm. Due to some reason, we forgot to tag the C values with
visualizations. In that case, which of the following option best explains the C values
for the images below (1,2,3 left to right, so C values are C1 for image1, C2 for image2
and C3 for image3 ) in case of rbf kernel.
A) C1 = C2 = C3
B) C1 > C2 > C3
C) C1 < C2 < C3
D) None of these
Solution: (C)
MCQ questions for unit 4: Naïve Bayes and Support Vector Machine
Solution: B
Generalisation error in statistics is generally the out-of-sample error which is the measure
of how accurately a model can predict values for previously unseen data.
13. The minimum time complexity for training an SVM is O(n2). According to this fact, what
sizes of datasets are not best suited for SVM’s?
A) Large datasets
B) Small datasets
C) Medium sized datasets
D) Size does not matter
Solution: A
Datasets which have a clear classification boundary will function best with SVM’s.
A) Selection of Kernel
B) Kernel Parameters
C) Soft Margin Parameter C
D) All of the above
Solution: D
The SVM effectiveness depends upon how you choose the basic 3 requirements mentioned
above in such a way that it maximises your efficiency, reduces error and overfitting.
Solution: C
When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.
15. Suppose you are using RBF kernel in SVM with high Gamma value. What does this
signify?
A) The model would consider even far away points from hyperplane for modeling
B) The model would consider only the points close to the hyperplane for modeling
C) The model would not be affected by distance of points from hyperplane for modeling
D) None of the above
Solution: B
The gamma parameter in SVM tuning signifies the influence of points either near or far away
from the hyperplane.
For a low gamma, the model will be too constrained and include all points of the training dataset,
without really capturing the shape.
For a higher gamma, the model will capture the shape of the dataset well.
Solution: C
The cost parameter decides how much an SVM should be allowed to “bend” with the data. For a
low cost, you aim for a smooth decision surface and for a higher cost, you aim to classify more
points correctly. It is also simply referred to as the cost of misclassification.
17. Which of the following are real world applications of the SVM?
Solution: D
SVM’s are highly versatile models that can be used for practically all real world problems
ranging from regression to clustering and handwriting recognitions.
18. We usually use feature normalization before using the Gaussian kernel in SVM. What is
true about feature normalization?
A) 1
B) 1 and 2
C) 1 and 3
D) 2 and 3
Solution: B
Statements one and two are correct.
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: C
1. Which of the following is a widely used and effective machine learning algorithm based on
the idea of bagging?
a. Decision Tree
b. Regression
c. Classification
d. Random Forest - answer
2. To find the minimum or the maximum of a function, we set the gradient to zero because:
a. The value of the gradient at extrema of a function is always zero - answer
b. Depends on the type of problem
c. Both A and B
d. None of the above
3. The most widely used metrics and tools to assess a classification model are:
a. Confusion matrix
b. Cost-sensitive accuracy
c. Area under the ROC curve
d. All of the above - answer
4. Which of the following is a good test dataset characteristic?
a. Large enough to yield meaningful results
b. Is representative of the dataset as a whole
c. Both A and B - answer
d. None of the above
5. Which of the following is a disadvantage of decision trees?
a. Factor analysis
b. Decision trees are robust to outliers
c. Decision trees are prone to be overfit - answer
d. None of the above
6. How do you handle missing or corrupted data in a dataset?
a. Drop missing rows or columns
b. Replace missing values with mean/median/mode
c. Assign a unique category to missing values
d. All of the above - answer
7. What is the purpose of performing cross-validation?
a. To assess the predictive performance of the models
b. To judge how the trained model performs outside the sample on test data
c. Both A and B - answer
8. Why is second order differencing in time series needed?
a. To remove stationarity
b. To find the maxima or minima at the local point
c. Both A and B - answer
d. None of the above
9. When performing regression or classification, which of the following is the correct way to
preprocess the data?
a. Normalize the data → PCA → training - answer
b. PCA → normalize PCA output → training
c. Normalize the data → PCA → normalize PCA output → training
d. None of the above
10. Which of the folllowing is an example of feature extraction?
a. Constructing bag of words vector from an email
b. Applying PCA projects to a large high-dimensional data
c. Removing stopwords in a sentence
d. All of the above - answer
11. What is pca.components_ in Sklearn?
a. Set of all eigen vectors for the projection space - answer
b. Matrix of principal components
c. Result of the multiplication matrix
d. None of the above options
12. Which of the following is true about Naive Bayes ?
a. Assumes that all the features in a dataset are equally important
b. Assumes that all the features in a dataset are independent
c. Both A and B - answer
d. None of the above options
13. Which of the following statements about regularization is not correct?
a. Using too large a value of lambda can cause your hypothesis to underfit the data.
b. Using too large a value of lambda can cause your hypothesis to overfit the data.
c. Using a very large value of lambda cannot hurt the performance of your hypothesis.
d. None of the above - answer
14. How can you prevent a clustering algorithm from getting stuck in bad local optima?
a. Set the same seed value for each run
b. Use multiple random initializations - answer
c. Both A and B
d. None of the above
15. Which of the following techniques can be used for normalization in text mining?
a. Stemming
b. Lemmatization
c. Stop Word Removal
d. Both A and B - answer
16. In which of the following cases will K-means clustering fail to give good results? 1) Data
points with outliers 2) Data points with different densities 3) Data points with nonconvex
shapes
a. 1 and 2
b. 2 and 3
c. 1, 2, and 3 - answer
d. 1 and 3
17. Which of the following is a reasonable way to select the number of principal components
"k"?
a. Choose k to be the smallest value so that at least 99% of the varinace is retained. -
answer
b. Choose k to be 99% of m (k = 0.99*m, rounded to the nearest integer).
c. Choose k to be the largest value so that 99% of the variance is retained.
d. Use the elbow method
18. You run gradient descent for 15 iterations with a=0.3 and compute J(theta) after each
iteration. You find that the value of J(Theta) decreases quickly and then levels off. Based on
this, which of the following conclusions seems most plausible?
a. Rather than using the current value of a, use a larger value of a (say a=1.0)
b. Rather than using the current value of a, use a smaller value of a (say a=0.1)
c. a=0.3 is an effective choice of learning rate- answer
d. None of the above
19. What is a sentence parser typically used for?
a. It is used to parse sentences to check if they are utf-8 compliant.
b. It is used to parse sentences to derive their most likely syntax tree structures. -
answer
c. It is used to parse sentences to assign POS tags to all tokens.
d. It is used to check if sentences can be parsed into meaningful tokens.
20. Suppose you have trained a logistic regression classifier and it outputs a new example x
with a prediction ho(x) = 0.2. This means
a. Our estimate for P(y=1 | x)
b. Our estimate for P(y=0 | x) - answer
c. Our estimate for P(y=1 | x)
d. Our estimate for P(y=0 | x)
1) If you remove the following any one red points from the data. Does the
decision boundary will change?
A) Yes
B) No
Solution: A
These three examples are positioned such that removing any one of them introduces slack
in the constraints. So the decision boundary would completely change.
21. [True or False] If you remove the non-red circled points from the data, the decision
boundary will change?
A) True
B) False
Solution: B
On the other hand, rest of the points in the data won’t affect the decision boundary much.
23. When the C parameter is set to infinite, which of the following holds true?
A) The optimal hyperplane if exists, will be the one that completely separates the data
B) The soft-margin classifier will separate the data
C) None of the above
Solution: A
At such a high level of misclassification penalty, soft margin will not hold existence as
there will be no room for error.
25. The minimum time complexity for training an SVM is O(n2). According to this fact, what
sizes of datasets are not best suited for SVM’s?
A) Large datasets
B) Small datasets
C) Medium sized datasets
D) Size does not matter
Solution: A
Datasets which have a clear classification boundary will function best with SVM’s.
29. Suppose you are using RBF kernel in SVM with high Gamma value. What does this
signify?
A) The model would consider even far away points from hyperplane for modeling
B) The model would consider only the points close to the hyperplane for modeling
C) The model would not be affected by distance of points from hyperplane for
modeling
D) None of the above
Solution: B
The gamma parameter in SVM tuning signifies the influence of points either near or far
away from the hyperplane
For a low gamma, the model will be too constrained and include all points of the training
dataset, without really capturing the shape.
For a higher gamma, the model will capture the shape of the dataset well.
31. 12)Suppose you are building a SVM model on data X. The data X can be error prone
which means that you should not trust any specific data point too much. Now think that
you want to build a SVM model which has quadratic kernel function of polynomial
degree 2 that uses Slack variable C as one of it’s hyper parameter. Based upon that give
the answer for following question.
What would happen when you use very large value of C(C->infinity)?
Note: For small C was also classifying all data points correctly
A) We can still classify data correctly for given setting of hyper parameter C
B) We can not classify data correctly for given setting of hyper parameter C
C) Can’t Say
D) None of these
Solution: A
For large values of C, the penalty for misclassifying points is very high, so the decision
boundary will perfectly separate the data if possible.
32. What would happen when you use very small C (C~0)?
A) Misclassification would happen
B) Data will be correctly classified
C) Can’t say
D) None of these
Solution: A
The classifier can maximize the margin between most of the points, while misclassifying
a few points, because the penalty is so low.
33. If I am using all features of my dataset and I achieve 100% accuracy on my training set,
but ~70% on validation set, what should I look out for?
A) Underfitting
B) Nothing, the model is perfect
C) Overfitting
Solution: C
If we’re achieving 100% training accuracy very easily, we need to check to verify if
we’re overfitting our data.
34. Which of the following are real world applications of the SVM?
A) Text and Hypertext Categorization
B) Image Classification
C) Clustering of News Articles
D) All of the above
Solution: D
SVM’s are highly versatile models that can be used for practically all real world problems
ranging from regression to clustering and handwriting recognitions.
Question Context: 16 – 18
Suppose you have trained an SVM with linear decision boundary after training SVM, you
correctly infer that your SVM model is under fitting.
35. Which of the following option would you more likely to consider iterating SVM next
time?
A) You want to increase your data points
B) You want to decrease your data points
C) You will try to calculate more variables
D) You will try to reduce the features
Solution: C
The best option here would be to create more features for the model.
36. Suppose you gave the correct answer in previous question. What do you think that is
actually happening?
1.We are lowering the bias
2. We are lowering the variance
3. We are increasing the bias
4. We are increasing the variance
A) 1 and 2
B) 2 and 3
C) 1 and 4
D) 2 and 4
Solution: C
Better model will lower the bias and increase the variance
37. In above question suppose you want to change one of it’s(SVM) hyperparameter so that
effect would be same as previous questions i.e model will not under fit?
A) We will increase the parameter C
B) We will decrease the parameter C
C) Changing in C don’t effect
D) None of these
Solution: A
Increasing C parameter would be the right thing to do here, as it will ensure regularized
model
38. We usually use feature normalization before using the Gaussian kernel in SVM. What is
true about feature normalization?
1.We do feature normalization so that new feature will dominate other
2. Some times, feature normalization is not feasible in case of categorical variables
3. Feature normalization always helps when we use Gaussian kernel in SVM
A) 1
B) 1 and 2
C) 1 and 3
D) 2 and 3
Solution: B
Statements one and two are correct.
40. Suppose you have same distribution of classes in the data. Now, say for training 1 time in
one vs all setting the SVM is taking 10 second. How many seconds would it require to
train one-vs-all method end to end?
A) 20
B) 40
C) 60
D) 80
Solution: B
It would take 10×4 = 40 seconds
41. Suppose your problem has changed now. Now, data has only 2 classes. What would you
think how many times we need to train SVM in such case?
A) 1
B) 2
C) 3
D) 4
Solution: A
Training the SVM only one time would give you appropriate results
1. Support Vector Machine works well with,
a) Linear Scenarios
b) Non-linear Scenarios
c) Both of these
d) None of these
3. Two classes separated by a margin with two boundaries are called as,
a) Linear Vectors
b) Support Vectors
c) Test Vectors
d) None of these
8. To find out a trade-off between precision and number of support vectors, scikit-learn provides
an implementation called as,
a) NuSVC
b) BuSVC
c) MuSVC
d) AuSVC
Answer: a) NuSVC
a)
b)
c)
d) None of these
Answer: a)
a)
b)
c)
d) None of these
Answer: b)
11. The sigmoid kernel is based on this function:
a)
b)
c)
d) None of these
Answer: c)
a) 1
b) 2
c) Both 1 and 2
d) None of these
Answer: a) Discriminative
Answer: b) Find the optimal separating hyperplane which maximizes the margin of training data.
a)
b)
c)
d) None of these
Answer: a)
Answer: a) SVM algorithms use a set of mathematical functions that are defined as the kernel
9. Probability provides a way of summarizing the ______ that comes from our laziness and
ignorances.
A. Belief
B. Uncertaintity
C. Joint probability distributions
D. Randomness
ANSWER: B
10. The entries in the full joint probability distribution can be calculated as
A. Using variables
B. Both Using variables & information
C. Using information
D. All of the above
ANSWER: C
12. Naïve Bayes algorithm is based on _______ and used for solving classification problems.
A. Bayes Theorem
B. Candidate elimination algorithm
C. EM algorithm
D. None of the above
ANSWER: A
19. Support vectors are the data points that lie closest to the decision surface.
A. TRUE
B. FALSE
ANSWER: A
22. Which of the following are real world applications of the SVM?
A. Text and Hypertext Categorization
B. Image Classification
C. Clustering of News Articles
D. All of the above
ANSWER:D
23. Gaussian naive Bayes is useful when working with continuous values whose probabilities
can be modeled using a Gaussian distribution
A. Bernoulli
B. multinomial
C. Gaussian
D. All of above
ANSWER: C
24. A multinomial distribution is useful to model feature vectors where each value
represents,the number of occurrences of a term or its relative frequency
A. Bernoulli
B. multinomial
C. Gaussian
D. All of above
ANSWER: B
26. The two classes are normally separated by a margin with two boundaries where a few
elements lie. Those elements are called
A. principal componants
B. support vectors
C. factors
D. None
ANSWER: B
27. What is/are true about kernel in SVM? 1. Kernel function map low dimensional data to
high dimensional space. 2.It’s a similarity function
A. 1
B. 2
C. 1 and 2
D. None of these
ANSWER: C
28. Support vector machine (SVM) is a _________ classifier
A. Descrinative
B. Generative
ANSWER: A
30. The training examples closest to the separating hyperplane are called as _______
A. Training vector
B. Testing Vector
C. Support margin
D. Support vector
ANSWER:D
33. When using R, which of the following package is used for SVM?
A. b1072
B. c1071
C. d2012
D. e1071
ANSWER:D
35. Which of the following might be valid reasons for preferring an SVM over a neural
network?
A. An SVM can automatically learn to apply a non-linear transformation on the input space;
a neural net cannot.
B. An SVM can effectively map the data to an infinite-dimensional space; a neural net
cannot.
C. An SVM should not get stuck in local minima, unlike a neural net.
D. The transformed (basis function) representation constructed by an SVM is usually
easier to visualise/interpret than for a neural net.
ANSWER: B,C
36. You are given a labeled binary classification data set with N data points and D features.
Suppose that N < D. In training an SVM on this data set, which of the following kernels
is likely to be most appropriate?
A. Linear kernel
B. Quadratic kernel
C. Higher-order polynomial kernel
D. RBF kernel
ANSWER: A
UNIT I
1. What is classification?
a) when the output variable is a category, such as “red” or “blue” or “disease” and “no
disease”.
b) when the output variable is a real value, such as “dollars” or “weight”.
Ans: Solution A
2. What is regression?
a) When the output variable is a category, such as “red” or “blue” or “disease” and “no
disease”.
b) When the output variable is a real value, such as “dollars” or “weight”.
Ans: Solution B
Ans: Solution B
Ans: Solution A
Ans: Solution D
6. What is Reinforcement learning?
a) All data is unlabelled and the algorithms learn to inherent structure from the input data
b) All data is labelled and the algorithms learn to predict the output from the input data
c) It is a framework for learning where an agent interacts with an environment and receives
a reward for each interaction
d) Some data is labelled but most of it is unlabelled and a mixture of supervised and
unsupervised techniques can be used.
Ans: Solution C
Regression,
Classification
Clustering
Reinforcement Learning
Options:
A. 1 Only
B. 1 and 2
C. 1 and 3
D. 1, 2 and 4
Ans : Solution D
Ans : Solution C
Ans : Solution B
11. Supervised learning and unsupervised clustering both require at least one
a) hidden attribute.
b) output attribute.
c) input attribute.
d) categorical attribute.
Ans : Solution A
12. Supervised learning differs from unsupervised clustering in that supervised learning requires
a) at least one input attribute.
b) input attributes to be categorical.
c) at least one output attribute.
d) output attributes to be categorical.
Ans : Solution B
13. A regression model in which more than one independent variable is used to predict the
dependent variable is called
a) a simple linear regression model
b) a multiple regression models
c) an independent model
d) none of the above
Ans : Solution C
14. A term used to describe the case when the independent variables in a multiple regression model
are correlated is
a) Regression
b) correlation
c) multicollinearity
d) none of the above
Ans : Solution C
15. A multiple regression model has the form: y = 2 + 3x1 + 4x2. As x1 increases by 1 unit (holding x2
constant), y will
a) increase by 3 units
b) decrease by 3 units
c) increase by 4 units
d) decrease by 4 units
Ans : Solution C
Ans : Solution B
17. A measure of goodness of fit for the estimated regression equation is the
a) multiple coefficient of determination
b) mean square due to error
c) mean square due to regression
d) none of the above
Ans : Solution C
Ans : Solution D
Ans : Solution C
20. For a multiple regression model, SST = 200 and SSE = 50. The multiple coefficient of
determination is
a) 0.25
b) 4.00
c) 0.75
d) none of the above
Ans : Solution B
Ans : Solution B
Ans : Solution B
Ans : Solution C
Ans : Solution D
26. Which statement is true about neural network and linear regression models?
a) Both models require input attributes to be numeric.
b) Both models require numeric attributes to range between 0 and 1.
c) The output of both models is a categorical attribute value.
d) Both techniques build models whose output is determined by a linear sum of weighted
input attribute values.
Ans : Solution A
Ans : Solution A
28. The average positive difference between computed and desired outcome values.
a) root mean squared error
b) mean squared error
c) mean absolute error
d) mean positive error
Ans : Solution D
29. Selecting data so as to assure that each class is properly represented in both the training and
test set.
a) cross validation
b) stratification
c) verification
d) bootstrapping
Ans : Solution B
30. The standard error is defined as the square root of this computation.
a) The sample variance divided by the total number of sample instances.
b) The population variance divided by the total number of sample instances.
c) The sample variance divided by the sample mean.
d) The population variance divided by the sample mean.
Ans : Solution A
31. Data used to optimize the parameter settings of a supervised learner model.
a) Training
b) Test
c) Verification
d) Validation
Ans : Solution D
Ans : Solution A
33. The correlation between the number of years an employee has worked for a company and the
salary of the employee is 0.75. What can be said about employee salary and years worked?
a) There is no relationship between salary and years worked.
b) Individuals that have worked for the company the longest have higher salaries.
c) Individuals that have worked for the company the longest have lower salaries.
d) The majority of employees have been with the company a long time.
e) The majority of employees have been with the company a short period of time.
Ans : Solution B
34. The correlation coefficient for two real-valued attributes is –0.85. What does this value tell you?
a) The attributes are not linearly related.
b) As the value of one attribute increases the value of the second attribute also increases.
c) As the value of one attribute decreases the value of the second attribute increases.
d) The attributes show a curvilinear relationship.
Ans : Solution C
35. The average squared difference between classifier predicted output and actual output.
a) mean squared error
b) root mean squared error
c) mean absolute error
d) mean relative error
Ans : Solution A
36. Simple regression assumes a __________ relationship between the input attribute and output
attribute.
a) Linear
b) Quadratic
c) reciprocal
d) inverse
Ans : Solution A
Ans : Solution B
Ans : Solution C
39. Logistic regression is a ________ regression technique that is used to model data having a
_____outcome.
a) linear, numeric
b) linear, binary
c) nonlinear, numeric
d) nonlinear, binary
Ans : Solution D
40. This technique associates a conditional probability value with each data instance.
a) linear regression
b) logistic regression
c) simple regression
d) multiple linear regression
Ans : Solution B
41. This supervised learning technique can process both numeric and categorical input attributes.
a) linear regression
b) Bayes classifier
c) logistic regression
d) backpropagation learning
Ans : Solution A
Ans : Solution B
43. This clustering algorithm merges and splits nodes to help modify nonoptimal partitions.
a) agglomerative clustering
b) expectation maximization
c) conceptual clustering
d) K-Means clustering
Ans : Solution D
44. This clustering algorithm initially assumes that each data instance represents a single cluster.
a) agglomerative clustering
b) conceptual clustering
c) K-Means clustering
d) expectation maximization
Ans : Solution C
45. This unsupervised clustering algorithm terminates when mean values computed for the current
iteration of the algorithm are identical to the computed mean values for the previous iteration.
a) agglomerative clustering
b) conceptual clustering
c) K-Means clustering
d) expectation maximization
Ans : Solution C
46. Machine learning techniques differ from statistical techniques in that machine learning methods
a) typically assume an underlying distribution for the data.
b) are better able to deal with missing and noisy data.
c) are not able to explain their behavior.
d) have trouble with large-sized datasets.
Ans : Solution B
UNIT –II
1.True- False: Over fitting is more likely when you have huge amount of data to train?
A) TRUE
B) FALSE
Ans Solution: (B)
With a small training dataset, it’s easier to find a hypothesis to fit the training data exactly i.e.
over fitting.
3.Which of the following techniques would perform better for reducing dimensions of a data
set?
A. Removing columns which have too many missing values
B. Removing columns which have high variance in data
C. Removing columns with dissimilar data trends
D. None of these
Ans Solution: (A)
If a columns have too many missing values, (say 99%) then we can remove such columns.
4.It is not necessary to have a target variable for applying dimensionality reduction
algorithms.
A. TRUE
B. FALSE
Ans Solution: (A)
LDA is an example of supervised dimensionality reduction algorithm.
5. PCA can be used for projecting and visualizing data in lower dimensions.
A. TRUE
B. FALSE
Ans Solution: (A)
Sometimes it is very useful to plot the data in lower dimensions. We can take the first 2 principal
components and then visualize the data using scatter plot.
6. The most popularly used dimensionality reduction algorithm is Principal Component Analysis
(PCA). Which of the following is/are true about PCA?
PCA is an unsupervised method
It searches for the directions that data have the largest variance
Maximum number of principal components <= number of features
All principal components are orthogonal to each other
A. 1 and 2
B. 1 and 3
C. 2 and 3
D. All of the above
Ans D
8. What happens when you get features in lower dimensions using PCA?
The features will still have interpretability
The features will lose interpretability
The features must carry all information present in data
The features may not carry all information present in data
A. 1 and 3
B. 1 and 4
C. 2 and 3
D. 2 and 4
Ans Solution: (D)
When you get the features in lower dimensions then you will lose some information of data
most of the times and you won’t be able to interpret the lower dimension data.
10. What is of the following statement is true about t-SNE in comparison to PCA?
A. When the data is huge (in size), t-SNE may fail to produce better results.
B. T-NSE always produces better result regardless of the size of the data
C. PCA always performs better than t-SNE for smaller size data.
D. None of these
Ans Solution: (A)
Option A is correct
11. [ True or False ] PCA can be used for projecting and visualizing data in lower dimensions.
A. TRUE
B. FALSE
Solution: (A)
Sometimes it is very useful to plot the data in lower dimensions. We can take the first 2 principal
components and then visualize the data using scatter plot.
12. A feature F1 can take certain value: A, B, C, D, E, & F and represents grade of students from
a college.
1) Which of the following statement is true in following case?
A) Feature F1 is an example of nominal variable.
B) Feature F1 is an example of ordinal variable.
C) It doesn’t belong to any of the above category.
D) Both of these
Solution: (B)
Ordinal variables are the variables which has some order in their categories. For example, grade
A should be consider as high grade than grade B.
1. Which of the following methods do we use to best fit the data in Logistic Regression?
A) Least Square Error
B) Maximum Likelihood
C) Jaccard distance
D) Both A and B
Ans Solution: B
2. Choose which of the following options is true regarding One-Vs-All method in Logistic
Regression.
A) We need to fit n models in n-class classification problem
B) We need to fit n-1 models to classify into n classes
C) We need to fit only 1 model to classify into n classes
D) None of these
Ans Solution: A
3. Suppose, You applied a Logistic Regression model on a given data and got a training accuracy
X and testing accuracy Y. Now, you want to add a few new features in the same data. Select the
option(s) which is/are correct in such a case.
Note: Consider remaining parameters are same.
A) Training accuracy increases
B) Training accuracy increases or remains the same
C) Testing accuracy decreases
D) Testing accuracy increases or remains the same
Ans Solution: A and D
Adding more features to model will increase the training accuracy because model has to
consider more data to fit the logistic regression. But testing accuracy increases if feature is
found to be significant
6. Which of the following methods do we use to find the best fit line for data in Linear
Regression?
A) Least Square Error
B) Maximum Likelihood
C) Logarithmic Loss
D) Both A and B
Ans Solution: (A)
In linear regression, we try to minimize the least square errors of the model to identify the line
of best fit.
8. Suppose you plotted a scatter plot between the residuals and predicted values in linear
regression and you found that there is a relationship between them. Which of the following
conclusion do you make about this situation?
9. Suppose you have fitted a complex regression model on a dataset. Now, you are using Ridge
regression with penalty x.
Choose the option which describes bias in best manner.
A) In case of very large x; bias is low
B) In case of very large x; bias is high
C) We can’t say about bias
D) None of these
Ans Solution: (B)
If the penalty is very large it means model is less complex, therefore the bias would be high.
11. Suppose you have trained a logistic regression classifier and it outputs a new example x with
a prediction ho(x) = 0.2. This means
Our estimate for P(y=1 | x)
Our estimate for P(y=0 | x)
Our estimate for P(y=1 | x)
Our estimate for P(y=0 | x)
Ans Solution: B
A) TRUE
B) FALSE
Solution: (A)
True. A Neural network can be used as a universal approximator, so it can definitely implement
a linear regression algorithm.
15. Which of the following methods do we use to find the best fit line for data in Linear
Regression?
A) Least Square Error
B) Maximum Likelihood
C) Logarithmic Loss
D) Both A and B
Solution: (A)
In linear regression, we try to minimize the least square errors of the model to identify the line
of best fit.
16. Which of the following evaluation metrics can be used to evaluate a model while modeling
a continuous output variable?
A) AUC-ROC
B) Accuracy
C) Logloss
D) Mean-Squared-Error
Solution: (D)
Since linear regression gives output as continuous values, so in such case we use mean squared
error metric to evaluate the model performance. Remaining options are use in case of a
classification problem.
17. True-False: Lasso Regularization can be used for variable selection in Linear Regression.
A) TRUE
B) FALSE
Solution: (A)
True, In case of lasso regression we apply absolute penalty which makes some of the coefficients
zero.
19. Suppose that we have N independent variables (X1,X2… Xn) and dependent variable is Y.
Now Imagine that you are applying linear regression by fitting the best fit line using least square
error on this data.
You found that correlation coefficient for one of it’s variable(Say X1) with Y is -0.95.
Which of the following is true for X1?
A) Relation between the X1 and Y is weak
B) Relation between the X1 and Y is strong
C) Relation between the X1 and Y is neutral
D) Correlation can’t judge the relationship
Solution: (B)
The absolute value of the correlation coefficient denotes the strength of the relationship.
Since absolute correlation is very high it means that the relationship is strong between X1 and
Y.
20. Looking at above two characteristics, which of the following option is the correct for
Pearson correlation between V1 and V2?
If you are given the two variables V1 and V2 and they are following below two characteristics.
1. If V1 increases then V2 also increases
2. If V1 decreases then V2 behavior is unknown
A) Pearson correlation will be close to 1
B) Pearson correlation will be close to -1
C) Pearson correlation will be close to 0
D) None of these
Solution: (D)
We cannot comment on the correlation coefficient by using only statement 1. We need to
consider the both of these two statements. Consider V1 as x and V2 as |x|. The correlation
coefficient would not be close to 1 in such a case.
21. Suppose Pearson correlation between V1 and V2 is zero. In such case, is it right to
conclude that V1 and V2 do not have any relation between them?
A) TRUE
B) FALSE
Solution: (B)
Pearson correlation coefficient between 2 variables might be zero even when they have a
relationship between them. If the correlation coefficient is zero, it just means that that they
don’t move together. We can take examples like y=|x| or y=x^2.
22. True- False: Overfitting is more likely when you have huge amount of data to train?
A) TRUE
B) FALSE
Solution: (B)
With a small training dataset, it’s easier to find a hypothesis to fit the training data exactly i.e.
overfitting.
23. We can also compute the coefficient of linear regression with the help of an analytical
method called “Normal Equation”. Which of the following is/are true about Normal Equation?
1. We don’t have to choose the learning rate
2. It becomes slow when number of features is very large
3. Thers is no need to iterate
A) 1 and 2
B) 1 and 3
C) 2 and 3
D) 1,2 and 3
Solution: (D)
Instead of gradient descent, Normal Equation can also be used to find coefficients.
25. What will happen when you apply very large penalty?
A) Some of the coefficient will become absolute zero
B) Some of the coefficient will approach zero but not absolute zero
C) Both A and B depending on the situation
D) None of these
Solution: (B)
In lasso some of the coefficient value become zero, but in case of Ridge, the coefficients become
close to zero but not zero.
26. What will happen when you apply very large penalty in case of Lasso?
A) Some of the coefficient will become zero
B) Some of the coefficient will be approaching to zero but not absolute zero
C) Both A and B depending on the situation
D) None of these
Solution: (A)
As already discussed, lasso applies absolute penalty, so some of the coefficients will become
zero.
27. Which of the following statement is true about outliers in Linear regression?
A) Linear regression is sensitive to outliers
B) Linear regression is not sensitive to outliers
C) Can’t say
D) None of these
Solution: (A)
The slope of the regression line will change due to outliers in most of the cases. So Linear
Regression is sensitive to outliers.
28. Suppose you plotted a scatter plot between the residuals and predicted values in linear
regression and you found that there is a relationship between them. Which of the following
conclusion do you make about this situation?
31. In terms of bias and variance. Which of the following is true when you fit degree 2
polynomial?
A) Increase
B) Decrease
C) Remain constant
D) Can’t Say
Solution: (D)
Training error may increase or decrease depending on the values that are used to fit the model.
If the values used to train contain more outliers gradually, then the error might just increase.
33. What do you expect will happen with bias and variance as you increase the size of training
data?
34. What would be the root mean square training error for this data if you run a Linear
Regression model of the form (Y = A0+A1X)?
A) Less than 0
B) Greater than zero
C) Equal to 0
D) None of these
Solution: (C)
We can perfectly fit the line on the following data so mean error will be zero.
35. Which of the following scenario would give you the right hyper parameter?
A) 1
B) 2
C) 3
D) 4
Solution: (B)
Option B would be the better option because it leads to less training as well as validation error.
36. Suppose you got the tuned hyper parameters from the previous question. Now, Imagine
you want to add a variable in variable space such that this added feature is important. Which
of the following thing would you observe in such case?
A) Training Error will decrease and Validation error will increase
B) Training Error will increase and Validation error will increase
C) Training Error will increase and Validation error will decrease
D) Training Error will decrease and Validation error will decrease
E) None of the above
Solution: (D)
If the added feature is important, the training and validation error would decrease.
A) L1
B) L2
C) Any
D) None of these
Solution: (D)
I won’t use any regularization methods because regularization is used in case of overfitting.
41. True-False: Is it possible to design a logistic regression algorithm using a Neural Network
Algorithm?
A) TRUE
B) FALSE
Solution: A
True, Neural network is a is a universal approximator so it can implement linear regression
algorithm.
43. Which of the following methods do we use to best fit the data in Logistic Regression?
A) Least Square Error
B) Maximum Likelihood
C) Jaccard distance
D) Both A and B
Solution: B
Logistic regression uses maximum likely hood estimate for training a logistic regression.
44. Which of the following evaluation metrics can not be applied in case of logistic regression
output to compare with target?
A) AUC-ROC
B) Accuracy
C) Logloss
D) Mean-Squared-Error
Solution: D
Since, Logistic Regression is a classification algorithm so it’s output can not be real time value so
mean squared error can not use for evaluating it
45. One of the very good methods to analyze the performance of Logistic Regression is AIC,
which is similar to R-Squared in Linear Regression. Which of the following is true about AIC?
A) We prefer a model with minimum AIC value
B) We prefer a model with maximum AIC value
C) Both but depend on the situation
D) None of these
Solution: A
We select the best model in logistic regression which can least AIC.
Solution: A
In case of lasso we apply a absolute penality, after increasing the penality in lasso some of the
coefficient of variables may become zero.
Context: 48-49
Consider a following model for logistic regression: P (y =1|x, w)= g(w0 + w1x)
where g(z) is the logistic function.
In the above equation the P (y =1|x; w) , viewed as a function of x, that we can get by changing the
parameters w.
A) (0, inf)
B) (-inf, 0 )
C) (0, 1)
D) (-inf, inf)
Solution: C
For values of x in the range of real number from −∞ to +∞ Logistic function will give the output
between (0,1)
49 In above question what do you think which function would make p between (0,1)?
A) logistic function
B) Log likelihood function
C) Mixture of both
D) None of them
Solution: A
50. Suppose you have been given a fair coin and you want to find out the odds of getting heads.
Which of the following option is true for such a case?
A) odds will be 0
B) odds will be 0.5
C) odds will be 1
D) None of these
Solution: C
Odds are defined as the ratio of the probability of success and the probability of failure. So in case of fair
coin probability of success is 1/2 and the probability of failure is 1/2 so odd would be 1
51. The logit function(given as l(x)) is the log of odds function. What could be the range of logit
function in the domain x=[0,1]?
A) (– ∞ , ∞)
B) (0,1)
C) (0, ∞)
D) (- ∞, 0)
Solution: A
For our purposes, the odds function has the advantage of transforming the probability function, which
has values from 0 to 1, into an equivalent function with values between 0 and ∞. When we take the
natural log of the odds function, we get a range of values from -∞ to ∞.
A) Linear Regression errors values has to be normally distributed but in case of Logistic Regression it is
not the case
B) Logistic Regression errors values has to be normally distributed but in case of Linear Regression it is
not the case
C) Both Linear Regression and Logistic Regression error values have to be normally distributed
D) Both Linear Regression and Logistic Regression error values have not to be normally distributed
Solution:A
53. Which of the following is true regarding the logistic function for any value “x”?
Note:
Logistic(x): is a logistic function of any number “x”
A) Logistic(x) = Logit(x)
B) Logistic(x) = Logit_inv(x)
C) Logit_inv(x) = Logit(x)
D) None of these
Solution: B
Suppose you have given the two scatter plot “a” and “b” for two classes( blue for positive and red for
negative class). In scatter plot “a”, you correctly classified all data points using logistic regression ( black
line is a decision boundary).
A) Bias will be high
B) Bias will be low
C) Can’t say
D) None of these
Solution: A
55. Suppose, You applied a Logistic Regression model on a given data and got a training accuracy X
and testing accuracy Y. Now, you want to add a few new features in the same data. Select the
option(s) which is/are correct in such a case.
Solution: A and D
Adding more features to model will increase the training accuracy because model has to consider more
data to fit the logistic regression. But testing accuracy increases if feature is found to be significant
56. Choose which of the following options is true regarding One-Vs-All method in Logistic Regression.
If there are n classes, then n separate logistic regression has to fit, where the probability of each
category is predicted over the rest of the categories combined.
57. Below are two different logistic models with different values for β0 and β1.
Which of the
following statement(s) is true about β0 and β1 values of two logistics models (Green, Black)?
Solution: B
Context 58-60
Below are the three scatter plot(A,B,C left to right) and hand drawn decision boundaries for logistic
regression.
58. Which of the following above figure shows that the decision boundary is overfitting the training
data?
A) A
B) B
C) C
D)None of these
Solution: C
Since in figure 3, Decision boundary is not smooth that means it will over-fitting the data.
1. The training error in first plot is maximum as compare to second and third plot.
2. The best model for this regression problem is the last (third) plot because it has minimum
training error (zero).
3. The second model is more robust than first and third because it will perform best on unseen
data.
5. All will perform same because we have not seen the testing data.
A) 1 and 3
B) 1 and 3
C) 1, 3 and 4
D) 5
Solution: C
The trend in the graphs looks like a quadratic trend over independent variable X. A higher degree(Right
graph) polynomial might have a very high accuracy on the train population but is expected to fail badly
on test dataset. But if you see in left graph we will have training error maximum because it underfits the
training data
60. Suppose, above decision boundaries were generated for the different value of regularization.
Which of the above decision boundary shows the maximum regularization?
A) A
B) B
C) C
D) All have equal regularization
Solution: A
Since, more regularization means more penality means less complex decision boundry that shows in first
figure A.
61. What would do if you want to train logistic regression on same data that will take less time as well
as give the comparatively similar accuracy(may not be same)?
Suppose you are using a Logistic Regression model on a huge dataset. One of the problem you may face
on such huge data is that Logistic regression will take very long time to train.
Solution: D
If you decrease the number of iteration while training it will take less time for surly but will not give the
same accuracy for getting the similar accuracy but not exact you need to increase the learning rate.
62. Which of the following image is showing the cost function for y =1.
Following is the loss function in logistic regression(Y-axis loss function and x axis log probability) for
two class classification problem.
Solution: A
A is the true answer as loss function decreases as the log probability increases
A) 1
B) 2
C) 3
D) 4
Solution: C
There are three local minima present in the graph
64. Can a Logistic Regression classifier do a perfect classification on the below data?
Note: You can use only X1 and X2 variables where X1 and X2 can take only two binary values(0,1).
A) TRUE
B) FALSE
C) Can’t say
D) None of these
Solution: B
No, logistic regression only forms linear decision surface, but the examples in the figure are not linearly
separable.
UNIT IV
Ans Solution: C
When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.
Ans Solution: C
The cost parameter decides how much an SVM should be allowed to “bend” with the data. For a low
cost, you aim for a smooth decision surface and for a higher cost, you aim to classify more points
correctly. It is also simply referred to as the cost of misclassification.
Ans Solution: D
SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.
Ans Solution: B
Generalisation error in statistics is generally the out-of-sample error which is the measure of how
accurately a model can predict values for previously unseen data.
Ans Solution: C
When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: C
Question Context:8– 9
Suppose you are using a Linear SVM classifier with 2 class classification problem. Now you have been
given the following data in which some points are circled red that are representing support vectors.
8. If you remove the following any one red points from the data. Does the decision boundary will
change?
A) Yes
B) No
Solution: A
These three examples are positioned such that removing any one of them introduces slack in the
constraints. So the decision boundary would completely change.
9. [True or False] If you remove the non-red circled points from the data, the decision boundary will
change?
A) True
B) False
Solution: B
On the other hand, rest of the points in the data won’t affect the decision boundary much.
Solution: B
Generalization error in statistics is generally the out-of-sample error which is the measure of how
accurately a model can predict values for previously unseen data.
11. When the C parameter is set to infinite, which of the following holds true?
A) The optimal hyperplane if exists, will be the one that completely separates the data
B) The soft-margin classifier will separate the data
C) None of the above
Solution: A
At such a high level of misclassification penalty, soft margin will not hold existence as there will be no
room for error.
Solution: A
A hard margin means that an SVM is very rigid in classification and tries to work extremely well in the
training set, causing overfitting.
13. The minimum time complexity for training an SVM is O(n2). According to this fact, what sizes of
datasets are not best suited for SVM’s?
A) Large datasets
B) Small datasets
C) Medium sized datasets
D) Size does not matter
Solution: A
Datasets which have a clear classification boundary will function best with SVM’s.
A) Selection of Kernel
B) Kernel Parameters
C) Soft Margin Parameter C
D) All of the above
Solution: D
The SVM effectiveness depends upon how you choose the basic 3 requirements mentioned above in
such a way that it maximises your efficiency, reduces error and overfitting.
15. upport vectors are the data points that lie closest to the decision surface.
A) TRUE
B) FALSE
Solution: A
They are the points closest to the hyperplane and the hardest ones to classify. They also have a direct
bearing on the location of the decision surface.
Solution: C
When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.
17. Suppose you are using RBF kernel in SVM with high Gamma value. What does this signify?
A) The model would consider even far away points from hyperplane for modeling
B) The model would consider only the points close to the hyperplane for modeling
C) The model would not be affected by distance of points from hyperplane for modeling
D) None of the above
Solution: B
The gamma parameter in SVM tuning signifies the influence of points either near or far away from the
hyperplane.
For a low gamma, the model will be too constrained and include all points of the training dataset,
without really capturing the shape.
For a higher gamma, the model will capture the shape of the dataset well.
Solution: C
The cost parameter decides how much an SVM should be allowed to “bend” with the data. For a low
cost, you aim for a smooth decision surface and for a higher cost, you aim to classify more points
correctly. It is also simply referred to as the cost of misclassification.
19. Suppose you are building a SVM model on data X. The data X can be error prone which means that
you should not trust any specific data point too much. Now think that you want to build a SVM model
which has quadratic kernel function of polynomial degree 2 that uses Slack variable C as one of it’s hyper
parameter. Based upon that give the answer for following question.
What would happen when you use very large value of C(C->infinity)?
Note: For small C was also classifying all data points correctly
A) We can still classify data correctly for given setting of hyper parameter C
B) We can not classify data correctly for given setting of hyper parameter C
C) Can’t Say
D) None of these
Solution: A
For large values of C, the penalty for misclassifying points is very high, so the decision boundary will
perfectly separate the data if possible.
20. What would happen when you use very small C (C~0)?
Solution: A
The classifier can maximize the margin between most of the points, while misclassifying a few points,
because the penalty is so low.
21. If I am using all features of my dataset and I achieve 100% accuracy on my training set, but ~70% on
validation set, what should I look out for?
A) Underfitting
B) Nothing, the model is perfect
C) Overfitting
Solution: C
If we’re achieving 100% training accuracy very easily, we need to check to verify if we’re overfitting our
data.
22. Which of the following are real world applications of the SVM?
Solution: D
SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.
Question Context: 23 – 25
Suppose you have trained an SVM with linear decision boundary after training SVM, you correctly infer
that your SVM model is under fitting.
23. Which of the following option would you more likely to consider iterating SVM next time?
Solution: C
The best option here would be to create more features for the model.
24. Suppose you gave the correct answer in previous question. What do you think that is actually
happening?
A) 1 and 2
B) 2 and 3
C) 1 and 4
D) 2 and 4
Solution: C
Better model will lower the bias and increase the variance
25. In above question suppose you want to change one of it’s(SVM) hyperparameter so that effect
would be same as previous questions i.e model will not under fit?
Solution: A
Increasing C parameter would be the right thing to do here, as it will ensure regularized model
26. We usually use feature normalization before using the Gaussian kernel in SVM. What is true about
feature normalization?
A) 1
B) 1 and 2
C) 1 and 3
D) 2 and 3
Solution: B
Suppose you are dealing with 4 class classification problem and you want to train a SVM model on the
data for that you are using One-vs-all method. Now answer the below questions?
27. How many times we need to train our SVM model in such case?
A) 1
B) 2
C) 3
D) 4
Solution: D
For a 4 class problem, you would have to train the SVM at least 4 times if you are using a one-vs-all
method.
28. Suppose you have same distribution of classes in the data. Now, say for training 1 time in one vs all
setting the SVM is taking 10 second. How many seconds would it require to train one-vs-all method end
to end?
A) 20
B) 40
C) 60
D) 80
Solution: B
29 Suppose your problem has changed now. Now, data has only 2 classes. What would you think how
many times we need to train SVM in such case?
A) 1
B) 2
C) 3
D) 4
Solution: A
Training the SVM only one time would give you appropriate results
Suppose you are using SVM with linear kernel of polynomial degree 2, Now think that you have applied
this on data and found that it perfectly fit the data that means, Training and testing accuracy is 100%.
30. Now, think that you increase the complexity (or degree of polynomial of this kernel). What would
you think will happen?
Solution: A
Increasing the complexity of the data would make the algorithm overfit the data.
31. In the previous question after increasing the complexity you found that training accuracy was still
100%. According to you what is the reason behind that?
1. Since data is fixed and we are fitting more polynomial term or parameters so the algorithm starts
memorizing everything in the data
2. Since data is fixed and SVM doesn’t need to search in big hypothesis space
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: C
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: C
UNIT V
1. Which of the following is a widely used and effective machine learning algorithm based on the
idea of bagging?
a) Decision Tree
b) Regression
c) Classification
d) Random Forest
Ans D
a) Factor analysis
b) Decision trees are robust to outliers
c) Decision trees are prone to be overfit
d) None of the above
Ans C
a. True
b. False
Decision trees can also be used to for clusters in the data but clustering often generates natural
clusters and is not dependent on any objective function.
Regression
Classification
Clustering
Reinforcement Learning
Options:
a. 1 Only
b. 1 and 2
c. 1 and 3
d. 1, 2 and 4
Ans D
6 Which of the following is the most appropriate strategy for data cleaning before performing
clustering analysis, given less than desirable number of data points:
Removal of outliers
Options:
a. 1 only
b. 2 only
c. 1 and 2
d. None of the above
Ans A
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: C
Both options are true. In Bagging, each individual trees are independent of each other because they
consider different subset of features and samples.
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: B
In boosting tree individual weak learners are not independent of each other because each tree correct
the results of previous tree. Bagging and boosting both can be consider as improving the base learners
results.
9. In Random forest you can generate hundreds of trees (say T1, T2 …..Tn) and then aggregate
the results of these tree. Which of the following is true about individual (Tk) tree in Random Forest?
1. Individual tree is built on a subset of the features
A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4
Ans Solution: A
Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.
10. Suppose you are using a bagging based algorithm say a RandomForest in model building.
Which of the following can be true?
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: A
Since Random Forest aggregate the result of different weak learners, If It is possible we would want
more number of trees in model building. Random Forest is a black box model you will lose
interpretability after using it.
11. Which of the following is/are true about Random Forest and Gradient Boosting ensemble
methods?
2. Random Forest is use for classification whereas Gradient Boosting is use for regression task
3. Random Forest is use for regression whereas Gradient Boosting is use for Classification task
Solution: E
12. In Random forest you can generate hundreds of trees (say T1, T2 …..Tn) and then aggregate the
results of these tree. Which of the following is true about individual(Tk) tree in Random Forest?
A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4
Solution: A
Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.
13. Which of the following algorithm doesn’t uses learning Rate as of one of its hyperparameter?
1. Gradient Boosting
2. Extra Trees
3. AdaBoost
4. Random Forest
A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4
Solution: D
Random Forest and Extra Trees don’t have learning rate as a hyperparameter.
14. Which of the following algorithm are not an example of ensemble learning algorithm?
A) Random Forest
B) Adaboost
C) Extra Trees
D) Gradient Boosting
E) Decision Trees
Solution: E
Decision trees doesn’t aggregate the results of multiple trees so it is not an ensemble algorithm.
15. Suppose you are using a bagging based algorithm say a RandomForest in model building. Which of
the following can be true?
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: A
Since Random Forest aggregate the result of different weak learners, If It is possible we would want
more number of trees in model building. Random Forest is a black box model you will lose
interpretability after using it.
16. True-False: The bagging is suitable for high variance low bias models?
A) TRUE
B) FALSE
Solution: A
The bagging is suitable for high variance low bias models or you can say for complex models.
17. To apply bagging to regression trees which of the following is/are true in such case?
Solution: D
Solution: B
We always consider the validation results to compare with the test result.
19. In which of the following scenario a gain ratio is preferred over Information Gain?
Solution: A
When high cardinality problems, gain ratio is preferred over Information Gain technique.
20. Suppose you have given the following scenario for training and validation error for Gradient
Boosting. Which of the following hyper parameter would you choose in such case?
1 2 100 110
2 4 90 105
3 6 50 100
4 8 45 105
5 10 30 150
A) 1
B) 2
C) 3
D) 4
Solution: B
Scenario 2 and 4 has same validation accuracies but we would select 2 because depth is lower is better
hyper parameter.
21. Which of the following is/are not true about DBSCAN clustering algorithm:
1. For data points to be in a cluster, they must be in a distance threshold to a core point
5. It is robust to outliers
Options:
A. 1 only
B. 2 only
C. 4 only
D. 2 and 3
Solution: D
DBSCAN can form a cluster of any arbitrary shape and does not have strong assumptions for the
distribution of data points in the data space.
Answer: d
Explanation: K-means clustering follows partitioning approach.
Answer: c
Explanation: k-nearest neighbour has nothing to do with k-means.
Answer: a
Explanation: K-means requires a number of clusters.
Answer: a
Explanation: K-means clustering produces the final estimate of cluster centroids.
27.
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
S.r No Question a b c d Correct Image
Answer
Write down question Option a Option b Option c Option d a/b/c/d img.jpg
1 In reinforcement learning if feedback is Penalty Overlearning Reward None of above A
negative one it is defined as____.
2 According to____ , it�s a key success Claude Shannon's theory Gini Index Darwin�s theory None of above C
factor for the survival and evolution of all
species.
3 How can you avoid overfitting ? By using a lot of data By using inductive machine By using validation only None of above A
learning
4 What are the popular algorithms of Decision Trees and Neural Probabilistic networks and Support vector machines All D
Machine Learning? Networks (back Nearest Neighbor
propagation)
5 What is �Training set�? Training set is used to test A set of data is used to Both A & B None of above B
the accuracy of the discover the potentially
hypotheses generated by the predictive relationship.
learner.
6 Common deep learning applications Image classification, Autonomous car driving, All above D
include____ Real-time visual tracking Logistic optimization Bioinformatics,
Speech recognition
7 what is the function of �Supervised Classifications, Predict time Speech recognition, Both A & B None of above C
Learning�? series, Annotate strings Regression
8 Commons unsupervised applications Object segmentation Similarity detection Automatic labeling All above D
include
9 Reinforcement learning is particularly the environment is not it's often very dynamic it's impossible to have a All above D
efficient when______________. completely deterministic precise error measure
10 if there is only a discrete number of Regression Classification. Modelfree Categories B
possible outcomes (called categories),
the process becomes a______.
11 Which of the following are supervised Spam detection, Image classification, Autonomous car driving, A
learning applications Pattern detection, Real-time visual tracking Logistic optimization Bioinformatics,
Natural Language Speech recognition
Processing
12 During the last few years, many ______ Logical Classical Classification None of above D
algorithms have been applied to deep
neural networks to learn the best policy
for playing Atari video games and to teach
an agent how to associate the right action
with an input representing the state.
13 Which of the following sentence is Machine learning relates Data mining can be defined Both A & B None of the above C
correct? with the study, design and as the process in which the
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
development of the unstructured data tries to
algorithms that give extract knowledge or
computers the capability to unknown interesting
learn without being explicitly patterns.
programmed.
14 What is �Overfitting� in Machine when a statistical model Robots are programed so While involving the process a set of data is used to A
learning? describes random error or that they can perform the of learning �overfitting� discover the potentially
noise instead of underlying task based on data they occurs. predictive relationship
relationship �overfitting� gather from sensors.
occurs.
15 What is �Test set�? Test set is used to test the It is a set of data is used to Both A & B None of above A
accuracy of the hypotheses discover the potentially
generated by the learner. predictive relationship.
16 ________is much more difficult because it's Removing the whole line Creating sub-model to Using an automatic All above B
necessary to determine a supervised predict those features strategy to input them
strategy to train a model for each feature according to the other
and, finally, to predict their value known values
17 How it's possible to use a different regression classification random_state missing_values D
placeholder through the
parameter_______.
18 If you need a more powerful scaling RobustScaler DictVectorizer LabelBinarizer FeatureHasher A
feature, with a superior control on outliers
and the possibility to select a quantile
range, there's also the class________.
19 scikit-learn also provides a class for per- max, l0 and l1 norms max, l1 and l2 norms max, l2 and l3 norms max, l3 and l4 norms B
sample normalization, Normalizer. It can
apply________to each element of a dataset
20 There are also many univariate methods F-tests and p-values chi-square ANOVA All above A
that can be used in order to select the
best features according to specific criteria
based on________.
21 Which of the following selects only a SelectPercentile FeatureHasher SelectKBest All above A
subset of features belonging to a certain
percentile
22 ________performs a PCA with non-linearly SparsePCA KernelPCA SVD None of the Mentioned B
separable data sets.
23 A feature F1 can take certain value: A, B, Feature F1 is an example of Feature F1 is an example of It doesn�t belong to any Both of these B
C, D, E, & F and represents grade of nominal variable. ordinal variable. of the above category.
students from a college.
Which of the following statement is true in
following case?
24 What would you do in PCA to get the Transform data to zero mean Transform data to zero Not possible None of these A
same projection as SVD? median
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
25 What is PCA, KPCA and ICA used for? Principal Components Kernel based Principal Independent Component All above D
Analysis Component Analysis Analysis
26 Can a model trained for item based YES NO A
similarity also choose from a given set of
items?
27 What are common feature selection correlation coefficient Greedy algorithms All above None of these C
methods in regression task?
28 The parameter______ allows specifying test_size training_size All above None of these C
the percentage of elements to put into the
test/training set
29 In many classification problems, the random_state dataset test_size All above B
target ______ is made up of categorical
labels which cannot immediately be
processed by any algorithm.
30 _______adopts a dictionary-oriented LabelEncoder class LabelBinarizer class DictVectorizer FeatureHasher A
approach, associating to each category
label a progressive integer number.
31 If Linear regression model perfectly first a) Test error is also always b) Test error is non zero c) Couldn�t comment on d) Test error is equal to Train c
i.e., train error is zero, then zero Test error error
_____________________
32 Which of the following metrics can be a) ii and iv b) i and ii c) ii, iii and iv d) i, ii, iii and iv d
used for evaluating regression models?i)
R Squaredii) Adjusted R Squarediii) F
Statisticsiv) RMSE / MSE / MAE
33 How many coefficients do you need to a) 1 b) 2 c) 3 d) 4 b
estimate in a simple linear regression
model (One independent variable)?
34 In a simple linear regression model (One a) by 1 b) no change c) by intercept d) by its slope d
independent variable), If we change the
input variable by 1 unit. How much output
variable will change?
35 �Function used for linear regression in R a) lm(formula, data) b) lr(formula, data) c) lrm(formula, data) d) regression.linear(formula, a
is __________ data)
36 In syntax of linear model a) Matrix b) Vector c) Array d) List b
lm(formula,data,..), data refers to ______
37 In the mathematical Equation of Linear a) (X-intercept, Slope) b) (Slope, X-Intercept) c) (Y-Intercept, Slope) d) (slope, Y-Intercept) c
Regression Y?=??1 + ?2X + ?, (?1, ?2)
refers to __________
38 Linear Regression is a supervised A) TRUE B) FALSE a
machine learning algorithm.
39 It is possible to design a Linear regression A) TRUE B) FALSE a
algorithm using a neural network?
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
40 Which of the following methods do we A)�Least Square Error B)�Maximum Likelihood C) Logarithmic Loss D) Both A and B a
use to find the best fit line for data in
Linear Regression?
41 Which of the following evaluation metrics A)�AUC-ROC B)�Accuracy C)�Logloss D)�Mean-Squared-Error d
can be used to evaluate a model while
modeling a continuous output variable?
42 Which of the following is true about A) Lower is better B)�Higher is better C)�A or B depend on the D)�None of these a
Residuals ? situation
43 Overfitting is more likely when you have A) TRUE B) FALSE b
huge amount of data to train?
44 Which of the following statement is true A)�Linear regression is B)�Linear regression is C)�Can�t say D)�None of these a
about outliers in Linear regression? sensitive to outliers not sensitive to outliers
45 Suppose you plotted a scatter plot A)�Since the there is a B)�Since the there is a C)�Can�t say D)�None of these a
between the residuals and predicted relationship means our relationship means our
values in linear regression and you found model is not good model is good
that there is a relationship between them.
Which of the following conclusion do you
make about this situation?
46 Naive Bayes classifiers are a collection Classification Clustering Regression All a
------------------of algorithms�
47 Naive Bayes classifiers is _______________ Supervised Unsupervised Both None a
Learning
48 Features being classified is independent False TRUE b
of each other in Na�ve Bayes Classifier
49 Features being classified is __________ of Independent Dependent Partial Dependent None a
each other in Na�ve Bayes Classifier
50 Bayes Theorem is given by where 1. P(H) True FALSE a bayes.jpg
is the probability of hypothesis H being
true.
2. P(E) is the probability of the
evidence(regardless of the hypothesis).
3. P(E|H) is the probability of the evidence
given that hypothesis is true.
4. P(H|E) is the probability of the
hypothesis given that the evidence is
there.
51 In given image, P(H|E) Posterior Prior a bayes.jpg
is__________probability.
52 In given image, P(H) Posterior Prior b bayes.jpg
is__________probability.
53 Conditional probability is a measure of the True FALSE a
probability of an event given that another
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
event has already occurred.
54 Bayes� theorem describes the True FALSE a
probability of an event, based on prior
knowledge of conditions that might be
related to the event.
55 Bernoulli Na�ve Bayes Classifier is Continuous Discrete Binary c
___________distribution
56 Multinomial Na�ve Bayes Classifier is Continuous Discrete Binary b
___________distribution
57 Gaussian Na�ve Bayes Classifier is Continuous Discrete Binary a
___________distribution
58 Binarize parameter in BernoulliNB scikit True FALSE a
sets threshold for binarizing of sample
features.
59 Gaussian distribution when plotted, gives Mean Variance Discrete Random a
a bell shaped curve which is symmetric
about the _______ of the feature values.
60 SVMs directly give us the posterior True FALSE b
probabilities P(y = 1jx) and P(y = ??1jx)
61 Any linear combination of the True FALSE a
components of a multivariate Gaussian is
a univariate Gaussian.
62 Solving a non linear separation problem True FALSE a
with a hard margin Kernelized SVM
(Gaussian RBF Kernel) might lead to
overfitting
63 SVM is a ------------------ algorithm� Classification Clustering Regression All a
64 SVM is a ------------------ learning Supervised Unsupervised Both None a
65 The linear�SVM�classifier works by True FALSE a
drawing a straight line between two
classes
66 Which of the following function provides cl_forecastB cl_nowcastC cl_precastD None of the Mentioned D --
unsupervised prediction ?
67 Which of the following is characteristic of fast accuracy scalable All above D --
best machine learning method ?
68 What are the different Algorithm Supervised Learning and Unsupervised Learning and Both A & B None of the Mentioned C --
techniques in Machine Learning? Semi-supervised Learning Transduction
69 What is the standard approach to split the set of example into group the set of example a set of observed learns programs from data A --
supervised learning? the training set and the test into the training set and the instances tries to induce a
test general rule
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
70 Which of the following is not Machine Artificial Intelligence Rule based inference Both A & B None of the Mentioned B --
Learning?
71 What is Model Selection in Machine The process of selecting when a statistical model Find interesting directions All above A --
Learning? models among different describes random error or in data and find novel
mathematical models, which noise instead of underlying observations/ database
are used to describe the relationship cleaning
same data set
72 Which are two techniques of Machine Genetic Programming and Speech recognition and Both A & B None of the Mentioned A --
Learning ? Inductive Learning Regression
73 Even if there are no actual supervisors Supervised Reinforcement Unsupervised None of the above B --
________ learning is also based on
feedback provided by the environment
74 What does learning exactly mean? Robots are programed so A set of data is used to Learning is the ability to It is a set of data is used to C --
that they can perform the discover the potentially change according to discover the potentially
task based on data they predictive relationship. external stimuli and predictive relationship.
gather from sensors. remembering most of all
previous experiences.
75 When it is necessary to allow the model to Overfitting Overlearning Classification Regression A --
develop a generalization ability and avoid
a common problem called______.
76 Techniques involve the usage of both Supervised Semi-supervised Unsupervised None of the above B --
labeled and unlabeled data is called___.
77 In reinforcement learning if feedback is Penalty Overlearning Reward None of above A --
negative one it is defined as____.
78 According to____ , it�s a key success Claude Shannon's theory Gini Index Darwin�s theory None of above C --
factor for the survival and evolution of all
species.
79 A supervised scenario is characterized by Programmer Teacher Author Farmer B --
the concept of a _____.
80 overlearning causes due to an excessive Capacity Regression Reinforcement Accuracy A --
______.
81 Which of the following is an example of a PCA K-Means None of the above A --
deterministic algorithm?
82 Which of the following model model MCV MARS MCRS All above B --
include a backwards elimination feature
selection routine?
83 Can we extract knowledge without apply YES NO A --
feature selection
84 While using feature selection on the data, NO YES B --
is the number of features decreases.
85 Which of the following are several models regression classification None of the above C --
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
for feature extraction
86 _____ provides some built-in datasets that scikit-learn classification regression None of the above A --
can be used for testing purposes.
87 While using _____ all labels are LabelEncoder class LabelBinarizer class DictVectorizer FeatureHasher A --
turned into sequential numbers.
88 _______produce sparse matrices of real DictVectorizer FeatureHasher Both A & B None of the Mentioned C --
numbers that can be fed into any machine
learning model.
89 scikit-learn offers the class______, which is LabelEncoder LabelBinarizer DictVectorizer Imputer D --
responsible for filling the holes using a
strategy based on the mean, median, or
frequency
90 Which of the following scale data by MinMaxScaler MaxAbsScaler Both A & B None of the Mentioned C --
removing elements that don't belong to a
given range or by considering a maximum
absolute value.
91 scikit-learn also provides a class for per- Normalizer Imputer Classifier All above A --
sample normalization,_____
92 ______dataset with many features normalized unnormalized Both A & B None of the Mentioned B --
contains information proportional to the
independence of all features and their
variance.
93 In order to assess how much information Concuttent matrix Convergance matrix Supportive matrix Covariance matrix D --
is brought by each component, and the
correlation among them, a useful tool is
the_____.
94 The_____ parameter can assume different run start stop C --
values which determine how the data init
matrix is initially processed.
95 ______allows exploiting the natural SparsePCA KernelPCA SVD init parameter A --
sparsity of data while extracting principal
components.
96 Which of the following evaluation metrics AUC-ROC Accuracy Logloss Mean-Squared-Error D --
can be used to evaluate a model while
modeling a continuous output variable?
97 Which of the following is true about Lower is better Higher is better A or B depend on the None of these A --
Residuals ? situation
98 Overfitting is more likely when you have TRUE FALSE B --
huge amount of data to train?
99 Which of the following statement is true Linear regression is sensitive Linear regression is not Can�t say None of these A --
about outliers in Linear regression? to outliers sensitive to outliers
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
100 Suppose you plotted a scatter plot Since the there is a Since the there is a Can�t say None of these A --
between the residuals and predicted relationship means our relationship means our
values in linear regression and you found model is not good model is good
that there is a relationship between them.
Which of the following conclusion do you
make about this situation?
101 Let�s say, a �Linear regression� model You will always have test You can not have test error None of the above C --
perfectly fits the training data (train error error zero zero
is zero). Now, Which of the following
statement is true?
102 In a linear regression problem, we are If R Squared increases, this If R Squared decreases, this Individually R squared None of these. C --
using �R-squared� to measure variable is significant. variable is not significant. cannot tell about variable
goodness-of-fit. We add a feature in linear importance. We can�t say
regression model and retrain the same anything about it right now.
model.Which of the following option is
true?
103 Which of the one is true about Linear Regression with Linear Regression with Linear Regression with None of these A --
Heteroskedasticity? varying error terms constant error terms zero error terms
104 Which of the following assumptions do 1,2 and 3. 1,3 and 4. 1 and 3. All of above. D --
we make while deriving linear regression
parameters?1. The true relationship
between dependent y and predictor x is
linear2. The model errors are statistically
independent3. The errors are normally
distributed with a 0 mean and constant
standard deviation4. The predictor x is
non-stochastic and is measured error-free
105 To test linear relationship of y(dependent) Scatter plot Barchart Histograms None of these A --
and x(independent) continuous variables,
which of the following plot best suited?
106 which of the following step / assumption The polynomial degree Whether we learn the The use of a constant-term A --
in regression modeling impacts the trade- weights by matrix inversion
off between under-fitting and over-fitting or gradient descent
the most.
107 Can we calculate the skewness of TRUE FALSE B --
variables based on mean and median?
108 Which of the following is true about Ridge regression uses Lasso regression uses Both use subset selection None of above B --
�Ridge� or �Lasso� regression subset selection of features subset selection of features of features
methods in case of feature selection?
109 Which of the following statement(s) can 1 and 2 1 and 3 2 and 4 None of the above A --
be true post adding a variable in a linear
regression model?1. R-Squared and
Adjusted R-squared both increase2. R-
Squared increases and Adjusted R-
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
squared decreases3. R-Squared
decreases and Adjusted R-squared
decreases4. R-Squared decreases and
Adjusted R-squared increases
110 How many coefficients do you need to 1 2 Can�t Say B --
estimate in a simple linear regression
model (One independent variable)?
111 In given image, P(H) Posterior Prior B bayes.jpg
is__________probability.
112 Conditional probability is a measure of the True FALSE A --
probability of an event given that another
event has already occurred.
113 Gaussian distribution when plotted, gives Mean Variance Discrete Random A --
a bell shaped curve which is symmetric
about the _______ of the feature values.
114 SVMs directly give us the posterior True FALSE B --
probabilities P(y = 1jx) and P(y = ??1jx)
115 SVM is a ------------------ algorithm� Classification Clustering Regression All A --
116 What is/are true about kernel in SVM?1. 1 2 1 and 2 None of these C --
Kernel function map low dimensional data
to high dimensional space2. It�s a
similarity function
117 Suppose you are building a SVM model on Misclassification would Data will be correctly Can�t say None of these A --
data X. The data X can be error prone happen classified
which means that you should not trust
any specific data point too much. Now
think that you want to build a SVM model
which has quadratic kernel function of
polynomial degree 2 that uses Slack
variable C as one of it�s hyper
parameter.What would happen when you
use very small C (C~0)?
118 The cost parameter in the SVM means: The number of cross- The kernel to be used The tradeoff between None of the above C --
validations to be made misclassification and
simplicity of the model
119 Bayes� theorem describes the True FALSE A --
probability of an event, based on prior
knowledge of conditions that might be
related to the event.
120 Bernoulli Na�ve Bayes Classifier is Continuous Discrete Binary C --
___________distribution
121 If you remove the non-red circled points TRUE FALSE B svm.jpg
from the data, the decision boundary will
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
change?
122 How do you handle missing or corrupted a. Drop missing rows or b. Replace missing values c. Assign a unique d. All of the above� D --
data in a dataset? columns with mean/median/mode category to missing values
123 Binarize parameter in BernoulliNB scikit True FALSE A --
sets threshold for binarizing of sample
features.
124 Which of the following statements about A.�����Attributes are B.�����Attributes are C.�����Attributes are D.�����Attributes can B --
Naive Bayes is incorrect? equally important. statistically dependent of statistically independent of be nominal or numeric
one another given the class one another given the
value. class value.
125 The SVM�s are less effective when: The data is linearly separable The data is clean and ready The data is noisy and C --
to use contains overlapping
points
126 Naive Bayes classifiers is _______________ Supervised Unsupervised Both None A --
Learning
127 Features being classified is independent False TRUE B --
of each other in Na�ve Bayes Classifier
128 Features being classified is __________ of Independent Dependent Partial Dependent None A --
each other in Na�ve Bayes Classifier
129 Bayes Theorem is given by where 1. P(H) True FALSE A bayes.jpg
is the probability of hypothesis H being
true.
2. P(E) is the probability of the
evidence(regardless of the hypothesis).
3. P(E|H) is the probability of the evidence
given that hypothesis is true.
4. P(H|E) is the probability of the
hypothesis given that the evidence is
there.
130 Any linear combination of the True FALSE A --
components of a multivariate Gaussian is
a univariate Gaussian.
This sheet
is for 2
Mark
questions
S.r No Question a b c d Correct Image
Answer
e.g 1 Write down question Option a Option b Option c Option d a/b/c/d img.jpg
1 A supervised scenario is characterized by Programmer Teacher Author Farmer B
the concept of a _____.
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
2 overlearning causes due to an excessive Capacity Regression Reinforcement Accuracy A
______.
3 If there is only a discrete number of Modelfree Categories Prediction None of above B
possible outcomes called _____.
4 What is the standard approach to split the set of example into group the set of example a set of observed learns programs from data A
supervised learning? the training set and the test into the training set and the instances tries to induce a
test general rule
5 Some people are using the term ___ Inference Interference Accuracy None of above A
instead of prediction only to avoid the
weird idea that machine learning is a sort
of modern magic.
6 The term _____ can be freely used, but Accuracy Cluster Regression Prediction D
with the same meaning adopted in
physics or system theory.
7 Which are two techniques of Machine Genetic Programming and Speech recognition and Both A & B None of the Mentioned A
Learning ? Inductive Learning Regression
8 Even if there are no actual supervisors Supervised Reinforcement Unsupervised None of the above B
________ learning is also based on
feedback provided by the environment
9 Common deep learning applications / Real-time visual object Classic approaches Automatic labeling Bio-inspired adaptive B
problems can also be solved using____ identification systems
10 Identify the various approaches for Concept Vs Classification Symbolic Vs Statistical Inductive Vs Analytical All above D
machine learning. Learning Learning Learning
11 what is the function of �Unsupervised Find clusters of the data and Find interesting directions Interesting coordinates All D
Learning�? find low-dimensional in data and find novel and correlations
representations of the data observations/ database
cleaning
12 What are the two methods used for the Platt Calibration and Isotonic Statistics and A
calibration in Supervised Learning? Regression Informal Retrieval
13 What is the standard approach to split the set of example into group the set of example a set of observed learns programs from data A
supervised learning? the training set and the test into the training set and the instances tries to induce a
test general rule
14 Which of the following is not Machine Artificial Intelligence Rule based inference Both A & B None of the Mentioned B
Learning?
15 What is Model Selection in Machine The process of selecting when a statistical model Find interesting directions All above A
Learning? models among different describes random error or in data and find novel
mathematical models, which noise instead of underlying observations/ database
are used to describe the relationship cleaning
same data set
16 _____ provides some built-in datasets that scikit-learn classification regression None of the above A
can be used for testing purposes.
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
17 While using _____ all labels are LabelEncoder class LabelBinarizer class DictVectorizer FeatureHasher A
turned into sequential numbers.
18 _______produce sparse matrices of real DictVectorizer FeatureHasher Both A & B None of the Mentioned C
numbers that can be fed into any machine
learning model.
19 scikit-learn offers the class______, which is LabelEncoder LabelBinarizer DictVectorizer Imputer D
responsible for filling the holes using a
strategy based on the mean, median, or
frequency
20 Which of the following scale data by MinMaxScaler MaxAbsScaler Both A & B None of the Mentioned C
removing elements that don't belong to a
given range or by considering a maximum
absolute value.
21 Which of the following model model MCV MARS MCRS All above B
include a backwards elimination feature
selection routine?
22 Can we extract knowledge without apply YES NO A
feature selection
23 While using feature selection on the data, NO YES B
is the number of features decreases.
24 Which of the following are several models regression classification None of the above C
for feature extraction
25 scikit-learn also provides a class for per- Normalizer Imputer Classifier All above A
sample normalization,_____
26 ______dataset with many features normalized unnormalized Both A & B None of the Mentioned B
contains information proportional to the
independence of all features and their
variance.
27 In order to assess how much information Concuttent matrix Convergance matrix Supportive matrix Covariance matrix D
is brought by each component, and the
correlation among them, a useful tool is
the_____.
28 The_____ parameter can assume different run start stop C
values which determine how the data init
matrix is initially processed.
29 ______allows exploiting the natural SparsePCA KernelPCA SVD init parameter A
sparsity of data while extracting principal
components.
30 Which of the following is an example of a PCA K-Means None of the above A
deterministic algorithm?
31 Let�s say, a �Linear regression� model A. You will always have test B. You can not have test C. None of the above c
perfectly fits the training data (train error error zero error zero
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
is zero). Now, Which of the following
statement is true?
32 In a linear regression problem, we are A. If R Squared increases, B. If R Squared decreases, C. Individually R squared D. None of these. c
using �R-squared� to measure this variable is significant. this variable is not cannot tell about variable
goodness-of-fit. We add a feature in linear significant. importance. We can�t say
regression model and retrain the same anything about it right now.
model.Which of the following option is
true?
33 Which of the one is true about A. Linear Regression with B. Linear Regression with C. Linear Regression with D. None of these a
Heteroskedasticity? varying error terms constant error terms zero error terms
34 Which of the following assumptions do A. 1,2 and 3. B. 1,3 and 4. C. 1 and 3. D. All of above. d
we make while deriving linear regression
parameters?1. The true relationship
between dependent y and predictor x is
linear2. The model errors are statistically
independent3. The errors are normally
distributed with a 0 mean and constant
standard deviation4. The predictor x is
non-stochastic and is measured error-free
35 To test linear relationship of y(dependent) A. Scatter plot B. Barchart C. Histograms D. None of these a
and x(independent) continuous variables,
which of the following plot best suited?
36 Generally, which of the following A. 1 and 2 B. only 1 C. only 2 D. None of these. b
method(s) is used for predicting
continuous dependent variable?1. Linear
Regression2. Logistic Regression
37 Suppose you are training a linear A. Both are False B. 1 is False and 2 is True C. 1 is True and 2 is False D. Both are True c
regression model. Now consider these
points.1. Overfitting is more likely if we
have less data2. Overfitting is more likely
when the hypothesis space is small.Which
of the above statement(s) are correct?
38 Suppose we fit �Lasso Regression� to a A. It is more likely for X1 to B. It is more likely for X1 to C. Can�t say D. None of these b
data set, which has 100 features be excluded from the model be included in the model
(X1,X2�X100).� Now, we rescale one of
these feature by multiplying with 10 (say
that feature is X1),� and then refit Lasso
regression with the same regularization
parameter.Now, which of the following
option will be correct?
39 Which of the following is true about A. Ridge regression uses B. Lasso regression uses C. Both use subset D. None of above b
�Ridge� or �Lasso� regression subset selection of features subset selection of features selection of features
methods in case of feature selection?
40 Which of the following statement(s) can A. 1 and 2 B. 1 and 3 C. 2 and 4 D. None of the above a
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
be true post adding a variable in a linear
regression model?1. R-Squared and
Adjusted R-squared both increase2. R-
Squared increases and Adjusted R-
squared decreases3. R-Squared
decreases and Adjusted R-squared
decreases4. R-Squared decreases and
Adjusted R-squared increases
41 We can also compute the coefficient of A. 1 and 2 B. 1 and 3. C. 2 and 3. D. 1,2 and 3. d
linear regression with the help of an
analytical method called �Normal
Equation�. Which of the following is/are
true about �Normal Equation�?1. We
don�t have to choose the learning rate2.
It becomes slow when number of features
is very large3. No need to iterate
42 How many coefficients do you need to A. 1 B. 2 C. Can�t Say b
estimate in a simple linear regression
model (One independent variable)?
43 �If two variables are correlated, is it A. Yes B. No b
necessary that they have a linear
relationship?
44 Correlated variables can have zero A. True B. False a
correlation coeffficient. True or False?
45 Which of the following option is true A. The relationship is B. The relationship is not C. The relationship is not D. The relationship is d
regarding �Regression� and symmetric between x and y symmetric between x and y symmetric between x and symmetric between x and y
�Correlation� ?Note: y is dependent in both. in both. y in case of correlation but in case of correlation but in
variable and x is independent variable. in case of regression it is case of regression it is not
symmetric. symmetric.
46 What is/are true about kernel in SVM?1. 1 2 1 and 2 None of these c
Kernel function map low dimensional data
to high dimensional space2. It�s a
similarity function
47 Suppose you are building a SVM model on Misclassification would Data will be correctly Can�t say None of these a
data X. The data X can be error prone happen classified
which means that you should not trust
any specific data point too much. Now
think that you want to build a SVM model
which has quadratic kernel function of
polynomial degree 2 that uses Slack
variable C as one of it�s hyper
parameter.What would happen when you
use very small C (C~0)?
48 Suppose you are using a Linear SVM yes no a svm.jpg
classifier with 2 class classification
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
problem. Now you have been given the
following data in which some points are
circled red that are representing support
vectors.If you remove the following any
one red points from the data. Does the
decision boundary will change?
49 If you remove the non-red circled points TRUE FALSE b svm.jpg
from the data, the decision boundary will
change?
50 When the C parameter is set to infinite, The optimal hyperplane if The soft-margin classifier None of the above a
which of the following holds true? exists, will be the one that will separate the data
completely separates the
data
51 Suppose you are building a SVM model on We can still classify data We can not classify data Can�t Say None of these a
data X. The data X can be error prone correctly for given setting of correctly for given setting
which means that you should not trust hyper parameter C of hyper parameter C
any specific data point too much. Now
think that you want to build a SVM model
which has quadratic kernel function of
polynomial degree 2 that uses Slack
variable C as one of it�s hyper
parameter.What would happen when you
use very large value of C(C->infinity)?
52 SVM can solve�linear�and non- TRUE FALSE a
linear�problems
53 The objective of the support vector TRUE FALSE a
machine algorithm is to find a hyperplane
in an N-dimensional space(N � the
number of features) that distinctly
classifies the data points.
54 Hyperplanes are _____________boundaries usual decision parallel b
that help classify the data points.�
55 The _____of the hyperplane depends upon dimension classification reduction a
the number of features.
56 Hyperplanes are decision boundaries that TRUE FALSE a
help classify the data points.�
57 SVM�algorithms�use�a set of TRUE FALSE a
mathematical functions that are defined
as the�kernel.
58 In SVM, Kernel function is used to map a TRUE FALSE a
lower dimensional data into a higher
dimensional data.
59 In SVR we try to fit the error within a TRUE FALSE a
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
certain threshold.
60 When the C parameter is set to infinite, The optimal hyperplane if The soft-margin classifier None of the above a
which of the following holds true? exists, will be the one that will separate the data
completely separates the
data
61 How do you handle missing or corrupted a. Drop missing rows or b. Replace missing values c. Assign a unique d. All of the above� d
data in a dataset? columns with mean/median/mode category to missing values
62 What is the purpose of performing cross- a. To assess the predictive b. To judge how the trained c. Both A and B� c
validation? performance of the models model performs outside the
sample on test data
63 Which of the following is true about Naive a. Assumes that all the b. Assumes that all the c. Both A and B� d. None of the above option c
Bayes ? features in a dataset are features in a dataset are
equally important independent
64 Which of the following statements about A.�����Attributes are B.�����Attributes are C.�����Attributes are D.�����Attributes can b
Naive Bayes is incorrect? equally important. statistically dependent of statistically independent of be nominal or numeric
one another given the class one another given the
value. class value.
65 Which of the following ��PCA ��Decision Tree ��Naive Bayesian Linerar regression a
is�not�supervised learning?
66 How can you avoid overfitting ? By using a lot of data By using inductive machine By using validation only None of above A --
learning
67 What are the popular algorithms of Decision Trees and Neural Probabilistic networks and Support vector machines All D --
Machine Learning? Networks (back Nearest Neighbor
propagation)
68 What is �Training set�? Training set is used to test A set of data is used to Both A & B None of above B --
the accuracy of the discover the potentially
hypotheses generated by the predictive relationship.
learner.
69 Identify the various approaches for Concept Vs Classification Symbolic Vs Statistical Inductive Vs Analytical All above D --
machine learning. Learning Learning Learning
70 what is the function of �Unsupervised Find clusters of the data and Find interesting directions Interesting coordinates All D --
Learning�? find low-dimensional in data and find novel and correlations
representations of the data observations/ database
cleaning
71 What are the two methods used for the Platt Calibration and Isotonic Statistics and A --
calibration in Supervised Learning? Regression Informal Retrieval
72 ______can be adopted when it's necessary Supervised Semi-supervised Reinforcement Clusters B --
to categorize a large amount of data with
a few complete examples or when there's
the need to impose some constraints to a
clustering algorithm.
73 In reinforcement learning, this feedback is Overfitting Overlearning Reward None of above C --
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
usually called as___.
74 In the last decade, many researchers Deep learning Machine learning Reinforcement learning Unsupervised learning A --
started training bigger and bigger models,
built with several different layers that's
why this approach is called_____.
75 there's a growing interest in pattern Regression Accuracy Modelfree Scalable C --
recognition and associative memories
whose structure and functioning are
similar to what happens in the neocortex.
Such an approach also allows simpler
algorithms called _____
76 ______ showed better performance than Machine learning Deep learning Reinforcement learning Supervised learning B --
other approaches, even without a context-
based model
77 Common deep learning applications / Real-time visual object Classic approaches Automatic labeling Bio-inspired adaptive B --
problems can also be solved using____ identification systems
78 Some people are using the term ___ Inference Interference Accuracy None of above A --
instead of prediction only to avoid the
weird idea that machine learning is a sort
of modern magic.
79 The term _____ can be freely used, but Accuracy Cluster Regression Prediction D --
with the same meaning adopted in
physics or system theory.
80 If there is only a discrete number of Modelfree Categories Prediction None of above B --
possible outcomes called _____.
81 A feature F1 can take certain value: A, B, Feature F1 is an example of Feature F1 is an example of It doesn�t belong to any Both of these B --
C, D, E, & F and represents grade of nominal variable. ordinal variable. of the above category.
students from a college.
Which of the following statement is true in
following case?
82 What would you do in PCA to get the Transform data to zero mean Transform data to zero Not possible None of these A --
same projection as SVD? median
83 What is PCA, KPCA and ICA used for? Principal Components Kernel based Principal Independent Component All above D --
Analysis Component Analysis Analysis
84 Can a model trained for item based YES NO A --
similarity also choose from a given set of
items?
85 What are common feature selection correlation coefficient Greedy algorithms All above None of these C --
methods in regression task?
86 The parameter______ allows specifying test_size training_size All above None of these C --
the percentage of elements to put into the
test/training set
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
87 In many classification problems, the random_state dataset test_size All above B --
target ______ is made up of categorical
labels which cannot immediately be
processed by any algorithm.
88 _______adopts a dictionary-oriented LabelEncoder class LabelBinarizer class DictVectorizer FeatureHasher A --
approach, associating to each category
label a progressive integer number.
89 ________is much more difficult because it's Removing the whole line Creating sub-model to Using an automatic All above B --
necessary to determine a supervised predict those features strategy to input them
strategy to train a model for each feature according to the other
and, finally, to predict their value known values
90 How it's possible to use a different regression classification random_state missing_values D --
placeholder through the
parameter_______.
91 If you need a more powerful scaling RobustScaler DictVectorizer LabelBinarizer FeatureHasher A --
feature, with a superior control on outliers
and the possibility to select a quantile
range, there's also the class________.
92 scikit-learn also provides a class for per- max, l0 and l1 norms max, l1 and l2 norms max, l2 and l3 norms max, l3 and l4 norms B --
sample normalization, Normalizer. It can
apply________to each element of a dataset
93 There are also many univariate methods F-tests and p-values chi-square ANOVA All above A --
that can be used in order to select the
best features according to specific criteria
based on________.
94 Which of the following selects only a SelectPercentile FeatureHasher SelectKBest All above A --
subset of features belonging to a certain
percentile
95 ________performs a PCA with non-linearly SparsePCA KernelPCA SVD None of the Mentioned B --
separable data sets.
96 �If two variables are correlated, is it Yes No B --
necessary that they have a linear
relationship?
97 Correlated variables can have zero TRUE FALSE A --
correlation coeffficient. True or False?
98 Suppose we fit �Lasso Regression� to a It is more likely for X1 to be It is more likely for X1 to be Can�t say None of these B --
data set, which has 100 features excluded from the model included in the model
(X1,X2�X100).� Now, we rescale one of
these feature by multiplying with 10 (say
that feature is X1),� and then refit Lasso
regression with the same regularization
parameter.Now, which of the following
option will be correct?
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
99 If Linear regression model perfectly first Test error is also always zero Test error is non zero Couldn�t comment on Test error is equal to Train C --
i.e., train error is zero, then Test error error
_____________________
100 Which of the following metrics can be ii and iv i and ii ii, iii and iv i, ii, iii and iv D --
used for evaluating regression models?i)
R Squaredii) Adjusted R Squarediii) F
Statisticsiv) RMSE / MSE / MAE
101 In syntax of linear model Matrix Vector Array List B --
lm(formula,data,..), data refers to ______
102 Linear Regression is a supervised TRUE FALSE A --
machine learning algorithm.
103 It is possible to design a Linear regression TRUE FALSE A --
algorithm using a neural network?
104 Which of the following methods do we Least Square Error Maximum Likelihood Logarithmic Loss Both A and B A --
use to find the best fit line for data in
Linear Regression?
105 Suppose you are training a linear Both are False 1 is False and 2 is True 1 is True and 2 is False Both are True C --
regression model. Now consider these
points.1. Overfitting is more likely if we
have less data2. Overfitting is more likely
when the hypothesis space is small.Which
of the above statement(s) are correct?
106 We can also compute the coefficient of 1 and 2 1 and 3. 2 and 3. 1,2 and 3. D --
linear regression with the help of an
analytical method called �Normal
Equation�. Which of the following is/are
true about �Normal Equation�?1. We
don�t have to choose the learning rate2.
It becomes slow when number of features
is very large3. No need to iterate
107 Which of the following option is true The relationship is The relationship is not The relationship is not The relationship is D --
regarding �Regression� and symmetric between x and y symmetric between x and y symmetric between x and symmetric between x and y
�Correlation� ?Note: y is dependent in both. in both. y in case of correlation but in case of correlation but in
variable and x is independent variable. in case of regression it is case of regression it is not
symmetric. symmetric.
108 In a simple linear regression model (One by 1 no change by intercept by its slope D --
independent variable), If we change the
input variable by 1 unit. How much output
variable will change?
109 Generally, which of the following 1 and 2 only 1 only 2 None of these. B --
method(s) is used for predicting
continuous dependent variable?1. Linear
Regression2. Logistic Regression
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
110 How many coefficients do you need to 1 2 3 4 B --
estimate in a simple linear regression
model (One independent variable)?
111 Suppose you are building a SVM model on We can still classify data We can not classify data Can�t Say None of these A --
data X. The data X can be error prone correctly for given setting of correctly for given setting
which means that you should not trust hyper parameter C of hyper parameter C
any specific data point too much. Now
think that you want to build a SVM model
which has quadratic kernel function of
polynomial degree 2 that uses Slack
variable C as one of it�s hyper
parameter.What would happen when you
use very large value of C(C->infinity)?
112 SVM can solve�linear�and non- TRUE FALSE A --
linear�problems
113 The objective of the support vector TRUE FALSE A --
machine algorithm is to find a hyperplane
in an N-dimensional space(N � the
number of features) that distinctly
classifies the data points.
114 Hyperplanes are _____________boundaries usual decision parallel B --
that help classify the data points.�
115 When the C parameter is set to infinite, The optimal hyperplane if The soft-margin classifier None of the above A --
which of the following holds true? exists, will be the one that will separate the data
completely separates the
data
116 SVM is a ------------------ learning Supervised Unsupervised Both None A --
117 The linear�SVM�classifier works by True FALSE A --
drawing a straight line between two
classes
118 In a real problem, you should check to see TRUE FALSE B --
if the SVM is separable and then include
slack variables if it is not separable.
119 Which of the following are real world Text and Hypertext Image Classification Clustering of News All of the above D --
applications of the SVM? Categorization Articles
120 The _____of the hyperplane depends upon dimension classification reduction A --
the number of features.
121 Hyperplanes are decision boundaries that TRUE FALSE A --
help classify the data points.�
122 SVM�algorithms�use�a set of TRUE FALSE A --
mathematical functions that are defined
as the�kernel.
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
123 Naive Bayes classifiers are a collection Classification Clustering Regression All A --
------------------of algorithms�
124 In given image, P(H|E) Posterior Prior A bayes.jpg
is__________probability.
125 Solving a non linear separation problem True FALSE A
with a hard margin Kernelized SVM
(Gaussian RBF Kernel) might lead to
overfitting
126 100 people are at party. Given data gives TRUE FALSE A man.jpg
information about how many wear pink or
not, and if a man or not. Imagine a pink
wearing guest leaves, was it a man?
127 For the given weather data, Calculate 0.4 0.64 0.29 0.75 B weather
probability of playing data.jpg
128 In SVM, Kernel function is used to map a TRUE FALSE A --
lower dimensional data into a higher
dimensional data.
129 In SVR we try to fit the error within a TRUE FALSE A --
certain threshold.
130 When the C parameter is set to infinite, The optimal hyperplane if The soft-margin classifier None of the above A --
which of the following holds true? exists, will be the one that will separate the data
completely separates the
data
This sheet
is for 3
Mark
questions
S.r No Question a b c d Correct Image
Answer
e.g 1 Write down question Option a Option b Option c Option d a/b/c/d img.jpg
1 Which of the following is characteristic of fast accuracy scalable All above D
best machine learning method ?
2 What are the different Algorithm Supervised Learning and Unsupervised Learning and Both A & B None of the Mentioned C
techniques in Machine Learning? Semi-supervised Learning Transduction
3 ______can be adopted when it's necessary Supervised Semi-supervised Reinforcement Clusters B
to categorize a large amount of data with
a few complete examples or when there's
the need to impose some constraints to a
clustering algorithm.
4 In reinforcement learning, this feedback is Overfitting Overlearning Reward None of above C
usually called as___.
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
5 In the last decade, many researchers Deep learning Machine learning Reinforcement learning Unsupervised learning A
started training bigger and bigger models,
built with several different layers that's
why this approach is called_____.
6 What does learning exactly mean? Robots are programed so A set of data is used to Learning is the ability to It is a set of data is used to C
that they can perform the discover the potentially change according to discover the potentially
task based on data they predictive relationship. external stimuli and predictive relationship.
gather from sensors. remembering most of all
previous experiences.
7 When it is necessary to allow the model to Overfitting Overlearning Classification Regression A
develop a generalization ability and avoid
a common problem called______.
8 Techniques involve the usage of both Supervised Semi-supervised Unsupervised None of the above B
labeled and unlabeled data is called___.
9 there's a growing interest in pattern Regression Accuracy Modelfree Scalable C
recognition and associative memories
whose structure and functioning are
similar to what happens in the neocortex.
Such an approach also allows simpler
algorithms called _____
10 ______ showed better performance than Machine learning Deep learning Reinforcement learning Supervised learning B
other approaches, even without a context-
based model
11 Which of the following sentence is Machine learning relates Data mining can be defined Both A & B None of the above C --
correct? with the study, design and as the process in which the
development of the unstructured data tries to
algorithms that give extract knowledge or
computers the capability to unknown interesting
learn without being explicitly patterns.
programmed.
12 What is �Overfitting� in Machine when a statistical model Robots are programed so While involving the process a set of data is used to A --
learning? describes random error or that they can perform the of learning �overfitting� discover the potentially
noise instead of underlying task based on data they occurs. predictive relationship
relationship �overfitting� gather from sensors.
occurs.
13 What is �Test set�? Test set is used to test the It is a set of data is used to Both A & B None of above A --
accuracy of the hypotheses discover the potentially
generated by the learner. predictive relationship.
14 what is the function of �Supervised Classifications, Predict time Speech recognition, Both A & B None of above C --
Learning�? series, Annotate strings Regression
15 Commons unsupervised applications Object segmentation Similarity detection Automatic labeling All above D --
include
16 Reinforcement learning is particularly the environment is not it's often very dynamic it's impossible to have a All above D --
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
efficient when______________. completely deterministic precise error measure
17 During the last few years, many ______ Logical Classical Classification None of above D --
algorithms have been applied to deep
neural networks to learn the best policy
for playing Atari video games and to teach
an agent how to associate the right action
with an input representing the state.
18 Common deep learning applications Image classification, Autonomous car driving, All above D --
include____ Real-time visual tracking Logistic optimization Bioinformatics,
Speech recognition
19 if there is only a discrete number of Regression Classification. Modelfree Categories B --
possible outcomes (called categories),
the process becomes a______.
20 Which of the following are supervised Spam detection, Image classification, Autonomous car driving, A --
learning applications Pattern detection, Real-time visual tracking Logistic optimization Bioinformatics,
Natural Language Speech recognition
Processing
21 Let�s say, you are working with All categories of categorical Frequency distribution of Train and Test always have Both A and B D --
categorical feature(s) and you have not variable are not present in categories is different in same distribution.
looked at the distribution of the the test dataset. train as compared to the
categorical variable in the test data. test dataset.
a. Supervised Learning
b. Unsupervised Learning
c. Semi-supervised Learning
d. Reinforcement Learning
4. Supervised Learning algorithms are accompanied by both Input and Expected Output?
a. True- answer
b. False
a. Regression- answer
b. Classification
c. Association Rule mining
d. All of these
8. k-NN algorithm does more computation on ‘test’ time rather than ‘train’ time.
a. True- answer
b. False
a. Manhattan
b. Minkowski
c. Jaccard
d. Mahalanobis
e. All can be used- answer
10. Which of the following machine learning algorithm can be used for imputing missing values
of both categorical and continuous variables?
a. K-NN- answer
b. Linear Regression
c. Logistic Regression
d. Decision Tree
11. Which of the following algorithm isNOT an example of ensemble learning algorithm
a. Random Forest
b. Adaboost
c. Gradient Boosting
d. Decision Trees
a. Semi-supervised learning.
b. Supervised Learning
c. Unsupervised Learning
d. All of these
14. Unsupervised Learning algorithms are accompanied by both Input and Expected Output?
a. True
b. False (Only Input) - answer
a. Clustering- answer
b. Classification
c. Regression
d. Association
a. Centroid-based Clustering
b. Density-based Clustering
c. Hierarchical Clustering
d. All of the above- answer
17. Learning algorithms that use both labelled and unlabelled data can be categorised as
a. Supervised Algorithms
b. Unsupervised Algorithms
c. Semi-supervised Algorithms- answer
d. Reinforcement Learning
a. True- answer
b. False
19. When the number of output classes is greater than one, which is / are the possible strategy
used to handle them
a. One-vs-All
b. One-vs-One
c. Both of them- answer
d. None of the above
20. In One-vs-All strategy how many classifiers are trained for n classes
a. 1
b. n- answer
c. n/2
d. None of the above
21. In One-vs-One strategy how many classifiers are trained for n classes
a. 1
b. n
c. n*(n-1)/2- answer
d. n/2
22. When the model isn't able to capture the dynamicsshown by the same training set, such
situation is called as
a. Underfitting- answer
b. Overfitting
c. Normal fitting
d. Regularization
23. When the model can associate almost perfectly all the known samples to the corresponding
output values, but when an unknown input is presented, the corresponding prediction error
can be very high, such situation is called as
a. Underfitting
b. Overfitting- answer
c. Normal fitting
d. None of these
a. Information Gain
b. Entropy- answer
c. Probability of an event
d. None of the above
a. Logistic Regression
b. Naïve Bayes
c. K-Nearest Neighbors- answer
d. Simple Neural Networks
28. Which of the factors affect the performance of learner system does not include?
a. Adaptive system
b. Non-adaptive system- answer
c. Both
d. None of the above
a. K-Nearest Neighbor
b. Decision Tree
c. K-means- answer
d. Linear Regression
a. Supervised Learning
b. Unsupervised Learning
c. Reinforcement Learning
d. All of the above- answer
32. In which type of Learning, both features and labels are given to an algorithm?
33. In which type of learning, the algorithm maps input variable to output variable?
a. Classification
b. Regression
c. Clustering- answer
d. None of the above
a. Classification- answer
b. Clustering
c. Regression
d. Association
a. Classification
b. Clustering
c. Regression- answer
d. Association
37. In which learning technique, the system discovers patterns from dataset?
a. Supervised Learning
b. Unsupervised Learning- answer
c. Reinforcement Learning
d. None of the above
38. In which type of learning, the problem can be solved without knowing labels?
a. Supervised Learning
b. Unsupervised Learning- answer
c. All of the above
d. None of the above
a. Clustering- answer
b. Association
c. Regression
d. None of the above
a. Clustering
b. Association- answer
c. Regression
d. None of the above
41. From the following, which is best suited to build a game of chess?
a. Supervised Learning
b. Unsupervised Learning
c. Deep Learning- answer
d. None of the above
a. Supervised Learning
b. Unsupervised Learning
c. Reinforcement Learning- answer
d. None of the above
a. Supervised Learning
b. Unsupervised Learning- answer
c. Reinforcement Learning
d. None of the above
44. From the options, which application you should solve by deep learning for the best
performance?
a. Spam filtering
b. Image classification- answer
c. Sales prediction
d. Automatic labelling
45. A neural network model is said to be inspired from the human brain.Which of the following
statement(s) correctly represents a real neuron?
a. Underfitting
b. Overfitting- answer
c. Both
d. Not a and b
a. Underfitting- answer
b. Overfitting
c. Both
d. None of the above
Unit 1: Two marks questions
1. The goal(s) of the supervised learning system is (are) ___________
a. Training a system that must also work with samples never seen before.
b. To allow the model to develop a generalization ability and avoid a common problem
called over fitting
c. Supervisor: to provide the agent with a precise measure of its error
d. All of the above- answer
a. Reinforcement learning
b. Supervised learning- answer
c. Un supervised learning
d. Semi supervised learning
6. Identify the type of Machine learning approach to solve the given problems:
Decision Support System to predict the decision to play Match or not to play
a. Reinforcement learning
b. Supervised learning- answer
c. Un supervised learning
d. Semi supervised learning
7. Identify the type of Machine learning approach to solve the given problems:
Grouping of documents retrieved by Google Search Engine
a. Reinforcement learning
b. Supervised learning
c. Un supervised learning- answer
d. Semi supervised learning
8. Identify the type of Machine learning approach to solve the given problems:
a. Reinforcement learning
b. Supervised learning- answer
c. Unsupervised learning
d. Semi supervised learning
9. Identify the type of Machine learning approach to solve the given problems:
System to predict the suitable treatment
a. Reinforcement learning
b. Supervised learning
c. Un supervised learning
d. Semi supervised learning
10. Identify the type of Machine learning approach to solve the given problems:
System for Driverless Car
a. Reinforcement learning- answer
b. Supervised learning
c. Unsupervised learning
d. Semi supervised learning
1. For creating Training and Test datasets which statements are true?
a. Both datasets must reflect the original distribution
b. The original dataset must be randomly shuffled before the split phase in order to avoid
correlation between consequent elements
c. Both a and b - answer
d. None of the above
4. Scikit-learn class Imputer fills the holes using a strategy based on the:
a. mean
b. median
c. frequency (the most frequent entry)
d. All of the above- answer
10. Consider Q1=31 and Q3=119. The inter quartile range (IQR) will be______
a. 88 - answer
b. -88
c. 150
d. -150
MCQs on unit 2 (One mark question)
1) Which of the following contains train_test_split() function
A) sklearn.feature_extraction
B) sklearn.preprocessing
C) sklearn.model_selection- answer
D) sklearn.decomposition
2) Default value of test_size in train_test_split() when both test_size and train_size are none
A) 0.33
B) 0.25 - answer
C) 0.50
D) 0.20
A) Dictionary-oriented- answer
B) List-oriented
C) Tree-oriented
D) Map-oriented
A) SHA256
B) MD5
C) MurmurHash 3- answer
D) BLAKE3
6) When performing regression or classification, which of the following is the correct way to
preprocess the data?
10) Principal component analysis is a method to select only a subset of features which contain
the largest amount of?
A) Total covariance
B) Total variance - answer
C) Total count
D) Mean
11) In the following loss function which parametercontrols the level of sparsity?
A) xi
B) c - answer
C) D
D) αi
12) Which parameter determines the number of atoms in scikit-learn DictioanryLearning class?
A) alpha
B) n_jobs
C) n_components - answer
D) tol
14) Non negative matrix factorization algorithm optimizes a loss function based on?
A) L1 Norm
B) Frobenius norm - answer
C) linalgnorm
D) matrix norm
15) Which of the following encoding technique is efficient to deal with large number of possible
categories?
A) Effect Encoding
B) Feature Hashing
C) One Hot Encoding
D) Bin counting scheme - answer
16) Which scaling technique scales data without being affected by outliers?
A) Filter Methods
B) Wrapper Methods - answer
C) Embedded Methods
D) Subset Methods
18) From the following which can be applied on dataset with more than one dimension?
A) Mean
B) Standard Deviation
C) Covariance - answer
D) Variance
19) In principal component analysis the sparse loadings can be obtained by imposing which
constraint on regression coefficients:
A) Ridge
B) Lasso - answer
C) Linear
D) Logistic
21) Eigen vector with ____ Eigen value is the principle component of dataset.
A) Lowest
B) Highest - answer
C) Mean
D) Zero
22) Trace is equal to the ___ of the Eigen values.
A) Difference
B) Sum - answer
C) Product
D) Mean
23) In which scaling technique the upper and lower can be specified by user?
A) Robust Scaling
B) Min Max Scaling - answer
C) Standardized Scaling
D) Z-score Scaling
24) Principal component analysis (PCA) can be used with variables of any mathematical types:
quantitative, qualitative, ora mixture of these types.
A) True
B) False - answer
25) Variances and covariances can be computed for variables of any mathematical types:
quantitative, qualitative, or a mixture of these types.
A) True
B) False - answer
Unit- 3: Regression (One mark)
1. A process by which we estimate the value of dependent variable on the basis of one or more
independent variables is called:
a. Correlation
b. Regression - answer
c. Residual
d. Slope
2. All data points falling along a straight line is called:
a. Linear relationship - answer
b. Non linear relationship
c. Residual
d. Scatter diagram
3. A relationship where the flow of the data points is best represented by a curve is called:
a. Linear relationship
b. Nonlinear relationship - answer
c. Linear positive
d. Linear negative
4. The value we would predict for the dependent variable when the independent variables are all
equal to zero is called:
(a) Slope
(b) Sum of residual
(c) Intercept - answer
(d) Difficult to tell
5. The predicted rate of response of the dependent variable to changes in the independent variable is
called:
(a) Slope - answer
(b) Intercept
(c) Error
(d) Regression equation
6. The slope of the regression line of Y on X is also called the:
(a) Correlation coefficient of X on Y
(b) Correlation coefficient of Y on X
(c) Regression coefficient of X on Y
(d) Regression coefficient of Y on X - answer
8. In simple linear regression, the numbers of unknown constants are:
(a) One
(b) Two - answer
(c) Three
(d) Four
9. In simple regression equation, the numbers of variables involved are:
(a) 0
(b) 1
(c) 2 - answer
(d) 3
10. If the value of any regression coefficient is zero, then two variables are:
(a) Qualitative
(b) Correlation
(c) Dependent
(d) Independent- answer
11. In SK-Learn Linear Regression offers two instance variables, __________ and ____________
a) intercept_ and coef_ - answer
b) Intercept and coef
c) Slope and Intercept
d) Slope and Coef
12. _________ regression imposes an additional shrinkage penalty to the ordinary least squares loss
function to limit its squared L2 norm:
a) Lasso
b) LassoCV
c) Ridge - answer
d) ElasticNet
13. _____________ regressor imposes a penalty on the L1 norm of w to determine a potentially
higher number of null coefficients:
a) Lasso - answer
b) RidgeCV
c) Ridge
d) ElasticNet
14. A Regression approach to avoid the problem of outliers is offered by _______________
a) Linear Regression
b) Logistic Regression
c) RANSAC Regressor - answer
d) Polynomial Regressor
16. ________ occurs when our model neither fits the training data nor generalizes on the new data.
a) Over-fitting
b) Under-fitting - answer
c) Best fitting
d) None of the above
17. ________________ is the process of adding information in order to solve an ill-posed problem
or to prevent overfitting
a) Under-fitting
b) Regularization - answer
c) Best fitting
d) None of the above
18. ____________ selects the only some feature while reduces the coefficients of others to zero.
This property is known as feature selection
a) Lasso - answer
b) RidgeCV
c) Ridge
d) ElasticNet
19. ______ combines both Lasso and Ridge Regression into one model with two penalty factors, one
proportional to L1 norm and other proportional to L2 norm.
a) LassoCV
b) RidgeCV
c) ElasticNet - answer
d) None of the above
20. ____________minimizes the cost function by gradually updating the weight values.
a) Linear Regression
b) Logistic Regression
c) RANSAC Regressor
d) Polynomial Regressor - answer
22. The Regression technique that uses sigmoid function is called________________
a) Linear Regression
b) Logistic Regression - answer
c) RANSAC Regressor
d) Polynomial Regressor
23. Confusion Matrix can be used to measure the performance of _______________ model.
a) Linear Regression
b) Logistic Regression - answer
c) RANSAC Regressor
d) Polynomial Regressor
24. The residual is defined as the difference between the:
a) actual value of y and the estimated value of y - answer
b) actual value of x and the estimated value of x
c) actual value of y and the estimated value of x
d) actual value of x and the estimated value of y
25)Which of the following methods do we use to find the best fit line for data in Linear
Regression?
A) Least Square Error
B) Maximum Likelihood
C) Logarithmic Loss
D) Both A and B
Answer:(A)
26)True- False: Overfitting is more likely when you have a huge amount of data to train.
A) TRUE
B) FALSE
Solution: (B)
27) What will happen when you apply very large penalty in the case of Lasso?
A) Some of the coefficients will become zero
B) Some of the coefficients will be approaching to zero but not absolute zero
C) Both A and B depending on the situation
D) None of these
Solution: (A)
28) Generally, which of the following method(s) is used for predicting continuous dependent
variable?
1. Linear Regression 2. Logistic Regression
A) 1 and 2
B)only 1
C)only 2
D)None of these
Solution:(B)
31)Which is L1 regression
A)Lasso
B)Ridge
C)polynomial
D)Isotonic
Answer A
32)Which of the following is true about “Ridge” or “Lasso” regression methods in case of feature
selection?
A) Ridge regression uses subset selection of features
B)Lasso regression uses subset selection of features
C)Both use subset selection of features
D)None of the above
Solution:(B)
35) Which of the following statement is true about outliers in Linear regression?
A) Linear regression is sensitive to outliers
B) Linear regression is not sensitive to outliers
C) Can’t say
D) None of these
Solution: (A)
36) What do you expect will happen with bias and variance as you increase the size of training
data?
A) Bias increases and Variance increases
B) Bias decreases and Variance increases
C) Bias decreases and Variance decreases
D) Bias increases and Variance decreases
Solution: (D)
37)A Pearson correlation between two variables is zero but, still, their values can still be related
to each other.
A) TRUE
B) FALSE
Solution: (A)
38) Which of the following statement(s) is / are true for Gradient Decent (GD) and Stochastic
Gradient Decent (SGD)?
1. In GD and SGD, you update a set of parameters in an iterative manner to minimize the
error function.
2. In SGD, you have to run through all the samples in your training set for a single update of
a parameter in each iteration.
3. In GD, you either use the entire data or a subset of training data to update a parameter in
each iteration.
A) Only 1
B) Only 2
C) Only 3
D) 1 and 2
Solution:(A)
39) When hypothesis tests and confidence limits are to be used, the residuals are assumed
to follow the __________distribution.
A) Formal
B) Mutual
C) Normal
D) Abnormal
Solution:(C)
40)The error due to simplistic assumptions made by the model in fitting the data is called as
A)variance
B)bias
C)MSE
D)none of these
Solution:(B)
43) Least square method calculates the best-fitting line for the observed data by minimizing the sum
of the squares of the _______ deviations.
a) Vertical
b) Horizontal
c) Both of these
d) None of these
Solution:(A)
Unit-3 (Two marks)
1. The regression line yhat = 3 + 2x has been fitted to the data points (4,8), (2,5), and (1,2). The
residual sum of squares will be:
a) 10
b) 15
c) 13
d) 22 - answer
2. Suppose you have trained a logistic regression classifier and it outputs a new example x with a
prediction ho(x) = 0.2. This means
a. Our estimate for P(y=1 | x)
b. Our estimate for P(y=0 | x) - answer
c. Our estimate for P(y=1 | x)
d. Our estimate for P(y=0 | x)
3. A regression analysis between sales (in $1000) and advertising (in $100) resulted in the following
least squares line: yhat = 75 +6x. This implies that if advertising is $800, then the predicted amount
of sales (in dollars) is:
a. $4875 - answer
b. $123,000
c. $487,500
d. $12,300
4. The value for SSE equals zero. This means that the coefficient of determination (r^2) must equal:
a. 0.0.
b. -1.0.
c. 2.3.
d. 1.0 - answer
a) 12.58 - answer
b) 10.58
c) 11.85
d) 10.85
7. For the given results of a recently conducted study on the correlation of the number of hours spent
driving with the risk of developing acute backache. The slope of the line is_______.
a) 4.59 - answer
b) 10.58
c) 5.85
d) 10.85
8. for the given vector of outputs the Mean squared error is ________.
y_true = [3, -0.5, 2, 7]
y_pred = [2.5, 0.0, 2, 8]
a) 0.45
b) 0.375 - answer
c) 0.56
d) None of the above
9)The correct relationship between SST, SSR, and SSE is given by;
a) SSR = SST + SSE
b) SST = SSR + SSE
c) SSE = SSR – SST
d) all of the above
Solution:(B)
10)Stochastic gradient descent performs less computation per update than batch gradient descent.
A)True
B)False
Solution:(A)
11)A parameter that is external to model and whose value cannot be estimated from data is called as
A)Hyperparameter
B)Model Parameter
C)Outlier
D)Regularization constant
Solution:(A)
14)The most widely used metrics and tools to assess a classification model are:
A)Confusion matrix
B)Cost-sensitive accuracy
C)Area under the ROC curve
D)All of the above
Solution:(D)
16) In practice, Line of best fit or regression line is found when _____________
a) Sum of residuals (Σ(Y – h(X))) is minimum
b) Sum of the absolute value of residuals (Σ|Y-h(X)|) is maximum
c) Sum of the square of residuals ( Σ (Y-h(X))2) is minimum
d) Sum of the square of residuals ( Σ (Y-h(X))2) is maximum
Solution:(C)
Unit- 4 : Naïve Bayes and SVM
(one mark)
1. Naive bayes falls under which category-
a. Unsupervised classification learning
b. Supervised classification learning
c. Semi- supervised classification learning
d. Reinforcement learning
Ans - b
2. What machine learning task is the Naive Bayes algorithm used for?
a. dimensionality reduction
b. clustering
c. classification
d. regression
Ans - c
3. Naive Bayes assumption about data is-
a. input is independent, conditional on the output label.
b. input is dependent, conditional on the output label.
c. input is independent, not conditional on the output label.
d. input is dependent, not conditional on the output label.
Ans - a
4. Bayes rule:
a. P(A |B) = P(B|A) .P(B) / P(A)
b. P(A |B) = P(B|A) .P(A) / P(B)
c. P(A |B) = P(B|A) .P(A)
d. P(A |B) = P(B|A) .P(B)
Ans - b
7. Which type of naive bayes classifier is best suited for document classification problem -
a. Bernoulli naive bayes
b. Multinomial naive bayes
c. Gaussian naive bayes
d. Complement Naive bayes
Ans - b
8. Which type of naive bayes classifiers is usually used for yes/no type boolean predictores-
a. Bernoulli naive bayes
b. Multinomial naive bayes
c. Gaussian naive bayes
d. Complement Naive bayes
Ans - a
(Two marks)
1. One marble jar has several different colored marbles inside of it. It has 1 red, 2 green, 4 blue, and
8 yellow marbles. All the marbles are the same size and shape. If Peter takes out a marble from the
jar without looking, what is the probability that he will NOT choose a yellow marble.
a. 7/15
b. 8/15
c. 7/8
d. 5/8
Ans- a
2. If we train a Naive Bayes classifier using infinite training data that satisfies all of its modeling
assumptions , then in general, what can we say about the training error (error in training data) and
test error (error in held-out test data)?
a. It may not achieve either zero training error or zero test error
b. It will always achieve zero training error and zero test error.
c. It will always achieve zero training error but may not achieve zero test error.
d. It may not achieve zero training error but will always achieve zero test error.
Ans - a
12. Which of the following are real world applications of the SVM?
a. Text and Hypertext Categorization
b. Image Classification
c. Clustering of News Articles
d. All of the above
Ans- d
4. Which of the following is a widely used and effective machine learning algorithm based on the
idea of bagging?
a. Decision Tree
b. Regression
c. Classification
d. Random Forest - answer
5. In the given formula of Decision Tree family , what A and D represents?
Gain(A) = Cross_Entropy(D) – EntropyA(D)
a. Attribute, Decision
b. Attribute, Dataset- answer
c. Probability, Dataset
d. None of the above
6. In the given formula of Decision Tree family , which are the given statements are true?
Gain(A) = Cross_Entropy(D) – EntropyA(D)
7. A _________ is a decision support tool that uses a tree-like graph or model of decisions and
their possible consequences, including chance event outcomes, resource costs, and utility.
a. Decision tree- answer
b. Graphs
c. Trees
d. Neural Networks
10. The most widely used metrics and tools to assess a classification model are:
a. Confusion matrix
b. Cost-sensitive accuracy
c. Area under the ROC curve
d. All of the above - answer
11. Which of the following is a good test dataset characteristic?
a. Large enough to yield meaningful results
b. Is representative of the dataset as a whole
c. Both A and B - answer
d. None of the above
12. Which of the following is a disadvantage of decision trees?
a. Factor analysis
b. Decision trees are robust to outliers
c. Decision trees are prone to be overfit - answer
d. None of the above
13. What is the purpose of performing cross-validation?
a. To assess the predictive performance of the models
b. To judge how the trained model performs outside the sample on test data
c. Both A and B – answer
d. None of the above
2.Bagging is the method for improving the performance by aggregating the results of weak
learners
A) 1
B) 2
C) 1 and 2- answer
D) None of these
A) 1
B) 2- answer
C) 1 and 2
D) None of these
16. Which of the following algorithm are not an example of ensemble learning algorithm?
A) Random Forest
B) Adaboost
C) Extra Trees
D) Gradient Boosting
E) Decision Trees- answer
17. Suppose you are using a bagging based algorithm say a RandomForest in model building.
Which of the following can be true?
A) 1- answer
B) 2
C) 1 and 2
D) None of these
18. True-False: The bagging is suitable for high variance low bias models?
A) TRUE- answer
B) FALSE
19. In which of the following scenario a gain ratio is preferred over Information Gain?
20. In K-means clustering, the distance between each sample and each centroid is computed and the
sample is assigned to the cluster where the distance is minimum. This approach is often called ----
3)The algorithm stops when the centroids become stable and, therefore, the inertia is minimized
22. [True or False] k-NN algorithm does more computation on test time rather than train
time.
A) TRUE - answer
B) FALSE
2. In Random forest you can generate hundreds of trees (say T1, T2 …..Tn) and then aggregate the
results of these tree. Which of the following is true about individual(Tk) tree in Random Forest?
A) 1 and 3 - answer
B) 1 and 4
C) 2 and 3
D) 2 and 4
3. Which of the following algorithm doesn’t uses learning Rate as of one of its hyperparameter?
1. Gradient Boosting
2. Extra Trees
3. AdaBoost
4. Random Forest
A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4 - answer
4. Which of the following algorithm would you take into the consideration in your final model
building on the basis of performance?
Suppose you have given the following graph which shows the ROC curve for two different
classification algorithms such as Random Forest(Red) and Logistic Regression(Blue)
A) Random Forest- anwser
B) Logistic Regression
C) Both of the above
D) None of these
5. Which of the following is true about training and testing error in such case?
Suppose you want to apply AdaBoost algorithm on Data D which has T observations. You
set half the data for training and half for testing initially. Now you want to increase the
number of data points for training T1, T2 … Tn where T1 < T2…. Tn-1 < Tn.
E) The difference between training error and test error increases as number of observations
increases
B) The difference between training error and test error decreases as number of
observations increases- answer
C) The difference between training error and test error will not change
D) None of These
6. In random forest or gradient boosting algorithms, features can be of any type. For example,
it can be a continuous feature or a categorical feature. Which of the following option is true
when you consider these types of features?
A) Only Random forest algorithm handles real valued attributes by discretizing them
B) Only Gradient boosting algorithm handles real valued attributes by discretizing them
C) Both algorithms can handle real valued attributes by discretizing them- answer
D) None of these
7. Consider the following figure for answering the next few questions. In the figure, X1 and X2
are the two features and the data point is represented by dots (-1 is negative class and +1 is a
positive class). And you first split the data based on feature X1(say splitting point is x11)
which is shown in the figure using vertical line. Every value less than x11 will be predicted
as positive class and greater than x will be predicted as negative class.
1. In each stage, introduce a new regression tree to compensate the shortcomings of existing
model
2. We can use gradient decent method for minimize the loss function
A) 1
B) 2
C) 1 and 2- answer
D) None of these
a. cluster_centers_
b. inertia_
c. n_clusters
d. all of the above
10. In which of the following cases will K-means clustering fail to give good results?
1) Data points with outliers 2) Data points with different densities 3) Data points with
nonconvex shapes
1. 1 and 2
2. 2 and 3
3. 1, 2, and 3 - answer
4. 1 and 3
11. Which of the following is a reasonable way to select the number of clusters "k"?
1. Choose k to be the smallest value so that at least 99% of the varinace is retained.
2. Choose k to be 99% of m (k = 0.99*m, rounded to the nearest integer).
3. Choose k to be the largest value so that 99% of the variance is retained.
4. Use the elbow method- answer
12. A company has build a kNN classifier that gets 100% accuracy on training data. When they
deployed this model on client side it has been found that the model is not at all accurate.
Which of the following thing might gone wrong?
Note: Model has successfully deployed and no technical issues are found at client side except
the model performance
13. In k-NN it is very likely to overfit due to the curse of dimensionality. Which of the
following option would you consider to handle such problem?
1. Dimensionality Reduction
2. Feature selection
A) 1
B) 2
C) 1 and 2 - answer
D) None of these
14. In the image below, which would be the best value for k assuming that the algorithm you are
using is k-Nearest Neighbor.
A) 3
B) 10 - answer
C) 20
D 50
15. Which of the following is/are not true about DBSCAN clustering algorithm:
1. For data points to be in a cluster, they must be in a distance threshold to a core point
2. It has strong assumptions for the distribution of data points in dataspace
3. It has substantially high time complexity of order O(n3)
4. It does not require prior knowledge of the no. of desired clusters
5. It is robust to outliers
Options:
A. 1 only
B. 2 only
C. 4 only
D. 2 and 3 - answer
1. After performing K-Means Clustering analysis on a dataset, you observed the following
dendrogram. Which of the following conclusion can be drawn from the dendrogram?
D. The above dendrogram interpretation is not possible for K-Means clustering analysis -
answer
3. In the figure below, if you draw a horizontal line on y-axis for y=2. What will be the number
of clusters formed?
A. 1
B. 2 - answer
C. 3
D. 4
4. What should be the best choice for number of clusters based on the following results:
A. 5
B. 6 - answer
C. 14
D. Greater than 14
5. Which of the following is/are not true about Centroid based K-Means clustering algorithm
and Distribution based expectation-maximization clustering algorithm:
Options:
A. 1 only
B. 5 only - answer
C. 1 and 3
D. 6 and 7
7. If you are using Multinomial mixture models with the expectation-maximization algorithm for
clustering a set of data points into two clusters, which of the assumptions are important:
• x1, x2,…, xN: These are inputs to the neuron. These can either be the actual observations
from input layer or an intermediate value from one of the hidden layers.
• w1, w2,…,wN: The Weight of each input.
• bi: Is termed as Bias units. These are constant values added to the input of the activation
function corresponding to each weight. It works similar to an intercept term.
• a: Is termed as the activation of the neuron which can be represented as
• and y: is the output of the neuron
Considering the above notations, will a line equation (y = mx + c) fall into the category of a
neuron?
A. Yes- answer
B. No
9. In the graph below, we observe that the error has many “ups and downs”
Should we be worried?
A. Yes, because this means there is a problem with the learning rate of neural network.
B. No, as long as there is a cumulative decrease in both training and validation error,
we don’t need to worry - answer
1. Which of the following metrics, do we have for finding dissimilarity between two clusters in
hierarchical clustering?
1. Single-link
2. Complete-link
3. Average-link
Options:
A. 1 and 2
B. 1 and 3
C. 2 and 3
D. 1, 2 and 3 - answer
3. If you increase the number of hidden layers in a Multi Layer Perceptron, the classification
error of test data always decreases. True or False?
A. True
B. False - answer
4. You are building a neural network where it gets input from the previous layer as well as from
itself.
D. None of these
6. In a neural network, which of the following techniques is used to deal with overfitting?
A. Dropout
B. Regularization
C. Batch Normalization
D. All of these - answer
A. A unit which doesn’t update during training by any of its neighbour - answer
B. A unit which does not respond completely to any of the training patterns
D. None of these
9. For an image recognition problem (recognizing a cat in a photo), which architecture of neural
network would be better suited to solve the problem?
A. Multi Layer Perceptron
B. Convolutional Neural Network - answer
C. Recurrent Neural network
D. Perceptron
10. What are the factors to select the depth of neural network?
A. 1, 2, 4, 5
B. 2, 3, 4, 5
C. 1, 3, 4, 5
Options:
1. 2 Only
2. 1 only
C. 1 and 2
D. 2 and 3 - answer
13. Recommendation systems are used in which of the following applications:
a. Banking
b. Shopping
c. Search Engine
d. All of the above – answer
17. For each pair of clusters, which algorithm computes the maximum distance between the clusters
using below formula?
a. Single link
b. Complete link -answer
c. Average link
d. Ward’s Linkage
18. ___________ Graphical method to better understand the agglomeration process shows in a static
way how the aggregations are performed ,starting from the bottom (where all samples are separated)
till the top (where the linkage is complete).
a. Flow chart
b. Histo graph
c. Dendrogram –answer
d. Decision tree
21. ___________ are general computers which can learn algorithms to map input
sequences to output sequences
a. CNN
b. RNN- answer
c. Deep Q-Learning
d. All of these
UNIT I
1. What is classification?
a) when the output variable is a category, such as “red” or “blue” or “disease” and “no
disease”.
b) when the output variable is a real value, such as “dollars” or “weight”.
Ans: Solution A
2. What is regression?
a) When the output variable is a category, such as “red” or “blue” or “disease” and “no
disease”.
b) When the output variable is a real value, such as “dollars” or “weight”.
Ans: Solution B
Ans: Solution B
Ans: Solution A
Ans: Solution D
6. What is Reinforcement learning?
a) All data is unlabelled and the algorithms learn to inherent structure from the input data
b) All data is labelled and the algorithms learn to predict the output from the input data
c) It is a framework for learning where an agent interacts with an environment and receives
a reward for each interaction
d) Some data is labelled but most of it is unlabelled and a mixture of supervised and
unsupervised techniques can be used.
Ans: Solution C
Regression,
Classification
Clustering
Reinforcement Learning
Options:
A. 1 Only
B. 1 and 2
C. 1 and 3
D. 1, 2 and 4
Ans : Solution D
Ans : Solution C
Ans : Solution B
11. Supervised learning and unsupervised clustering both require at least one
a) hidden attribute.
b) output attribute.
c) input attribute.
d) categorical attribute.
Ans : Solution A
12. Supervised learning differs from unsupervised clustering in that supervised learning requires
a) at least one input attribute.
b) input attributes to be categorical.
c) at least one output attribute.
d) output attributes to be categorical.
Ans : Solution B
13. A regression model in which more than one independent variable is used to predict the
dependent variable is called
a) a simple linear regression model
b) a multiple regression models
c) an independent model
d) none of the above
Ans : Solution C
14. A term used to describe the case when the independent variables in a multiple regression model
are correlated is
a) Regression
b) correlation
c) multicollinearity
d) none of the above
Ans : Solution C
15. A multiple regression model has the form: y = 2 + 3x1 + 4x2. As x1 increases by 1 unit (holding x2
constant), y will
a) increase by 3 units
b) decrease by 3 units
c) increase by 4 units
d) decrease by 4 units
Ans : Solution C
Ans : Solution B
17. A measure of goodness of fit for the estimated regression equation is the
a) multiple coefficient of determination
b) mean square due to error
c) mean square due to regression
d) none of the above
Ans : Solution C
Ans : Solution D
Ans : Solution C
20. For a multiple regression model, SST = 200 and SSE = 50. The multiple coefficient of
determination is
a) 0.25
b) 4.00
c) 0.75
d) none of the above
Ans : Solution B
Ans : Solution B
Ans : Solution B
Ans : Solution C
Ans : Solution D
26. Which statement is true about neural network and linear regression models?
a) Both models require input attributes to be numeric.
b) Both models require numeric attributes to range between 0 and 1.
c) The output of both models is a categorical attribute value.
d) Both techniques build models whose output is determined by a linear sum of weighted
input attribute values.
Ans : Solution A
Ans : Solution A
28. The average positive difference between computed and desired outcome values.
a) root mean squared error
b) mean squared error
c) mean absolute error
d) mean positive error
Ans : Solution D
29. Selecting data so as to assure that each class is properly represented in both the training and
test set.
a) cross validation
b) stratification
c) verification
d) bootstrapping
Ans : Solution B
30. The standard error is defined as the square root of this computation.
a) The sample variance divided by the total number of sample instances.
b) The population variance divided by the total number of sample instances.
c) The sample variance divided by the sample mean.
d) The population variance divided by the sample mean.
Ans : Solution A
31. Data used to optimize the parameter settings of a supervised learner model.
a) Training
b) Test
c) Verification
d) Validation
Ans : Solution D
Ans : Solution A
33. The correlation between the number of years an employee has worked for a company and the
salary of the employee is 0.75. What can be said about employee salary and years worked?
a) There is no relationship between salary and years worked.
b) Individuals that have worked for the company the longest have higher salaries.
c) Individuals that have worked for the company the longest have lower salaries.
d) The majority of employees have been with the company a long time.
e) The majority of employees have been with the company a short period of time.
Ans : Solution B
34. The correlation coefficient for two real-valued attributes is –0.85. What does this value tell you?
a) The attributes are not linearly related.
b) As the value of one attribute increases the value of the second attribute also increases.
c) As the value of one attribute decreases the value of the second attribute increases.
d) The attributes show a curvilinear relationship.
Ans : Solution C
35. The average squared difference between classifier predicted output and actual output.
a) mean squared error
b) root mean squared error
c) mean absolute error
d) mean relative error
Ans : Solution A
36. Simple regression assumes a __________ relationship between the input attribute and output
attribute.
a) Linear
b) Quadratic
c) reciprocal
d) inverse
Ans : Solution A
Ans : Solution B
Ans : Solution C
39. Logistic regression is a ________ regression technique that is used to model data having a
_____outcome.
a) linear, numeric
b) linear, binary
c) nonlinear, numeric
d) nonlinear, binary
Ans : Solution D
40. This technique associates a conditional probability value with each data instance.
a) linear regression
b) logistic regression
c) simple regression
d) multiple linear regression
Ans : Solution B
41. This supervised learning technique can process both numeric and categorical input attributes.
a) linear regression
b) Bayes classifier
c) logistic regression
d) backpropagation learning
Ans : Solution A
Ans : Solution B
43. This clustering algorithm merges and splits nodes to help modify nonoptimal partitions.
a) agglomerative clustering
b) expectation maximization
c) conceptual clustering
d) K-Means clustering
Ans : Solution D
44. This clustering algorithm initially assumes that each data instance represents a single cluster.
a) agglomerative clustering
b) conceptual clustering
c) K-Means clustering
d) expectation maximization
Ans : Solution C
45. This unsupervised clustering algorithm terminates when mean values computed for the current
iteration of the algorithm are identical to the computed mean values for the previous iteration.
a) agglomerative clustering
b) conceptual clustering
c) K-Means clustering
d) expectation maximization
Ans : Solution C
46. Machine learning techniques differ from statistical techniques in that machine learning methods
a) typically assume an underlying distribution for the data.
b) are better able to deal with missing and noisy data.
c) are not able to explain their behavior.
d) have trouble with large-sized datasets.
Ans : Solution B
UNIT –II
1.True- False: Over fitting is more likely when you have huge amount of data to train?
A) TRUE
B) FALSE
Ans Solution: (B)
With a small training dataset, it’s easier to find a hypothesis to fit the training data exactly i.e.
over fitting.
3.Which of the following techniques would perform better for reducing dimensions of a data
set?
A. Removing columns which have too many missing values
B. Removing columns which have high variance in data
C. Removing columns with dissimilar data trends
D. None of these
Ans Solution: (A)
If a columns have too many missing values, (say 99%) then we can remove such columns.
4.It is not necessary to have a target variable for applying dimensionality reduction
algorithms.
A. TRUE
B. FALSE
Ans Solution: (A)
LDA is an example of supervised dimensionality reduction algorithm.
5. PCA can be used for projecting and visualizing data in lower dimensions.
A. TRUE
B. FALSE
Ans Solution: (A)
Sometimes it is very useful to plot the data in lower dimensions. We can take the first 2 principal
components and then visualize the data using scatter plot.
6. The most popularly used dimensionality reduction algorithm is Principal Component Analysis
(PCA). Which of the following is/are true about PCA?
PCA is an unsupervised method
It searches for the directions that data have the largest variance
Maximum number of principal components <= number of features
All principal components are orthogonal to each other
A. 1 and 2
B. 1 and 3
C. 2 and 3
D. All of the above
Ans D
8. What happens when you get features in lower dimensions using PCA?
The features will still have interpretability
The features will lose interpretability
The features must carry all information present in data
The features may not carry all information present in data
A. 1 and 3
B. 1 and 4
C. 2 and 3
D. 2 and 4
Ans Solution: (D)
When you get the features in lower dimensions then you will lose some information of data
most of the times and you won’t be able to interpret the lower dimension data.
10. What is of the following statement is true about t-SNE in comparison to PCA?
A. When the data is huge (in size), t-SNE may fail to produce better results.
B. T-NSE always produces better result regardless of the size of the data
C. PCA always performs better than t-SNE for smaller size data.
D. None of these
Ans Solution: (A)
Option A is correct
11. [ True or False ] PCA can be used for projecting and visualizing data in lower dimensions.
A. TRUE
B. FALSE
Solution: (A)
Sometimes it is very useful to plot the data in lower dimensions. We can take the first 2 principal
components and then visualize the data using scatter plot.
12. A feature F1 can take certain value: A, B, C, D, E, & F and represents grade of students from
a college.
1) Which of the following statement is true in following case?
A) Feature F1 is an example of nominal variable.
B) Feature F1 is an example of ordinal variable.
C) It doesn’t belong to any of the above category.
D) Both of these
Solution: (B)
Ordinal variables are the variables which has some order in their categories. For example, grade
A should be consider as high grade than grade B.
1. Which of the following methods do we use to best fit the data in Logistic Regression?
A) Least Square Error
B) Maximum Likelihood
C) Jaccard distance
D) Both A and B
Ans Solution: B
2. Choose which of the following options is true regarding One-Vs-All method in Logistic
Regression.
A) We need to fit n models in n-class classification problem
B) We need to fit n-1 models to classify into n classes
C) We need to fit only 1 model to classify into n classes
D) None of these
Ans Solution: A
3. Suppose, You applied a Logistic Regression model on a given data and got a training accuracy
X and testing accuracy Y. Now, you want to add a few new features in the same data. Select the
option(s) which is/are correct in such a case.
Note: Consider remaining parameters are same.
A) Training accuracy increases
B) Training accuracy increases or remains the same
C) Testing accuracy decreases
D) Testing accuracy increases or remains the same
Ans Solution: A and D
Adding more features to model will increase the training accuracy because model has to
consider more data to fit the logistic regression. But testing accuracy increases if feature is
found to be significant
6. Which of the following methods do we use to find the best fit line for data in Linear
Regression?
A) Least Square Error
B) Maximum Likelihood
C) Logarithmic Loss
D) Both A and B
Ans Solution: (A)
In linear regression, we try to minimize the least square errors of the model to identify the line
of best fit.
8. Suppose you plotted a scatter plot between the residuals and predicted values in linear
regression and you found that there is a relationship between them. Which of the following
conclusion do you make about this situation?
9. Suppose you have fitted a complex regression model on a dataset. Now, you are using Ridge
regression with penalty x.
Choose the option which describes bias in best manner.
A) In case of very large x; bias is low
B) In case of very large x; bias is high
C) We can’t say about bias
D) None of these
Ans Solution: (B)
If the penalty is very large it means model is less complex, therefore the bias would be high.
11. Suppose you have trained a logistic regression classifier and it outputs a new example x with
a prediction ho(x) = 0.2. This means
Our estimate for P(y=1 | x)
Our estimate for P(y=0 | x)
Our estimate for P(y=1 | x)
Our estimate for P(y=0 | x)
Ans Solution: B
A) TRUE
B) FALSE
Solution: (A)
True. A Neural network can be used as a universal approximator, so it can definitely implement
a linear regression algorithm.
15. Which of the following methods do we use to find the best fit line for data in Linear
Regression?
A) Least Square Error
B) Maximum Likelihood
C) Logarithmic Loss
D) Both A and B
Solution: (A)
In linear regression, we try to minimize the least square errors of the model to identify the line
of best fit.
16. Which of the following evaluation metrics can be used to evaluate a model while modeling
a continuous output variable?
A) AUC-ROC
B) Accuracy
C) Logloss
D) Mean-Squared-Error
Solution: (D)
Since linear regression gives output as continuous values, so in such case we use mean squared
error metric to evaluate the model performance. Remaining options are use in case of a
classification problem.
17. True-False: Lasso Regularization can be used for variable selection in Linear Regression.
A) TRUE
B) FALSE
Solution: (A)
True, In case of lasso regression we apply absolute penalty which makes some of the coefficients
zero.
19. Suppose that we have N independent variables (X1,X2… Xn) and dependent variable is Y.
Now Imagine that you are applying linear regression by fitting the best fit line using least square
error on this data.
You found that correlation coefficient for one of it’s variable(Say X1) with Y is -0.95.
Which of the following is true for X1?
A) Relation between the X1 and Y is weak
B) Relation between the X1 and Y is strong
C) Relation between the X1 and Y is neutral
D) Correlation can’t judge the relationship
Solution: (B)
The absolute value of the correlation coefficient denotes the strength of the relationship.
Since absolute correlation is very high it means that the relationship is strong between X1 and
Y.
20. Looking at above two characteristics, which of the following option is the correct for
Pearson correlation between V1 and V2?
If you are given the two variables V1 and V2 and they are following below two characteristics.
1. If V1 increases then V2 also increases
2. If V1 decreases then V2 behavior is unknown
A) Pearson correlation will be close to 1
B) Pearson correlation will be close to -1
C) Pearson correlation will be close to 0
D) None of these
Solution: (D)
We cannot comment on the correlation coefficient by using only statement 1. We need to
consider the both of these two statements. Consider V1 as x and V2 as |x|. The correlation
coefficient would not be close to 1 in such a case.
21. Suppose Pearson correlation between V1 and V2 is zero. In such case, is it right to
conclude that V1 and V2 do not have any relation between them?
A) TRUE
B) FALSE
Solution: (B)
Pearson correlation coefficient between 2 variables might be zero even when they have a
relationship between them. If the correlation coefficient is zero, it just means that that they
don’t move together. We can take examples like y=|x| or y=x^2.
22. True- False: Overfitting is more likely when you have huge amount of data to train?
A) TRUE
B) FALSE
Solution: (B)
With a small training dataset, it’s easier to find a hypothesis to fit the training data exactly i.e.
overfitting.
23. We can also compute the coefficient of linear regression with the help of an analytical
method called “Normal Equation”. Which of the following is/are true about Normal Equation?
1. We don’t have to choose the learning rate
2. It becomes slow when number of features is very large
3. Thers is no need to iterate
A) 1 and 2
B) 1 and 3
C) 2 and 3
D) 1,2 and 3
Solution: (D)
Instead of gradient descent, Normal Equation can also be used to find coefficients.
25. What will happen when you apply very large penalty?
A) Some of the coefficient will become absolute zero
B) Some of the coefficient will approach zero but not absolute zero
C) Both A and B depending on the situation
D) None of these
Solution: (B)
In lasso some of the coefficient value become zero, but in case of Ridge, the coefficients become
close to zero but not zero.
26. What will happen when you apply very large penalty in case of Lasso?
A) Some of the coefficient will become zero
B) Some of the coefficient will be approaching to zero but not absolute zero
C) Both A and B depending on the situation
D) None of these
Solution: (A)
As already discussed, lasso applies absolute penalty, so some of the coefficients will become
zero.
27. Which of the following statement is true about outliers in Linear regression?
A) Linear regression is sensitive to outliers
B) Linear regression is not sensitive to outliers
C) Can’t say
D) None of these
Solution: (A)
The slope of the regression line will change due to outliers in most of the cases. So Linear
Regression is sensitive to outliers.
28. Suppose you plotted a scatter plot between the residuals and predicted values in linear
regression and you found that there is a relationship between them. Which of the following
conclusion do you make about this situation?
31. In terms of bias and variance. Which of the following is true when you fit degree 2
polynomial?
A) Increase
B) Decrease
C) Remain constant
D) Can’t Say
Solution: (D)
Training error may increase or decrease depending on the values that are used to fit the model.
If the values used to train contain more outliers gradually, then the error might just increase.
33. What do you expect will happen with bias and variance as you increase the size of training
data?
34. What would be the root mean square training error for this data if you run a Linear
Regression model of the form (Y = A0+A1X)?
A) Less than 0
B) Greater than zero
C) Equal to 0
D) None of these
Solution: (C)
We can perfectly fit the line on the following data so mean error will be zero.
35. Which of the following scenario would give you the right hyper parameter?
A) 1
B) 2
C) 3
D) 4
Solution: (B)
Option B would be the better option because it leads to less training as well as validation error.
36. Suppose you got the tuned hyper parameters from the previous question. Now, Imagine
you want to add a variable in variable space such that this added feature is important. Which
of the following thing would you observe in such case?
A) Training Error will decrease and Validation error will increase
B) Training Error will increase and Validation error will increase
C) Training Error will increase and Validation error will decrease
D) Training Error will decrease and Validation error will decrease
E) None of the above
Solution: (D)
If the added feature is important, the training and validation error would decrease.
A) L1
B) L2
C) Any
D) None of these
Solution: (D)
I won’t use any regularization methods because regularization is used in case of overfitting.
41. True-False: Is it possible to design a logistic regression algorithm using a Neural Network
Algorithm?
A) TRUE
B) FALSE
Solution: A
True, Neural network is a is a universal approximator so it can implement linear regression
algorithm.
43. Which of the following methods do we use to best fit the data in Logistic Regression?
A) Least Square Error
B) Maximum Likelihood
C) Jaccard distance
D) Both A and B
Solution: B
Logistic regression uses maximum likely hood estimate for training a logistic regression.
44. Which of the following evaluation metrics can not be applied in case of logistic regression
output to compare with target?
A) AUC-ROC
B) Accuracy
C) Logloss
D) Mean-Squared-Error
Solution: D
Since, Logistic Regression is a classification algorithm so it’s output can not be real time value so
mean squared error can not use for evaluating it
45. One of the very good methods to analyze the performance of Logistic Regression is AIC,
which is similar to R-Squared in Linear Regression. Which of the following is true about AIC?
A) We prefer a model with minimum AIC value
B) We prefer a model with maximum AIC value
C) Both but depend on the situation
D) None of these
Solution: A
We select the best model in logistic regression which can least AIC.
Solution: A
In case of lasso we apply a absolute penality, after increasing the penality in lasso some of the
coefficient of variables may become zero.
Context: 48-49
Consider a following model for logistic regression: P (y =1|x, w)= g(w0 + w1x)
where g(z) is the logistic function.
In the above equation the P (y =1|x; w) , viewed as a function of x, that we can get by changing the
parameters w.
A) (0, inf)
B) (-inf, 0 )
C) (0, 1)
D) (-inf, inf)
Solution: C
For values of x in the range of real number from −∞ to +∞ Logistic function will give the output
between (0,1)
49 In above question what do you think which function would make p between (0,1)?
A) logistic function
B) Log likelihood function
C) Mixture of both
D) None of them
Solution: A
50. Suppose you have been given a fair coin and you want to find out the odds of getting heads.
Which of the following option is true for such a case?
A) odds will be 0
B) odds will be 0.5
C) odds will be 1
D) None of these
Solution: C
Odds are defined as the ratio of the probability of success and the probability of failure. So in case of fair
coin probability of success is 1/2 and the probability of failure is 1/2 so odd would be 1
51. The logit function(given as l(x)) is the log of odds function. What could be the range of logit
function in the domain x=[0,1]?
A) (– ∞ , ∞)
B) (0,1)
C) (0, ∞)
D) (- ∞, 0)
Solution: A
For our purposes, the odds function has the advantage of transforming the probability function, which
has values from 0 to 1, into an equivalent function with values between 0 and ∞. When we take the
natural log of the odds function, we get a range of values from -∞ to ∞.
A) Linear Regression errors values has to be normally distributed but in case of Logistic Regression it is
not the case
B) Logistic Regression errors values has to be normally distributed but in case of Linear Regression it is
not the case
C) Both Linear Regression and Logistic Regression error values have to be normally distributed
D) Both Linear Regression and Logistic Regression error values have not to be normally distributed
Solution:A
53. Which of the following is true regarding the logistic function for any value “x”?
Note:
Logistic(x): is a logistic function of any number “x”
A) Logistic(x) = Logit(x)
B) Logistic(x) = Logit_inv(x)
C) Logit_inv(x) = Logit(x)
D) None of these
Solution: B
Suppose you have given the two scatter plot “a” and “b” for two classes( blue for positive and red for
negative class). In scatter plot “a”, you correctly classified all data points using logistic regression ( black
line is a decision boundary).
A) Bias will be high
B) Bias will be low
C) Can’t say
D) None of these
Solution: A
55. Suppose, You applied a Logistic Regression model on a given data and got a training accuracy X
and testing accuracy Y. Now, you want to add a few new features in the same data. Select the
option(s) which is/are correct in such a case.
Solution: A and D
Adding more features to model will increase the training accuracy because model has to consider more
data to fit the logistic regression. But testing accuracy increases if feature is found to be significant
56. Choose which of the following options is true regarding One-Vs-All method in Logistic Regression.
If there are n classes, then n separate logistic regression has to fit, where the probability of each
category is predicted over the rest of the categories combined.
57. Below are two different logistic models with different values for β0 and β1.
Which of the
following statement(s) is true about β0 and β1 values of two logistics models (Green, Black)?
Solution: B
Context 58-60
Below are the three scatter plot(A,B,C left to right) and hand drawn decision boundaries for logistic
regression.
58. Which of the following above figure shows that the decision boundary is overfitting the training
data?
A) A
B) B
C) C
D)None of these
Solution: C
Since in figure 3, Decision boundary is not smooth that means it will over-fitting the data.
1. The training error in first plot is maximum as compare to second and third plot.
2. The best model for this regression problem is the last (third) plot because it has minimum
training error (zero).
3. The second model is more robust than first and third because it will perform best on unseen
data.
5. All will perform same because we have not seen the testing data.
A) 1 and 3
B) 1 and 3
C) 1, 3 and 4
D) 5
Solution: C
The trend in the graphs looks like a quadratic trend over independent variable X. A higher degree(Right
graph) polynomial might have a very high accuracy on the train population but is expected to fail badly
on test dataset. But if you see in left graph we will have training error maximum because it underfits the
training data
60. Suppose, above decision boundaries were generated for the different value of regularization.
Which of the above decision boundary shows the maximum regularization?
A) A
B) B
C) C
D) All have equal regularization
Solution: A
Since, more regularization means more penality means less complex decision boundry that shows in first
figure A.
61. What would do if you want to train logistic regression on same data that will take less time as well
as give the comparatively similar accuracy(may not be same)?
Suppose you are using a Logistic Regression model on a huge dataset. One of the problem you may face
on such huge data is that Logistic regression will take very long time to train.
Solution: D
If you decrease the number of iteration while training it will take less time for surly but will not give the
same accuracy for getting the similar accuracy but not exact you need to increase the learning rate.
62. Which of the following image is showing the cost function for y =1.
Following is the loss function in logistic regression(Y-axis loss function and x axis log probability) for
two class classification problem.
Solution: A
A is the true answer as loss function decreases as the log probability increases
A) 1
B) 2
C) 3
D) 4
Solution: C
There are three local minima present in the graph
64. Can a Logistic Regression classifier do a perfect classification on the below data?
Note: You can use only X1 and X2 variables where X1 and X2 can take only two binary values(0,1).
A) TRUE
B) FALSE
C) Can’t say
D) None of these
Solution: B
No, logistic regression only forms linear decision surface, but the examples in the figure are not linearly
separable.
UNIT IV
Ans Solution: C
When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.
Ans Solution: C
The cost parameter decides how much an SVM should be allowed to “bend” with the data. For a low
cost, you aim for a smooth decision surface and for a higher cost, you aim to classify more points
correctly. It is also simply referred to as the cost of misclassification.
Ans Solution: D
SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.
Ans Solution: B
Generalisation error in statistics is generally the out-of-sample error which is the measure of how
accurately a model can predict values for previously unseen data.
Ans Solution: C
When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: C
Question Context:8– 9
Suppose you are using a Linear SVM classifier with 2 class classification problem. Now you have been
given the following data in which some points are circled red that are representing support vectors.
8. If you remove the following any one red points from the data. Does the decision boundary will
change?
A) Yes
B) No
Solution: A
These three examples are positioned such that removing any one of them introduces slack in the
constraints. So the decision boundary would completely change.
9. [True or False] If you remove the non-red circled points from the data, the decision boundary will
change?
A) True
B) False
Solution: B
On the other hand, rest of the points in the data won’t affect the decision boundary much.
Solution: B
Generalization error in statistics is generally the out-of-sample error which is the measure of how
accurately a model can predict values for previously unseen data.
11. When the C parameter is set to infinite, which of the following holds true?
A) The optimal hyperplane if exists, will be the one that completely separates the data
B) The soft-margin classifier will separate the data
C) None of the above
Solution: A
At such a high level of misclassification penalty, soft margin will not hold existence as there will be no
room for error.
Solution: A
A hard margin means that an SVM is very rigid in classification and tries to work extremely well in the
training set, causing overfitting.
13. The minimum time complexity for training an SVM is O(n2). According to this fact, what sizes of
datasets are not best suited for SVM’s?
A) Large datasets
B) Small datasets
C) Medium sized datasets
D) Size does not matter
Solution: A
Datasets which have a clear classification boundary will function best with SVM’s.
A) Selection of Kernel
B) Kernel Parameters
C) Soft Margin Parameter C
D) All of the above
Solution: D
The SVM effectiveness depends upon how you choose the basic 3 requirements mentioned above in
such a way that it maximises your efficiency, reduces error and overfitting.
15. upport vectors are the data points that lie closest to the decision surface.
A) TRUE
B) FALSE
Solution: A
They are the points closest to the hyperplane and the hardest ones to classify. They also have a direct
bearing on the location of the decision surface.
Solution: C
When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.
17. Suppose you are using RBF kernel in SVM with high Gamma value. What does this signify?
A) The model would consider even far away points from hyperplane for modeling
B) The model would consider only the points close to the hyperplane for modeling
C) The model would not be affected by distance of points from hyperplane for modeling
D) None of the above
Solution: B
The gamma parameter in SVM tuning signifies the influence of points either near or far away from the
hyperplane.
For a low gamma, the model will be too constrained and include all points of the training dataset,
without really capturing the shape.
For a higher gamma, the model will capture the shape of the dataset well.
Solution: C
The cost parameter decides how much an SVM should be allowed to “bend” with the data. For a low
cost, you aim for a smooth decision surface and for a higher cost, you aim to classify more points
correctly. It is also simply referred to as the cost of misclassification.
19. Suppose you are building a SVM model on data X. The data X can be error prone which means that
you should not trust any specific data point too much. Now think that you want to build a SVM model
which has quadratic kernel function of polynomial degree 2 that uses Slack variable C as one of it’s hyper
parameter. Based upon that give the answer for following question.
What would happen when you use very large value of C(C->infinity)?
Note: For small C was also classifying all data points correctly
A) We can still classify data correctly for given setting of hyper parameter C
B) We can not classify data correctly for given setting of hyper parameter C
C) Can’t Say
D) None of these
Solution: A
For large values of C, the penalty for misclassifying points is very high, so the decision boundary will
perfectly separate the data if possible.
20. What would happen when you use very small C (C~0)?
Solution: A
The classifier can maximize the margin between most of the points, while misclassifying a few points,
because the penalty is so low.
21. If I am using all features of my dataset and I achieve 100% accuracy on my training set, but ~70% on
validation set, what should I look out for?
A) Underfitting
B) Nothing, the model is perfect
C) Overfitting
Solution: C
If we’re achieving 100% training accuracy very easily, we need to check to verify if we’re overfitting our
data.
22. Which of the following are real world applications of the SVM?
Solution: D
SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.
Question Context: 23 – 25
Suppose you have trained an SVM with linear decision boundary after training SVM, you correctly infer
that your SVM model is under fitting.
23. Which of the following option would you more likely to consider iterating SVM next time?
Solution: C
The best option here would be to create more features for the model.
24. Suppose you gave the correct answer in previous question. What do you think that is actually
happening?
A) 1 and 2
B) 2 and 3
C) 1 and 4
D) 2 and 4
Solution: C
Better model will lower the bias and increase the variance
25. In above question suppose you want to change one of it’s(SVM) hyperparameter so that effect
would be same as previous questions i.e model will not under fit?
Solution: A
Increasing C parameter would be the right thing to do here, as it will ensure regularized model
26. We usually use feature normalization before using the Gaussian kernel in SVM. What is true about
feature normalization?
A) 1
B) 1 and 2
C) 1 and 3
D) 2 and 3
Solution: B
Suppose you are dealing with 4 class classification problem and you want to train a SVM model on the
data for that you are using One-vs-all method. Now answer the below questions?
27. How many times we need to train our SVM model in such case?
A) 1
B) 2
C) 3
D) 4
Solution: D
For a 4 class problem, you would have to train the SVM at least 4 times if you are using a one-vs-all
method.
28. Suppose you have same distribution of classes in the data. Now, say for training 1 time in one vs all
setting the SVM is taking 10 second. How many seconds would it require to train one-vs-all method end
to end?
A) 20
B) 40
C) 60
D) 80
Solution: B
29 Suppose your problem has changed now. Now, data has only 2 classes. What would you think how
many times we need to train SVM in such case?
A) 1
B) 2
C) 3
D) 4
Solution: A
Training the SVM only one time would give you appropriate results
Suppose you are using SVM with linear kernel of polynomial degree 2, Now think that you have applied
this on data and found that it perfectly fit the data that means, Training and testing accuracy is 100%.
30. Now, think that you increase the complexity (or degree of polynomial of this kernel). What would
you think will happen?
Solution: A
Increasing the complexity of the data would make the algorithm overfit the data.
31. In the previous question after increasing the complexity you found that training accuracy was still
100%. According to you what is the reason behind that?
1. Since data is fixed and we are fitting more polynomial term or parameters so the algorithm starts
memorizing everything in the data
2. Since data is fixed and SVM doesn’t need to search in big hypothesis space
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: C
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: C
UNIT V
1. Which of the following is a widely used and effective machine learning algorithm based on the
idea of bagging?
a) Decision Tree
b) Regression
c) Classification
d) Random Forest
Ans D
a) Factor analysis
b) Decision trees are robust to outliers
c) Decision trees are prone to be overfit
d) None of the above
Ans C
a. True
b. False
Decision trees can also be used to for clusters in the data but clustering often generates natural
clusters and is not dependent on any objective function.
Regression
Classification
Clustering
Reinforcement Learning
Options:
a. 1 Only
b. 1 and 2
c. 1 and 3
d. 1, 2 and 4
Ans D
6 Which of the following is the most appropriate strategy for data cleaning before performing
clustering analysis, given less than desirable number of data points:
Removal of outliers
Options:
a. 1 only
b. 2 only
c. 1 and 2
d. None of the above
Ans A
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: C
Both options are true. In Bagging, each individual trees are independent of each other because they
consider different subset of features and samples.
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: B
In boosting tree individual weak learners are not independent of each other because each tree correct
the results of previous tree. Bagging and boosting both can be consider as improving the base learners
results.
9. In Random forest you can generate hundreds of trees (say T1, T2 …..Tn) and then aggregate
the results of these tree. Which of the following is true about individual (Tk) tree in Random Forest?
1. Individual tree is built on a subset of the features
A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4
Ans Solution: A
Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.
10. Suppose you are using a bagging based algorithm say a RandomForest in model building.
Which of the following can be true?
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: A
Since Random Forest aggregate the result of different weak learners, If It is possible we would want
more number of trees in model building. Random Forest is a black box model you will lose
interpretability after using it.
11. Which of the following is/are true about Random Forest and Gradient Boosting ensemble
methods?
2. Random Forest is use for classification whereas Gradient Boosting is use for regression task
3. Random Forest is use for regression whereas Gradient Boosting is use for Classification task
Solution: E
12. In Random forest you can generate hundreds of trees (say T1, T2 …..Tn) and then aggregate the
results of these tree. Which of the following is true about individual(Tk) tree in Random Forest?
A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4
Solution: A
Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.
13. Which of the following algorithm doesn’t uses learning Rate as of one of its hyperparameter?
1. Gradient Boosting
2. Extra Trees
3. AdaBoost
4. Random Forest
A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4
Solution: D
Random Forest and Extra Trees don’t have learning rate as a hyperparameter.
14. Which of the following algorithm are not an example of ensemble learning algorithm?
A) Random Forest
B) Adaboost
C) Extra Trees
D) Gradient Boosting
E) Decision Trees
Solution: E
Decision trees doesn’t aggregate the results of multiple trees so it is not an ensemble algorithm.
15. Suppose you are using a bagging based algorithm say a RandomForest in model building. Which of
the following can be true?
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: A
Since Random Forest aggregate the result of different weak learners, If It is possible we would want
more number of trees in model building. Random Forest is a black box model you will lose
interpretability after using it.
16. True-False: The bagging is suitable for high variance low bias models?
A) TRUE
B) FALSE
Solution: A
The bagging is suitable for high variance low bias models or you can say for complex models.
17. To apply bagging to regression trees which of the following is/are true in such case?
Solution: D
Solution: B
We always consider the validation results to compare with the test result.
19. In which of the following scenario a gain ratio is preferred over Information Gain?
Solution: A
When high cardinality problems, gain ratio is preferred over Information Gain technique.
20. Suppose you have given the following scenario for training and validation error for Gradient
Boosting. Which of the following hyper parameter would you choose in such case?
1 2 100 110
2 4 90 105
3 6 50 100
4 8 45 105
5 10 30 150
A) 1
B) 2
C) 3
D) 4
Solution: B
Scenario 2 and 4 has same validation accuracies but we would select 2 because depth is lower is better
hyper parameter.
21. Which of the following is/are not true about DBSCAN clustering algorithm:
1. For data points to be in a cluster, they must be in a distance threshold to a core point
5. It is robust to outliers
Options:
A. 1 only
B. 2 only
C. 4 only
D. 2 and 3
Solution: D
DBSCAN can form a cluster of any arbitrary shape and does not have strong assumptions for the
distribution of data points in the data space.
Answer: d
Explanation: K-means clustering follows partitioning approach.
Answer: c
Explanation: k-nearest neighbour has nothing to do with k-means.
Answer: a
Explanation: K-means requires a number of clusters.
Answer: a
Explanation: K-means clustering produces the final estimate of cluster centroids.
27.
((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1
3 UPTO 10)
((QUESTION)) Which of the following step / assumption in regression modeling impacts
the trade-off between under-fitting and over-fitting the most
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO 1. The true relationship between dependent y and predictor x is linear
2. The model errors are statistically independent
3. The errors are normally distributed with a 0 mean and constant standard deviation.
((OPTION_A)) 1,2&3
THIS IS MANDATORY OPTION
((OPTION_B)) 1&3
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) All of above
This is optional
((OPTION_D))
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E C
((EXPLANATION)) This is also optional
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO 1. Simple Linear regression will have high bias and low variance
2. Simple Linear regression will have low bias and high variance
3. polynomial of degree 3 will have low bias and high variance
Polynomial of degree 3 will have low bias and Low variance
((OPTION_A)) . Only 1
THIS IS MANDATORY OPTION
((OPTION_B)) 1&3
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) 1&4
This is optional
((OPTION_D)) None of the above
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E C
((EXPLANATION)) This is also optional
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO Now, which of the following option will be correct ?
((OPTION_A)) It is more likely for X1 to be excluded from the model
THIS IS MANDATORY OPTION
((OPTION_B)) It is more likely for X1 to be included in the model
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) . Can’t say
This is optional
((OPTION_D)) None of the above
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E B
((EXPLANATION)) This is also optional
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO 1. We don’t have to choose the learning rate
2. It becomes slow when number of features is very large
3. No need to iterate
((OPTION_A)) 1 and 2
THIS IS MANDATORY OPTION
((OPTION_B)) 1&3
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) 2&3
This is optional
((OPTION_D)) 1,2&3
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E D
((EXPLANATION)) This is also optional
((OPTION_A)) . 1 and 2
THIS IS MANDATORY OPTION
((OPTION_B)) 1 and 3
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) 2 and 3
This is optional
((OPTION_D)) 1,2 and 3
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E D
((EXPLANATION)) This is also optional
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO Note: Consider remaining parameters are same.
1. Training accuracy always decreases.
2. Training accuracy always increases or remain same.
3. Testing accuracy always decreases
Testing accuracy always increases or remain same
((OPTION_A)) Only 2
THIS IS MANDATORY OPTION
((OPTION_B)) Only 1
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) Only3
This is optional
((OPTION_D)) All of the above
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional
((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1
3 UPTO 10)
((QUESTION)) The graph below represents a regression line predicting Y from X. The values on the
graph shows the residuals for each predictions value. Use this information to compute
the SSE.
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A)) 3.02
THIS IS MANDATORY OPTION
((OPTION_B)) 0.75
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) 1.01
This is optional
((OPTION_D)) None of these
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO Would a person with Salary $1 be considered an Outlier?
((OPTION_A)) YES
THIS IS MANDATORY OPTION
((OPTION_B)) NO
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) . More information is required
This is optional
((OPTION_D)) None of these
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E C
((EXPLANATION)) This is also optional
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO Logistic(x): is a logistic function of any number “x”
Logit(x): is a logit function of any number “x”
Logit_inv(x): is a inverse logit function of any number “x””?
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO Note: Consider remaining parameters are same.
((OPTION_A)) Training accuracy increases
THIS IS MANDATORY OPTION
((OPTION_B)) Training accuracy increases or remains the same
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) Testing accuracy decreases
This is optional
((OPTION_D)) Testing accuracy increases or remains the same
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A&D
((EXPLANATION)) This is also optional
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO Following is the loss function in logistic regression(Y-axis loss function and x axis log probability) for two
class classification problem.
Note: Y is the target class
((OPTION_A)) A
THIS IS MANDATORY OPTION
((OPTION_B)) B
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) BOTH
This is optional
((OPTION_D)) NON OF THESE
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional
This is optional
((OPTION_D)) Predict a continuous variable from dichotomous or continuous variables
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional
((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1
3 UPTO 10)
((QUESTION)) The odds ratio is
This is optional
((OPTION_D)) The ratio of the probability of an event happening to the probability of the event not happening.
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E C
((EXPLANATION)) This is also optional
This is optional
((OPTION_D)) That the statistical model is a poor fit of the data.
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E B
((EXPLANATION)) This is also optional
This is optional
((OPTION_D)) Linear relationship between observations.
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E B
((EXPLANATION)) This is also optional
This is optional
((OPTION_D)) There is no dependent variable.
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E C
((EXPLANATION)) This is also optional
This is optional
((OPTION_D)) none
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E C
((EXPLANATION)) This is also optional
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO pairs as follows: y 1 = 22, x 1 = 1, y 2 = 3, x 2 = 1, y 3 = 3, x 3 = 2. What
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO stored it in a vector g. What is the cost of one gradient descent update
given the gradient?
((OPTION_A))
O (D )
THIS IS MANDATORY OPTION
((OPTION_B))
O (N )
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
O (ND )
This is optional
((OPTION_D))
O (ND 2)
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO you increase the amount of training data, the test error decreases and the
training error increases. The train error is quite low (almost what you expect
it to), while the test error is much higher than the train error.
What do you think is the main reason behind this behavior. Choose the
most probable option
((OPTION_A))
High variance
THIS IS MANDATORY OPTION
((OPTION_B))
High model bias
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
High estimation bias
This is optional
((OPTION_D))
None of the above
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional
((OPTION_A))
TRUE
THIS IS MANDATORY OPTION
((OPTION_B))
FALSE
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional
((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR
1
3 UPTO 10)
((QUESTION))
Training set is normally a representation of a global distribution
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A))
TRUE
THIS IS MANDATORY OPTION
((OPTION_B))
FALSE
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional
((OPTION_A))
Underfitting
THIS IS MANDATORY OPTION
((OPTION_B))
Overfitting
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
Both
Both
This is optional
((OPTION_D))
None
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E
((EXPLANATION)) This is also optional
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO values, but when an unknown input is presented, the corresponding prediction
error can be very high, This problem is called as
((OPTION_A))
Underfitting
THIS IS MANDATORY OPTION
((OPTION_B))
Overfitting
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
Both
This is optional
((OPTION_D))
None
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E
((EXPLANATION)) This is also optional
((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR
1
3 UPTO 10)
((QUESTION)) ---------- may prove to be more difficult to discover as it could be initially considered
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO the result of a perfect fitting
((OPTION_A))
Underfitting
THIS IS MANDATORY OPTION
((OPTION_B))
Overfitting
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
Both
This is optional
((OPTION_D))
None
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E
((EXPLANATION)) This is also optional
measure e m which takes two arguments and allows us to compute a total error value
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
over the whole dataset. Those two arguments are.
((OPTION_A))
expected and predicted output
THIS IS MANDATORY OPTION
((OPTION_B))
calculated and predicted output
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
calculated and measured output
calculated and measured output
This is optional
((OPTION_D))
none
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO generic training algorithm has to find the global minimum or a point quite close to it
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO proposed a mathematical approach to determine whether a problem is learnable by a
((QUESTION)) Choose the options that is incorrect regarding machine learning (ML) and
artificial intelligence (AI)
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) Linear in D
THIS IS
MANDATORY
OPTION
((OPTION_B)) Exponential in D
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Linear in N
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) -1.66
THIS IS
MANDATORY
OPTION
((OPTION_B)) 2
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 3
This is optional
((OPTION_D)) 4
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Let us say that we have computed the gradient of our cost function and
stored it in a vector g. What is the cost of one gradient descent update
ENTER given the gradient?
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) O(D)
THIS IS
MANDATORY
OPTION
((OPTION_B)) O(N)
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) O(ND)
This is optional
((OPTION_D)) O(ND2)
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) You observe the following while fitting a linear regression to the data: As
you increase the amount of training data, the test error decreases and the
ENTER training error increases. The train error is quite low (almost what you
CONTENT. QTN expect
CAN HAVE it to), while the test error is much higher than the train error.
IMAGES ALSO What do you think is the main reason behind this behavior. Choose the
most probable option
((OPTION_A)) High variance
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Adding more basis functions in a linear model... (pick the most probably
option)
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_C)) Serration
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) You are given data about seismic activity in Japan, and you want to
predict a magnitude of the next earthquake, this is in an example of
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_C)) Serration
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) Classification
THIS IS
MANDATORY
OPTION
((OPTION_B)) Regression
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Clustering
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) True
THIS IS
MANDATORY
OPTION
((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) Outcome
THIS IS
MANDATORY
OPTION
((OPTION_B)) Feature
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Attribute
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) It may be better to avoid the metric of ROC curve as it can suffer
from accuracy paradox.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) True
THIS IS
MANDATORY
OPTION
((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) The expected value or _______ of a random variable is the center of its
distribution.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) Mode
THIS IS
MANDATORY
OPTION
((OPTION_B)) median
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) mean
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) variance
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) The square root of the variance is called the ________ deviation
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) empirical
THIS IS
MANDATORY
OPTION
((OPTION_B)) mean
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) continuous
This is optional
((OPTION_D)) standard
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) For continuous random variables, the CDF is the derivative of the PDF.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) True
THIS IS
MANDATORY
OPTION
((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) True
THIS IS
MANDATORY
OPTION
((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Consider the results of a medical experiment that aims to predict whether someone is
going to develop myopia based on some physical measurements and heredity. In this
ENTER case, the input dataset consists of the person’s medical characteristics and the target
variable is binary: 1 for those who are likely to develop myopia and 0 for those who
CONTENT. QTN aren’t. This can be best classified as
CAN HAVE
IMAGES ALSO
((OPTION_A)) Regression
THIS IS
MANDATORY
OPTION
((OPTION_C)) Clustering
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
The purpose of a machine learning model is to approximate an unknown function
((QUESTION))
that
ENTER associates input elements to output ones
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) True
THIS IS
MANDATORY
OPTION
((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Training set is normally a representation of a global distribution
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) True
THIS IS
MANDATORY
OPTION
((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
The model has an excessive capacity and it's not more able to
((QUESTION))
generalize considering the original dynamics provided by the training set. This
ENTER problem is called as
CONTENT. QTN
CAN HAVE
IMAGES ALSO
Underfitting
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B)) Overfitting
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Both
This is optional
((OPTION_D)) None
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
It can associate almost perfectly all the known samples to the corresponding
((QUESTION))
output
ENTER values, but when an unknown input is presented, the corresponding prediction
CONTENT. QTN error can be very high, This problem is called as
CAN HAVE
IMAGES ALSO
Underfitting
((OPTION_A))
THIS IS
MANDATORY
OPTION
Overfitting
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Both
This is optional
((OPTION_D)) None
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
---------- may prove to be more difficult to discover as it could be initially
((QUESTION))
considered the result of a perfect fitting
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
Underfitting
((OPTION_A))
THIS IS
MANDATORY
OPTION
Overfitting
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Both
This is optional
((OPTION_D)) None
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
when working with a supervised scenario, we define a non-negative error
((QUESTION))
measure em which takes two arguments and allows us to compute a total error
ENTER value over the whole dataset. Those two arguments are.
CONTENT. QTN
CAN HAVE
IMAGES ALSO
expected and predicted output
((OPTION_A))
THIS IS
MANDATORY
OPTION
calculated and predicted output
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
calculated and measured output
((OPTION_C))
This is optional
none
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
A
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Initial value represents a starting point over the surface of a n-variables function.
((QUESTION))
A
ENTER generic training algorithm has to find the global minimum or a point quite close
CONTENT. QTN to it
CAN HAVE (there's always a tolerance to avoid an excessive number of iterations and a
IMAGES ALSO consequent risk
of overfitting). This measure is also called
loss function
((OPTION_A))
THIS IS
MANDATORY
OPTION
predicted output
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
measured output
((OPTION_C))
This is optional
mean square error
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) In particular, a concept is a subset of input patterns X which determine the same
output element
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) True
THIS IS
MANDATORY
OPTION
((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) True
THIS IS
MANDATORY
OPTION
((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) An exponential time could lead to computational explosions when the datasets
are too large
ENTER or the optimization starting point is very far from an acceptable minimum.
CONTENT. QTN Moreover, it's
CAN HAVE important to remember the so-called …….
IMAGES ALSO
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE First term is called as
IMAGES ALSO
((OPTION_A)) posteriori
THIS IS
MANDATORY
OPTION
((OPTION_B)) Apriori
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) likelihood.
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
second term is called as
CAN HAVE
IMAGES ALSO
((OPTION_A)) posteriori
THIS IS
MANDATORY
OPTION
((OPTION_B)) Apriori
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) likelihood.
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
Third term is called as
CAN HAVE
IMAGES ALSO
((OPTION_A)) posteriori
THIS IS
MANDATORY
OPTION
((OPTION_B)) Apriori
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) likelihood.
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
THIS IS ALSO
MANDATORY
OPTION
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Suppose you have the following data with one real-value input
variable & one real-value output variable. What is leave-one out cross
ENTER validation mean square error in case of linear regression (Y = bX+c)?
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) 10/27
THIS IS
MANDATORY
OPTION
((OPTION_B)) 20/27
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 50/27
This is optional
((OPTION_D)) 49/27
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Which of the following is/ are true about “Maximum Likelihood
estimate (MLE)”?
ENTER
CONTENT. QTN 1. MLE may not always exist
CAN HAVE 2. MLE always exists
IMAGES ALSO 3. If MLE exist, it (they) may not be unique
4. If MLE exist, it (they) must be unique
((OPTION_A)) 1and4
THIS IS
MANDATORY
OPTION
((OPTION_B)) 2 and3
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 1 and3
This is optional
((OPTION_D)) 2 and4
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Let’s say, a “Linear regression” model perfectly fits the training data
(train error is zero). Now, Which of the following statement is true?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS ALSO
MANDATORY
OPTION
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
((OPTION_B)) . The p-value for the null hypothesis Beta coefficient =0 is 0.0001
THIS IS ALSO
MANDATORY
OPTION
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Which of the following assumptions do we make while deriving linear regression param
((QUESTION))
1. The true relationship between dependent y and predictor x is linear
ENTER 2. The model errors are statistically independent
CONTENT. QTN 3. The errors are normally distributed with a 0 mean and constant standard deviation.
CAN HAVE
IMAGES ALSO
((OPTION_A)) 1,2&3
THIS IS
MANDATORY
OPTION
((OPTION_B)) 1&3
THIS IS ALSO
MANDATORY
OPTION
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_B)) Barchart
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Histograms
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) 1&2
THIS IS
MANDATORY
OPTION
((OPTION_B)) Only 1
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Only 2
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Which of the following offsets, do we use in case of least square line fit? Suppose horizontal axis is
((QUESTION)) independent variable and vertical axis is dependent variable
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Suppose we have generated the data with help of polynomial regression of degree 3 (degree 3 will
((QUESTION)) perfectly fit this data). Now consider below points and choose the option based on these points.
ENTER 1. Simple Linear regression will have high bias and low variance
CONTENT. QTN 2. Simple Linear regression will have low bias and high variance
3. polynomial of degree 3 will have low bias and high variance
CAN HAVE
IMAGES ALSO Polynomial of degree 3 will have low bias and Low variance
((OPTION_A)) . Only 1
THIS IS
MANDATORY
OPTION
((OPTION_B)) 1&3
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 1&4
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) . Suppose you are training a linear regression model. Now consider
these points.
ENTER
CONTENT. QTN 1. Overfitting is more likely if we have less data
CAN HAVE 2. Overfitting is more likely when the hypothesis space is small
IMAGES ALSO
Which of the above statement(s) are correct?
((OPTION_A)) Both are False
THIS IS
MANDATORY
OPTION
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH c
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Suppose we fit “Lasso Regression” to a data set, which has 100 features (X1,X2…X100). Now, we rescale
((QUESTION)) one of these feature by multiplying with 10 (say that feature is X1), and then refit Lasso regression with
the same regularization parameter.
ENTER
CONTENT. QTN Now, which of the following option will be correct?
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) . 1 and 2
THIS IS
MANDATORY
OPTION
((OPTION_B)) 1 and 3
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 2 and 4
This is optional
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) . Which of the following metrics can be used for evaluating regression
models?
ENTER 1. R Squared
CONTENT. QTN 2. Adjusted R Squared
CAN HAVE 3. F Statistics
IMAGES ALSO 1. RMSE / MSE / MAE
((OPTION_A)) 2 and 4
THIS IS
MANDATORY
OPTION
((OPTION_B)) 1 and 2.
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) . 2, 3 and 4.
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) We can also compute the coefficient of linear regression with the help
of an analytical method called “Normal Equation”. Which of the
ENTER following is/are true about “Normal Equation”?
CONTENT. QTN 1. We don’t have to choose the learning rate
CAN HAVE 2. It becomes slow when number of features is very large
IMAGES ALSO 3. No need to iterate
((OPTION_A)) 1 and 2
THIS IS
MANDATORY
OPTION
((OPTION_B)) 1&3
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 2&3
This is optional
((OPTION_D)) 1,2&3
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
. The expected value of Y is a linear function of the X(X1,X2….Xn) variables and regression line is
((QUESTION)) defined as:
Y = β0 + β1 X1 + β2 X2……+ βn Xn
ENTER Which of the following statement(s) are true?
1. If Xi changes by an amount ∆Xi, holding other variables constant, then the expected value of Y
CONTENT. QTN changes by a proportional amount βi ∆Xi, for some constant βi (which in general could be a
CAN HAVE positive or negative number).
2. The value of βi is always the same, regardless of values of the other X’s.
IMAGES ALSO 3. The total effect of the X’s on the expected value of Y is the sum of their separate effects.
((OPTION_A)) . 1 and 2
THIS IS
MANDATORY
OPTION
((OPTION_B)) 1 and 3
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 2 and 3
This is optional
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) 1
THIS IS
MANDATORY
OPTION
((OPTION_B)) 2
THIS IS ALSO
MANDATORY
OPTION
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
. Below graphs show two fitted regression lines (A & B) on randomly generated data. Now, I want to find
((QUESTION)) the sum of residuals in both cases A and B.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO Which of the following statement is true about sum of residuals of A and B
THIS IS
MANDATORY
OPTION
This is optional
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) If two variables are correlated, is it necessary that they have a linear
relationsh
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) YES
THIS IS
MANDATORY
OPTION
((OPTION_B)) NO
THIS IS ALSO
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION
((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Suppose I applied a logistic regression model on data and got training accuracy X and testing accuracy Y.
((QUESTION)) Now I want to add few new features in data. Select option(s) which are correct in such case.
Note: Consider remaining parameters are same.
ENTER 1. Training accuracy always decreases.
2. Training accuracy always increases or remain same.
CONTENT. QTN 3. Testing accuracy always decreases
CAN HAVE Testing accuracy always increases or remain same
IMAGES ALSO
((OPTION_A)) Only 2
THIS IS
MANDATORY
OPTION
((OPTION_B)) Only 1
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Only3
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
The graph below represents a regression line predicting Y from X. The values on the
((QUESTION)) graph shows the residuals for each predictions value. Use this information to
ENTER compute the SSE.
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) 3.02
THIS IS
MANDATORY
OPTION
((OPTION_B)) 0.75
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 1.01
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) YES
THIS IS
MANDATORY
OPTION
((OPTION_B)) NO
THIS IS ALSO
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION
((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) _
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION
((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION
((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION
((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Which of the following methods do we use to best fit the data in
Logistic Regression?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) One of the very good methods to analyze the performance of Logistic
Regression is AIC, which is similar to R-Squared in Linear
ENTER Regression. Which of the following is true about AIC
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION
((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) ) LASSO
THIS IS
MANDATORY
OPTION
((OPTION_B)) Ridge
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Both
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Suppose you have been given a fair coin and you want to find out the
odds of getting heads. Which of the following option is true for such a
ENTER case?
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) ) The logit function(given as l(x)) is the log of odds function. What
could be the range of logit function in the domain x=[0,1]?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) (– ∞ , ∞)
THIS IS
MANDATORY
OPTION
((OPTION_B)) (0,1)
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) (0, ∞)
This is optional
((OPTION_D)) (- ∞, 0)
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) Linear Regression errors values has to be normally distributed but in case
of Logistic Regression it is not the case
THIS IS
MANDATORY
OPTION
((OPTION_B)) Linear Regression errors values has to be normally distributed but in case
of Logistic Regression it is not the case
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Both Linear Regression and Logistic Regression error values have to be
normally distributed
This is optional
((OPTION_D)) Both Linear Regression and Logistic Regression error values have not to
be normally distributed
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
17) Which of the following is true regarding the logistic function for any value “x Note:
((QUESTION)) Logistic(x): is a logistic function of any number “x”
Logit(x): is a logit function of any number “x”
ENTER Logit_inv(x): is a inverse logit function of any number “x””?
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Suppose, You applied a Logistic Regression model on a given data and
got a training accuracy X and testing accuracy Y. Now, you want to
ENTER add a few new features in the same data. Select the option(s) which
CONTENT. QTN is/are correct in such a case.
CAN HAVE
IMAGES ALSO Note: Consider remaining parameters are same.
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A&D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) What would do if you want to train logistic regression on same data
that will take less time as well as give the comparatively similar
ENTER accuracy(may not be same)?
CONTENT. QTN
CAN HAVE Suppose you are using a Logistic Regression model on a huge dataset. One
IMAGES ALSO of the problem you may face on such huge data is that Logistic regression
will take very long time to train
((OPTION_A)) Decrease the learning rate and decrease the number of iteration
THIS IS
MANDATORY
OPTION
((OPTION_B)) Decrease the learning rate and increase the number of iteration
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Increase the learning rate and increase the number of iteration
This is optional
((OPTION_D)) Increase the learning rate and decrease the number of iteration
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Which of the following image is showing the cost function for y =1.
((QUESTION)) Following is the loss function in logistic regression(Y-axis loss function and x axis log probability) for two
class classification problem.
ENTER Note: Y is the target class
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) A
THIS IS
MANDATORY
OPTION
((OPTION_B)) B
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) BOTH
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
The odds ratio is
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) The ratio of the probability of an event not happening to the probability of the event happening.
THIS IS
MANDATORY
OPTION
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) The ratio of the odds after a unit change in the predictor to the original odds.
This is optional
((OPTION_D)) The ratio of the probability of an event happening to the probability of the event not happening.
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Large values of the log-likelihood statistic indicate:
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) That there are a greater number of explained vs. unexplained observations.
THIS IS
MANDATORY
OPTION
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) That as the predictor variable increases, the likelihood of the outcome occurring decreases.
This is optional
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Logistic regression assumes a:
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) Linear relationship between continuous predictor variables and the outcome variable.
THIS IS
MANDATORY
OPTION
((OPTION_B)) Linear relationship between continuous predictor variables and the logit of the outcome
variable.
THIS IS ALSO
MANDATORY
OPTION
This is optional
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
In binary logistic regression:
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
THIS IS ALSO
MANDATORY
OPTION
This is optional
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_D)) none
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
This sheet is for 3 Mark questions
S.r No Question Image a b c d Correct Answer
e.g 1 Write down question img.jpg Option a Option b Option c Option d a/b/c/d
1 Which of the following is characteristic of best fast accuracy scalable All above D
machine learning method ?
2 What are the different Algorithm techniques in Supervised Unsupervised Both A & B None of the C
Machine Learning? Learning and Learning and Mentioned
Semi- Transduction
3 ______can be adopted when it's necessary to Supervised Semi- Reinforcement Clusters B
categorize a large amount of data with a few supervised
complete examples or when there's the need to
4 In reinforcement learning, this feedback is usually Overfitting Overlearning Reward None of above C
called as___.
5 In the last decade, many researchers started training Deep learning Machine Reinforcement Unsupervised A
bigger and bigger models, built with several different learning learning learning
layers that's why this approach is called_____.
6 What does learning exactly mean? Robots are A set of data Learning is the It is a set of C
programed so is used to ability to data is used to
that they can discover the change discover the
7 When it is necessary to allow the model to develop a Overfitting Overlearning Classification Regression A
generalization ability and avoid a common problem
called______.
8 Techniques involve the usage of both labeled and Supervised Semi- Unsupervised None of the B
unlabeled data is called___. supervised above
9 there's a growing interest in pattern recognition and Regression Accuracy Modelfree Scalable C
associative memories whose structure and functioning
are similar to what happens in the neocortex. Such an
10 ______ showed better performance than other Machine Deep learning Reinforcement Supervised B
approaches, even without a context-based model learning learning learning
14 Classifications,
Predict time Speech
what is the function of ‘Supervised Learning’? -- series, recognition, Both A & B None of above C
Annotate Regression
strings
15 Object Similarity Automatic
Commons unsupervised applications include -- All above D
segmentation detection labeling
16
the it's impossible
Reinforcement learning is particularly efficient environment is it's often very to have a
-- All above D
when______________. not completely dynamic precise error
deterministic measure
Ans: Solution A
2. What is regression?
a) When the output variable is a category, such as “red” or “blue” or “disease” and “no
disease”.
b) When the output variable is a real value, such as “dollars” or “weight”.
Ans: Solution B
Ans: Solution B
Ans: Solution A
Ans: Solution D
6. What is Reinforcement learning?
a) All data is unlabelled and the algorithms learn to inherent structure from the input data
b) All data is labelled and the algorithms learn to predict the output from the input data
c) It is a framework for learning where an agent interacts with an environment and receives
a reward for each interaction
d) Some data is labelled but most of it is unlabelled and a mixture of supervised and
unsupervised techniques can be used.
Ans: Solution C
Regression,
Classification
Clustering
Reinforcement Learning
Options:
A. 1 Only
B. 1 and 2
C. 1 and 3
D. 1, 2 and 4
Ans : Solution D
Ans : Solution C
Ans : Solution B
11. Supervised learning and unsupervised clustering both require at least one
a) hidden attribute.
b) output attribute.
c) input attribute.
d) categorical attribute.
Ans : Solution A
12. Supervised learning differs from unsupervised clustering in that supervised learning requires
a) at least one input attribute.
b) input attributes to be categorical.
c) at least one output attribute.
d) output attributes to be categorical.
Ans : Solution B
13. A regression model in which more than one independent variable is used to predict the
dependent variable is called
a) a simple linear regression model
b) a multiple regression models
c) an independent model
d) none of the above
Ans : Solution C
14. A term used to describe the case when the independent variables in a multiple regression model
are correlated is
a) Regression
b) correlation
c) multicollinearity
d) none of the above
Ans : Solution C
15. A multiple regression model has the form: y = 2 + 3x1 + 4x2. As x1 increases by 1 unit (holding x2
constant), y will
a) increase by 3 units
b) decrease by 3 units
c) increase by 4 units
d) decrease by 4 units
Ans : Solution C
Ans : Solution B
17. A measure of goodness of fit for the estimated regression equation is the
a) multiple coefficient of determination
b) mean square due to error
c) mean square due to regression
d) none of the above
Ans : Solution C
Ans : Solution D
Ans : Solution C
20. For a multiple regression model, SST = 200 and SSE = 50. The multiple coefficient of
determination is
a) 0.25
b) 4.00
c) 0.75
d) none of the above
Ans : Solution B
Ans : Solution B
Ans : Solution B
Ans : Solution C
Ans : Solution D
26. Which statement is true about neural network and linear regression models?
a) Both models require input attributes to be numeric.
b) Both models require numeric attributes to range between 0 and 1.
c) The output of both models is a categorical attribute value.
d) Both techniques build models whose output is determined by a linear sum of weighted
input attribute values.
Ans : Solution A
Ans : Solution A
28. The average positive difference between computed and desired outcome values.
a) root mean squared error
b) mean squared error
c) mean absolute error
d) mean positive error
Ans : Solution D
29. Selecting data so as to assure that each class is properly represented in both the training and
test set.
a) cross validation
b) stratification
c) verification
d) bootstrapping
Ans : Solution B
30. The standard error is defined as the square root of this computation.
a) The sample variance divided by the total number of sample instances.
b) The population variance divided by the total number of sample instances.
c) The sample variance divided by the sample mean.
d) The population variance divided by the sample mean.
Ans : Solution A
31. Data used to optimize the parameter settings of a supervised learner model.
a) Training
b) Test
c) Verification
d) Validation
Ans : Solution D
Ans : Solution A
33. The correlation between the number of years an employee has worked for a company and the
salary of the employee is 0.75. What can be said about employee salary and years worked?
a) There is no relationship between salary and years worked.
b) Individuals that have worked for the company the longest have higher salaries.
c) Individuals that have worked for the company the longest have lower salaries.
d) The majority of employees have been with the company a long time.
e) The majority of employees have been with the company a short period of time.
Ans : Solution B
34. The correlation coefficient for two real-valued attributes is –0.85. What does this value tell you?
a) The attributes are not linearly related.
b) As the value of one attribute increases the value of the second attribute also increases.
c) As the value of one attribute decreases the value of the second attribute increases.
d) The attributes show a curvilinear relationship.
Ans : Solution C
35. The average squared difference between classifier predicted output and actual output.
a) mean squared error
b) root mean squared error
c) mean absolute error
d) mean relative error
Ans : Solution A
36. Simple regression assumes a __________ relationship between the input attribute and output
attribute.
a) Linear
b) Quadratic
c) reciprocal
d) inverse
Ans : Solution A
Ans : Solution B
Ans : Solution C
39. Logistic regression is a ________ regression technique that is used to model data having a
_____outcome.
a) linear, numeric
b) linear, binary
c) nonlinear, numeric
d) nonlinear, binary
Ans : Solution D
40. This technique associates a conditional probability value with each data instance.
a) linear regression
b) logistic regression
c) simple regression
d) multiple linear regression
Ans : Solution B
41. This supervised learning technique can process both numeric and categorical input attributes.
a) linear regression
b) Bayes classifier
c) logistic regression
d) backpropagation learning
Ans : Solution A
Ans : Solution B
43. This clustering algorithm merges and splits nodes to help modify nonoptimal partitions.
a) agglomerative clustering
b) expectation maximization
c) conceptual clustering
d) K-Means clustering
Ans : Solution D
44. This clustering algorithm initially assumes that each data instance represents a single cluster.
a) agglomerative clustering
b) conceptual clustering
c) K-Means clustering
d) expectation maximization
Ans : Solution C
45. This unsupervised clustering algorithm terminates when mean values computed for the current
iteration of the algorithm are identical to the computed mean values for the previous iteration.
a) agglomerative clustering
b) conceptual clustering
c) K-Means clustering
d) expectation maximization
Ans : Solution C
46. Machine learning techniques differ from statistical techniques in that machine learning methods
a) typically assume an underlying distribution for the data.
b) are better able to deal with missing and noisy data.
c) are not able to explain their behavior.
d) have trouble with large-sized datasets.
Ans : Solution B
UNIT –II
1.True- False: Over fitting is more likely when you have huge amount of data to train?
A) TRUE
B) FALSE
Ans Solution: (B)
With a small training dataset, it’s easier to find a hypothesis to fit the training data exactly i.e.
over fitting.
3.Which of the following techniques would perform better for reducing dimensions of a data
set?
A. Removing columns which have too many missing values
B. Removing columns which have high variance in data
C. Removing columns with dissimilar data trends
D. None of these
Ans Solution: (A)
If a columns have too many missing values, (say 99%) then we can remove such columns.
4.It is not necessary to have a target variable for applying dimensionality reduction
algorithms.
A. TRUE
B. FALSE
Ans Solution: (A)
LDA is an example of supervised dimensionality reduction algorithm.
5. PCA can be used for projecting and visualizing data in lower dimensions.
A. TRUE
B. FALSE
Ans Solution: (A)
Sometimes it is very useful to plot the data in lower dimensions. We can take the first 2 principal
components and then visualize the data using scatter plot.
6. The most popularly used dimensionality reduction algorithm is Principal Component Analysis
(PCA). Which of the following is/are true about PCA?
PCA is an unsupervised method
It searches for the directions that data have the largest variance
Maximum number of principal components <= number of features
All principal components are orthogonal to each other
A. 1 and 2
B. 1 and 3
C. 2 and 3
D. All of the above
Ans D
8. What happens when you get features in lower dimensions using PCA?
The features will still have interpretability
The features will lose interpretability
The features must carry all information present in data
The features may not carry all information present in data
A. 1 and 3
B. 1 and 4
C. 2 and 3
D. 2 and 4
Ans Solution: (D)
When you get the features in lower dimensions then you will lose some information of data
most of the times and you won’t be able to interpret the lower dimension data.
10. What is of the following statement is true about t-SNE in comparison to PCA?
A. When the data is huge (in size), t-SNE may fail to produce better results.
B. T-NSE always produces better result regardless of the size of the data
C. PCA always performs better than t-SNE for smaller size data.
D. None of these
Ans Solution: (A)
Option A is correct
11. [ True or False ] PCA can be used for projecting and visualizing data in lower dimensions.
A. TRUE
B. FALSE
Solution: (A)
Sometimes it is very useful to plot the data in lower dimensions. We can take the first 2 principal
components and then visualize the data using scatter plot.
12. A feature F1 can take certain value: A, B, C, D, E, & F and represents grade of students from
a college.
1) Which of the following statement is true in following case?
A) Feature F1 is an example of nominal variable.
B) Feature F1 is an example of ordinal variable.
C) It doesn’t belong to any of the above category.
D) Both of these
Solution: (B)
Ordinal variables are the variables which has some order in their categories. For example, grade
A should be consider as high grade than grade B.
1. Which of the following methods do we use to best fit the data in Logistic Regression?
A) Least Square Error
B) Maximum Likelihood
C) Jaccard distance
D) Both A and B
Ans Solution: B
2. Choose which of the following options is true regarding One-Vs-All method in Logistic
Regression.
A) We need to fit n models in n-class classification problem
B) We need to fit n-1 models to classify into n classes
C) We need to fit only 1 model to classify into n classes
D) None of these
Ans Solution: A
3. Suppose, You applied a Logistic Regression model on a given data and got a training accuracy
X and testing accuracy Y. Now, you want to add a few new features in the same data. Select the
option(s) which is/are correct in such a case.
Note: Consider remaining parameters are same.
A) Training accuracy increases
B) Training accuracy increases or remains the same
C) Testing accuracy decreases
D) Testing accuracy increases or remains the same
Ans Solution: A and D
Adding more features to model will increase the training accuracy because model has to
consider more data to fit the logistic regression. But testing accuracy increases if feature is
found to be significant
6. Which of the following methods do we use to find the best fit line for data in Linear
Regression?
A) Least Square Error
B) Maximum Likelihood
C) Logarithmic Loss
D) Both A and B
Ans Solution: (A)
In linear regression, we try to minimize the least square errors of the model to identify the line
of best fit.
8. Suppose you plotted a scatter plot between the residuals and predicted values in linear
regression and you found that there is a relationship between them. Which of the following
conclusion do you make about this situation?
9. Suppose you have fitted a complex regression model on a dataset. Now, you are using Ridge
regression with penalty x.
Choose the option which describes bias in best manner.
A) In case of very large x; bias is low
B) In case of very large x; bias is high
C) We can’t say about bias
D) None of these
Ans Solution: (B)
If the penalty is very large it means model is less complex, therefore the bias would be high.
11. Suppose you have trained a logistic regression classifier and it outputs a new example x with
a prediction ho(x) = 0.2. This means
Our estimate for P(y=1 | x)
Our estimate for P(y=0 | x)
Our estimate for P(y=1 | x)
Our estimate for P(y=0 | x)
Ans Solution: B
A) TRUE
B) FALSE
Solution: (A)
True. A Neural network can be used as a universal approximator, so it can definitely implement
a linear regression algorithm.
15. Which of the following methods do we use to find the best fit line for data in Linear
Regression?
A) Least Square Error
B) Maximum Likelihood
C) Logarithmic Loss
D) Both A and B
Solution: (A)
In linear regression, we try to minimize the least square errors of the model to identify the line
of best fit.
16. Which of the following evaluation metrics can be used to evaluate a model while modeling
a continuous output variable?
A) AUC-ROC
B) Accuracy
C) Logloss
D) Mean-Squared-Error
Solution: (D)
Since linear regression gives output as continuous values, so in such case we use mean squared
error metric to evaluate the model performance. Remaining options are use in case of a
classification problem.
17. True-False: Lasso Regularization can be used for variable selection in Linear Regression.
A) TRUE
B) FALSE
Solution: (A)
True, In case of lasso regression we apply absolute penalty which makes some of the coefficients
zero.
19. Suppose that we have N independent variables (X1,X2… Xn) and dependent variable is Y.
Now Imagine that you are applying linear regression by fitting the best fit line using least square
error on this data.
You found that correlation coefficient for one of it’s variable(Say X1) with Y is -0.95.
Which of the following is true for X1?
A) Relation between the X1 and Y is weak
B) Relation between the X1 and Y is strong
C) Relation between the X1 and Y is neutral
D) Correlation can’t judge the relationship
Solution: (B)
The absolute value of the correlation coefficient denotes the strength of the relationship.
Since absolute correlation is very high it means that the relationship is strong between X1 and
Y.
20. Looking at above two characteristics, which of the following option is the correct for
Pearson correlation between V1 and V2?
If you are given the two variables V1 and V2 and they are following below two characteristics.
1. If V1 increases then V2 also increases
2. If V1 decreases then V2 behavior is unknown
A) Pearson correlation will be close to 1
B) Pearson correlation will be close to -1
C) Pearson correlation will be close to 0
D) None of these
Solution: (D)
We cannot comment on the correlation coefficient by using only statement 1. We need to
consider the both of these two statements. Consider V1 as x and V2 as |x|. The correlation
coefficient would not be close to 1 in such a case.
21. Suppose Pearson correlation between V1 and V2 is zero. In such case, is it right to
conclude that V1 and V2 do not have any relation between them?
A) TRUE
B) FALSE
Solution: (B)
Pearson correlation coefficient between 2 variables might be zero even when they have a
relationship between them. If the correlation coefficient is zero, it just means that that they
don’t move together. We can take examples like y=|x| or y=x^2.
22. True- False: Overfitting is more likely when you have huge amount of data to train?
A) TRUE
B) FALSE
Solution: (B)
With a small training dataset, it’s easier to find a hypothesis to fit the training data exactly i.e.
overfitting.
23. We can also compute the coefficient of linear regression with the help of an analytical
method called “Normal Equation”. Which of the following is/are true about Normal Equation?
1. We don’t have to choose the learning rate
2. It becomes slow when number of features is very large
3. Thers is no need to iterate
A) 1 and 2
B) 1 and 3
C) 2 and 3
D) 1,2 and 3
Solution: (D)
Instead of gradient descent, Normal Equation can also be used to find coefficients.
25. What will happen when you apply very large penalty?
A) Some of the coefficient will become absolute zero
B) Some of the coefficient will approach zero but not absolute zero
C) Both A and B depending on the situation
D) None of these
Solution: (B)
In lasso some of the coefficient value become zero, but in case of Ridge, the coefficients become
close to zero but not zero.
26. What will happen when you apply very large penalty in case of Lasso?
A) Some of the coefficient will become zero
B) Some of the coefficient will be approaching to zero but not absolute zero
C) Both A and B depending on the situation
D) None of these
Solution: (A)
As already discussed, lasso applies absolute penalty, so some of the coefficients will become
zero.
27. Which of the following statement is true about outliers in Linear regression?
A) Linear regression is sensitive to outliers
B) Linear regression is not sensitive to outliers
C) Can’t say
D) None of these
Solution: (A)
The slope of the regression line will change due to outliers in most of the cases. So Linear
Regression is sensitive to outliers.
28. Suppose you plotted a scatter plot between the residuals and predicted values in linear
regression and you found that there is a relationship between them. Which of the following
conclusion do you make about this situation?
31. In terms of bias and variance. Which of the following is true when you fit degree 2
polynomial?
A) Increase
B) Decrease
C) Remain constant
D) Can’t Say
Solution: (D)
Training error may increase or decrease depending on the values that are used to fit the model.
If the values used to train contain more outliers gradually, then the error might just increase.
33. What do you expect will happen with bias and variance as you increase the size of training
data?
34. What would be the root mean square training error for this data if you run a Linear
Regression model of the form (Y = A0+A1X)?
A) Less than 0
B) Greater than zero
C) Equal to 0
D) None of these
Solution: (C)
We can perfectly fit the line on the following data so mean error will be zero.
35. Which of the following scenario would give you the right hyper parameter?
A) 1
B) 2
C) 3
D) 4
Solution: (B)
Option B would be the better option because it leads to less training as well as validation error.
36. Suppose you got the tuned hyper parameters from the previous question. Now, Imagine
you want to add a variable in variable space such that this added feature is important. Which
of the following thing would you observe in such case?
A) Training Error will decrease and Validation error will increase
B) Training Error will increase and Validation error will increase
C) Training Error will increase and Validation error will decrease
D) Training Error will decrease and Validation error will decrease
E) None of the above
Solution: (D)
If the added feature is important, the training and validation error would decrease.
A) L1
B) L2
C) Any
D) None of these
Solution: (D)
I won’t use any regularization methods because regularization is used in case of overfitting.
41. True-False: Is it possible to design a logistic regression algorithm using a Neural Network
Algorithm?
A) TRUE
B) FALSE
Solution: A
True, Neural network is a is a universal approximator so it can implement linear regression
algorithm.
43. Which of the following methods do we use to best fit the data in Logistic Regression?
A) Least Square Error
B) Maximum Likelihood
C) Jaccard distance
D) Both A and B
Solution: B
Logistic regression uses maximum likely hood estimate for training a logistic regression.
44. Which of the following evaluation metrics can not be applied in case of logistic regression
output to compare with target?
A) AUC-ROC
B) Accuracy
C) Logloss
D) Mean-Squared-Error
Solution: D
Since, Logistic Regression is a classification algorithm so it’s output can not be real time value so
mean squared error can not use for evaluating it
45. One of the very good methods to analyze the performance of Logistic Regression is AIC,
which is similar to R-Squared in Linear Regression. Which of the following is true about AIC?
A) We prefer a model with minimum AIC value
B) We prefer a model with maximum AIC value
C) Both but depend on the situation
D) None of these
Solution: A
We select the best model in logistic regression which can least AIC.
Solution: A
In case of lasso we apply a absolute penality, after increasing the penality in lasso some of the
coefficient of variables may become zero.
Context: 48-49
Consider a following model for logistic regression: P (y =1|x, w)= g(w0 + w1x)
where g(z) is the logistic function.
In the above equation the P (y =1|x; w) , viewed as a function of x, that we can get by changing the
parameters w.
A) (0, inf)
B) (-inf, 0 )
C) (0, 1)
D) (-inf, inf)
Solution: C
For values of x in the range of real number from −∞ to +∞ Logistic function will give the output
between (0,1)
49 In above question what do you think which function would make p between (0,1)?
A) logistic function
B) Log likelihood function
C) Mixture of both
D) None of them
Solution: A
50. Suppose you have been given a fair coin and you want to find out the odds of getting heads.
Which of the following option is true for such a case?
A) odds will be 0
B) odds will be 0.5
C) odds will be 1
D) None of these
Solution: C
Odds are defined as the ratio of the probability of success and the probability of failure. So in case of fair
coin probability of success is 1/2 and the probability of failure is 1/2 so odd would be 1
51. The logit function(given as l(x)) is the log of odds function. What could be the range of logit
function in the domain x=[0,1]?
A) (– ∞ , ∞)
B) (0,1)
C) (0, ∞)
D) (- ∞, 0)
Solution: A
For our purposes, the odds function has the advantage of transforming the probability function, which
has values from 0 to 1, into an equivalent function with values between 0 and ∞. When we take the
natural log of the odds function, we get a range of values from -∞ to ∞.
A) Linear Regression errors values has to be normally distributed but in case of Logistic Regression it is
not the case
B) Logistic Regression errors values has to be normally distributed but in case of Linear Regression it is
not the case
C) Both Linear Regression and Logistic Regression error values have to be normally distributed
D) Both Linear Regression and Logistic Regression error values have not to be normally distributed
Solution:A
53. Which of the following is true regarding the logistic function for any value “x”?
Note:
Logistic(x): is a logistic function of any number “x”
A) Logistic(x) = Logit(x)
B) Logistic(x) = Logit_inv(x)
C) Logit_inv(x) = Logit(x)
D) None of these
Solution: B
Suppose you have given the two scatter plot “a” and “b” for two classes( blue for positive and red for
negative class). In scatter plot “a”, you correctly classified all data points using logistic regression ( black
line is a decision boundary).
A) Bias will be high
B) Bias will be low
C) Can’t say
D) None of these
Solution: A
55. Suppose, You applied a Logistic Regression model on a given data and got a training accuracy X
and testing accuracy Y. Now, you want to add a few new features in the same data. Select the
option(s) which is/are correct in such a case.
Solution: A and D
Adding more features to model will increase the training accuracy because model has to consider more
data to fit the logistic regression. But testing accuracy increases if feature is found to be significant
56. Choose which of the following options is true regarding One-Vs-All method in Logistic Regression.
If there are n classes, then n separate logistic regression has to fit, where the probability of each
category is predicted over the rest of the categories combined.
57. Below are two different logistic models with different values for β0 and β1.
Which of the
following statement(s) is true about β0 and β1 values of two logistics models (Green, Black)?
Solution: B
Context 58-60
Below are the three scatter plot(A,B,C left to right) and hand drawn decision boundaries for logistic
regression.
58. Which of the following above figure shows that the decision boundary is overfitting the training
data?
A) A
B) B
C) C
D)None of these
Solution: C
Since in figure 3, Decision boundary is not smooth that means it will over-fitting the data.
1. The training error in first plot is maximum as compare to second and third plot.
2. The best model for this regression problem is the last (third) plot because it has minimum
training error (zero).
3. The second model is more robust than first and third because it will perform best on unseen
data.
5. All will perform same because we have not seen the testing data.
A) 1 and 3
B) 1 and 3
C) 1, 3 and 4
D) 5
Solution: C
The trend in the graphs looks like a quadratic trend over independent variable X. A higher degree(Right
graph) polynomial might have a very high accuracy on the train population but is expected to fail badly
on test dataset. But if you see in left graph we will have training error maximum because it underfits the
training data
60. Suppose, above decision boundaries were generated for the different value of regularization.
Which of the above decision boundary shows the maximum regularization?
A) A
B) B
C) C
D) All have equal regularization
Solution: A
Since, more regularization means more penality means less complex decision boundry that shows in first
figure A.
61. What would do if you want to train logistic regression on same data that will take less time as well
as give the comparatively similar accuracy(may not be same)?
Suppose you are using a Logistic Regression model on a huge dataset. One of the problem you may face
on such huge data is that Logistic regression will take very long time to train.
Solution: D
If you decrease the number of iteration while training it will take less time for surly but will not give the
same accuracy for getting the similar accuracy but not exact you need to increase the learning rate.
62. Which of the following image is showing the cost function for y =1.
Following is the loss function in logistic regression(Y-axis loss function and x axis log probability) for
two class classification problem.
Solution: A
A is the true answer as loss function decreases as the log probability increases
A) 1
B) 2
C) 3
D) 4
Solution: C
There are three local minima present in the graph
64. Can a Logistic Regression classifier do a perfect classification on the below data?
Note: You can use only X1 and X2 variables where X1 and X2 can take only two binary values(0,1).
A) TRUE
B) FALSE
C) Can’t say
D) None of these
Solution: B
No, logistic regression only forms linear decision surface, but the examples in the figure are not linearly
separable.
UNIT IV
Ans Solution: C
When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.
Ans Solution: C
The cost parameter decides how much an SVM should be allowed to “bend” with the data. For a low
cost, you aim for a smooth decision surface and for a higher cost, you aim to classify more points
correctly. It is also simply referred to as the cost of misclassification.
Ans Solution: D
SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.
Ans Solution: B
Generalisation error in statistics is generally the out-of-sample error which is the measure of how
accurately a model can predict values for previously unseen data.
Ans Solution: C
When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: C
Question Context:8– 9
Suppose you are using a Linear SVM classifier with 2 class classification problem. Now you have been
given the following data in which some points are circled red that are representing support vectors.
8. If you remove the following any one red points from the data. Does the decision boundary will
change?
A) Yes
B) No
Solution: A
These three examples are positioned such that removing any one of them introduces slack in the
constraints. So the decision boundary would completely change.
9. [True or False] If you remove the non-red circled points from the data, the decision boundary will
change?
A) True
B) False
Solution: B
On the other hand, rest of the points in the data won’t affect the decision boundary much.
Solution: B
Generalization error in statistics is generally the out-of-sample error which is the measure of how
accurately a model can predict values for previously unseen data.
11. When the C parameter is set to infinite, which of the following holds true?
A) The optimal hyperplane if exists, will be the one that completely separates the data
B) The soft-margin classifier will separate the data
C) None of the above
Solution: A
At such a high level of misclassification penalty, soft margin will not hold existence as there will be no
room for error.
Solution: A
A hard margin means that an SVM is very rigid in classification and tries to work extremely well in the
training set, causing overfitting.
13. The minimum time complexity for training an SVM is O(n2). According to this fact, what sizes of
datasets are not best suited for SVM’s?
A) Large datasets
B) Small datasets
C) Medium sized datasets
D) Size does not matter
Solution: A
Datasets which have a clear classification boundary will function best with SVM’s.
A) Selection of Kernel
B) Kernel Parameters
C) Soft Margin Parameter C
D) All of the above
Solution: D
The SVM effectiveness depends upon how you choose the basic 3 requirements mentioned above in
such a way that it maximises your efficiency, reduces error and overfitting.
15. upport vectors are the data points that lie closest to the decision surface.
A) TRUE
B) FALSE
Solution: A
They are the points closest to the hyperplane and the hardest ones to classify. They also have a direct
bearing on the location of the decision surface.
Solution: C
When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.
17. Suppose you are using RBF kernel in SVM with high Gamma value. What does this signify?
A) The model would consider even far away points from hyperplane for modeling
B) The model would consider only the points close to the hyperplane for modeling
C) The model would not be affected by distance of points from hyperplane for modeling
D) None of the above
Solution: B
The gamma parameter in SVM tuning signifies the influence of points either near or far away from the
hyperplane.
For a low gamma, the model will be too constrained and include all points of the training dataset,
without really capturing the shape.
For a higher gamma, the model will capture the shape of the dataset well.
Solution: C
The cost parameter decides how much an SVM should be allowed to “bend” with the data. For a low
cost, you aim for a smooth decision surface and for a higher cost, you aim to classify more points
correctly. It is also simply referred to as the cost of misclassification.
19. Suppose you are building a SVM model on data X. The data X can be error prone which means that
you should not trust any specific data point too much. Now think that you want to build a SVM model
which has quadratic kernel function of polynomial degree 2 that uses Slack variable C as one of it’s hyper
parameter. Based upon that give the answer for following question.
What would happen when you use very large value of C(C->infinity)?
Note: For small C was also classifying all data points correctly
A) We can still classify data correctly for given setting of hyper parameter C
B) We can not classify data correctly for given setting of hyper parameter C
C) Can’t Say
D) None of these
Solution: A
For large values of C, the penalty for misclassifying points is very high, so the decision boundary will
perfectly separate the data if possible.
20. What would happen when you use very small C (C~0)?
Solution: A
The classifier can maximize the margin between most of the points, while misclassifying a few points,
because the penalty is so low.
21. If I am using all features of my dataset and I achieve 100% accuracy on my training set, but ~70% on
validation set, what should I look out for?
A) Underfitting
B) Nothing, the model is perfect
C) Overfitting
Solution: C
If we’re achieving 100% training accuracy very easily, we need to check to verify if we’re overfitting our
data.
22. Which of the following are real world applications of the SVM?
Solution: D
SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.
Question Context: 23 – 25
Suppose you have trained an SVM with linear decision boundary after training SVM, you correctly infer
that your SVM model is under fitting.
23. Which of the following option would you more likely to consider iterating SVM next time?
Solution: C
The best option here would be to create more features for the model.
24. Suppose you gave the correct answer in previous question. What do you think that is actually
happening?
A) 1 and 2
B) 2 and 3
C) 1 and 4
D) 2 and 4
Solution: C
Better model will lower the bias and increase the variance
25. In above question suppose you want to change one of it’s(SVM) hyperparameter so that effect
would be same as previous questions i.e model will not under fit?
Solution: A
Increasing C parameter would be the right thing to do here, as it will ensure regularized model
26. We usually use feature normalization before using the Gaussian kernel in SVM. What is true about
feature normalization?
A) 1
B) 1 and 2
C) 1 and 3
D) 2 and 3
Solution: B
Suppose you are dealing with 4 class classification problem and you want to train a SVM model on the
data for that you are using One-vs-all method. Now answer the below questions?
27. How many times we need to train our SVM model in such case?
A) 1
B) 2
C) 3
D) 4
Solution: D
For a 4 class problem, you would have to train the SVM at least 4 times if you are using a one-vs-all
method.
28. Suppose you have same distribution of classes in the data. Now, say for training 1 time in one vs all
setting the SVM is taking 10 second. How many seconds would it require to train one-vs-all method end
to end?
A) 20
B) 40
C) 60
D) 80
Solution: B
29 Suppose your problem has changed now. Now, data has only 2 classes. What would you think how
many times we need to train SVM in such case?
A) 1
B) 2
C) 3
D) 4
Solution: A
Training the SVM only one time would give you appropriate results
Suppose you are using SVM with linear kernel of polynomial degree 2, Now think that you have applied
this on data and found that it perfectly fit the data that means, Training and testing accuracy is 100%.
30. Now, think that you increase the complexity (or degree of polynomial of this kernel). What would
you think will happen?
Solution: A
Increasing the complexity of the data would make the algorithm overfit the data.
31. In the previous question after increasing the complexity you found that training accuracy was still
100%. According to you what is the reason behind that?
1. Since data is fixed and we are fitting more polynomial term or parameters so the algorithm starts
memorizing everything in the data
2. Since data is fixed and SVM doesn’t need to search in big hypothesis space
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: C
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: C
UNIT V
1. Which of the following is a widely used and effective machine learning algorithm based on the
idea of bagging?
a) Decision Tree
b) Regression
c) Classification
d) Random Forest
Ans D
a) Factor analysis
b) Decision trees are robust to outliers
c) Decision trees are prone to be overfit
d) None of the above
Ans C
a. True
b. False
Decision trees can also be used to for clusters in the data but clustering often generates natural
clusters and is not dependent on any objective function.
Regression
Classification
Clustering
Reinforcement Learning
Options:
a. 1 Only
b. 1 and 2
c. 1 and 3
d. 1, 2 and 4
Ans D
6 Which of the following is the most appropriate strategy for data cleaning before performing
clustering analysis, given less than desirable number of data points:
Removal of outliers
Options:
a. 1 only
b. 2 only
c. 1 and 2
d. None of the above
Ans A
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: C
Both options are true. In Bagging, each individual trees are independent of each other because they
consider different subset of features and samples.
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: B
In boosting tree individual weak learners are not independent of each other because each tree correct
the results of previous tree. Bagging and boosting both can be consider as improving the base learners
results.
9. In Random forest you can generate hundreds of trees (say T1, T2 …..Tn) and then aggregate
the results of these tree. Which of the following is true about individual (Tk) tree in Random Forest?
1. Individual tree is built on a subset of the features
A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4
Ans Solution: A
Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.
10. Suppose you are using a bagging based algorithm say a RandomForest in model building.
Which of the following can be true?
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: A
Since Random Forest aggregate the result of different weak learners, If It is possible we would want
more number of trees in model building. Random Forest is a black box model you will lose
interpretability after using it.
11. Which of the following is/are true about Random Forest and Gradient Boosting ensemble
methods?
2. Random Forest is use for classification whereas Gradient Boosting is use for regression task
3. Random Forest is use for regression whereas Gradient Boosting is use for Classification task
Solution: E
12. In Random forest you can generate hundreds of trees (say T1, T2 …..Tn) and then aggregate the
results of these tree. Which of the following is true about individual(Tk) tree in Random Forest?
A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4
Solution: A
Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.
13. Which of the following algorithm doesn’t uses learning Rate as of one of its hyperparameter?
1. Gradient Boosting
2. Extra Trees
3. AdaBoost
4. Random Forest
A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4
Solution: D
Random Forest and Extra Trees don’t have learning rate as a hyperparameter.
14. Which of the following algorithm are not an example of ensemble learning algorithm?
A) Random Forest
B) Adaboost
C) Extra Trees
D) Gradient Boosting
E) Decision Trees
Solution: E
Decision trees doesn’t aggregate the results of multiple trees so it is not an ensemble algorithm.
15. Suppose you are using a bagging based algorithm say a RandomForest in model building. Which of
the following can be true?
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: A
Since Random Forest aggregate the result of different weak learners, If It is possible we would want
more number of trees in model building. Random Forest is a black box model you will lose
interpretability after using it.
16. True-False: The bagging is suitable for high variance low bias models?
A) TRUE
B) FALSE
Solution: A
The bagging is suitable for high variance low bias models or you can say for complex models.
17. To apply bagging to regression trees which of the following is/are true in such case?
Solution: D
Solution: B
We always consider the validation results to compare with the test result.
19. In which of the following scenario a gain ratio is preferred over Information Gain?
Solution: A
When high cardinality problems, gain ratio is preferred over Information Gain technique.
20. Suppose you have given the following scenario for training and validation error for Gradient
Boosting. Which of the following hyper parameter would you choose in such case?
1 2 100 110
2 4 90 105
3 6 50 100
4 8 45 105
5 10 30 150
A) 1
B) 2
C) 3
D) 4
Solution: B
Scenario 2 and 4 has same validation accuracies but we would select 2 because depth is lower is better
hyper parameter.
21. Which of the following is/are not true about DBSCAN clustering algorithm:
1. For data points to be in a cluster, they must be in a distance threshold to a core point
5. It is robust to outliers
Options:
A. 1 only
B. 2 only
C. 4 only
D. 2 and 3
Solution: D
DBSCAN can form a cluster of any arbitrary shape and does not have strong assumptions for the
distribution of data points in the data space.
Answer: d
Explanation: K-means clustering follows partitioning approach.
Answer: c
Explanation: k-nearest neighbour has nothing to do with k-means.
Answer: a
Explanation: K-means requires a number of clusters.
Answer: a
Explanation: K-means clustering produces the final estimate of cluster centroids.
27.
1. Techniques of feature engineering involve:
A. Clean dataset
B. Increase their signal-noise ratio
C. Reduce dimensionality
D. All of these
ANSWER: D
4. The original dataset must be randomly shuffled before the split phase
A. avoid a correlation between consequent elements
B. avoid a sequencing between consequent elements
C. build a relation between consequent elements
D. None of these
ANSWER: A
18. Which option should be considered only when the dataset is quite large, the number
of missing features is high, and any prediction could be risky
A. Removing the whole line
B. Creating sub-model to predict those features
C. Using an automatic strategy to input them according to the other known values
D. All of these
ANSWER: A
19. While managing missing values, which method is said as best choice
A. Removing the whole line
B. Creating sub-model to predict those features
C. Using an automatic strategy to input them according to the other known values
D. All of these
ANSWER: C
20. What are data preprocessing techniques to handle outliers
A. Winsorize (cap at threshold).
B. Transform to reduce skew (using Box-Cox or similar).
C. Remove outliers if you're certain they are anomalies or measurement errors.
D. All of above
ANSWER: D
21. Which of the following model model include a backwards elimination feature selection
routine?
A. MCV
B. MARS
C. MCRS
D. All of the Mentioned
ANSWER: B
25. If you split your data into train/test splits, is it still possible to overfit your
model?
A. True
B. False
C. None of these
D. All of these
ANSWER: A
26. How do you handle missing or corrupted data in a dataset
A. Drop missing rows or columns
B. Replace missing values with mean/median/mode
C. Assign a unique category to missing values
D. All of the above
ANSWER: D
28. Technique to re-scales a feature or observation value with distribution value between
0 and 1 is known a
A. Mean Normalization
B. Max Normalization
C. Mode Normalization
D. Min-Max Normalization
ANSWER: D
29. Feature Scaling is a technique to standardize the independent features present in the
data in a fixed range
A. True
B. False
C. None of these
D. All of these
ANSWER: A
30. To calculate the distance between centroid and data point which method is used
A. Euclidean Distance
B. Manhattan Distance
C. Minkowski Distance
D. All of the above
ANSWER: D
31. Feature Scaling is a technique to standardize the independent features present in the
data in a fixed range
A. True
B. False
C. None of these
D. All of these
ANSWER: A
32. ------------ performed during the data pre-processing to handle highly varying
magnitudes or values or units
A. Label encoding
B. Feature Scaling
C. Feature extraction
D. Normalization
ANSWER: B
34. Normalization is generally required when we are dealing with attributes on a different
scale
A. True
B. False
C. None of these
D. All of these
ANSWER: A
39. A -----------is a useful approach to remove all those elements whose contribution is
under a predefined level
A. correlation threshold
B. covariance threshold
C. variance threshold
D. None of these
ANSWER: C
40. Imagine, you have 1000 input features and 1 target feature in a machine learning problem.
You have to select 100 most important features based on the relationship between input
features and the target features.Do you think, this is an example of dimensionality
reduction?
A. Yes
B. No
C. None of these
D. All of these
ANSWER: A
41. When performing regression or classification, which of the following is the correct
way to preprocess the data
A. Normalize the data → PCA → training
B. PCA → normalize PCA output → training
C. Normalize the data → PCA → normalize PCA output → training
D. None of the above
ANSWER: A
43. Which of the following is a reasonable way to select the number of principal components
"k"
A. Choose k to be the smallest value so that at least 99% of the varinace is retained.
B. Choose k to be 99% of m (k = 0.99*m, rounded to the nearest integer).
C. Choose k to be the largest value so that 99% of the variance is retained.
D. Use the elbow method
ANSWER: A
44. Dimensionality reduction algorithms are one of the possible ways to reduce the
computation time required to build a model.
A. TRUE
B. FALSE
C. None of these
D. All of these
ANSWER: A
45. When a dataset is made up of non-negative elements can we use non-negative matrix
factorization (NNMF) instead of standard PCA
A. Yes
B. NO
C. None of these
D. All of these
ANSWER: A
46. PCA can be used for projecting and visualizing data in lower dimensions.
A. TRUE
B. FALSE
C. None of these
D. All of these
ANSWER: A
47. Which of the following is/are true about PCA? 1. PCA is an unsupervised method. 2.
It searches for the directions that data have the largest variance 3. Maximum number of
principal components <= number of features 4. All principal components are orthogonal to
each other
A. 1 and 2
B. 1 and 3
C. 2 and 3
D. All of these
ANSWER: D
48. ------- allows exploiting the natural sparsity of data while extracting principal
components
A. Standard PCA
B. Kernal PCA
C. Sparse PCA
D. All of the above
ANSWER: C
50. Dictionary learning is a technique which allows rebuilding a sample starting from a
sparse dictionary of atoms
A. True
B. False
C. None of these
D. All of these
ANSWER: A
1. In practice, Line of best fit or regression line is found when _____________
a) Sum of residuals (∑(Y – h(X))) is minimum
b) Sum of the absolute value of residuals (∑|Y-h(X)|) is maximum
c) Sum of the square of residuals ( ∑ (Y-h(X))2) is minimum
d) Sum of the square of residuals ( ∑ (Y-h(X))2) is maximum
View Answer
Answer: c
Explanation: Here we penalize higher error value much more as compared to the smaller one,
such that there is a significant difference between making big errors and small errors,
which makes it easy to differentiate and select the best fit line.
2. If Linear regression model perfectly first i.e., train error is zero, then
_____________________
a) Test error is also always zero
b) Test error is non zero
c) Couldn’t comment on Test error
d) Test error is equal to Train error
View Answer
Answer: c
Explanation: Test Error depends on the test data. If the Test data is an exact representation
of train data then test error is always zero. But this may not be the case.
3. Which of the following metrics can be used for evaluating regression models?
i) R Squared
ii) Adjusted R Squared
iii) F Statistics
iv) RMSE / MSE / MAE
a) ii and iv
b) i and ii
c) ii, iii and iv
d) i, ii, iii and iv
View Answer
Answer: d
Explanation: These (R Squared, Adjusted R Squared, F Statistics, RMSE / MSE / MAE) are some
metrics which you can use to evaluate your regression model.
4. How many coefficients do you need to estimate in a simple linear regression model (One
independent variable)?
a) 1
b) 2
c) 3
d) 4
View Answer
Answer: b
Explanation: In simple linear regression, there is one independent variable so 2
coefficients (Y=a+bx+error).
5. In a simple linear regression model (One independent variable), If we change the input
variable by 1 unit. How much output variable will change?
a) by 1
b) no change
c) by intercept
d) by its slope
View Answer
Answer: d
Explanation: For linear regression Y=a+bx+error. If neglect error then Y=a+bx. If x
increases by 1, then Y = a+b(x+1) which implies Y=a+bx+b. So Y increases by its slope.
8. In the mathematical Equation of Linear Regression Y = β1 + β2X + ϵ, (β1, β2) refers to
__________
a) (X-intercept, Slope)
b) (Slope, X-Intercept)
c) (Y-Intercept, Slope)
d) (slope, Y-Intercept)
View Answer
Answer: c
A) TRUE
B) FALSE
Solution: (A)
Yes, Linear regression is a supervised learning algorithm because it uses true labels for
training. Supervised learning algorithm should have input variable (x) and an output
variable (Y) for each example.
A) TRUE
B) FALSE
Solution: (A)
A) TRUE
B) FALSE
Solution: (A)
12. Which of the following methods do we use to find the best fit line for data in Linear
Regression?
Solution: (A)
In linear regression, we try to minimize the least square errors of the model to identify
the line of best fit.
13. Which of the following evaluation metrics can be used to evaluate a model while modeling
a continuous output variable?
A) AUC-ROC
B) Accuracy
C) Logloss
D) Mean-Squared-Error
Solution: (D)
Since linear regression gives output as continuous values, so in such case we use mean
squared error metric to evaluate the model performance. Remaining options are use in case
of a classification problem.
14. True-False: Lasso Regularization can be used for variable selection in Linear
Regression.
A) TRUE
B) FALSE
Solution: (A)
True, In case of lasso regression we apply absolute penalty which makes some of the
coefficients zero.
A) Lower is better
B) Higher is better
C) A or B depend on the situation
D) None of these
Solution: (A)
Residuals refer to the error values of the model. Therefore lower residuals are desired.
16. Suppose that we have N independent variables (X1,X2… Xn) and dependent variable is Y.
Now Imagine that you are applying linear regression by fitting the best fit line using least
square error on this data.
You found that correlation coefficient for one of it’s variable(Say X1) with Y is -0.95.
Solution: (B)
The absolute value of the correlation coefficient denotes the strength of the relationship.
Since absolute correlation is very high it means that the relationship is strong between
X1 and Y.
17. Looking at above two characteristics, which of the following option is the correct for
Pearson correlation between V1 and V2?
If you are given the two variables V1 and V2 and they are following below two characteristics.
Solution: (D)
18. Suppose Pearson correlation between V1 and V2 is zero. In such case, is it right to
conclude that V1 and V2 do not have any relation between them?
A) TRUE
B) FALSE
Solution: (B)
Pearson correlation coefficient between 2 variables might be zero even when they have a
relationship between them. If the correlation coefficient is zero, it just means that that
they don’t move together. We can take examples like y=|x| or y=x^2.
19. Which of the following offsets, do we use in linear regression’s least square line fit?
Suppose horizontal axis is independent variable and vertical axis is dependent variable.
A) Vertical offset
B) Perpendicular offset
C) Both, depending on the situation
D) None of above
Solution: (A)
20. True- False: Overfitting is more likely when you have huge amount of data to train?
A) TRUE
B) FALSE
Solution: (B)
With a small training dataset, it’s easier to find a hypothesis to fit the training data
exactly i.e. overfitting.
21. We can also compute the coefficient of linear regression with the help of an analytical
method called “Normal Equation”. Which of the following is/are true about Normal Equation?
A) 1 and 2
B) 1 and 3
C) 2 and 3
D) 1,2 and 3
Solution: (D)
Instead of gradient descent, Normal Equation can also be used to find coefficients. Refer
this article for read more about normal equation.
22. Which of the following statement is true about sum of residuals of A and B?
Below graphs show two fitted regression lines (A & B) on randomly generated data. Now, I
want to find the sum of residuals in both cases A and B.
Note:
Solution: (C)
Sum of residuals will always be zero, therefore both have same sum of residuals
23. Suppose you have fitted a complex regression model on a dataset. Now, you are using
Ridge regression with penality x. Choose the option which describes bias in best manner.
A) In case of very large x; bias is low
B) In case of very large x; bias is high
C) We can’t say about bias
D) None of these
Solution: (B)
24. If the penalty is very large it means model is less complex, therefore the bias would
be high. What will happen when you apply very large penalty?
Solution: (B)
In lasso some of the coefficient value become zero, but in case of Ridge, the coefficients
become close to zero but not zero.
25. What will happen when you apply very large penalty in case of Lasso?
A) Some of the coefficient will become zero
B) Some of the coefficient will be approaching to zero but not absolute zero
C) Both A and B depending on the situation
D) None of these
Solution: (A)
As already discussed, lasso applies absolute penalty, so some of the coefficients will
become zero.
26. Which of the following statement is true about outliers in Linear regression?
Solution: (A)
The slope of the regression line will change due to outliers in most of the cases. So Linear
Regression is sensitive to outliers.
27. Suppose you plotted a scatter plot between the residuals and predicted values in linear
regression and you found that there is a relationship between them. Which of the following
conclusion do you make about this situation?
Solution: (A)
There should not be any relationship between predicted values and residuals. If there exists
any relationship between them,it means that the model has not perfectly captured the
information in the data.
Suppose that you have a dataset D1 and you design a linear regression model of degree 3
polynomial and you found that the training and testing error is “0” or in another terms
it perfectly fits the data.
28. What will happen when you fit degree 4 polynomial in linear regression?
A) There are high chances that degree 4 polynomial will over fit the data
B) There are high chances that degree 4 polynomial will under fit the data
C) Can’t say
D) None of these
Solution: (A)
Since is more degree 4 will be more complex(overfit the data) than the degree 3 model so
it will again perfectly fit the data. In such case training error will be zero but test
error may not be zero.
29. What will happen when you fit degree 2 polynomial in linear regression?
A) It is high chances that degree 2 polynomial will over fit the data
B) It is high chances that degree 2 polynomial will under fit the data
C) Can’t say
D) None of these
Solution: (B)
If a degree 3 polynomial fits the data perfectly, it’s highly likely that a simpler
model(degree 2 polynomial) might under fit the data.
30. In terms of bias and variance. Which of the following is true when you fit degree 2
polynomial?
Solution: (C)
Since a degree 2 polynomial will be less complex as compared to degree 3, the bias will
be high and variance will be low.
Which of the following is true about below graphs(A,B, C left to right) between the cost
function and Number of iterations?
31. Suppose l1, l2 and l3 are the three learning rates for A,B,C respectively. Which of
the following is true about l1,l2 and l3?
A) l2 < l1 < l3
B) l1 > l2 > l3
C) l1 = l2 = l3
D) None of these
Solution: (A)
In case of high learning rate, step will be high, the objective function will decrease
quickly initially, but it will not find the global minima and objective function starts
increasing after a few iterations.
In case of low learning rate, the step will be small. So the objective function will decrease
slowly
We have been given a dataset with n records in which we have input attribute as x and output
attribute as y. Suppose we use a linear regression method to model this data. To test our
linear regressor, we split the data in training set and test set randomly.
32. Now we increase the training set size gradually. As the training set size increases,
what do you expect will happen with the mean training error?
A) Increase
B) Decrease
C) Remain constant
D) Can’t Say
Solution: (D)
Training error may increase or decrease depending on the values that are used to fit the
model. If the values used to train contain more outliers gradually, then the error might
just increase.
33. What do you expect will happen with bias and variance as you increase the size of training
data?
Solution: (D)
As we increase the size of the training data, the bias would increase while the variance
would decrease.
Consider the following data where one input(X) and one output(Y) is given.
34. What would be the root mean square training error for this data if you run a Linear
Regression model of the form (Y = A0+A1X)?
A) Less than 0
B) Greater than zero
C) Equal to 0
D) None of these
Solution: (C)
We can perfectly fit the line on the following data so mean error will be zero.
Suppose you have been given the following scenario for training and validation error for
Linear Regression.
Scenario Learning Rate Number of iterations Training Error Validation Error
1 0.1 1000 100 110
2 0.2 600 90 105
3 0.3 400 110 110
4 0.4 300 120 130
5 0.4 250 130 150
35. Which of the following scenario would give you the right hyper parameter?
A) 1
B) 2
C) 3
D) 4
Solution: (B)
Option B would be the better option because it leads to less training as well as validation
error.
36. Suppose you got the tuned hyper parameters from the previous question. Now, Imagine
you want to add a variable in variable space such that this added feature is important.
Which of the following thing would you observe in such case?
Solution: (D)
If the added feature is important, the training and validation error would decrease.
Suppose, you got a situation where you find that your linear regression model is under
fitting the data.
37. In such situation which of the following options would you consider?
A) 1 and 2
B) 2 and 3
C) 1 and 3
D) 1, 2 and 3
Solution: (A)
In case of under fitting, you need to induce more variables in variable space or you can
add some polynomial degree variables to make the model more complex to be able to fir the
data better.
38. Now situation is same as written in previous question(under fitting).Which of following
regularization algorithm would you prefer?
A) L1
B) L2
C) Any
D) None of these
Solution: (D)
39. Which of the following step / assumption in regression modeling impacts the trade-off
between under-fitting and over-fitting the most.
Solution: A
Choosing the right degree of polynomial plays a critical role in fit of regression. If we
choose higher degree of polynomial, chances of overfit increase significantly.
40. Suppose you have the following data with one real-value input variable & one real-value
output variable. What is leave-one out cross validation mean square error in case of linear
regression (Y = bX+c)?
A. 10/27
B. 20/27
C. 50/27
D. 49/27
Solution: D
We need to calculate the residuals for each cross validation point. After fitting the line
with 2 points and leaving 1 point for cross validation.
Leave one out cross validation mean square error = (2^2 +(2/3)^2 +1^2) /3 = 49/27
41. Which of the following is/ are true about “Maximum Likelihood estimate (MLE)”?
A. 1 and 4
B. 2 and 3
C. 1 and 3
D. 2 and 4
Solution: C
The MLE may not be a turning point i.e. may not be a point at which the first derivative
of the likelihood (and log-likelihood) function vanishes.
42. Let’s say, a “Linear regression” model perfectly fits the training data (train error
is zero). Now, Which of the following statement is true?
Solution: C
Test error may be zero if there no noise in test data. In other words, it will be zero,
if the test data is perfect representative of train data but not always.
C. Individually R squared cannot tell about variable importance. We can’t say anything about
it right now.
D. None of these.
Solution: C
“R squared” individually can’t tell whether a variable is significant or not because each
time when we add a feature, “R squared” can either increase or stay constant. But, it is
not true in case of “Adjusted R squared” (increases when features found to be significant).
44. Which one of the statement is true regarding residuals in regression analysis?
Solution: A
Sum of residual in regression is always zero. It the sum of residuals is zero, the ‘Mean’
will also be zero.
D. None of these
Solution: A
You can refer this article for more detail about regression analysis.
46. Which of the following indicates a fairly strong relationship between X and Y?
D. None of these
Solution: A
Correlation between variables is 0.9. It signifies that the relationship between variables
is fairly strong.
On the other hand, p-value and t-statistics merely measure how strong is the evidence that
there is non zero association. Even a weak effect can be extremely significant given enough
data.
47. Which of the following assumptions do we make while deriving linear regression
parameters?
A. 1,2 and 3.
B. 1,3 and 4.
C. 1 and 3.
D. All of above.
Solution: D
When deriving regression parameters, we make all the four assumptions mentioned above. If
any of the assumptions is violated, the model would be misleading.
A. Scatter plot
B. Barchart
C. Histograms
D. None of these
Solution: A
To test the linear relationship between continuous variables Scatter plot is a good option.
We can find out how one variable is changing w.r.t. another variable. A scatter plot displays
the relationship between two quantitative variables.
49. Generally, which of the following method(s) is used for predicting continuous dependent
variable?
Linear Regression
Logistic Regression
A. 1 and 2
B. only 1
C. only 2
D. None of these.
Solution: B
50. A correlation between age and health of a person found to be -1.09. On the basis of
this you would tell the doctors that:
C. None of these
Solution: C
51. Which of the following offsets, do we use in case of least square line fit? Suppose
horizontal axis is independent variable and vertical axis is dependent variable.
A. Vertical offset
B. Perpendicular offset
D. None of above
Solution: A
We always consider residual as vertical offsets. Perpendicular offset are useful in case
of PCA.
52. Suppose we have generated the data with help of polynomial regression of degree 3 (degree
3 will perfectly fit this data). Now consider below points and choose the option based on
these points.
Simple Linear regression will have high bias and low variance
Simple Linear regression will have low bias and high variance
polynomial of degree 3 will have low bias and high variance
Polynomial of degree 3 will have low bias and Low variance
A. Only 1
B. 1 and 3
C. 1 and 4
D. 2 and 4
Solution: C
If we fit higher degree polynomial greater than 3, it will overfit the data because model
will become more complex. If we fit the lower degree polynomial less than 3 which means
that we have less complex model so in this case high bias and low variance. But in case
of degree 3 polynomial it will have low bias and low variance.
53. Suppose you are training a linear regression model. Now consider these points.
Solution: C
1.With small training dataset, it’s easier to find a hypothesis to fit the training data
exactly i.e. overfitting.
2. We can see this from the bias-variance trade-off. When hypothesis space is small, it
has higher bias and lower variance. So with a small hypothesis space, it’s less likely to
find a hypothesis to fit the data exactly i.e. underfitting.
54. Suppose we fit “Lasso Regression” to a data set, which has 100 features (X1,X2…X100).
Now, we rescale one of these feature by multiplying with 10 (say that feature is X1), and
then refit Lasso regression with the same regularization parameter.
C. Can’t say
D. None of these
Solution: B
Big feature values =⇒ smaller coefficients =⇒ less lasso penalty =⇒ more likely to have
be kept
55. Which of the following is true about “Ridge” or “Lasso” regression methods in case
of feature selection?
D. None of above
Solution: B
“Ridge regression” will use all predictors in final model whereas “Lasso regression” can
be used for feature selection because coefficient values can be zero. For more detail click
here.
56. Which of the following statement(s) can be true post adding a variable in a linear
regression model?
A. 1 and 2
B. 1 and 3
C. 2 and 4
Solution: A
Each time when you add a feature, R squared always either increase or stays constant, but
it is not true in case of Adjusted R squared. If it increases, the feature would be
significant.
57. The following visualization shows the fit of three different models (in blue line) on
same training data. What can you conclude from these visualizations?
The training error in first model is higher when compared to second and third model.
The best model for this regression problem is the last (third) model, because it has
minimum training error.
The second model is more robust than first and third because it will perform better
on unseen data.
The third model is overfitting data as compared to first and second model.
All models will perform same because we have not seen the test data.
A. 1 and 3
B. 1 and 3
C. 1, 3 and 4
D. Only 5
Solution: C
The trend of the data looks like a quadratic trend over independent variable X. A higher
degree (Right graph) polynomial might have a very high accuracy on the train population
but is expected to fail badly on test dataset. But if you see in left graph we will have
training error maximum because it under-fits the training data.
58. Which of the following metrics can be used for evaluating regression models?
R Squared
Adjusted R Squared
F Statistics
A. 2 and 4.
B. 1 and 2.
C. 2, 3 and 4.
Solution: D
These (R Squared, Adjusted R Squared, F Statistics , RMSE / MSE / MAE ) are some metrics
which you can use to evaluate your regression model.
59. We can also compute the coefficient of linear regression with the help of an analytical
method called “Normal Equation”. Which of the following is/are true about “Normal
Equation”?
A. 1 and 2
B. 1 and 3.
C. 2 and 3.
D. 1,2 and 3.
Solution: D
Instead of gradient descent, Normal Equation can also be used to find coefficients. Refer
this article for read more about normal equation.
60. The expected value of Y is a linear function of the X(X1,X2….Xn) variables and regression
line is defined as:
Y = β0 + β1 X1 + β2 X2……+ βn Xn
A. 1 and 2
B. 1 and 3
C. 2 and 3
D. 1,2 and 3
Solution: D
The unexplained variations of Y are independent random variables (in particular, not
“auto correlated” if the variables are time series)
They all have the same variance (“homoscedasticity”).
They are normally distributed.
61. How many coefficients do you need to estimate in a simple linear regression model (One
independent variable)?
A. 1
B. 2
C. Can’t Say
Solution: B
Note:
D) None of these
Solution: C
63. If two variables are correlated, is it necessary that they have a linear relationship?
A. Yes
B. No
Solution: B
64. Correlated variables can have zero correlation coeffficient. True or False?
A. True
B. False
Solution: A
65. Suppose I applied a logistic regression model on data and got training accuracy X and
testing accuracy Y. Now I want to add few new features in data. Select option(s) which are
correct in such case.
A. Only 2
B. Only 1
C. Only 3
D. Only 4
Solution: A
Adding more features to model will always increase the training accuracy i.e. low bias.
But testing accuracy increases if feature is found to be significant.
66. The graph below represents a regression line predicting Y from X. The values on the
graph shows the residuals for each predictions value. Use this information to compute the
SSE.
A. 3.02
B. 0.75
C. 1.01
D. None of these
Solution: A
SSE is the sum of the squared errors of prediction, so SSE = (-.2)^2 + (.4)^2 + (-.8)^2
+ (1.3)^2 + (-.7)^2 = 3.02
67. Height and weight are well known to be positively correlated. Ignoring the plot scales
(the variables have been standardized), which of the two scatter plots (plot1, plot2) is
more likely to be a plot showing the values of height (Var1 – X axis) and weight (Var2 –
Y axis).
A. Plot2
B. Plot1
C. Both
D. Can’t say
Solution: A
Plot 2 is definitely a better representation of the association between height and weight.
As individuals get taller, they take up more volume, which leads to an increase in height,
so a positive relationship is expected. The plot on the right has this positive relationship
while the plot on the left shows a negative relationship.
68. Suppose the distribution of salaries in a company X has median $35,000, and 25th and
75th percentiles are $21,000 and $53,000 respectively.
A. Yes
B. No
D. None of these.
Solution: C
69. Which of the following option is true regarding “Regression” and “Correlation” ?
C. The relationship is not symmetric between x and y in case of correlation but in case
of regression it is symmetric.
Solution: D
Correlation is a statistic metric that measures the linear association between two
variables. It treats y and x symmetrically.
Regression is setup to predict y from x. The relationship is not symmetric.
70. Can we calculate the skewness of variables based on mean and median?
A. True
B. False
Solution: B
The skewness is not directly related to the relationship between the mean and median.
71. Suppose you have n datasets with two continuous variables (y is dependent variable and
x is independent variable). We have calculated summary statistics on these datasets. All
of them give the following result:
A. Yes
B. No
C. Can’t Say
Solutiom: C
To answer this question, you should know about Anscombe’s quartet. Refer this link to read
more about this.
72. How does number of observations influence overfitting? Choose the correct answer(s).
A. 1 and 4
B. 2 and 3
C. 1 and 3
D. None of theses
Solution: A
In particular, if we have very few observations and it’s small, then our models can rapidly
overfits data. Because we have only a few points and as we’re increasing in our model
complexity like the order of the polynomial, it becomes very easy to hit all of our
observations.
On the other hand, if we have lots and lots of observations, even with really, really complex
models, it is difficult to overfit because we have dense observations across our input.
73. Suppose you have fitted a complex regression model on a dataset. Now, you are using
Ridge regression with tuning parameter lambda to reduce its complexity. Choose the option(s)
below which describes relationship of bias and variance with lambda.
Solution: C
If lambda is very large it means model is less complex. So in this case bias is high and
variance in low.
74. Suppose you have fitted a complex regression model on a dataset. Now, you are using
Ridge regression with tuning parameter lambda to reduce its complexity. Choose the option(s)
below which describes relationship of bias and variance with lambda.
Solution: B
If lambda is very small it means model is complex. So in this case bias is low and variance
is high because model will overfit the data.
A. 1 and 3
B. 1 and 4
C. 2 and 3
D. 2 and 4
Solution: A
Specifically, we can see that when lambda is 0, we get our least square solution. When lambda
goes to infinity, we get very, very small coefficients approaching 0.
76. Out of the three residual plots given below, which of the following represent worse
model(s) compared to others?
Note:
A. 1
B. 2
C. 3
D. 1 and 2
Solution: C
There should not be any relationship between predicted values and residuals. If there exist
any relationship between them means model has not perfectly capture the information in data.
77. Which of the following method(s) does not have closed form solution for its coefficients?
A. Ridge regression
B. Lasso
D. None of both
Solution: B
The Lasso does not admit a closed-form solution. The L1-penalty makes the solution
non-linear. So we need to approximate the solution.
78. Consider the following dataset
Which bold point, if removed will have the largest effect on fitted regression line as shown
in above figure(dashed)?
A) a
B) b
C) c
D) d
Solution: D
79. In a simple linear regression model (One independent variable), If we change the input
variable by 1 unit. How much output variable will change?
A: By 1
B. No change
C. By intercept
D. By its Slope
Solution: D
Equation for simple linear regression: Y=a+bx. Now if we increase the value of x by 1 then
the value of y would be a+b(x+1) i.e. value of y will get incremented by b.
80. Logistic Regression transforms the output probability to be in a range of [0, 1]. Which
of the following function is used by logistic regression to convert the probability in the
range between [0,1].
A. Sigmoid
B. Mode
C. Square
D. Probit
Solution: A
Sigmoid function is used to convert output probability between [0,1] in logistic regression.
81: Which of the following statement is true about partial derivative of the cost functions
w.r.t weights / coefficients in linear-regression and logistic-regression?
C. Can’t say
D. None of these
Solution: B
82. Suppose, we are using Logistic regression model for n-class classification problem.
In this case, we can use One-vs-rest method. Choose which of the following option is true
regarding this?
D. None of these.
Solution: A
If there are n classes, then n separate logistic regression has to fit, where the probability
of each category is predicted over the rest of the categories combined.
-1 vs 0 and 1
0 vs -1 and 1
1 vs 0 and -1
83. Below are two different logistic models with different values for β0 and β1.
Which of the following statement(s) is true about β0 and β1 values of two logistics models
(Green, Black)?
Note: consider Y = β0 + β1*X. Here, β0 is intercept and β1 is coefficient.
D. Can’t Say.
Solution: B
Name of Faculty Dr Roshani Raut
Name of Subject Machine Learning & Applications
Year BE
Branch IT
Difficult
y Level
Blooms
Unit (Easy-1/
Q.no Description Question Choice A Choice B Choice C Choice D Taxonom
No Medium-
y Level
2/
Hard-3)
In multiclass classification number of classes must
1 Less than two Equals to two Greater than two option 1 and option 2 2 1 1
be
Application of machine learning methods to large
2 databases is called Data Mining. Artificial Intelligence Big Data Computing Internet of Things 1 2 1
What characterize unlabeled examples in machine There is no confusing There is plenty of confusing
9 There is no prior knowledge There is prior knowledge 1 2 1
learning knowledge knowledge
10 What does dimensionality reduction reduce? stochastics collinerity performance Entropy 1 1 1
11 Data used to build a data mining model. Training data Validation data test data hidden data 1 1 1
The problem of finding hidden structure in unlabeled Supervised learning Unsupervised learning
12 Reinforcement learning None of the above 1 1 1
data is called…
The difference between the actual Y value and the
13 predicted Y value found using a regression equation slope residual outlier scatter plot 3 1 1
is called the
Which of the following can only be used when
14 Linear hard-margin SVM Linear Logistic Regression Linear Soft margin SVM The centroid method 2 1 1
training data are linearlyseparable?
15 Impact of high variance on the training set ? overfitting underfitting both underfitting & overfitting Depents upon the dataset 2 1 1
The SVM allows very low error The SVM allows high amount
16 What do you mean by a hard margin? Both 1 & 2 None of the above 2 1 1
in classification of error in classification
17 The effectiveness of an SVM depends upon: Selection of Kernel Kernel Parameters Soft Margin Parameter C All of the above 2 1 1
It is the transmission of error
It is another name given to It is the transmission of error
back through the network to
18 What is back propagation? the curvy function in the back through the network to None of the mentioned 6 2 1
allow weights to be adjusted so
perceptron adjust the inputs
that the network can learn
The only examples
All the examples that have a
19 What are support vectors? necessary to compute f(x) in All of the above None of the above 2 2 1
non-zero weight αk in a SVM
an SVM.
always output values can be used for regression
20 Neural networks optimize a convex cost function All of the above 3 2 1
between 0 and 1 as well as classification
Given a database of customer
Given a set of news articles
Given email labeled as Spam data, automatically discover
Of the Following Examples, Which would you found on the web, group Find the patterns in Market
21 or not Spam, learn a spam market segments and group 1 1 2
address using an supervised learning Algorithm? them into set of articles Basket Analysis
filter customers into different market
about the same story.
segments.
Dimensionality Reduction Algorithms are one of the
22 possible ways to reduce the computation time TRUE FALSE 1 1 2
required to build a model
You are given reviews of few netflix series marked
23 as positive, negative and neutral. Classifying reviews Supervised Learning Unsupervised Learning Semisupervised Learning Reinforcement Learning 1 1 2
of a new netflix series is an example of
Which of the following is a good test dataset Large enough to yield Is representative of the
24 Both A and B None of the above 1 2 2
characteristic? meaningful results dataset as a whole
25 Following are the types of supervised learning Classification Regression subgroup discovery All of the above 1 1 2
26 Type of matrix decomposition model is Descriptive model Predictive Model Logical model None of the above 1 3 2
Following is powerful distance metrics used by
27 Euclidean distance Manhattan distance Both A and B square distance 1 2 2
Geometric model
28 The output of training process in machine learning is machine learning model machine learning algorithm null accuracy 1 1 2
A feature F1 can take certain value: A, B, C, D, E, &
F and represents grade of students from a college.
29 Here feature type is nominal ordinal categorical boolean 1 3 2
30 PCA is Forward Feature selection Backword Feature selection Feature Extraction All of the above 1 1 2
Dimensionality reduction algorithms are one of the
31 possible ways to reduce the computation time True FALSE 1 1 2
required to build a model.
Removing columns which
Removing columns which Removing columns with
Which of the following techniques would perform have too many missing
32 have high variance in data dissimilar data trends None of these 1 3 2
better for reducing dimensions of a data set? values
Supervised learning and unsupervised clustering output attribute. hidden attribute. input attribute.
33 both require which is correct according to the categorical attribute 1 2 2
statement.
A plane with 1 dimensional A plane with 2 dimensional A plane with 1 dimensional
What characterize is hyperplance in geometrical A plane with 2 dimensional more
34 fewer than number of input fewer than number of more than number of input 1 2 2
model of machine learning? than number of input attributes
attributes input attributes attributes
Like the probabilistic view, the ________ view
35 allows us to associate a probability of membership exampler deductive classical inductive 1 2 2
with each classification.
Database query is used to uncover this type of
36 deep hidden shallow multidimensional 1 2 2
knowledge.
A person trained to interact with a human expert in knowledge programmer knowledge developer knowledge engineer
37 knowledge extractor 1 2 2
order to capture their knowledge. r
Some telecommunication company wants to
Supervised learning Unsupervised learning
38 segment their customers into distinct groups ,this is Reinforcement learning Data extraction 1 2 2
an example of
In the example of predicting number of babies based
39 outcome feature observation attribute 1 3 2
on stork's population ,Number of babies is
Linear Regression is a _______ machine learning
40 Supervised Unsupervised Semi-Supervised Can't say 3 1 2
algorithm.
A perceptron adds up all the weighted inputs it
Sometimes – it can also output
41 receives, and if it exceeds a certain value, it outputs TRUE False Can’t say 2 1 2
intermediate values as well
a 1, otherwise it just outputs a 0.
To transform the data from To transform the problem To transform the problem from
42 What is the purpose of the Kernel Trick? nonlinearly separable to from regression to supervised to unsupervised All of the above 2 1 2
linearly separable classification learning.
Which of the following can only be used when
43 Linear hard-margin SVM Linear Logistic Regression Linear Soft margin SVM Parzen windows 2 1 2
training data are linearlyseparable?
determines how strongly the is more analogous to the only changes very slowly,
can sometimes exceed 30,000
dendrites of the output of a unit in a taking a period of
44 The firing rate of a neuron action potentials 2 1 2
neuron stimulate axons of neural net than the output several seconds to make large
per second
neighboring neurons voltage of the neuron adjustments
Which of the following methods/methods do we use
45 Least Square Error Maximum Likelihood Logarithmic Loss Both A and B 3 2 2
to find the best fit line for data in Linear Regression?
Which of the following methods do we use to best fit
46 Least Square Error Maximum Likelihood Jaccard distance Both A and B 3 2 2
the data in Logistic Regression?
Which of the following evaluation metrics can not be
47 applied in case of logistic regression output to AUC-ROC Accuracy Logloss Mean-Squared-Error 2 2 2
compare with target?
Which of the following is an application of NN
48 Sales forecasting Data validation Risk management All of the mentioned 6 2 2
(Neural Network)?
Neural Networks are complex ______________ with
49 Linear Functions Nonlinear Functions Discrete Functions Exponential Functions 6 2 2
many parameters.
The tradeoff between
The number of cross-validations
50 The cost parameter in the SVM means: The kernel to be used misclassification and None of the above 2 2 2
to be made
simplicity of the model
Lasso can be interpreted as least-squares linear weights are regularized with the weights have a Gaussian weights are regularized with
51 the solution algorithm is simpler 3 2 2
regression where the L1 norm prior the L2 norm
changes ridge regression so exploits the fact that in many
we solve a d × d learning algorithms, the
can be applied to every is commonly used for
52 The kernel trick linear system instead of an n weights can be written as a 2 2 2
classification algorithm dimensionality reduction
× n system, given n linear
sample points with d features combination of input points
How does the bias-variance decomposition of a
ridge regression estimator compare with that of Ridge has larger bias, larger Ridge has smaller bias, Ridge has larger bias, Ridge has smaller bias, smaller
53 2 2 2
ordinary variance larger variance smaller variance variance
least squares regression?
Which of the following evaluation metrics can be
54 used to evaluate a model while modeling a AUC-ROC Accuracy Logloss Mean-Squared-Error 3 3 2
continuous output variable?
Classifiers which perform
Classifiers which form a tree series of condition checking
55 What are tree based classifiers? Both options except none None of the options 4 1 2
with each attribute at one level with one attribute
at a time
56 What is gini index? It is a type of index structure It is a measure of purity Both options except none None of the options 4 1 2
Which of the following sentences are correct in
reference to
Information gain?
57 a. It is biased towards single-valued attributes a and b a and d b, c and d All of the above 4 1 2
b. It is biased towards multi-valued attributes
c. ID3 makes use of information gain
d. The approact used by ID3 is greedy
Multivariate split is where the partitioning of tuples is
based on a
58 TRUE FALSE 4 1 2
combination of attributes rather than on a single
attribute.
Gain ratio tends to prefer unbalanced splits in which
59 TRUE FALSE 4 1 2
one partition is much smaller than the other
The gini index is not biased towards multivalued
60 TRUE FALSE 4 1 2
attributed.
61 Gini index does not favour equal sized partitions. TRUE FALSE 4 1 2
When the number of classes is large Gini index is
62 TRUE FALSE 4 1 2
not a good choice.
Attribute selection measures are also known as
63 TRUE FALSE 4 1 2
splitting rules.
his clustering approach initially assumes that each
64 expectation maximization K-Means clustering agglomerative clustering conceptual clustering 4 1 2
data instance represents a single cluster.
Which statement is true about the K-Means The output attribute must be All attribute values must be All attributes must be Attribute values may be either
65 4 1 2
algorithm? cateogrical categorical numeric categorical or numeric
The probability of a hypothesis before the
66 priori posterior conditional subjective 5 1 2
presentation of evidence.
67 KDD represents extraction of data knowledge rules model 4 1 2
68 The most general form of distance is Manhattan Eucledian Mean Minkowski 4 1 2
With Bayes theorem the probability of hypothesis
69 a conditional probability an a priori probability a bidirectional probability a posterior probability 5 1 2
H¾ specified by P(H) ¾ is referred to as
Simple regression assumes a __________
70 relationship between the input attribute and output quadratic inverse linear reciprocal 3 1 2
attribute.
Which of the following algorithm comes under the
71 Apriori Brute force DBSCAN K-nearest neighbor 4 1 2
classification
Hierarchical agglomerative clustering is typically
72 Dendrogram Binary trees Block diagram Graph 4 1 2
visualized as?
The _______ step eliminates the extensions of
73 (k-1)-itemsets which are not found to be Partitioning Candidate generation Itemset eliminations Pruning 4 1 2
frequent,from being considered for counting support
The distance between two points calculated using
74 Supremum distance Eucledian distance Linear distance Manhattan Distance 4 1 2
Pythagoras theorem is
Which learning Requires Self Assessment to identify
75 Unsupervised Learning Supervised Learning Semisupervised Learning Reinforced Learning 1 1 3
patterns within data?
Select the correct answers for following statements.
1. Filter methods are much faster compared to
wrapper methods.
76 2. Wrapper methods use statistical methods for Both are True 1 is True and 2 is False Both are False 1 is False and 2 is True 1 2 3
evaluation of a subset of features while Filter
methods use cross validation.
All the problems that arise All the problems that arise
All the problems that arise All the problems that arise when
when working with data in the when working with data in
when working with data in the working with data in the higher
77 The "curse of dimensionality" referes higher dimensions, that did the lower dimensions, that 1 2 3
lower dimensions, that did not dimensions, that did not exist in
not exist in the lower did not exist in the higher
exist in the lower dimensions. the higher dimensions.
dimensions. dimensions.
Training based on historical
78 In simple term, machine learning is Prediction to answer a query Both A and B Automization of complex tasks 1 1 3
data
If machine learning model output doesnot involves
79 Descriptive model Predictive Model Reinforcement Learning All of the above 1 1 3
target variable then that model is called as
80 Following are the descriptive models Clustering Classification Association rule Both a and c 1 1 3
Different learning methods does not include?
81 Memorization Analogy Deduction Introduction 1 3 3
A measurable property or parameter of the data-set
82 training data feature test data validation data 1 2 3
is
83 Feature can be used as a Binary split Predictor Both A and B None of the above 1 1 3
It is not necessary to have a target variable for
84 True FALSE 1 1 3
applying dimensionality reduction algorithms
The most popularly used dimensionality reduction
algorithm is Principal Component Analysis (PCA).
Which of the following is/are true about PCA? 1.
PCA is an unsupervised method
2. It searches for the directions that data have the
85 1&2 2&3 3&4 All of the above 1 3 3
largest variance
3. Maximum number of principal components <=
number of features
4. All principal components are orthogonal to each
other
Choose k to be the smallest
Choose k to be 99% of m (k
value so that at least 99% of Choose k to be the largest
Which of the following is a reasonable way to select = 0.99*m, rounded to the
86 the varinace is retained. - value so that 99% of the Use the elbow method 1 3 3
the number of principal components "k"? nearest integer).
answer variance is retained.
115 Which one of these is not a tree based learner? CART ID3 Bayesian classifier Random Forest 4 2 3
116 Which one of these is a tree based learner? Rule based Bayesian Belief Network Bayesian classifier Random Forest 4 2 3
What is the approach of basic algorithm for decision
117 Greedy Top Down Procedural Step by Step 4 2 3
tree induction?
Which of the following classifications would best suit
118 If...then... Analysis Market-basket analysis Regression analysis Cluster analysis 4 2 3
the student performance classification systems?
Given that we can select the same feature multiple
times during the recursive partitioning of
the input space, is it always possible to achieve
119 Yes No 4 2 3
100% accuracy on the training data (given
that we allow for trees to grow to their maximum
size) when building decision trees?
This clustering algorithm terminates when mean
values computed for the current iteration of the
120 K-Means clustering conceptual clustering expectation maximization agglomerative clustering 4 2 3
algorithm are identical to the computed mean values
for the previous iteration
The number of iterations in apriori ___________ increases with the size of the decreases with the increase increases with the size of decreases with increase in size
121 4 2 3
Select one: a. b. c. d. data in size of the data the maximum frequent set of the maximum frequent set
Superset of both closed
Frequent item sets is Superset of only closed frequent Superset of only maximal Subset of maximal frequent
122 frequent item sets and 4 2 3
item sets frequent item sets item sets
maximal frequent item sets
A good clustering method will produce high quality
123 high inter class similarity low intra class similarity high intra class similarity no inter class similarity 4 2 3
clusters with
Both techniques build models
Both models require numeric
Which statement is true about neural network and whose output is determined by The output of both models is Both models require input
124 attributes to range between 0 4 2 3
linear regression models? a linear sum of weighted input a categorical attribute value attributes to be numeric
and 1
attribute values
Outliers should be part of the The nature of the problem Outliers should be part of the
Outliers should be identified
125 Which statement about outliers is true? training dataset but should not determines how outliers are test dataset but should not be 2 2 3
and removed from a dataset
be present in the test data used present in the training data
High support and medium High support and low Low support and high
126 Which Association Rule would you prefer Low support and low confidence 4 2 3
confidence confidence confidence
In a Rule based classifier, If there is a rule for each
127 combination of attribute values, what do you called Exhaustive Inclusive Comprehensive Mutually exclusive 4 2 3
that rule set R
If a set cannot pass a test, its To decrease the efficiency, To improve the efficiency, do
If a set can pass a test, its
128 The apriori property means supersets will also fail the do level-wise generation of level-wise generation of 4 2 3
supersets will fail the same test
same test frequent item sets frequent item sets d.
If an item set ‘XYZ’ is a frequent item set, then all
129 Undefined Not frequent Frequent Can not say 4 2 3
subsets of that frequent item set are
Clustering is ___________ and is example of
130 Predictive and supervised Predictive and unsupervised Descriptive and supervised Descriptive and unsupervised 4 2 3
____________learning
To determine association rules from frequent item Only minimum confidence Neither support not Both minimum support and
131 Minimum support is needed 4 2 3
sets needed confidence needed confidence are needed
If {A,B,C,D} is a frequent itemset, candidate rules
132 C –> A D –>ABCD A –> BC B –> ADC 4 2 3
which is not possible is
Low support and high Low support and low High support and medium
133 Which Association Rule would you prefer High support and low confidence 4 2 3
confidence confidence confidence
The probability that a person owns a sports car
given that they subscribe to automotive magazine is
40%. We also know that 3% of the adult population
subscribes to automotive magazine. The probability
134 of a person owning a sports car given that they don’t 0.0398 0.0389 0.0368 0.0396 5 3 3
subscribe to automotive magazine is 30%. Use this
information to compute the probability that a person
subscribes to automotive magazine given that they
own a sports car
This clustering algorithm terminates when mean
values computed for the current iteration of the
135 conceptual clustering K-Means clustering expectation maximization agglomerative clustering 4 2 3
algorithm are identical to the computed mean values
for the previous iteration
Classification rules are extracted from
136 decision tree root node branches siblings 4 2 3
_____________
What does K refers in the K-Means algorithm which
137 Complexity Fixed value No of iterations number of clusters 4 2 3
is a non-hierarchical clustering approach?
PCA works better if there is
1. A linear structure in the data
138 2. If the data lies on a curved surface and not on a 1 and 2 2 and 3 1 and 3 1,2 and 3 1 3 4
flat surface
3. If variables are scaled in the same unit
139 If TP=9 FP=6 FN=26 TN=70 then Error rate will be 45 percentage 99 percentage 28 percentage 20 perentage 2 3 4
Imagine, you are solving a classification problems
with highly imbalanced class. The majority class is
observed 99% of times in the training data. Your
model has 99% accuracy after taking the predictions
on test data. Which of the following is true in such a
case? 1.
Accuracy metric is not a good idea for imbalanced
140 1 and 3 1 and 4 2 and 3 2 and 4 2 3 4
class problems.
2.Accuracy metric is a good idea for imbalanced
class problems.
3.Precision and recall metrics are good for
imbalanced class problems.
4.Precision and recall metrics aren’t good for
imbalanced class problems.
he minimum time complexity for training an SVM is
141 O(n2). According to this fact, what sizes of datasets Large datasets Small datasets Medium sized datasets Size does not matter 2 1 4
are not best suited for SVM’s?
Both By pruning the longer
How will you counter over-fitting in decision tree?
142 By pruning the longer rules By creating new rules rules’ and ‘ By creating new None of the options 4 3 4
rules’
Pessimistic pruning and Postpruning and Cost complexity pruning and
143 What are two steps of tree pruning work? None of the options 4 3 4
Optimistic pruning Prepruning time complexity pruning
The best pruned tree is the
A pruning set of class
In pre-pruning a tree is 'pruned' one that minimizes the number
144 Which of the following sentences are true? labelled tuples is used to All of the above 4 3 4
by halting its construction early of encoding
estimate cost complexity
bits
Assume that you are given a data set and a neural
Fidelity of the decision tree
network model trained on the data set. You
model, which is the fraction Comprehensibility of the
are asked to build a decision tree model with the F1 measure of the decision
Accuracy of the decision tree of instances on which the decision tree model, measured
145 sole purpose of understanding/interpreting tree model on the given data 4 3 4
model on the given data set neural in terms of the size of the
the built neural network model. In such a scenario, set
network and the decision corresponding rule set
which among the following measures would
tree give the same output
you concentrate most on optimising?
Which of the following properties are characteristic
of decision trees?
(a) High bias
146 a and b a and d b, c and d All of the above 4 3 4
(b) High variance
(c) Lack of smoothness of prediction surfaces
(d) Unbounded parameter set
To control the size of the tree, we need to control the
number of regions. One approach to
do this would be to split tree nodes only if the
resultant decrease in the sum of squares error
exceeds some threshold. For the described method,
which among the following are true?
147 a and b a and d b, c and d All of the above 4 3 4
(a) It would, in general, help restrict the size of the
trees
(b) It has the potential to affect the performance of
the resultant regression/classification
model
(c) It is computationally infeasible
Identify the model which
Identify the best Identify the model which gives
Identify the best partition of the gives performance close to
approximation of the above the best performance using the
Which among the following statements best input space and response per the best greedy
148 by the greedy approach (to greedy approximation 4 3 4
describes our approach to learning decision trees partition to minimise sum approximation performance
identifying the (option (b)) with the smallest
of squares error (option (b)) with the smallest
partitions) partition scheme
partition scheme
Having built a decision tree, we are using reduced
error pruning to reduce the size of the
tree. We select a node to collapse. For this particular
node, on the left branch, there are 3
training data points with the following outputs: 5, 7,
149 9.6 and for the right branch, there are 10.8, 13.33, 14.48 10.8, 13.33, 12.06 7.2, 10, 8.8 7.2, 10, 8.6 4 3 4
four training data points with the following outputs:
8.7, 9.8, 10.5, 11. What were the original
responses for data points along the two branches
(left & right respectively) and what is the
new response after collapsing the node?
Suppose on performing reduced error pruning, we
collapsed a node and observed an improvement in
the prediction accuracy on the validation set. Which
among the following statements
are possible in light of the performance improvement
observed?
(a) The collapsed node helped overcome the effect
of one or more noise affected data points
150 a and b a and d b, c and d All of the above 4 3 4
in the training set
(b) The validation set had one or more noise
affected data points in the region corresponding
to the collapsed node
(c) The validation set did not have any data points
along at least one of the collapsed branches
(d) The validation set did have data points adversely
affected by the collapsed node
151 Time Complexity of k-means is given by O(mn) O(tkn) O(kn) O(t2kn) 4 3 4
Neural network learning Neural networks can be used for
Neural networks work well Neural networks can be used
Which one of the following is not a major strength of algorithms are guaranteed to applications that require a time
152 with datasets containing for both supervised learning 6 3 4
the neural network approach? converge to an optimal element to be included in the
noisy data and unsupervised clustering
solution data
In Apriori algorithm, if 1 item-sets are 100, then the
153 100 200 4950 5000 4 3 4
number of candidate 2 item-sets are
154 Significant Bottleneck in the Apriori algorithm is Finding frequent itemsets Pruning Candidate generation Number of iterations 4 3 4
typically assume an
Machine learning techniques differ from statistical are better able to deal with have trouble with large-sized are not able to explain their
155 underlying distribution for the 4 3 4
techniques in that machine learning methods missing and noisy data datasets behavior
data
The probability that a person owns a sports car
given that they subscribe to automotive magazine is
40%. We also know that 3% of the adult population
subscribes to automotive magazine. The probability
156 of a person owning a sports car given that they 0.0368 0.0396 0.0389 0.0398 4 3 4
don’t subscribe to automotive magazine is 30%.
Use this information to compute the probability that a
person subscribes to automotive magazine given
that they own a sports car
What is the final resultant cluster size in Divisive
157 algorithm, which is one of the hierarchical clustering Zero Three singleton Two 4 3 4
approaches?
2k – 1 candidate association 2k candidate association 2k – 2 candidate
158 Given a frequent itemset L, If |L| = k, then there are 2k -2 candidate association rules 4 3 5
rules rules association rules
A student Grade is a variable F1 which takes a value
Variable F1 is an example of Variable F1 is an example It doesn't belong to any of the It belongs to both ordinal and
159 from A,B,C and D. Which of the following is True in 1 2 3
nominal variable of ordinal variable mentioned categories nominal category
the following case?
What can be major issue in Faster Runtime Compared to Slower Runtime Compared to
160 Low Variance High Variance 1 2 3
Leave-One-Out-Cross-Validation(LOOCV)? K-Fold Cross Validation normal Validation
Imagine a Newly-Born starts to learn walking. It will
try to find a suitable policy to learn walking after
161 classification regression Kmeans algorithm Reinforcement Learning 1 2 3
repeated falling and getting up.specify what type of
machine learning is best suited?
Semi-Supervised Learning Supervised learning
162 Perceptron Classifier is Unsupervised learning algorithm Soft margin classifier 2 1 2
Algorithm algorithm
163 Type of dataset available in Supervised Learning is Unlabeled dataset Labeled Dataset CSV file Excel file 2 2 3
which among the following is the most appropriate
164 kernel that can be used with SVM to separate the Linear kernel Gaussian RBF kernel Polynomial kernel Option 1 and option 3 2 2 3
classes.
The data is clean and ready The data is noisy and
165 The SVMs are less effective when The data is linearly separable option 1 and option 2 2 2 3
to use contains overlapping points
The model would consider
The model would consider even The model would not be
Suppose you are using RBF kernel in SVM with high only the points close to
166 far away points from affected by distance of points opton 1 and option 2 2 2 3
Gamma value. What does this signify? the hyperplane for
hyperplane for modeling from hyperplane for modeling
modeling
What is the precision value for following confusion
167 0.91 0.09 0.9 0.95 2 3 4
matrix of binary classification?
Which of the following are components of
168 Bias Vaiance Both of them None of them 2 1 2
generalization Error?
Which of the following is not a kernel method in
169 Linear Kernel Polynomial Kernel RBF Kernel Nonlinear Kernel 2 2 3
SVM?
During the treatement of cancer patients , the doctor
needs to be very careful about which patients need
170 to be given chemotherapy.Which metric should we Precision Recall call score 2 3 4
use in order to decide the patients who should given
chemotherapy?
Which one of the following is suitable? 1. When the
hypothsis space is richer, overfitting is more likely. 2.
171 True, False False, True True,True False,False 2 2 3
when the feature space is larger , overfitting is more
likely.
172 Which of the following is a categorical data? Branch of Bank Expenditure in rupees prize of house Weight of a person 2 2 3
The data is noisy and
The soft margin SVM is more preferred than the The data is not noisy and The data is noisy and linearly
173 The data is linearly seperable contains overlapping 2 2 3
hard-margin SVM when- linearly seperable seperable
points
In SVM which has quadratic kernel function of
We can still classify the data We can not classify the data
polynomial degree 2 that has slack variable C as We can not classify the data at Data can be classified correctly
174 correctly for given setting of correctly for given setting of 2 3 4
one hyper paramenter. What would happen if we all without any impact of C
hyper parameter C hyper parameter C
use very large value for C
In SVM, RBF kernel with appropriate parameters to The Decision boundry in the The Decision boundry in The Decision boundry in the
The Decision boundry in the
175 perform binary classification where the data is transformed feature space in the transformed feature original feature space in not 2 2 3
original feature space in linear
non-linearly seperable. In this scenario non-linear space in linear considered
Which of the following is true about SVM? 1. Kernel
176 function map low dimensional data to high 1 is True, 2 is False 1 is False, 2 is True 1 is True, 2 is True 1 is False, 2 is False 2 1 2
dimensional space. 2. It is a similarity Function
What is the Accuracy in percentage based on
following confusion matrix of three class
classification.
177 Confusion Matrix C= 75% 97% 95% 85% 2 3 4
[14 0 0]
[ 1 15 0]
[ 0 0 6]
Which of the following method is used for multiclass
178 One Vs Rest LOOCV All vs One One vs Another 2 1 2
classification?
What is the precision value for following confusion
179 0.91 0.09 0.9 0.95 2 3 4
matrix of binary classification?
Which of the following is not a kernel method in
180 Linear Kernel Polynomial Kernel RBF Kernel Nonlinear Kernel 2 1 2
SVM?
Based on survey , it was found that the probability
that person like to watch serials is 0.25 and the
probability that person like to watch netflix series is
181 0.32 0.2 0.44 0.56 2 2 3
0.43. Also the probability that person like to watch
serials and netflix sereis is 0.12. what is the
probability that a person doesn't like to watch either?
A machine learning problem involves four attributes
plus a class. The attributes have 3, 2, 2, and 2
182 possible values each. The class has 3 possible 12 24 48 72 2 3 4
values. How many maximum possible different
examples are there?
they are not consistent
183 MLE estimates are often undesirable because they are biased they have high variance None of the above 2 1 2
estimators
Linear Regression is a _______ machine learning
184 Supervised Unsupervised Semi-Supervised Can't say 3 1 2
algorithm.
In the regression equation Y = 75.65 + 0.50X, the
185 0.5 75.65 1 indeterminable 3 1 2
intercept is
The difference between the actual Y value and the slope residual outlier scatter plot
186 predicted Y value found using a regression equation 3 2 3
is called the
The selling price of a house depends on many
factors. For example, it depends on the number of
bedrooms, number of kitchen, number of
187 bathrooms, the year the house was built, and the Binary Classification Multilabel Classification Simple Linear Regression Multiple Linear Regression 3 3 4
square footage of the lot. Given these factors,
predicting the selling price of the house is an
example of ____________ task.
Suppose, you got a situation where you find that
your linear regression model is under fitting the data. You will remove some
188 You will add more features All of the above None of the above 3 2 3
In such situation which of the following options would features
you consider?
Which of the following methods/methods do we use
189 Least Square Error Maximum Likelihood Logarithmic Loss Both A and B 3 2 3
to find the best fit line for data in Linear Regression?
We have been given a dataset with n records in
which we have input attribute as x and output
attribute as y. Suppose we use a linear regression
method to model this data. To test our linear
190 regressor, we split the data in training set and test Increase Decrease Remain constant Can’t Say 3 2 3
set randomly. Now we increase the training set
size gradually. As the training set size increases,
What do you expect will happen with the mean
training error?
We have been given a dataset with n records in
which we have input attribute as x and output
attribute as y. Suppose we use a linear regression
method to model this data. To test our linear Bias increases and Variance Bias decreases and Variance Bias decreases and Variance Bias increases and Variance
191 3 2 3
regressor, we split the data in training set and test increases increases decreases decreases
set randomly. What do you expect will happen
with bias and variance as you increase the size of
training data?
If X and Y in a regression model are totally the correlation coefficient would the coefficient of the coefficient of determination
192 the SSE would be 0 3 2 3
unrelated, be -1 determination would be 0 would be 1
Regarding bias and variance, which of the following
statements are true? (Here ‘high’ and ‘low’ are
relative to the ideal model.
(i) Models which overfit are more likely to have high
bias
193 (ii) Models which overfit are more likely to have low (i) and (ii) (ii) and (iii) (iii) and (iv) None of these 3 2 3
bias
(iii) Models which overfit are more likely to have high
variance
(iv) Models which overfit are more likely to have low
variance
Which of the following evaluation metrics can be
194 used to evaluate a model while modeling a AUC-ROC Accuracy Logloss Mean-Squared-Error 3 3 4
continuous output variable?
Suppose that we have N independent variables
(X1,X2… Xn) and dependent variable is Y. Now
Imagine that you are applying linear regression by Relation between the X1 and Y Relation between the X1 Relation between the X1 and Y Correlation can’t judge the
195 3 3 4
fitting the best fit line using least square error on this is weak and Y is strong is neutral relationship
data. You found that correlation coefficient for one of
it’s variable(Say X1) with Y is 0.95.
In terms of bias and variance. Which of the following Bias will be high, variance will be Bias will be low, variance will Bias will be high, variance Bias will be low, variance will be
196 3 3 4
is true when you fit degree 2 polynomial? high be high will be low low
At least one principal
Which of the following statements are true for a
197
∈
design matrix X Rn×d with d > n? (The rows are n
Least-squares linear regression
computes the
The sample points are
X has exactly d − n
eigenvectors with eigenvalue
component direction is
orthogonal to a hyperplane 3 3 4
sample points and the columns represent d linearly separable
weights w = (XTX)−1 XTy zero that contains all the sample
features.)
points
Suppose your model is demonstrating high variance
Improve the optimization
across the different training sets. Which of the Increase the amount of traning Decrease the model Reduce the noise in the training
198 algorithm being used for 3 3 3
following is NOT valid way to try and reduce the data in each traning set complexity data
error minimization.
variance?
Regression through the origin Normalizing variables results
Least squares is not an
199 Point out the wrong statement. yields an equivalent slope if you in the slope being the None of the mentioned 3 3 4
estimation tool
center the data first correlation
Which of the following are components of
200 Bias Vaiance Both of them None of them 3 1 2
generalization Error?
both multicollinearity &
201 Problem in multi regression is ? multicollinearity overfitting underfitting 3 1 2
overfitting
How can we best represent ‘support’ for the {X,Y}/(Total number of {Z}/(Total number of {X,Y,Z}/(Total number of
202 {Z}/{X,Y} 3 2 3
following association rule: “If X and Y, then Z”. transactions) transactions) transactions)
It is the conditional
probability that a randomly It is the probability that a
selected transaction will A high value of confidence randomly selected transaction Confidence is not measured in
Choose the correct statement with respect to
203 include all the items in the suggests a weak association will include all the items in the terms of (estimated) conditional 3 2 3
‘confidence’ metric in association rules
consequent given that the rule consequent as well as all the probability.
transaction includes all the items in the antecedent.
items in the antecedent.
k-means clustering aims to
k-means clustering is a linear k-nearest neighbor is same as
204 Which Statement is not true statement. partition n observations k-means is sensitive to outlier 4 1 2
clustering algorithm. k-means
into k clusters
which of the following cases will K-Means clustering
give poor results?
1. Data points with outliers
205 1 and 2 2 and 3 2 and 4 1, 2 and 4 4 1 2
2. Data points with different densities
3. Data points with round shapes
4. Data points with non-convex shapes
Structure in which internal Flow-Chart like Structure in
node represents test on an which internal node represents
attribute, each branch test on an attribute, each
206 What is Decision Tree? Flow-Chart None of the above 4 1 2
represents outcome of test branch represents outcome of
and each leaf node test and each leaf node
represents class label represents class label
8 observations are clustered into 3 clusters using
K-Means clustering algorithm. After first iteration
clusters, C1, C2, C3 has following observations:
C1: {(2,2), (4,4), (6,6)}
207 C1: (4,4), C2: (2,2), C3: (7,7) C1: (6,6), C2: (4,4), C3: (9,9) C1: (2,2), C2: (0,0), C3: (5,5) C1: (4,4), C2: (3,3), C3: (7,7) 4 2 3
C2: {(0,4), (4,0),(2,5)}
C3: {(5,5), (9,9)}
What will be the cluster centroids if you want to
proceed for second iteration?
It is the conditional
probability that a randomly It is the probability that a
selected transaction will A high value of confidence randomly selected transaction Confidence is not measured in
Choose the correct statement with respect to
208 include all the items in the suggests a weak association will include all the items in the terms of (estimated) conditional 4 2 3
‘confidence’ metric in association rules
consequent given that the rule consequent as well as all the probability.
transaction includes all the items in the antecedent.
items in the antecedent.
Pessimistic pruning and Postpruning and Cost complexity pruning and
209 What are two steps of tree pruning work? None of the options 4 2 3
Optimistic pruning Prepruning time complexity pruning
A database has 5 transactions. Of these, 4
transactions include milk and bread. Further, of the
given 4 transactions, 2 transactions include cheese.
210 0.4 0.6 0.8 0.42 4 2 3
Find the support percentage for the following
association rule “if milk and bread are purchased,
then cheese is also purchased”.
It can be used in both
Which of the following option is true about k-NN It can be used for
211 It can be used for classification classification and Not useful in ML algorithm 4 1 2
algorithm? regression
regression
How to select best hyperparameters in tree based Measure performance over Measure performance over Random selection of hyper
212 Both of these 4 1 2
models? training data validation data parameters
What is true about K-Mean Clustering?
1. K-means is extremely sensitive to cluster center
initializations
213 1 and 3 1 and 2 2 and 3 1, 2 and 3 4 1 2
2. Bad initialization can lead to Poor convergence
speed
3. Bad initialization can lead to bad overall clustering
Classifiers which perform
Classifiers which form a tree
214 What are tree based classifiers? series of condition checking Both options except none Not possible 4 1 2
with each attribute at one level
with one attribute at a time
Gini index operates on the Gini index performs only binary
215 What is gini index? It is a measure of purity All (1,2 and 3) 4 1 2
categorical target variables split
Tree/Rule based classification algorithms
216 if-then. while. do while switch. 4 1 2
generate ... rule to perform the classification.
Structure in which internal
node represents test on an
attribute, each branch
217 Decision Tree is Flow-Chart Both a & b Class of instance 4 1 2
represents outcome of test
and each leaf node
represents class label
Which of the following is true about Manhattan It can be used for continuous It can be used for categorical It can be used for categorical
218 It can be used for constants 4 2 3
distance? variables variables as well as continuous
A company has build a kNN classifier that gets 100%
accuracy on training data. When they deployed this
model on client side it has been found that the
model is not at all accurate. Which of the following It is probably a overfitted It is probably a underfitted
219 Can’t say Wrong Client data 4 2 3
thing might gone wrong? model model
Note: Model has successfully deployed and no
technical issues are found at client side except the
model performance
hich of the following classifications would best suit
220 If...then... analysis Market-basket analysis Regression analysis Cluster analysis 4 3 4
the student performance classification systems?
Which statement is true about the K-Means The output attribute must be All attribute values must be All attributes must be Attribute values may be either
221 4 2 3
algorithm? Select one: cateogrical. categorical. numeric categorical or numeric
Which of the following can act as possible
termination conditions in K-Means?
1. For a fixed number of iterations.
2. Assignment of observations to clusters does not
222 change between iterations. Except for cases with a 1, 3 and 4 1, 2 and 3 1, 2 and 4 1,2,3,4 4 3 4
bad local minimum.
3. Centroids do not change between successive
iterations.
4. Terminate when RSS falls below a threshold.
Which of the following statement is true about k-NN
algorithm?
1) k-NN performs much better if all of the data have
the same scale
223 2) k-NN works well with a small number of input 1 and 2 1 and 3 Only 1 1,2 and 3 4 3 4
variables (p), but struggles when the number of
inputs is very large
3) k-NN makes no assumptions about the functional
form of the problem being solved
In which of the following cases will K-means
clustering fail to give good results?
224 1) Data points with outliers 1 and 2 2 and 3 1, 2, and 3 1 and 3 4 3 4
2) Data points with different densities
3) Data points with nonconvex shapes
Both By pruning the longer
225 How will you counter over-fitting in decision tree? By pruning the longer rules By creating new rules rules’ and ‘ By creating new Over-fitting is not possible 4 3 4
rules’
This clustering algorithm terminates when mean
values computed for the current iteration of the
226 K-Means clustering conceptual clustering expectation maximization agglomerative clustering 4 3 4
algorithm are identical to the computed mean values
for the previous iteration Select one:
Which one of the following is the main reason for To save computing time during To save space for storing the To make the training set error To avoid overfitting the
227 4 3 4
pruning a Decision Tree? testing Decision Tree smaller training set
You've just finished training a decision tree for spam
classification, and it is getting abnormally bad
Your decision trees are too You need to increase the
228 performance on both your training and test sets. You You are overfitting. Incorrect data 4 3 4
shallow. learning rate.
know that your implementation has no bugs, so what
could be causing the problem?
Converges to the global
Requires the dimension of the Minimizes the within class
Has the smallest value of the optimum if and only if the initial
229 The K-means algorithm: feature space to be no bigger variance for a given number 4 3 4
objective function when K = 1 means are chosen as some of
than the number of samples of clusters
the samples themselves
Which of the following metrics, do we have for
finding dissimilarity between two clusters in
hierarchical clustering?
230 1 and 2 1 and 3 2 and 3 1, 2 and 3 4 3 4
1. Single-link
2. Complete-link
3. Average-link
In which of the following cases will K-Means
clustering fail to give good results?
1. Data points with outliers
231 1 and 2 2 and 3 2 and 4 1, 2 and 4 4 2 3
2. Data points with different densities
3. Data points with round shapes
4. Data points with non-convex shapes
Hierarchical clustering is slower than
232 TRUE FALSE Depends on data Cannot say 4 2 3
non-hierarchical clustering?
High entropy means that the partitions in
233 pure not pure useful useless 4 2 3
classification are
Suppose we would like to perform clustering on
spatial data such as the geometrical locations of
234 houses. We wish to produce clusters of many Decision Trees Density-based clustering Model-based clustering K-means clustering 4 3 4
different sizes and shapes. Which of the following
methods is the most appropriate?
The main disadvantage of maximum likelihood
235 mathematically less folded mathematically less complex mathematically less complex computationally intense 4 1 2
methods is that they are _____
The maximum likelihood method can be used to TRUE FALSE
explore relationships among more diverse
236 - - 4 2 3
sequences, conditions that are not well handled by
maximum parsimony methods.
k-means clustering aims to
Which Statement is not true statement. k-means clustering is a linear k-nearest neighbor is same
237 partition n observations into k-means is sensitive to outlier 4 1 2
clustering algorithm. as k-means
k clusters
In distance calculation it will You always get the same In Manhattan distance it is an
what is Feature scaling done before applying
238 give the same weights for all clusters. If you use or don't important step but in Euclidian None of these 4 1 2
K-Mean algorithm?
features use feature scaling it is not
which of the following cases will K-Means clustering
give poor results?
1. Data points with outliers
239 1 and 2 2 and 3 2 and 4 1, 2 and 4 4 1 2
2. Data points with different densities
3. Data points with round shapes
4. Data points with non-convex shapes
The most probable feature for
All the features of a class are
What is the naïve assumption in a Naïve Bayes All the classes are independent All the features of a class are a class is the most important
240 conditionally dependent on 5 1 2
Classifier. of each other independent of each other feature to be cinsidered for
each other
classification
Based on survey , it was found that the probability
that person like to watch serials is 0.25 and the
probability that person like to watch netflix series is
241 0.32 0.2 0.44 0.56 5 2 3
0.43. Also the probability that person like to watch
serials and netflix sereis is 0.12. what is the
probability that a person doesn't like to watch either?
What is the actual number of independent
242 parameters which need to be estimated in P P 2P P(P+1)/2 P(P+3)/2 5 1 2
dimensional Gaussian distribution model?
Give the correct Answer for following statements.
1. It is important to perform feature normalization
243 1 is True, 2 is False 1 is False, 2 is True 1 is True, 2 is True 1 is False, 2 is False 5 3 4
before using the Gaussian kernel.
2. The maximum value of the Gaussian kernel is 1.
The most probable feature for
All the features of a class All the features of a class are
What is the naïve assumption in a Naïve Bayes All the classes are independent a class is the most important
244 are independent of each conditionally dependent on each 5 1 2
Classifier. of each other feature to be cinsidered for
other other
classification
What is the actual number of independent
245 parameters which need to be estimated in P P 2P P(P+1)/2 P(P+3)/2 5 1 2
dimensional Gaussian distribution model?
Which of the following quantities are minimized
246 directly or indirectly during parameter estimation in Negative Log-likelihood Log-liklihood Cross Entropy Residual Sum of Square 5 2 3
Gaussian distribution Model?
In Naive Bayes equation P(C / X)= (P(X / C)
247 P(X/C) P(C/X) P(C) P(X) 5 1 2
*P(C) ) / P(X) which part considers "likelihood"?
Consider the following dataset. x,y,z are the features
248 and T is a class(1/0). Classify the test data (0,0,1) as 0 1 0.1 0.9 5 3 4
values of x,y,z respectively.
Given a rule of the form IF X THEN Y, rule
Y is false when X is known to be Y is true when X is known X is true when Y is known to be X is false when Y is known to be
249 confidence is defined as the conditional probability 5 3 4
false. to be true. true false.
that Select one:
Attributes are statistically Attributes are statistically
Which of the following statements about Naive Attributes can be nominal or
250 Attributes are equally important. dependent of one another independent of one another 5 2 3
Bayes is incorrect? numeric
given the class value. given the class value.
How the entries in the full joint probability distribution Both Using variables &
251 Using variables Using information None of the mentioned 5 2 3
can be calculated? information
How many terms are required for building a bayes
252 1 2 3 4 5 2 3
model?
253 Skewness of Normal distribution is ___________ Negative Positive 0 Undefined 5 1 2
254 The shape of the Normal Curve is ___________ Bell Shaped flat circular spiked 5 1 2
As the value of one attribute
As the value of one attribute
The correlation coefficient for two real-valued The attributes are not linearly increases the value of the The attributes show a linear
255 decreases the value of the 5 1 2
attributes is –0.85. What does this value tell you? related. second attribute also relationship
second attribute increases
increases
8 observations are clustered into 3 clusters using
K-Means clustering algorithm. After first iteration
clusters, C1, C2, C3 has following observations:
C1: {(2,2), (4,4), (6,6)}
256 C1: (4,4), C2: (2,2), C3: (7,7) C1: (6,6), C2: (4,4), C3: (9,9) C1: (2,2), C2: (0,0), C3: (5,5) C1: (4,4), C2: (3,3), C3: (7,7) 5 3 4
C2: {(0,4), (4,0),(2,5)}
C3: {(5,5), (9,9)}
What will be the cluster centroids if you want to
proceed for second iteration?
Which of the following quantities are minimized
257 directly or indirectly during parameter estimation in Negative Log-likelihood Log-liklihood Cross Entropy Residual Sum of Square 5 2 3
Gaussian distribution Model?
In Naive Bayes equation P(C / X)= (P(X / C)
258 P(X/C) P(C/X) P(C) P(X) 5 1 2
*P(C) ) / P(X) which part considers "likelihood"?
Consider the following dataset. x,y,z are the features
259 and T is a class(1/0). Classify the test data (0,0,1) as 0 1 0.1 0.9 5 3 4
values of x,y,z respectively.
Which of the following option is / are correct
regarding benefits of ensemble model?
260 1. Better performance 1 and 3 2 and 3 1, 2 and 3 1 and 2 5 1 2
2. Generalized models
3. Better interpretability
The network that involves backward links from
261 Self organizing maps Perceptrons Recurrent neural network Multi layered perceptron 6 1 2
output to the input and hidden layers is called
Which of the following parameters can be tuned for
finding good ensemble model in bagging based
algorithms?
262 1. Max number of samples 1 2 3&4 1,2,3&4 6 1 2
2. Max features
3. Bootstrapping of samples
4. Bootstrapping of features
What is back propagation?
a) It is another name given to the curvy function in
the perceptron
b) It is the transmission of error back through the
263 network to adjust the inputs a b c b&c 6 1 2
c) It is the transmission of error back through the
network to allow weights to be adjusted so that the
network can learn
d) None of the mentioned
In an election for the head of college, N candidates
are competing against each other and people are
voting for either of the candidates. Voters don’t
264 Bagging Boosting Stacking Randomization 6 2 3
communicate with each other while casting their
votes.which of the following ensembles method
works similar to the discussed elction Procedure?
What is the sequence of the following tasks in a
perceptron?
Initialize weights of perceptron randomly
265 Go to the next batch of dataset 1, 4, 3, 2 3, 1, 2, 4 4, 3, 2, 1 1, 2, 3, 4 6 2 3
If the prediction does not match the output, change
the weights
For a sample input, compute an output
In which neural net architecture, does weight sharing
occur? Convolutional neural . Fully Connected Neural
266 Recurrent Neural Network Both A and B 6 2 3
Network Network
Which of the following are correct statement(s)
about stacking?
1. A machine learning model is trained on
predictions of multiple machine learning models
2. A Logistic regression will definitely work better in
267 1 and 2 2 and 3 1 and 3 1,2 and 3 6 2 3
the second stage as compared to other classification
methods
3. First stage models are trained on full / partial
feature space of training data
Given above is a description of a neural network. When you add more hidden
When there is higher When the problem is an image When there is lower
268 When does a neural network model become a deep layers and increase depth of 6 2 3
dimensionality of data recognition problem dimensionality of data
learning model? neural network
What are the steps for using a gradient descent
algorithm?
1)Calculate error between the actual value and the
predicted value
2)Reiterate until you find the best weights of network 1, 2, 3, 4, 5
269 4, 3, 1, 5, 2 3, 2, 1, 5, 4 5, 4, 3, 2, 1 6 3 4
3)Pass an input through the network and get values
from output layer
4)Initialize random weight and bias
5)Go to each neurons which contributes to the error
and change its respective values to reduce the error
A 4-input neuron has weights 1, 2, 3 and 4. The
transfer function is linear with the constant of
270 238 76 248 348 6 3 4
proportionality being equal to 2. The inputs are 4,
10, 10 and 30 respectively. What will be the output?
Which of the following option is / are correct
regarding benefits of ensemble model?
271 1. Better performance 1 and 3 2 and 3 1, 2 and 3 1 and 2 6 1 2
2. Generalized models
3. Better interpretability
The network that involves backward links from
272 Self organizing maps Perceptrons Recurrent neural network Multi layered perceptron 6 1 2
output to the input and hidden layers is called
Which of the following parameters can be tuned for
finding good ensemble model in bagging based
algorithms?
273 1. Max number of samples 1 2 3&4 1,2,3&4 6 1 2
2. Max features
3. Bootstrapping of samples
4. Bootstrapping of features
Increase in size of a convolutional kernel would
274 necessarily increase the performance of a TRUE FALSE 6 1 2
convolutional network.
considers the reduction in considers the reduction in
error when moving from the error when moving from the can only be conceptualized as a
275 The F-test an omnibus test 6 1 2
complete model to the reduced model to the reduction in error
reduced model complete model
What is back propagation?
a) It is another name given to the curvy function in
the perceptron
b) It is the transmission of error back through the
276 network to adjust the inputs a b c b&c 6 1 2
c) It is the transmission of error back through the
network to allow weights to be adjusted so that the
network can learn
d) None of the mentioned
In an election for the head of college, N candidates
are competing against each other and people are
voting for either of the candidates. Voters don’t
277 Bagging Boosting Stacking Randomization 6 1 2
communicate with each other while casting their
votes.which of the following ensembles method
works similar to the discussed elction Procedure?
2,3,
278 Which of the following is NOT supervised learning? PCA Decision tree Linear Regression Naive Bayesian 1 2
4,5
Which of the following algorithm is not an example of
279 Extra Tree Regressor Random Forest Gradient Boosting Decision Tree 6 2 3
an ensemble method?
What is true about an ensembled classifier?
1. Classifiers that are more “sure” can vote with
more conviction
280 2. Classifiers can be more “sure” about a particular 1 and 2 1 and 3 2 and 3 All of the above 6 2 3
part of the space
3. Most of the times, it performs better than a single
classifier
Which of the following option is / are correct
regarding benefits of ensemble model?
281 1. Better performance 1 and 3 2 and 3 1 and 2 1, 2 and 3 6 1 2
2. Generalized models
3. Better interpretability
Which of the following can be true for selecting base
learners for an ensemble?
1. Different learners can come from same algorithm
with different hyper parameters
282 1 2 1 and 3 1, 2 and 3 6 2 3
2. Different learners can come from different
algorithms
3. Different learners can come from different training
spaces
True or False: Ensemble learning can only be
283 TRUE FALSE 6 1 2
applied to supervised learning methods.
True or False: Ensembles will yield bad results when
there is significant diversity among the models.
284 TRUE FALSE 6 1 2
Note: All individual models have meaningful and
good predictions.
Which of the following is / are true about weak
learners used in ensemble model?
1. They have low variance and they don’t usually
overfit
285 1 and 2 1 and 3 2 and 3 None of these 6 3 4
2. They have high bias, so they can not solve hard
learning problems
3. They have high variance and they don’t usually
overfit
True or False: Ensemble of classifiers may or may
286 not be more accurate than any of its individual TRUE False 6 1 2
model.
If you use an ensemble of different base models, is it
287 necessary to tune the hyper parameters of all base Yes No can’t say 6 1 2
models to improve the ensemble performance?
Generally, an ensemble method works better, if the
individual base models have ____________? Less correlation among High correlation among Correlation does not have any
288 None of the above 6 3 4
Note: Suppose each individual base models have predictions predictions impact on ensemble output
accuracy greater than 50%.
In an election, N candidates are competing against
each other and people are voting for either of the
candidates. Voters don’t communicate with each
other while casting their votes.
289 Which of the following ensemble method works Bagging Boosting A Or B None of these 6 3 4
similar to above-discussed election procedure?
((QUESTION)) Choose the options that is incorrect regarding machine learning (ML) and
artificial intelligence (AI)
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) Linear in D
THIS IS
MANDATORY
OPTION
((OPTION_B)) Exponential in D
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Linear in N
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) -1.66
THIS IS
MANDATORY
OPTION
((OPTION_B)) 2
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 3
This is optional
((OPTION_D)) 4
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Let us say that we have computed the gradient of our cost function and
stored it in a vector g. What is the cost of one gradient descent update
ENTER given the gradient?
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) O(D)
THIS IS
MANDATORY
OPTION
((OPTION_B)) O(N)
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) O(ND)
This is optional
((OPTION_D)) O(ND2)
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) You observe the following while fitting a linear regression to the data: As
you increase the amount of training data, the test error decreases and the
ENTER training error increases. The train error is quite low (almost what you
CONTENT. QTN expect
CAN HAVE it to), while the test error is much higher than the train error.
IMAGES ALSO What do you think is the main reason behind this behavior. Choose the
most probable option
((OPTION_A)) High variance
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Adding more basis functions in a linear model... (pick the most probably
option)
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_C)) Serration
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) You are given data about seismic activity in Japan, and you want to
predict a magnitude of the next earthquake, this is in an example of
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_C)) Serration
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) Classification
THIS IS
MANDATORY
OPTION
((OPTION_B)) Regression
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Clustering
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) True
THIS IS
MANDATORY
OPTION
((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) Outcome
THIS IS
MANDATORY
OPTION
((OPTION_B)) Feature
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Attribute
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) It may be better to avoid the metric of ROC curve as it can suffer
from accuracy paradox.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) True
THIS IS
MANDATORY
OPTION
((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) The expected value or _______ of a random variable is the center of its
distribution.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) Mode
THIS IS
MANDATORY
OPTION
((OPTION_B)) median
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) mean
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) variance
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) The square root of the variance is called the ________ deviation
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) empirical
THIS IS
MANDATORY
OPTION
((OPTION_B)) mean
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) continuous
This is optional
((OPTION_D)) standard
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) For continuous random variables, the CDF is the derivative of the PDF.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) True
THIS IS
MANDATORY
OPTION
((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) True
THIS IS
MANDATORY
OPTION
((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Consider the results of a medical experiment that aims to predict whether someone is
going to develop myopia based on some physical measurements and heredity. In this
ENTER case, the input dataset consists of the person’s medical characteristics and the target
variable is binary: 1 for those who are likely to develop myopia and 0 for those who
CONTENT. QTN aren’t. This can be best classified as
CAN HAVE
IMAGES ALSO
((OPTION_A)) Regression
THIS IS
MANDATORY
OPTION
((OPTION_C)) Clustering
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
The purpose of a machine learning model is to approximate an unknown function
((QUESTION))
that
ENTER associates input elements to output ones
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) True
THIS IS
MANDATORY
OPTION
((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Training set is normally a representation of a global distribution
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) True
THIS IS
MANDATORY
OPTION
((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
The model has an excessive capacity and it's not more able to
((QUESTION))
generalize considering the original dynamics provided by the training set. This
ENTER problem is called as
CONTENT. QTN
CAN HAVE
IMAGES ALSO
Underfitting
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B)) Overfitting
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Both
This is optional
((OPTION_D)) None
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
It can associate almost perfectly all the known samples to the corresponding
((QUESTION))
output
ENTER values, but when an unknown input is presented, the corresponding prediction
CONTENT. QTN error can be very high, This problem is called as
CAN HAVE
IMAGES ALSO
Underfitting
((OPTION_A))
THIS IS
MANDATORY
OPTION
Overfitting
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Both
This is optional
((OPTION_D)) None
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
---------- may prove to be more difficult to discover as it could be initially
((QUESTION))
considered the result of a perfect fitting
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
Underfitting
((OPTION_A))
THIS IS
MANDATORY
OPTION
Overfitting
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Both
This is optional
((OPTION_D)) None
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
when working with a supervised scenario, we define a non-negative error
((QUESTION))
measure em which takes two arguments and allows us to compute a total error
ENTER value over the whole dataset. Those two arguments are.
CONTENT. QTN
CAN HAVE
IMAGES ALSO
expected and predicted output
((OPTION_A))
THIS IS
MANDATORY
OPTION
calculated and predicted output
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
calculated and measured output
((OPTION_C))
This is optional
none
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
A
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Initial value represents a starting point over the surface of a n-variables function.
((QUESTION))
A
ENTER generic training algorithm has to find the global minimum or a point quite close
CONTENT. QTN to it
CAN HAVE (there's always a tolerance to avoid an excessive number of iterations and a
IMAGES ALSO consequent risk
of overfitting). This measure is also called
loss function
((OPTION_A))
THIS IS
MANDATORY
OPTION
predicted output
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
measured output
((OPTION_C))
This is optional
mean square error
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) In particular, a concept is a subset of input patterns X which determine the same
output element
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) True
THIS IS
MANDATORY
OPTION
((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) True
THIS IS
MANDATORY
OPTION
((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) An exponential time could lead to computational explosions when the datasets
are too large
ENTER or the optimization starting point is very far from an acceptable minimum.
CONTENT. QTN Moreover, it's
CAN HAVE important to remember the so-called …….
IMAGES ALSO
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE First term is called as
IMAGES ALSO
((OPTION_A)) posteriori
THIS IS
MANDATORY
OPTION
((OPTION_B)) Apriori
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) likelihood.
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
second term is called as
CAN HAVE
IMAGES ALSO
((OPTION_A)) posteriori
THIS IS
MANDATORY
OPTION
((OPTION_B)) Apriori
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) likelihood.
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
Third term is called as
CAN HAVE
IMAGES ALSO
((OPTION_A)) posteriori
THIS IS
MANDATORY
OPTION
((OPTION_B)) Apriori
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) likelihood.
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
THIS IS ALSO
MANDATORY
OPTION
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Suppose you have the following data with one real-value input
variable & one real-value output variable. What is leave-one out cross
ENTER validation mean square error in case of linear regression (Y = bX+c)?
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) 10/27
THIS IS
MANDATORY
OPTION
((OPTION_B)) 20/27
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 50/27
This is optional
((OPTION_D)) 49/27
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Which of the following is/ are true about “Maximum Likelihood
estimate (MLE)”?
ENTER
CONTENT. QTN 1. MLE may not always exist
CAN HAVE 2. MLE always exists
IMAGES ALSO 3. If MLE exist, it (they) may not be unique
4. If MLE exist, it (they) must be unique
((OPTION_A)) 1and4
THIS IS
MANDATORY
OPTION
((OPTION_B)) 2 and3
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 1 and3
This is optional
((OPTION_D)) 2 and4
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Let’s say, a “Linear regression” model perfectly fits the training data
(train error is zero). Now, Which of the following statement is true?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS ALSO
MANDATORY
OPTION
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
((OPTION_B)) . The p-value for the null hypothesis Beta coefficient =0 is 0.0001
THIS IS ALSO
MANDATORY
OPTION
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Which of the following assumptions do we make while deriving linear regression param
((QUESTION))
1. The true relationship between dependent y and predictor x is linear
ENTER 2. The model errors are statistically independent
CONTENT. QTN 3. The errors are normally distributed with a 0 mean and constant standard deviation.
CAN HAVE
IMAGES ALSO
((OPTION_A)) 1,2&3
THIS IS
MANDATORY
OPTION
((OPTION_B)) 1&3
THIS IS ALSO
MANDATORY
OPTION
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_B)) Barchart
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Histograms
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) 1&2
THIS IS
MANDATORY
OPTION
((OPTION_B)) Only 1
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Only 2
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Which of the following offsets, do we use in case of least square line fit? Suppose horizontal axis is
((QUESTION)) independent variable and vertical axis is dependent variable
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Suppose we have generated the data with help of polynomial regression of degree 3 (degree 3 will
((QUESTION)) perfectly fit this data). Now consider below points and choose the option based on these points.
ENTER 1. Simple Linear regression will have high bias and low variance
CONTENT. QTN 2. Simple Linear regression will have low bias and high variance
3. polynomial of degree 3 will have low bias and high variance
CAN HAVE
IMAGES ALSO Polynomial of degree 3 will have low bias and Low variance
((OPTION_A)) . Only 1
THIS IS
MANDATORY
OPTION
((OPTION_B)) 1&3
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 1&4
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) . Suppose you are training a linear regression model. Now consider
these points.
ENTER
CONTENT. QTN 1. Overfitting is more likely if we have less data
CAN HAVE 2. Overfitting is more likely when the hypothesis space is small
IMAGES ALSO
Which of the above statement(s) are correct?
((OPTION_A)) Both are False
THIS IS
MANDATORY
OPTION
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH c
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Suppose we fit “Lasso Regression” to a data set, which has 100 features (X1,X2…X100). Now, we rescale
((QUESTION)) one of these feature by multiplying with 10 (say that feature is X1), and then refit Lasso regression with
the same regularization parameter.
ENTER
CONTENT. QTN Now, which of the following option will be correct?
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) . 1 and 2
THIS IS
MANDATORY
OPTION
((OPTION_B)) 1 and 3
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 2 and 4
This is optional
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) . Which of the following metrics can be used for evaluating regression
models?
ENTER 1. R Squared
CONTENT. QTN 2. Adjusted R Squared
CAN HAVE 3. F Statistics
IMAGES ALSO 1. RMSE / MSE / MAE
((OPTION_A)) 2 and 4
THIS IS
MANDATORY
OPTION
((OPTION_B)) 1 and 2.
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) . 2, 3 and 4.
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) We can also compute the coefficient of linear regression with the help
of an analytical method called “Normal Equation”. Which of the
ENTER following is/are true about “Normal Equation”?
CONTENT. QTN 1. We don’t have to choose the learning rate
CAN HAVE 2. It becomes slow when number of features is very large
IMAGES ALSO 3. No need to iterate
((OPTION_A)) 1 and 2
THIS IS
MANDATORY
OPTION
((OPTION_B)) 1&3
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 2&3
This is optional
((OPTION_D)) 1,2&3
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
. The expected value of Y is a linear function of the X(X1,X2….Xn) variables and regression line is
((QUESTION)) defined as:
Y = β0 + β1 X1 + β2 X2……+ βn Xn
ENTER Which of the following statement(s) are true?
1. If Xi changes by an amount ∆Xi, holding other variables constant, then the expected value of Y
CONTENT. QTN changes by a proportional amount βi ∆Xi, for some constant βi (which in general could be a
CAN HAVE positive or negative number).
2. The value of βi is always the same, regardless of values of the other X’s.
IMAGES ALSO 3. The total effect of the X’s on the expected value of Y is the sum of their separate effects.
((OPTION_A)) . 1 and 2
THIS IS
MANDATORY
OPTION
((OPTION_B)) 1 and 3
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 2 and 3
This is optional
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) 1
THIS IS
MANDATORY
OPTION
((OPTION_B)) 2
THIS IS ALSO
MANDATORY
OPTION
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
. Below graphs show two fitted regression lines (A & B) on randomly generated data. Now, I want to find
((QUESTION)) the sum of residuals in both cases A and B.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO Which of the following statement is true about sum of residuals of A and B
THIS IS
MANDATORY
OPTION
This is optional
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) If two variables are correlated, is it necessary that they have a linear
relationsh
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) YES
THIS IS
MANDATORY
OPTION
((OPTION_B)) NO
THIS IS ALSO
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION
((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Suppose I applied a logistic regression model on data and got training accuracy X and testing accuracy Y.
((QUESTION)) Now I want to add few new features in data. Select option(s) which are correct in such case.
Note: Consider remaining parameters are same.
ENTER 1. Training accuracy always decreases.
2. Training accuracy always increases or remain same.
CONTENT. QTN 3. Testing accuracy always decreases
CAN HAVE Testing accuracy always increases or remain same
IMAGES ALSO
((OPTION_A)) Only 2
THIS IS
MANDATORY
OPTION
((OPTION_B)) Only 1
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Only3
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
The graph below represents a regression line predicting Y from X. The values on the
((QUESTION)) graph shows the residuals for each predictions value. Use this information to
ENTER compute the SSE.
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) 3.02
THIS IS
MANDATORY
OPTION
((OPTION_B)) 0.75
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 1.01
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) YES
THIS IS
MANDATORY
OPTION
((OPTION_B)) NO
THIS IS ALSO
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION
((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) _
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION
((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION
((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION
((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Which of the following methods do we use to best fit the data in
Logistic Regression?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) One of the very good methods to analyze the performance of Logistic
Regression is AIC, which is similar to R-Squared in Linear
ENTER Regression. Which of the following is true about AIC
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION
((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) ) LASSO
THIS IS
MANDATORY
OPTION
((OPTION_B)) Ridge
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Both
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Suppose you have been given a fair coin and you want to find out the
odds of getting heads. Which of the following option is true for such a
ENTER case?
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) ) The logit function(given as l(x)) is the log of odds function. What
could be the range of logit function in the domain x=[0,1]?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) (– ∞ , ∞)
THIS IS
MANDATORY
OPTION
((OPTION_B)) (0,1)
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) (0, ∞)
This is optional
((OPTION_D)) (- ∞, 0)
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) Linear Regression errors values has to be normally distributed but in case
of Logistic Regression it is not the case
THIS IS
MANDATORY
OPTION
((OPTION_B)) Linear Regression errors values has to be normally distributed but in case
of Logistic Regression it is not the case
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Both Linear Regression and Logistic Regression error values have to be
normally distributed
This is optional
((OPTION_D)) Both Linear Regression and Logistic Regression error values have not to
be normally distributed
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
17) Which of the following is true regarding the logistic function for any value “x Note:
((QUESTION)) Logistic(x): is a logistic function of any number “x”
Logit(x): is a logit function of any number “x”
ENTER Logit_inv(x): is a inverse logit function of any number “x””?
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Suppose, You applied a Logistic Regression model on a given data and
got a training accuracy X and testing accuracy Y. Now, you want to
ENTER add a few new features in the same data. Select the option(s) which
CONTENT. QTN is/are correct in such a case.
CAN HAVE
IMAGES ALSO Note: Consider remaining parameters are same.
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A&D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) What would do if you want to train logistic regression on same data
that will take less time as well as give the comparatively similar
ENTER accuracy(may not be same)?
CONTENT. QTN
CAN HAVE Suppose you are using a Logistic Regression model on a huge dataset. One
IMAGES ALSO of the problem you may face on such huge data is that Logistic regression
will take very long time to train
((OPTION_A)) Decrease the learning rate and decrease the number of iteration
THIS IS
MANDATORY
OPTION
((OPTION_B)) Decrease the learning rate and increase the number of iteration
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Increase the learning rate and increase the number of iteration
This is optional
((OPTION_D)) Increase the learning rate and decrease the number of iteration
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Which of the following image is showing the cost function for y =1.
((QUESTION)) Following is the loss function in logistic regression(Y-axis loss function and x axis log probability) for two
class classification problem.
ENTER Note: Y is the target class
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) A
THIS IS
MANDATORY
OPTION
((OPTION_B)) B
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) BOTH
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
The odds ratio is
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) The ratio of the probability of an event not happening to the probability of the event happening.
THIS IS
MANDATORY
OPTION
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) The ratio of the odds after a unit change in the predictor to the original odds.
This is optional
((OPTION_D)) The ratio of the probability of an event happening to the probability of the event not happening.
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Large values of the log-likelihood statistic indicate:
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) That there are a greater number of explained vs. unexplained observations.
THIS IS
MANDATORY
OPTION
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) That as the predictor variable increases, the likelihood of the outcome occurring decreases.
This is optional
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Logistic regression assumes a:
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) Linear relationship between continuous predictor variables and the outcome variable.
THIS IS
MANDATORY
OPTION
((OPTION_B)) Linear relationship between continuous predictor variables and the logit of the outcome
variable.
THIS IS ALSO
MANDATORY
OPTION
This is optional
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
In binary logistic regression:
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
THIS IS ALSO
MANDATORY
OPTION
This is optional
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_D)) none
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
This sheet is for 3 Mark questions
S.r No Question Image a b c d Correct Answer
e.g 1 Write down question img.jpg Option a Option b Option c Option d a/b/c/d
1 Which of the following is characteristic of best fast accuracy scalable All above D
machine learning method ?
2 What are the different Algorithm techniques in Supervised Unsupervised Both A & B None of the C
Machine Learning? Learning and Learning and Mentioned
Semi- Transduction
3 ______can be adopted when it's necessary to Supervised Semi- Reinforcement Clusters B
categorize a large amount of data with a few supervised
complete examples or when there's the need to
4 In reinforcement learning, this feedback is usually Overfitting Overlearning Reward None of above C
called as___.
5 In the last decade, many researchers started training Deep learning Machine Reinforcement Unsupervised A
bigger and bigger models, built with several different learning learning learning
layers that's why this approach is called_____.
6 What does learning exactly mean? Robots are A set of data Learning is the It is a set of C
programed so is used to ability to data is used to
that they can discover the change discover the
7 When it is necessary to allow the model to develop a Overfitting Overlearning Classification Regression A
generalization ability and avoid a common problem
called______.
8 Techniques involve the usage of both labeled and Supervised Semi- Unsupervised None of the B
unlabeled data is called___. supervised above
9 there's a growing interest in pattern recognition and Regression Accuracy Modelfree Scalable C
associative memories whose structure and functioning
are similar to what happens in the neocortex. Such an
10 ______ showed better performance than other Machine Deep learning Reinforcement Supervised B
approaches, even without a context-based model learning learning learning
14 Classifications,
Predict time Speech
what is the function of ‘Supervised Learning’? -- series, recognition, Both A & B None of above C
Annotate Regression
strings
15 Object Similarity Automatic
Commons unsupervised applications include -- All above D
segmentation detection labeling
16
the it's impossible
Reinforcement learning is particularly efficient environment is it's often very to have a
-- All above D
when______________. not completely dynamic precise error
deterministic measure
Ans: Solution A
2. What is regression?
a) When the output variable is a category, such as “red” or “blue” or “disease” and “no
disease”.
b) When the output variable is a real value, such as “dollars” or “weight”.
Ans: Solution B
Ans: Solution B
Ans: Solution A
Ans: Solution D
6. What is Reinforcement learning?
a) All data is unlabelled and the algorithms learn to inherent structure from the input data
b) All data is labelled and the algorithms learn to predict the output from the input data
c) It is a framework for learning where an agent interacts with an environment and receives
a reward for each interaction
d) Some data is labelled but most of it is unlabelled and a mixture of supervised and
unsupervised techniques can be used.
Ans: Solution C
Regression,
Classification
Clustering
Reinforcement Learning
Options:
A. 1 Only
B. 1 and 2
C. 1 and 3
D. 1, 2 and 4
Ans : Solution D
Ans : Solution C
Ans : Solution B
11. Supervised learning and unsupervised clustering both require at least one
a) hidden attribute.
b) output attribute.
c) input attribute.
d) categorical attribute.
Ans : Solution A
12. Supervised learning differs from unsupervised clustering in that supervised learning requires
a) at least one input attribute.
b) input attributes to be categorical.
c) at least one output attribute.
d) output attributes to be categorical.
Ans : Solution B
13. A regression model in which more than one independent variable is used to predict the
dependent variable is called
a) a simple linear regression model
b) a multiple regression models
c) an independent model
d) none of the above
Ans : Solution C
14. A term used to describe the case when the independent variables in a multiple regression model
are correlated is
a) Regression
b) correlation
c) multicollinearity
d) none of the above
Ans : Solution C
15. A multiple regression model has the form: y = 2 + 3x1 + 4x2. As x1 increases by 1 unit (holding x2
constant), y will
a) increase by 3 units
b) decrease by 3 units
c) increase by 4 units
d) decrease by 4 units
Ans : Solution C
Ans : Solution B
17. A measure of goodness of fit for the estimated regression equation is the
a) multiple coefficient of determination
b) mean square due to error
c) mean square due to regression
d) none of the above
Ans : Solution C
Ans : Solution D
Ans : Solution C
20. For a multiple regression model, SST = 200 and SSE = 50. The multiple coefficient of
determination is
a) 0.25
b) 4.00
c) 0.75
d) none of the above
Ans : Solution B
Ans : Solution B
Ans : Solution B
Ans : Solution C
Ans : Solution D
26. Which statement is true about neural network and linear regression models?
a) Both models require input attributes to be numeric.
b) Both models require numeric attributes to range between 0 and 1.
c) The output of both models is a categorical attribute value.
d) Both techniques build models whose output is determined by a linear sum of weighted
input attribute values.
Ans : Solution A
Ans : Solution A
28. The average positive difference between computed and desired outcome values.
a) root mean squared error
b) mean squared error
c) mean absolute error
d) mean positive error
Ans : Solution D
29. Selecting data so as to assure that each class is properly represented in both the training and
test set.
a) cross validation
b) stratification
c) verification
d) bootstrapping
Ans : Solution B
30. The standard error is defined as the square root of this computation.
a) The sample variance divided by the total number of sample instances.
b) The population variance divided by the total number of sample instances.
c) The sample variance divided by the sample mean.
d) The population variance divided by the sample mean.
Ans : Solution A
31. Data used to optimize the parameter settings of a supervised learner model.
a) Training
b) Test
c) Verification
d) Validation
Ans : Solution D
Ans : Solution A
33. The correlation between the number of years an employee has worked for a company and the
salary of the employee is 0.75. What can be said about employee salary and years worked?
a) There is no relationship between salary and years worked.
b) Individuals that have worked for the company the longest have higher salaries.
c) Individuals that have worked for the company the longest have lower salaries.
d) The majority of employees have been with the company a long time.
e) The majority of employees have been with the company a short period of time.
Ans : Solution B
34. The correlation coefficient for two real-valued attributes is –0.85. What does this value tell you?
a) The attributes are not linearly related.
b) As the value of one attribute increases the value of the second attribute also increases.
c) As the value of one attribute decreases the value of the second attribute increases.
d) The attributes show a curvilinear relationship.
Ans : Solution C
35. The average squared difference between classifier predicted output and actual output.
a) mean squared error
b) root mean squared error
c) mean absolute error
d) mean relative error
Ans : Solution A
36. Simple regression assumes a __________ relationship between the input attribute and output
attribute.
a) Linear
b) Quadratic
c) reciprocal
d) inverse
Ans : Solution A
Ans : Solution B
Ans : Solution C
39. Logistic regression is a ________ regression technique that is used to model data having a
_____outcome.
a) linear, numeric
b) linear, binary
c) nonlinear, numeric
d) nonlinear, binary
Ans : Solution D
40. This technique associates a conditional probability value with each data instance.
a) linear regression
b) logistic regression
c) simple regression
d) multiple linear regression
Ans : Solution B
41. This supervised learning technique can process both numeric and categorical input attributes.
a) linear regression
b) Bayes classifier
c) logistic regression
d) backpropagation learning
Ans : Solution A
Ans : Solution B
43. This clustering algorithm merges and splits nodes to help modify nonoptimal partitions.
a) agglomerative clustering
b) expectation maximization
c) conceptual clustering
d) K-Means clustering
Ans : Solution D
44. This clustering algorithm initially assumes that each data instance represents a single cluster.
a) agglomerative clustering
b) conceptual clustering
c) K-Means clustering
d) expectation maximization
Ans : Solution C
45. This unsupervised clustering algorithm terminates when mean values computed for the current
iteration of the algorithm are identical to the computed mean values for the previous iteration.
a) agglomerative clustering
b) conceptual clustering
c) K-Means clustering
d) expectation maximization
Ans : Solution C
46. Machine learning techniques differ from statistical techniques in that machine learning methods
a) typically assume an underlying distribution for the data.
b) are better able to deal with missing and noisy data.
c) are not able to explain their behavior.
d) have trouble with large-sized datasets.
Ans : Solution B
UNIT –II
1.True- False: Over fitting is more likely when you have huge amount of data to train?
A) TRUE
B) FALSE
Ans Solution: (B)
With a small training dataset, it’s easier to find a hypothesis to fit the training data exactly i.e.
over fitting.
3.Which of the following techniques would perform better for reducing dimensions of a data
set?
A. Removing columns which have too many missing values
B. Removing columns which have high variance in data
C. Removing columns with dissimilar data trends
D. None of these
Ans Solution: (A)
If a columns have too many missing values, (say 99%) then we can remove such columns.
4.It is not necessary to have a target variable for applying dimensionality reduction
algorithms.
A. TRUE
B. FALSE
Ans Solution: (A)
LDA is an example of supervised dimensionality reduction algorithm.
5. PCA can be used for projecting and visualizing data in lower dimensions.
A. TRUE
B. FALSE
Ans Solution: (A)
Sometimes it is very useful to plot the data in lower dimensions. We can take the first 2 principal
components and then visualize the data using scatter plot.
6. The most popularly used dimensionality reduction algorithm is Principal Component Analysis
(PCA). Which of the following is/are true about PCA?
PCA is an unsupervised method
It searches for the directions that data have the largest variance
Maximum number of principal components <= number of features
All principal components are orthogonal to each other
A. 1 and 2
B. 1 and 3
C. 2 and 3
D. All of the above
Ans D
8. What happens when you get features in lower dimensions using PCA?
The features will still have interpretability
The features will lose interpretability
The features must carry all information present in data
The features may not carry all information present in data
A. 1 and 3
B. 1 and 4
C. 2 and 3
D. 2 and 4
Ans Solution: (D)
When you get the features in lower dimensions then you will lose some information of data
most of the times and you won’t be able to interpret the lower dimension data.
10. What is of the following statement is true about t-SNE in comparison to PCA?
A. When the data is huge (in size), t-SNE may fail to produce better results.
B. T-NSE always produces better result regardless of the size of the data
C. PCA always performs better than t-SNE for smaller size data.
D. None of these
Ans Solution: (A)
Option A is correct
11. [ True or False ] PCA can be used for projecting and visualizing data in lower dimensions.
A. TRUE
B. FALSE
Solution: (A)
Sometimes it is very useful to plot the data in lower dimensions. We can take the first 2 principal
components and then visualize the data using scatter plot.
12. A feature F1 can take certain value: A, B, C, D, E, & F and represents grade of students from
a college.
1) Which of the following statement is true in following case?
A) Feature F1 is an example of nominal variable.
B) Feature F1 is an example of ordinal variable.
C) It doesn’t belong to any of the above category.
D) Both of these
Solution: (B)
Ordinal variables are the variables which has some order in their categories. For example, grade
A should be consider as high grade than grade B.
1. Which of the following methods do we use to best fit the data in Logistic Regression?
A) Least Square Error
B) Maximum Likelihood
C) Jaccard distance
D) Both A and B
Ans Solution: B
2. Choose which of the following options is true regarding One-Vs-All method in Logistic
Regression.
A) We need to fit n models in n-class classification problem
B) We need to fit n-1 models to classify into n classes
C) We need to fit only 1 model to classify into n classes
D) None of these
Ans Solution: A
3. Suppose, You applied a Logistic Regression model on a given data and got a training accuracy
X and testing accuracy Y. Now, you want to add a few new features in the same data. Select the
option(s) which is/are correct in such a case.
Note: Consider remaining parameters are same.
A) Training accuracy increases
B) Training accuracy increases or remains the same
C) Testing accuracy decreases
D) Testing accuracy increases or remains the same
Ans Solution: A and D
Adding more features to model will increase the training accuracy because model has to
consider more data to fit the logistic regression. But testing accuracy increases if feature is
found to be significant
6. Which of the following methods do we use to find the best fit line for data in Linear
Regression?
A) Least Square Error
B) Maximum Likelihood
C) Logarithmic Loss
D) Both A and B
Ans Solution: (A)
In linear regression, we try to minimize the least square errors of the model to identify the line
of best fit.
8. Suppose you plotted a scatter plot between the residuals and predicted values in linear
regression and you found that there is a relationship between them. Which of the following
conclusion do you make about this situation?
9. Suppose you have fitted a complex regression model on a dataset. Now, you are using Ridge
regression with penalty x.
Choose the option which describes bias in best manner.
A) In case of very large x; bias is low
B) In case of very large x; bias is high
C) We can’t say about bias
D) None of these
Ans Solution: (B)
If the penalty is very large it means model is less complex, therefore the bias would be high.
11. Suppose you have trained a logistic regression classifier and it outputs a new example x with
a prediction ho(x) = 0.2. This means
Our estimate for P(y=1 | x)
Our estimate for P(y=0 | x)
Our estimate for P(y=1 | x)
Our estimate for P(y=0 | x)
Ans Solution: B
A) TRUE
B) FALSE
Solution: (A)
True. A Neural network can be used as a universal approximator, so it can definitely implement
a linear regression algorithm.
15. Which of the following methods do we use to find the best fit line for data in Linear
Regression?
A) Least Square Error
B) Maximum Likelihood
C) Logarithmic Loss
D) Both A and B
Solution: (A)
In linear regression, we try to minimize the least square errors of the model to identify the line
of best fit.
16. Which of the following evaluation metrics can be used to evaluate a model while modeling
a continuous output variable?
A) AUC-ROC
B) Accuracy
C) Logloss
D) Mean-Squared-Error
Solution: (D)
Since linear regression gives output as continuous values, so in such case we use mean squared
error metric to evaluate the model performance. Remaining options are use in case of a
classification problem.
17. True-False: Lasso Regularization can be used for variable selection in Linear Regression.
A) TRUE
B) FALSE
Solution: (A)
True, In case of lasso regression we apply absolute penalty which makes some of the coefficients
zero.
19. Suppose that we have N independent variables (X1,X2… Xn) and dependent variable is Y.
Now Imagine that you are applying linear regression by fitting the best fit line using least square
error on this data.
You found that correlation coefficient for one of it’s variable(Say X1) with Y is -0.95.
Which of the following is true for X1?
A) Relation between the X1 and Y is weak
B) Relation between the X1 and Y is strong
C) Relation between the X1 and Y is neutral
D) Correlation can’t judge the relationship
Solution: (B)
The absolute value of the correlation coefficient denotes the strength of the relationship.
Since absolute correlation is very high it means that the relationship is strong between X1 and
Y.
20. Looking at above two characteristics, which of the following option is the correct for
Pearson correlation between V1 and V2?
If you are given the two variables V1 and V2 and they are following below two characteristics.
1. If V1 increases then V2 also increases
2. If V1 decreases then V2 behavior is unknown
A) Pearson correlation will be close to 1
B) Pearson correlation will be close to -1
C) Pearson correlation will be close to 0
D) None of these
Solution: (D)
We cannot comment on the correlation coefficient by using only statement 1. We need to
consider the both of these two statements. Consider V1 as x and V2 as |x|. The correlation
coefficient would not be close to 1 in such a case.
21. Suppose Pearson correlation between V1 and V2 is zero. In such case, is it right to
conclude that V1 and V2 do not have any relation between them?
A) TRUE
B) FALSE
Solution: (B)
Pearson correlation coefficient between 2 variables might be zero even when they have a
relationship between them. If the correlation coefficient is zero, it just means that that they
don’t move together. We can take examples like y=|x| or y=x^2.
22. True- False: Overfitting is more likely when you have huge amount of data to train?
A) TRUE
B) FALSE
Solution: (B)
With a small training dataset, it’s easier to find a hypothesis to fit the training data exactly i.e.
overfitting.
23. We can also compute the coefficient of linear regression with the help of an analytical
method called “Normal Equation”. Which of the following is/are true about Normal Equation?
1. We don’t have to choose the learning rate
2. It becomes slow when number of features is very large
3. Thers is no need to iterate
A) 1 and 2
B) 1 and 3
C) 2 and 3
D) 1,2 and 3
Solution: (D)
Instead of gradient descent, Normal Equation can also be used to find coefficients.
25. What will happen when you apply very large penalty?
A) Some of the coefficient will become absolute zero
B) Some of the coefficient will approach zero but not absolute zero
C) Both A and B depending on the situation
D) None of these
Solution: (B)
In lasso some of the coefficient value become zero, but in case of Ridge, the coefficients become
close to zero but not zero.
26. What will happen when you apply very large penalty in case of Lasso?
A) Some of the coefficient will become zero
B) Some of the coefficient will be approaching to zero but not absolute zero
C) Both A and B depending on the situation
D) None of these
Solution: (A)
As already discussed, lasso applies absolute penalty, so some of the coefficients will become
zero.
27. Which of the following statement is true about outliers in Linear regression?
A) Linear regression is sensitive to outliers
B) Linear regression is not sensitive to outliers
C) Can’t say
D) None of these
Solution: (A)
The slope of the regression line will change due to outliers in most of the cases. So Linear
Regression is sensitive to outliers.
28. Suppose you plotted a scatter plot between the residuals and predicted values in linear
regression and you found that there is a relationship between them. Which of the following
conclusion do you make about this situation?
31. In terms of bias and variance. Which of the following is true when you fit degree 2
polynomial?
A) Increase
B) Decrease
C) Remain constant
D) Can’t Say
Solution: (D)
Training error may increase or decrease depending on the values that are used to fit the model.
If the values used to train contain more outliers gradually, then the error might just increase.
33. What do you expect will happen with bias and variance as you increase the size of training
data?
34. What would be the root mean square training error for this data if you run a Linear
Regression model of the form (Y = A0+A1X)?
A) Less than 0
B) Greater than zero
C) Equal to 0
D) None of these
Solution: (C)
We can perfectly fit the line on the following data so mean error will be zero.
35. Which of the following scenario would give you the right hyper parameter?
A) 1
B) 2
C) 3
D) 4
Solution: (B)
Option B would be the better option because it leads to less training as well as validation error.
36. Suppose you got the tuned hyper parameters from the previous question. Now, Imagine
you want to add a variable in variable space such that this added feature is important. Which
of the following thing would you observe in such case?
A) Training Error will decrease and Validation error will increase
B) Training Error will increase and Validation error will increase
C) Training Error will increase and Validation error will decrease
D) Training Error will decrease and Validation error will decrease
E) None of the above
Solution: (D)
If the added feature is important, the training and validation error would decrease.
A) L1
B) L2
C) Any
D) None of these
Solution: (D)
I won’t use any regularization methods because regularization is used in case of overfitting.
41. True-False: Is it possible to design a logistic regression algorithm using a Neural Network
Algorithm?
A) TRUE
B) FALSE
Solution: A
True, Neural network is a is a universal approximator so it can implement linear regression
algorithm.
43. Which of the following methods do we use to best fit the data in Logistic Regression?
A) Least Square Error
B) Maximum Likelihood
C) Jaccard distance
D) Both A and B
Solution: B
Logistic regression uses maximum likely hood estimate for training a logistic regression.
44. Which of the following evaluation metrics can not be applied in case of logistic regression
output to compare with target?
A) AUC-ROC
B) Accuracy
C) Logloss
D) Mean-Squared-Error
Solution: D
Since, Logistic Regression is a classification algorithm so it’s output can not be real time value so
mean squared error can not use for evaluating it
45. One of the very good methods to analyze the performance of Logistic Regression is AIC,
which is similar to R-Squared in Linear Regression. Which of the following is true about AIC?
A) We prefer a model with minimum AIC value
B) We prefer a model with maximum AIC value
C) Both but depend on the situation
D) None of these
Solution: A
We select the best model in logistic regression which can least AIC.
Solution: A
In case of lasso we apply a absolute penality, after increasing the penality in lasso some of the
coefficient of variables may become zero.
Context: 48-49
Consider a following model for logistic regression: P (y =1|x, w)= g(w0 + w1x)
where g(z) is the logistic function.
In the above equation the P (y =1|x; w) , viewed as a function of x, that we can get by changing the
parameters w.
A) (0, inf)
B) (-inf, 0 )
C) (0, 1)
D) (-inf, inf)
Solution: C
For values of x in the range of real number from −∞ to +∞ Logistic function will give the output
between (0,1)
49 In above question what do you think which function would make p between (0,1)?
A) logistic function
B) Log likelihood function
C) Mixture of both
D) None of them
Solution: A
50. Suppose you have been given a fair coin and you want to find out the odds of getting heads.
Which of the following option is true for such a case?
A) odds will be 0
B) odds will be 0.5
C) odds will be 1
D) None of these
Solution: C
Odds are defined as the ratio of the probability of success and the probability of failure. So in case of fair
coin probability of success is 1/2 and the probability of failure is 1/2 so odd would be 1
51. The logit function(given as l(x)) is the log of odds function. What could be the range of logit
function in the domain x=[0,1]?
A) (– ∞ , ∞)
B) (0,1)
C) (0, ∞)
D) (- ∞, 0)
Solution: A
For our purposes, the odds function has the advantage of transforming the probability function, which
has values from 0 to 1, into an equivalent function with values between 0 and ∞. When we take the
natural log of the odds function, we get a range of values from -∞ to ∞.
A) Linear Regression errors values has to be normally distributed but in case of Logistic Regression it is
not the case
B) Logistic Regression errors values has to be normally distributed but in case of Linear Regression it is
not the case
C) Both Linear Regression and Logistic Regression error values have to be normally distributed
D) Both Linear Regression and Logistic Regression error values have not to be normally distributed
Solution:A
53. Which of the following is true regarding the logistic function for any value “x”?
Note:
Logistic(x): is a logistic function of any number “x”
A) Logistic(x) = Logit(x)
B) Logistic(x) = Logit_inv(x)
C) Logit_inv(x) = Logit(x)
D) None of these
Solution: B
Suppose you have given the two scatter plot “a” and “b” for two classes( blue for positive and red for
negative class). In scatter plot “a”, you correctly classified all data points using logistic regression ( black
line is a decision boundary).
A) Bias will be high
B) Bias will be low
C) Can’t say
D) None of these
Solution: A
55. Suppose, You applied a Logistic Regression model on a given data and got a training accuracy X
and testing accuracy Y. Now, you want to add a few new features in the same data. Select the
option(s) which is/are correct in such a case.
Solution: A and D
Adding more features to model will increase the training accuracy because model has to consider more
data to fit the logistic regression. But testing accuracy increases if feature is found to be significant
56. Choose which of the following options is true regarding One-Vs-All method in Logistic Regression.
If there are n classes, then n separate logistic regression has to fit, where the probability of each
category is predicted over the rest of the categories combined.
57. Below are two different logistic models with different values for β0 and β1.
Which of the
following statement(s) is true about β0 and β1 values of two logistics models (Green, Black)?
Solution: B
Context 58-60
Below are the three scatter plot(A,B,C left to right) and hand drawn decision boundaries for logistic
regression.
58. Which of the following above figure shows that the decision boundary is overfitting the training
data?
A) A
B) B
C) C
D)None of these
Solution: C
Since in figure 3, Decision boundary is not smooth that means it will over-fitting the data.
1. The training error in first plot is maximum as compare to second and third plot.
2. The best model for this regression problem is the last (third) plot because it has minimum
training error (zero).
3. The second model is more robust than first and third because it will perform best on unseen
data.
5. All will perform same because we have not seen the testing data.
A) 1 and 3
B) 1 and 3
C) 1, 3 and 4
D) 5
Solution: C
The trend in the graphs looks like a quadratic trend over independent variable X. A higher degree(Right
graph) polynomial might have a very high accuracy on the train population but is expected to fail badly
on test dataset. But if you see in left graph we will have training error maximum because it underfits the
training data
60. Suppose, above decision boundaries were generated for the different value of regularization.
Which of the above decision boundary shows the maximum regularization?
A) A
B) B
C) C
D) All have equal regularization
Solution: A
Since, more regularization means more penality means less complex decision boundry that shows in first
figure A.
61. What would do if you want to train logistic regression on same data that will take less time as well
as give the comparatively similar accuracy(may not be same)?
Suppose you are using a Logistic Regression model on a huge dataset. One of the problem you may face
on such huge data is that Logistic regression will take very long time to train.
Solution: D
If you decrease the number of iteration while training it will take less time for surly but will not give the
same accuracy for getting the similar accuracy but not exact you need to increase the learning rate.
62. Which of the following image is showing the cost function for y =1.
Following is the loss function in logistic regression(Y-axis loss function and x axis log probability) for
two class classification problem.
Solution: A
A is the true answer as loss function decreases as the log probability increases
A) 1
B) 2
C) 3
D) 4
Solution: C
There are three local minima present in the graph
64. Can a Logistic Regression classifier do a perfect classification on the below data?
Note: You can use only X1 and X2 variables where X1 and X2 can take only two binary values(0,1).
A) TRUE
B) FALSE
C) Can’t say
D) None of these
Solution: B
No, logistic regression only forms linear decision surface, but the examples in the figure are not linearly
separable.
UNIT IV
Ans Solution: C
When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.
Ans Solution: C
The cost parameter decides how much an SVM should be allowed to “bend” with the data. For a low
cost, you aim for a smooth decision surface and for a higher cost, you aim to classify more points
correctly. It is also simply referred to as the cost of misclassification.
Ans Solution: D
SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.
Ans Solution: B
Generalisation error in statistics is generally the out-of-sample error which is the measure of how
accurately a model can predict values for previously unseen data.
Ans Solution: C
When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: C
Question Context:8– 9
Suppose you are using a Linear SVM classifier with 2 class classification problem. Now you have been
given the following data in which some points are circled red that are representing support vectors.
8. If you remove the following any one red points from the data. Does the decision boundary will
change?
A) Yes
B) No
Solution: A
These three examples are positioned such that removing any one of them introduces slack in the
constraints. So the decision boundary would completely change.
9. [True or False] If you remove the non-red circled points from the data, the decision boundary will
change?
A) True
B) False
Solution: B
On the other hand, rest of the points in the data won’t affect the decision boundary much.
Solution: B
Generalization error in statistics is generally the out-of-sample error which is the measure of how
accurately a model can predict values for previously unseen data.
11. When the C parameter is set to infinite, which of the following holds true?
A) The optimal hyperplane if exists, will be the one that completely separates the data
B) The soft-margin classifier will separate the data
C) None of the above
Solution: A
At such a high level of misclassification penalty, soft margin will not hold existence as there will be no
room for error.
Solution: A
A hard margin means that an SVM is very rigid in classification and tries to work extremely well in the
training set, causing overfitting.
13. The minimum time complexity for training an SVM is O(n2). According to this fact, what sizes of
datasets are not best suited for SVM’s?
A) Large datasets
B) Small datasets
C) Medium sized datasets
D) Size does not matter
Solution: A
Datasets which have a clear classification boundary will function best with SVM’s.
A) Selection of Kernel
B) Kernel Parameters
C) Soft Margin Parameter C
D) All of the above
Solution: D
The SVM effectiveness depends upon how you choose the basic 3 requirements mentioned above in
such a way that it maximises your efficiency, reduces error and overfitting.
15. upport vectors are the data points that lie closest to the decision surface.
A) TRUE
B) FALSE
Solution: A
They are the points closest to the hyperplane and the hardest ones to classify. They also have a direct
bearing on the location of the decision surface.
Solution: C
When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.
17. Suppose you are using RBF kernel in SVM with high Gamma value. What does this signify?
A) The model would consider even far away points from hyperplane for modeling
B) The model would consider only the points close to the hyperplane for modeling
C) The model would not be affected by distance of points from hyperplane for modeling
D) None of the above
Solution: B
The gamma parameter in SVM tuning signifies the influence of points either near or far away from the
hyperplane.
For a low gamma, the model will be too constrained and include all points of the training dataset,
without really capturing the shape.
For a higher gamma, the model will capture the shape of the dataset well.
Solution: C
The cost parameter decides how much an SVM should be allowed to “bend” with the data. For a low
cost, you aim for a smooth decision surface and for a higher cost, you aim to classify more points
correctly. It is also simply referred to as the cost of misclassification.
19. Suppose you are building a SVM model on data X. The data X can be error prone which means that
you should not trust any specific data point too much. Now think that you want to build a SVM model
which has quadratic kernel function of polynomial degree 2 that uses Slack variable C as one of it’s hyper
parameter. Based upon that give the answer for following question.
What would happen when you use very large value of C(C->infinity)?
Note: For small C was also classifying all data points correctly
A) We can still classify data correctly for given setting of hyper parameter C
B) We can not classify data correctly for given setting of hyper parameter C
C) Can’t Say
D) None of these
Solution: A
For large values of C, the penalty for misclassifying points is very high, so the decision boundary will
perfectly separate the data if possible.
20. What would happen when you use very small C (C~0)?
Solution: A
The classifier can maximize the margin between most of the points, while misclassifying a few points,
because the penalty is so low.
21. If I am using all features of my dataset and I achieve 100% accuracy on my training set, but ~70% on
validation set, what should I look out for?
A) Underfitting
B) Nothing, the model is perfect
C) Overfitting
Solution: C
If we’re achieving 100% training accuracy very easily, we need to check to verify if we’re overfitting our
data.
22. Which of the following are real world applications of the SVM?
Solution: D
SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.
Question Context: 23 – 25
Suppose you have trained an SVM with linear decision boundary after training SVM, you correctly infer
that your SVM model is under fitting.
23. Which of the following option would you more likely to consider iterating SVM next time?
Solution: C
The best option here would be to create more features for the model.
24. Suppose you gave the correct answer in previous question. What do you think that is actually
happening?
A) 1 and 2
B) 2 and 3
C) 1 and 4
D) 2 and 4
Solution: C
Better model will lower the bias and increase the variance
25. In above question suppose you want to change one of it’s(SVM) hyperparameter so that effect
would be same as previous questions i.e model will not under fit?
Solution: A
Increasing C parameter would be the right thing to do here, as it will ensure regularized model
26. We usually use feature normalization before using the Gaussian kernel in SVM. What is true about
feature normalization?
A) 1
B) 1 and 2
C) 1 and 3
D) 2 and 3
Solution: B
Suppose you are dealing with 4 class classification problem and you want to train a SVM model on the
data for that you are using One-vs-all method. Now answer the below questions?
27. How many times we need to train our SVM model in such case?
A) 1
B) 2
C) 3
D) 4
Solution: D
For a 4 class problem, you would have to train the SVM at least 4 times if you are using a one-vs-all
method.
28. Suppose you have same distribution of classes in the data. Now, say for training 1 time in one vs all
setting the SVM is taking 10 second. How many seconds would it require to train one-vs-all method end
to end?
A) 20
B) 40
C) 60
D) 80
Solution: B
29 Suppose your problem has changed now. Now, data has only 2 classes. What would you think how
many times we need to train SVM in such case?
A) 1
B) 2
C) 3
D) 4
Solution: A
Training the SVM only one time would give you appropriate results
Suppose you are using SVM with linear kernel of polynomial degree 2, Now think that you have applied
this on data and found that it perfectly fit the data that means, Training and testing accuracy is 100%.
30. Now, think that you increase the complexity (or degree of polynomial of this kernel). What would
you think will happen?
Solution: A
Increasing the complexity of the data would make the algorithm overfit the data.
31. In the previous question after increasing the complexity you found that training accuracy was still
100%. According to you what is the reason behind that?
1. Since data is fixed and we are fitting more polynomial term or parameters so the algorithm starts
memorizing everything in the data
2. Since data is fixed and SVM doesn’t need to search in big hypothesis space
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: C
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: C
UNIT V
1. Which of the following is a widely used and effective machine learning algorithm based on the
idea of bagging?
a) Decision Tree
b) Regression
c) Classification
d) Random Forest
Ans D
a) Factor analysis
b) Decision trees are robust to outliers
c) Decision trees are prone to be overfit
d) None of the above
Ans C
a. True
b. False
Decision trees can also be used to for clusters in the data but clustering often generates natural
clusters and is not dependent on any objective function.
Regression
Classification
Clustering
Reinforcement Learning
Options:
a. 1 Only
b. 1 and 2
c. 1 and 3
d. 1, 2 and 4
Ans D
6 Which of the following is the most appropriate strategy for data cleaning before performing
clustering analysis, given less than desirable number of data points:
Removal of outliers
Options:
a. 1 only
b. 2 only
c. 1 and 2
d. None of the above
Ans A
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: C
Both options are true. In Bagging, each individual trees are independent of each other because they
consider different subset of features and samples.
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: B
In boosting tree individual weak learners are not independent of each other because each tree correct
the results of previous tree. Bagging and boosting both can be consider as improving the base learners
results.
9. In Random forest you can generate hundreds of trees (say T1, T2 …..Tn) and then aggregate
the results of these tree. Which of the following is true about individual (Tk) tree in Random Forest?
1. Individual tree is built on a subset of the features
A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4
Ans Solution: A
Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.
10. Suppose you are using a bagging based algorithm say a RandomForest in model building.
Which of the following can be true?
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: A
Since Random Forest aggregate the result of different weak learners, If It is possible we would want
more number of trees in model building. Random Forest is a black box model you will lose
interpretability after using it.
11. Which of the following is/are true about Random Forest and Gradient Boosting ensemble
methods?
2. Random Forest is use for classification whereas Gradient Boosting is use for regression task
3. Random Forest is use for regression whereas Gradient Boosting is use for Classification task
Solution: E
12. In Random forest you can generate hundreds of trees (say T1, T2 …..Tn) and then aggregate the
results of these tree. Which of the following is true about individual(Tk) tree in Random Forest?
A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4
Solution: A
Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.
13. Which of the following algorithm doesn’t uses learning Rate as of one of its hyperparameter?
1. Gradient Boosting
2. Extra Trees
3. AdaBoost
4. Random Forest
A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4
Solution: D
Random Forest and Extra Trees don’t have learning rate as a hyperparameter.
14. Which of the following algorithm are not an example of ensemble learning algorithm?
A) Random Forest
B) Adaboost
C) Extra Trees
D) Gradient Boosting
E) Decision Trees
Solution: E
Decision trees doesn’t aggregate the results of multiple trees so it is not an ensemble algorithm.
15. Suppose you are using a bagging based algorithm say a RandomForest in model building. Which of
the following can be true?
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: A
Since Random Forest aggregate the result of different weak learners, If It is possible we would want
more number of trees in model building. Random Forest is a black box model you will lose
interpretability after using it.
16. True-False: The bagging is suitable for high variance low bias models?
A) TRUE
B) FALSE
Solution: A
The bagging is suitable for high variance low bias models or you can say for complex models.
17. To apply bagging to regression trees which of the following is/are true in such case?
Solution: D
Solution: B
We always consider the validation results to compare with the test result.
19. In which of the following scenario a gain ratio is preferred over Information Gain?
Solution: A
When high cardinality problems, gain ratio is preferred over Information Gain technique.
20. Suppose you have given the following scenario for training and validation error for Gradient
Boosting. Which of the following hyper parameter would you choose in such case?
1 2 100 110
2 4 90 105
3 6 50 100
4 8 45 105
5 10 30 150
A) 1
B) 2
C) 3
D) 4
Solution: B
Scenario 2 and 4 has same validation accuracies but we would select 2 because depth is lower is better
hyper parameter.
21. Which of the following is/are not true about DBSCAN clustering algorithm:
1. For data points to be in a cluster, they must be in a distance threshold to a core point
5. It is robust to outliers
Options:
A. 1 only
B. 2 only
C. 4 only
D. 2 and 3
Solution: D
DBSCAN can form a cluster of any arbitrary shape and does not have strong assumptions for the
distribution of data points in the data space.
Answer: d
Explanation: K-means clustering follows partitioning approach.
Answer: c
Explanation: k-nearest neighbour has nothing to do with k-means.
Answer: a
Explanation: K-means requires a number of clusters.
Answer: a
Explanation: K-means clustering produces the final estimate of cluster centroids.
27.
UNIT I
1. What is classification?
a) when the output variable is a category, such as “red” or “blue” or “disease” and “no
disease”.
b) when the output variable is a real value, such as “dollars” or “weight”.
Ans: Solution A
2. What is regression?
a) When the output variable is a category, such as “red” or “blue” or “disease” and “no
disease”.
b) When the output variable is a real value, such as “dollars” or “weight”.
Ans: Solution B
Ans: Solution B
Ans: Solution A
Ans: Solution D
6. What is Reinforcement learning?
a) All data is unlabelled and the algorithms learn to inherent structure from the input data
b) All data is labelled and the algorithms learn to predict the output from the input data
c) It is a framework for learning where an agent interacts with an environment and receives
a reward for each interaction
d) Some data is labelled but most of it is unlabelled and a mixture of supervised and
unsupervised techniques can be used.
Ans: Solution C
Regression,
Classification
Clustering
Reinforcement Learning
Options:
A. 1 Only
B. 1 and 2
C. 1 and 3
D. 1, 2 and 4
Ans : Solution D
Ans : Solution C
Ans : Solution B
11. Supervised learning and unsupervised clustering both require at least one
a) hidden attribute.
b) output attribute.
c) input attribute.
d) categorical attribute.
Ans : Solution A
12. Supervised learning differs from unsupervised clustering in that supervised learning requires
a) at least one input attribute.
b) input attributes to be categorical.
c) at least one output attribute.
d) output attributes to be categorical.
Ans : Solution B
13. A regression model in which more than one independent variable is used to predict the
dependent variable is called
a) a simple linear regression model
b) a multiple regression models
c) an independent model
d) none of the above
Ans : Solution C
14. A term used to describe the case when the independent variables in a multiple regression model
are correlated is
a) Regression
b) correlation
c) multicollinearity
d) none of the above
Ans : Solution C
15. A multiple regression model has the form: y = 2 + 3x1 + 4x2. As x1 increases by 1 unit (holding x2
constant), y will
a) increase by 3 units
b) decrease by 3 units
c) increase by 4 units
d) decrease by 4 units
Ans : Solution C
Ans : Solution B
17. A measure of goodness of fit for the estimated regression equation is the
a) multiple coefficient of determination
b) mean square due to error
c) mean square due to regression
d) none of the above
Ans : Solution C
Ans : Solution D
Ans : Solution C
20. For a multiple regression model, SST = 200 and SSE = 50. The multiple coefficient of
determination is
a) 0.25
b) 4.00
c) 0.75
d) none of the above
Ans : Solution B
Ans : Solution B
Ans : Solution B
Ans : Solution C
Ans : Solution D
26. Which statement is true about neural network and linear regression models?
a) Both models require input attributes to be numeric.
b) Both models require numeric attributes to range between 0 and 1.
c) The output of both models is a categorical attribute value.
d) Both techniques build models whose output is determined by a linear sum of weighted
input attribute values.
Ans : Solution A
Ans : Solution A
28. The average positive difference between computed and desired outcome values.
a) root mean squared error
b) mean squared error
c) mean absolute error
d) mean positive error
Ans : Solution D
29. Selecting data so as to assure that each class is properly represented in both the training and
test set.
a) cross validation
b) stratification
c) verification
d) bootstrapping
Ans : Solution B
30. The standard error is defined as the square root of this computation.
a) The sample variance divided by the total number of sample instances.
b) The population variance divided by the total number of sample instances.
c) The sample variance divided by the sample mean.
d) The population variance divided by the sample mean.
Ans : Solution A
31. Data used to optimize the parameter settings of a supervised learner model.
a) Training
b) Test
c) Verification
d) Validation
Ans : Solution D
Ans : Solution A
33. The correlation between the number of years an employee has worked for a company and the
salary of the employee is 0.75. What can be said about employee salary and years worked?
a) There is no relationship between salary and years worked.
b) Individuals that have worked for the company the longest have higher salaries.
c) Individuals that have worked for the company the longest have lower salaries.
d) The majority of employees have been with the company a long time.
e) The majority of employees have been with the company a short period of time.
Ans : Solution B
34. The correlation coefficient for two real-valued attributes is –0.85. What does this value tell you?
a) The attributes are not linearly related.
b) As the value of one attribute increases the value of the second attribute also increases.
c) As the value of one attribute decreases the value of the second attribute increases.
d) The attributes show a curvilinear relationship.
Ans : Solution C
35. The average squared difference between classifier predicted output and actual output.
a) mean squared error
b) root mean squared error
c) mean absolute error
d) mean relative error
Ans : Solution A
36. Simple regression assumes a __________ relationship between the input attribute and output
attribute.
a) Linear
b) Quadratic
c) reciprocal
d) inverse
Ans : Solution A
Ans : Solution B
Ans : Solution C
39. Logistic regression is a ________ regression technique that is used to model data having a
_____outcome.
a) linear, numeric
b) linear, binary
c) nonlinear, numeric
d) nonlinear, binary
Ans : Solution D
40. This technique associates a conditional probability value with each data instance.
a) linear regression
b) logistic regression
c) simple regression
d) multiple linear regression
Ans : Solution B
41. This supervised learning technique can process both numeric and categorical input attributes.
a) linear regression
b) Bayes classifier
c) logistic regression
d) backpropagation learning
Ans : Solution A
Ans : Solution B
43. This clustering algorithm merges and splits nodes to help modify nonoptimal partitions.
a) agglomerative clustering
b) expectation maximization
c) conceptual clustering
d) K-Means clustering
Ans : Solution D
44. This clustering algorithm initially assumes that each data instance represents a single cluster.
a) agglomerative clustering
b) conceptual clustering
c) K-Means clustering
d) expectation maximization
Ans : Solution C
45. This unsupervised clustering algorithm terminates when mean values computed for the current
iteration of the algorithm are identical to the computed mean values for the previous iteration.
a) agglomerative clustering
b) conceptual clustering
c) K-Means clustering
d) expectation maximization
Ans : Solution C
46. Machine learning techniques differ from statistical techniques in that machine learning methods
a) typically assume an underlying distribution for the data.
b) are better able to deal with missing and noisy data.
c) are not able to explain their behavior.
d) have trouble with large-sized datasets.
Ans : Solution B
UNIT –II
1.True- False: Over fitting is more likely when you have huge amount of data to train?
A) TRUE
B) FALSE
Ans Solution: (B)
With a small training dataset, it’s easier to find a hypothesis to fit the training data exactly i.e.
over fitting.
3.Which of the following techniques would perform better for reducing dimensions of a data
set?
A. Removing columns which have too many missing values
B. Removing columns which have high variance in data
C. Removing columns with dissimilar data trends
D. None of these
Ans Solution: (A)
If a columns have too many missing values, (say 99%) then we can remove such columns.
4.It is not necessary to have a target variable for applying dimensionality reduction
algorithms.
A. TRUE
B. FALSE
Ans Solution: (A)
LDA is an example of supervised dimensionality reduction algorithm.
5. PCA can be used for projecting and visualizing data in lower dimensions.
A. TRUE
B. FALSE
Ans Solution: (A)
Sometimes it is very useful to plot the data in lower dimensions. We can take the first 2 principal
components and then visualize the data using scatter plot.
6. The most popularly used dimensionality reduction algorithm is Principal Component Analysis
(PCA). Which of the following is/are true about PCA?
PCA is an unsupervised method
It searches for the directions that data have the largest variance
Maximum number of principal components <= number of features
All principal components are orthogonal to each other
A. 1 and 2
B. 1 and 3
C. 2 and 3
D. All of the above
Ans D
8. What happens when you get features in lower dimensions using PCA?
The features will still have interpretability
The features will lose interpretability
The features must carry all information present in data
The features may not carry all information present in data
A. 1 and 3
B. 1 and 4
C. 2 and 3
D. 2 and 4
Ans Solution: (D)
When you get the features in lower dimensions then you will lose some information of data
most of the times and you won’t be able to interpret the lower dimension data.
10. What is of the following statement is true about t-SNE in comparison to PCA?
A. When the data is huge (in size), t-SNE may fail to produce better results.
B. T-NSE always produces better result regardless of the size of the data
C. PCA always performs better than t-SNE for smaller size data.
D. None of these
Ans Solution: (A)
Option A is correct
11. [ True or False ] PCA can be used for projecting and visualizing data in lower dimensions.
A. TRUE
B. FALSE
Solution: (A)
Sometimes it is very useful to plot the data in lower dimensions. We can take the first 2 principal
components and then visualize the data using scatter plot.
12. A feature F1 can take certain value: A, B, C, D, E, & F and represents grade of students from
a college.
1) Which of the following statement is true in following case?
A) Feature F1 is an example of nominal variable.
B) Feature F1 is an example of ordinal variable.
C) It doesn’t belong to any of the above category.
D) Both of these
Solution: (B)
Ordinal variables are the variables which has some order in their categories. For example, grade
A should be consider as high grade than grade B.
1. Which of the following methods do we use to best fit the data in Logistic Regression?
A) Least Square Error
B) Maximum Likelihood
C) Jaccard distance
D) Both A and B
Ans Solution: B
2. Choose which of the following options is true regarding One-Vs-All method in Logistic
Regression.
A) We need to fit n models in n-class classification problem
B) We need to fit n-1 models to classify into n classes
C) We need to fit only 1 model to classify into n classes
D) None of these
Ans Solution: A
3. Suppose, You applied a Logistic Regression model on a given data and got a training accuracy
X and testing accuracy Y. Now, you want to add a few new features in the same data. Select the
option(s) which is/are correct in such a case.
Note: Consider remaining parameters are same.
A) Training accuracy increases
B) Training accuracy increases or remains the same
C) Testing accuracy decreases
D) Testing accuracy increases or remains the same
Ans Solution: A and D
Adding more features to model will increase the training accuracy because model has to
consider more data to fit the logistic regression. But testing accuracy increases if feature is
found to be significant
6. Which of the following methods do we use to find the best fit line for data in Linear
Regression?
A) Least Square Error
B) Maximum Likelihood
C) Logarithmic Loss
D) Both A and B
Ans Solution: (A)
In linear regression, we try to minimize the least square errors of the model to identify the line
of best fit.
8. Suppose you plotted a scatter plot between the residuals and predicted values in linear
regression and you found that there is a relationship between them. Which of the following
conclusion do you make about this situation?
9. Suppose you have fitted a complex regression model on a dataset. Now, you are using Ridge
regression with penalty x.
Choose the option which describes bias in best manner.
A) In case of very large x; bias is low
B) In case of very large x; bias is high
C) We can’t say about bias
D) None of these
Ans Solution: (B)
If the penalty is very large it means model is less complex, therefore the bias would be high.
11. Suppose you have trained a logistic regression classifier and it outputs a new example x with
a prediction ho(x) = 0.2. This means
Our estimate for P(y=1 | x)
Our estimate for P(y=0 | x)
Our estimate for P(y=1 | x)
Our estimate for P(y=0 | x)
Ans Solution: B
A) TRUE
B) FALSE
Solution: (A)
True. A Neural network can be used as a universal approximator, so it can definitely implement
a linear regression algorithm.
15. Which of the following methods do we use to find the best fit line for data in Linear
Regression?
A) Least Square Error
B) Maximum Likelihood
C) Logarithmic Loss
D) Both A and B
Solution: (A)
In linear regression, we try to minimize the least square errors of the model to identify the line
of best fit.
16. Which of the following evaluation metrics can be used to evaluate a model while modeling
a continuous output variable?
A) AUC-ROC
B) Accuracy
C) Logloss
D) Mean-Squared-Error
Solution: (D)
Since linear regression gives output as continuous values, so in such case we use mean squared
error metric to evaluate the model performance. Remaining options are use in case of a
classification problem.
17. True-False: Lasso Regularization can be used for variable selection in Linear Regression.
A) TRUE
B) FALSE
Solution: (A)
True, In case of lasso regression we apply absolute penalty which makes some of the coefficients
zero.
19. Suppose that we have N independent variables (X1,X2… Xn) and dependent variable is Y.
Now Imagine that you are applying linear regression by fitting the best fit line using least square
error on this data.
You found that correlation coefficient for one of it’s variable(Say X1) with Y is -0.95.
Which of the following is true for X1?
A) Relation between the X1 and Y is weak
B) Relation between the X1 and Y is strong
C) Relation between the X1 and Y is neutral
D) Correlation can’t judge the relationship
Solution: (B)
The absolute value of the correlation coefficient denotes the strength of the relationship.
Since absolute correlation is very high it means that the relationship is strong between X1 and
Y.
20. Looking at above two characteristics, which of the following option is the correct for
Pearson correlation between V1 and V2?
If you are given the two variables V1 and V2 and they are following below two characteristics.
1. If V1 increases then V2 also increases
2. If V1 decreases then V2 behavior is unknown
A) Pearson correlation will be close to 1
B) Pearson correlation will be close to -1
C) Pearson correlation will be close to 0
D) None of these
Solution: (D)
We cannot comment on the correlation coefficient by using only statement 1. We need to
consider the both of these two statements. Consider V1 as x and V2 as |x|. The correlation
coefficient would not be close to 1 in such a case.
21. Suppose Pearson correlation between V1 and V2 is zero. In such case, is it right to
conclude that V1 and V2 do not have any relation between them?
A) TRUE
B) FALSE
Solution: (B)
Pearson correlation coefficient between 2 variables might be zero even when they have a
relationship between them. If the correlation coefficient is zero, it just means that that they
don’t move together. We can take examples like y=|x| or y=x^2.
22. True- False: Overfitting is more likely when you have huge amount of data to train?
A) TRUE
B) FALSE
Solution: (B)
With a small training dataset, it’s easier to find a hypothesis to fit the training data exactly i.e.
overfitting.
23. We can also compute the coefficient of linear regression with the help of an analytical
method called “Normal Equation”. Which of the following is/are true about Normal Equation?
1. We don’t have to choose the learning rate
2. It becomes slow when number of features is very large
3. Thers is no need to iterate
A) 1 and 2
B) 1 and 3
C) 2 and 3
D) 1,2 and 3
Solution: (D)
Instead of gradient descent, Normal Equation can also be used to find coefficients.
25. What will happen when you apply very large penalty?
A) Some of the coefficient will become absolute zero
B) Some of the coefficient will approach zero but not absolute zero
C) Both A and B depending on the situation
D) None of these
Solution: (B)
In lasso some of the coefficient value become zero, but in case of Ridge, the coefficients become
close to zero but not zero.
26. What will happen when you apply very large penalty in case of Lasso?
A) Some of the coefficient will become zero
B) Some of the coefficient will be approaching to zero but not absolute zero
C) Both A and B depending on the situation
D) None of these
Solution: (A)
As already discussed, lasso applies absolute penalty, so some of the coefficients will become
zero.
27. Which of the following statement is true about outliers in Linear regression?
A) Linear regression is sensitive to outliers
B) Linear regression is not sensitive to outliers
C) Can’t say
D) None of these
Solution: (A)
The slope of the regression line will change due to outliers in most of the cases. So Linear
Regression is sensitive to outliers.
28. Suppose you plotted a scatter plot between the residuals and predicted values in linear
regression and you found that there is a relationship between them. Which of the following
conclusion do you make about this situation?
31. In terms of bias and variance. Which of the following is true when you fit degree 2
polynomial?
A) Increase
B) Decrease
C) Remain constant
D) Can’t Say
Solution: (D)
Training error may increase or decrease depending on the values that are used to fit the model.
If the values used to train contain more outliers gradually, then the error might just increase.
33. What do you expect will happen with bias and variance as you increase the size of training
data?
34. What would be the root mean square training error for this data if you run a Linear
Regression model of the form (Y = A0+A1X)?
A) Less than 0
B) Greater than zero
C) Equal to 0
D) None of these
Solution: (C)
We can perfectly fit the line on the following data so mean error will be zero.
35. Which of the following scenario would give you the right hyper parameter?
A) 1
B) 2
C) 3
D) 4
Solution: (B)
Option B would be the better option because it leads to less training as well as validation error.
36. Suppose you got the tuned hyper parameters from the previous question. Now, Imagine
you want to add a variable in variable space such that this added feature is important. Which
of the following thing would you observe in such case?
A) Training Error will decrease and Validation error will increase
B) Training Error will increase and Validation error will increase
C) Training Error will increase and Validation error will decrease
D) Training Error will decrease and Validation error will decrease
E) None of the above
Solution: (D)
If the added feature is important, the training and validation error would decrease.
A) L1
B) L2
C) Any
D) None of these
Solution: (D)
I won’t use any regularization methods because regularization is used in case of overfitting.
41. True-False: Is it possible to design a logistic regression algorithm using a Neural Network
Algorithm?
A) TRUE
B) FALSE
Solution: A
True, Neural network is a is a universal approximator so it can implement linear regression
algorithm.
43. Which of the following methods do we use to best fit the data in Logistic Regression?
A) Least Square Error
B) Maximum Likelihood
C) Jaccard distance
D) Both A and B
Solution: B
Logistic regression uses maximum likely hood estimate for training a logistic regression.
44. Which of the following evaluation metrics can not be applied in case of logistic regression
output to compare with target?
A) AUC-ROC
B) Accuracy
C) Logloss
D) Mean-Squared-Error
Solution: D
Since, Logistic Regression is a classification algorithm so it’s output can not be real time value so
mean squared error can not use for evaluating it
45. One of the very good methods to analyze the performance of Logistic Regression is AIC,
which is similar to R-Squared in Linear Regression. Which of the following is true about AIC?
A) We prefer a model with minimum AIC value
B) We prefer a model with maximum AIC value
C) Both but depend on the situation
D) None of these
Solution: A
We select the best model in logistic regression which can least AIC.
Solution: A
In case of lasso we apply a absolute penality, after increasing the penality in lasso some of the
coefficient of variables may become zero.
Context: 48-49
Consider a following model for logistic regression: P (y =1|x, w)= g(w0 + w1x)
where g(z) is the logistic function.
In the above equation the P (y =1|x; w) , viewed as a function of x, that we can get by changing the
parameters w.
A) (0, inf)
B) (-inf, 0 )
C) (0, 1)
D) (-inf, inf)
Solution: C
For values of x in the range of real number from −∞ to +∞ Logistic function will give the output
between (0,1)
49 In above question what do you think which function would make p between (0,1)?
A) logistic function
B) Log likelihood function
C) Mixture of both
D) None of them
Solution: A
50. Suppose you have been given a fair coin and you want to find out the odds of getting heads.
Which of the following option is true for such a case?
A) odds will be 0
B) odds will be 0.5
C) odds will be 1
D) None of these
Solution: C
Odds are defined as the ratio of the probability of success and the probability of failure. So in case of fair
coin probability of success is 1/2 and the probability of failure is 1/2 so odd would be 1
51. The logit function(given as l(x)) is the log of odds function. What could be the range of logit
function in the domain x=[0,1]?
A) (– ∞ , ∞)
B) (0,1)
C) (0, ∞)
D) (- ∞, 0)
Solution: A
For our purposes, the odds function has the advantage of transforming the probability function, which
has values from 0 to 1, into an equivalent function with values between 0 and ∞. When we take the
natural log of the odds function, we get a range of values from -∞ to ∞.
A) Linear Regression errors values has to be normally distributed but in case of Logistic Regression it is
not the case
B) Logistic Regression errors values has to be normally distributed but in case of Linear Regression it is
not the case
C) Both Linear Regression and Logistic Regression error values have to be normally distributed
D) Both Linear Regression and Logistic Regression error values have not to be normally distributed
Solution:A
53. Which of the following is true regarding the logistic function for any value “x”?
Note:
Logistic(x): is a logistic function of any number “x”
A) Logistic(x) = Logit(x)
B) Logistic(x) = Logit_inv(x)
C) Logit_inv(x) = Logit(x)
D) None of these
Solution: B
Suppose you have given the two scatter plot “a” and “b” for two classes( blue for positive and red for
negative class). In scatter plot “a”, you correctly classified all data points using logistic regression ( black
line is a decision boundary).
A) Bias will be high
B) Bias will be low
C) Can’t say
D) None of these
Solution: A
55. Suppose, You applied a Logistic Regression model on a given data and got a training accuracy X
and testing accuracy Y. Now, you want to add a few new features in the same data. Select the
option(s) which is/are correct in such a case.
Solution: A and D
Adding more features to model will increase the training accuracy because model has to consider more
data to fit the logistic regression. But testing accuracy increases if feature is found to be significant
56. Choose which of the following options is true regarding One-Vs-All method in Logistic Regression.
If there are n classes, then n separate logistic regression has to fit, where the probability of each
category is predicted over the rest of the categories combined.
57. Below are two different logistic models with different values for β0 and β1.
Which of the
following statement(s) is true about β0 and β1 values of two logistics models (Green, Black)?
Solution: B
Context 58-60
Below are the three scatter plot(A,B,C left to right) and hand drawn decision boundaries for logistic
regression.
58. Which of the following above figure shows that the decision boundary is overfitting the training
data?
A) A
B) B
C) C
D)None of these
Solution: C
Since in figure 3, Decision boundary is not smooth that means it will over-fitting the data.
1. The training error in first plot is maximum as compare to second and third plot.
2. The best model for this regression problem is the last (third) plot because it has minimum
training error (zero).
3. The second model is more robust than first and third because it will perform best on unseen
data.
5. All will perform same because we have not seen the testing data.
A) 1 and 3
B) 1 and 3
C) 1, 3 and 4
D) 5
Solution: C
The trend in the graphs looks like a quadratic trend over independent variable X. A higher degree(Right
graph) polynomial might have a very high accuracy on the train population but is expected to fail badly
on test dataset. But if you see in left graph we will have training error maximum because it underfits the
training data
60. Suppose, above decision boundaries were generated for the different value of regularization.
Which of the above decision boundary shows the maximum regularization?
A) A
B) B
C) C
D) All have equal regularization
Solution: A
Since, more regularization means more penality means less complex decision boundry that shows in first
figure A.
61. What would do if you want to train logistic regression on same data that will take less time as well
as give the comparatively similar accuracy(may not be same)?
Suppose you are using a Logistic Regression model on a huge dataset. One of the problem you may face
on such huge data is that Logistic regression will take very long time to train.
Solution: D
If you decrease the number of iteration while training it will take less time for surly but will not give the
same accuracy for getting the similar accuracy but not exact you need to increase the learning rate.
62. Which of the following image is showing the cost function for y =1.
Following is the loss function in logistic regression(Y-axis loss function and x axis log probability) for
two class classification problem.
Solution: A
A is the true answer as loss function decreases as the log probability increases
A) 1
B) 2
C) 3
D) 4
Solution: C
There are three local minima present in the graph
64. Can a Logistic Regression classifier do a perfect classification on the below data?
Note: You can use only X1 and X2 variables where X1 and X2 can take only two binary values(0,1).
A) TRUE
B) FALSE
C) Can’t say
D) None of these
Solution: B
No, logistic regression only forms linear decision surface, but the examples in the figure are not linearly
separable.
UNIT IV
Ans Solution: C
When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.
Ans Solution: C
The cost parameter decides how much an SVM should be allowed to “bend” with the data. For a low
cost, you aim for a smooth decision surface and for a higher cost, you aim to classify more points
correctly. It is also simply referred to as the cost of misclassification.
Ans Solution: D
SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.
Ans Solution: B
Generalisation error in statistics is generally the out-of-sample error which is the measure of how
accurately a model can predict values for previously unseen data.
Ans Solution: C
When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: C
Question Context:8– 9
Suppose you are using a Linear SVM classifier with 2 class classification problem. Now you have been
given the following data in which some points are circled red that are representing support vectors.
8. If you remove the following any one red points from the data. Does the decision boundary will
change?
A) Yes
B) No
Solution: A
These three examples are positioned such that removing any one of them introduces slack in the
constraints. So the decision boundary would completely change.
9. [True or False] If you remove the non-red circled points from the data, the decision boundary will
change?
A) True
B) False
Solution: B
On the other hand, rest of the points in the data won’t affect the decision boundary much.
Solution: B
Generalization error in statistics is generally the out-of-sample error which is the measure of how
accurately a model can predict values for previously unseen data.
11. When the C parameter is set to infinite, which of the following holds true?
A) The optimal hyperplane if exists, will be the one that completely separates the data
B) The soft-margin classifier will separate the data
C) None of the above
Solution: A
At such a high level of misclassification penalty, soft margin will not hold existence as there will be no
room for error.
Solution: A
A hard margin means that an SVM is very rigid in classification and tries to work extremely well in the
training set, causing overfitting.
13. The minimum time complexity for training an SVM is O(n2). According to this fact, what sizes of
datasets are not best suited for SVM’s?
A) Large datasets
B) Small datasets
C) Medium sized datasets
D) Size does not matter
Solution: A
Datasets which have a clear classification boundary will function best with SVM’s.
A) Selection of Kernel
B) Kernel Parameters
C) Soft Margin Parameter C
D) All of the above
Solution: D
The SVM effectiveness depends upon how you choose the basic 3 requirements mentioned above in
such a way that it maximises your efficiency, reduces error and overfitting.
15. upport vectors are the data points that lie closest to the decision surface.
A) TRUE
B) FALSE
Solution: A
They are the points closest to the hyperplane and the hardest ones to classify. They also have a direct
bearing on the location of the decision surface.
Solution: C
When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.
17. Suppose you are using RBF kernel in SVM with high Gamma value. What does this signify?
A) The model would consider even far away points from hyperplane for modeling
B) The model would consider only the points close to the hyperplane for modeling
C) The model would not be affected by distance of points from hyperplane for modeling
D) None of the above
Solution: B
The gamma parameter in SVM tuning signifies the influence of points either near or far away from the
hyperplane.
For a low gamma, the model will be too constrained and include all points of the training dataset,
without really capturing the shape.
For a higher gamma, the model will capture the shape of the dataset well.
Solution: C
The cost parameter decides how much an SVM should be allowed to “bend” with the data. For a low
cost, you aim for a smooth decision surface and for a higher cost, you aim to classify more points
correctly. It is also simply referred to as the cost of misclassification.
19. Suppose you are building a SVM model on data X. The data X can be error prone which means that
you should not trust any specific data point too much. Now think that you want to build a SVM model
which has quadratic kernel function of polynomial degree 2 that uses Slack variable C as one of it’s hyper
parameter. Based upon that give the answer for following question.
What would happen when you use very large value of C(C->infinity)?
Note: For small C was also classifying all data points correctly
A) We can still classify data correctly for given setting of hyper parameter C
B) We can not classify data correctly for given setting of hyper parameter C
C) Can’t Say
D) None of these
Solution: A
For large values of C, the penalty for misclassifying points is very high, so the decision boundary will
perfectly separate the data if possible.
20. What would happen when you use very small C (C~0)?
Solution: A
The classifier can maximize the margin between most of the points, while misclassifying a few points,
because the penalty is so low.
21. If I am using all features of my dataset and I achieve 100% accuracy on my training set, but ~70% on
validation set, what should I look out for?
A) Underfitting
B) Nothing, the model is perfect
C) Overfitting
Solution: C
If we’re achieving 100% training accuracy very easily, we need to check to verify if we’re overfitting our
data.
22. Which of the following are real world applications of the SVM?
Solution: D
SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.
Question Context: 23 – 25
Suppose you have trained an SVM with linear decision boundary after training SVM, you correctly infer
that your SVM model is under fitting.
23. Which of the following option would you more likely to consider iterating SVM next time?
Solution: C
The best option here would be to create more features for the model.
24. Suppose you gave the correct answer in previous question. What do you think that is actually
happening?
A) 1 and 2
B) 2 and 3
C) 1 and 4
D) 2 and 4
Solution: C
Better model will lower the bias and increase the variance
25. In above question suppose you want to change one of it’s(SVM) hyperparameter so that effect
would be same as previous questions i.e model will not under fit?
Solution: A
Increasing C parameter would be the right thing to do here, as it will ensure regularized model
26. We usually use feature normalization before using the Gaussian kernel in SVM. What is true about
feature normalization?
A) 1
B) 1 and 2
C) 1 and 3
D) 2 and 3
Solution: B
Suppose you are dealing with 4 class classification problem and you want to train a SVM model on the
data for that you are using One-vs-all method. Now answer the below questions?
27. How many times we need to train our SVM model in such case?
A) 1
B) 2
C) 3
D) 4
Solution: D
For a 4 class problem, you would have to train the SVM at least 4 times if you are using a one-vs-all
method.
28. Suppose you have same distribution of classes in the data. Now, say for training 1 time in one vs all
setting the SVM is taking 10 second. How many seconds would it require to train one-vs-all method end
to end?
A) 20
B) 40
C) 60
D) 80
Solution: B
29 Suppose your problem has changed now. Now, data has only 2 classes. What would you think how
many times we need to train SVM in such case?
A) 1
B) 2
C) 3
D) 4
Solution: A
Training the SVM only one time would give you appropriate results
Suppose you are using SVM with linear kernel of polynomial degree 2, Now think that you have applied
this on data and found that it perfectly fit the data that means, Training and testing accuracy is 100%.
30. Now, think that you increase the complexity (or degree of polynomial of this kernel). What would
you think will happen?
Solution: A
Increasing the complexity of the data would make the algorithm overfit the data.
31. In the previous question after increasing the complexity you found that training accuracy was still
100%. According to you what is the reason behind that?
1. Since data is fixed and we are fitting more polynomial term or parameters so the algorithm starts
memorizing everything in the data
2. Since data is fixed and SVM doesn’t need to search in big hypothesis space
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: C
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: C
UNIT V
1. Which of the following is a widely used and effective machine learning algorithm based on the
idea of bagging?
a) Decision Tree
b) Regression
c) Classification
d) Random Forest
Ans D
a) Factor analysis
b) Decision trees are robust to outliers
c) Decision trees are prone to be overfit
d) None of the above
Ans C
a. True
b. False
Decision trees can also be used to for clusters in the data but clustering often generates natural
clusters and is not dependent on any objective function.
Regression
Classification
Clustering
Reinforcement Learning
Options:
a. 1 Only
b. 1 and 2
c. 1 and 3
d. 1, 2 and 4
Ans D
6 Which of the following is the most appropriate strategy for data cleaning before performing
clustering analysis, given less than desirable number of data points:
Removal of outliers
Options:
a. 1 only
b. 2 only
c. 1 and 2
d. None of the above
Ans A
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: C
Both options are true. In Bagging, each individual trees are independent of each other because they
consider different subset of features and samples.
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: B
In boosting tree individual weak learners are not independent of each other because each tree correct
the results of previous tree. Bagging and boosting both can be consider as improving the base learners
results.
9. In Random forest you can generate hundreds of trees (say T1, T2 …..Tn) and then aggregate
the results of these tree. Which of the following is true about individual (Tk) tree in Random Forest?
1. Individual tree is built on a subset of the features
A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4
Ans Solution: A
Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.
10. Suppose you are using a bagging based algorithm say a RandomForest in model building.
Which of the following can be true?
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: A
Since Random Forest aggregate the result of different weak learners, If It is possible we would want
more number of trees in model building. Random Forest is a black box model you will lose
interpretability after using it.
11. Which of the following is/are true about Random Forest and Gradient Boosting ensemble
methods?
2. Random Forest is use for classification whereas Gradient Boosting is use for regression task
3. Random Forest is use for regression whereas Gradient Boosting is use for Classification task
Solution: E
12. In Random forest you can generate hundreds of trees (say T1, T2 …..Tn) and then aggregate the
results of these tree. Which of the following is true about individual(Tk) tree in Random Forest?
A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4
Solution: A
Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.
13. Which of the following algorithm doesn’t uses learning Rate as of one of its hyperparameter?
1. Gradient Boosting
2. Extra Trees
3. AdaBoost
4. Random Forest
A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4
Solution: D
Random Forest and Extra Trees don’t have learning rate as a hyperparameter.
14. Which of the following algorithm are not an example of ensemble learning algorithm?
A) Random Forest
B) Adaboost
C) Extra Trees
D) Gradient Boosting
E) Decision Trees
Solution: E
Decision trees doesn’t aggregate the results of multiple trees so it is not an ensemble algorithm.
15. Suppose you are using a bagging based algorithm say a RandomForest in model building. Which of
the following can be true?
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: A
Since Random Forest aggregate the result of different weak learners, If It is possible we would want
more number of trees in model building. Random Forest is a black box model you will lose
interpretability after using it.
16. True-False: The bagging is suitable for high variance low bias models?
A) TRUE
B) FALSE
Solution: A
The bagging is suitable for high variance low bias models or you can say for complex models.
17. To apply bagging to regression trees which of the following is/are true in such case?
Solution: D
Solution: B
We always consider the validation results to compare with the test result.
19. In which of the following scenario a gain ratio is preferred over Information Gain?
Solution: A
When high cardinality problems, gain ratio is preferred over Information Gain technique.
20. Suppose you have given the following scenario for training and validation error for Gradient
Boosting. Which of the following hyper parameter would you choose in such case?
1 2 100 110
2 4 90 105
3 6 50 100
4 8 45 105
5 10 30 150
A) 1
B) 2
C) 3
D) 4
Solution: B
Scenario 2 and 4 has same validation accuracies but we would select 2 because depth is lower is better
hyper parameter.
21. Which of the following is/are not true about DBSCAN clustering algorithm:
1. For data points to be in a cluster, they must be in a distance threshold to a core point
5. It is robust to outliers
Options:
A. 1 only
B. 2 only
C. 4 only
D. 2 and 3
Solution: D
DBSCAN can form a cluster of any arbitrary shape and does not have strong assumptions for the
distribution of data points in the data space.
Answer: d
Explanation: K-means clustering follows partitioning approach.
Answer: c
Explanation: k-nearest neighbour has nothing to do with k-means.
Answer: a
Explanation: K-means requires a number of clusters.
Answer: a
Explanation: K-means clustering produces the final estimate of cluster centroids.
27.
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
THIS IS ALSO
MANDATORY
OPTION
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Suppose you have the following data with one real-value input
variable & one real-value output variable. What is leave-one out cross
ENTER validation mean square error in case of linear regression (Y = bX+c)?
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) 10/27
THIS IS
MANDATORY
OPTION
((OPTION_B)) 20/27
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 50/27
This is optional
((OPTION_D)) 49/27
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Which of the following is/ are true about “Maximum Likelihood
estimate (MLE)”?
ENTER
CONTENT. QTN 1. MLE may not always exist
CAN HAVE 2. MLE always exists
IMAGES ALSO 3. If MLE exist, it (they) may not be unique
4. If MLE exist, it (they) must be unique
((OPTION_A)) 1and4
THIS IS
MANDATORY
OPTION
((OPTION_B)) 2 and3
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 1 and3
This is optional
((OPTION_D)) 2 and4
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Let’s say, a “Linear regression” model perfectly fits the training data
(train error is zero). Now, Which of the following statement is true?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS ALSO
MANDATORY
OPTION
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
((OPTION_B)) . The p-value for the null hypothesis Beta coefficient =0 is 0.0001
THIS IS ALSO
MANDATORY
OPTION
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Which of the following assumptions do we make while deriving linear regression param
((QUESTION))
1. The true relationship between dependent y and predictor x is linear
ENTER 2. The model errors are statistically independent
CONTENT. QTN 3. The errors are normally distributed with a 0 mean and constant standard deviation.
CAN HAVE
IMAGES ALSO
((OPTION_A)) 1,2&3
THIS IS
MANDATORY
OPTION
((OPTION_B)) 1&3
THIS IS ALSO
MANDATORY
OPTION
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_B)) Barchart
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Histograms
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) 1&2
THIS IS
MANDATORY
OPTION
((OPTION_B)) Only 1
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Only 2
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Which of the following offsets, do we use in case of least square line fit? Suppose horizontal axis is
((QUESTION)) independent variable and vertical axis is dependent variable
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Suppose we have generated the data with help of polynomial regression of degree 3 (degree 3 will
((QUESTION)) perfectly fit this data). Now consider below points and choose the option based on these points.
ENTER 1. Simple Linear regression will have high bias and low variance
CONTENT. QTN 2. Simple Linear regression will have low bias and high variance
3. polynomial of degree 3 will have low bias and high variance
CAN HAVE
IMAGES ALSO Polynomial of degree 3 will have low bias and Low variance
((OPTION_A)) . Only 1
THIS IS
MANDATORY
OPTION
((OPTION_B)) 1&3
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 1&4
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) . Suppose you are training a linear regression model. Now consider
these points.
ENTER
CONTENT. QTN 1. Overfitting is more likely if we have less data
CAN HAVE 2. Overfitting is more likely when the hypothesis space is small
IMAGES ALSO
Which of the above statement(s) are correct?
((OPTION_A)) Both are False
THIS IS
MANDATORY
OPTION
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH c
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Suppose we fit “Lasso Regression” to a data set, which has 100 features (X1,X2…X100). Now, we rescale
((QUESTION)) one of these feature by multiplying with 10 (say that feature is X1), and then refit Lasso regression with
the same regularization parameter.
ENTER
CONTENT. QTN Now, which of the following option will be correct?
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) . 1 and 2
THIS IS
MANDATORY
OPTION
((OPTION_B)) 1 and 3
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 2 and 4
This is optional
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) . Which of the following metrics can be used for evaluating regression
models?
ENTER 1. R Squared
CONTENT. QTN 2. Adjusted R Squared
CAN HAVE 3. F Statistics
IMAGES ALSO 1. RMSE / MSE / MAE
((OPTION_A)) 2 and 4
THIS IS
MANDATORY
OPTION
((OPTION_B)) 1 and 2.
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) . 2, 3 and 4.
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) We can also compute the coefficient of linear regression with the help
of an analytical method called “Normal Equation”. Which of the
ENTER following is/are true about “Normal Equation”?
CONTENT. QTN 1. We don’t have to choose the learning rate
CAN HAVE 2. It becomes slow when number of features is very large
IMAGES ALSO 3. No need to iterate
((OPTION_A)) 1 and 2
THIS IS
MANDATORY
OPTION
((OPTION_B)) 1&3
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 2&3
This is optional
((OPTION_D)) 1,2&3
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
. The expected value of Y is a linear function of the X(X1,X2….Xn) variables and regression line is
((QUESTION)) defined as:
Y = β0 + β1 X1 + β2 X2……+ βn Xn
ENTER Which of the following statement(s) are true?
1. If Xi changes by an amount ∆Xi, holding other variables constant, then the expected value of Y
CONTENT. QTN changes by a proportional amount βi ∆Xi, for some constant βi (which in general could be a
CAN HAVE positive or negative number).
2. The value of βi is always the same, regardless of values of the other X’s.
IMAGES ALSO 3. The total effect of the X’s on the expected value of Y is the sum of their separate effects.
((OPTION_A)) . 1 and 2
THIS IS
MANDATORY
OPTION
((OPTION_B)) 1 and 3
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 2 and 3
This is optional
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) 1
THIS IS
MANDATORY
OPTION
((OPTION_B)) 2
THIS IS ALSO
MANDATORY
OPTION
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
. Below graphs show two fitted regression lines (A & B) on randomly generated data. Now, I want to find
((QUESTION)) the sum of residuals in both cases A and B.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO Which of the following statement is true about sum of residuals of A and B
THIS IS
MANDATORY
OPTION
This is optional
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) If two variables are correlated, is it necessary that they have a linear
relationsh
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) YES
THIS IS
MANDATORY
OPTION
((OPTION_B)) NO
THIS IS ALSO
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION
((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Suppose I applied a logistic regression model on data and got training accuracy X and testing accuracy Y.
((QUESTION)) Now I want to add few new features in data. Select option(s) which are correct in such case.
Note: Consider remaining parameters are same.
ENTER 1. Training accuracy always decreases.
2. Training accuracy always increases or remain same.
CONTENT. QTN 3. Testing accuracy always decreases
CAN HAVE Testing accuracy always increases or remain same
IMAGES ALSO
((OPTION_A)) Only 2
THIS IS
MANDATORY
OPTION
((OPTION_B)) Only 1
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Only3
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
The graph below represents a regression line predicting Y from X. The values on the
((QUESTION)) graph shows the residuals for each predictions value. Use this information to
ENTER compute the SSE.
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) 3.02
THIS IS
MANDATORY
OPTION
((OPTION_B)) 0.75
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 1.01
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) YES
THIS IS
MANDATORY
OPTION
((OPTION_B)) NO
THIS IS ALSO
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION
((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) _
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION
((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION
((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION
((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Which of the following methods do we use to best fit the data in
Logistic Regression?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) One of the very good methods to analyze the performance of Logistic
Regression is AIC, which is similar to R-Squared in Linear
ENTER Regression. Which of the following is true about AIC
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION
((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) ) LASSO
THIS IS
MANDATORY
OPTION
((OPTION_B)) Ridge
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Both
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Suppose you have been given a fair coin and you want to find out the
odds of getting heads. Which of the following option is true for such a
ENTER case?
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) ) The logit function(given as l(x)) is the log of odds function. What
could be the range of logit function in the domain x=[0,1]?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) (– ∞ , ∞)
THIS IS
MANDATORY
OPTION
((OPTION_B)) (0,1)
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) (0, ∞)
This is optional
((OPTION_D)) (- ∞, 0)
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) Linear Regression errors values has to be normally distributed but in case
of Logistic Regression it is not the case
THIS IS
MANDATORY
OPTION
((OPTION_B)) Linear Regression errors values has to be normally distributed but in case
of Logistic Regression it is not the case
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Both Linear Regression and Logistic Regression error values have to be
normally distributed
This is optional
((OPTION_D)) Both Linear Regression and Logistic Regression error values have not to
be normally distributed
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
17) Which of the following is true regarding the logistic function for any value “x Note:
((QUESTION)) Logistic(x): is a logistic function of any number “x”
Logit(x): is a logit function of any number “x”
ENTER Logit_inv(x): is a inverse logit function of any number “x””?
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Suppose, You applied a Logistic Regression model on a given data and
got a training accuracy X and testing accuracy Y. Now, you want to
ENTER add a few new features in the same data. Select the option(s) which
CONTENT. QTN is/are correct in such a case.
CAN HAVE
IMAGES ALSO Note: Consider remaining parameters are same.
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A&D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) What would do if you want to train logistic regression on same data
that will take less time as well as give the comparatively similar
ENTER accuracy(may not be same)?
CONTENT. QTN
CAN HAVE Suppose you are using a Logistic Regression model on a huge dataset. One
IMAGES ALSO of the problem you may face on such huge data is that Logistic regression
will take very long time to train
((OPTION_A)) Decrease the learning rate and decrease the number of iteration
THIS IS
MANDATORY
OPTION
((OPTION_B)) Decrease the learning rate and increase the number of iteration
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Increase the learning rate and increase the number of iteration
This is optional
((OPTION_D)) Increase the learning rate and decrease the number of iteration
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Which of the following image is showing the cost function for y =1.
((QUESTION)) Following is the loss function in logistic regression(Y-axis loss function and x axis log probability) for two
class classification problem.
ENTER Note: Y is the target class
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) A
THIS IS
MANDATORY
OPTION
((OPTION_B)) B
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) BOTH
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
The odds ratio is
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) The ratio of the probability of an event not happening to the probability of the event happening.
THIS IS
MANDATORY
OPTION
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) The ratio of the odds after a unit change in the predictor to the original odds.
This is optional
((OPTION_D)) The ratio of the probability of an event happening to the probability of the event not happening.
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Large values of the log-likelihood statistic indicate:
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) That there are a greater number of explained vs. unexplained observations.
THIS IS
MANDATORY
OPTION
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) That as the predictor variable increases, the likelihood of the outcome occurring decreases.
This is optional
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Logistic regression assumes a:
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) Linear relationship between continuous predictor variables and the outcome variable.
THIS IS
MANDATORY
OPTION
((OPTION_B)) Linear relationship between continuous predictor variables and the logit of the outcome
variable.
THIS IS ALSO
MANDATORY
OPTION
This is optional
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
In binary logistic regression:
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
THIS IS ALSO
MANDATORY
OPTION
This is optional
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_D)) none
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Choose the options that is incorrect regarding machine learning (ML) and
artificial intelligence (AI)
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) Linear in D
THIS IS
MANDATORY
OPTION
((OPTION_B)) Exponential in D
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Linear in N
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) -1.66
THIS IS
MANDATORY
OPTION
((OPTION_B)) 2
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 3
This is optional
((OPTION_D)) 4
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Let us say that we have computed the gradient of our cost function and
stored it in a vector g. What is the cost of one gradient descent update
ENTER
given the gradient?
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) O(D)
THIS IS
MANDATORY
OPTION
((OPTION_B)) O(N)
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) O(ND)
This is optional
((OPTION_D)) O(ND2)
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) You observe the following while fitting a linear regression to the data: As
you increase the amount of training data, the test error decreases and the
ENTER
training error increases. The train error is quite low (almost what you
CONTENT. QTN expect
CAN HAVE it to), while the test error is much higher than the train error.
IMAGES ALSO What do you think is the main reason behind this behavior. Choose the
most probable option
((OPTION_A)) High variance
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Adding more basis functions in a linear model... (pick the most probably
option)
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_C)) Serration
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) You are given data about seismic activity in Japan, and you want to
predict a magnitude of the next earthquake, this is in an example of
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_C)) Serration
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) Classification
THIS IS
MANDATORY
OPTION
((OPTION_B)) Regression
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Clustering
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) True
THIS IS
MANDATORY
OPTION
((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) Outcome
THIS IS
MANDATORY
OPTION
((OPTION_B)) Feature
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Attribute
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) It may be better to avoid the metric of ROC curve as it can suffer
from accuracy paradox.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) True
THIS IS
MANDATORY
OPTION
((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) The expected value or _______ of a random variable is the center of its
distribution.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) Mode
THIS IS
MANDATORY
OPTION
((OPTION_B)) median
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) mean
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) variance
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) The square root of the variance is called the ________ deviation
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) empirical
THIS IS
MANDATORY
OPTION
((OPTION_B)) mean
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) continuous
This is optional
((OPTION_D)) standard
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) For continuous random variables, the CDF is the derivative of the PDF.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) True
THIS IS
MANDATORY
OPTION
((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) True
THIS IS
MANDATORY
OPTION
((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Consider the results of a medical experiment that aims to predict whether someone is
going to develop myopia based on some physical measurements and heredity. In this
ENTER case, the input dataset consists of the person’s medical characteristics and the target
variable is binary: 1 for those who are likely to develop myopia and 0 for those who
CONTENT. QTN aren’t. This can be best classified as
CAN HAVE
IMAGES ALSO
((OPTION_A)) Regression
THIS IS
MANDATORY
OPTION
((OPTION_C)) Clustering
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
The purpose of a machine learning model is to approximate an unknown function
((QUESTION))
that
ENTER associates input elements to output ones
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) True
THIS IS
MANDATORY
OPTION
((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Training set is normally a representation of a global distribution
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) True
THIS IS
MANDATORY
OPTION
((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
The model has an excessive capacity and it's not more able to
((QUESTION))
generalize considering the original dynamics provided by the training set. This
ENTER problem is called as
CONTENT. QTN
CAN HAVE
IMAGES ALSO
Underfitting
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B)) Overfitting
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Both
This is optional
((OPTION_D)) None
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
It can associate almost perfectly all the known samples to the corresponding
((QUESTION))
output
ENTER values, but when an unknown input is presented, the corresponding prediction
CONTENT. QTN error can be very high, This problem is called as
CAN HAVE
IMAGES ALSO
Underfitting
((OPTION_A))
THIS IS
MANDATORY
OPTION
Overfitting
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Both
This is optional
((OPTION_D)) None
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
---------- may prove to be more difficult to discover as it could be initially
((QUESTION))
considered the result of a perfect fitting
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
Underfitting
((OPTION_A))
THIS IS
MANDATORY
OPTION
Overfitting
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Both
This is optional
((OPTION_D)) None
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
when working with a supervised scenario, we define a non-negative error
((QUESTION))
measure em which takes two arguments and allows us to compute a total error
ENTER value over the whole dataset. Those two arguments are.
CONTENT. QTN
CAN HAVE
IMAGES ALSO
expected and predicted output
((OPTION_A))
THIS IS
MANDATORY
OPTION
calculated and predicted output
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
calculated and measured output
((OPTION_C))
This is optional
none
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
A
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Initial value represents a starting point over the surface of a n-variables function.
((QUESTION))
A
ENTER generic training algorithm has to find the global minimum or a point quite close
CONTENT. QTN to it
CAN HAVE (there's always a tolerance to avoid an excessive number of iterations and a
IMAGES ALSO consequent risk
of overfitting). This measure is also called
loss function
((OPTION_A))
THIS IS
MANDATORY
OPTION
predicted output
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
measured output
((OPTION_C))
This is optional
mean square error
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) In particular, a concept is a subset of input patterns X which determine the same
output element
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) True
THIS IS
MANDATORY
OPTION
((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) True
THIS IS
MANDATORY
OPTION
((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) An exponential time could lead to computational explosions when the datasets
are too large
ENTER or the optimization starting point is very far from an acceptable minimum.
CONTENT. QTN Moreover, it's
CAN HAVE important to remember the so-called …….
IMAGES ALSO
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE First term is called as
IMAGES ALSO
((OPTION_A)) posteriori
THIS IS
MANDATORY
OPTION
((OPTION_B)) Apriori
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) likelihood.
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
second term is called as
CAN HAVE
IMAGES ALSO
((OPTION_A)) posteriori
THIS IS
MANDATORY
OPTION
((OPTION_B)) Apriori
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) likelihood.
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
Third term is called as
CAN HAVE
IMAGES ALSO
((OPTION_A)) posteriori
THIS IS
MANDATORY
OPTION
((OPTION_B)) Apriori
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) likelihood.
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Choose the options that is incorrect regarding machine learning (ML) and
artificial intelligence (AI)
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) Linear in D
THIS IS
MANDATORY
OPTION
((OPTION_B)) Exponential in D
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Linear in N
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) -1.66
THIS IS
MANDATORY
OPTION
((OPTION_B)) 2
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 3
This is optional
((OPTION_D)) 4
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Let us say that we have computed the gradient of our cost function and
stored it in a vector g. What is the cost of one gradient descent update
ENTER given the gradient?
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) O(D)
THIS IS
MANDATORY
OPTION
((OPTION_B)) O(N)
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) O(ND)
This is optional
((OPTION_D)) O(ND2)
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) You observe the following while fitting a linear regression to the data: As
you increase the amount of training data, the test error decreases and the
ENTER training error increases. The train error is quite low (almost what you
CONTENT. QTN expect
CAN HAVE it to), while the test error is much higher than the train error.
IMAGES ALSO What do you think is the main reason behind this behavior. Choose the
most probable option
((OPTION_A)) High variance
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Adding more basis functions in a linear model... (pick the most probably
option)
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_C)) Serration
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) You are given data about seismic activity in Japan, and you want to
predict a magnitude of the next earthquake, this is in an example of
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_C)) Serration
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) Classification
THIS IS
MANDATORY
OPTION
((OPTION_B)) Regression
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Clustering
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) True
THIS IS
MANDATORY
OPTION
((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) Outcome
THIS IS
MANDATORY
OPTION
((OPTION_B)) Feature
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Attribute
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) It may be better to avoid the metric of ROC curve as it can suffer
from accuracy paradox.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) True
THIS IS
MANDATORY
OPTION
((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) The expected value or _______ of a random variable is the center of its
distribution.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) Mode
THIS IS
MANDATORY
OPTION
((OPTION_B)) median
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) mean
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) variance
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) The square root of the variance is called the ________ deviation
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) empirical
THIS IS
MANDATORY
OPTION
((OPTION_B)) mean
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) continuous
This is optional
((OPTION_D)) standard
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) For continuous random variables, the CDF is the derivative of the PDF.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) True
THIS IS
MANDATORY
OPTION
((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) True
THIS IS
MANDATORY
OPTION
((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Consider the results of a medical experiment that aims to predict whether someone is
going to develop myopia based on some physical measurements and heredity. In this
ENTER case, the input dataset consists of the person’s medical characteristics and the target
variable is binary: 1 for those who are likely to develop myopia and 0 for those who
CONTENT. QTN aren’t. This can be best classified as
CAN HAVE
IMAGES ALSO
((OPTION_A)) Regression
THIS IS
MANDATORY
OPTION
((OPTION_C)) Clustering
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
The purpose of a machine learning model is to approximate an unknown function
((QUESTION))
that
ENTER associates input elements to output ones
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) True
THIS IS
MANDATORY
OPTION
((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Training set is normally a representation of a global distribution
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) True
THIS IS
MANDATORY
OPTION
((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
The model has an excessive capacity and it's not more able to
((QUESTION))
generalize considering the original dynamics provided by the training set. This
ENTER problem is called as
CONTENT. QTN
CAN HAVE
IMAGES ALSO
Underfitting
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B)) Overfitting
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Both
This is optional
((OPTION_D)) None
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
It can associate almost perfectly all the known samples to the corresponding
((QUESTION))
output
ENTER values, but when an unknown input is presented, the corresponding prediction
CONTENT. QTN error can be very high, This problem is called as
CAN HAVE
IMAGES ALSO
Underfitting
((OPTION_A))
THIS IS
MANDATORY
OPTION
Overfitting
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Both
This is optional
((OPTION_D)) None
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
---------- may prove to be more difficult to discover as it could be initially
((QUESTION))
considered the result of a perfect fitting
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
Underfitting
((OPTION_A))
THIS IS
MANDATORY
OPTION
Overfitting
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Both
This is optional
((OPTION_D)) None
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
when working with a supervised scenario, we define a non-negative error
((QUESTION))
measure em which takes two arguments and allows us to compute a total error
ENTER value over the whole dataset. Those two arguments are.
CONTENT. QTN
CAN HAVE
IMAGES ALSO
expected and predicted output
((OPTION_A))
THIS IS
MANDATORY
OPTION
calculated and predicted output
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
calculated and measured output
((OPTION_C))
This is optional
none
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
A
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Initial value represents a starting point over the surface of a n-variables function.
((QUESTION))
A
ENTER generic training algorithm has to find the global minimum or a point quite close
CONTENT. QTN to it
CAN HAVE (there's always a tolerance to avoid an excessive number of iterations and a
IMAGES ALSO consequent risk
of overfitting). This measure is also called
loss function
((OPTION_A))
THIS IS
MANDATORY
OPTION
predicted output
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
measured output
((OPTION_C))
This is optional
mean square error
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) In particular, a concept is a subset of input patterns X which determine the same
output element
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) True
THIS IS
MANDATORY
OPTION
((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) True
THIS IS
MANDATORY
OPTION
((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) An exponential time could lead to computational explosions when the datasets
are too large
ENTER or the optimization starting point is very far from an acceptable minimum.
CONTENT. QTN Moreover, it's
CAN HAVE important to remember the so-called …….
IMAGES ALSO
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE First term is called as
IMAGES ALSO
((OPTION_A)) posteriori
THIS IS
MANDATORY
OPTION
((OPTION_B)) Apriori
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) likelihood.
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
second term is called as
CAN HAVE
IMAGES ALSO
((OPTION_A)) posteriori
THIS IS
MANDATORY
OPTION
((OPTION_B)) Apriori
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) likelihood.
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
Third term is called as
CAN HAVE
IMAGES ALSO
((OPTION_A)) posteriori
THIS IS
MANDATORY
OPTION
((OPTION_B)) Apriori
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) likelihood.
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
THIS IS ALSO
MANDATORY
OPTION
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Suppose you have the following data with one real-value input
variable & one real-value output variable. What is leave-one out cross
ENTER validation mean square error in case of linear regression (Y = bX+c)?
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) 10/27
THIS IS
MANDATORY
OPTION
((OPTION_B)) 20/27
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 50/27
This is optional
((OPTION_D)) 49/27
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Which of the following is/ are true about “Maximum Likelihood
estimate (MLE)”?
ENTER
CONTENT. QTN 1. MLE may not always exist
CAN HAVE 2. MLE always exists
IMAGES ALSO 3. If MLE exist, it (they) may not be unique
4. If MLE exist, it (they) must be unique
((OPTION_A)) 1and4
THIS IS
MANDATORY
OPTION
((OPTION_B)) 2 and3
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 1 and3
This is optional
((OPTION_D)) 2 and4
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Let’s say, a “Linear regression” model perfectly fits the training data
(train error is zero). Now, Which of the following statement is true?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS ALSO
MANDATORY
OPTION
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
((OPTION_B)) . The p-value for the null hypothesis Beta coefficient =0 is 0.0001
THIS IS ALSO
MANDATORY
OPTION
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Which of the following assumptions do we make while deriving linear regression param
((QUESTION))
1. The true relationship between dependent y and predictor x is linear
ENTER 2. The model errors are statistically independent
CONTENT. QTN 3. The errors are normally distributed with a 0 mean and constant standard deviation.
CAN HAVE
IMAGES ALSO
((OPTION_A)) 1,2&3
THIS IS
MANDATORY
OPTION
((OPTION_B)) 1&3
THIS IS ALSO
MANDATORY
OPTION
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_B)) Barchart
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Histograms
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) 1&2
THIS IS
MANDATORY
OPTION
((OPTION_B)) Only 1
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Only 2
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Which of the following offsets, do we use in case of least square line fit? Suppose horizontal axis is
((QUESTION)) independent variable and vertical axis is dependent variable
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Suppose we have generated the data with help of polynomial regression of degree 3 (degree 3 will
((QUESTION)) perfectly fit this data). Now consider below points and choose the option based on these points.
ENTER 1. Simple Linear regression will have high bias and low variance
CONTENT. QTN 2. Simple Linear regression will have low bias and high variance
3. polynomial of degree 3 will have low bias and high variance
CAN HAVE
IMAGES ALSO Polynomial of degree 3 will have low bias and Low variance
((OPTION_A)) . Only 1
THIS IS
MANDATORY
OPTION
((OPTION_B)) 1&3
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 1&4
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) . Suppose you are training a linear regression model. Now consider
these points.
ENTER
CONTENT. QTN 1. Overfitting is more likely if we have less data
CAN HAVE 2. Overfitting is more likely when the hypothesis space is small
IMAGES ALSO
Which of the above statement(s) are correct?
((OPTION_A)) Both are False
THIS IS
MANDATORY
OPTION
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH c
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Suppose we fit “Lasso Regression” to a data set, which has 100 features (X1,X2…X100). Now, we rescale
((QUESTION)) one of these feature by multiplying with 10 (say that feature is X1), and then refit Lasso regression with
the same regularization parameter.
ENTER
CONTENT. QTN Now, which of the following option will be correct?
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) . 1 and 2
THIS IS
MANDATORY
OPTION
((OPTION_B)) 1 and 3
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 2 and 4
This is optional
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) . Which of the following metrics can be used for evaluating regression
models?
ENTER 1. R Squared
CONTENT. QTN 2. Adjusted R Squared
CAN HAVE 3. F Statistics
IMAGES ALSO 1. RMSE / MSE / MAE
((OPTION_A)) 2 and 4
THIS IS
MANDATORY
OPTION
((OPTION_B)) 1 and 2.
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) . 2, 3 and 4.
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) We can also compute the coefficient of linear regression with the help
of an analytical method called “Normal Equation”. Which of the
ENTER following is/are true about “Normal Equation”?
CONTENT. QTN 1. We don’t have to choose the learning rate
CAN HAVE 2. It becomes slow when number of features is very large
IMAGES ALSO 3. No need to iterate
((OPTION_A)) 1 and 2
THIS IS
MANDATORY
OPTION
((OPTION_B)) 1&3
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 2&3
This is optional
((OPTION_D)) 1,2&3
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
. The expected value of Y is a linear function of the X(X1,X2….Xn) variables and regression line is
((QUESTION)) defined as:
Y = β0 + β1 X1 + β2 X2……+ βn Xn
ENTER Which of the following statement(s) are true?
1. If Xi changes by an amount ∆Xi, holding other variables constant, then the expected value of Y
CONTENT. QTN changes by a proportional amount βi ∆Xi, for some constant βi (which in general could be a
CAN HAVE positive or negative number).
2. The value of βi is always the same, regardless of values of the other X’s.
IMAGES ALSO 3. The total effect of the X’s on the expected value of Y is the sum of their separate effects.
((OPTION_A)) . 1 and 2
THIS IS
MANDATORY
OPTION
((OPTION_B)) 1 and 3
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 2 and 3
This is optional
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) 1
THIS IS
MANDATORY
OPTION
((OPTION_B)) 2
THIS IS ALSO
MANDATORY
OPTION
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
. Below graphs show two fitted regression lines (A & B) on randomly generated data. Now, I want to find
((QUESTION)) the sum of residuals in both cases A and B.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO Which of the following statement is true about sum of residuals of A and B
THIS IS
MANDATORY
OPTION
This is optional
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) If two variables are correlated, is it necessary that they have a linear
relationsh
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) YES
THIS IS
MANDATORY
OPTION
((OPTION_B)) NO
THIS IS ALSO
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION
((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Suppose I applied a logistic regression model on data and got training accuracy X and testing accuracy Y.
((QUESTION)) Now I want to add few new features in data. Select option(s) which are correct in such case.
Note: Consider remaining parameters are same.
ENTER 1. Training accuracy always decreases.
2. Training accuracy always increases or remain same.
CONTENT. QTN 3. Testing accuracy always decreases
CAN HAVE Testing accuracy always increases or remain same
IMAGES ALSO
((OPTION_A)) Only 2
THIS IS
MANDATORY
OPTION
((OPTION_B)) Only 1
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Only3
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
The graph below represents a regression line predicting Y from X. The values on the
((QUESTION)) graph shows the residuals for each predictions value. Use this information to
ENTER compute the SSE.
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) 3.02
THIS IS
MANDATORY
OPTION
((OPTION_B)) 0.75
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 1.01
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) YES
THIS IS
MANDATORY
OPTION
((OPTION_B)) NO
THIS IS ALSO
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION
((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) _
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION
((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION
((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION
((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Which of the following methods do we use to best fit the data in
Logistic Regression?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) One of the very good methods to analyze the performance of Logistic
Regression is AIC, which is similar to R-Squared in Linear
ENTER Regression. Which of the following is true about AIC
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION
((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) ) LASSO
THIS IS
MANDATORY
OPTION
((OPTION_B)) Ridge
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Both
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Suppose you have been given a fair coin and you want to find out the
odds of getting heads. Which of the following option is true for such a
ENTER case?
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) ) The logit function(given as l(x)) is the log of odds function. What
could be the range of logit function in the domain x=[0,1]?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) (– ∞ , ∞)
THIS IS
MANDATORY
OPTION
((OPTION_B)) (0,1)
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) (0, ∞)
This is optional
((OPTION_D)) (- ∞, 0)
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) Linear Regression errors values has to be normally distributed but in case
of Logistic Regression it is not the case
THIS IS
MANDATORY
OPTION
((OPTION_B)) Linear Regression errors values has to be normally distributed but in case
of Logistic Regression it is not the case
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Both Linear Regression and Logistic Regression error values have to be
normally distributed
This is optional
((OPTION_D)) Both Linear Regression and Logistic Regression error values have not to
be normally distributed
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
17) Which of the following is true regarding the logistic function for any value “x Note:
((QUESTION)) Logistic(x): is a logistic function of any number “x”
Logit(x): is a logit function of any number “x”
ENTER Logit_inv(x): is a inverse logit function of any number “x””?
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Suppose, You applied a Logistic Regression model on a given data and
got a training accuracy X and testing accuracy Y. Now, you want to
ENTER add a few new features in the same data. Select the option(s) which
CONTENT. QTN is/are correct in such a case.
CAN HAVE
IMAGES ALSO Note: Consider remaining parameters are same.
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A&D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) What would do if you want to train logistic regression on same data
that will take less time as well as give the comparatively similar
ENTER accuracy(may not be same)?
CONTENT. QTN
CAN HAVE Suppose you are using a Logistic Regression model on a huge dataset. One
IMAGES ALSO of the problem you may face on such huge data is that Logistic regression
will take very long time to train
((OPTION_A)) Decrease the learning rate and decrease the number of iteration
THIS IS
MANDATORY
OPTION
((OPTION_B)) Decrease the learning rate and increase the number of iteration
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Increase the learning rate and increase the number of iteration
This is optional
((OPTION_D)) Increase the learning rate and decrease the number of iteration
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Which of the following image is showing the cost function for y =1.
((QUESTION)) Following is the loss function in logistic regression(Y-axis loss function and x axis log probability) for two
class classification problem.
ENTER Note: Y is the target class
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) A
THIS IS
MANDATORY
OPTION
((OPTION_B)) B
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) BOTH
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
The odds ratio is
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) The ratio of the probability of an event not happening to the probability of the event happening.
THIS IS
MANDATORY
OPTION
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) The ratio of the odds after a unit change in the predictor to the original odds.
This is optional
((OPTION_D)) The ratio of the probability of an event happening to the probability of the event not happening.
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Large values of the log-likelihood statistic indicate:
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) That there are a greater number of explained vs. unexplained observations.
THIS IS
MANDATORY
OPTION
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) That as the predictor variable increases, the likelihood of the outcome occurring decreases.
This is optional
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Logistic regression assumes a:
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) Linear relationship between continuous predictor variables and the outcome variable.
THIS IS
MANDATORY
OPTION
((OPTION_B)) Linear relationship between continuous predictor variables and the logit of the outcome
variable.
THIS IS ALSO
MANDATORY
OPTION
This is optional
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
In binary logistic regression:
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
THIS IS ALSO
MANDATORY
OPTION
This is optional
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_D)) none
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
This sheet is for 3 Mark questions
S.r No Question Image a b c d Correct Answer
e.g 1 Write down question img.jpg Option a Option b Option c Option d a/b/c/d
1 Which of the following is characteristic of best fast accuracy scalable All above D
machine learning method ?
2 What are the different Algorithm techniques in Supervised Unsupervised Both A & B None of the C
Machine Learning? Learning and Learning and Mentioned
Semi- Transduction
3 ______can be adopted when it's necessary to Supervised Semi- Reinforcement Clusters B
categorize a large amount of data with a few supervised
complete examples or when there's the need to
4 In reinforcement learning, this feedback is usually Overfitting Overlearning Reward None of above C
called as___.
5 In the last decade, many researchers started training Deep learning Machine Reinforcement Unsupervised A
bigger and bigger models, built with several different learning learning learning
layers that's why this approach is called_____.
6 What does learning exactly mean? Robots are A set of data Learning is the It is a set of C
programed so is used to ability to data is used to
that they can discover the change discover the
7 When it is necessary to allow the model to develop a Overfitting Overlearning Classification Regression A
generalization ability and avoid a common problem
called______.
8 Techniques involve the usage of both labeled and Supervised Semi- Unsupervised None of the B
unlabeled data is called___. supervised above
9 there's a growing interest in pattern recognition and Regression Accuracy Modelfree Scalable C
associative memories whose structure and functioning
are similar to what happens in the neocortex. Such an
10 ______ showed better performance than other Machine Deep learning Reinforcement Supervised B
approaches, even without a context-based model learning learning learning
14 Classifications,
Predict time Speech
what is the function of ‘Supervised Learning’? -- series, recognition, Both A & B None of above C
Annotate Regression
strings
15 Object Similarity Automatic
Commons unsupervised applications include -- All above D
segmentation detection labeling
16
the it's impossible
Reinforcement learning is particularly efficient environment is it's often very to have a
-- All above D
when______________. not completely dynamic precise error
deterministic measure
Ans: Solution A
2. What is regression?
a) When the output variable is a category, such as “red” or “blue” or “disease” and “no
disease”.
b) When the output variable is a real value, such as “dollars” or “weight”.
Ans: Solution B
Ans: Solution B
Ans: Solution A
Ans: Solution D
6. What is Reinforcement learning?
a) All data is unlabelled and the algorithms learn to inherent structure from the input data
b) All data is labelled and the algorithms learn to predict the output from the input data
c) It is a framework for learning where an agent interacts with an environment and receives
a reward for each interaction
d) Some data is labelled but most of it is unlabelled and a mixture of supervised and
unsupervised techniques can be used.
Ans: Solution C
Regression,
Classification
Clustering
Reinforcement Learning
Options:
A. 1 Only
B. 1 and 2
C. 1 and 3
D. 1, 2 and 4
Ans : Solution D
Ans : Solution C
Ans : Solution B
11. Supervised learning and unsupervised clustering both require at least one
a) hidden attribute.
b) output attribute.
c) input attribute.
d) categorical attribute.
Ans : Solution A
12. Supervised learning differs from unsupervised clustering in that supervised learning requires
a) at least one input attribute.
b) input attributes to be categorical.
c) at least one output attribute.
d) output attributes to be categorical.
Ans : Solution B
13. A regression model in which more than one independent variable is used to predict the
dependent variable is called
a) a simple linear regression model
b) a multiple regression models
c) an independent model
d) none of the above
Ans : Solution C
14. A term used to describe the case when the independent variables in a multiple regression model
are correlated is
a) Regression
b) correlation
c) multicollinearity
d) none of the above
Ans : Solution C
15. A multiple regression model has the form: y = 2 + 3x1 + 4x2. As x1 increases by 1 unit (holding x2
constant), y will
a) increase by 3 units
b) decrease by 3 units
c) increase by 4 units
d) decrease by 4 units
Ans : Solution C
Ans : Solution B
17. A measure of goodness of fit for the estimated regression equation is the
a) multiple coefficient of determination
b) mean square due to error
c) mean square due to regression
d) none of the above
Ans : Solution C
Ans : Solution D
Ans : Solution C
20. For a multiple regression model, SST = 200 and SSE = 50. The multiple coefficient of
determination is
a) 0.25
b) 4.00
c) 0.75
d) none of the above
Ans : Solution B
Ans : Solution B
Ans : Solution B
Ans : Solution C
Ans : Solution D
26. Which statement is true about neural network and linear regression models?
a) Both models require input attributes to be numeric.
b) Both models require numeric attributes to range between 0 and 1.
c) The output of both models is a categorical attribute value.
d) Both techniques build models whose output is determined by a linear sum of weighted
input attribute values.
Ans : Solution A
Ans : Solution A
28. The average positive difference between computed and desired outcome values.
a) root mean squared error
b) mean squared error
c) mean absolute error
d) mean positive error
Ans : Solution D
29. Selecting data so as to assure that each class is properly represented in both the training and
test set.
a) cross validation
b) stratification
c) verification
d) bootstrapping
Ans : Solution B
30. The standard error is defined as the square root of this computation.
a) The sample variance divided by the total number of sample instances.
b) The population variance divided by the total number of sample instances.
c) The sample variance divided by the sample mean.
d) The population variance divided by the sample mean.
Ans : Solution A
31. Data used to optimize the parameter settings of a supervised learner model.
a) Training
b) Test
c) Verification
d) Validation
Ans : Solution D
Ans : Solution A
33. The correlation between the number of years an employee has worked for a company and the
salary of the employee is 0.75. What can be said about employee salary and years worked?
a) There is no relationship between salary and years worked.
b) Individuals that have worked for the company the longest have higher salaries.
c) Individuals that have worked for the company the longest have lower salaries.
d) The majority of employees have been with the company a long time.
e) The majority of employees have been with the company a short period of time.
Ans : Solution B
34. The correlation coefficient for two real-valued attributes is –0.85. What does this value tell you?
a) The attributes are not linearly related.
b) As the value of one attribute increases the value of the second attribute also increases.
c) As the value of one attribute decreases the value of the second attribute increases.
d) The attributes show a curvilinear relationship.
Ans : Solution C
35. The average squared difference between classifier predicted output and actual output.
a) mean squared error
b) root mean squared error
c) mean absolute error
d) mean relative error
Ans : Solution A
36. Simple regression assumes a __________ relationship between the input attribute and output
attribute.
a) Linear
b) Quadratic
c) reciprocal
d) inverse
Ans : Solution A
Ans : Solution B
Ans : Solution C
39. Logistic regression is a ________ regression technique that is used to model data having a
_____outcome.
a) linear, numeric
b) linear, binary
c) nonlinear, numeric
d) nonlinear, binary
Ans : Solution D
40. This technique associates a conditional probability value with each data instance.
a) linear regression
b) logistic regression
c) simple regression
d) multiple linear regression
Ans : Solution B
41. This supervised learning technique can process both numeric and categorical input attributes.
a) linear regression
b) Bayes classifier
c) logistic regression
d) backpropagation learning
Ans : Solution A
Ans : Solution B
43. This clustering algorithm merges and splits nodes to help modify nonoptimal partitions.
a) agglomerative clustering
b) expectation maximization
c) conceptual clustering
d) K-Means clustering
Ans : Solution D
44. This clustering algorithm initially assumes that each data instance represents a single cluster.
a) agglomerative clustering
b) conceptual clustering
c) K-Means clustering
d) expectation maximization
Ans : Solution C
45. This unsupervised clustering algorithm terminates when mean values computed for the current
iteration of the algorithm are identical to the computed mean values for the previous iteration.
a) agglomerative clustering
b) conceptual clustering
c) K-Means clustering
d) expectation maximization
Ans : Solution C
46. Machine learning techniques differ from statistical techniques in that machine learning methods
a) typically assume an underlying distribution for the data.
b) are better able to deal with missing and noisy data.
c) are not able to explain their behavior.
d) have trouble with large-sized datasets.
Ans : Solution B
UNIT –II
1.True- False: Over fitting is more likely when you have huge amount of data to train?
A) TRUE
B) FALSE
Ans Solution: (B)
With a small training dataset, it’s easier to find a hypothesis to fit the training data exactly i.e.
over fitting.
3.Which of the following techniques would perform better for reducing dimensions of a data
set?
A. Removing columns which have too many missing values
B. Removing columns which have high variance in data
C. Removing columns with dissimilar data trends
D. None of these
Ans Solution: (A)
If a columns have too many missing values, (say 99%) then we can remove such columns.
4.It is not necessary to have a target variable for applying dimensionality reduction
algorithms.
A. TRUE
B. FALSE
Ans Solution: (A)
LDA is an example of supervised dimensionality reduction algorithm.
5. PCA can be used for projecting and visualizing data in lower dimensions.
A. TRUE
B. FALSE
Ans Solution: (A)
Sometimes it is very useful to plot the data in lower dimensions. We can take the first 2 principal
components and then visualize the data using scatter plot.
6. The most popularly used dimensionality reduction algorithm is Principal Component Analysis
(PCA). Which of the following is/are true about PCA?
PCA is an unsupervised method
It searches for the directions that data have the largest variance
Maximum number of principal components <= number of features
All principal components are orthogonal to each other
A. 1 and 2
B. 1 and 3
C. 2 and 3
D. All of the above
Ans D
8. What happens when you get features in lower dimensions using PCA?
The features will still have interpretability
The features will lose interpretability
The features must carry all information present in data
The features may not carry all information present in data
A. 1 and 3
B. 1 and 4
C. 2 and 3
D. 2 and 4
Ans Solution: (D)
When you get the features in lower dimensions then you will lose some information of data
most of the times and you won’t be able to interpret the lower dimension data.
10. What is of the following statement is true about t-SNE in comparison to PCA?
A. When the data is huge (in size), t-SNE may fail to produce better results.
B. T-NSE always produces better result regardless of the size of the data
C. PCA always performs better than t-SNE for smaller size data.
D. None of these
Ans Solution: (A)
Option A is correct
11. [ True or False ] PCA can be used for projecting and visualizing data in lower dimensions.
A. TRUE
B. FALSE
Solution: (A)
Sometimes it is very useful to plot the data in lower dimensions. We can take the first 2 principal
components and then visualize the data using scatter plot.
12. A feature F1 can take certain value: A, B, C, D, E, & F and represents grade of students from
a college.
1) Which of the following statement is true in following case?
A) Feature F1 is an example of nominal variable.
B) Feature F1 is an example of ordinal variable.
C) It doesn’t belong to any of the above category.
D) Both of these
Solution: (B)
Ordinal variables are the variables which has some order in their categories. For example, grade
A should be consider as high grade than grade B.
1. Which of the following methods do we use to best fit the data in Logistic Regression?
A) Least Square Error
B) Maximum Likelihood
C) Jaccard distance
D) Both A and B
Ans Solution: B
2. Choose which of the following options is true regarding One-Vs-All method in Logistic
Regression.
A) We need to fit n models in n-class classification problem
B) We need to fit n-1 models to classify into n classes
C) We need to fit only 1 model to classify into n classes
D) None of these
Ans Solution: A
3. Suppose, You applied a Logistic Regression model on a given data and got a training accuracy
X and testing accuracy Y. Now, you want to add a few new features in the same data. Select the
option(s) which is/are correct in such a case.
Note: Consider remaining parameters are same.
A) Training accuracy increases
B) Training accuracy increases or remains the same
C) Testing accuracy decreases
D) Testing accuracy increases or remains the same
Ans Solution: A and D
Adding more features to model will increase the training accuracy because model has to
consider more data to fit the logistic regression. But testing accuracy increases if feature is
found to be significant
6. Which of the following methods do we use to find the best fit line for data in Linear
Regression?
A) Least Square Error
B) Maximum Likelihood
C) Logarithmic Loss
D) Both A and B
Ans Solution: (A)
In linear regression, we try to minimize the least square errors of the model to identify the line
of best fit.
8. Suppose you plotted a scatter plot between the residuals and predicted values in linear
regression and you found that there is a relationship between them. Which of the following
conclusion do you make about this situation?
9. Suppose you have fitted a complex regression model on a dataset. Now, you are using Ridge
regression with penalty x.
Choose the option which describes bias in best manner.
A) In case of very large x; bias is low
B) In case of very large x; bias is high
C) We can’t say about bias
D) None of these
Ans Solution: (B)
If the penalty is very large it means model is less complex, therefore the bias would be high.
11. Suppose you have trained a logistic regression classifier and it outputs a new example x with
a prediction ho(x) = 0.2. This means
Our estimate for P(y=1 | x)
Our estimate for P(y=0 | x)
Our estimate for P(y=1 | x)
Our estimate for P(y=0 | x)
Ans Solution: B
A) TRUE
B) FALSE
Solution: (A)
True. A Neural network can be used as a universal approximator, so it can definitely implement
a linear regression algorithm.
15. Which of the following methods do we use to find the best fit line for data in Linear
Regression?
A) Least Square Error
B) Maximum Likelihood
C) Logarithmic Loss
D) Both A and B
Solution: (A)
In linear regression, we try to minimize the least square errors of the model to identify the line
of best fit.
16. Which of the following evaluation metrics can be used to evaluate a model while modeling
a continuous output variable?
A) AUC-ROC
B) Accuracy
C) Logloss
D) Mean-Squared-Error
Solution: (D)
Since linear regression gives output as continuous values, so in such case we use mean squared
error metric to evaluate the model performance. Remaining options are use in case of a
classification problem.
17. True-False: Lasso Regularization can be used for variable selection in Linear Regression.
A) TRUE
B) FALSE
Solution: (A)
True, In case of lasso regression we apply absolute penalty which makes some of the coefficients
zero.
19. Suppose that we have N independent variables (X1,X2… Xn) and dependent variable is Y.
Now Imagine that you are applying linear regression by fitting the best fit line using least square
error on this data.
You found that correlation coefficient for one of it’s variable(Say X1) with Y is -0.95.
Which of the following is true for X1?
A) Relation between the X1 and Y is weak
B) Relation between the X1 and Y is strong
C) Relation between the X1 and Y is neutral
D) Correlation can’t judge the relationship
Solution: (B)
The absolute value of the correlation coefficient denotes the strength of the relationship.
Since absolute correlation is very high it means that the relationship is strong between X1 and
Y.
20. Looking at above two characteristics, which of the following option is the correct for
Pearson correlation between V1 and V2?
If you are given the two variables V1 and V2 and they are following below two characteristics.
1. If V1 increases then V2 also increases
2. If V1 decreases then V2 behavior is unknown
A) Pearson correlation will be close to 1
B) Pearson correlation will be close to -1
C) Pearson correlation will be close to 0
D) None of these
Solution: (D)
We cannot comment on the correlation coefficient by using only statement 1. We need to
consider the both of these two statements. Consider V1 as x and V2 as |x|. The correlation
coefficient would not be close to 1 in such a case.
21. Suppose Pearson correlation between V1 and V2 is zero. In such case, is it right to
conclude that V1 and V2 do not have any relation between them?
A) TRUE
B) FALSE
Solution: (B)
Pearson correlation coefficient between 2 variables might be zero even when they have a
relationship between them. If the correlation coefficient is zero, it just means that that they
don’t move together. We can take examples like y=|x| or y=x^2.
22. True- False: Overfitting is more likely when you have huge amount of data to train?
A) TRUE
B) FALSE
Solution: (B)
With a small training dataset, it’s easier to find a hypothesis to fit the training data exactly i.e.
overfitting.
23. We can also compute the coefficient of linear regression with the help of an analytical
method called “Normal Equation”. Which of the following is/are true about Normal Equation?
1. We don’t have to choose the learning rate
2. It becomes slow when number of features is very large
3. Thers is no need to iterate
A) 1 and 2
B) 1 and 3
C) 2 and 3
D) 1,2 and 3
Solution: (D)
Instead of gradient descent, Normal Equation can also be used to find coefficients.
25. What will happen when you apply very large penalty?
A) Some of the coefficient will become absolute zero
B) Some of the coefficient will approach zero but not absolute zero
C) Both A and B depending on the situation
D) None of these
Solution: (B)
In lasso some of the coefficient value become zero, but in case of Ridge, the coefficients become
close to zero but not zero.
26. What will happen when you apply very large penalty in case of Lasso?
A) Some of the coefficient will become zero
B) Some of the coefficient will be approaching to zero but not absolute zero
C) Both A and B depending on the situation
D) None of these
Solution: (A)
As already discussed, lasso applies absolute penalty, so some of the coefficients will become
zero.
27. Which of the following statement is true about outliers in Linear regression?
A) Linear regression is sensitive to outliers
B) Linear regression is not sensitive to outliers
C) Can’t say
D) None of these
Solution: (A)
The slope of the regression line will change due to outliers in most of the cases. So Linear
Regression is sensitive to outliers.
28. Suppose you plotted a scatter plot between the residuals and predicted values in linear
regression and you found that there is a relationship between them. Which of the following
conclusion do you make about this situation?
31. In terms of bias and variance. Which of the following is true when you fit degree 2
polynomial?
A) Increase
B) Decrease
C) Remain constant
D) Can’t Say
Solution: (D)
Training error may increase or decrease depending on the values that are used to fit the model.
If the values used to train contain more outliers gradually, then the error might just increase.
33. What do you expect will happen with bias and variance as you increase the size of training
data?
34. What would be the root mean square training error for this data if you run a Linear
Regression model of the form (Y = A0+A1X)?
A) Less than 0
B) Greater than zero
C) Equal to 0
D) None of these
Solution: (C)
We can perfectly fit the line on the following data so mean error will be zero.
35. Which of the following scenario would give you the right hyper parameter?
A) 1
B) 2
C) 3
D) 4
Solution: (B)
Option B would be the better option because it leads to less training as well as validation error.
36. Suppose you got the tuned hyper parameters from the previous question. Now, Imagine
you want to add a variable in variable space such that this added feature is important. Which
of the following thing would you observe in such case?
A) Training Error will decrease and Validation error will increase
B) Training Error will increase and Validation error will increase
C) Training Error will increase and Validation error will decrease
D) Training Error will decrease and Validation error will decrease
E) None of the above
Solution: (D)
If the added feature is important, the training and validation error would decrease.
A) L1
B) L2
C) Any
D) None of these
Solution: (D)
I won’t use any regularization methods because regularization is used in case of overfitting.
41. True-False: Is it possible to design a logistic regression algorithm using a Neural Network
Algorithm?
A) TRUE
B) FALSE
Solution: A
True, Neural network is a is a universal approximator so it can implement linear regression
algorithm.
43. Which of the following methods do we use to best fit the data in Logistic Regression?
A) Least Square Error
B) Maximum Likelihood
C) Jaccard distance
D) Both A and B
Solution: B
Logistic regression uses maximum likely hood estimate for training a logistic regression.
44. Which of the following evaluation metrics can not be applied in case of logistic regression
output to compare with target?
A) AUC-ROC
B) Accuracy
C) Logloss
D) Mean-Squared-Error
Solution: D
Since, Logistic Regression is a classification algorithm so it’s output can not be real time value so
mean squared error can not use for evaluating it
45. One of the very good methods to analyze the performance of Logistic Regression is AIC,
which is similar to R-Squared in Linear Regression. Which of the following is true about AIC?
A) We prefer a model with minimum AIC value
B) We prefer a model with maximum AIC value
C) Both but depend on the situation
D) None of these
Solution: A
We select the best model in logistic regression which can least AIC.
Solution: A
In case of lasso we apply a absolute penality, after increasing the penality in lasso some of the
coefficient of variables may become zero.
Context: 48-49
Consider a following model for logistic regression: P (y =1|x, w)= g(w0 + w1x)
where g(z) is the logistic function.
In the above equation the P (y =1|x; w) , viewed as a function of x, that we can get by changing the
parameters w.
A) (0, inf)
B) (-inf, 0 )
C) (0, 1)
D) (-inf, inf)
Solution: C
For values of x in the range of real number from −∞ to +∞ Logistic function will give the output
between (0,1)
49 In above question what do you think which function would make p between (0,1)?
A) logistic function
B) Log likelihood function
C) Mixture of both
D) None of them
Solution: A
50. Suppose you have been given a fair coin and you want to find out the odds of getting heads.
Which of the following option is true for such a case?
A) odds will be 0
B) odds will be 0.5
C) odds will be 1
D) None of these
Solution: C
Odds are defined as the ratio of the probability of success and the probability of failure. So in case of fair
coin probability of success is 1/2 and the probability of failure is 1/2 so odd would be 1
51. The logit function(given as l(x)) is the log of odds function. What could be the range of logit
function in the domain x=[0,1]?
A) (– ∞ , ∞)
B) (0,1)
C) (0, ∞)
D) (- ∞, 0)
Solution: A
For our purposes, the odds function has the advantage of transforming the probability function, which
has values from 0 to 1, into an equivalent function with values between 0 and ∞. When we take the
natural log of the odds function, we get a range of values from -∞ to ∞.
A) Linear Regression errors values has to be normally distributed but in case of Logistic Regression it is
not the case
B) Logistic Regression errors values has to be normally distributed but in case of Linear Regression it is
not the case
C) Both Linear Regression and Logistic Regression error values have to be normally distributed
D) Both Linear Regression and Logistic Regression error values have not to be normally distributed
Solution:A
53. Which of the following is true regarding the logistic function for any value “x”?
Note:
Logistic(x): is a logistic function of any number “x”
A) Logistic(x) = Logit(x)
B) Logistic(x) = Logit_inv(x)
C) Logit_inv(x) = Logit(x)
D) None of these
Solution: B
Suppose you have given the two scatter plot “a” and “b” for two classes( blue for positive and red for
negative class). In scatter plot “a”, you correctly classified all data points using logistic regression ( black
line is a decision boundary).
A) Bias will be high
B) Bias will be low
C) Can’t say
D) None of these
Solution: A
55. Suppose, You applied a Logistic Regression model on a given data and got a training accuracy X
and testing accuracy Y. Now, you want to add a few new features in the same data. Select the
option(s) which is/are correct in such a case.
Solution: A and D
Adding more features to model will increase the training accuracy because model has to consider more
data to fit the logistic regression. But testing accuracy increases if feature is found to be significant
56. Choose which of the following options is true regarding One-Vs-All method in Logistic Regression.
If there are n classes, then n separate logistic regression has to fit, where the probability of each
category is predicted over the rest of the categories combined.
57. Below are two different logistic models with different values for β0 and β1.
Which of the
following statement(s) is true about β0 and β1 values of two logistics models (Green, Black)?
Solution: B
Context 58-60
Below are the three scatter plot(A,B,C left to right) and hand drawn decision boundaries for logistic
regression.
58. Which of the following above figure shows that the decision boundary is overfitting the training
data?
A) A
B) B
C) C
D)None of these
Solution: C
Since in figure 3, Decision boundary is not smooth that means it will over-fitting the data.
1. The training error in first plot is maximum as compare to second and third plot.
2. The best model for this regression problem is the last (third) plot because it has minimum
training error (zero).
3. The second model is more robust than first and third because it will perform best on unseen
data.
5. All will perform same because we have not seen the testing data.
A) 1 and 3
B) 1 and 3
C) 1, 3 and 4
D) 5
Solution: C
The trend in the graphs looks like a quadratic trend over independent variable X. A higher degree(Right
graph) polynomial might have a very high accuracy on the train population but is expected to fail badly
on test dataset. But if you see in left graph we will have training error maximum because it underfits the
training data
60. Suppose, above decision boundaries were generated for the different value of regularization.
Which of the above decision boundary shows the maximum regularization?
A) A
B) B
C) C
D) All have equal regularization
Solution: A
Since, more regularization means more penality means less complex decision boundry that shows in first
figure A.
61. What would do if you want to train logistic regression on same data that will take less time as well
as give the comparatively similar accuracy(may not be same)?
Suppose you are using a Logistic Regression model on a huge dataset. One of the problem you may face
on such huge data is that Logistic regression will take very long time to train.
Solution: D
If you decrease the number of iteration while training it will take less time for surly but will not give the
same accuracy for getting the similar accuracy but not exact you need to increase the learning rate.
62. Which of the following image is showing the cost function for y =1.
Following is the loss function in logistic regression(Y-axis loss function and x axis log probability) for
two class classification problem.
Solution: A
A is the true answer as loss function decreases as the log probability increases
A) 1
B) 2
C) 3
D) 4
Solution: C
There are three local minima present in the graph
64. Can a Logistic Regression classifier do a perfect classification on the below data?
Note: You can use only X1 and X2 variables where X1 and X2 can take only two binary values(0,1).
A) TRUE
B) FALSE
C) Can’t say
D) None of these
Solution: B
No, logistic regression only forms linear decision surface, but the examples in the figure are not linearly
separable.
UNIT IV
Ans Solution: C
When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.
Ans Solution: C
The cost parameter decides how much an SVM should be allowed to “bend” with the data. For a low
cost, you aim for a smooth decision surface and for a higher cost, you aim to classify more points
correctly. It is also simply referred to as the cost of misclassification.
Ans Solution: D
SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.
Ans Solution: B
Generalisation error in statistics is generally the out-of-sample error which is the measure of how
accurately a model can predict values for previously unseen data.
Ans Solution: C
When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: C
Question Context:8– 9
Suppose you are using a Linear SVM classifier with 2 class classification problem. Now you have been
given the following data in which some points are circled red that are representing support vectors.
8. If you remove the following any one red points from the data. Does the decision boundary will
change?
A) Yes
B) No
Solution: A
These three examples are positioned such that removing any one of them introduces slack in the
constraints. So the decision boundary would completely change.
9. [True or False] If you remove the non-red circled points from the data, the decision boundary will
change?
A) True
B) False
Solution: B
On the other hand, rest of the points in the data won’t affect the decision boundary much.
Solution: B
Generalization error in statistics is generally the out-of-sample error which is the measure of how
accurately a model can predict values for previously unseen data.
11. When the C parameter is set to infinite, which of the following holds true?
A) The optimal hyperplane if exists, will be the one that completely separates the data
B) The soft-margin classifier will separate the data
C) None of the above
Solution: A
At such a high level of misclassification penalty, soft margin will not hold existence as there will be no
room for error.
Solution: A
A hard margin means that an SVM is very rigid in classification and tries to work extremely well in the
training set, causing overfitting.
13. The minimum time complexity for training an SVM is O(n2). According to this fact, what sizes of
datasets are not best suited for SVM’s?
A) Large datasets
B) Small datasets
C) Medium sized datasets
D) Size does not matter
Solution: A
Datasets which have a clear classification boundary will function best with SVM’s.
A) Selection of Kernel
B) Kernel Parameters
C) Soft Margin Parameter C
D) All of the above
Solution: D
The SVM effectiveness depends upon how you choose the basic 3 requirements mentioned above in
such a way that it maximises your efficiency, reduces error and overfitting.
15. upport vectors are the data points that lie closest to the decision surface.
A) TRUE
B) FALSE
Solution: A
They are the points closest to the hyperplane and the hardest ones to classify. They also have a direct
bearing on the location of the decision surface.
Solution: C
When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.
17. Suppose you are using RBF kernel in SVM with high Gamma value. What does this signify?
A) The model would consider even far away points from hyperplane for modeling
B) The model would consider only the points close to the hyperplane for modeling
C) The model would not be affected by distance of points from hyperplane for modeling
D) None of the above
Solution: B
The gamma parameter in SVM tuning signifies the influence of points either near or far away from the
hyperplane.
For a low gamma, the model will be too constrained and include all points of the training dataset,
without really capturing the shape.
For a higher gamma, the model will capture the shape of the dataset well.
Solution: C
The cost parameter decides how much an SVM should be allowed to “bend” with the data. For a low
cost, you aim for a smooth decision surface and for a higher cost, you aim to classify more points
correctly. It is also simply referred to as the cost of misclassification.
19. Suppose you are building a SVM model on data X. The data X can be error prone which means that
you should not trust any specific data point too much. Now think that you want to build a SVM model
which has quadratic kernel function of polynomial degree 2 that uses Slack variable C as one of it’s hyper
parameter. Based upon that give the answer for following question.
What would happen when you use very large value of C(C->infinity)?
Note: For small C was also classifying all data points correctly
A) We can still classify data correctly for given setting of hyper parameter C
B) We can not classify data correctly for given setting of hyper parameter C
C) Can’t Say
D) None of these
Solution: A
For large values of C, the penalty for misclassifying points is very high, so the decision boundary will
perfectly separate the data if possible.
20. What would happen when you use very small C (C~0)?
Solution: A
The classifier can maximize the margin between most of the points, while misclassifying a few points,
because the penalty is so low.
21. If I am using all features of my dataset and I achieve 100% accuracy on my training set, but ~70% on
validation set, what should I look out for?
A) Underfitting
B) Nothing, the model is perfect
C) Overfitting
Solution: C
If we’re achieving 100% training accuracy very easily, we need to check to verify if we’re overfitting our
data.
22. Which of the following are real world applications of the SVM?
Solution: D
SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.
Question Context: 23 – 25
Suppose you have trained an SVM with linear decision boundary after training SVM, you correctly infer
that your SVM model is under fitting.
23. Which of the following option would you more likely to consider iterating SVM next time?
Solution: C
The best option here would be to create more features for the model.
24. Suppose you gave the correct answer in previous question. What do you think that is actually
happening?
A) 1 and 2
B) 2 and 3
C) 1 and 4
D) 2 and 4
Solution: C
Better model will lower the bias and increase the variance
25. In above question suppose you want to change one of it’s(SVM) hyperparameter so that effect
would be same as previous questions i.e model will not under fit?
Solution: A
Increasing C parameter would be the right thing to do here, as it will ensure regularized model
26. We usually use feature normalization before using the Gaussian kernel in SVM. What is true about
feature normalization?
A) 1
B) 1 and 2
C) 1 and 3
D) 2 and 3
Solution: B
Suppose you are dealing with 4 class classification problem and you want to train a SVM model on the
data for that you are using One-vs-all method. Now answer the below questions?
27. How many times we need to train our SVM model in such case?
A) 1
B) 2
C) 3
D) 4
Solution: D
For a 4 class problem, you would have to train the SVM at least 4 times if you are using a one-vs-all
method.
28. Suppose you have same distribution of classes in the data. Now, say for training 1 time in one vs all
setting the SVM is taking 10 second. How many seconds would it require to train one-vs-all method end
to end?
A) 20
B) 40
C) 60
D) 80
Solution: B
29 Suppose your problem has changed now. Now, data has only 2 classes. What would you think how
many times we need to train SVM in such case?
A) 1
B) 2
C) 3
D) 4
Solution: A
Training the SVM only one time would give you appropriate results
Suppose you are using SVM with linear kernel of polynomial degree 2, Now think that you have applied
this on data and found that it perfectly fit the data that means, Training and testing accuracy is 100%.
30. Now, think that you increase the complexity (or degree of polynomial of this kernel). What would
you think will happen?
Solution: A
Increasing the complexity of the data would make the algorithm overfit the data.
31. In the previous question after increasing the complexity you found that training accuracy was still
100%. According to you what is the reason behind that?
1. Since data is fixed and we are fitting more polynomial term or parameters so the algorithm starts
memorizing everything in the data
2. Since data is fixed and SVM doesn’t need to search in big hypothesis space
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: C
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: C
UNIT V
1. Which of the following is a widely used and effective machine learning algorithm based on the
idea of bagging?
a) Decision Tree
b) Regression
c) Classification
d) Random Forest
Ans D
a) Factor analysis
b) Decision trees are robust to outliers
c) Decision trees are prone to be overfit
d) None of the above
Ans C
a. True
b. False
Decision trees can also be used to for clusters in the data but clustering often generates natural
clusters and is not dependent on any objective function.
Regression
Classification
Clustering
Reinforcement Learning
Options:
a. 1 Only
b. 1 and 2
c. 1 and 3
d. 1, 2 and 4
Ans D
6 Which of the following is the most appropriate strategy for data cleaning before performing
clustering analysis, given less than desirable number of data points:
Removal of outliers
Options:
a. 1 only
b. 2 only
c. 1 and 2
d. None of the above
Ans A
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: C
Both options are true. In Bagging, each individual trees are independent of each other because they
consider different subset of features and samples.
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: B
In boosting tree individual weak learners are not independent of each other because each tree correct
the results of previous tree. Bagging and boosting both can be consider as improving the base learners
results.
9. In Random forest you can generate hundreds of trees (say T1, T2 …..Tn) and then aggregate
the results of these tree. Which of the following is true about individual (Tk) tree in Random Forest?
1. Individual tree is built on a subset of the features
A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4
Ans Solution: A
Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.
10. Suppose you are using a bagging based algorithm say a RandomForest in model building.
Which of the following can be true?
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: A
Since Random Forest aggregate the result of different weak learners, If It is possible we would want
more number of trees in model building. Random Forest is a black box model you will lose
interpretability after using it.
11. Which of the following is/are true about Random Forest and Gradient Boosting ensemble
methods?
2. Random Forest is use for classification whereas Gradient Boosting is use for regression task
3. Random Forest is use for regression whereas Gradient Boosting is use for Classification task
Solution: E
12. In Random forest you can generate hundreds of trees (say T1, T2 …..Tn) and then aggregate the
results of these tree. Which of the following is true about individual(Tk) tree in Random Forest?
A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4
Solution: A
Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.
13. Which of the following algorithm doesn’t uses learning Rate as of one of its hyperparameter?
1. Gradient Boosting
2. Extra Trees
3. AdaBoost
4. Random Forest
A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4
Solution: D
Random Forest and Extra Trees don’t have learning rate as a hyperparameter.
14. Which of the following algorithm are not an example of ensemble learning algorithm?
A) Random Forest
B) Adaboost
C) Extra Trees
D) Gradient Boosting
E) Decision Trees
Solution: E
Decision trees doesn’t aggregate the results of multiple trees so it is not an ensemble algorithm.
15. Suppose you are using a bagging based algorithm say a RandomForest in model building. Which of
the following can be true?
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: A
Since Random Forest aggregate the result of different weak learners, If It is possible we would want
more number of trees in model building. Random Forest is a black box model you will lose
interpretability after using it.
16. True-False: The bagging is suitable for high variance low bias models?
A) TRUE
B) FALSE
Solution: A
The bagging is suitable for high variance low bias models or you can say for complex models.
17. To apply bagging to regression trees which of the following is/are true in such case?
Solution: D
Solution: B
We always consider the validation results to compare with the test result.
19. In which of the following scenario a gain ratio is preferred over Information Gain?
Solution: A
When high cardinality problems, gain ratio is preferred over Information Gain technique.
20. Suppose you have given the following scenario for training and validation error for Gradient
Boosting. Which of the following hyper parameter would you choose in such case?
1 2 100 110
2 4 90 105
3 6 50 100
4 8 45 105
5 10 30 150
A) 1
B) 2
C) 3
D) 4
Solution: B
Scenario 2 and 4 has same validation accuracies but we would select 2 because depth is lower is better
hyper parameter.
21. Which of the following is/are not true about DBSCAN clustering algorithm:
1. For data points to be in a cluster, they must be in a distance threshold to a core point
5. It is robust to outliers
Options:
A. 1 only
B. 2 only
C. 4 only
D. 2 and 3
Solution: D
DBSCAN can form a cluster of any arbitrary shape and does not have strong assumptions for the
distribution of data points in the data space.
Answer: d
Explanation: K-means clustering follows partitioning approach.
Answer: c
Explanation: k-nearest neighbour has nothing to do with k-means.
Answer: a
Explanation: K-means requires a number of clusters.
Answer: a
Explanation: K-means clustering produces the final estimate of cluster centroids.
27.
UNIT I
1. What is classification?
a) when the output variable is a category, such as “red” or “blue” or “disease” and “no
disease”.
b) when the output variable is a real value, such as “dollars” or “weight”.
Ans: Solution A
2. What is regression?
a) When the output variable is a category, such as “red” or “blue” or “disease” and “no
disease”.
b) When the output variable is a real value, such as “dollars” or “weight”.
Ans: Solution B
Ans: Solution B
Ans: Solution A
Ans: Solution D
6. What is Reinforcement learning?
a) All data is unlabelled and the algorithms learn to inherent structure from the input data
b) All data is labelled and the algorithms learn to predict the output from the input data
c) It is a framework for learning where an agent interacts with an environment and receives
a reward for each interaction
d) Some data is labelled but most of it is unlabelled and a mixture of supervised and
unsupervised techniques can be used.
Ans: Solution C
Regression,
Classification
Clustering
Reinforcement Learning
Options:
A. 1 Only
B. 1 and 2
C. 1 and 3
D. 1, 2 and 4
Ans : Solution D
Ans : Solution C
Ans : Solution B
11. Supervised learning and unsupervised clustering both require at least one
a) hidden attribute.
b) output attribute.
c) input attribute.
d) categorical attribute.
Ans : Solution A
12. Supervised learning differs from unsupervised clustering in that supervised learning requires
a) at least one input attribute.
b) input attributes to be categorical.
c) at least one output attribute.
d) output attributes to be categorical.
Ans : Solution B
13. A regression model in which more than one independent variable is used to predict the
dependent variable is called
a) a simple linear regression model
b) a multiple regression models
c) an independent model
d) none of the above
Ans : Solution C
14. A term used to describe the case when the independent variables in a multiple regression model
are correlated is
a) Regression
b) correlation
c) multicollinearity
d) none of the above
Ans : Solution C
15. A multiple regression model has the form: y = 2 + 3x1 + 4x2. As x1 increases by 1 unit (holding x2
constant), y will
a) increase by 3 units
b) decrease by 3 units
c) increase by 4 units
d) decrease by 4 units
Ans : Solution C
Ans : Solution B
17. A measure of goodness of fit for the estimated regression equation is the
a) multiple coefficient of determination
b) mean square due to error
c) mean square due to regression
d) none of the above
Ans : Solution C
Ans : Solution D
Ans : Solution C
20. For a multiple regression model, SST = 200 and SSE = 50. The multiple coefficient of
determination is
a) 0.25
b) 4.00
c) 0.75
d) none of the above
Ans : Solution B
Ans : Solution B
Ans : Solution B
Ans : Solution C
Ans : Solution D
26. Which statement is true about neural network and linear regression models?
a) Both models require input attributes to be numeric.
b) Both models require numeric attributes to range between 0 and 1.
c) The output of both models is a categorical attribute value.
d) Both techniques build models whose output is determined by a linear sum of weighted
input attribute values.
Ans : Solution A
Ans : Solution A
28. The average positive difference between computed and desired outcome values.
a) root mean squared error
b) mean squared error
c) mean absolute error
d) mean positive error
Ans : Solution D
29. Selecting data so as to assure that each class is properly represented in both the training and
test set.
a) cross validation
b) stratification
c) verification
d) bootstrapping
Ans : Solution B
30. The standard error is defined as the square root of this computation.
a) The sample variance divided by the total number of sample instances.
b) The population variance divided by the total number of sample instances.
c) The sample variance divided by the sample mean.
d) The population variance divided by the sample mean.
Ans : Solution A
31. Data used to optimize the parameter settings of a supervised learner model.
a) Training
b) Test
c) Verification
d) Validation
Ans : Solution D
Ans : Solution A
33. The correlation between the number of years an employee has worked for a company and the
salary of the employee is 0.75. What can be said about employee salary and years worked?
a) There is no relationship between salary and years worked.
b) Individuals that have worked for the company the longest have higher salaries.
c) Individuals that have worked for the company the longest have lower salaries.
d) The majority of employees have been with the company a long time.
e) The majority of employees have been with the company a short period of time.
Ans : Solution B
34. The correlation coefficient for two real-valued attributes is –0.85. What does this value tell you?
a) The attributes are not linearly related.
b) As the value of one attribute increases the value of the second attribute also increases.
c) As the value of one attribute decreases the value of the second attribute increases.
d) The attributes show a curvilinear relationship.
Ans : Solution C
35. The average squared difference between classifier predicted output and actual output.
a) mean squared error
b) root mean squared error
c) mean absolute error
d) mean relative error
Ans : Solution A
36. Simple regression assumes a __________ relationship between the input attribute and output
attribute.
a) Linear
b) Quadratic
c) reciprocal
d) inverse
Ans : Solution A
Ans : Solution B
Ans : Solution C
39. Logistic regression is a ________ regression technique that is used to model data having a
_____outcome.
a) linear, numeric
b) linear, binary
c) nonlinear, numeric
d) nonlinear, binary
Ans : Solution D
40. This technique associates a conditional probability value with each data instance.
a) linear regression
b) logistic regression
c) simple regression
d) multiple linear regression
Ans : Solution B
41. This supervised learning technique can process both numeric and categorical input attributes.
a) linear regression
b) Bayes classifier
c) logistic regression
d) backpropagation learning
Ans : Solution A
Ans : Solution B
43. This clustering algorithm merges and splits nodes to help modify nonoptimal partitions.
a) agglomerative clustering
b) expectation maximization
c) conceptual clustering
d) K-Means clustering
Ans : Solution D
44. This clustering algorithm initially assumes that each data instance represents a single cluster.
a) agglomerative clustering
b) conceptual clustering
c) K-Means clustering
d) expectation maximization
Ans : Solution C
45. This unsupervised clustering algorithm terminates when mean values computed for the current
iteration of the algorithm are identical to the computed mean values for the previous iteration.
a) agglomerative clustering
b) conceptual clustering
c) K-Means clustering
d) expectation maximization
Ans : Solution C
46. Machine learning techniques differ from statistical techniques in that machine learning methods
a) typically assume an underlying distribution for the data.
b) are better able to deal with missing and noisy data.
c) are not able to explain their behavior.
d) have trouble with large-sized datasets.
Ans : Solution B
UNIT –II
1.True- False: Over fitting is more likely when you have huge amount of data to train?
A) TRUE
B) FALSE
Ans Solution: (B)
With a small training dataset, it’s easier to find a hypothesis to fit the training data exactly i.e.
over fitting.
3.Which of the following techniques would perform better for reducing dimensions of a data
set?
A. Removing columns which have too many missing values
B. Removing columns which have high variance in data
C. Removing columns with dissimilar data trends
D. None of these
Ans Solution: (A)
If a columns have too many missing values, (say 99%) then we can remove such columns.
4.It is not necessary to have a target variable for applying dimensionality reduction
algorithms.
A. TRUE
B. FALSE
Ans Solution: (A)
LDA is an example of supervised dimensionality reduction algorithm.
5. PCA can be used for projecting and visualizing data in lower dimensions.
A. TRUE
B. FALSE
Ans Solution: (A)
Sometimes it is very useful to plot the data in lower dimensions. We can take the first 2 principal
components and then visualize the data using scatter plot.
6. The most popularly used dimensionality reduction algorithm is Principal Component Analysis
(PCA). Which of the following is/are true about PCA?
PCA is an unsupervised method
It searches for the directions that data have the largest variance
Maximum number of principal components <= number of features
All principal components are orthogonal to each other
A. 1 and 2
B. 1 and 3
C. 2 and 3
D. All of the above
Ans D
8. What happens when you get features in lower dimensions using PCA?
The features will still have interpretability
The features will lose interpretability
The features must carry all information present in data
The features may not carry all information present in data
A. 1 and 3
B. 1 and 4
C. 2 and 3
D. 2 and 4
Ans Solution: (D)
When you get the features in lower dimensions then you will lose some information of data
most of the times and you won’t be able to interpret the lower dimension data.
10. What is of the following statement is true about t-SNE in comparison to PCA?
A. When the data is huge (in size), t-SNE may fail to produce better results.
B. T-NSE always produces better result regardless of the size of the data
C. PCA always performs better than t-SNE for smaller size data.
D. None of these
Ans Solution: (A)
Option A is correct
11. [ True or False ] PCA can be used for projecting and visualizing data in lower dimensions.
A. TRUE
B. FALSE
Solution: (A)
Sometimes it is very useful to plot the data in lower dimensions. We can take the first 2 principal
components and then visualize the data using scatter plot.
12. A feature F1 can take certain value: A, B, C, D, E, & F and represents grade of students from
a college.
1) Which of the following statement is true in following case?
A) Feature F1 is an example of nominal variable.
B) Feature F1 is an example of ordinal variable.
C) It doesn’t belong to any of the above category.
D) Both of these
Solution: (B)
Ordinal variables are the variables which has some order in their categories. For example, grade
A should be consider as high grade than grade B.
1. Which of the following methods do we use to best fit the data in Logistic Regression?
A) Least Square Error
B) Maximum Likelihood
C) Jaccard distance
D) Both A and B
Ans Solution: B
2. Choose which of the following options is true regarding One-Vs-All method in Logistic
Regression.
A) We need to fit n models in n-class classification problem
B) We need to fit n-1 models to classify into n classes
C) We need to fit only 1 model to classify into n classes
D) None of these
Ans Solution: A
3. Suppose, You applied a Logistic Regression model on a given data and got a training accuracy
X and testing accuracy Y. Now, you want to add a few new features in the same data. Select the
option(s) which is/are correct in such a case.
Note: Consider remaining parameters are same.
A) Training accuracy increases
B) Training accuracy increases or remains the same
C) Testing accuracy decreases
D) Testing accuracy increases or remains the same
Ans Solution: A and D
Adding more features to model will increase the training accuracy because model has to
consider more data to fit the logistic regression. But testing accuracy increases if feature is
found to be significant
6. Which of the following methods do we use to find the best fit line for data in Linear
Regression?
A) Least Square Error
B) Maximum Likelihood
C) Logarithmic Loss
D) Both A and B
Ans Solution: (A)
In linear regression, we try to minimize the least square errors of the model to identify the line
of best fit.
8. Suppose you plotted a scatter plot between the residuals and predicted values in linear
regression and you found that there is a relationship between them. Which of the following
conclusion do you make about this situation?
9. Suppose you have fitted a complex regression model on a dataset. Now, you are using Ridge
regression with penalty x.
Choose the option which describes bias in best manner.
A) In case of very large x; bias is low
B) In case of very large x; bias is high
C) We can’t say about bias
D) None of these
Ans Solution: (B)
If the penalty is very large it means model is less complex, therefore the bias would be high.
11. Suppose you have trained a logistic regression classifier and it outputs a new example x with
a prediction ho(x) = 0.2. This means
Our estimate for P(y=1 | x)
Our estimate for P(y=0 | x)
Our estimate for P(y=1 | x)
Our estimate for P(y=0 | x)
Ans Solution: B
A) TRUE
B) FALSE
Solution: (A)
True. A Neural network can be used as a universal approximator, so it can definitely implement
a linear regression algorithm.
15. Which of the following methods do we use to find the best fit line for data in Linear
Regression?
A) Least Square Error
B) Maximum Likelihood
C) Logarithmic Loss
D) Both A and B
Solution: (A)
In linear regression, we try to minimize the least square errors of the model to identify the line
of best fit.
16. Which of the following evaluation metrics can be used to evaluate a model while modeling
a continuous output variable?
A) AUC-ROC
B) Accuracy
C) Logloss
D) Mean-Squared-Error
Solution: (D)
Since linear regression gives output as continuous values, so in such case we use mean squared
error metric to evaluate the model performance. Remaining options are use in case of a
classification problem.
17. True-False: Lasso Regularization can be used for variable selection in Linear Regression.
A) TRUE
B) FALSE
Solution: (A)
True, In case of lasso regression we apply absolute penalty which makes some of the coefficients
zero.
19. Suppose that we have N independent variables (X1,X2… Xn) and dependent variable is Y.
Now Imagine that you are applying linear regression by fitting the best fit line using least square
error on this data.
You found that correlation coefficient for one of it’s variable(Say X1) with Y is -0.95.
Which of the following is true for X1?
A) Relation between the X1 and Y is weak
B) Relation between the X1 and Y is strong
C) Relation between the X1 and Y is neutral
D) Correlation can’t judge the relationship
Solution: (B)
The absolute value of the correlation coefficient denotes the strength of the relationship.
Since absolute correlation is very high it means that the relationship is strong between X1 and
Y.
20. Looking at above two characteristics, which of the following option is the correct for
Pearson correlation between V1 and V2?
If you are given the two variables V1 and V2 and they are following below two characteristics.
1. If V1 increases then V2 also increases
2. If V1 decreases then V2 behavior is unknown
A) Pearson correlation will be close to 1
B) Pearson correlation will be close to -1
C) Pearson correlation will be close to 0
D) None of these
Solution: (D)
We cannot comment on the correlation coefficient by using only statement 1. We need to
consider the both of these two statements. Consider V1 as x and V2 as |x|. The correlation
coefficient would not be close to 1 in such a case.
21. Suppose Pearson correlation between V1 and V2 is zero. In such case, is it right to
conclude that V1 and V2 do not have any relation between them?
A) TRUE
B) FALSE
Solution: (B)
Pearson correlation coefficient between 2 variables might be zero even when they have a
relationship between them. If the correlation coefficient is zero, it just means that that they
don’t move together. We can take examples like y=|x| or y=x^2.
22. True- False: Overfitting is more likely when you have huge amount of data to train?
A) TRUE
B) FALSE
Solution: (B)
With a small training dataset, it’s easier to find a hypothesis to fit the training data exactly i.e.
overfitting.
23. We can also compute the coefficient of linear regression with the help of an analytical
method called “Normal Equation”. Which of the following is/are true about Normal Equation?
1. We don’t have to choose the learning rate
2. It becomes slow when number of features is very large
3. Thers is no need to iterate
A) 1 and 2
B) 1 and 3
C) 2 and 3
D) 1,2 and 3
Solution: (D)
Instead of gradient descent, Normal Equation can also be used to find coefficients.
25. What will happen when you apply very large penalty?
A) Some of the coefficient will become absolute zero
B) Some of the coefficient will approach zero but not absolute zero
C) Both A and B depending on the situation
D) None of these
Solution: (B)
In lasso some of the coefficient value become zero, but in case of Ridge, the coefficients become
close to zero but not zero.
26. What will happen when you apply very large penalty in case of Lasso?
A) Some of the coefficient will become zero
B) Some of the coefficient will be approaching to zero but not absolute zero
C) Both A and B depending on the situation
D) None of these
Solution: (A)
As already discussed, lasso applies absolute penalty, so some of the coefficients will become
zero.
27. Which of the following statement is true about outliers in Linear regression?
A) Linear regression is sensitive to outliers
B) Linear regression is not sensitive to outliers
C) Can’t say
D) None of these
Solution: (A)
The slope of the regression line will change due to outliers in most of the cases. So Linear
Regression is sensitive to outliers.
28. Suppose you plotted a scatter plot between the residuals and predicted values in linear
regression and you found that there is a relationship between them. Which of the following
conclusion do you make about this situation?
31. In terms of bias and variance. Which of the following is true when you fit degree 2
polynomial?
A) Increase
B) Decrease
C) Remain constant
D) Can’t Say
Solution: (D)
Training error may increase or decrease depending on the values that are used to fit the model.
If the values used to train contain more outliers gradually, then the error might just increase.
33. What do you expect will happen with bias and variance as you increase the size of training
data?
34. What would be the root mean square training error for this data if you run a Linear
Regression model of the form (Y = A0+A1X)?
A) Less than 0
B) Greater than zero
C) Equal to 0
D) None of these
Solution: (C)
We can perfectly fit the line on the following data so mean error will be zero.
35. Which of the following scenario would give you the right hyper parameter?
A) 1
B) 2
C) 3
D) 4
Solution: (B)
Option B would be the better option because it leads to less training as well as validation error.
36. Suppose you got the tuned hyper parameters from the previous question. Now, Imagine
you want to add a variable in variable space such that this added feature is important. Which
of the following thing would you observe in such case?
A) Training Error will decrease and Validation error will increase
B) Training Error will increase and Validation error will increase
C) Training Error will increase and Validation error will decrease
D) Training Error will decrease and Validation error will decrease
E) None of the above
Solution: (D)
If the added feature is important, the training and validation error would decrease.
A) L1
B) L2
C) Any
D) None of these
Solution: (D)
I won’t use any regularization methods because regularization is used in case of overfitting.
41. True-False: Is it possible to design a logistic regression algorithm using a Neural Network
Algorithm?
A) TRUE
B) FALSE
Solution: A
True, Neural network is a is a universal approximator so it can implement linear regression
algorithm.
43. Which of the following methods do we use to best fit the data in Logistic Regression?
A) Least Square Error
B) Maximum Likelihood
C) Jaccard distance
D) Both A and B
Solution: B
Logistic regression uses maximum likely hood estimate for training a logistic regression.
44. Which of the following evaluation metrics can not be applied in case of logistic regression
output to compare with target?
A) AUC-ROC
B) Accuracy
C) Logloss
D) Mean-Squared-Error
Solution: D
Since, Logistic Regression is a classification algorithm so it’s output can not be real time value so
mean squared error can not use for evaluating it
45. One of the very good methods to analyze the performance of Logistic Regression is AIC,
which is similar to R-Squared in Linear Regression. Which of the following is true about AIC?
A) We prefer a model with minimum AIC value
B) We prefer a model with maximum AIC value
C) Both but depend on the situation
D) None of these
Solution: A
We select the best model in logistic regression which can least AIC.
Solution: A
In case of lasso we apply a absolute penality, after increasing the penality in lasso some of the
coefficient of variables may become zero.
Context: 48-49
Consider a following model for logistic regression: P (y =1|x, w)= g(w0 + w1x)
where g(z) is the logistic function.
In the above equation the P (y =1|x; w) , viewed as a function of x, that we can get by changing the
parameters w.
A) (0, inf)
B) (-inf, 0 )
C) (0, 1)
D) (-inf, inf)
Solution: C
For values of x in the range of real number from −∞ to +∞ Logistic function will give the output
between (0,1)
49 In above question what do you think which function would make p between (0,1)?
A) logistic function
B) Log likelihood function
C) Mixture of both
D) None of them
Solution: A
50. Suppose you have been given a fair coin and you want to find out the odds of getting heads.
Which of the following option is true for such a case?
A) odds will be 0
B) odds will be 0.5
C) odds will be 1
D) None of these
Solution: C
Odds are defined as the ratio of the probability of success and the probability of failure. So in case of fair
coin probability of success is 1/2 and the probability of failure is 1/2 so odd would be 1
51. The logit function(given as l(x)) is the log of odds function. What could be the range of logit
function in the domain x=[0,1]?
A) (– ∞ , ∞)
B) (0,1)
C) (0, ∞)
D) (- ∞, 0)
Solution: A
For our purposes, the odds function has the advantage of transforming the probability function, which
has values from 0 to 1, into an equivalent function with values between 0 and ∞. When we take the
natural log of the odds function, we get a range of values from -∞ to ∞.
A) Linear Regression errors values has to be normally distributed but in case of Logistic Regression it is
not the case
B) Logistic Regression errors values has to be normally distributed but in case of Linear Regression it is
not the case
C) Both Linear Regression and Logistic Regression error values have to be normally distributed
D) Both Linear Regression and Logistic Regression error values have not to be normally distributed
Solution:A
53. Which of the following is true regarding the logistic function for any value “x”?
Note:
Logistic(x): is a logistic function of any number “x”
A) Logistic(x) = Logit(x)
B) Logistic(x) = Logit_inv(x)
C) Logit_inv(x) = Logit(x)
D) None of these
Solution: B
Suppose you have given the two scatter plot “a” and “b” for two classes( blue for positive and red for
negative class). In scatter plot “a”, you correctly classified all data points using logistic regression ( black
line is a decision boundary).
A) Bias will be high
B) Bias will be low
C) Can’t say
D) None of these
Solution: A
55. Suppose, You applied a Logistic Regression model on a given data and got a training accuracy X
and testing accuracy Y. Now, you want to add a few new features in the same data. Select the
option(s) which is/are correct in such a case.
Solution: A and D
Adding more features to model will increase the training accuracy because model has to consider more
data to fit the logistic regression. But testing accuracy increases if feature is found to be significant
56. Choose which of the following options is true regarding One-Vs-All method in Logistic Regression.
If there are n classes, then n separate logistic regression has to fit, where the probability of each
category is predicted over the rest of the categories combined.
57. Below are two different logistic models with different values for β0 and β1.
Which of the
following statement(s) is true about β0 and β1 values of two logistics models (Green, Black)?
Solution: B
Context 58-60
Below are the three scatter plot(A,B,C left to right) and hand drawn decision boundaries for logistic
regression.
58. Which of the following above figure shows that the decision boundary is overfitting the training
data?
A) A
B) B
C) C
D)None of these
Solution: C
Since in figure 3, Decision boundary is not smooth that means it will over-fitting the data.
1. The training error in first plot is maximum as compare to second and third plot.
2. The best model for this regression problem is the last (third) plot because it has minimum
training error (zero).
3. The second model is more robust than first and third because it will perform best on unseen
data.
5. All will perform same because we have not seen the testing data.
A) 1 and 3
B) 1 and 3
C) 1, 3 and 4
D) 5
Solution: C
The trend in the graphs looks like a quadratic trend over independent variable X. A higher degree(Right
graph) polynomial might have a very high accuracy on the train population but is expected to fail badly
on test dataset. But if you see in left graph we will have training error maximum because it underfits the
training data
60. Suppose, above decision boundaries were generated for the different value of regularization.
Which of the above decision boundary shows the maximum regularization?
A) A
B) B
C) C
D) All have equal regularization
Solution: A
Since, more regularization means more penality means less complex decision boundry that shows in first
figure A.
61. What would do if you want to train logistic regression on same data that will take less time as well
as give the comparatively similar accuracy(may not be same)?
Suppose you are using a Logistic Regression model on a huge dataset. One of the problem you may face
on such huge data is that Logistic regression will take very long time to train.
Solution: D
If you decrease the number of iteration while training it will take less time for surly but will not give the
same accuracy for getting the similar accuracy but not exact you need to increase the learning rate.
62. Which of the following image is showing the cost function for y =1.
Following is the loss function in logistic regression(Y-axis loss function and x axis log probability) for
two class classification problem.
Solution: A
A is the true answer as loss function decreases as the log probability increases
A) 1
B) 2
C) 3
D) 4
Solution: C
There are three local minima present in the graph
64. Can a Logistic Regression classifier do a perfect classification on the below data?
Note: You can use only X1 and X2 variables where X1 and X2 can take only two binary values(0,1).
A) TRUE
B) FALSE
C) Can’t say
D) None of these
Solution: B
No, logistic regression only forms linear decision surface, but the examples in the figure are not linearly
separable.
UNIT IV
Ans Solution: C
When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.
Ans Solution: C
The cost parameter decides how much an SVM should be allowed to “bend” with the data. For a low
cost, you aim for a smooth decision surface and for a higher cost, you aim to classify more points
correctly. It is also simply referred to as the cost of misclassification.
Ans Solution: D
SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.
Ans Solution: B
Generalisation error in statistics is generally the out-of-sample error which is the measure of how
accurately a model can predict values for previously unseen data.
Ans Solution: C
When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: C
Question Context:8– 9
Suppose you are using a Linear SVM classifier with 2 class classification problem. Now you have been
given the following data in which some points are circled red that are representing support vectors.
8. If you remove the following any one red points from the data. Does the decision boundary will
change?
A) Yes
B) No
Solution: A
These three examples are positioned such that removing any one of them introduces slack in the
constraints. So the decision boundary would completely change.
9. [True or False] If you remove the non-red circled points from the data, the decision boundary will
change?
A) True
B) False
Solution: B
On the other hand, rest of the points in the data won’t affect the decision boundary much.
Solution: B
Generalization error in statistics is generally the out-of-sample error which is the measure of how
accurately a model can predict values for previously unseen data.
11. When the C parameter is set to infinite, which of the following holds true?
A) The optimal hyperplane if exists, will be the one that completely separates the data
B) The soft-margin classifier will separate the data
C) None of the above
Solution: A
At such a high level of misclassification penalty, soft margin will not hold existence as there will be no
room for error.
Solution: A
A hard margin means that an SVM is very rigid in classification and tries to work extremely well in the
training set, causing overfitting.
13. The minimum time complexity for training an SVM is O(n2). According to this fact, what sizes of
datasets are not best suited for SVM’s?
A) Large datasets
B) Small datasets
C) Medium sized datasets
D) Size does not matter
Solution: A
Datasets which have a clear classification boundary will function best with SVM’s.
A) Selection of Kernel
B) Kernel Parameters
C) Soft Margin Parameter C
D) All of the above
Solution: D
The SVM effectiveness depends upon how you choose the basic 3 requirements mentioned above in
such a way that it maximises your efficiency, reduces error and overfitting.
15. upport vectors are the data points that lie closest to the decision surface.
A) TRUE
B) FALSE
Solution: A
They are the points closest to the hyperplane and the hardest ones to classify. They also have a direct
bearing on the location of the decision surface.
Solution: C
When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.
17. Suppose you are using RBF kernel in SVM with high Gamma value. What does this signify?
A) The model would consider even far away points from hyperplane for modeling
B) The model would consider only the points close to the hyperplane for modeling
C) The model would not be affected by distance of points from hyperplane for modeling
D) None of the above
Solution: B
The gamma parameter in SVM tuning signifies the influence of points either near or far away from the
hyperplane.
For a low gamma, the model will be too constrained and include all points of the training dataset,
without really capturing the shape.
For a higher gamma, the model will capture the shape of the dataset well.
Solution: C
The cost parameter decides how much an SVM should be allowed to “bend” with the data. For a low
cost, you aim for a smooth decision surface and for a higher cost, you aim to classify more points
correctly. It is also simply referred to as the cost of misclassification.
19. Suppose you are building a SVM model on data X. The data X can be error prone which means that
you should not trust any specific data point too much. Now think that you want to build a SVM model
which has quadratic kernel function of polynomial degree 2 that uses Slack variable C as one of it’s hyper
parameter. Based upon that give the answer for following question.
What would happen when you use very large value of C(C->infinity)?
Note: For small C was also classifying all data points correctly
A) We can still classify data correctly for given setting of hyper parameter C
B) We can not classify data correctly for given setting of hyper parameter C
C) Can’t Say
D) None of these
Solution: A
For large values of C, the penalty for misclassifying points is very high, so the decision boundary will
perfectly separate the data if possible.
20. What would happen when you use very small C (C~0)?
Solution: A
The classifier can maximize the margin between most of the points, while misclassifying a few points,
because the penalty is so low.
21. If I am using all features of my dataset and I achieve 100% accuracy on my training set, but ~70% on
validation set, what should I look out for?
A) Underfitting
B) Nothing, the model is perfect
C) Overfitting
Solution: C
If we’re achieving 100% training accuracy very easily, we need to check to verify if we’re overfitting our
data.
22. Which of the following are real world applications of the SVM?
Solution: D
SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.
Question Context: 23 – 25
Suppose you have trained an SVM with linear decision boundary after training SVM, you correctly infer
that your SVM model is under fitting.
23. Which of the following option would you more likely to consider iterating SVM next time?
Solution: C
The best option here would be to create more features for the model.
24. Suppose you gave the correct answer in previous question. What do you think that is actually
happening?
A) 1 and 2
B) 2 and 3
C) 1 and 4
D) 2 and 4
Solution: C
Better model will lower the bias and increase the variance
25. In above question suppose you want to change one of it’s(SVM) hyperparameter so that effect
would be same as previous questions i.e model will not under fit?
Solution: A
Increasing C parameter would be the right thing to do here, as it will ensure regularized model
26. We usually use feature normalization before using the Gaussian kernel in SVM. What is true about
feature normalization?
A) 1
B) 1 and 2
C) 1 and 3
D) 2 and 3
Solution: B
Suppose you are dealing with 4 class classification problem and you want to train a SVM model on the
data for that you are using One-vs-all method. Now answer the below questions?
27. How many times we need to train our SVM model in such case?
A) 1
B) 2
C) 3
D) 4
Solution: D
For a 4 class problem, you would have to train the SVM at least 4 times if you are using a one-vs-all
method.
28. Suppose you have same distribution of classes in the data. Now, say for training 1 time in one vs all
setting the SVM is taking 10 second. How many seconds would it require to train one-vs-all method end
to end?
A) 20
B) 40
C) 60
D) 80
Solution: B
29 Suppose your problem has changed now. Now, data has only 2 classes. What would you think how
many times we need to train SVM in such case?
A) 1
B) 2
C) 3
D) 4
Solution: A
Training the SVM only one time would give you appropriate results
Suppose you are using SVM with linear kernel of polynomial degree 2, Now think that you have applied
this on data and found that it perfectly fit the data that means, Training and testing accuracy is 100%.
30. Now, think that you increase the complexity (or degree of polynomial of this kernel). What would
you think will happen?
Solution: A
Increasing the complexity of the data would make the algorithm overfit the data.
31. In the previous question after increasing the complexity you found that training accuracy was still
100%. According to you what is the reason behind that?
1. Since data is fixed and we are fitting more polynomial term or parameters so the algorithm starts
memorizing everything in the data
2. Since data is fixed and SVM doesn’t need to search in big hypothesis space
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: C
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: C
UNIT V
1. Which of the following is a widely used and effective machine learning algorithm based on the
idea of bagging?
a) Decision Tree
b) Regression
c) Classification
d) Random Forest
Ans D
a) Factor analysis
b) Decision trees are robust to outliers
c) Decision trees are prone to be overfit
d) None of the above
Ans C
a. True
b. False
Decision trees can also be used to for clusters in the data but clustering often generates natural
clusters and is not dependent on any objective function.
Regression
Classification
Clustering
Reinforcement Learning
Options:
a. 1 Only
b. 1 and 2
c. 1 and 3
d. 1, 2 and 4
Ans D
6 Which of the following is the most appropriate strategy for data cleaning before performing
clustering analysis, given less than desirable number of data points:
Removal of outliers
Options:
a. 1 only
b. 2 only
c. 1 and 2
d. None of the above
Ans A
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: C
Both options are true. In Bagging, each individual trees are independent of each other because they
consider different subset of features and samples.
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: B
In boosting tree individual weak learners are not independent of each other because each tree correct
the results of previous tree. Bagging and boosting both can be consider as improving the base learners
results.
9. In Random forest you can generate hundreds of trees (say T1, T2 …..Tn) and then aggregate
the results of these tree. Which of the following is true about individual (Tk) tree in Random Forest?
1. Individual tree is built on a subset of the features
A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4
Ans Solution: A
Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.
10. Suppose you are using a bagging based algorithm say a RandomForest in model building.
Which of the following can be true?
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: A
Since Random Forest aggregate the result of different weak learners, If It is possible we would want
more number of trees in model building. Random Forest is a black box model you will lose
interpretability after using it.
11. Which of the following is/are true about Random Forest and Gradient Boosting ensemble
methods?
2. Random Forest is use for classification whereas Gradient Boosting is use for regression task
3. Random Forest is use for regression whereas Gradient Boosting is use for Classification task
Solution: E
12. In Random forest you can generate hundreds of trees (say T1, T2 …..Tn) and then aggregate the
results of these tree. Which of the following is true about individual(Tk) tree in Random Forest?
A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4
Solution: A
Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.
13. Which of the following algorithm doesn’t uses learning Rate as of one of its hyperparameter?
1. Gradient Boosting
2. Extra Trees
3. AdaBoost
4. Random Forest
A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4
Solution: D
Random Forest and Extra Trees don’t have learning rate as a hyperparameter.
14. Which of the following algorithm are not an example of ensemble learning algorithm?
A) Random Forest
B) Adaboost
C) Extra Trees
D) Gradient Boosting
E) Decision Trees
Solution: E
Decision trees doesn’t aggregate the results of multiple trees so it is not an ensemble algorithm.
15. Suppose you are using a bagging based algorithm say a RandomForest in model building. Which of
the following can be true?
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: A
Since Random Forest aggregate the result of different weak learners, If It is possible we would want
more number of trees in model building. Random Forest is a black box model you will lose
interpretability after using it.
16. True-False: The bagging is suitable for high variance low bias models?
A) TRUE
B) FALSE
Solution: A
The bagging is suitable for high variance low bias models or you can say for complex models.
17. To apply bagging to regression trees which of the following is/are true in such case?
Solution: D
Solution: B
We always consider the validation results to compare with the test result.
19. In which of the following scenario a gain ratio is preferred over Information Gain?
Solution: A
When high cardinality problems, gain ratio is preferred over Information Gain technique.
20. Suppose you have given the following scenario for training and validation error for Gradient
Boosting. Which of the following hyper parameter would you choose in such case?
1 2 100 110
2 4 90 105
3 6 50 100
4 8 45 105
5 10 30 150
A) 1
B) 2
C) 3
D) 4
Solution: B
Scenario 2 and 4 has same validation accuracies but we would select 2 because depth is lower is better
hyper parameter.
21. Which of the following is/are not true about DBSCAN clustering algorithm:
1. For data points to be in a cluster, they must be in a distance threshold to a core point
5. It is robust to outliers
Options:
A. 1 only
B. 2 only
C. 4 only
D. 2 and 3
Solution: D
DBSCAN can form a cluster of any arbitrary shape and does not have strong assumptions for the
distribution of data points in the data space.
Answer: d
Explanation: K-means clustering follows partitioning approach.
Answer: c
Explanation: k-nearest neighbour has nothing to do with k-means.
Answer: a
Explanation: K-means requires a number of clusters.
Answer: a
Explanation: K-means clustering produces the final estimate of cluster centroids.
27.
lOMoARcPSD|7874213
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
THIS IS ALSO
MANDATORY
OPTION
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Suppose you have the following data with one real-value input variable
& one real-value output variable. What is leave-one out cross validation
ENTER mean square error in case of linear regression (Y = bX+c)?
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) 10/27
THIS IS
MANDATORY
OPTION
((OPTION_B)) 20/27
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 50/27
This is optional
((OPTION_D)) 49/27
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Which of the following is/ are true about “Maximum Likelihood es-
timate (MLE)”?
ENTER
CONTENT. QTN 1. MLE may not always exist
CAN HAVE 2. MLE always exists
IMAGES ALSO
3. If MLE exist, it (they) may not be unique
THIS IS
MANDATORY
OPTION
((OPTION_B)) 2 and3
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 1 and3
This is optional
((OPTION_D)) 2 and4
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Let’s say, a “Linear regression” model perfectly fits the training data
(train error is zero). Now, Which of the following statement is true?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS ALSO
MANDATORY
OPTION
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
((OPTION_B)) . The p-value for the null hypothesis Beta coefficient =0 is 0.0001
THIS IS ALSO
MANDATORY
OPTION
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Which of the following assumptions do we make while deriving linear regression param
((QUESTION))
1. The true relationship between dependent y and predictor x is linear
ENTER 2. The model errors are statistically independent
CONTENT. QTN
3. The errors are normally distributed with a 0 mean and constant standard deviation.
CAN HAVE
IMAGES ALSO
((OPTION_A)) 1,2&3
THIS IS
MANDATORY
OPTION
((OPTION_B)) 1&3
THIS IS ALSO
MANDATORY
OPTION
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_B)) Barchart
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Histograms
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) 1&2
THIS IS
MANDATORY
OPTION
((OPTION_B)) Only 1
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Only 2
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Which of the following offsets, do we use in case of least square line fit? Suppose horizontal axis is
((QUESTION)) independent variable and vertical axis is dependent variable
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Suppose we have generated the data with help of polynomial regression of degree 3 (degree 3 will perfectly
((QUESTION)) fit this data). Now consider below points and choose the option based on these points.
ENTER 1. Simple Linear regression will have high bias and low variance
2. Simple Linear regression will have low bias and high variance
CONTENT. QTN
CAN HAVE 3. polynomial of degree 3 will have low bias and high variance
IMAGES ALSO
Polynomial of degree 3 will have low bias and Low variance
((OPTION_A)) . Only 1
THIS IS
MANDATORY
OPTION
((OPTION_B)) 1&3
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 1&4
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) . Suppose you are training a linear regression model. Now consider these
points.
ENTER
CONTENT. QTN 1. Overfitting is more likely if we have less data
CAN HAVE 2. Overfitting is more likely when the hypothesis space is small
IMAGES ALSO
Which of the above statement(s) are correct?
((OPTION_A)) Both are False
THIS IS
MANDATORY
OPTION
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH c
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Suppose we fit “Lasso Regression” to a data set, which has 100 features (X1,X2…X100). Now, we rescale
((QUESTION)) one of these feature by multiplying with 10 (say that feature is X1), and then refit Lasso regression with the
same regularization parameter.
ENTER
CONTENT. QTN Now, which of the following option will be correct?
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Which of the following statement(s) can be true post adding a variable
.
in a linear regression model?
ENTER 1. R-Squared and Adjusted R-squared both increase
CONTENT. QTN 2. R-Squared increases and Adjusted R-squared decreases
CAN HAVE 3. R-Squared decreases and Adjusted R-squared decreases
IMAGES ALSO 4. R-Squared decreases and Adjusted R-squared increases
((OPTION_A)) . 1 and 2
THIS IS
MANDATORY
OPTION
((OPTION_B)) 1 and 3
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 2 and 4
This is optional
none of these
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) . Which of the following metrics can be used for evaluating regression
models?
ENTER 1. R Squared
CONTENT. QTN 2. Adjusted R Squared
CAN HAVE 3. F Statistics
IMAGES ALSO 1. RMSE / MSE / MAE
((OPTION_A)) 2 and 4
THIS IS
MANDATORY
OPTION
((OPTION_B)) 1 and 2.
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) . 2, 3 and 4.
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) We can also compute the coefficient of linear regression with the help of
an analytical method called “Normal Equation”. Which of the following
ENTER is/are true about “Normal Equation”?
CONTENT. QTN 1. We don’t have to choose the learning rate
CAN HAVE 2. It becomes slow when number of features is very large
IMAGES ALSO 3. No need to iterate
((OPTION_A)) 1 and 2
THIS IS
MANDATORY
OPTION
((OPTION_B)) 1&3
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 2&3
This is optional
((OPTION_D)) 1,2&3
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
. The expected value of Y is a linear function of the X(X1,X2….Xn) variables and regression line is defined
((QUESTION)) as:
Y = β0 + β1 X1 + β2 X2……+ βn Xn
ENTER Which of the following statement(s) are true?
1. If Xi changes by an amount ∆Xi, holding other variables constant, then the expected value of Y
CONTENT. QTN changes by a proportional amount βi ∆Xi, for some constant βi (which in general could be a posit -
CAN HAVE ive or negative number).
2. The value of βi is always the same, regardless of values of the other X’s.
IMAGES ALSO 3. The total effect of the X’s on the expected value of Y is the sum of their separate effects.
((OPTION_A)) . 1 and 2
THIS IS
MANDATORY
OPTION
((OPTION_B)) 1 and 3
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 2 and 3
This is optional
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) 1
THIS IS
MANDATORY
OPTION
((OPTION_B)) 2
THIS IS ALSO
MANDATORY
OPTION
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
. Below graphs show two fitted regression lines (A & B) on randomly generated data. Now, I want to find the
((QUESTION)) sum of residuals in both cases A and B.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO Which of the following statement is true about sum of residuals of A and B
THIS IS
MANDATORY
OPTION
This is optional
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) If two variables are correlated, is it necessary that they have a linear re-
lationsh
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) YES
THIS IS
MANDATORY
OPTION
((OPTION_B)) NO
THIS IS ALSO
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION
((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Suppose I applied a logistic regression model on data and got training accuracy X and testing accuracy Y.
((QUESTION)) Now I want to add few new features in data. Select option(s) which are correct in such case.
Note: Consider remaining parameters are same.
ENTER 1. Training accuracy always decreases.
2. Training accuracy always increases or remain same.
CONTENT. QTN 3. Testing accuracy always decreases
CAN HAVE Testing accuracy always increases or remain same
IMAGES ALSO
((OPTION_A)) Only 2
THIS IS
MANDATORY
OPTION
((OPTION_B)) Only 1
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Only3
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
The graph below represents a regression line predicting Y from X. The values on the
((QUESTION)) graph shows the residuals for each predictions value. Use this information to compute
ENTER the SSE.
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) 3.02
THIS IS
MANDATORY
OPTION
((OPTION_B)) 0.75
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) 1.01
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) YES
THIS IS
MANDATORY
OPTION
((OPTION_B)) NO
THIS IS ALSO
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
((OPTION_C)) The relationship is not symmetric between x and y in case of correlation but
in case of regression it is symmetric.
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION
((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) _
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION
((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION
((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION
((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Which of the following methods do we use to best fit the data in Logistic
Regression?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) One of the very good methods to analyze the performance of Logistic
Regression is AIC, which is similar to R-Squared in Linear Regression.
ENTER Which of the following is true about AIC
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION
((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) ) LASSO
THIS IS
MANDATORY
OPTION
((OPTION_B)) Ridge
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Both
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Suppose you have been given a fair coin and you want to find out the
odds of getting heads. Which of the following option is true for such a
ENTER case?
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) ) The logit function(given as l(x)) is the log of odds function. What could
be the range of logit function in the domain x=[0,1]?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) (– ∞ , ∞)
THIS IS
MANDATORY
OPTION
((OPTION_B)) (0,1)
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) (0, ∞)
This is optional
((OPTION_D)) (- ∞, 0)
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_A)) Linear Regression errors values has to be normally distributed but in case of
Logistic Regression it is not the case
THIS IS
MANDATORY
OPTION
((OPTION_B)) Linear Regression errors values has to be normally distributed but in case of
Logistic Regression it is not the case
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Both Linear Regression and Logistic Regression error values have to be
normally distributed
This is optional
((OPTION_D)) Both Linear Regression and Logistic Regression error values have not to be
normally distributed
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
17) Which of the following is true regarding the logistic function for any value “x Note:
((QUESTION)) Logistic(x): is a logistic function of any number “x”
Logit(x): is a logit function of any number “x”
ENTER Logit_inv(x): is a inverse logit function of any number “x””?
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) Suppose, You applied a Logistic Regression model on a given data and
got a training accuracy X and testing accuracy Y. Now, you want to add
ENTER a few new features in the same data. Select the option(s) which is/are
CONTENT. QTN correct in such a case.
CAN HAVE
IMAGES ALSO Note: Consider remaining parameters are same.
THIS IS
MANDATORY
OPTION
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A&D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
THIS IS
MANDATORY
OPTION
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION)) What would do if you want to train logistic regression on same data that
will take less time as well as give the comparatively similar
ENTER accuracy(may not be same)?
CONTENT. QTN
CAN HAVE Suppose you are using a Logistic Regression model on a huge dataset. One
IMAGES ALSO of the problem you may face on such huge data is that Logistic regression
will take very long time to train
((OPTION_A)) Decrease the learning rate and decrease the number of iteration
THIS IS
MANDATORY
OPTION
((OPTION_B)) Decrease the learning rate and increase the number of iteration
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) Increase the learning rate and increase the number of iteration
This is optional
((OPTION_D)) Increase the learning rate and decrease the number of iteration
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH D
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Which of the following image is showing the cost function for y =1.
((QUESTION)) Following is the loss function in logistic regression(Y-axis loss function and x axis log probability) for two
class classification problem.
ENTER Note: Y is the target class
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) A
THIS IS
MANDATORY
OPTION
((OPTION_B)) B
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) BOTH
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
((OPTION_C)) Predict any categorical variable from several other categorical variables.
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH A
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
The odds ratio is
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) The ratio of the probability of an event not happening to the probability of the event happening.
THIS IS
MANDATORY
OPTION
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) The ratio of the odds after a unit change in the predictor to the original odds.
This is optional
((OPTION_D)) The ratio of the probability of an event happening to the probability of the event not happening.
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Large values of the log-likelihood statistic indicate:
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) That there are a greater number of explained vs. unexplained observations.
THIS IS
MANDATORY
OPTION
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C)) That as the predictor variable increases, the likelihood of the outcome occurring decreases.
This is optional
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Logistic regression assumes a:
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A)) Linear relationship between continuous predictor variables and the outcome variable.
THIS IS
MANDATORY
OPTION
((OPTION_B)) Linear relationship between continuous predictor variables and the logit of the outcome variable.
THIS IS ALSO
MANDATORY
OPTION
This is optional
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
In binary logistic regression:
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
THIS IS
MANDATORY
OPTION
THIS IS ALSO
MANDATORY
OPTION
This is optional
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((OPTION_D)) none
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH C
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH B
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
((OPTION_A))
THIS IS
MANDATORY
OPTION
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
((CORRECT_CH
OICE)) Either A
or B or C or D or
E
((EXPLANATION
)) This is also
optional
Ans: Solution A
2. What is regression?
a) When the output variable is a category, such as “red” or “blue” or “disease” and “no
disease”.
b) When the output variable is a real value, such as “dollars” or “weight”.
Ans: Solution B
Ans: Solution B
Ans: Solution A
Ans: Solution D
6. What is Reinforcement learning?
a) All data is unlabelled and the algorithms learn to inherent structure from the input data
b) All data is labelled and the algorithms learn to predict the output from the input data
c) It is a framework for learning where an agent interacts with an environment and receives
a reward for each interaction
d) Some data is labelled but most of it is unlabelled and a mixture of supervised and
unsupervised techniques can be used.
Ans: Solution C
Regression,
Classification
Clustering
Reinforcement Learning
Options:
A. 1 Only
B. 1 and 2
C. 1 and 3
D. 1, 2 and 4
Ans : Solution D
Ans : Solution C
Ans : Solution B
11. Supervised learning and unsupervised clustering both require at least one
a) hidden attribute.
b) output attribute.
c) input attribute.
d) categorical attribute.
Ans : Solution A
12. Supervised learning differs from unsupervised clustering in that supervised learning requires
a) at least one input attribute.
b) input attributes to be categorical.
c) at least one output attribute.
d) output attributes to be categorical.
Ans : Solution B
13. A regression model in which more than one independent variable is used to predict the
dependent variable is called
a) a simple linear regression model
b) a multiple regression models
c) an independent model
d) none of the above
Ans : Solution C
14. A term used to describe the case when the independent variables in a multiple regression model
are correlated is
a) Regression
b) correlation
c) multicollinearity
d) none of the above
Ans : Solution C
15. A multiple regression model has the form: y = 2 + 3x1 + 4x2. As x1 increases by 1 unit (holding x2
constant), y will
a) increase by 3 units
b) decrease by 3 units
c) increase by 4 units
d) decrease by 4 units
Ans : Solution C
Ans : Solution B
17. A measure of goodness of fit for the estimated regression equation is the
a) multiple coefficient of determination
b) mean square due to error
c) mean square due to regression
d) none of the above
Ans : Solution C
Ans : Solution D
Ans : Solution C
20. For a multiple regression model, SST = 200 and SSE = 50. The multiple coefficient of
determination is
a) 0.25
b) 4.00
c) 0.75
d) none of the above
Ans : Solution B
Ans : Solution B
Ans : Solution B
Ans : Solution C
Ans : Solution D
26. Which statement is true about neural network and linear regression models?
a) Both models require input attributes to be numeric.
b) Both models require numeric attributes to range between 0 and 1.
c) The output of both models is a categorical attribute value.
d) Both techniques build models whose output is determined by a linear sum of weighted
input attribute values.
Ans : Solution A
Ans : Solution A
28. The average positive difference between computed and desired outcome values.
a) root mean squared error
b) mean squared error
c) mean absolute error
d) mean positive error
Ans : Solution D
29. Selecting data so as to assure that each class is properly represented in both the training and
test set.
a) cross validation
b) stratification
c) verification
d) bootstrapping
Ans : Solution B
30. The standard error is defined as the square root of this computation.
a) The sample variance divided by the total number of sample instances.
b) The population variance divided by the total number of sample instances.
c) The sample variance divided by the sample mean.
d) The population variance divided by the sample mean.
Ans : Solution A
31. Data used to optimize the parameter settings of a supervised learner model.
a) Training
b) Test
c) Verification
d) Validation
Ans : Solution D
Ans : Solution A
33. The correlation between the number of years an employee has worked for a company and the
salary of the employee is 0.75. What can be said about employee salary and years worked?
a) There is no relationship between salary and years worked.
b) Individuals that have worked for the company the longest have higher salaries.
c) Individuals that have worked for the company the longest have lower salaries.
d) The majority of employees have been with the company a long time.
e) The majority of employees have been with the company a short period of time.
Ans : Solution B
34. The correlation coefficient for two real-valued attributes is –0.85. What does this value tell you?
a) The attributes are not linearly related.
b) As the value of one attribute increases the value of the second attribute also increases.
c) As the value of one attribute decreases the value of the second attribute increases.
d) The attributes show a curvilinear relationship.
Ans : Solution C
35. The average squared difference between classifier predicted output and actual output.
a) mean squared error
b) root mean squared error
c) mean absolute error
d) mean relative error
Ans : Solution A
36. Simple regression assumes a __________ relationship between the input attribute and output
attribute.
a) Linear
b) Quadratic
c) reciprocal
d) inverse
Ans : Solution A
Ans : Solution B
Ans : Solution C
39. Logistic regression is a ________ regression technique that is used to model data having a
_____outcome.
a) linear, numeric
b) linear, binary
c) nonlinear, numeric
d) nonlinear, binary
Ans : Solution D
40. This technique associates a conditional probability value with each data instance.
a) linear regression
b) logistic regression
c) simple regression
d) multiple linear regression
Ans : Solution B
41. This supervised learning technique can process both numeric and categorical input attributes.
a) linear regression
b) Bayes classifier
c) logistic regression
d) backpropagation learning
Ans : Solution A
Ans : Solution B
43. This clustering algorithm merges and splits nodes to help modify nonoptimal partitions.
a) agglomerative clustering
b) expectation maximization
c) conceptual clustering
d) K-Means clustering
Ans : Solution D
44. This clustering algorithm initially assumes that each data instance represents a single cluster.
a) agglomerative clustering
b) conceptual clustering
c) K-Means clustering
d) expectation maximization
Ans : Solution C
45. This unsupervised clustering algorithm terminates when mean values computed for the current
iteration of the algorithm are identical to the computed mean values for the previous iteration.
a) agglomerative clustering
b) conceptual clustering
c) K-Means clustering
d) expectation maximization
Ans : Solution C
46. Machine learning techniques differ from statistical techniques in that machine learning methods
a) typically assume an underlying distribution for the data.
b) are better able to deal with missing and noisy data.
c) are not able to explain their behavior.
d) have trouble with large-sized datasets.
Ans : Solution B
UNIT –II
1.True- False: Over fitting is more likely when you have huge amount of data to train?
A) TRUE
B) FALSE
Ans Solution: (B)
With a small training dataset, it’s easier to find a hypothesis to fit the training data exactly i.e.
over fitting.
3.Which of the following techniques would perform better for reducing dimensions of a data
set?
A. Removing columns which have too many missing values
B. Removing columns which have high variance in data
C. Removing columns with dissimilar data trends
D. None of these
Ans Solution: (A)
If a columns have too many missing values, (say 99%) then we can remove such columns.
4.It is not necessary to have a target variable for applying dimensionality reduction
algorithms.
A. TRUE
B. FALSE
Ans Solution: (A)
LDA is an example of supervised dimensionality reduction algorithm.
5. PCA can be used for projecting and visualizing data in lower dimensions.
A. TRUE
B. FALSE
Ans Solution: (A)
Sometimes it is very useful to plot the data in lower dimensions. We can take the first 2 principal
components and then visualize the data using scatter plot.
6. The most popularly used dimensionality reduction algorithm is Principal Component Analysis
(PCA). Which of the following is/are true about PCA?
PCA is an unsupervised method
It searches for the directions that data have the largest variance
Maximum number of principal components <= number of features
All principal components are orthogonal to each other
A. 1 and 2
B. 1 and 3
C. 2 and 3
D. All of the above
Ans D
8. What happens when you get features in lower dimensions using PCA?
The features will still have interpretability
The features will lose interpretability
The features must carry all information present in data
The features may not carry all information present in data
A. 1 and 3
B. 1 and 4
C. 2 and 3
D. 2 and 4
Ans Solution: (D)
When you get the features in lower dimensions then you will lose some information of data
most of the times and you won’t be able to interpret the lower dimension data.
10. What is of the following statement is true about t-SNE in comparison to PCA?
A. When the data is huge (in size), t-SNE may fail to produce better results.
B. T-NSE always produces better result regardless of the size of the data
C. PCA always performs better than t-SNE for smaller size data.
D. None of these
Ans Solution: (A)
Option A is correct
11. [ True or False ] PCA can be used for projecting and visualizing data in lower dimensions.
A. TRUE
B. FALSE
Solution: (A)
Sometimes it is very useful to plot the data in lower dimensions. We can take the first 2 principal
components and then visualize the data using scatter plot.
12. A feature F1 can take certain value: A, B, C, D, E, & F and represents grade of students from
a college.
1) Which of the following statement is true in following case?
A) Feature F1 is an example of nominal variable.
B) Feature F1 is an example of ordinal variable.
C) It doesn’t belong to any of the above category.
D) Both of these
Solution: (B)
Ordinal variables are the variables which has some order in their categories. For example, grade
A should be consider as high grade than grade B.
1. Which of the following methods do we use to best fit the data in Logistic Regression?
A) Least Square Error
B) Maximum Likelihood
C) Jaccard distance
D) Both A and B
Ans Solution: B
2. Choose which of the following options is true regarding One-Vs-All method in Logistic
Regression.
A) We need to fit n models in n-class classification problem
B) We need to fit n-1 models to classify into n classes
C) We need to fit only 1 model to classify into n classes
D) None of these
Ans Solution: A
3. Suppose, You applied a Logistic Regression model on a given data and got a training accuracy
X and testing accuracy Y. Now, you want to add a few new features in the same data. Select the
option(s) which is/are correct in such a case.
Note: Consider remaining parameters are same.
A) Training accuracy increases
B) Training accuracy increases or remains the same
C) Testing accuracy decreases
D) Testing accuracy increases or remains the same
Ans Solution: A and D
Adding more features to model will increase the training accuracy because model has to
consider more data to fit the logistic regression. But testing accuracy increases if feature is
found to be significant
6. Which of the following methods do we use to find the best fit line for data in Linear
Regression?
A) Least Square Error
B) Maximum Likelihood
C) Logarithmic Loss
D) Both A and B
Ans Solution: (A)
In linear regression, we try to minimize the least square errors of the model to identify the line
of best fit.
8. Suppose you plotted a scatter plot between the residuals and predicted values in linear
regression and you found that there is a relationship between them. Which of the following
conclusion do you make about this situation?
9. Suppose you have fitted a complex regression model on a dataset. Now, you are using Ridge
regression with penalty x.
Choose the option which describes bias in best manner.
A) In case of very large x; bias is low
B) In case of very large x; bias is high
C) We can’t say about bias
D) None of these
Ans Solution: (B)
If the penalty is very large it means model is less complex, therefore the bias would be high.
11. Suppose you have trained a logistic regression classifier and it outputs a new example x with
a prediction ho(x) = 0.2. This means
Our estimate for P(y=1 | x)
Our estimate for P(y=0 | x)
Our estimate for P(y=1 | x)
Our estimate for P(y=0 | x)
Ans Solution: B
A) TRUE
B) FALSE
Solution: (A)
True. A Neural network can be used as a universal approximator, so it can definitely implement
a linear regression algorithm.
15. Which of the following methods do we use to find the best fit line for data in Linear
Regression?
A) Least Square Error
B) Maximum Likelihood
C) Logarithmic Loss
D) Both A and B
Solution: (A)
In linear regression, we try to minimize the least square errors of the model to identify the line
of best fit.
16. Which of the following evaluation metrics can be used to evaluate a model while modeling
a continuous output variable?
A) AUC-ROC
B) Accuracy
C) Logloss
D) Mean-Squared-Error
Solution: (D)
Since linear regression gives output as continuous values, so in such case we use mean squared
error metric to evaluate the model performance. Remaining options are use in case of a
classification problem.
17. True-False: Lasso Regularization can be used for variable selection in Linear Regression.
A) TRUE
B) FALSE
Solution: (A)
True, In case of lasso regression we apply absolute penalty which makes some of the coefficients
zero.
19. Suppose that we have N independent variables (X1,X2… Xn) and dependent variable is Y.
Now Imagine that you are applying linear regression by fitting the best fit line using least square
error on this data.
You found that correlation coefficient for one of it’s variable(Say X1) with Y is -0.95.
Which of the following is true for X1?
A) Relation between the X1 and Y is weak
B) Relation between the X1 and Y is strong
C) Relation between the X1 and Y is neutral
D) Correlation can’t judge the relationship
Solution: (B)
The absolute value of the correlation coefficient denotes the strength of the relationship.
Since absolute correlation is very high it means that the relationship is strong between X1 and
Y.
20. Looking at above two characteristics, which of the following option is the correct for
Pearson correlation between V1 and V2?
If you are given the two variables V1 and V2 and they are following below two characteristics.
1. If V1 increases then V2 also increases
2. If V1 decreases then V2 behavior is unknown
A) Pearson correlation will be close to 1
B) Pearson correlation will be close to -1
C) Pearson correlation will be close to 0
D) None of these
Solution: (D)
We cannot comment on the correlation coefficient by using only statement 1. We need to
consider the both of these two statements. Consider V1 as x and V2 as |x|. The correlation
coefficient would not be close to 1 in such a case.
21. Suppose Pearson correlation between V1 and V2 is zero. In such case, is it right to
conclude that V1 and V2 do not have any relation between them?
A) TRUE
B) FALSE
Solution: (B)
Pearson correlation coefficient between 2 variables might be zero even when they have a
relationship between them. If the correlation coefficient is zero, it just means that that they
don’t move together. We can take examples like y=|x| or y=x^2.
22. True- False: Overfitting is more likely when you have huge amount of data to train?
A) TRUE
B) FALSE
Solution: (B)
With a small training dataset, it’s easier to find a hypothesis to fit the training data exactly i.e.
overfitting.
23. We can also compute the coefficient of linear regression with the help of an analytical
method called “Normal Equation”. Which of the following is/are true about Normal Equation?
1. We don’t have to choose the learning rate
2. It becomes slow when number of features is very large
3. Thers is no need to iterate
A) 1 and 2
B) 1 and 3
C) 2 and 3
D) 1,2 and 3
Solution: (D)
Instead of gradient descent, Normal Equation can also be used to find coefficients.
25. What will happen when you apply very large penalty?
A) Some of the coefficient will become absolute zero
B) Some of the coefficient will approach zero but not absolute zero
C) Both A and B depending on the situation
D) None of these
Solution: (B)
In lasso some of the coefficient value become zero, but in case of Ridge, the coefficients become
close to zero but not zero.
26. What will happen when you apply very large penalty in case of Lasso?
A) Some of the coefficient will become zero
B) Some of the coefficient will be approaching to zero but not absolute zero
C) Both A and B depending on the situation
D) None of these
Solution: (A)
As already discussed, lasso applies absolute penalty, so some of the coefficients will become
zero.
27. Which of the following statement is true about outliers in Linear regression?
A) Linear regression is sensitive to outliers
B) Linear regression is not sensitive to outliers
C) Can’t say
D) None of these
Solution: (A)
The slope of the regression line will change due to outliers in most of the cases. So Linear
Regression is sensitive to outliers.
28. Suppose you plotted a scatter plot between the residuals and predicted values in linear
regression and you found that there is a relationship between them. Which of the following
conclusion do you make about this situation?
31. In terms of bias and variance. Which of the following is true when you fit degree 2
polynomial?
A) Increase
B) Decrease
C) Remain constant
D) Can’t Say
Solution: (D)
Training error may increase or decrease depending on the values that are used to fit the model.
If the values used to train contain more outliers gradually, then the error might just increase.
33. What do you expect will happen with bias and variance as you increase the size of training
data?
34. What would be the root mean square training error for this data if you run a Linear
Regression model of the form (Y = A0+A1X)?
A) Less than 0
B) Greater than zero
C) Equal to 0
D) None of these
Solution: (C)
We can perfectly fit the line on the following data so mean error will be zero.
35. Which of the following scenario would give you the right hyper parameter?
A) 1
B) 2
C) 3
D) 4
Solution: (B)
Option B would be the better option because it leads to less training as well as validation error.
36. Suppose you got the tuned hyper parameters from the previous question. Now, Imagine
you want to add a variable in variable space such that this added feature is important. Which
of the following thing would you observe in such case?
A) Training Error will decrease and Validation error will increase
B) Training Error will increase and Validation error will increase
C) Training Error will increase and Validation error will decrease
D) Training Error will decrease and Validation error will decrease
E) None of the above
Solution: (D)
If the added feature is important, the training and validation error would decrease.
A) L1
B) L2
C) Any
D) None of these
Solution: (D)
I won’t use any regularization methods because regularization is used in case of overfitting.
41. True-False: Is it possible to design a logistic regression algorithm using a Neural Network
Algorithm?
A) TRUE
B) FALSE
Solution: A
True, Neural network is a is a universal approximator so it can implement linear regression
algorithm.
43. Which of the following methods do we use to best fit the data in Logistic Regression?
A) Least Square Error
B) Maximum Likelihood
C) Jaccard distance
D) Both A and B
Solution: B
Logistic regression uses maximum likely hood estimate for training a logistic regression.
44. Which of the following evaluation metrics can not be applied in case of logistic regression
output to compare with target?
A) AUC-ROC
B) Accuracy
C) Logloss
D) Mean-Squared-Error
Solution: D
Since, Logistic Regression is a classification algorithm so it’s output can not be real time value so
mean squared error can not use for evaluating it
45. One of the very good methods to analyze the performance of Logistic Regression is AIC,
which is similar to R-Squared in Linear Regression. Which of the following is true about AIC?
A) We prefer a model with minimum AIC value
B) We prefer a model with maximum AIC value
C) Both but depend on the situation
D) None of these
Solution: A
We select the best model in logistic regression which can least AIC.
Solution: A
In case of lasso we apply a absolute penality, after increasing the penality in lasso some of the
coefficient of variables may become zero.
Context: 48-49
Consider a following model for logistic regression: P (y =1|x, w)= g(w0 + w1x)
where g(z) is the logistic function.
In the above equation the P (y =1|x; w) , viewed as a function of x, that we can get by changing the
parameters w.
A) (0, inf)
B) (-inf, 0 )
C) (0, 1)
D) (-inf, inf)
Solution: C
For values of x in the range of real number from −∞ to +∞ Logistic function will give the output
between (0,1)
49 In above question what do you think which function would make p between (0,1)?
A) logistic function
B) Log likelihood function
C) Mixture of both
D) None of them
Solution: A
50. Suppose you have been given a fair coin and you want to find out the odds of getting heads.
Which of the following option is true for such a case?
A) odds will be 0
B) odds will be 0.5
C) odds will be 1
D) None of these
Solution: C
Odds are defined as the ratio of the probability of success and the probability of failure. So in case of fair
coin probability of success is 1/2 and the probability of failure is 1/2 so odd would be 1
51. The logit function(given as l(x)) is the log of odds function. What could be the range of logit
function in the domain x=[0,1]?
A) (– ∞ , ∞)
B) (0,1)
C) (0, ∞)
D) (- ∞, 0)
Solution: A
For our purposes, the odds function has the advantage of transforming the probability function, which
has values from 0 to 1, into an equivalent function with values between 0 and ∞. When we take the
natural log of the odds function, we get a range of values from -∞ to ∞.
A) Linear Regression errors values has to be normally distributed but in case of Logistic Regression it is
not the case
B) Logistic Regression errors values has to be normally distributed but in case of Linear Regression it is
not the case
C) Both Linear Regression and Logistic Regression error values have to be normally distributed
D) Both Linear Regression and Logistic Regression error values have not to be normally distributed
Solution:A
53. Which of the following is true regarding the logistic function for any value “x”?
Note:
Logistic(x): is a logistic function of any number “x”
A) Logistic(x) = Logit(x)
B) Logistic(x) = Logit_inv(x)
C) Logit_inv(x) = Logit(x)
D) None of these
Solution: B
Suppose you have given the two scatter plot “a” and “b” for two classes( blue for positive and red for
negative class). In scatter plot “a”, you correctly classified all data points using logistic regression ( black
line is a decision boundary).
A) Bias will be high
B) Bias will be low
C) Can’t say
D) None of these
Solution: A
55. Suppose, You applied a Logistic Regression model on a given data and got a training accuracy X
and testing accuracy Y. Now, you want to add a few new features in the same data. Select the
option(s) which is/are correct in such a case.
Solution: A and D
Adding more features to model will increase the training accuracy because model has to consider more
data to fit the logistic regression. But testing accuracy increases if feature is found to be significant
56. Choose which of the following options is true regarding One-Vs-All method in Logistic Regression.
If there are n classes, then n separate logistic regression has to fit, where the probability of each
category is predicted over the rest of the categories combined.
57. Below are two different logistic models with different values for β0 and β1.
Which of the
following statement(s) is true about β0 and β1 values of two logistics models (Green, Black)?
Solution: B
Context 58-60
Below are the three scatter plot(A,B,C left to right) and hand drawn decision boundaries for logistic
regression.
58. Which of the following above figure shows that the decision boundary is overfitting the training
data?
A) A
B) B
C) C
D)None of these
Solution: C
Since in figure 3, Decision boundary is not smooth that means it will over-fitting the data.
1. The training error in first plot is maximum as compare to second and third plot.
2. The best model for this regression problem is the last (third) plot because it has minimum
training error (zero).
3. The second model is more robust than first and third because it will perform best on unseen
data.
5. All will perform same because we have not seen the testing data.
A) 1 and 3
B) 1 and 3
C) 1, 3 and 4
D) 5
Solution: C
The trend in the graphs looks like a quadratic trend over independent variable X. A higher degree(Right
graph) polynomial might have a very high accuracy on the train population but is expected to fail badly
on test dataset. But if you see in left graph we will have training error maximum because it underfits the
training data
60. Suppose, above decision boundaries were generated for the different value of regularization.
Which of the above decision boundary shows the maximum regularization?
A) A
B) B
C) C
D) All have equal regularization
Solution: A
Since, more regularization means more penality means less complex decision boundry that shows in first
figure A.
61. What would do if you want to train logistic regression on same data that will take less time as well
as give the comparatively similar accuracy(may not be same)?
Suppose you are using a Logistic Regression model on a huge dataset. One of the problem you may face
on such huge data is that Logistic regression will take very long time to train.
Solution: D
If you decrease the number of iteration while training it will take less time for surly but will not give the
same accuracy for getting the similar accuracy but not exact you need to increase the learning rate.
62. Which of the following image is showing the cost function for y =1.
Following is the loss function in logistic regression(Y-axis loss function and x axis log probability) for
two class classification problem.
Solution: A
A is the true answer as loss function decreases as the log probability increases
A) 1
B) 2
C) 3
D) 4
Solution: C
There are three local minima present in the graph
64. Can a Logistic Regression classifier do a perfect classification on the below data?
Note: You can use only X1 and X2 variables where X1 and X2 can take only two binary values(0,1).
A) TRUE
B) FALSE
C) Can’t say
D) None of these
Solution: B
No, logistic regression only forms linear decision surface, but the examples in the figure are not linearly
separable.
UNIT IV
Ans Solution: C
When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.
Ans Solution: C
The cost parameter decides how much an SVM should be allowed to “bend” with the data. For a low
cost, you aim for a smooth decision surface and for a higher cost, you aim to classify more points
correctly. It is also simply referred to as the cost of misclassification.
Ans Solution: D
SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.
Ans Solution: B
Generalisation error in statistics is generally the out-of-sample error which is the measure of how
accurately a model can predict values for previously unseen data.
Ans Solution: C
When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: C
Question Context:8– 9
Suppose you are using a Linear SVM classifier with 2 class classification problem. Now you have been
given the following data in which some points are circled red that are representing support vectors.
8. If you remove the following any one red points from the data. Does the decision boundary will
change?
A) Yes
B) No
Solution: A
These three examples are positioned such that removing any one of them introduces slack in the
constraints. So the decision boundary would completely change.
9. [True or False] If you remove the non-red circled points from the data, the decision boundary will
change?
A) True
B) False
Solution: B
On the other hand, rest of the points in the data won’t affect the decision boundary much.
Solution: B
Generalization error in statistics is generally the out-of-sample error which is the measure of how
accurately a model can predict values for previously unseen data.
11. When the C parameter is set to infinite, which of the following holds true?
A) The optimal hyperplane if exists, will be the one that completely separates the data
B) The soft-margin classifier will separate the data
C) None of the above
Solution: A
At such a high level of misclassification penalty, soft margin will not hold existence as there will be no
room for error.
Solution: A
A hard margin means that an SVM is very rigid in classification and tries to work extremely well in the
training set, causing overfitting.
13. The minimum time complexity for training an SVM is O(n2). According to this fact, what sizes of
datasets are not best suited for SVM’s?
A) Large datasets
B) Small datasets
C) Medium sized datasets
D) Size does not matter
Solution: A
Datasets which have a clear classification boundary will function best with SVM’s.
A) Selection of Kernel
B) Kernel Parameters
C) Soft Margin Parameter C
D) All of the above
Solution: D
The SVM effectiveness depends upon how you choose the basic 3 requirements mentioned above in
such a way that it maximises your efficiency, reduces error and overfitting.
15. upport vectors are the data points that lie closest to the decision surface.
A) TRUE
B) FALSE
Solution: A
They are the points closest to the hyperplane and the hardest ones to classify. They also have a direct
bearing on the location of the decision surface.
Solution: C
When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.
17. Suppose you are using RBF kernel in SVM with high Gamma value. What does this signify?
A) The model would consider even far away points from hyperplane for modeling
B) The model would consider only the points close to the hyperplane for modeling
C) The model would not be affected by distance of points from hyperplane for modeling
D) None of the above
Solution: B
The gamma parameter in SVM tuning signifies the influence of points either near or far away from the
hyperplane.
For a low gamma, the model will be too constrained and include all points of the training dataset,
without really capturing the shape.
For a higher gamma, the model will capture the shape of the dataset well.
Solution: C
The cost parameter decides how much an SVM should be allowed to “bend” with the data. For a low
cost, you aim for a smooth decision surface and for a higher cost, you aim to classify more points
correctly. It is also simply referred to as the cost of misclassification.
19. Suppose you are building a SVM model on data X. The data X can be error prone which means that
you should not trust any specific data point too much. Now think that you want to build a SVM model
which has quadratic kernel function of polynomial degree 2 that uses Slack variable C as one of it’s hyper
parameter. Based upon that give the answer for following question.
What would happen when you use very large value of C(C->infinity)?
Note: For small C was also classifying all data points correctly
A) We can still classify data correctly for given setting of hyper parameter C
B) We can not classify data correctly for given setting of hyper parameter C
C) Can’t Say
D) None of these
Solution: A
For large values of C, the penalty for misclassifying points is very high, so the decision boundary will
perfectly separate the data if possible.
20. What would happen when you use very small C (C~0)?
Solution: A
The classifier can maximize the margin between most of the points, while misclassifying a few points,
because the penalty is so low.
21. If I am using all features of my dataset and I achieve 100% accuracy on my training set, but ~70% on
validation set, what should I look out for?
A) Underfitting
B) Nothing, the model is perfect
C) Overfitting
Solution: C
If we’re achieving 100% training accuracy very easily, we need to check to verify if we’re overfitting our
data.
22. Which of the following are real world applications of the SVM?
Solution: D
SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.
Question Context: 23 – 25
Suppose you have trained an SVM with linear decision boundary after training SVM, you correctly infer
that your SVM model is under fitting.
23. Which of the following option would you more likely to consider iterating SVM next time?
Solution: C
The best option here would be to create more features for the model.
24. Suppose you gave the correct answer in previous question. What do you think that is actually
happening?
A) 1 and 2
B) 2 and 3
C) 1 and 4
D) 2 and 4
Solution: C
Better model will lower the bias and increase the variance
25. In above question suppose you want to change one of it’s(SVM) hyperparameter so that effect
would be same as previous questions i.e model will not under fit?
Solution: A
Increasing C parameter would be the right thing to do here, as it will ensure regularized model
26. We usually use feature normalization before using the Gaussian kernel in SVM. What is true about
feature normalization?
A) 1
B) 1 and 2
C) 1 and 3
D) 2 and 3
Solution: B
Suppose you are dealing with 4 class classification problem and you want to train a SVM model on the
data for that you are using One-vs-all method. Now answer the below questions?
27. How many times we need to train our SVM model in such case?
A) 1
B) 2
C) 3
D) 4
Solution: D
For a 4 class problem, you would have to train the SVM at least 4 times if you are using a one-vs-all
method.
28. Suppose you have same distribution of classes in the data. Now, say for training 1 time in one vs all
setting the SVM is taking 10 second. How many seconds would it require to train one-vs-all method end
to end?
A) 20
B) 40
C) 60
D) 80
Solution: B
29 Suppose your problem has changed now. Now, data has only 2 classes. What would you think how
many times we need to train SVM in such case?
A) 1
B) 2
C) 3
D) 4
Solution: A
Training the SVM only one time would give you appropriate results
Suppose you are using SVM with linear kernel of polynomial degree 2, Now think that you have applied
this on data and found that it perfectly fit the data that means, Training and testing accuracy is 100%.
30. Now, think that you increase the complexity (or degree of polynomial of this kernel). What would
you think will happen?
Solution: A
Increasing the complexity of the data would make the algorithm overfit the data.
31. In the previous question after increasing the complexity you found that training accuracy was still
100%. According to you what is the reason behind that?
1. Since data is fixed and we are fitting more polynomial term or parameters so the algorithm starts
memorizing everything in the data
2. Since data is fixed and SVM doesn’t need to search in big hypothesis space
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: C
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: C
UNIT V
1. Which of the following is a widely used and effective machine learning algorithm based on the
idea of bagging?
a) Decision Tree
b) Regression
c) Classification
d) Random Forest
Ans D
a) Factor analysis
b) Decision trees are robust to outliers
c) Decision trees are prone to be overfit
d) None of the above
Ans C
a. True
b. False
Decision trees can also be used to for clusters in the data but clustering often generates natural
clusters and is not dependent on any objective function.
Regression
Classification
Clustering
Reinforcement Learning
Options:
a. 1 Only
b. 1 and 2
c. 1 and 3
d. 1, 2 and 4
Ans D
6 Which of the following is the most appropriate strategy for data cleaning before performing
clustering analysis, given less than desirable number of data points:
Removal of outliers
Options:
a. 1 only
b. 2 only
c. 1 and 2
d. None of the above
Ans A
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: C
Both options are true. In Bagging, each individual trees are independent of each other because they
consider different subset of features and samples.
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: B
In boosting tree individual weak learners are not independent of each other because each tree correct
the results of previous tree. Bagging and boosting both can be consider as improving the base learners
results.
9. In Random forest you can generate hundreds of trees (say T1, T2 …..Tn) and then aggregate
the results of these tree. Which of the following is true about individual (Tk) tree in Random Forest?
1. Individual tree is built on a subset of the features
A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4
Ans Solution: A
Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.
10. Suppose you are using a bagging based algorithm say a RandomForest in model building.
Which of the following can be true?
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: A
Since Random Forest aggregate the result of different weak learners, If It is possible we would want
more number of trees in model building. Random Forest is a black box model you will lose
interpretability after using it.
11. Which of the following is/are true about Random Forest and Gradient Boosting ensemble
methods?
2. Random Forest is use for classification whereas Gradient Boosting is use for regression task
3. Random Forest is use for regression whereas Gradient Boosting is use for Classification task
Solution: E
12. In Random forest you can generate hundreds of trees (say T1, T2 …..Tn) and then aggregate the
results of these tree. Which of the following is true about individual(Tk) tree in Random Forest?
A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4
Solution: A
Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.
13. Which of the following algorithm doesn’t uses learning Rate as of one of its hyperparameter?
1. Gradient Boosting
2. Extra Trees
3. AdaBoost
4. Random Forest
A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4
Solution: D
Random Forest and Extra Trees don’t have learning rate as a hyperparameter.
14. Which of the following algorithm are not an example of ensemble learning algorithm?
A) Random Forest
B) Adaboost
C) Extra Trees
D) Gradient Boosting
E) Decision Trees
Solution: E
Decision trees doesn’t aggregate the results of multiple trees so it is not an ensemble algorithm.
15. Suppose you are using a bagging based algorithm say a RandomForest in model building. Which of
the following can be true?
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: A
Since Random Forest aggregate the result of different weak learners, If It is possible we would want
more number of trees in model building. Random Forest is a black box model you will lose
interpretability after using it.
16. True-False: The bagging is suitable for high variance low bias models?
A) TRUE
B) FALSE
Solution: A
The bagging is suitable for high variance low bias models or you can say for complex models.
17. To apply bagging to regression trees which of the following is/are true in such case?
Solution: D
Solution: B
We always consider the validation results to compare with the test result.
19. In which of the following scenario a gain ratio is preferred over Information Gain?
Solution: A
When high cardinality problems, gain ratio is preferred over Information Gain technique.
20. Suppose you have given the following scenario for training and validation error for Gradient
Boosting. Which of the following hyper parameter would you choose in such case?
1 2 100 110
2 4 90 105
3 6 50 100
4 8 45 105
5 10 30 150
A) 1
B) 2
C) 3
D) 4
Solution: B
Scenario 2 and 4 has same validation accuracies but we would select 2 because depth is lower is better
hyper parameter.
21. Which of the following is/are not true about DBSCAN clustering algorithm:
1. For data points to be in a cluster, they must be in a distance threshold to a core point
5. It is robust to outliers
Options:
A. 1 only
B. 2 only
C. 4 only
D. 2 and 3
Solution: D
DBSCAN can form a cluster of any arbitrary shape and does not have strong assumptions for the
distribution of data points in the data space.
Answer: d
Explanation: K-means clustering follows partitioning approach.
Answer: c
Explanation: k-nearest neighbour has nothing to do with k-means.
Answer: a
Explanation: K-means requires a number of clusters.
Answer: a
Explanation: K-means clustering produces the final estimate of cluster centroids.
27.
lOMoARcPSD|7874213
MCQ-ML
Solution: (B)
Ordinal variables are the variables which has some order in their categories. For
example, grade A should be consider as high grade than grade B.
A) PCA
B) K-Means
Solution: (A)A deterministic algorithm is that in which output does not change on
different runs. PCA would give the same result if we run again, but not k-means.
3) [True or False] A Pearson correlation between two variables is zero but, still
their values can still be related to each other.
A) TRUE
B) FALSE
Solution: (A)
Y=X2. Note that, they are not only associated, but one is a function of the other and
Pearson correlation between them is 0.
4) Which of the following statement(s) is / are true for Gradient Decent (GD) and
Stochastic Gradient Decent (SGD)?
A) Only 1
B) Only 2
C) Only 3
D) 1 and 2
E) 2 and 3
F) 1,2 and 3
Solution: (A)In SGD for each iteration you choose the batch which is generally contain
the random sample of data But in case of GD each iteration contain the all of the training
observations.
5) Which of the following hyper parameter(s), when increased may cause random
forest to over fit the data?
1. Number of Trees
2. Depth of Tree
3. Learning Rate
A) Only 1
B) Only 2
C) Only 3
D) 1 and 2
E) 2 and 3
F) 1,2 and 3
Solution: (B)Usually, if we increase the depth of tree it will cause overfitting. Learning
rate is not an hyperparameter in random forest. Increase in the number of tree will cause
under fitting.
6) Imagine, you are working with “Analytics Vidhya” and you want to develop a
machine learning algorithm which predicts the number of views on the articles.
Your analysis is based on features like author name, number of articles written by
the same author on Analytics Vidhya in past and a few other features. Which of
the following evaluation metric would you choose in that case?
A) Only 1
B) Only 2
C) Only 3
D) 1 and 3
E) 2 and 3
F) 1 and 2
Solution:(A)
You can think that the number of views of articles is the continuous target variable which
fall under the regression problem. So, mean squared error will be used as an evaluation
metrics.
7) Given below are three images (1,2,3). Which of the following option is correct
for these images?
A)
B)
C)
A) 1 is tanh, 2 is ReLU and 3 is SIGMOID activation functions.
Solution: (D)
8) Below are the 8 actual values of target variable in the train file.
[0,0,0,1,1,1,1,1]
So the answer is A.
9) Let’s say, you are working with categorical feature(s) and you have not looked
at the distribution of the categorical variable in the test data.
You want to apply one hot encoding (OHE) on the categorical feature(s). What
challenges you may face if you have applied OHE on a categorical variable of train
dataset?
A) All categories of categorical variable are not present in the test dataset.
D) Both A and B
E) None of these
Solution: (D)Both are true, The OHE will fail to encode the categories which is present
in test but not in train so it could be one of the main challenges while applying OHE. The
challenge given in option B is also true you need to more careful while applying OHE if
frequency distribution doesn’t same in train and test.
10) Skip gram model is one of the best models used in Word2vec algorithm for
words embedding. Which one of the following models depict the skip gram
model?
A) A
B) B
C) Both A and B
D) None of these
Solution: (B)
Both models (model1 and model2) are used in Word2vec algorithm. The model1
represent a CBOW model where as Model2 represent the Skip gram model.
11) Let’s say, you are using activation function X in hidden layers of neural
network. At a particular neuron for any given input, you get the output as “-
0.0001”. Which of the following activation function could X represent?
A) ReLU
B) tanh
C) SIGMOID
D) None of these
Solution: (B)The function is a tanh because the this function output range is between (-
1,-1).
12) [True or False] LogLoss evaluation metric can have negative values.
A) TRUE
B) FALSE
13) Which of the following statements is/are true about “Type-1” and “Type-2”
errors?
A) Only 1
B) Only 2
C) Only 3
D) 1 and 2
E) 1 and 3
F) 2 and 3
Solution: (E)
In statistical hypothesis testing, a type I error is the incorrect rejection of a true null
hypothesis (a “false positive”), while a type II error is incorrectly retaining a false null
hypothesis (a “false negative”).
14) Which of the following is/are one of the important step(s) to pre-process the
text in NLP based projects?
1. Stemming
2. Stop word removal
3. Object Standardization
A) 1 and 2
B) 1 and 3
C) 2 and 3
D) 1,2 and 3
Solution: (D)
Stemming is a rudimentary rule-based process of stripping the suffixes (“ing”, “ly”, “es”,
“s” etc) from a word.
Stop words are those words which will have not relevant to the context of the data for
example is/am/are.
Object Standardization is also one of the good way to pre-process the text.
15) Suppose you want to project high dimensional data into lower dimensions.
The two most famous dimensionality reduction algorithms used here are PCA and
t-SNE. Let’s say you have applied both algorithms respectively on data “X” and
you got the datasets “X_projected_PCA” , “X_projected_tSNE”.
Solution: (B)
t-SNE algorithm considers nearest neighbour points to reduce the dimensionality of the
data. So, after using t-SNE we can think that reduced dimensions will also have
interpretation in nearest neighbour space. But in the case of PCA it is not the case.
Context: 16-17
Given below are three scatter plots for two features (Image 1, 2 & 3 from left to
right).
16) In the above images, which of the following is/are examples of multi-collinear
features?
A) Features in Image 1
B) Features in Image 2
C) Features in Image 3
Solution: (D)
In Image 1, features have high positive correlation where as in Image 2 has high
negative correlation between the features so in both images pair of features are the
example of multicollinear features.
A) Only 1
B)Only 2
C) Only 3
D) Either 1 or 3
E) Either 2 or 3
Solution: (E)
You cannot remove the both features because after removing the both features you will
lose all of the information so you should either remove the only 1 feature or you can use
the regularization algorithm like L1 and L2.
18) Adding a non-important feature to a linear regression model may result in.
1. Increase in R-square
2. Decrease in R-square
A) Only 1 is correct
B) Only 2 is correct
C) Either 1 or 2
D) None of these
Solution: (A)
After adding a feature in feature space, whether that feature is important or unimportant
features the R-squared always increase.
19) Suppose, you are given three variables X, Y and Z. The Pearson correlation
coefficients for (X, Y), (Y, Z) and (X, Z) are C1, C2 & C3 respectively.
Now, you have added 2 in all values of X (i.enew values become X+2), subtracted 2
from all values of Y (i.e. new values are Y-2) and Z remains the same. The new
coefficients for (X,Y), (Y,Z) and (X,Z) are given by D1, D2 & D3 respectively. How do
the values of D1, D2 & D3 relate to C1, C2 & C3?
E) D1 = C1, D2 = C2, D3 = C3
F) Cannot be determined
Solution: (E)Correlation between the features won’t change if you add or subtract a
value in the features.
20) Imagine, you are solving a classification problems with highly imbalanced
class. The majority class is observed 99% of times in the training data.
Your model has 99% accuracy after taking the predictions on test data. Which of
the following is true in such a case?
A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4
21) In ensemble learning, you aggregate the predictions for weak learners, so that
an ensemble of these models will give a better prediction than prediction of
individual models.
Which of the following statements is / are true for weak learners used in ensemble
model?
A) 1 and 2
B) 1 and 3
C) 2 and 3
D) Only 1
E) Only 2
Solution: (A)
Weak learners are sure about particular part of a problem. So, they usually don’t overfit
which means that weak learners have low variance and high bias.
22) Which of the following options is/are true for K-fold cross-validation?
1. Increase in K will result in higher time required to cross validate the result.
2. Higher values of K will result in higher confidence on the cross-validation
result as compared to lower value of K.
3. If K=N, then it is called Leave one out cross validation, where N is the
number of observations.
A) 1 and 2
B) 2 and 3
C) 1 and 3
D) 1,2 and 3
Solution: (D)
Larger k value means less bias towards overestimating the true expected error (as
training folds will be closer to the total dataset) and higher running time (as you are
getting closer to the limit case: Leave-One-Out CV). We also need to consider the
variance between the k folds accuracy while selecting the k.
Time taken by an algorithm for training (on a model with max_depth 2) 4-fold is 10
seconds and for the prediction on remaining 1-fold is 2 seconds.
23) Which of the following option is true for overall execution time for 5-fold cross
validation with 10 different values of “max_depth”?
D) Can’t estimate
Solution: (D)
Each iteration for depth “2” in 5-fold cross validation will take 10 secs for training and 2
second for testing. So, 5 folds will take 12*5 = 60 seconds. Since we are searching over
the 10 depth values so the algorithm would take 60*10 = 600 seconds. But training and
testing a model on depth greater than 2 will take more time than depth “2” so overall
timing would be greater than 600.
24) In previous question, if you train the same algorithm for tuning 2 hyper
parameters say “max_depth” and “learning_rate”.
You want to select the right value against “max_depth” (from given 10 depth
values) and learning rate (from given 5 different learning rates). In such cases,
which of the following will represent the overall time?
A) 1000-1500 second
B) 1500-3000 Second
D) None of these
25) Given below is a scenario for training error TE and Validation error VE for a
machine learning algorithm M1. You want to choose a hyperparameter (H) based
on TE and VE.
H TE VE
1 105 90
2 200 85
3 250 96
4 105 85
5 300 100
Which value of H will you choose based on the above table?
A) 1
B) 2
C) 3
D) 4
E) 5
26) What would you do in PCA to get the same projection as SVD?
C) Not possible
D) None of these
Solution: (A)When the data has a zero mean vector PCA will have same projections as
SVD, otherwise you have to centre the data first before taking SVD.
Assume there is a black box algorithm, which takes training data with multiple
observations (t1, t2, t3,…….. tn) and a new observation (q1). The black box
outputs the nearest neighbor of q1 (say ti) and its corresponding class label ci.
You can also think that this black box algorithm is same as 1-NN (1-nearest
neighbor).
A) TRUE
B) FALSE
Solution: (A)
In first step, you pass an observation (q1) in the black box algorithm so this algorithm
would return a nearest observation and its class.
In second step, you through it out nearest observation from train data and again input
the observation (q1). The black box algorithm will again return the a nearest observation
and it’s class.
28) Instead of using 1-NN black box we want to use the j-NN (j>1) algorithm as
black box. Which of the following option is correct for finding k-NN using j-NN?
A) 1
B) 2
C) 3
29) Suppose you are given 7 Scatter plots 1-7 (left to right) and you want to
compare Pearson correlation coefficients between variables of each scatterplot.
1. 1<2<3<4
2. 1>2>3 > 4
3. 7<6<5<4
4. 7>6>5>4
A) 1 and 3
B) 2 and 3
C) 1 and 4
D) 2 and 4
Solution: (B)
from image 1to 4 correlation is decreasing (absolute value). But from image 4 to 7
correlation is increasing but values are negative (for example, 0, -0.3, -0.7, -0.99).
30) You can evaluate the performance of a binary class classification problem
using different metrics such as accuracy, log-loss, F-Score. Let’s say, you are
using the log-loss function as evaluation metric.
1.
If a classifier is confident about an incorrect classification, then log-loss will
penalise it heavily.
2. For a particular observation, the classifier assigns a very small probability for the
correct class then the corresponding contribution to the log-loss will be very large.
3. Lower the log-loss, the better is the model.
A) 1 and 3
B) 2 and 3
C) 1 and 2
D) 1,2 and 3
Note: Visual distance between the points in the image represents the actual
distance.
31) Which of the following is leave-one-out cross-validation accuracy for 3-NN (3-
nearest neighbor)?
A) 0
D) 0.4
C) 0.8
D) 1
Solution: (C)
In Leave-One-Out cross validation, we will select (n-1) observations for training and 1
observation of validation. Consider each point as a cross validation point and then find
the 3 nearest point to this point. So if you repeat this procedure for all points you will get
the correct classification for all positive class given in the above figure but negative class
will be misclassified. Hence you will get 80% accuracy.
32) Which of the following value of K will have least leave-one-out cross validation
accuracy?
A) 1NN
B) 3NN
C) 4NN
Solution: (A)Each point which will always be misclassified in 1-NN which means that
you will get the 0% accuracy.
33) Suppose you are given the below data and you want to apply a logistic
regression model for classifying it in two given classes.
Which of the following option is correct when you increase the value of C from zero to a
very large value?
Solution: (B)
By looking at the image, we see that even on just using x2, we can efficiently perform
classification. So at first w1 will become 0. As regularization parameter increases more,
w2 will come more and more closer to 0.
34) Suppose we have a dataset which can be trained with 100% accuracy with help
of a decision tree of depth 6. Now consider the points below and choose the
option based on these points.
Note: All other hyper parameters are same and other factors are not affected.
A) Only 1
B) Only 2
C) Both 1 and 2
Solution: (A)If you fit decision tree of depth 4 in such data means it will more likely to
underfit the data. So, in case of underfitting you will have high bias and low variance.
35) Which of the following options can be used to get global minima in k-Means
Algorithm?
A) 2 and 3
B) 1 and 3
C) 1 and 2
D) All of above
Solution: (D)All of the option can be tuned to find the global minima.
36) Imagine you are working on a project which is a binary classification problem.
You trained a model on training dataset and get the below confusion matrix on
validation dataset.
Based on the above confusion matrix, choose which option(s) below will give you
correct predictions?
1. Accuracy is ~0.91
2. Misclassification rate is ~ 0.91
3. False positive rate is ~0.95
4. True positive rate is ~0.95
A) 1 and 3
B) 2 and 4
C) 1 and 4
D) 2 and 3
Solution: (C)
The true Positive Rate is how many times you are predicting positive class correctly so
true positive rate would be 100/105 = 0.95 also known as “Sensitivity” or “Recall”
37) For which of the following hyperparameters, higher value is better for decision
tree algorithm?
A)1 and 2
B) 2 and 3
C) 1 and 3
D) 1, 2 and 3
E) Can’t say
Solution: (E)
For all three options A, B and C, it is not necessary that if you increase the value of
parameter the performance may increase. For example, if we have a very high value of
depth of tree, the resulting tree may overfit the data, and would not generalize well. On
the other hand, if we have a very low value, the tree may underfit the data. So, we can’t
say for sure that “higher is better”.
Context 38-39
Imagine, you have a 28 * 28 image and you run a 3 * 3 convolution neural network
on it with the input depth of 3 and output depth of 8.
38) What is the dimension of output feature map when you are using the given
parameters.
39) What is the dimensions of output feature map when you are using following
parameters.
40) Suppose, we were plotting the visualization for different values of C (Penalty
parameter) in SVM algorithm. Due to some reason, we forgot to tag the C values
with visualizations. In that case, which of the following option best explains the C
values for the images below (1,2,3 left to right, so C values are C1 for image1, C2
for image2 and C3 for image3 ) in case of rbf kernel.
A) C1 = C2 = C3
B) C1 > C2 > C3
C) C1 < C2 < C3
D) None of these
Solution: (C)
Penalty parameter C of the error term. It also controls the trade-off between smooth
decision boundary and classifying the training points correctly. For large values of C, the
optimization will choose a smaller-margin hyperplane.
1. Classification
2. Clustering
3. Reinforcement Learning
4. Regression
Options:
B. A. 2 Only
C. 1 and 2
D. 1 and 3
E. 2 and 3
F. 1, 2 and 3
H. 1, 2, 3 and 4
Solution: (E)
Generally, movie recommendation systems cluster the users in a finite number of similar
groups based on their previous activities and profile. Then, at a fundamental level,
people in the same cluster are made similar recommendations.
In some scenarios, this can also be approached as a classification problem for assigning
the most appropriate movie class to the user of a specific group of users. Also, a movie
recommendation system can be viewed as a reinforcement learning problem where it
learns by its previous recommendations and improves the future recommendations.
1. Regression
2. Classification
3. Clustering
4. Reinforcement Learning
Options:
A. 1 Only
B. 1 and 2
C. 1 and 3
D. 1, 2 and 3
E. 1, 2 and 4
F. 1, 2, 3 and 4
Solution: (E)
Sentiment analysis at the fundamental level is the task of classifying the sentiments
represented in an image, text or speech into a set of defined sentiment classes like
happy, sad, excited, positive, negative, etc. It can also be viewed as a regression
problem for assigning a sentiment score of say 1 to 10 for a corresponding image, text or
speech.
A. True
B. False
Solution: (A)
Decision trees can also be used to for clusters in the data but clustering often generates
natural clusters and is not dependent on any objective function.
Q4. Which of the following is the most appropriate strategy for data cleaning
before performing clustering analysis, given less than desirable number of data
points:
Options:
A. 1 only
B. 2 only
C. 1 and 2
Solution: (A)
Removal of outliers is not recommended if the data points are few in number. In this
scenario, capping and flouring of variables is the most appropriate strategy.
Q5. What is the minimum no. of variables/ features required to perform clustering?
A. 0
B. 1
C. 2
D. 3
Solution: (B)
Q6. For two runs of K-Mean clustering is it expected to get same clustering
results?
A. Yes
B. No
Solution: (B)
K-Means clustering algorithm instead converses on local minima which might also
correspond to the global minima in some cases but not always. Therefore, it’s advised to
run the K-Means algorithm multiple times before drawing inferences about the clusters.
However, note that it’s possible to receive same clustering results from K-means by
setting the same seed value for each run. But that is done by simply making the
algorithm choose the set of same random no. for each run.
A. Yes
B. No
C. Can’t say
D. None of these
Solution: (A)
When the K-Means algorithm has reached the local or global minima, it will not alter the
assignment of data points to clusters for two successive iterations.
Options:
A. 1, 3 and 4
B. 1, 2 and 3
C. 1, 2 and 4
Solution: (D)
All four conditions can be used as possible termination condition in K-Means clustering:
1. This condition limits the runtime of the clustering algorithm, but in some cases the
quality of the clustering will be poor because of an insufficient number of
iterations.
2. Except for cases with a bad local minimum, this produces a good clustering, but
runtimes may be unacceptably long.
3. This also ensures that the algorithm has converged at the minima.
4. Terminate when RSS falls below a threshold. This criterion ensures that the
clustering is of a desired quality after termination. Practically, it’s a good practice
to combine it with a bound on the number of iterations to guarantee termination.
Q9. Which of the following clustering algorithms suffers from the problem of
convergence at local optima?
Options:
A. 1 only
B. 2 and 3
C. 2 and 4
D. 1 and 3
E. 1,2 and 4
Solution: (D)
Out of the options given, only K-Means clustering algorithm and EM clustering algorithm
has the drawback of converging at local minima.
Solution: (A)
Out of all the options, K-Means clustering algorithm is most sensitive to outliers as it
uses the mean of cluster data points to find the cluster center.
Q11. After performing K-Means Clustering analysis on a dataset, you observed the
following dendrogram. Which of the following conclusion can be drawn from the
dendrogram?
D. The above dendrogram interpretation is not possible for K-Means clustering analysis
Solution: (D)
A dendrogram is not possible for K-Means clustering analysis. However, one can create
a cluster gram based on K-Means clustering analysis.
Options:
A. 1 only
B. 1 and 2
C. 1 and 4
D. 3 only
E. 2 and 4
Solution: (F)
Creating an input feature for cluster ids as ordinal variable or creating an input feature
for cluster centroids as a continuous variable might not convey any relevant information
to the regression model for multidimensional data. But for clustering in a single
dimension, all of the given methods are expected to convey meaningful information to
the regression model. For example, to cluster people in two groups based on their hair
length, storing clustering ID as ordinal variable and cluster centroids as continuous
variables will convey meaningful information.
Q13. What could be the possible reason(s) for producing two different
dendrograms using agglomerative clustering algorithm for the same dataset?
C. of variables used
D. B and c only
Solution: (E)
Change in either of Proximity function, no. of data points or no. of variables will lead to
different clustering results and hence different dendrograms.
Q14. In the figure below, if you draw a horizontal line on y-axis for y=2. What will
be the number of clusters formed?
A. 1
B. 2
C. 3
D. 4
Solution: (B)
Since the number of vertical lines intersecting the red horizontal line at y=2 in the
dendrogram are 2, therefore, two clusters will be formed.
Q15. What is the most appropriate no. of clusters for the data points represented
by the following dendrogram:
A. 2
B. 4
C. 6
D. 8
Solution: (B)
The decision of the no. of clusters that can best depict different groups can be chosen by
observing the dendrogram. The best choice of the no. of clusters is the no. of vertical
lines in the dendrogram cut by a horizontal line that can transverse the maximum
distance vertically without intersecting a cluster.
In the above example, the best choice of no. of clusters will be 4 as the red horizontal
line in the dendrogram below covers maximum vertical distance AB.
Q16. In which of the following cases will K-Means clustering fail to give good
results?
Options:
A. 1 and 2
B. 2 and 3
C. 2 and 4
D. 1, 2 and 4
E. 1, 2, 3 and 4
Solution: (D)
K-Means clustering algorithm fails to give good results when the data contains outliers,
the density spread of data points across the data space is different and the data points
follow non-convex shapes.
Q17. Which of the following metrics, do we have for finding dissimilarity between
two clusters in hierarchical clustering?
1. Single-link
2. Complete-link
3. Average-link
Options:
A. 1 and 2
B. 1 and 3
C. 2 and 3
D. 1, 2 and 3
Solution: (D)
All of the three methods i.e. single link, complete link and average link can be used for
finding dissimilarity between two clusters in hierarchical clustering.
Options:
A. 1 only
B. 2 only
C. 1 and 2
D. None of them
Solution: (A)
Clustering analysis is not negatively affected by heteroscedasticity but the results are
negatively impacted by multicollinearity of features/ variables used in clustering as the
correlated feature/ variable will carry extra weight on the distance calculation than
desired.
Which of the following clustering representations and dendrogram depicts the use
of MIN or Single link proximity function in hierarchical clustering:
A.
B.
C.
D.
Solution: (A)
For the single link or MIN version of hierarchical clustering, the proximity of two clusters
is defined to be the minimum of the distance between any two points in the different
clusters. For instance, from the table, we see that the distance between points 3 and 6 is
0.11, and that is the height at which they are joined into one cluster in the dendrogram.
As another example, the distance between clusters {3, 6} and {2, 5} is given by dist({3,
6}, {2, 5}) = min(dist(3, 2), dist(6, 2), dist(3, 5), dist(6, 5)) = min(0.1483, 0.2540, 0.2843,
0.3921) = 0.1483.
Which of the following clustering representations and dendrogram depicts the use
of MAX or Complete link proximity function in hierarchical clustering:
A.
B.
C.
D.
Solution: (B)
For the single link or MAX version of hierarchical clustering, the proximity of two clusters
is defined to be the maximum of the distance between any two points in the different
clusters. Similarly, here points 3 and 6 are merged first. However, {3, 6} is merged with
{4}, instead of {2, 5}. This is because the dist({3, 6}, {4}) = max(dist(3, 4), dist(6, 4)) =
max(0.1513, 0.2216) = 0.2216, which is smaller than dist({3, 6}, {2, 5}) = max(dist(3, 2),
dist(6, 2), dist(3, 5), dist(6, 5)) = max(0.1483, 0.2540, 0.2843, 0.3921) = 0.3921 and
dist({3, 6}, {1}) = max(dist(3, 1), dist(6, 1)) = max(0.2218, 0.2347) = 0.2347.
Which of the following clustering representations and dendrogram depicts the use
of Group average proximity function in hierarchical clustering:
A.
B.
C.
D.
Solution: (C)
For the group average version of hierarchical clustering, the proximity of two clusters is
defined to be the average of the pairwise proximities between all pairs of points in the
different clusters. This is an intermediate approach between MIN and MAX. This is
expressed by the following equation:
Here, the distance between some clusters. dist({3, 6, 4}, {1}) = (0.2218 + 0.3688 +
0.2347)/(3 ∗ 1) = 0.2751. dist({2, 5}, {1}) = (0.2357 + 0.3421)/(2 ∗ 1) = 0.2889. dist({3,
6, 4}, {2, 5}) = (0.1483 + 0.2843 + 0.2540 + 0.3921 + 0.2042 + 0.2932)/(6∗1) = 0.2637.
Because dist({3, 6, 4}, {2, 5}) is smaller than dist({3, 6, 4}, {1}) and dist({2, 5}, {1}), these
two clusters are merged at the fourth stage
Which of the following clustering representations and dendrogram depicts the use
of Ward’s method proximity function in hierarchical clustering:
A.
B.
C.
D.
Solution: (D)
Ward method is a centroid method. Centroid method calculates the proximity between
two clusters by calculating the distance between the centroids of clusters. For Ward’s
method, the proximity between two clusters is defined as the increase in the squared
error that results when two clusters are merged. The results of applying Ward’s method
to the sample data set of six points. The resulting clustering is somewhat different from
those produced by MIN, MAX, and group average.
Q23. What should be the best choice of no. of clusters based on the following
results:
A. 1
B. 2
C. 3
D. 4
Solution: (C)
The silhouette coefficient is a measure of how similar an object is to its own cluster
compared to other clusters. Number of clusters for which silhouette coefficient is highest
represents the best choice of the number of clusters.
Q24. Which of the following is/are valid iterative strategy for treating missing
values before clustering analysis?
Solution: (C)
All of the mentioned techniques are valid for treating missing values before clustering
analysis but only imputation with EM algorithm is iterative in its functioning.
Q25. K-Mean algorithm has some limitations. One of the limitation it has is, it
makes hard assignments(A point either completely belongs to a cluster or not
belongs at all) of points to clusters.
Options:
A. 1 only
B. 2 only
C. 1 and 2
D. None of these
Solution: (C)
Both, Gaussian mixture models and Fuzzy K-means allows soft assignments.
Q26. Assume, you want to cluster 7 observations into 3 clusters using K-Means
clustering algorithm. After first iteration clusters, C1, C2, C3 has following
observations:
What will be the cluster centroids if you want to proceed for second iteration?
D. None of these
Solution: (A)
Q27. Assume, you want to cluster 7 observations into 3 clusters using K-Means
clustering algorithm. After first iteration clusters, C1, C2, C3 has following
observations:
What will be the Manhattan distance for observation (9, 9) from cluster centroid
C1. In second iteration.
A. 10
B. 5*sqrt(2)
C. 13*sqrt(2)
D. None of these
Solution: (A)
Manhattan distance between centroid C1 i.e. (4, 4) and (9, 9) = (9-4) + (9-4) = 10
Q28. If two variables V1 and V2, are used for clustering. Which of the following are
true for K means clustering with k =3?
Options:
A. 1 only
B. 2 only
C. 1 and 2
Solution: (A)
If the correlation between the variables V1 and V2 is 1, then all the data points will be in
a straight line. Hence, all the three cluster centroids will form a straight line as well.
Q29. Feature scaling is an important step before applying K-Mean algorithm. What
is reason behind this?
A. In distance calculation it will give the same weights for all features
B. You always get the same clusters. If you use or don’t use feature scaling
D. None of these
Solution; (A)
Feature scaling ensures that all the features get same weight in the clustering analysis.
Consider a scenario of clustering people based on their weights (in KG) with range 55-
110 and height (in inches) with range 5.6 to 6.4. In this case, the clusters produced
without scaling can be very misleading as the range of weight is much higher than that of
height. Therefore, its necessary to bring them to same scale so that they have equal
weightage on the clustering result.
Q30. Which of the following method is used for finding optimal of cluster in K-
Mean algorithm?
A. Elbow method
B. Manhattan method
C. Ecludian mehthod
E. None of these
Solution: (A)
Out of the given options, only elbow method is used for finding the optimal number of
clusters. The elbow method looks at the percentage of variance explained as a function
of the number of clusters: One should choose a number of clusters so that adding
another cluster doesn’t give much better modeling of the data.
Options:
A. 1 and 3
B. 1 and 2
C. 2 and 3
D. 1, 2 and 3
Solution: (D)
All three of the given statements are true. K-means is extremely sensitive to cluster
center initialization. Also, bad initialization can lead to Poor convergence speed as well
as bad overall clustering.
Q32. Which of the following can be applied to get good results for K-means
algorithm corresponding to global minima?
Options:
A. 2 and 3
B. 1 and 3
C. 1 and 2
D. All of above
Solution: (D)
All of these are standard practices that are used in order to obtain good clustering
results.
Q33. What should be the best choice for number of clusters based on the
following results:
A. 5
B. 6
C. 14
D. Greater than 14
Solution: (B)
Based on the above results, the best choice of number of clusters using elbow method is
6.
Q34. What should be the best choice for number of clusters based on the
following results:
A. 2
B. 4
C. 6
D. 8
Solution: (C)
Q35. Which of the following sequences is correct for a K-Means algorithm using
Forgy method of initialization?
Options:
A. 1, 2, 3, 5, 4
B. 1, 3, 2, 4, 5
C. 2, 1, 3, 4, 5
D. None of these
Solution: (A)
The methods used for initialization in K means are Forgy and Random Partition. The
Forgy method randomly chooses k observations from the data set and uses these as the
initial means. The Random Partition method first randomly assigns a cluster to each
observation and then proceeds to the update step, thus computing the initial mean to be
the centroid of the cluster’s randomly assigned points.
Q36. If you are using Multinomial mixture models with the expectation-
maximization algorithm for clustering a set of data points into two clusters, which
of the assumptions are important:
Solution: (C)
In EM algorithm for clustering its essential to choose the same no. of clusters to classify
the data points into as the no. of different distributions they are expected to be generated
from and also the distributions must be of the same type.
Q37. Which of the following is/are not true about Centroid based K-Means
clustering algorithm and Distribution based expectation-maximization clustering
algorithm:
3. Both have strong assumptions that the data points must fulfill
4. Both are sensitive to outliers
5. Expectation maximization algorithm is a special case of K-Means
6. Both requires prior knowledge of the no. of desired clusters
7. The results produced by both are non-reproducible.
Options:
A. 1 only
B. 5 only
C. 1 and 3
D. 6 and 7
E. 4, 6 and 7
Solution: (B)
All of the above statements are true except the 5 th as instead K-Means is a special case
of EM algorithm in which only the centroids of the cluster distributions are calculated at
each iteration.
Q38. Which of the following is/are not true about DBSCAN clustering algorithm:
Options:
A. 1 only
B. 2 only
C. 4 only
D. 2 and 3
E. 1 and 5
F. 1, 3 and 5
Solution: (D)
DBSCAN can form a cluster of any arbitrary shape and does not have strong
assumptions for the distribution of data points in the dataspace.
DBSCAN has a low time complexity of order O(n log n) only.
Q39. Which of the following are the high and low bounds for the existence of F-
Score?
A. [0,1]
B. (0,1)
C. [-1,1]
Solution: (A)
The lowest and highest possible values of F score are 0 and 1 with 1 representing that
every data point is assigned to the correct cluster and 0 representing that the precession
and/ or recall of the clustering analysis are both 0. In clustering analysis, high value of F
score is desired.
Q40. Following are the results observed for clustering 6000 data points into 3
clusters: A, B and C:
A. 3
B. 4
C. 5
D. 6
Solution: (D)
Here,
Therefore,
Hence,
1) [True or False] k-NN algorithm does more computation on test time rather than
train time.
A) TRUE
B) FALSE
Solution: A
The training phase of the algorithm consists only of storing the feature vectors and class
labels of the training samples.
In the testing phase, a test point is classified by assigning the label which are most
frequent among the k training samples nearest to that query point – hence higher
computation.
2) In the image below, which would be the best value for k assuming that the
algorithm you are using is k-Nearest Neighbor.
A) 3
B) 10
C) 20
D 50
Solution: B
Validation error is the least when the value of k is 10. So it is best to use this value of k
A) Manhattan
B) Minkowski
C) Tanimoto
D) Jaccard
E) Mahalanobis
F) All can be used
Solution: F
All of these distance metric can be used as a distance metric for k-NN.
Solution: C
We can also use k-NN for regression problems. In this case the prediction can be based
on the mean or the median of the k-most similar instances.
1. k-NN performs much better if all of the data have the same scale
2. k-NN works well with a small number of input variables (p), but struggles when
the number of inputs is very large
3. k-NN makes no assumptions about the functional form of the problem being
solved
A) 1 and 2
B) 1 and 3
C) Only 1
D) All of the above
Solution: D
6) Which of the following machine learning algorithm can be used for imputing
missing values of both categorical and continuous variables?
A) K-NN
B) Linear Regression
C) Logistic Regression
Solution: A
k-NN algorithm can be used for imputing missing value of both categorical and
continuous variables.
Solution: A
Manhattan Distance is designed for calculating the distance between real valued
features.
1. Hamming Distance
2. Euclidean Distance
3. Manhattan Distance
A) 1
B) 2
C) 3
D) 1 and 2
E) 2 and 3
F) 1,2 and 3
Solution: A
Both Euclidean and Manhattan distances are used in case of continuous variables,
whereas hamming distance is used in case of categorical variable.
9) Which of the following will be Euclidean Distance between the two data point
A(1,3) and B(2,3)?
A) 1
B) 2
C) 4
D) 8
Solution: A
10) Which of the following will be Manhattan Distance between the two data point
A(1,3) and B(2,3)?
A) 1
B) 2
C) 4
D) 8
Solution: A
Context: 11-12
Suppose, you have given the following data where x and y are the 2 input variables and
Class is the dependent variable.
11) Suppose, you want to predict the class of new data point x=1 and y=1 using
eucludian distance in 3-NN. In which class this data point belong to?
A) + Class
B) – Class
C) Can’t say
D) None of these
Solution: A
All three nearest point are of +class so this point will be classified as +class.
12) In the previous question, you are now want use 7-NN instead of 3-KNN which
of the following x=1 and y=1 will belong to?
A) + Class
B) – Class
C) Can’t say
Solution: B
Now this point will be classified as – class because there are 4 – class and 3 +class
point are in nearest circle.
Context 13-14:
Suppose you have given the following 2-class data where “+” represent a postive class
and “” is represent negative class.
13) Which of the following value of k in k-NN would minimize the leave one out
cross validation accuracy?
A) 3
B) 5
C) Both have same
D) None of these
Solution: B
5-NN will have least leave one out cross validation error.
14) Which of the following would be the leave on out cross validation accuracy for
k=5?
A) 2/14
B) 4/14
C) 6/14
D) 8/14
E) None of the above
Solution: E
In 5-NN we will have 10/14 leave one out cross validation accuracy.
15) Which of the following will be true about k in k-NN in terms of Bias?
Solution: A
large K means simple model, simple model always condider as high bias
16) Which of the following will be true about k in k-NN in terms of variance?
Solution: B
Your task is to tag the both distance by seeing the following two graphs. Which of
the following option is true about below graph ?
Left is the graphical depiction of how euclidean distance works, whereas right one is of
Manhattan distance.
18) When you find noise in data which of the following option would you consider
in k-NN?
Solution: A
To be more sure of which classifications you make, you can try increasing the value of k.
19) In k-NN it is very likely to overfit due to the curse of dimensionality. Which of
the following option would you consider to handle such problem?
1. Dimensionality Reduction
2. Feature selection
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: C
In such case you can use either dimensionality reduction algorithm or the
feature selection algorithm
20) Below are two statements given. Which of the following will be true both
statements?
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: C
21) Suppose you have given the following images(1 left, 2 middle and 3 right),
Now your task is to find out the value of k in k-NN in each image where k1 is for 1 st,
k2 is for 2nd and k3 is for 3rd figure.
A) k1 > k2> k3
B) k1<k2
C) k1 = k2 = k3
D) None of these
Solution: D
22) Which of the following value of k in the following graph would you give least
leave one out cross validation accuracy?
A) 1
B) 2
C) 3
D) 5
Solution: B
If you keep the value of k as 2, it gives the lowest cross validation accuracy. You can try
this out yourself.
23) A company has build a kNN classifier that gets 100% accuracy on training
data. When they deployed this model on client side it has been found that the
model is not at all accurate. Which of the following thing might gone wrong?
Note: Model has successfully deployed and no technical issues are found at client
side except the model performance
Solution: A
24) You have given the following 2 statements, find which of these option is/are
true in case of k-NN?
1. In case of very large value of k, we may include points from other classes into the
neighborhood.
2. In case of too small value of k the algorithm is very sensitive to noise
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: C
Solution: D
Option A: This is not always true. You have to ensure that the value of k is not too high or
not too low.
Option B: This statement is not true. The decision boundary can be a bit jagged
A) TRUE
B) FALSE
Solution: A
27) In k-NN what will happen when you increase/decrease the value of k?
Solution: A
28) Following are the two statements given for k-NN algorthm, which of the
statement(s)
is/are true?
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: C
Context 29-30:
Suppose, you have trained a k-NN model and now you want to get the prediction on test
data. Before getting the prediction suppose you want to calculate the time taken by k-NN
for predicting the class for test data.
Note: Calculating the distance between 2 observation will take D time.
29) What would be the time taken by 1-NN if there are N(Very large) observations
in test data?
A) N*D
B) N*D*2
C) (N*D)/2
D) None of these
Solution: A
30) What would be the relation between the time taken by 1-NN,2-NN,3-NN.
Solution: C
The training time for any value of k in kNN algorithm is the same.
Bias-Variance tradeof
The bias is an error from erroneous assumptions in the learning
algorithm. High bias can cause an algorithm to miss the relevant
relations between features and target outputs. In other words, model
with high bias pays very little attention to the training data and
oversimpliies the model.
The variance is an error from sensitivity to small luctuations in the
training set. High variance can cause an algorithm to model the
random noise in the training data, rather than the intended outputs.
In other words, model with high variance pays a lot of attention to
training data and does not generalize on the data which it hasn’t
seen before. [Source: Refer here]
Answer: (c) It would probably result in a decision tree that scores well
on the training set but badly on a test set
It is usual to make only binary splits because multiway splits break
the data into small subsets too quickly. This causes a bias towards
splitting predictors with many classes since they are more likely to
produce relatively pure child nodes, which results in overitting. [For
more, refer here]
Ans: Solution A
2. What is regression?
a) When the output variable is a category, such as “red” or “blue” or “disease” and “no
disease”.
b) When the output variable is a real value, such as “dollars” or “weight”.
Ans: Solution B
Ans: Solution B
Ans: Solution A
Ans: Solution D
6. What is Reinforcement learning?
a) All data is unlabelled and the algorithms learn to inherent structure from the input data
b) All data is labelled and the algorithms learn to predict the output from the input data
c) It is a framework for learning where an agent interacts with an environment and receives
a reward for each interaction
d) Some data is labelled but most of it is unlabelled and a mixture of supervised and
unsupervised techniques can be used.
Ans: Solution C
Regression,
Classification
Clustering
Reinforcement Learning
Options:
A. 1 Only
B. 1 and 2
C. 1 and 3
D. 1, 2 and 4
Ans : Solution D
Ans : Solution C
Ans : Solution B
11. Supervised learning and unsupervised clustering both require at least one
a) hidden attribute.
b) output attribute.
c) input attribute.
d) categorical attribute.
Ans : Solution A
12. Supervised learning differs from unsupervised clustering in that supervised learning requires
a) at least one input attribute.
b) input attributes to be categorical.
c) at least one output attribute.
d) output attributes to be categorical.
Ans : Solution B
13. A regression model in which more than one independent variable is used to predict the
dependent variable is called
a) a simple linear regression model
b) a multiple regression models
c) an independent model
d) none of the above
Ans : Solution C
14. A term used to describe the case when the independent variables in a multiple regression model
are correlated is
a) Regression
b) correlation
c) multicollinearity
d) none of the above
Ans : Solution C
15. A multiple regression model has the form: y = 2 + 3x1 + 4x2. As x1 increases by 1 unit (holding x2
constant), y will
a) increase by 3 units
b) decrease by 3 units
c) increase by 4 units
d) decrease by 4 units
Ans : Solution C
Ans : Solution B
17. A measure of goodness of fit for the estimated regression equation is the
a) multiple coefficient of determination
b) mean square due to error
c) mean square due to regression
d) none of the above
Ans : Solution C
Ans : Solution D
Ans : Solution C
20. For a multiple regression model, SST = 200 and SSE = 50. The multiple coefficient of
determination is
a) 0.25
b) 4.00
c) 0.75
d) none of the above
Ans : Solution B
Ans : Solution B
Ans : Solution B
Ans : Solution C
Ans : Solution D
26. Which statement is true about neural network and linear regression models?
a) Both models require input attributes to be numeric.
b) Both models require numeric attributes to range between 0 and 1.
c) The output of both models is a categorical attribute value.
d) Both techniques build models whose output is determined by a linear sum of weighted
input attribute values.
Ans : Solution A
Ans : Solution A
28. The average positive difference between computed and desired outcome values.
a) root mean squared error
b) mean squared error
c) mean absolute error
d) mean positive error
Ans : Solution D
29. Selecting data so as to assure that each class is properly represented in both the training and
test set.
a) cross validation
b) stratification
c) verification
d) bootstrapping
Ans : Solution B
30. The standard error is defined as the square root of this computation.
a) The sample variance divided by the total number of sample instances.
b) The population variance divided by the total number of sample instances.
c) The sample variance divided by the sample mean.
d) The population variance divided by the sample mean.
Ans : Solution A
31. Data used to optimize the parameter settings of a supervised learner model.
a) Training
b) Test
c) Verification
d) Validation
Ans : Solution D
Ans : Solution A
33. The correlation between the number of years an employee has worked for a company and the
salary of the employee is 0.75. What can be said about employee salary and years worked?
a) There is no relationship between salary and years worked.
b) Individuals that have worked for the company the longest have higher salaries.
c) Individuals that have worked for the company the longest have lower salaries.
d) The majority of employees have been with the company a long time.
e) The majority of employees have been with the company a short period of time.
Ans : Solution B
34. The correlation coefficient for two real-valued attributes is –0.85. What does this value tell you?
a) The attributes are not linearly related.
b) As the value of one attribute increases the value of the second attribute also increases.
c) As the value of one attribute decreases the value of the second attribute increases.
d) The attributes show a curvilinear relationship.
Ans : Solution C
35. The average squared difference between classifier predicted output and actual output.
a) mean squared error
b) root mean squared error
c) mean absolute error
d) mean relative error
Ans : Solution A
36. Simple regression assumes a __________ relationship between the input attribute and output
attribute.
a) Linear
b) Quadratic
c) reciprocal
d) inverse
Ans : Solution A
Ans : Solution B
Ans : Solution C
39. Logistic regression is a ________ regression technique that is used to model data having a
_____outcome.
a) linear, numeric
b) linear, binary
c) nonlinear, numeric
d) nonlinear, binary
Ans : Solution D
40. This technique associates a conditional probability value with each data instance.
a) linear regression
b) logistic regression
c) simple regression
d) multiple linear regression
Ans : Solution B
41. This supervised learning technique can process both numeric and categorical input attributes.
a) linear regression
b) Bayes classifier
c) logistic regression
d) backpropagation learning
Ans : Solution A
Ans : Solution B
43. This clustering algorithm merges and splits nodes to help modify nonoptimal partitions.
a) agglomerative clustering
b) expectation maximization
c) conceptual clustering
d) K-Means clustering
Ans : Solution D
44. This clustering algorithm initially assumes that each data instance represents a single cluster.
a) agglomerative clustering
b) conceptual clustering
c) K-Means clustering
d) expectation maximization
Ans : Solution C
45. This unsupervised clustering algorithm terminates when mean values computed for the current
iteration of the algorithm are identical to the computed mean values for the previous iteration.
a) agglomerative clustering
b) conceptual clustering
c) K-Means clustering
d) expectation maximization
Ans : Solution C
46. Machine learning techniques differ from statistical techniques in that machine learning methods
a) typically assume an underlying distribution for the data.
b) are better able to deal with missing and noisy data.
c) are not able to explain their behavior.
d) have trouble with large-sized datasets.
Ans : Solution B
UNIT –II
1.True- False: Over fitting is more likely when you have huge amount of data to train?
A) TRUE
B) FALSE
Ans Solution: (B)
With a small training dataset, it’s easier to find a hypothesis to fit the training data exactly i.e.
over fitting.
3.Which of the following techniques would perform better for reducing dimensions of a data
set?
A. Removing columns which have too many missing values
B. Removing columns which have high variance in data
C. Removing columns with dissimilar data trends
D. None of these
Ans Solution: (A)
If a columns have too many missing values, (say 99%) then we can remove such columns.
4.It is not necessary to have a target variable for applying dimensionality reduction
algorithms.
A. TRUE
B. FALSE
Ans Solution: (A)
LDA is an example of supervised dimensionality reduction algorithm.
5. PCA can be used for projecting and visualizing data in lower dimensions.
A. TRUE
B. FALSE
Ans Solution: (A)
Sometimes it is very useful to plot the data in lower dimensions. We can take the first 2 principal
components and then visualize the data using scatter plot.
6. The most popularly used dimensionality reduction algorithm is Principal Component Analysis
(PCA). Which of the following is/are true about PCA?
PCA is an unsupervised method
It searches for the directions that data have the largest variance
Maximum number of principal components <= number of features
All principal components are orthogonal to each other
A. 1 and 2
B. 1 and 3
C. 2 and 3
D. All of the above
Ans D
8. What happens when you get features in lower dimensions using PCA?
The features will still have interpretability
The features will lose interpretability
The features must carry all information present in data
The features may not carry all information present in data
A. 1 and 3
B. 1 and 4
C. 2 and 3
D. 2 and 4
Ans Solution: (D)
When you get the features in lower dimensions then you will lose some information of data
most of the times and you won’t be able to interpret the lower dimension data.
10. What is of the following statement is true about t-SNE in comparison to PCA?
A. When the data is huge (in size), t-SNE may fail to produce better results.
B. T-NSE always produces better result regardless of the size of the data
C. PCA always performs better than t-SNE for smaller size data.
D. None of these
Ans Solution: (A)
Option A is correct
11. [ True or False ] PCA can be used for projecting and visualizing data in lower dimensions.
A. TRUE
B. FALSE
Solution: (A)
Sometimes it is very useful to plot the data in lower dimensions. We can take the first 2 principal
components and then visualize the data using scatter plot.
12. A feature F1 can take certain value: A, B, C, D, E, & F and represents grade of students from
a college.
1) Which of the following statement is true in following case?
A) Feature F1 is an example of nominal variable.
B) Feature F1 is an example of ordinal variable.
C) It doesn’t belong to any of the above category.
D) Both of these
Solution: (B)
Ordinal variables are the variables which has some order in their categories. For example, grade
A should be consider as high grade than grade B.
1. Which of the following methods do we use to best fit the data in Logistic Regression?
A) Least Square Error
B) Maximum Likelihood
C) Jaccard distance
D) Both A and B
Ans Solution: B
2. Choose which of the following options is true regarding One-Vs-All method in Logistic
Regression.
A) We need to fit n models in n-class classification problem
B) We need to fit n-1 models to classify into n classes
C) We need to fit only 1 model to classify into n classes
D) None of these
Ans Solution: A
3. Suppose, You applied a Logistic Regression model on a given data and got a training accuracy
X and testing accuracy Y. Now, you want to add a few new features in the same data. Select the
option(s) which is/are correct in such a case.
Note: Consider remaining parameters are same.
A) Training accuracy increases
B) Training accuracy increases or remains the same
C) Testing accuracy decreases
D) Testing accuracy increases or remains the same
Ans Solution: A and D
Adding more features to model will increase the training accuracy because model has to
consider more data to fit the logistic regression. But testing accuracy increases if feature is
found to be significant
6. Which of the following methods do we use to find the best fit line for data in Linear
Regression?
A) Least Square Error
B) Maximum Likelihood
C) Logarithmic Loss
D) Both A and B
Ans Solution: (A)
In linear regression, we try to minimize the least square errors of the model to identify the line
of best fit.
8. Suppose you plotted a scatter plot between the residuals and predicted values in linear
regression and you found that there is a relationship between them. Which of the following
conclusion do you make about this situation?
9. Suppose you have fitted a complex regression model on a dataset. Now, you are using Ridge
regression with penalty x.
Choose the option which describes bias in best manner.
A) In case of very large x; bias is low
B) In case of very large x; bias is high
C) We can’t say about bias
D) None of these
Ans Solution: (B)
If the penalty is very large it means model is less complex, therefore the bias would be high.
11. Suppose you have trained a logistic regression classifier and it outputs a new example x with
a prediction ho(x) = 0.2. This means
Our estimate for P(y=1 | x)
Our estimate for P(y=0 | x)
Our estimate for P(y=1 | x)
Our estimate for P(y=0 | x)
Ans Solution: B
A) TRUE
B) FALSE
Solution: (A)
True. A Neural network can be used as a universal approximator, so it can definitely implement
a linear regression algorithm.
15. Which of the following methods do we use to find the best fit line for data in Linear
Regression?
A) Least Square Error
B) Maximum Likelihood
C) Logarithmic Loss
D) Both A and B
Solution: (A)
In linear regression, we try to minimize the least square errors of the model to identify the line
of best fit.
16. Which of the following evaluation metrics can be used to evaluate a model while modeling
a continuous output variable?
A) AUC-ROC
B) Accuracy
C) Logloss
D) Mean-Squared-Error
Solution: (D)
Since linear regression gives output as continuous values, so in such case we use mean squared
error metric to evaluate the model performance. Remaining options are use in case of a
classification problem.
17. True-False: Lasso Regularization can be used for variable selection in Linear Regression.
A) TRUE
B) FALSE
Solution: (A)
True, In case of lasso regression we apply absolute penalty which makes some of the coefficients
zero.
19. Suppose that we have N independent variables (X1,X2… Xn) and dependent variable is Y.
Now Imagine that you are applying linear regression by fitting the best fit line using least square
error on this data.
You found that correlation coefficient for one of it’s variable(Say X1) with Y is -0.95.
Which of the following is true for X1?
A) Relation between the X1 and Y is weak
B) Relation between the X1 and Y is strong
C) Relation between the X1 and Y is neutral
D) Correlation can’t judge the relationship
Solution: (B)
The absolute value of the correlation coefficient denotes the strength of the relationship.
Since absolute correlation is very high it means that the relationship is strong between X1 and
Y.
20. Looking at above two characteristics, which of the following option is the correct for
Pearson correlation between V1 and V2?
If you are given the two variables V1 and V2 and they are following below two characteristics.
1. If V1 increases then V2 also increases
2. If V1 decreases then V2 behavior is unknown
A) Pearson correlation will be close to 1
B) Pearson correlation will be close to -1
C) Pearson correlation will be close to 0
D) None of these
Solution: (D)
We cannot comment on the correlation coefficient by using only statement 1. We need to
consider the both of these two statements. Consider V1 as x and V2 as |x|. The correlation
coefficient would not be close to 1 in such a case.
21. Suppose Pearson correlation between V1 and V2 is zero. In such case, is it right to
conclude that V1 and V2 do not have any relation between them?
A) TRUE
B) FALSE
Solution: (B)
Pearson correlation coefficient between 2 variables might be zero even when they have a
relationship between them. If the correlation coefficient is zero, it just means that that they
don’t move together. We can take examples like y=|x| or y=x^2.
22. True- False: Overfitting is more likely when you have huge amount of data to train?
A) TRUE
B) FALSE
Solution: (B)
With a small training dataset, it’s easier to find a hypothesis to fit the training data exactly i.e.
overfitting.
23. We can also compute the coefficient of linear regression with the help of an analytical
method called “Normal Equation”. Which of the following is/are true about Normal Equation?
1. We don’t have to choose the learning rate
2. It becomes slow when number of features is very large
3. Thers is no need to iterate
A) 1 and 2
B) 1 and 3
C) 2 and 3
D) 1,2 and 3
Solution: (D)
Instead of gradient descent, Normal Equation can also be used to find coefficients.
25. What will happen when you apply very large penalty?
A) Some of the coefficient will become absolute zero
B) Some of the coefficient will approach zero but not absolute zero
C) Both A and B depending on the situation
D) None of these
Solution: (B)
In lasso some of the coefficient value become zero, but in case of Ridge, the coefficients become
close to zero but not zero.
26. What will happen when you apply very large penalty in case of Lasso?
A) Some of the coefficient will become zero
B) Some of the coefficient will be approaching to zero but not absolute zero
C) Both A and B depending on the situation
D) None of these
Solution: (A)
As already discussed, lasso applies absolute penalty, so some of the coefficients will become
zero.
27. Which of the following statement is true about outliers in Linear regression?
A) Linear regression is sensitive to outliers
B) Linear regression is not sensitive to outliers
C) Can’t say
D) None of these
Solution: (A)
The slope of the regression line will change due to outliers in most of the cases. So Linear
Regression is sensitive to outliers.
28. Suppose you plotted a scatter plot between the residuals and predicted values in linear
regression and you found that there is a relationship between them. Which of the following
conclusion do you make about this situation?
31. In terms of bias and variance. Which of the following is true when you fit degree 2
polynomial?
A) Increase
B) Decrease
C) Remain constant
D) Can’t Say
Solution: (D)
Training error may increase or decrease depending on the values that are used to fit the model.
If the values used to train contain more outliers gradually, then the error might just increase.
33. What do you expect will happen with bias and variance as you increase the size of training
data?
34. What would be the root mean square training error for this data if you run a Linear
Regression model of the form (Y = A0+A1X)?
A) Less than 0
B) Greater than zero
C) Equal to 0
D) None of these
Solution: (C)
We can perfectly fit the line on the following data so mean error will be zero.
35. Which of the following scenario would give you the right hyper parameter?
A) 1
B) 2
C) 3
D) 4
Solution: (B)
Option B would be the better option because it leads to less training as well as validation error.
36. Suppose you got the tuned hyper parameters from the previous question. Now, Imagine
you want to add a variable in variable space such that this added feature is important. Which
of the following thing would you observe in such case?
A) Training Error will decrease and Validation error will increase
B) Training Error will increase and Validation error will increase
C) Training Error will increase and Validation error will decrease
D) Training Error will decrease and Validation error will decrease
E) None of the above
Solution: (D)
If the added feature is important, the training and validation error would decrease.
A) L1
B) L2
C) Any
D) None of these
Solution: (D)
I won’t use any regularization methods because regularization is used in case of overfitting.
41. True-False: Is it possible to design a logistic regression algorithm using a Neural Network
Algorithm?
A) TRUE
B) FALSE
Solution: A
True, Neural network is a is a universal approximator so it can implement linear regression
algorithm.
43. Which of the following methods do we use to best fit the data in Logistic Regression?
A) Least Square Error
B) Maximum Likelihood
C) Jaccard distance
D) Both A and B
Solution: B
Logistic regression uses maximum likely hood estimate for training a logistic regression.
44. Which of the following evaluation metrics can not be applied in case of logistic regression
output to compare with target?
A) AUC-ROC
B) Accuracy
C) Logloss
D) Mean-Squared-Error
Solution: D
Since, Logistic Regression is a classification algorithm so it’s output can not be real time value so
mean squared error can not use for evaluating it
45. One of the very good methods to analyze the performance of Logistic Regression is AIC,
which is similar to R-Squared in Linear Regression. Which of the following is true about AIC?
A) We prefer a model with minimum AIC value
B) We prefer a model with maximum AIC value
C) Both but depend on the situation
D) None of these
Solution: A
We select the best model in logistic regression which can least AIC.
Solution: A
In case of lasso we apply a absolute penality, after increasing the penality in lasso some of the
coefficient of variables may become zero.
Context: 48-49
Consider a following model for logistic regression: P (y =1|x, w)= g(w0 + w1x)
where g(z) is the logistic function.
In the above equation the P (y =1|x; w) , viewed as a function of x, that we can get by changing the
parameters w.
A) (0, inf)
B) (-inf, 0 )
C) (0, 1)
D) (-inf, inf)
Solution: C
For values of x in the range of real number from −∞ to +∞ Logistic function will give the output
between (0,1)
49 In above question what do you think which function would make p between (0,1)?
A) logistic function
B) Log likelihood function
C) Mixture of both
D) None of them
Solution: A
50. Suppose you have been given a fair coin and you want to find out the odds of getting heads.
Which of the following option is true for such a case?
A) odds will be 0
B) odds will be 0.5
C) odds will be 1
D) None of these
Solution: C
Odds are defined as the ratio of the probability of success and the probability of failure. So in case of fair
coin probability of success is 1/2 and the probability of failure is 1/2 so odd would be 1
51. The logit function(given as l(x)) is the log of odds function. What could be the range of logit
function in the domain x=[0,1]?
A) (– ∞ , ∞)
B) (0,1)
C) (0, ∞)
D) (- ∞, 0)
Solution: A
For our purposes, the odds function has the advantage of transforming the probability function, which
has values from 0 to 1, into an equivalent function with values between 0 and ∞. When we take the
natural log of the odds function, we get a range of values from -∞ to ∞.
A) Linear Regression errors values has to be normally distributed but in case of Logistic Regression it is
not the case
B) Logistic Regression errors values has to be normally distributed but in case of Linear Regression it is
not the case
C) Both Linear Regression and Logistic Regression error values have to be normally distributed
D) Both Linear Regression and Logistic Regression error values have not to be normally distributed
Solution:A
53. Which of the following is true regarding the logistic function for any value “x”?
Note:
Logistic(x): is a logistic function of any number “x”
A) Logistic(x) = Logit(x)
B) Logistic(x) = Logit_inv(x)
C) Logit_inv(x) = Logit(x)
D) None of these
Solution: B
Suppose you have given the two scatter plot “a” and “b” for two classes( blue for positive and red for
negative class). In scatter plot “a”, you correctly classified all data points using logistic regression ( black
line is a decision boundary).
A) Bias will be high
B) Bias will be low
C) Can’t say
D) None of these
Solution: A
55. Suppose, You applied a Logistic Regression model on a given data and got a training accuracy X
and testing accuracy Y. Now, you want to add a few new features in the same data. Select the
option(s) which is/are correct in such a case.
Solution: A and D
Adding more features to model will increase the training accuracy because model has to consider more
data to fit the logistic regression. But testing accuracy increases if feature is found to be significant
56. Choose which of the following options is true regarding One-Vs-All method in Logistic Regression.
If there are n classes, then n separate logistic regression has to fit, where the probability of each
category is predicted over the rest of the categories combined.
57. Below are two different logistic models with different values for β0 and β1.
Which of the
following statement(s) is true about β0 and β1 values of two logistics models (Green, Black)?
Solution: B
Context 58-60
Below are the three scatter plot(A,B,C left to right) and hand drawn decision boundaries for logistic
regression.
58. Which of the following above figure shows that the decision boundary is overfitting the training
data?
A) A
B) B
C) C
D)None of these
Solution: C
Since in figure 3, Decision boundary is not smooth that means it will over-fitting the data.
1. The training error in first plot is maximum as compare to second and third plot.
2. The best model for this regression problem is the last (third) plot because it has minimum
training error (zero).
3. The second model is more robust than first and third because it will perform best on unseen
data.
5. All will perform same because we have not seen the testing data.
A) 1 and 3
B) 1 and 3
C) 1, 3 and 4
D) 5
Solution: C
The trend in the graphs looks like a quadratic trend over independent variable X. A higher degree(Right
graph) polynomial might have a very high accuracy on the train population but is expected to fail badly
on test dataset. But if you see in left graph we will have training error maximum because it underfits the
training data
60. Suppose, above decision boundaries were generated for the different value of regularization.
Which of the above decision boundary shows the maximum regularization?
A) A
B) B
C) C
D) All have equal regularization
Solution: A
Since, more regularization means more penality means less complex decision boundry that shows in first
figure A.
61. What would do if you want to train logistic regression on same data that will take less time as well
as give the comparatively similar accuracy(may not be same)?
Suppose you are using a Logistic Regression model on a huge dataset. One of the problem you may face
on such huge data is that Logistic regression will take very long time to train.
Solution: D
If you decrease the number of iteration while training it will take less time for surly but will not give the
same accuracy for getting the similar accuracy but not exact you need to increase the learning rate.
62. Which of the following image is showing the cost function for y =1.
Following is the loss function in logistic regression(Y-axis loss function and x axis log probability) for
two class classification problem.
Solution: A
A is the true answer as loss function decreases as the log probability increases
A) 1
B) 2
C) 3
D) 4
Solution: C
There are three local minima present in the graph
64. Can a Logistic Regression classifier do a perfect classification on the below data?
Note: You can use only X1 and X2 variables where X1 and X2 can take only two binary values(0,1).
A) TRUE
B) FALSE
C) Can’t say
D) None of these
Solution: B
No, logistic regression only forms linear decision surface, but the examples in the figure are not linearly
separable.
UNIT IV
Ans Solution: C
When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.
Ans Solution: C
The cost parameter decides how much an SVM should be allowed to “bend” with the data. For a low
cost, you aim for a smooth decision surface and for a higher cost, you aim to classify more points
correctly. It is also simply referred to as the cost of misclassification.
Ans Solution: D
SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.
Ans Solution: B
Generalisation error in statistics is generally the out-of-sample error which is the measure of how
accurately a model can predict values for previously unseen data.
Ans Solution: C
When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: C
Question Context:8– 9
Suppose you are using a Linear SVM classifier with 2 class classification problem. Now you have been
given the following data in which some points are circled red that are representing support vectors.
8. If you remove the following any one red points from the data. Does the decision boundary will
change?
A) Yes
B) No
Solution: A
These three examples are positioned such that removing any one of them introduces slack in the
constraints. So the decision boundary would completely change.
9. [True or False] If you remove the non-red circled points from the data, the decision boundary will
change?
A) True
B) False
Solution: B
On the other hand, rest of the points in the data won’t affect the decision boundary much.
Solution: B
Generalization error in statistics is generally the out-of-sample error which is the measure of how
accurately a model can predict values for previously unseen data.
11. When the C parameter is set to infinite, which of the following holds true?
A) The optimal hyperplane if exists, will be the one that completely separates the data
B) The soft-margin classifier will separate the data
C) None of the above
Solution: A
At such a high level of misclassification penalty, soft margin will not hold existence as there will be no
room for error.
Solution: A
A hard margin means that an SVM is very rigid in classification and tries to work extremely well in the
training set, causing overfitting.
13. The minimum time complexity for training an SVM is O(n2). According to this fact, what sizes of
datasets are not best suited for SVM’s?
A) Large datasets
B) Small datasets
C) Medium sized datasets
D) Size does not matter
Solution: A
Datasets which have a clear classification boundary will function best with SVM’s.
A) Selection of Kernel
B) Kernel Parameters
C) Soft Margin Parameter C
D) All of the above
Solution: D
The SVM effectiveness depends upon how you choose the basic 3 requirements mentioned above in
such a way that it maximises your efficiency, reduces error and overfitting.
15. upport vectors are the data points that lie closest to the decision surface.
A) TRUE
B) FALSE
Solution: A
They are the points closest to the hyperplane and the hardest ones to classify. They also have a direct
bearing on the location of the decision surface.
Solution: C
When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.
17. Suppose you are using RBF kernel in SVM with high Gamma value. What does this signify?
A) The model would consider even far away points from hyperplane for modeling
B) The model would consider only the points close to the hyperplane for modeling
C) The model would not be affected by distance of points from hyperplane for modeling
D) None of the above
Solution: B
The gamma parameter in SVM tuning signifies the influence of points either near or far away from the
hyperplane.
For a low gamma, the model will be too constrained and include all points of the training dataset,
without really capturing the shape.
For a higher gamma, the model will capture the shape of the dataset well.
Solution: C
The cost parameter decides how much an SVM should be allowed to “bend” with the data. For a low
cost, you aim for a smooth decision surface and for a higher cost, you aim to classify more points
correctly. It is also simply referred to as the cost of misclassification.
19. Suppose you are building a SVM model on data X. The data X can be error prone which means that
you should not trust any specific data point too much. Now think that you want to build a SVM model
which has quadratic kernel function of polynomial degree 2 that uses Slack variable C as one of it’s hyper
parameter. Based upon that give the answer for following question.
What would happen when you use very large value of C(C->infinity)?
Note: For small C was also classifying all data points correctly
A) We can still classify data correctly for given setting of hyper parameter C
B) We can not classify data correctly for given setting of hyper parameter C
C) Can’t Say
D) None of these
Solution: A
For large values of C, the penalty for misclassifying points is very high, so the decision boundary will
perfectly separate the data if possible.
20. What would happen when you use very small C (C~0)?
Solution: A
The classifier can maximize the margin between most of the points, while misclassifying a few points,
because the penalty is so low.
21. If I am using all features of my dataset and I achieve 100% accuracy on my training set, but ~70% on
validation set, what should I look out for?
A) Underfitting
B) Nothing, the model is perfect
C) Overfitting
Solution: C
If we’re achieving 100% training accuracy very easily, we need to check to verify if we’re overfitting our
data.
22. Which of the following are real world applications of the SVM?
Solution: D
SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.
Question Context: 23 – 25
Suppose you have trained an SVM with linear decision boundary after training SVM, you correctly infer
that your SVM model is under fitting.
23. Which of the following option would you more likely to consider iterating SVM next time?
Solution: C
The best option here would be to create more features for the model.
24. Suppose you gave the correct answer in previous question. What do you think that is actually
happening?
A) 1 and 2
B) 2 and 3
C) 1 and 4
D) 2 and 4
Solution: C
Better model will lower the bias and increase the variance
25. In above question suppose you want to change one of it’s(SVM) hyperparameter so that effect
would be same as previous questions i.e model will not under fit?
Solution: A
Increasing C parameter would be the right thing to do here, as it will ensure regularized model
26. We usually use feature normalization before using the Gaussian kernel in SVM. What is true about
feature normalization?
A) 1
B) 1 and 2
C) 1 and 3
D) 2 and 3
Solution: B
Suppose you are dealing with 4 class classification problem and you want to train a SVM model on the
data for that you are using One-vs-all method. Now answer the below questions?
27. How many times we need to train our SVM model in such case?
A) 1
B) 2
C) 3
D) 4
Solution: D
For a 4 class problem, you would have to train the SVM at least 4 times if you are using a one-vs-all
method.
28. Suppose you have same distribution of classes in the data. Now, say for training 1 time in one vs all
setting the SVM is taking 10 second. How many seconds would it require to train one-vs-all method end
to end?
A) 20
B) 40
C) 60
D) 80
Solution: B
29 Suppose your problem has changed now. Now, data has only 2 classes. What would you think how
many times we need to train SVM in such case?
A) 1
B) 2
C) 3
D) 4
Solution: A
Training the SVM only one time would give you appropriate results
Suppose you are using SVM with linear kernel of polynomial degree 2, Now think that you have applied
this on data and found that it perfectly fit the data that means, Training and testing accuracy is 100%.
30. Now, think that you increase the complexity (or degree of polynomial of this kernel). What would
you think will happen?
Solution: A
Increasing the complexity of the data would make the algorithm overfit the data.
31. In the previous question after increasing the complexity you found that training accuracy was still
100%. According to you what is the reason behind that?
1. Since data is fixed and we are fitting more polynomial term or parameters so the algorithm starts
memorizing everything in the data
2. Since data is fixed and SVM doesn’t need to search in big hypothesis space
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: C
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: C
UNIT V
1. Which of the following is a widely used and effective machine learning algorithm based on the
idea of bagging?
a) Decision Tree
b) Regression
c) Classification
d) Random Forest
Ans D
a) Factor analysis
b) Decision trees are robust to outliers
c) Decision trees are prone to be overfit
d) None of the above
Ans C
a. True
b. False
Decision trees can also be used to for clusters in the data but clustering often generates natural
clusters and is not dependent on any objective function.
Regression
Classification
Clustering
Reinforcement Learning
Options:
a. 1 Only
b. 1 and 2
c. 1 and 3
d. 1, 2 and 4
Ans D
6 Which of the following is the most appropriate strategy for data cleaning before performing
clustering analysis, given less than desirable number of data points:
Removal of outliers
Options:
a. 1 only
b. 2 only
c. 1 and 2
d. None of the above
Ans A
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: C
Both options are true. In Bagging, each individual trees are independent of each other because they
consider different subset of features and samples.
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: B
In boosting tree individual weak learners are not independent of each other because each tree correct
the results of previous tree. Bagging and boosting both can be consider as improving the base learners
results.
9. In Random forest you can generate hundreds of trees (say T1, T2 …..Tn) and then aggregate
the results of these tree. Which of the following is true about individual (Tk) tree in Random Forest?
1. Individual tree is built on a subset of the features
A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4
Ans Solution: A
Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.
10. Suppose you are using a bagging based algorithm say a RandomForest in model building.
Which of the following can be true?
A) 1
B) 2
C) 1 and 2
D) None of these
Ans Solution: A
Since Random Forest aggregate the result of different weak learners, If It is possible we would want
more number of trees in model building. Random Forest is a black box model you will lose
interpretability after using it.
11. Which of the following is/are true about Random Forest and Gradient Boosting ensemble
methods?
2. Random Forest is use for classification whereas Gradient Boosting is use for regression task
3. Random Forest is use for regression whereas Gradient Boosting is use for Classification task
Solution: E
12. In Random forest you can generate hundreds of trees (say T1, T2 …..Tn) and then aggregate the
results of these tree. Which of the following is true about individual(Tk) tree in Random Forest?
A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4
Solution: A
Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.
13. Which of the following algorithm doesn’t uses learning Rate as of one of its hyperparameter?
1. Gradient Boosting
2. Extra Trees
3. AdaBoost
4. Random Forest
A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4
Solution: D
Random Forest and Extra Trees don’t have learning rate as a hyperparameter.
14. Which of the following algorithm are not an example of ensemble learning algorithm?
A) Random Forest
B) Adaboost
C) Extra Trees
D) Gradient Boosting
E) Decision Trees
Solution: E
Decision trees doesn’t aggregate the results of multiple trees so it is not an ensemble algorithm.
15. Suppose you are using a bagging based algorithm say a RandomForest in model building. Which of
the following can be true?
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: A
Since Random Forest aggregate the result of different weak learners, If It is possible we would want
more number of trees in model building. Random Forest is a black box model you will lose
interpretability after using it.
16. True-False: The bagging is suitable for high variance low bias models?
A) TRUE
B) FALSE
Solution: A
The bagging is suitable for high variance low bias models or you can say for complex models.
17. To apply bagging to regression trees which of the following is/are true in such case?
Solution: D
Solution: B
We always consider the validation results to compare with the test result.
19. In which of the following scenario a gain ratio is preferred over Information Gain?
Solution: A
When high cardinality problems, gain ratio is preferred over Information Gain technique.
20. Suppose you have given the following scenario for training and validation error for Gradient
Boosting. Which of the following hyper parameter would you choose in such case?
1 2 100 110
2 4 90 105
3 6 50 100
4 8 45 105
5 10 30 150
A) 1
B) 2
C) 3
D) 4
Solution: B
Scenario 2 and 4 has same validation accuracies but we would select 2 because depth is lower is better
hyper parameter.
21. Which of the following is/are not true about DBSCAN clustering algorithm:
1. For data points to be in a cluster, they must be in a distance threshold to a core point
5. It is robust to outliers
Options:
A. 1 only
B. 2 only
C. 4 only
D. 2 and 3
Solution: D
DBSCAN can form a cluster of any arbitrary shape and does not have strong assumptions for the
distribution of data points in the data space.
Answer: d
Explanation: K-means clustering follows partitioning approach.
Answer: c
Explanation: k-nearest neighbour has nothing to do with k-means.
Answer: a
Explanation: K-means requires a number of clusters.
Answer: a
Explanation: K-means clustering produces the final estimate of cluster centroids.
27.
Page 1 of 7
UNIVERSITY OF OSLO
Faculty of Mathematics and Natural Sciences
Make sure that your copy of this examination paper is complete before answering.
The exam text consists of problems 1-30 (multiple choice questions) to be answered on
the form that is enclosed in the appendix and problems 31-33 which are answered on
the usual sheets. Problems 1-30 have a total weight of 60%, while problems 31-33 have a
weight of 40%.
You can use the right column of the text as a draft. The form in the appendix is the one to be
handed in (remember to include your candidate number).
Problem 1
Hill climbing A Is a population-based optimization algorithm
B Results depend on the starting points
C Can only be done when a solution has a finite number of
neighbors
D Has less randomness than greedy search
Problem 2
Strategy A Adapt mutation using a fixed strategy schedule
parameters B Improve the chances of finding a better solution in the
short term
C Improve the chances of finding the global optimum
D Adapt mutation by adjusting the normal distribution spread
Page 2 of 7
Problem 3
Evolution A Random parents selection
strategies have B Uniform mutation
C Recombination by partially mapped crossover
D Fitness proportional survivor selection
Problem 4
The crossover A Integer representations
operators used in B Real-valued representations
binary C Permutation representations
representations can D Tree representations
also be used in
Problem 5
Permutation A Swap mutation
representation B Creep mutation
works with C Scramble mutation
D Insert mutation
Problem 6
Adding an offset to A Fitness proportional selection
all fitness values B Ranking selection
affects selection C Tournament selection
pressure in D 𝜇 + 𝜆 selection
Problem 7
One can improve A Ensuring that the initial population well distributed
results on multi- B Reducing the population size
modal problems by C Reducing the fitness of individuals that are close to others
D Increasing the selection pressure
Problem 8
Pareto dominance A Is hard to combine with tournament selection
B Can be used to sort points according to multiple objectives
C Reduces the objective functions to a scalar value
D A solution dominates another if it is as good in every way
and better in at least one
Page 3 of 7
Problem 9
Running multiple A An exhaustive search
times is necessary B An evolution strategy
to measure the C Training a multi-layer perceptron
performance of D Training a self-organizing map
Problem 10
Machine learning A Should be distinguished from self-learning
B Is applicable to classification problems
C A number of different biology-inspired methods could be
used for machine learning
D Is learning automatically from examples
Problem 11
Machine learning A Can be applied to analyze new data
B Is an alternative to artificial intelligence
C Can be used at design time and/or at run time
D Is always learning from scratch and not adaptation of a
previously learned system
Problem 12
Machine learning A Supervised learning is good for clustering problems
algorithms B Reinforcement learning is about learning behavior based
on reward
C Unsupervised learning does not require target values
D Selecting among the above learning methods is
independent of the problem to be solved
Problem 13
Swarm intelligence A Are inspired by interaction in nature between living beings
algorithms in motion
B Are focused on centralized control
C Simple local rules are often applicable
D It is difficult to predict the global behavior of the system
Problem 14
Particle Swarm A Is a population based algorithm
Optimization B Particles are selected for survival based on their fitness
(PSO) C Velocity and position of each solution are updated
D Updates are also based on neighbor particles
Problem 15
Cartesian Genetic A Has less restrictions than Genetic Programming
Programming B Can be used for evolving digital circuits
(CGP) C The level-back parameter indicates the number of previous
columns a node can connect to
D Crossover is always used
Page 4 of 7
Problem 16
Classification A Concerns finding decision boundaries that can be used to
separate out different classes
B Evolvable hardware is not applicable for classification
C Non-linear decision boundaries can solve more complex
problems than linear boundaries (straight lines)
D A test set is more relevant for testing generalization than
the training set
Problem 17
Biological neural A The outputs from a neurons are pulses of fixed strength
networks (height) and duration
B The output from the neuron is called a synapse
C Synapses can be inhibitory or excitatory
D Learning takes place in the dendrites
Problem 18
Which function does the A NAND
following multi-layer B NOR
perceptron realize: C AND
D XOR
Problem 19
Multilayer A Usually, the weights are initially set to small random
perceptron network values
B A hard limiting activation function is often used
C The weights can only be updated after all the training
vectors have been presented
D Multiple layers of neurons allow for less complex decision
boundaries than a single layer
Problem 20
Support Vector A Support vectors are used for computing hyperplanes
Machines (SVMs) B Is a method for minimizing the margin to hyperplanes
C Nonlinear problems are handled with mapping inputs to
lower-dimensional space
D Kernel functions are used for transforming data
Page 5 of 7
Problem 21
Which separation line would A
SVM most likely choose? B
C
D
Problem 22
Soft margins in A Reduce misclassifications during training
SVMs B Allow some of the training data to be misclassified by
introducing slack variables
C Reduce the problem of training data overfitting
D Are not useful if any training data is mislabeled
Problem 23
Ensemble learning A A combination of classifiers are applied for classification
B Classifiers should be trained to be slightly different
C In bagging, each training sample (data point) is used only
once for each iteration
D Minority voting is used if there is disagreement
Problem 24
Principal A Finds the directions with the most variation in the data
component B Is useful for visualizing data
analysis (PCA) C Dimensions are increased when applying PCA
D Eigenvalues and eigenvectors are computed from the
covariance matrix
Problem 25
Unsupervised A Categorizes training vectors by identifying similarities
learning between them
B Can use the same error functions as supervised learning
C Collaborative learning methods are often applied between
classes
D The data applied is unlabeled
Page 6 of 7
Problem 26
k-means A Automatically finds the number of clusters
B Each cluster center is moved to the mean of data points
assigned to it for each iteration
C A too small number of clusters may lead to overfitting
D The algorithm has converged when the change in cluster
assignment is less than a threshold
Problem 27
Self-Organizing A Includes both a competition and collaboration part
Feature Map B Two or more weight layers are often used
C Training data that are similar excite neurons that are near to
each other
D Represents a clustering technique
Problem 28
Self-Organizing A Increased network size leads to increased generalization
Feature Map B Weights of the winner neuron (and its neighborhood) are
learning updated
C The number of weights being modified for each training
vector is increased throughout learning
D A neighborhood function is used to compute the distance to
the winner neuron
Problem 29
Reinforcement A Works best with smaller state spaces
learning B Keeps a log of all individual actions taken by the agent
C Requires the agent to know the rewards for every action
D Models learning behavior in animals
Problem 30
Reinforcement A Is specified in the interval −1,0
learning discount B Is used to account for uncertainties about future rewards
factor C Develops exponentially with time
D Adjusts the balance between shortsightedness and
farsightedness
Page 7 of 7
Problem 31 (6%)
In a few sentences, sketch how you could modify a hill climbing algorithm in order to
improve chances of finding the global optima.
Problem 32 (10%)
If you were to design an evolutionary algorithm to optimize the following problems, what
kind of genetic representation (genotype) would you choose, and why? (Maximum two
sentences for each)
a) Finding the best route for delivering a set of packages to different addresses
b) Optimize parameters of a physical structure like an antenna with a given shape
c) Design of a digital circuit
Problem 33 (24%)
SiO, the student welfare organization, would like to have a system for sorting utensils after
washing. You are going to help them designing a camera based classifier system for sorting
knifes, forks, spoons and teaspoons into separate bins. You have a machine vision library
available that lets you identify where there is a utensil in the camera images, and it extracts a
large number of features for each identified object that we can use as inputs.
(a) (4%) What class of learning algorithm would be best to use in this case, supervised,
unsupervised or reinforcement learning? Justify your answer.
(b) (4%) We would like to make a system for distinguishing the utensils using a multi-layer
perceptron network. How many output neurons should the network have, and what would
each of them represent?
(c) (8%) Sketch the steps in the forward and backward phase of the multi-layer perceptron
algorithm (backpropagation). Use words and not equations.
(d) (4%) What are the different approaches to how often weights are updated during training?
(e) (4%) How would you find out when to stop the training?
Appendix
Page 8 of 71
Problem A B C D
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Appendix
Page 9 of 71
Problem A B C D
1 Ο
2 Ο Ο
3 Ο
4 Ο Ο
5 Ο Ο Ο
6 Ο
7 Ο Ο
8 Ο Ο
9 Ο Ο Ο
10 Ο Ο Ο
11 Ο Ο
12 Ο Ο
13 Ο Ο Ο
14 Ο Ο Ο
15 Ο Ο
16 Ο Ο Ο
17 Ο Ο
18 Ο
19 Ο
20 Ο Ο
21 Ο
22 Ο Ο
23 Ο Ο
24 Ο Ο Ο
25 Ο Ο
26 Ο Ο
27 Ο Ο Ο
28 Ο Ο
29 Ο Ο
30 Ο Ο Ο
Student name
3. (4 points) Check all the binary classifiers that are able to correctly separate the training data
(circles vs. triangles) given in Figure 1.
Logistic regression
SVM with linear kernel
SVM with RBF kernel
Decision tree
3-nearest-neighbor classifier (with Euclidean distance).
MA2823 2 / 12 Dec. 16, 2016
1.0
0.8
0.6
0.4
0.2
0.0
Solution:
• Logistic regression and linear SVM: linear decision functions, hence no.
• 3-NN: the 3 nearest neighbors of any point in our training set are 1 of the same
class and 2 of the opposite class, hence 3-NN will be systematically wrong.
• DT: yes, you can partition the space with lines orthogonal to the axes in such a way
that every sample ends up in a different region.
Short questions
4. (1 point) In a Bayesian learning framework, what is a posterior?
Solution: The updated probability p(θ|D) of a model, after having seen the data.
9. (2 points) A data scientist runs a principal component analysis on their data and tells you
that the percentage of variance explained by the first 3 components is 80 %. How is this
percentage of variance explained computed?
Solution: The overall variance is computed as the sum of the variances of all variables
(i.e. the sum of the diagonal terms of the covariance matrix). The variance explained
(or accounted for) by one PC is the variance of this PC (i.e. the diagonal term on the
corresponding entry of the covariance matrix of the data projected onto its PCs). The
variance explained by the first 3 components is the sum of the tree first values on the
diagonal of the covariance matrix of the data projected onto its PCs.
10. Assume you are given data {(x1 , y 1 ), . . . , (xn , y n )} where xi ∈ X and y i ∈ R. You are
planning to train an SVM. You define a kernel k and obtain, on your training data, the kernel
matrix K presented in Figure 2, where Kij = k(xi , xj ).
(a) (1 point) What is the issue here?
MA2823 4 / 12 Dec. 16, 2016
Solution: Diagonal dominance: the kernel is equivalent to the identity matrix and
the SVM won’t learn.
K
Solution: Normalize the kernel matrix by Kij ← √ ij , or manipulate a coeffi-
Kii Kjj
cient of your kernel to obtain non-zero off-diagonal terms.
11. (2 points) Assume we are given data {(x1 , y 1 ), . . . , (xn , y n )} where xi ∈ Rp and y i ∈ R,
and a parameter λ > 0. We denote by X the n × p matrix of row vectors x1 , . . . , xn and
y = (y 1 , . . . , y n ). We are also given a graph structure on the features, where vertices are
features and edges connect related features. We denote by E the set of edges of this graph.
The graph-Laplacian-regularized linear regression estimator is defined as:
X
β̂ = arg minp ||y − Xβ||22 + λ (βu − βv )2 .
β∈R
(u,v)∈ E
− βv )2 enforce?
P
What does the regularizer (u,v)∈ E (βu
12. Consider a data set described using 1 000 features in total. The labels have been generated
using the first 50 features. Another 50 features are exact copies of these features. The 900
remaining features are uninformative. Assume we have 100 000 training data points.
(a) (2 points) How many features will a filtering approach select?
Problems
13. Perceptron. Consider the following Boolean function:
x1 x2 y = ¬x1 ∪ x2
0 0 1
0 1 1
1 0 0
1 1 1
(a) (2 points) Can this function be represented by a perceptron? Explain your answer.
1.0 + +
0.8
0.6
0.4
0.2
0.0 + -
0.0 0.2 0.4 0.6 0.8 1.0
(b) (4 points) If yes, draw a perceptron that represents it. Otherwise, build a multilayer
neural network that will.
w0 = 1, w1 = −1, w2 = 2
Its output is given by: 1 if w0 + w1 x1 + w2 x2 > 0 and 0 otherwise.
This is one of many possible solutions. w0 , w1 , w2 must give the equation of a line
that separates (1, 0) from (0, 0), (0, 1) and (1, 1).
MA2823 6 / 12 Dec. 16, 2016
∂l(θjk )
= 0,
∂θjk
we obtain n n
1 X i 1 X
Iik xj + Iik (1 − xij ).
θjk i=1 1 − θjk i=1
Finally,
njk
θ̂jk = .
nk
MA2823 7 / 12 Dec. 16, 2016
For a data point x = (x1 , . . . , xp ), we can write the Naive Bayes decision rule as:
!
P (Y = yk )P (x|Y = yk )
f (x) = arg max PK .
k=1,...,K l=1 P (Y = yl )P (x|Y = yl )
Why?
(d) (1 point) Given a data point x, how can you calculate P (X = x) given the parameters
estimated by Naive Bayes?
P
Solution: P (X = x) as k P (X = x|Y = yk )P (Y = yk ).
15. Virtual high-througput screening. Figure 3 presents the performance of several algo-
rithms applied to the problem of classifying molecules in two classes: those that inhibit
Human Respiratory Syncytial Virus (HRSV), and those that do not. HRSV is the most fre-
quent cause of respiratory tract infections in small children, with a worldwide estimated
prevalence of about 34 million cases per year among children under 5 years of age.
(a) (1 point) Which method gives the best performance?
(b) (2 points) The goal of this study is to develop an algorithm that can be used to suggest,
among a large collection of several million of molecules, those that should be experi-
mentally tested for activity against HRSV. Compounds that are active against HSRV are
good leads from which to develop new medical treatments against infections caused by
this virus. In this context, is it preferable to have a high sensitivity or a high specificity?
Which part of the ROC curve is the most interesting?
Solution: We want a low false positive rate (so as to ensure there are mostly promis-
ing compounds among those that will be selected for further development; thera-
MA2823 8 / 12 Dec. 16, 2016
Figure 3: ROC curves for several algorithms classifying molecules according to their action on
HRSV, computed on a test set. Sensitivity = True Positive Rate. Specificity = 1 - False Positive
Rate. VS-RF : Random Forest. SVM : Support Vector Machine. GP : Gaussian Process. LDA :
Linear Discriminant Analysis. kNN : k-Nearest Neighbors. Source: M. Hao, Y. Li, Y. Wang, and S.
Zhang, Int. J. Mol. Sci. 2011, 12(2), 1259-1280.
peutic development is costly), i.e. high specificity. We’re interested in the left part
of the curve: what sensitivity can we get for a fixed specificity?
(c) (1 point) In this study, the authors have represented the molecules based on 777 de-
scriptors. Those descriptors include the number of oxygen atoms, the molecular weights,
the number of rotatable bonds, or the estimated solubility of the molecule. They have
fewer samples (216) than descriptors. What is the danger here?
Solution: Overfitting.
16. Kernel ridge regression. Assume we are given data {(x1 , y 1 ), . . . , (xn , y n )} where xi ∈ Rp
is centered and y i ∈ R, and a parameter λ > 0. We denote by X the n × p matrix of row
vectors x1 , . . . , xn and y = (y 1 , . . . , y n ). The ridge regression estimator is defined as:
(b) (2 points) Write down the value of the prediction for a data point x0 ∈ Rp , as a function
of X, y and λ.
Solution:
ŷ = β̂ > x0 = y > (XX > + λI)−1 Xx0 .
(c) (2 points) Let us now replace all data points with their image in a Hilbert space H: x
is replaced by φ(x), where φ : Rp → H. Let us define K as the n × n matrix with
entries Kij = hφ(xi ), φ(xj )iH , and κ as the n-dimemsional vector with entries κi =
hφ(xi ), φ(x0 )iH .
We are now solving the following optimization problem:
β̂ = arg minp ||y − Φβ||22 + λ||β||22 ,
β∈R
Solution:
ŷ = β̂ > x0 = y > (K + λI)−1 κ.
(d) (2 points) Could the kernel trick be applied in a similar fashion to the l1 -regularized
linear regression (Lasso)?
Solution: No, because unlike ||w||2 , ||w||1 cannot be expressed as a dot product.
17. Quadratic SVM. We are given the 2-dimensional training data D shown in Figure 4 for a
binary classification problem (circles vs. triangles). Assume we are using an SVM with a
quadratic kernel. Let C be the cost parameter of the SVM.
Assuming D = {xi , y i }i=1,...,n with x ∈ R2 and y ∈ {−1, +1}, recall that the SVM is solving
the following optimization problem:
n
1 X
arg min ||w||2 + C ξi such that
w∈Rp ,b∈R 2 i=1
y i hw, φ(xi )i + b ≥ 1 − ξi for all i = 1, . . . , n
ξi ≥ 0 for all i = 1, . . . , n,
MA2823 10 / 12 Dec. 16, 2016
12 12
10 10
8 8
6 6
4 4
2 2
0 0
2 2
2 0 2 4 6 8 10 12 2 0 2 4 6 8 10 12
Large C means the classifier makes few errors. Quadratic SVM means the decision
boundary is an ellipsoid.
(b) (2 points) On Figure 4 (b), draw the decision boundary for a very small value of C. Jus-
tify your answer here.
MA2823 11 / 12 Dec. 16, 2016
Small C means the classifier has a large margin. Quadratic SVM means the decision
boundary is an ellipsoid.
(c) (2 points) Which of the two (large C or small C) do you expect to generalize better and
why?
Solution: Small C. The two triangles near the circles are most likely noise/outliers.
Solution:
7
6
5
4 Cluster 1
3
2
1
0 Cluster 2
0 1 2 3 4 5 6 7
(b) (2 points) Does your solution change after another iteration of the k-means algorithm?
MA2823 12 / 12 Dec. 16, 2016
Solution: No.
7
6
5
4
3
2
1
0
0 1 2 3 4 5 6 7
Figure 5: Data for Question 18.
Bonus questions
19. (1 point) In scikit-learn, what is the difference between the methods predict and
predict_proba for classifiers?
Solution: predict returns a class prediction, while predict_proba returns the prob-
abilities to belong to each of the classes.
20. (1 point) Which feature(s) can you use to represent months in such a way that December is
equally distant from January and November using the Euclidean distance?
Solution: Map to a circle and use cosine and sine of the angle, i.e. use 2 features cos( πk
6
)
and sin( πk
6
).
Family name: Vision and Machine-Learning
Given name: 1/28/2011
Multiple-Choice Questionnaire
Group B
No documents authorized.
There can be several right answers to a question.
Marking-scheme: 2 points if all right answers are selected, 1 point in case of a
right but incomplete answer, 0 point if a wrong answer is selected.
Question N.1:
Large scale visual search. The number of visual words (VW) and their
structure are parameters, if we want to perform large scale search for particular
objects/buildings/scenes taken from dierent viewpoints. Select the statements
which are correct.
Possible answers:
a. A small number of VW (between 1000 and 4000) gives excellent performance.
b. A very large number of VW (between 200k and 1M) gives excellent perfor-
mance.
c. An average number of VW (around 20k) combined with a renement based
on a short binary signature gives excellent performance.
d. A hierarchically structured visual vocabulary improves the performance in
terms of search accuracy.
Question N.2:
Image features. Given scale invariant interest points and SIFT descriptors
which are normalized in the direction of the dominant gradient orient. Select
the properties which are correct.
Possible answers:
a. These descriptors allow to match images taken at dierent distances.
b. These descriptors are invariant to image rotation and translation.
c. These descriptors are invariant to ane transformations.
d. The detected regions indicate the local characteristic scale of the image.
1
Question N.3:
Image features. The Harris detector extracts interest points for a given image.
Select the properties which are correct.
Possible answers:
a. The detector is based on the auto-correlation matrix.
b. The detector selects the characteristic scale.
c. The detector nds discriminant points.
d. The detector is invariant to rotation.
Question N.4:
Bag-of-features models for category-level classication. Image classi-
cation is one task in category recognition. Select the statements which are true
in the context of image classication.
Possible answers:
a. The PASCAL dataset is a standard to compare the performance of dierent
algorithms for image classication.
b. Image classication allows to localize objects in the image.
c. When training an image classier, we use positive and negative training im-
ages.
d. The number of visual words used in the context of image classication is in
general very high (between 100k and 1M visual words).
Question N.5:
Bag-of-features models for category-level classication. The spatial
pyramid kernel can be used for image classication. Select the statements which
are correct.
Possible answers:
a. The spatial pyramid kernel captures coarsely the global spatial layout of the
image.
b. The spatial pyramid kernel works well for classifying scenes.
c. The spatial pyramid kernel is well adapted for classifying images with objects
in arbitrary positions.
d. The spatial pyramid kernel is invariant to image rotation.
2
Question N.6:
Camera geometry and image alignment. Two images, I and I 0 , are cap-
tured by two cameras with internal calibration matrices K and K0 . The two
cameras have the same camera center and are related by a pure rotation given
by matrix R (the mosaicking scenario). What is the form of homography H re-
lating the two images in terms of K, K0 and R?
Hint: Start from the perspective projection equation, which has the form x = z1 K[R t]X,
where x (3-vector) and X (4-vector) are the image and scene points in homogeneous
coordinates, respectively. Assume the two cameras have the same camera center at
the
t1 = t2 = 0, and use the fact that the scene point X can be written as X = X̃
origin ,
1
where X̃ (3-vector) is in non-homogenous coordinates.
Possible answers:
a. H = K0 RK
b. H = RKR−1
c. H = RK0 R−1
d. H = K0 RK−1
Question N.7:
Camera geometry and image alignment. What is the minimal number of
point-to-point correspondences to compute (i) homography and (ii) 2D ane
geometric transformation?
Possible answers:
a. 1 correspondence for ane transformation, 2 correspondences for homogra-
phy.
b. 2 correspondences for ane transformation, 3 correspondences for homogra-
phy.
c. 3 correspondences for ane transformation, 4 correspondences for homogra-
phy.
d. 4 correspondences for ane transformation, 5 correspondences for homogra-
phy.
3
Question N.8:
Large scale visual search. N sift descriptors are indexed using a randomized
KD-tree discussed in the lecture. What is the complexity (in terms of N ) of
nding an approximate nearest neighbor to a query sift descriptor?
Possible answers:
a. N2
b. N
c. log N
d. log log N
Question N.9:
Unsupervised learning. The k-means algorithm is a :
Possible answers:
a. supervised learning algorithm.
b. unsupervised learning algorithm.
c. semi-supervised learning algorithm.
d. weakly supervised learning algorithm.
Question N.10:
Unsupervised learning. Let k(·, ·) a positive denite kernel dening a simi-
larity measure. The spectral clustering algorithm relies upon the singular value
decomposition (SVD) of:
Possible answers:
a. K = [k(xi , xj ]1≤i,j≤n
c.
PL = D − K, with D = diag(deg(x1 ), . . . , deg(xn )) et deg(xi ) =
k(xi , xj ) for i = 1, . . . , n
n
j=1
4
Question N.11:
Supervised learning. The support vector machine uses the loss function
Possible answers:
a. `(y, f ) = max(0, 1 − yf )
c. `(y, f ) = (y − f )2
d. `(y, f ) = |y − f |
Question N.12:
Supervised learning. For competitive performance (and competitive gener-
alization error), the C parameter of the support vector machine should be:
Possible answers:
a. kept xed to C = 1 regardless of the training data at hand
b. optimized on the test set, used eventually for evaluting the true performance
of the learning algorithm
c. optimized on the training set
d. optimized through a cross-validation loop on the training set
Question N.13:
Category-level localization. A linear SVM classier used in combination
with the sliding-window object detector
Possible answers:
a. is fast because of the cascade structure
b. is fast because it can be expressed in the form of a dot-product f (x) =
w T x + b.
5
Question N.14:
Category-level localization. Pictorial structure models are often used to
model objects in terms of parts and relations between parts. The graph of a
pictorial structure model
Possible answers:
a. has nodes corresponding to object parts and edges corresponding to part
relations.
b. has nodes corresponding to part relations and edges corresponding to object
parts.
c. has associated energy function which can always be optimized in polynomial
time.
d. typically has a tree or a star structure due to eciency reasons.
Question N.15:
Motion and human actions. Optical ow estimation is problematic
Possible answers:
a. in homogeneous image areas.
b. in textured image areas.
c. at image edges.
d. at the boundaries of moving objects.
Question N.16:
Motion and human actions. Movie scripts can be used as a source of readily
available supervision. They can provide
Possible answers:
a. spatial supervision for objects in the video.
b. noisy temporal supervision.
c. reliable temporal supervision.
d. complete description of a video.
6
Multiple-Choice Questionnaire
Group B
a b c d
Question
n.1
Question
n.2
Question
n.3
Question
n.4
Question
n.5
Question
n.6
Question
n.7
Question
n.8
Question
n.9
Question
n.10
Question
n.11
Question
n.12
Question
n.13
Question
n.14
Question
n.15
Question
n.16
Multiple-Choice Questionnaire
Group B
a b c d
Question X X
n.1
Question X X X
n.2
Question X X X
n.3
Question X X
n.4
Question X X
n.5
Question X
n.6
Question X
n.7
Question X
n.8
Question X
n.9
Question X
n.10
Question X
n.11
Question X
n.12
Question X X
n.13
Question X X
n.14
Question X X X
n.15
Question X
n.16
1. The process of forming general concept definitions from examples of concepts to be learned.
A. Deduction
B. abduction
C. induction
D. conjunction
A. facts.
B. concepts.
C. procedures.
D. principles.
A. validation data
B. training data
C. test data
D. hidden data
A. hidden attribute.
B. output attribute.
C. input attribute.
D. categorical attribute.
5. Supervised learning differs from unsupervised clustering in that supervised learning requires
6. A regression model in which more than one independent variable is used to predict the dependent
variable is called
A. regression
B. correlation
C. multicollinearity
D. none of the above
8. A multiple regression model has the form: y = 2 + 3x1 + 4x2. As x1 increases by 1 unit (holding x2
constant), y will
A. increase by 3 units
B. decrease by 3 units
C. increase by 4 units
D. decrease by 4 units
10. A measure of goodness of fit for the estimated regression equation is the
13. For a multiple regression model, SST = 200 and SSE = 50. The multiple coefficient of determination is
A. 0.25
B. 4.00
C. 0.75
D. none of the above
14. A nearest neighbor approach is best used
A. supervised learning
B. unsupervised clustering
C. data query
1. What is the average weekly salary of all female employees under forty years of age? (C)
2. Develop a profile for credit card customers likely to carry an average monthly balance of more
than $1000.00. (A)
3. Determine the characteristics of a successful used car salesperson. (A)
4. What attribute similarities group customers holding one or several insurance policies? (A)
5. Do meaningful attribute relationships exist in a database containing information about credit
card customers? (B)
6. Do single men play more golf than married men? (C)
7. Determine whether a credit card transaction is valid or fraudulent (A)
A. predictive variable
B. independent variable
C. estimated variable
D. dependent variable
20. Which statement is true about neural network and linear regression models?
A. detect outliers
B. determine a best set of input attributes for supervised learning
C. evaluate the likely performance of a supervised learner model
D. determine if meaningful relationships can be found in a dataset
E. All of a,b,c, and d are common uses of unsupervised clustering.
22. The average positive difference between computed and desired outcome values.
23. Selecting data so as to assure that each class is properly represented in both the training and
test set.
A. cross validation
B. stratification
C. verification
D. bootstrapping
24. The standard error is defined as the square root of this computation.
A. training
B. test
C. verification
D. validation
27. The correlation between the number of years an employee has worked for a company and
the salary of the employee is 0.75. What can be said about employee salary and years
worked?
28. The correlation coefficient for two real-valued attributes is –0.85. What does this value tell
you?
29. The average squared difference between classifier predicted output and actual output.
30. Simple regression assumes a __________ relationship between the input attribute and
output attribute.
A. linear
B. quadratic
C. reciprocal
D. inverse
31. Regression trees are often used to model _______ data.
A. linear
B. nonlinear
C. categorical
D. symmetrical
33. Logistic regression is a ________ regression technique that is used to model data having a
_____outcome.
A. linear, numeric
B. linear, binary
C. nonlinear, numeric
D. nonlinear, binary
34. This technique associates a conditional probability value with each data instance.
A. linear regression
B. logistic regression
C. simple regression
D. multiple linear regression
35. This supervised learning technique can process both numeric and categorical input attributes.
A. linear regression
B. Bayes classifier
C. logistic regression
D. backpropagation learning
A. agglomerative clustering
B. expectation maximization
C. conceptual clustering
D. K-Means clustering
38. This clustering algorithm initially assumes that each data instance represents a single cluster.
A. agglomerative clustering
B. conceptual clustering
C. K-Means clustering
D. expectation maximization
39. This unsupervised clustering algorithm terminates when mean values computed for the
current iteration of the algorithm are identical to the computed mean values for the previous
iteration.
A. agglomerative clustering
B. conceptual clustering
C. K-Means clustering
D. expectation maximization
40. Machine learning techniques differ from statistical techniques in that machine learning
methods
• The exam is closed book, closed notes except your one-page (two sides) or two-page (one side) crib sheet.
• Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a
brief explanation. All short answer sections can be successfully answered in a few sentences AT MOST.
• For multiple-choice questions, fill in the bubbles for ALL CORRECT CHOICES (in some cases, there may be
more than one). For a question with p points and k choices, every false positive wil incur a penalty of p/(k − 1)
points.
• For short answer questions, unnecessarily long explanations and extraneous data will be penalized.
Please try to be terse and precise and do the side calculations on the scratch papers provided.
• Please draw a bounding box around your answer in the Short Answers section. A missed answer without
a bounding box will not be regraded.
First name
Last name
SID
1
Q1. [23 pts] True/False
(a) [1 pt] Solving a non linear separation problem with a hard margin Kernelized SVM (Gaussian RBF Kernel)
might lead to overfitting.
True
False
(b) [1 pt] In SVMs, the sum of the Lagrange multipliers corresponding to the positive examples is equal to the sum
of the Lagrange multipliers corresponding to the negative examples.
True
False
(c) [1 pt] SVMs directly give us the posterior probabilities P (y = 1|x) and P (y = −1|x).
True False
(e) [1 pt] In the discriminative approach to solving classification problems, we model the conditional probability
of the labels given the observations.
True
False
(f ) [1 pt] In a two class classification problem, a point on the Bayes optimal decision boundary x∗ always satisfies
P (y = 1|x∗ ) = P (y = 0|x∗ ).
True False
(g) [1 pt] Any linear combination of the components of a multivariate Gaussian is a univariate Gaussian.
True
False
(h) [1 pt] For any two random variables X ∼ N (µ1 , σ12 ) and Y ∼ N (µ2 , σ22 ), X + Y ∼ N (µ1 + µ2 , σ12 + σ22 ).
True False
(i) [1 pt] Stanford and Berkeley students are trying to solve the same logistic regression problem for a dataset.
The Stanford group claims that their initialization point will lead to a much better optimum than Berkeley’s
initialization point. Stanford is correct.
True False
p
(j) [1 pt] In logistic regression, we model the odds ratio ( 1−p ) as a linear function.
True False
(k) [1 pt] Random forests can be used to classify infinite dimensional data.
True
False
(l) [1 pt] In boosting we start with a Gaussian weight distribution over the training samples.
True False
(m) [1 pt] In Adaboost, the error of each hypothesis is calculated by the ratio of misclassified examples to the total
number of examples.
True False
(n) [1 pt] When k = 1 and N → ∞, the kNN classification rate is bounded above by twice the Bayes error rate.
True
False
(o) [1 pt] A single layer neural network with a sigmoid activation for binary classification with the cross entropy
loss is exactly equivalent to logistic regression.
True
False
2
(p) [1 pt] The loss function for LeNet5 (the convolutional neural network by LeCun et al.) is convex.
True False
(q) [1 pt] Convolution is a linear operation i.e. (αf1 + βf2 ) ∗ g = αf1 ∗ g + βf2 ∗ g.
True
False
(r) [1 pt] The k-means algorithm does coordinate descent on a non-convex objective function.
True
False
(s) [1 pt] A 1-NN classifier has higher variance than a 3-NN classifier.
True
False
(t) [1 pt] The single link agglomerative clustering algorithm groups two clusters on the basis of the maximum
distance between points in the two clusters.
True False
(u) [1 pt] The largest eigenvector of the covariance matrix is the direction of minimum variance in the data.
True False
(w) [1 pt] The non-zero eigenvalues of AAT and AT A are the same.
True
False
3
Q2. [36 pts] Multiple Choice Questions
(a) [4 pts] In linear regression, we model P (y|x) ∼ N (wT x + w0 , σ 2 ). The irreducible error in this model is
.
σ2 E[(y − E[y|x])|x]
(b) [4 pts] Let S1 and S2 be the set of support vectors and w1 and w2 be the learnt weight vectors for a linearly
separable problem using hard and soft margin linear SVMs respectively. Which of the following are correct?
(c) [4 pts] Ordinary least-squares regression is equivalent to assuming that each data point is generated according
to a linear function of the input plus zero-mean, constant-variance Gaussian noise. In many systems, however,
the noise variance is itself a positive linear function of the input (which is assumed to be non-negative, i.e.,
x ≥ 0). Which of the following families of probability models correctly describes this situation in the univariate
case?
2 2 2
P (y|x) = √1
σ 2πx
exp(− (y−(w2xσ
0 +w1 x))
2 )
P (y|x) = √1
σ 2πx
exp(− (y−(w0 +(w 1 +σ )x))
2σ 2 )
(f ) [4 pts] Let A be a symmetric matrix and S be the matrix containing its eigenvectors as column vectors, and D
a diagonal matrix containing the corresponding eigenvalues on the diagonal. Which of the following are true:
AS = SD SA = DS
AS = DS AS = DS T
(g) [4 pts] Consider the following dataset: A = (0, 2), B = (0, 1) and C = (1, 0). The k-means algorithm is
initialized with centers at A and B. Upon convergence, the two centers will be at
4
(h) [3 pts] Which of the following loss functions are convex?
(i) [3 pts] Consider T1 , a decision stump (tree of depth 2) and T2 , a decision tree that is grown till a maximum
depth of 4. Which of the following is/are correct?
(j) [4 pts] Consider the problem of building decision trees with k-ary splits (split one node intok nodes) and
you are deciding k for each node by calculating the entropy impurity for different values of k and optimizing
simultaneously over the splitting threshold(s) and k. Which of the following is/are true?
The algorithm will always choose k = 2 There will be k −1 thresholds for a k-ary split
5
Q3. [26 pts] Short Answers
σ2 σ12
(a) [5 pts] Given that (x1 , x2 ) are jointly normally distributed with µ = µµ12 and Σ = σ 1
σ22
(σ21 = σ12 ), give
21
an expression for the mean of the conditional distribution p(x1 |x2 = a).
e−x e−x
1 1 1
σ 0 (x) = −x
= . = 1− = σ(x)(1 − σ(x))
(1 + e ) 2 (1 + e ) (1 + e−x )
−x 1 + e−x 1 + e−x
Biased estimator: θ̂ (the sample estimate) is a biased estimator of θ (the population distribution parameter)
if E[θ̂] 6= θ.
n
Here θ̂ = x(n) . And E[x(n) ] = n+1 θ 6= θ. The steps for finding E[x(n) ] are given in the solutions of Homework
2, problem 5(c).
n+1
θ̂unbiased = x(n)
n
n+1 n+1 n+1 n
E[θ̂unbiased ] = E[ x(n) ] = E[x(n) ] = × θ=θ
n n n n+1
6
(d) [5 pts] Consider the problem of fitting the following function to a dataset of 100 points {(xi , yi )}, i = 1 . . . 100:
y = αcos(x) + βsin(x) + γ
This problem can be solved using the least squares method with a solution of the form:
α
β = (X T X)−1 X T Y
γ
cos(x1 ) sin(x1 ) 1 y1
cos(x2 ) sin(x2 ) 1 y2
X=
.. ..
.. Y =
..
. . . .
cos(x100 ) sin(x100 ) 1 y100
(e) [5 pts] Consider the problem of binary classification using the Naive Bayes classifier. You are given two dimen-
sional features (X1 , X2 ) and the categorical class conditional distributions in the tables below. The entries in
the tables correspond to P (X1 = x1 |Ci ) and P (X2 = x2 |Ci ) respectively. The two classes are equally likely.
PP Class PP Class
PP PP
C1 C2 C1 C2
X1 = P P PP X2 = PP P P
−1 0.2 0.3 −1 0.4 0.1
0 0.4 0.6 0 0.5 0.3
1 0.4 0.1 1 0.1 0.6
Given a data point (−1, 1), calculate the following posterior probabilities:
P (C1 |X1 = −1, X2 = 1) = Using Bayes’ Rule and conditional independence assumption of Naive Bayes
7
Scratch paper
8
Scratch paper
9
CS 189 Introduction to
Spring 2016 Machine Learning Final
• Please do not open the exam before you are instructed to do so.
• The exam is closed book, closed notes except your two-page cheat sheet.
• Electronic devices are forbidden on your person, including cell phones, iPods, headphones, and laptops.
Turn your cell phone off and leave all electronics at the front of the room, or risk getting a zero on
the exam.
• Please write your initials at the top right of each page (e.g., write “JS” if you are Jonathan Shewchuk). Finish
this by the end of your 3 hours.
• Mark your answers on front of each page, not the back. We will not scan the backs of each page, but you may
use them as scratch paper. Do not attach any extra sheets.
• The total number of points is 150. There are 30 multiple choice questions worth 3 points each, and 6 written
questions worth a total of 60 points.
• For multiple-choice questions, fill in the boxes for ALL correct choices: there may be more than one correct
choice, but there is always at least one correct choice. NO partial credit on multiple-choice questions: the
set of all correct answers must be checked.
First name
Last name
SID
1
Q1. [90 pts] Multiple Choice
Check the boxes for ALL CORRECT CHOICES. Every question should have at least one box checked. NO PARTIAL
CREDIT: the set of all correct answers (only) must be checked.
(1) [3 pts] What strategies can help reduce overfitting in decision trees?
(2) [3 pts] Which of the following are true of convolutional neural networks (CNNs) for image analysis?
Filters in earlier layers tend to include edge They have more parameters than fully-
detectors connected networks with the same number of lay-
ers and the same numbers of neurons in each layer
Pooling layers reduce the spatial resolution of A CNN can be trained for unsupervised learn-
the image ing tasks, whereas an ordinary neural net cannot
(4) [3 pts] Which of the following are true about generative models?
They model the joint distribution P (class = The perceptron is a generative model
C AND sample = x)
Linear discriminant analysis is a generative
They can be used for classification model
weights are regularized with the `1 norm the weights have a Gaussian prior
weights are regularized with the `2 norm the solution algorithm is simpler
(6) [3 pts] Which of the following methods can achieve zero training error on any linearly separable dataset?
can be applied to every classification algorithm is commonly used for dimensionality reduction
changes ridge regression so we solve a d × d exploits the fact that in many learning al-
linear system instead of an n × n system, given n gorithms, the weights can be written as a linear
sample points with d features combination of input points
2
(8) [3 pts] Suppose we train a hard-margin linear SVM on n > 100 data points in R2 , yielding a hyperplane with
exactly 2 support vectors. If we add one more data point and retrain the classifier, what is the maximum
possible number of support vectors for the new hyperplane (assuming the n + 1 points are linearly separable)?
2 n
3 n+1
(9) [3 pts] In latent semantic indexing, we compute a low-rank approximation to a term-document matrix. Which
of the following motivate the low-rank reconstruction?
Finding documents that are related to each The low-rank approximation provides a loss-
other, e.g. of a similar genre less method for compressing an input matrix
(10) [3 pts] Which of the following are true about subset selection?
Subset selection can substantially decrease the Subset selection can reduce overfitting
bias of support vector machines
Ridge regression frequently eliminates some of Finding the true best subset takes exponential
the features time
(11) [3 pts] In neural networks, nonlinear activation functions such as sigmoid, tanh, and ReLU
speed up the gradient calculation in backprop- help to learn nonlinear decision boundaries
agation, as compared to linear units
are applied only to the output units always output values between 0 and 1
(12) [3 pts] Suppose we are given data comprising points of several different classes. Each class has a different
probability distribution from which the sample points are drawn. We do not have the class labels. We use
k-means clustering to try to guess the classes. Which of the following circumstances would undermine its
effectiveness?
Some of the classes are not normally dis- The variance of each distribution is small in
tributed all directions
Each class has the same mean You choose k = n, the number of sample points
(13) [3 pts] Which of the following are true of spectral graph partitioning methods?
They find the cut with minimum weight They minimize a quadratic function subject to
one constraint: the partition must be balanced
They use one or more eigenvectors of the
Laplacian matrix The Normalized Cut was invented at Stanford
(14) [3 pts] Which of the following can help to reduce overfitting in an SVM classifier?
3
(15) [3 pts] Which value of k in the k-nearest neighbors algorithm generates the solid decision boundary depicted
here? There are only 2 classes. (Ignore the dashed line, which is the Bayes decision boundary.)
k=1 k=2
k = 10 k = 100
(16) [3 pts] Consider one layer of weights (edges) in a convolutional neural network (CNN) for grayscale images,
connecting one layer of units to the next layer of units. Which type of layer has the fewest parameters to be
learned during training? (Select one.)
(17) [3 pts] In the kernelized perceptron algorithm with learning rate = 1, the coefficient ai corresponding to a
training example xi represents the weight for K(xi , x). Suppose we have a two-class classification problem with
yi ∈ {1, −1}. If yi = 1, which of the following can be true for ai ?
ai = −1 ai = 1
ai = 0 ai = 5
(18) [3 pts] Suppose you want to split a graph G into two subgraphs. Let L be G’s Laplacian matrix. Which of the
following could help you find a good split?
The eigenvector corresponding to the second- The left singular vector corresponding to the
largest eigenvalue of L second-largest singular value of L
The eigenvector corresponding to the second- The left singular vector corresponding to the
smallest eigenvalue of L second-smallest singular value of L
(19) [3 pts] Which of the following are properties that a kernel matrix always has?
4
(20) [3 pts] How does the bias-variance decomposition of a ridge regression estimator compare with that of ordinary
least squares regression? (Select one.)
Ridge has larger bias, larger variance Ridge has smaller bias, larger variance
Ridge has larger bias, smaller variance Ridge has smaller bias, smaller variance
(21) [3 pts] Both PCA and Lasso can be used for feature selection. Which of the following statements are true?
Lasso selects a subset (not necessarily a strict PCA and Lasso both allow you to specify how
subset) of the original features many features are chosen
PCA produces features that are linear combi- PCA and Lasso are the same if you use the
nations of the original features kernel trick
(22) [3 pts] Which of the following are true about forward subset selection?
O(2d ) models must be trained during the al- It finds the subset of features that give the
gorithm, where d is the number of features lowest test error
It greedily adds the feature that most improves Forward selection is faster than backward se-
cross-validation accuracy lection if few features are relevant to prediction
(23) [3 pts] You’ve just finished training a random forest for spam classification, and it is getting abnormally bad
performance on your validation set, but good performance on your training set. Your implementation has no
bugs. What could be causing the problem?
Your decision trees are too deep You have too few trees in your ensemble
You are randomly sampling too many features Your bagging implementation is randomly
when you choose a split sampling sample points without replacement
6 3 1
2 7 0
9 6 and labels y = 1. Let f1 denote
(24) [3 pts] Consider training a decision tree given a design matrix X =
4 2 0
feature 1, corresponding to the first column of X, and let f2 denote feature 2, corresponding to the second
column. Which of the following splits at the root node gives the highest information gain? (Select one.)
f1 > 2 f2 > 3
f1 > 4 f2 > 6
(25) [3 pts] In terms of the bias-variance decomposition, a 1-nearest neighbor classifier has than a
3-nearest neighbor classifier.
5
(26) [3 pts] Which of the following are true about bagging?
Bagging is ineffective with logistic regression, If we use decision trees that have one sample
because all of the learners learn exactly the same point per leaf, bagging never gives lower training
decision boundary error than one ordinary decision tree
(27) [3 pts] An advantage of searching for an approximate nearest neighbor, rather than the exact nearest neighbor,
is that
it sometimes makes exhaustive search much the nearest neighbor classifier is sometimes
faster much more accurate
(28) [3 pts] In the derivation of the spectral graph partitioning algorithm, we relax a combinatorial optimization
problem to a continuous optimization problem. This relaxation has the following effects.
The combinatorial problem requires an ex- The combinatorial problem requires finding
act bisection of the graph, but the continuous al- eigenvectors, whereas the continuous problem re-
gorithm can produce (after rounding) partitions quires only matrix multiplication
that aren’t perfectly balanced
The combinatorial problem cannot be modi- The combinatorial problem is NP-hard, but
fied to accommodate vertices that have different the continuous problem can be solved in polyno-
masses, whereas the continuous problem can mial time
determines how strongly the dendrites of the is more analogous to the output of a unit in a
neuron stimulate axons of neighboring neurons neural net than the output voltage of the neuron
only changes very slowly, taking a period of can sometimes exceed 30,000 action potentials
several seconds to make large adjustments per second
(30) [3 pts] In algorithms that use the kernel trick, the Gaussian kernel
gives a regression function or predictor func- is equivalent to lifting the d-dimensional sam-
tion that is a linear combination of Gaussians cen- ple points to points in a space whose dimension
tered at the sample points is exponential in d
is less prone to oscillating than polynomials, has good properties in theory but is rarely
assuming the variance of the Gaussians is large used in practice
(31) 3 bonus points! The following Berkeley professors were cited in this semester’s lectures (possibly self-cited)
for specific research contributions they made to machine learning.
6
Q2. [8 pts] Feature Selection
A newly employed former CS 189/289A student trains the latest Deep Learning classifier and obtains state-of-the-art
accuracy. However, the classifier uses too many features! The boss is overwhelmed and asks for a model with fewer
features.
Let’s try to identify the most important features. Start with a simple dataset in R2 .
(1) [4 pts] Describe the training error of a Bayes optimal classifier that can see only the first feature of the data.
Describe the training error of a Bayes optimal classifier that can see only the second feature.
The first feature yields a training error of 50% (like random guessing). The second feature offers a training error of
zero.
(2) [4 pts] Based on this toy example, the student decides to fit a classifier on each feature individually, then
rank the features by their classifier’s accuracy, take the best k features, and train a new classifier on those k
features. We call this approach variable ranking. Unfortunately, the classifier trained on the best k features
obtains horrible accuracy, unless k is very close to d, the original number of features!
Construct a toy dataset in R2 for which variable ranking fails. In other words, a dataset where a variable is
useless by itself, but potentially useful alongside others. Use + for data points in Class 1, and O for data points
in Class 2.
An XOR Dataset is unpredictable with either feature. (This extends to n-dimensions, with the n-bit parity string.)
7
Q3. [10 pts] Gradient Descent for k-means Clustering
Recall the loss function for k-means clustering with k clusters, sample points x1 , ..., xn , and centers µ1 , ..., µk :
k
X X
L= kxi − µj k2 ,
j=1 xi ∈Sj
where Sj refers to the set of data points that are closer to µj than to any other cluster mean.
(1) [4 pts] Instead of updating µj by computing the mean, let’s minimize L with batch gradient descent while
holding the sets Sj fixed. Derive the update formula for µ1 with learning rate (step size) .
∂L ∂ X
= (xi − µ1 )> (xi − µ1 )
∂µ1 ∂µ1
xi ∈S1
X
= 2(µ1 − xi ).
xi ∈S1
(2) [2 pts] Derive the update formula for µ1 with stochastic gradient descent on a single sample point xi . Use
learning rate .
µ1 ← µ1 + (xi − µ1 ) if xi ∈ S1 , otherwise no change.
(3) [4 pts] In this part, we will connect the batch gradient descent update equation with the standard k-means
algorithm. Recall that in the update step of the standard algorithm, we assign each cluster center to be the
mean (centroid) of the data points closest to that center. It turns out that a particular choice of the learning
rate (which may be different for each cluster) makes the two algorithms (batch gradient descent and the
standard k-means algorithm) have identical update steps. Let’s focus on the update for the first cluster, with
center µ1 . Calculate the value of so that both algorithms perform the same update for µ1 . (If you do it right,
the answer should be very simple.)
In the standard algorithm, we assign µ1 ← xi ∈S1 |S11 | xi .
P
Comparing to the answer in (1), we set xi ∈S1 |S11 | xi = µ1 + xi ∈S1 (xi − µ1 ) and solve for .
P P
X 1 X 1 X
xi − µ1 = (xi − µ1 )
|S1 | |S1 |
xi ∈S1 xi ∈S1 xi ∈S1
X 1 X
(xi − µ1 ) = (xi − µ1 ).
|S1 |
xi ∈S1 xi ∈S1
1
Thus = |S1 | .
(Note: answers that differ by a constant factor are fine if consistent with answer for (1).)
8
Q4. [10 pts] Kernels
(1) [2 pts] What is the primary motivation for using the kernel trick in machine learning algorithms?
If we want to map sample points to a very high-dimensional feature space, the kernel trick can save us from
having to compute those features explicitly, thereby saving a lot of time.
(Alternative solution: the kernel trick enables the use of infinite-dimensional feature spaces.)
(2) [4 pts] Prove that for every design matrix X ∈ Rn×d , the corresponding kernel matrix is positive semidefinite.
For every vector z ∈ Rn ,
z> Kz = z> XX > z = |X > z|2 ,
which is clearly nonnegative.
(3) [2 pts] Suppose that a regression algorithm contains the following line of code.
w ← w + X > M XX > u
Here, X ∈ Rn×d is the design matrix, w ∈ Rd is the weight vector, M ∈ Rn×n is a matrix unrelated to X,
and u ∈ Rn is a vector unrelated to X. We want to derive a dual version of the algorithm in which we express
the weights w as a linear combination of samples Xi (rows of X) and a dual weight vector a contains the
coefficients of that linear combination. Rewrite the line of code in its dual form so that it updates a correctly
(and so that w does not appear).
a ← a + M XX > u
(4) [2 pts] Can this line of code for updating a be kernelized? If so, show how. If not, explain why.
Yes:
a ← a + M Ku
9
Q5. [12 pts] Let’s PCA
6 −4
−3 5
You are given a design matrix X =
−2
. Let’s use PCA to reduce the dimension from 2 to 1.
6
7 −3
(1) [6 pts] Compute the covariance matrix for the sample points. (Warning: Observe that X is not centered.)
Then compute the unit eigenvectors, and the corresponding eigenvalues, of the covariance matrix. Hint: If
you graph the points, you can probably guess the eigenvectors (then verify that they really are eigenvectors).
> 82 −80
The covariance matrix is X X = .
−80 82
" # " #
√1 √1
Its unit eigenvectors are 2 with eigenvalue 2 and 2 with eigenvalue 162. (Note: either eigenvector
√1 − √12
2
can be replaced with its negation.)
(2) [3 pts] Suppose we use PCA to project the sample points onto a one-dimensional space. What one-dimensional
subspace are we projecting onto? For each of the four sample points in X (not the centered version of X!),
write the coordinate (in principal coordinate space, not in R2 ) that the point is projected to.
" #
√1
2 1
We are projecting onto the subspace spanned by . (Equivalently, onto the space spanned by . Equiva-
− √12 −1
10
lently, onto the line x + y = 0.) The projections are (6, −4) → √
2
, (−3, 5) → − √82 , (−2, 6) → − √82 , (7, −3) → 10
√
2
.
(3) [3 pts] Given a design matrix X that is taller than it is wide, prove that every right singular vector of X with
singular value σ is an eigenvector of the covariance matrix with eigenvalue σ 2 .
If v is a right singular vector of X, then there is a singular value decomposition X = U DV > such that v is a column
of V . Here each of U and V has orthonormal columns, V is square, and D is square and diagonal. The covariance
matrix is X > X = V DU > U DV > = V D2 V > . This is an eigendecomposition of X > X, so each singular vector in V
with singular value σ is an eigenvector of X > X with eigenvalue σ 2 .
10
Q6. [10 pts] Trees
13
1 5 5
16
10 12 2 12
3 15 3 4 10 9
17
2 4 1 16 8 14
14 13 6 7 15 11
6
8 11 17
9
7
(1) [5 pts] Above, we have two depictions of the same k-d tree, which we have built to solve nearest neighbor
queries. Each node of the tree at right represents a rectangular box at left, and also stores one of the sample
points that lie inside that box. (The root node represents the whole plane R2 .) If a treenode stores sample point
i, then the line passing through point i (in the diagram at left) determines which boxes the child treenodes
represent.
Simulate running an exact 1-nearest neighbor query, where the bold X is the query point. Recall that the query
algorithm visits the treenodes in a smart order, and keeps track of the nearest point it has seen so far.
• Write down the numbers of all the sample points that serve as the “nearest point seen so far” sometime
while the query algorithm is running, in the order they are encountered.
• Circle all the subtrees in the k-d tree at upper right that are never visited during this query. (This is why
k-d tree search is usually faster than exhaustive search.)
(2) [5 pts] We are building a decision tree for a 2-class classification problem. We have n training points, each having
d real-valued features. At each node of the tree, we try every possible univariate split (i.e. for each feature, we
try every possible splitting value for that feature) and choose the split that maximizes the information gain.
Explain why it is possible to build the tree in O(ndh) time, where h is the depth of the tree’s deepest node.
Your explanation should include an analysis of the time to choose one node’s split. Assume that we can radix
sort real numbers in linear time.
Consider choosing the split at a node whose box contains n0 sample points. For each of the d features, we can sort
the sample points in O(n0 d) time. Then we can compute the entropy for the first split (separating the first sample
in the sorted list from the others) in O(n0 ) time, then we can walk through the list and update the entropy for each
successive split in O(1) time, summing to a total of O(n0 ) time for each of the d features. So it takes O(n0 d) time
overall to choose a split.
Each sample point participates in at most h treenodes, so each sample point contributes at most dh to the running
time, for a total running time of at most O(ndh).
11
Q7. [10 pts] Self-Driving Cars and Backpropagation
You want to train a neural network to drive a car. Your training data consists of grayscale 64 × 64 pixel images. The
training labels include the human driver’s steering wheel angle in degrees and the human driver’s speed in miles per
hour. Your neural network consists of an input layer with 64 × 64 = 4,096 units, a hidden layer with 2,048 units,
and an output layer with 2 units (one for steering angle, one for speed). You use the ReLU activation function for
the hidden units and no activation function for the outputs (or inputs).
(1) [2 pts] Calculate the number of parameters (weights) in this network. You can leave your answer as an
expression. Be sure to account for the bias terms.
(2) [3 pts] You train your network with the cost function J = 12 |y − z|2 . Use the following notation.
• x is a training image (input) vector with a 1 component appended to the end, y is a training label (input)
vector, and z is the output vector. All vectors are column vectors.
• r(γ) = max{0, γ} is the ReLU activation function, r0 (γ) is its derivative (1 if γ > 0, 0 otherwise), and
r(v) is r(·) applied component-wise to a vector.
• g is the vector of hidden unit values before the ReLU activation functions are applied, and h = r(g) is
the vector of hidden unit values after they are applied (but we append a 1 component to the end of h).
• V is the weight matrix mapping the input layer to the hidden layer; g = V x.
• W is the weight matrix mapping the hidden layer to the output layer; z = W h.
Derive ∂J/∂Wij .
∂J ∂z
= (z − y)>
∂Wij ∂Wij
= (zi − yi )hj
(3) [1 pt] Write ∂J/∂W as an outer product of two vectors. ∂J/∂W is a matrix with the same dimensions as W ;
it’s just like a gradient, except that W and ∂J/∂W are matrices rather than vectors.
∂J
= (z − y)h>
∂W
∂J ∂z
= (z − y)>
∂Vij ∂Vij
∂h
= (z − y)> W
∂Vij
= (z − y)> W [0, . . . , r0 (gi ) xj , . . . , 0]>
= ((z − y)> W )i r0 (gi ) xj .
12
10-601 Machine Learning, Midterm Exam
Good luck!
Name:
Andrew ID:
1
10-601 Machine Learning Midterm Exam October 18, 2012
Solution:
False
(b) [1 point] When a decision tree is grown to full depth, it is more likely to fit the noise in the data.
True False
Solution:
True
(c) [1 point] When the hypothesis space is richer, over fitting is more likely.
True False
Solution:
True
(d) [1 point] When the feature space is larger, over fitting is more likely.
True False
Solution:
True
(e) [1 point] We can use gradient descent to learn a Gaussian Mixture Model.
True False
Solution:
True
Short Questions.
(f) [3 points] Can you represent the following boolean function with a single logistic threshold unit
(i.e., a single unit from a neural network)? If yes, show the weights. If not, explain why not in 1-2
sentences.
A B f(A,B)
1 1 0
0 0 0
1 0 1
0 1 0
Page 1 of 16
10-601 Machine Learning Midterm Exam October 18, 2012
Solution:
Yes, you can represent this function with a single logistic threshold unit, since it is linearly
separable. Here is one example.
(1)
Page 2 of 16
10-601 Machine Learning Midterm Exam October 18, 2012
(g) [3 points] Suppose we clustered a set of N data points using two different clustering algorithms:
k-means and Gaussian mixtures. In both cases we obtained 5 clusters and in both cases the centers
of the clusters are exactly the same. Can 3 points that are assigned to different clusters in the k-
means solution be assigned to the same cluster in the Gaussian mixture solution? If no, explain. If
so, sketch an example or explain in 1-2 sentences.
Solution:
Yes, k-means assigns each data point to a unique cluster based on its distance to the cluster
center. Gaussian mixture clustering gives soft (probabilistic) assignment to each data point.
Therefore, even if cluster centers are identical in both methods, if Gaussian mixture compo-
nents have large variances (components are spread around their center), points on the edges
between clusters may be given different assignments in the Gaussian mixture solution.
Solution:
Lower variance
(i) [3 points] As the number of training examples goes to infinity, your model trained on that data
will have:
A. Lower bias B. Higher bias C. Same bias
Solution:
Same bias
(j) [3 points] Suppose you are given an EM algorithm that finds maximum likelihood estimates for a
model with latent variables. You are asked to modify the algorithm so that it finds MAP estimates
instead. Which step or steps do you need to modify:
A. Expectation B. Maximization C. No modification necessary D. Both
Solution:
Maximization
Page 3 of 16
10-601 Machine Learning Midterm Exam October 18, 2012
(a) [3 points] We have decided to use a neural network to solve this problem. We have two choices:
either to train a separate neural network for each of the diseases or to train a single neural network
with one output neuron for each disease, but with a shared hidden layer. Which method do you
prefer? Justify your answer.
Solution:
1- Neural network with a shared hidden layer can capture dependencies between diseases.
It can be shown that in some cases, when there is a dependency between the output nodes,
having a shared node in the hidden layer can improve the accuracy.
2- If there is no dependency between diseases (output neurons), then we would prefer to have
a separate neural network for each disease.
(b) [3 points] Some patient features are expensive to collect (e.g., brain scans) whereas others are not
(e.g., temperature). Therefore, we have decided to first ask our classification algorithm to predict
whether a patient has a disease, and if the classifier is 80% confident that the patient has a disease,
then we will do additional examinations to collect additional patient features In this case, which
classification methods do you recommend: neural networks, decision tree, or naive Bayes? Justify
your answer in one or two sentences.
Solution:
We expect students to explain how each of these learning techniques can be used to output
a confidence value (any of these techniques can be modified to provide a confidence value).
In addition, Naive Bayes is preferable to other cases since we can still use it for classification
when the value of some of the features are unknown.
We gave partial credits to those who mentioned neural network because of its non-linear de-
cision boundary, or decision tree since it gives us an interpretable answer.
(c) Assume that we use a logistic regression learning algorithm to train a classifier for each disease.
The classifier is trained to obtain MAP estimates for the logistic regression weights W . Our MAP
estimator optimizes the objective
Y
W ← arg max ln[P (W ) P (Y l |X l , W )]
W
l
where l refers to the lth training example. We adopt a Gaussian prior with zero mean for the
weights W = hw1 . . . wn i, making the above objective equivalent to:
X X
W ← arg max −C wi + ln P (Y l |X l , W )
W
i l
Note C here is a constant, and we re-run our learning algorithm with different values of C. Please
answer each of these true/false questions, and explain/justify your answer in no more than 2
sentences.
i. [2 points] The average log-probability of the training data can never increase as we increase C.
True False
Page 4 of 16
10-601 Machine Learning Midterm Exam October 18, 2012
Solution:
True. As we increase C, we give more weight to constraining the predictor. Thus it makes
our predictor less flexible to fit to training data (over constraining the predictor, makes it
unable to fit to training data).
ii. [2 points] If we start with C = 0, the average log-probability of test data will likely decrease as
we increase C.
True False
Solution:
False. As we increase the value of C (starting from C = 0), we avoid our predictor to over
fit to training data and thus we expect the accuracy of our predictor to be increased on the
test data.
iii. [2 points] If we start with a very large value of C, the average log-probability of test data can
never decrease as we increase C.
True False
Solution:
False. Similar to the previous parts, if we over constraint the predictor (by choosing very large
value of C), then it wouldn’t be able to fit to training data and thus makes it to perform worst
on the test data.
Page 5 of 16
10-601 Machine Learning Midterm Exam October 18, 2012
(a) (b)
i. [2 points] Figure 1(a) illustrates a subset of our training data when we have only two features:
X1 and X2 . Draw the decision boundary for the logistic regression that we explained in part
(c).
Solution:
The decision boundary for logistic regression is linear. One candidate solution which clas-
sifies all the data correctly is shown in Figure 1. We will accept other possible solutions
since decision boundary depends on the value of C (it is possible for the trained classifier
to miss-classify a few of the training data if we choose a large value of C).
ii. [3 points] Now assume that we add a new data point as it is shown in Figure 1(b). How does
it change the decision boundary that you drew in Figure 1(a)? Answer this by drawing both
the old and the new boundary.
Solution:
We expect the decision boundary to move a little toward the new data point.
(e) [3 points] Assume that we record information of all the patients who visit UPMC every day. How-
ever, for many of these patients we don’t know if they have any of the diseases, can we still improve
the accuracy of our classifier using these data? If yes, explain how, and if no, justify your answer.
Solution:
Yes, by using EM. In the class, we showed how EM can improve the accuracy of our classifier
using both labeled and unlabeled data. For more details, please look at http://www.cs.
cmu.edu/˜tom/10601_fall2012/slides/GrMod3_10_9_2012.pdf, page 6.
Page 6 of 16
10-601 Machine Learning Midterm Exam October 18, 2012
Question 3. Regression
Consider real-valued variables X and Y . The Y variable is generated, conditional on X, from the fol-
lowing process:
∼ N (0, σ 2 )
Y = aX +
where every is an independent variable, called a noise term, which is drawn from a Gaussian distri-
bution with mean 0, and standard deviation σ. This is a one-feature linear regression model, where a
is the only weight parameter. The conditional probability of Y has distribution p(Y |X, a) ∼ N (aX, σ 2 ),
so it can be written as
1 1 2
p(Y |X, a) = √ exp − 2 (Y − aX)
2πσ 2σ
The following questions are all about this model.
MLE estimation
(a) [3 points] Assume we have a training dataset of n pairs (Xi , Yi ) for i = 1..n, and σ is known.
Which ones of the following equations correctly represent the maximum likelihood problem for
estimating a? Say yes or no to each one. More than one of them should have the answer “yes.”
X 1 1
[Solution: no] arg max √ exp(− 2 (Yi − aXi )2 )
a
i
2πσ 2σ
Y 1 1
[Solution: yes] arg max √ exp(− 2 (Yi − aXi )2 )
a
i
2πσ 2σ
X 1
[Solution: no] arg max exp(− (Yi − aXi )2 )
a
i
2σ 2
Y 1
[Solution: yes] arg max exp(− (Yi − aXi )2 )
a
i
2σ 2
1X
[Solution: no] arg max (Yi − aXi )2
a 2 i
1X
[Solution: yes] arg min (Yi − aXi )2
a 2 i
(b) [7 points] Derive the maximum likelihood estimate of the parameter a in terms of the training
example Xi ’s and Yi ’s. We recommend you start with the simplest form of the problem you found
above.
Solution:
Page 7 of 16
10-601 Machine Learning Midterm Exam October 18, 2012
1
− aXi )2 and minimize F . Then
P
Use F (a) = 2 i (Yi
" #
∂ 1X 2
0= (Yi − aXi ) (2)
∂a 2 i
X
= (Yi − aXi )(−Xi ) (3)
i
X
= aXi2 − Xi Yi (4)
i
P
X i Yi
a = Pi 2 (5)
i Xi
Partial credit: 1 point for writing a correct objective, 1 point for taking the derivative, 1 point
for getting the chain rule correct, 1 point for a reasonable attempt at solving for a. 6 points for
correct up to a sign error.
P P
Many people got yi / xi as the answer, by erroneously cancelling xi on top and bottom.
4 points Pfor this answer when it is clear this cancelling caused the problem. If theyPexplicitly
xi yi / x2i along the way, 6 points. If it is completely unclear where
P P
derived yi / x i
came from, sometimes worth only 3 points (based on the partial credit rules above).
Some people wrote a gradient descent rule. We intended to ask for a closed-form maximum
likelihood estimate, not an algorithm to get it. (Yes, it is true that lectures never said there
exists a closed-form solution for linear regression MLE. But there is. In fact, there is a closed-
form solution even for multiple features, via linear algebra.) But we gave 4 points for getting
the rule correct; 3 points for correct with a sign error.
For gradient descent/ascent signs are tricky. If you are using the log-likelihood, thus maxi-
mization, you want gradient ascent, and thus add the gradient. If instead you’re doing the
minimization problem, andPusing gradient descent, need to Psubtract the gradient. Either way,
it comes out to a ← a + η i (yi − axi )xi . Interpretation: i (yi − axi )xi is the correlation of
data against the residual. In the case of positive x,y, if the data still correlates with the residual,
that means predictions are too low, so you want to increase a.
Here is a lovely book chapter by Tufte (1974) on one-feature linear regression:
http://www.edwardtufte.com/tufte/dapp/chapter3.html
MAP estimation
Let’s put a prior on a. Assume a ∼ N (0, λ2 ), so
1 1
p(a|λ) = √ exp(− 2 a2 )
2πλ 2λ
The posterior probability of a is
(c) [3 points] Under the following conditions, how do the prior and conditional likelihood curves
change? Do aM LE and aM AP become closer together, or further apart?
Page 8 of 16
10-601 Machine Learning Midterm Exam October 18, 2012
(d) [7 points] Assume σ = 1, and a fixed prior parameter λ. Solve for the MAP estimate of a,
arg max [ln p(Y1 ..Yn | X1 ..Xn , a) + ln p(a|λ)]
a
Solution:
∂ ∂` ∂ log p(a|λ)
[log p(Y |X, a) + log p(a|λ)] = + (6)
∂a ∂a ∂a
To stay sane, let’s look at it as maximization, not minimization. (It’s easy to get signs wrong by
trying to use the squared error minimization form from before.) Since σ = 1, the log-likelihood
and its derivative is
" #
1 Y 1 2
`(a) = log √ exp − 2 (Yi − aXi ) (7)
i
2πσ 2σ
1 X
`(a) = − log Z − (Yi − aXi )2 (8)
2 i
∂` X
=− (Yi − aXi )(−Xi ) (9)
∂a i
X
= (Yi − aXi )Xi (10)
i
X
= Xi Yi − aXi2 (11)
i
√
∂ log p(a) ∂ 1
= − log( 2πλ) − 2 a2 (12)
∂a ∂a 2λ
a
=− 2 (13)
λ
Page 9 of 16
10-601 Machine Learning Midterm Exam October 18, 2012
The full partial is the sum of that and the log-likelihood which we did before.
∂` ∂ log p(a)
0= + (14)
∂a ∂a !
X
2 a
0= Xi Yi − aXi − 2 (15)
i
λ
P
Xi Yi
a = P i2 (16)
( i Xi ) + 1/λ2
Partial credit: 1 point for writing out the log posterior, and/or doing some derivative. 1 point
for getting the derivative correct.
For full solution: deduct a point for a sign error. (There are many potential places for flipping
2
signs). Deduct a point for having
P n/λ : this results from wrapping a sum around the log-prior.
(Only the log-likelihood as a i around it since it’s the probability of drawing each data point.
The parameter a is drawn only once.)
Some people didn’t set σ = 1 and kept σ to the end. We simply gave credit if substituting
σ = 1 gave the right answer; a few people may have derived the wrong answer but we didn’t
carefully check all these cases.
People who did gradient descent rules were graded similarly as before: 4 points if correct,
deduct one for sign error.
Page 10 of 16
10-601 Machine Learning Midterm Exam October 18, 2012
X21 X22
(a) [2 points] From the rule we covered in lecture, is there any variable(s) conditionally independent
of X33 given X11 and X12 ? If so, list all.
Solution:
X21
(b) [2 points] From the rule we covered in lecture, is there any variable(s) conditionally independent
of X33 given X22 ? If so, list all.
Solution:
Everything but X22 , X33 .
(c) [3 points] Write the joint probability P (X11 , X12 , X13 , X21 , X22 , X31 , X32 , X33 ) factored according
to the Bayes net. How many parameters are necessary to define the conditional probability distri-
butions for this Bayesian network?
Solution:
P (X11 , X12 , X13 , X21 , X22 , X31 , X32 , X33 )
= P (X11 )P (X12 )P (X13 )P (X21 |X11 , X12 )P (X22 |X13 )P (X31 |X21 X22 )P (X32 |X21 X22 )P (X33 |X22 )
9 parameters are necessary.
(d) [2 points] Write an expression for P (X13 = 0, X22 = 1, X33 = 0) in terms of the conditional proba-
bility distributions given in your answer to part (c). Show your work.
Solution:
P (X13 = 0)P (X22 = 1|X13 = 0)P (X33 = 0|X22 = 1)
Page 11 of 16
10-601 Machine Learning Midterm Exam October 18, 2012
(e) [3 points] From your answer to (d), can you say X13 and X33 are independent? Why?
Solution:
No. Conditional independence doesn’t imply marginal independence.
(f) [3 points] Can you say the same thing when X22 = 1? In other words, can you say X13 and X33
are independent given X22 = 1? Why?
Solution:
Yes. X22 is the only parent of X33 and X13 is a nondescendant of X33 , so by the rule in the
lecture we can say they are independent given X22 = 1
(g) [2 points] Replace X21 and X22 by a single new variable X2 whose value is a pair of boolean
values, defined as: X2 = hX21, X22i. Draw the new Bayes net B 0 after the change.
Solution:
X2 = (X21 , X22 )
Page 12 of 16
10-601 Machine Learning Midterm Exam October 18, 2012
(h) [3 points] Do all the conditional independences in B hold in the new network B 0 ? If not, write one
that is true in B but not in B 0 . Consider only the variables present in both B and B 0 .
Solution:
No. For instance, X32 is not conditionally independnt of X33 given X22 anymore.
* Note: We noticed the problem description was a bit ambiguous, so we also accepted yes as a
correct answer
Page 13 of 16
10-601 Machine Learning Midterm Exam October 18, 2012
(a) Consider the training set accuracy and test set accuracy curves plotted above, during decision tree
learning, as the number of nodes in the decision tree grows. This decision tree is being used to
learn a function f : X → Y , where training and test set examples are drawn independently at
random from an underlying distribution P (X), after which the trainer provides a noise-free label
Y . Note error = 1 - accuracy. Please answer each of these true/false questions, and explain/justify
your answer in 1 or 2 sentences.
i. [2 points] T or F: Training error at each point on this curve provides an unbiased estimate of
true error.
Solution:
False. Training error is an optimistically biased estimate of true error, because the hypoth-
esis was chosen based on its fit to the training data.
ii. [1 point] T or F: Test error at each point on this curve provides an unbiased estimate of true
error.
Solution:
True. The expected value of test error (taken over different draws of random test sets) is
equal to true error.
iii. [1 point] T of F: Training accuracy minus test accuracy provides an unbiased estimate of the
degree of overfitting.
Solution:
True. We defined overfitting as test error minus training error, which is equal to training
accuracy minus test accuracy.
iv. [1 point] T or F: Each time we draw a different test set from P (X) the test accuracy curve may
vary from what we see here.
Solution:
True. Of course each random draw from P (X) may vary from another draw.
v. [1 point] T or F: The variance in test accuracy will increase as we increase the number of test
examples.
Page 14 of 16
10-601 Machine Learning Midterm Exam October 18, 2012
Solution:
False. The variance in test accuracy will decrease as we increase the size of the test set.
Solution:
The tree with 10 nodes. This has the highest test accuracy of any of the trees, and hence
the highest expected true accuracy.
ii. [2 points] What is the amount of overfitting in the tree you selected?
Solution:
overfitting = training accuracy minus test accuracy = 0.77 - 0.74 = 0.03
Let us consider the above plot of training and test error from the perspective of agnostic PAC
bounds. Consider the agnostic PAC bound we discussed in class:
1
m≥ (ln |H| + ln(1/δ))
22
where is defined to be the difference between errortrue (h) and errortrain (h) for any hypothesis h
output by the learner.
iii. [2 points] State in one carefully worded sentence what the above PAC bound guarantees about
the two curves in our decision tree plot above.
Solution:
If we train on m examples drawn at random from P (X), then with probability (1 − δ) the
overfitting (difference between training and true accuracy) for each hypothesis in the plot
will be less than or equal to . Note the the true accuracy is the expected value of the test
accuracy, taken over different randomly drawn test sets.
iv. [2 points] Assume we used 200 training examples to produce the above decision tree plot.
If we wish to reduce the overfitting to half of what we observe there, how many training
examples would you suggest we use? Justify your answer in terms of the agnostic PAC bound,
in no more than two sentences.
Solution:
The bound shows that m grows as 212 . Therefore if we wish to halve , it will suffice to
increase m by a factor of 4. We should use 200 × 4 = 800 training examples.
v. [2 points] Give a one sentence explanation of why you are not certain that your recommended
number of training examples will reduce overfitting by exactly one half.
Solution:
There are several reasons, including the following. 1. Our PAC theory result gives a
bound, not an equality, so 800 examples might decrease overfitting by more than half. 2.
The ”observed” overfitting is actually the test set accuracy, which is only an estimate of
true accuracy, so it may vary from true accuracy and our ”observed” overfitting will vary
accordingly.
Page 15 of 16
10-601 Machine Learning Midterm Exam October 18, 2012
(c) You decide to estimate of the probability θ that a particular coin will turn up heads, by flipping it
10 times. You notice that if repeat this experiment, each time obtaining as new set of 10 coin flips,
you get different resulting estimates. You repeat the experiment N = 20 times, obtaining estimates
θ̂1 , θ̂2 . . . θ̂20 . You calculate the variance in these estimates as
i=N
1 X i
var = (θ̂ − θmean )2
N i=1
Solution:
We should expect the MAP estimate to produce a smaller value for var, because using the
Beta prior is equivalent to adding in a fixed set of ”hallucinated” training examples that
will not vary from experiment to experiment.
Page 16 of 16
Sample questions for “Fundamentals of Machine Learning 2018”
Teacher: Mohammad Emtiyaz Khan
• In the final exam, no electronic devices are allowed except a calculator. Make
sure that your calculator is only a calculator and cannot be used for any other
purpose.
• For derivations, clearly explain your derivation step by step. In the final
exam you will be marked for steps as well as for the end result.
• We will denote the output data vector by y which is a vector that contains
all yn , and the feature matrix by X which is a matrix containing features xTn
as rows. Also, x en = [1, xTn ]T .
1 Multiple-Choice/Numerical Questions
1. Choose the options that are correct regarding machine learning (ML) and
artificial intelligence (AI),
1
Answer: (D)
1 1 1
Answer: 1
1 1 1
Answer: 2
12 8 −36
Answer: 2
Answer: C
(A) Linear in D.
(B) Polynomial in D.
(C) Exponential in D.
(D) Linear in N .
Answer: C,D
2
(B) It can be applied to non-continuous functions.
(C) It is easy to implement.
(D) It runs reasonably fast for multiple linear regression.
Answer: A,B,C.
Answer: A,C,D
11. Let us say that we have computed the gradient of our cost function and
stored it in a vector g. What is the cost of one gradient descent update
given the gradient?
(A) O(D)
(B) O(N )
(C) O(N D)
(D) O(N D2 )
Answer: (A)
12. Let us say that we are fitting one-parameter model to the data, i.e. yn ≈ β0 .
(0)
The average of y1 , y2 , . . . , yN is 1. We start gradient descent at β0 = 0 and
set the step-size to 0.5. What is the value of β0 after 3 iterations, i.e., the
(3)
value of β0 ?
Answer: 0.875 (deviation 0.01)
13. Let us say that we are fitting one-parameter model to the data, i.e. yn ≈ β0 .
(0)
The average of y1 , y2 , . . . , yN is 1. We start gradient descent at β0 = 10 and
set the step-size to 0.5. What is the value of β0 after 3 iterations, i.e., the
(3)
value of β0 ?
Answer: CA: 2.125 (deviation 0.01)
3
14. Computational complexity of Gradient descent is,
(A) linear in D
(B) linear in N
(C) polynomial in D
(D) dependent on the number of iterations
Answer: C
15. Generalization error measures how well an algorithm perform on unseen data.
The test error obtained using cross-validation is an estimate of the general-
ization error. Is this estimate unbiased?
Answer: (No)
(A) linear in K
(B) quadratic in K
(C) cubic in K
(D) exponential in K
Answer: A
17. You observe the following while fitting a linear regression to the data: As
you increase the amount of training data, the test error decreases and the
training error increases. The train error is quite low (almost what you expect
it to), while the test error is much higher than the train error.
What do you think is the main reason behind this behavior. Choose the
most probable option.
Answer: A
18. Adding more basis functions in a linear model... (pick the most probably
option)
4
(D) Doesn’t affect bias and variance
Answer: A
2 Multiple-output regression
Suppose we have N regression training-pairs, but instead of one output for each
input vector xn ∈ RD , we now have 2 outputs yn = [yn1 , yn2 ] where each yn1 and
yn2 are real numbers. For each output yn1 , we wish to fit a separate linear model:
where β 1 and β 2 are vectors of β1d and β2d respectively, for d = 0, 1, 2, . . . , D, and
eTn = [1 xTn ].
x
Our goal is to estimate β 1 and β 2 for which we choose to minimize the following
cost function:
N D D
X 1 T
2 1 T
2 X
2
X
2
L(β 1 , β 2 ) := yn1 − β 1 x
en + yn2 − β 2 x
en + λ1 β1d + λ2 β2d .
n=1
2 2 d=0 d=0
(6)
Answer:
PN h 2 i
(A) ∂L
:= − yn1 − β T1 x
en x
en + λ1 β 1 , same for β 2 .
∂β1 n=1
(B) The number of parameters is equal to 30 and the number of data points is
equal to 40. It is good to regularize, but just a mild regularization will do
since the number of parameters is still less than number of data points.
(C) Yes, we expect this to be the case because, if the data points are i.i.d., then
we might need less regularization.
(D) Same as gradient descent (please put an exact number here for the final
exam).
5
3 Eigenvalues
Given a real-valued matrix X, show that all the non-zero eigenvalues of XXT and
XT X are the same.
Answer: To prove this, you can use the SVD of X = USVT . Then XXT =
US2 UT and XT X = VS2 V. The non-zero eigenvalues are the same, although the
number of eigenvalues are different.
Consider the following artificial neural network with the nonlinear transformation
znm = σ(anm ) (see figure below). Here, n is the data index and m is the index of
hidden units. There are two binary outputs yn1 and yn2 taking values in {0, 1}.
Suppose you have N = 200 data points but M = 200 hidden units for each layer.
What problem(s) are you likely to encounter when training such a network? How
would you solve the problem(s)?
Answer: Overfitting. There are multiple ways to tackle this problem as discussed
in the lecture.
6
1. Which of the following would be more appropriate to be replaced with question mark in the following figure?
a) Data Analysis
b) Data Science
c) Descriptive Analytics
d) None of the mentioned
View Answer
Answer: b
Explanation: Data Science is a multidisciplinary which involves extraction of knowledge from large volumes of data that are
structured or unstructured.
Answer: a
Explanation: Accounting programs are prototypical examples of data processing applications.
Answer: d
Explanation: A data scientist is a job title for an employee or business intelligence (BI) consultant who excels at analyzing data,
particularly large amounts of data.
4. Which of the following is the most important language for Data Science?
a) Java
b) Ruby
c) R
d) None of the mentioned
View Answer
Answer: c
Explanation: R is free software for statistical computing and analysis.
Answer: b
Explanation: Data formatting is the organization of information according to preset specifications.
6. Which of the following approach should be used to ask Data Analysis question?
a) Find only one solution for particular problem
b) Find out the question which is to be answered
c) Find out answer from dataset without asking question
d) None of the mentioned
View Answer
Answer: b
Explanation: Data analysis has multiple facets and approaches.
Answer: d
Explanation: Data visualization is the presentation of data in a pictorial or graphical format.
Answer: b
Explanation: Hacker is an expert at programming and solving problems with a computer.
Answer: b
Explanation: Processing includes merging, summarizing and subsetting data.
Answer: b
Explanation: Raw data may only need to be processed once.
This set of Data Science Multiple Choice Questions & Answers (MCQs) focuses on “ToolBox Overview”.
1. Which of the following principle is incorrectly represented in the below figure?
a) Show Comparisons
b) Integrate Evidence
c) Describe Evidence
d) None of the mentioned
View Answer
Answer: d
Explanation: Principles of Analytical graphs are sequentially shown in the stepwise manner.
Answer: a
Explanation: The Method of Least Squares is a procedure to determine the best fit line to data.
Answer: c
Explanation: Six Principles of Analytical Graphs are useful for data analysis.
Answer: d
Explanation: EDA stands for Exploratory Data Analysis.
Answer: a
Explanation: Simple linear regression is equipped to handle more than one predictor.
6. Which of the following technique comes under practical machine learning?
a) Bagging
b) Boosting
c) Forecasting
d) None of the mentioned
View Answer
Answer: b
Explanation: Boosting is an approach to machine learning based on the idea of creating a highly accurate predictor.
7. Data Products shown in the below figure is built using which programming language?
a) S
b) Python
c) R
d) Java
View Answer
Answer: c
Explanation: Products mentioned in the figure are web application frameworks written in R.
Answer: a
Explanation: Bagging is used in statistical classification and regression.
Answer: b
Explanation: Raw data is data that has not been processed for use.
Answer: b
Explanation: Standard normal RVs are often labelled as Z.
1. Which of the following CLI command can also be used to rename files?
a) rm
b) mv
c) rm -r
d) none of the mentioned
View Answer
Answer: b
Explanation: mv stands for move.
Answer: b
Explanation: CLI stands for Command Line Interface.
3. Which of the following command allows you to change directory to one level above your parent directory?
a) cd
b) cd.
c) cd..
d) none of the mentioned
View Answer
Answer: c
Explanation: cd stands for change directory.
Answer: a
Explanation: rm can be used to remove files and directories.
Answer: b
Explanation: Depending on the command, there can be zero or more flags and arguments.
Answer: b
Explanation: Version control is also known as revision control.
Answer: a
Explanation: Git is a free and open source distributed version control system designed to handle everything from small to very large
projects with speed and efficiency.
8. Which of the following command line environment is used for interacting with Git?
a) GitHub
b) Git Bash
c) Git Boot
d) All of the mentioned
View Answer
Answer: b
Explanation: Git for Windows provides a BASH emulation used to run Git from the command line.
9. Which of the following web hosting service use Git control system?
a) GitHub
b) Open Hash
c) Git Bash
d) None of the mentioned
View Answer
Answer: a
Explanation: GitHub is a Web-based Git repository hosting service, which offers all of the distributed revision control and source
code management (SCM) functionality of Git.
Answer: a
Explanation: -r flag should be used for copying the content.
Answer: a
Explanation: You should do this before committing.
Answer: b
Explanation: CLI stands for Command Line Interface.
3. Which of the following command updates tracking for files that are modified?
a) git add .
b) git add -u
c) git add -A
d) none of the mentioned
View Answer
Answer: b
Explanation: The git add command adds a change in the working directory to the staging area.
Answer: a
Explanation: GitHub can store a remote copy of your repository.
Answer: a
Explanation: The git branch command is your general-purpose branch administration tool.
7. Which of the following is the correct way of creating GitHub repository in to well labelled commits?
a) Fork another user’s repository
b) Pop another user’s repository
c) Zip another user’s repository
d) None of the mentioned
View Answer
Answer: a
Explanation: A fork is a copy of a repository.
Answer: a
Explanation: In Git, there are two main ways to integrate changes from one branch into another: the merge and the rebase.
Answer: a
Explanation: A branch in Git is simply a lightweight movable pointer to one of these commits.
10. branch command is used to determine which branch you are currently in.
a) True
b) False
View Answer
Answer: a
Explanation: -r flag should be used for copying the content.
1. Which of the following principle characteristic is odd man out in the below figure?
a) Principle 1
b) Principle 2
c) Principle 3
d) Principle 4
View Answer
Answer: c
Explanation: Multivariate Data is the only characteristic related to Principle 3.
Answer: b
Explanation: Descriptive analysis describe a set of data.
3. Which of the following allows you to find the relationship you didn’t about?
a) Inferential
b) Exploratory
c) Causal
d) None of the mentioned
View Answer
Answer: b
Explanation: In statistics, exploratory data analysis is an approach to analyzing data sets to summarize their main characteristics, often
with visual methods.
Answer: a
Explanation: This only updates your local repository.
Answer: a
Explanation: Exploratory analyses are usually not the final way.
6. Which of the following uses data on some object to predict values for other object?
a) Inferential
b) Exploratory
c) Predictive
d) None of the mentioned
View Answer
Answer: c
Explanation: A prediction is a forecast, but not only about the weather.
Answer: a
Explanation: Inference is the act or process of deriving logical conclusions from premises known or assumed to be true.
8. Which of the following model is usually a gold standard for data analysis?
a) Inferential
b) Descriptive
c) Causal
d) All of the mentioned
View Answer
Answer: c
Explanation: A causal model is an abstract model that describes the causal mechanisms of a system.
9. Which of the following analysis should come in place of question mark in the below figure?
a) Inferential
b) Exploratory
c) Causal
d) None of the mentioned
View Answer
Answer: a
Explanation: Inferential statistics is concerned with making predictions or inferences about a population from observations and
analyses of a sample.
Answer: b
Explanation: Descriptive analysis is commonly applied to census data.
1. Which of the following type of data science question is missing in the figure?
a) Correlative
b) Exploratory
c) Relative
d) None of the mentioned
View Answer
Answer: b
Explanation: Exploratory analysis is used to find relationships about you didn’t know about.
Answer: b
Explanation: Inference depends heavily on the sampling scheme.
3. Which of the following uses relatively small amount of data to estimate about bigger population?
a) Inferential
b) Exploratory
c) Causal
d) None of the mentioned
View Answer
Answer: a
Explanation: Inferential statistics is concerned with making predictions or inferences about a population from observations and
analyses of a sample.
4. Which of the following analysis helps out to find the effect of variable change?
a) Inferential
b) Exploratory
c) Causal
d) None of the mentioned
View Answer
Answer: c
Explanation: Causal Analysis provides the real reason why things happen and hence allows focused change activity.
Answer: c
Explanation: Statistical inference is the process of deducing properties of an underlying distribution by analysis of data.
Answer: b
Explanation: A correlation is a measure or degree of relationship between two variables.
7. Which of the following is more applicable to the below figure?
a) Descriptive
b) Causal
c) Predictive
d) None of the mentioned
View Answer
Answer: a
Explanation: Google trends helps to describe the set of data.
Answer: c
Explanation: Equations are based on physical/engineering science.
Answer: d
Explanation: Mechanistic analysis are hard to infer except for simple simulations.
Answer: a
Explanation: Prediction is very hard, especially for future references.
a) Large Data
b) Big Data
c) Dark Data
d) None of the mentioned
View Answer
Answer: b
Explanation: Big data is a broad term for data sets so large or complex that traditional data processing applications are inadequate.
2. Point out the correct statement.
a) Machine learning focuses on prediction, based on known properties learned from the training data
b) Data Cleaning focuses on prediction, based on known properties learned from the training data
c) Representing data in a form which both mere mortals can understand and get valuable insights is as much a science as much as it is art
d) None of the mentioned
View Answer
Answer: d
Explanation: Visualization is becoming a very important aspect.
3. Which of the following characteristic of big data is relatively more concerned to data science?
a) Velocity
b) Variety
c) Volume
d) None of the mentioned
View Answer
Answer: b
Explanation: Big data enables organizations to store, manage, and manipulate vast amounts of disparate data at the right speed and at
the right time.
4. Which of the following analytical capabilities are provided by information management company?
a) Stream Computing
b) Content Management
c) Information Integration
d) All of the mentioned
View Answer
Answer: d
Explanation: With stream computing, store less, analyze more and make better decisions faster.
Answer: c
Explanation: Big Data is actually a concept providing an opportunity to find new insight into your existing data as well guidelines to
capture and analysis your future data.
6. Which of the following step is performed by data scientist after acquiring the data?
a) Data Cleansing
b) Data Integration
c) Data Replication
d) All of the mentioned
View Answer
Answer: a
Explanation: Data cleansing, data cleaning or data scrubbing is the process of detecting and correcting (or removing) corrupt or
inaccurate records from a record set, table, or database.
Answer: a
Explanation: IBM data scientists break big data into four dimensions: volume, variety, velocity and veracity.
8. Which of the following focuses on the discovery of (previously) unknown properties on the data?
a) Data mining
b) Big Data
c) Data wrangling
d) Machine Learning
View Answer
Answer: a
Explanation: Data munging or data wrangling is loosely the process of manually converting or mapping data from one “raw” form
into another format that allows for more convenient consumption of the data with the help of semi-automated tools.
9. Which of the following language should be replaced with the question mark in the below figure?
a) Java
b) PHP
c) COBOL
d) None of the mentioned
View Answer
Answer: a
Explanation: Java is used for processing data in Big data Analytics.
10. Beyond Volume, variety and velocity are the issues of big data veracity.
a) True
b) False
View Answer
Answer: a
Explanation: Data Veracity is uncertain or imprecise data.
1. Which of the following design term is perfectly applicable to the below figure?
a) Correlation
b) Confounding
c) Causation
d) None of the mentioned
View Answer
Answer: b
Explanation: Confounding can be dealt with either at the study design stage, or at the analysis stage.
Answer: a
Explanation: Usually the random component of data is measurement error.
3. Which of the following is the top most important thing in data science?
a) answer
b) question
c) data
d) none of the mentioned
View Answer
Answer: b
Explanation: The second most important is the data.
4. Which of the following approach should be used if you can’t fix the variable?
a) randomize it
b) non stratify it
c) generalize it
d) none of the mentioned
View Answer
Answer: a
Explanation: If you can’t fix the variable, stratify it.
Answer: a
Explanation: Randomized studies are usually used to identify causation.
Answer: d
Explanation: Experiments on causal relationships investigate the effect of one or more variables on one or more outcome variables.
Answer: d
Explanation: Data dredging is sometimes referred to as “data fishing”.
8. Which of the following data mining technique is used to uncover patterns in data?
a) Data bagging
b) Data booting
c) Data merging
d) Data Dredging
View Answer
Answer: d
Explanation: Data dredging, also called as data snooping, refers to the practice of misusing data mining techniques to show
misleading scientific ‘research’.
9. Which of the following figure correctly shows approximate order of difficulty?
a)
b)
c)
d) All of the mentioned
View Answer
Answer: a
Explanation: Predictive analysis is the practice of extracting information from existing data sets.
Answer: b
Explanation: If X predicts Y, it does not mean X causes Y.
Answer: a
Explanation: Operands can also appear in a reversed order.
Answer: a
Explanation: Timedeltas can be both positive and negative.
Answer: c
Explanation: NaT are skipped during evaluation.
4. Which of the following scalars can be converted to other ‘frequencies’ by as typing to a specific timedelta type?
a) Timedelta Series
b) TimedeltaIndex
c) Timedelta
d) All of the mentioned
View Answer
Answer: d
Explanation: These operations yield Series and propagate NaT -> nan.
Answer: b
Explanation: Dividing or multiplying a timedelta64[ns] Series by an integer or integer Series yields another timedelta64[ns] dtypes
Series.
6. Which of the following is used to generate an index with time delta?
a) TimeIndex
b) TimedeltaIndex
c) LeadIndex
d) None of the mentioned
View Answer
Answer: b
Explanation: Using TimedeltaIndex you can pass string-like, Timedelta, timedelta, or np.timedelta64 objects.
7. Combination of TimedeltaIndex with DatetimeIndex allow certain combination operations that are NaT preserving.
a) True
b) False
View Answer
Answer: a
Explanation: You can also convert indices to yield another index.
8. Using _________ on categorical data will produce similar output to a Series or DataFrame of type string.
a) .desc()
b) .describe()
c) .rank()
d) none of the mentioned
View Answer
Answer: b
Explanation: Categorical data has a categories and a ordered property.
Answer: a
Explanation: Renaming categories is done by assigning new values to the Series.cat.categories property.
Answer: a
Explanation: Categoricals are pandas data type.
1. The plot method on Series and DataFrame is just a simple wrapper around ____________
a) gplt.plot()
b) plt.plot()
c) plt.plotgraph()
d) none of the mentioned
View Answer
Answer: b
Explanation: If the index consists of dates, it calls gcf().autofmt_xdate() to try to format the x-axis nicely.
2. Point out the correct combination with regards to kind keyword for graph plotting.
a) ‘hist’ for histogram
b) ‘box’ for boxplot
c) ‘area’ for area plots
d) all of the mentioned
View Answer
Answer: d
Explanation: The kind keyword argument of plot() accepts a handful of values for plots other than the default Line plot.
Answer: a
Explanation: bar can also be used for barplot.
4. You can create a scatter plot matrix using the __________ method in pandas.tools.plotting.
a) sca_matrix
b) scatter_matrix
c) DataFrame.plot
d) all of the mentioned
View Answer
Answer: b
Explanation: You can create density plots using the Series/DataFrame.plot.
5. Point out the wrong combination with regards to kind keyword for graph plotting.
a) ‘scatter’ for scatter plots
b) ‘kde’ for hexagonal bin plots
c) ‘pie’ for pie plots
d) none of the mentioned
View Answer
Answer: b
Explanation: kde is used for density plots.
6. Which of the following plots are used to check if a data set or time series is random?
a) Lag
b) Random
c) Lead
d) None of the mentioned
View Answer
Answer: a
Explanation: Random data should not exhibit any structure in the lag plot.
Answer: a
Explanation: There are several plotting functions in pandas.tools.plotting.
8. Which of the following plots are often used for checking randomness in time series?
a) Autocausation
b) Autorank
c) Autocorrelation
d) None of the mentioned
View Answer
Answer: c
Explanation: If the time series is random, such autocorrelations should be near zero for any and all time-lag separations.
Answer: c
Explanation: Resulting plots and histograms are what constitutes the bootstrap plot.
Answer: a
Explanation: Curves belonging to samples of the same class will usually be closer together and form larger structures.
1. Which of the following is used to compute the percent change over a given number of periods?
a) pct_change
b) percent_change
c) per_change
d) none of the mentioned
View Answer
Answer: a
Explanation: Series, DataFrame, and Panel all have a method pct_change.
Answer: c
Explanation: Pandas represents timestamps in nanosecond resolution.
3. Which of the following object has a method cov to compute covariance between series?
a) Series
b) DataFrame
c) Panel
d) None of the mentioned
View Answer
Answer: a
Explanation: DataFrame has a method cov to compute pairwise covariances among the series in the DataFrame, also excluding
NA/null values.
4. Which of the following specifies the required minimum number of observations for each column pair in order to have a valid result?
a) min_periods
b) max_periods
c) minimum_periods
d) all of the mentioned
View Answer
Answer: a
Explanation: DataFrame.cov also supports an optional min_periods.
6. Which of the following is implemented on DataFrame to compute the correlation between like-labeled Series contained in different
DataFrame objects?
a) corrwith
b) corwith
c) corwit
d) none of the mentioned
View Answer
Answer: a
Explanation: A score close to 1 means their tastes are very similar.
Answer: b
Explanation: The binary operators take two Series or DataFrames.
8. Which of the following method produces a data ranking with ties being assigned the mean of the ranks for the group?
a) rank
b) dense_rank
c) partition_rank
d) none of the mentioned
View Answer
Answer: a
Explanation: rank is also a DataFrame method.
Answer: a
Explanation: reindex_like silently inserts NaNs and the dtype changes accordingly.
Answer: a
Explanation: Non-numeric columns will be automatically excluded from the correlation calculation.
Answer: d
Explanation: The passed index is a list of axis labels.
Answer: b
Explanation: If data is a dict, if index is passed the values in data corresponding to the labels in the index will be pulled out.
3. The result of an operation between unaligned Series will have the ________ of the indexes involved.
a) intersection
b) union
c) total
d) all of the mentioned
View Answer
Answer: b
Explanation: If a label is not found in one Series or the other, the result will be marked as missing NaN.
Answer: d
Explanation: DataFrame is a 2-dimensional labeled data structure with columns of potentially different types.
Answer: a
Explanation: A Series is like a fixed-size dict in that you can get and set values by index label.
6. Which of the following takes a dict of dicts or a dict of array-like sequences and returns a DataFrame?
a) DataFrame.from_items
b) DataFrame.from_records
c) DataFrame.from_dict
d) All of the mentioned
View Answer
Answer: a
Explanation: DataFrame.from_dict operates like the DataFrame constructor except for the orient parameter which is ‘columns’ by
default.
Answer: a
Explanation: The axis labels are collectively referred to as the index.
8. Which of the following works analogously to the form of the dict constructor?
a) DataFrame.from_items
b) DataFrame.from_records
c) DataFrame.from_dict
d) All of the mentioned
View Answer
Answer: a
Explanation: DataFrame.from_records takes a list of tuples or an ndarray with structured dtype.
9. Which of the following operation works with the same syntax as the analogous dict operations?
a) Getting columns
b) Setting columns
c) Deleting columns
d) All of the mentioned
View Answer
Answer: d
Explanation: You can treat a DataFrame semantically like a dict of like-indexed Series objects.
Answer: a
Explanation: If no index is passed, one will be created having values [0, …, len(data) – 1].
1. All pandas data structures are ___ mutable but not always _______mutable.
a) size, value
b) semantic, size
c) value, size
d) none of the mentioned
View Answer
Answer: c
Explanation: The length of a Series cannot be changed.
Answer: d
Explanation: Some elements may be close to one another according to one distance and farther away according to another.
Answer: a
Explanation: You can read data from a CSV file using the read_csv function.
4. Which of the following object you get after reading CSV file?
a) DataFrame
b) Character Vector
c) Panel
d) All of the mentioned
View Answer
Answer: a
Explanation: You get columns out of a DataFrame the same way you get elements out of a dictionary.
Answer: c
Explanation: Panel is generally 3D labeled.
Answer: a
Explanation: NumPy is the fundamental package for scientific computing with Python.
7. Panel is a container for Series, and DataFrame is a container for dataFrame objects.
a) True
b) False
View Answer
Answer: b
Explanation: DataFrame is a container for Series, and panel is a container for dataFrame objects.
Answer: c
Explanation: Bokeh is a Python interactive visualization library for large datasets that natively uses the latest web technologies.
9. Which of the following is a foundational exploratory visualization package for the R language in pandas ecosystem?
a) yhat
b) Seaborn
c) Vincent
d) None of the mentioned
View Answer
Answer: a
Explanation: It has great support for pandas data objects.
10. Pandas consist of static and moving window linear and panel regression.
a) True
b) False
View Answer
Answer: a
Explanation: Time series and cross-sectional data are special cases of panel data.
1. Quandl API for Python wraps the ________ REST API to return Pandas DataFrames with time series indexes.
a) Quandl
b) PyDatastream
c) PyData
d) None of the mentioned
View Answer
Answer: a
Explanation: PyDatastream is a Python interface to the Thomson Dataworks Enterprise (DWE/Datastream) SOAP API to return
indexed pandas dataFrames or panels with financial data.
2. Point out the correct statement.
a) Statsmodels provides powerful statistics, econometrics, analysis and modeling functionality that is out of panda’s scope
b) Vintage leverages pandas objects as the underlying data container for computation
c) Bokeh is a Python interactive visualization library for small datasets
d) All of the mentioned
View Answer
Answer: a
Explanation: Bokeh goal is to provide elegant, concise construction of novel graphics in the style of D3.
3. Which of the following library is used to retrieve and acquire statistical data and metadata disseminated in SDMX 2.1?
a) pandaSDMX
b) freedapi
c) geopandas
d) all of the mentioned
View Answer
Answer: a
Explanation: Geopandas extends pandas data objects to include geographic information which supports geometric operations.
4. Which of the following provides a standard API for doing computations with MongoDB?
a) Blaze
b) Geopandas
c) FRED
d) All of the mentioned
View Answer
Answer: a
Explanation: If your work entails maps and geographical coordinates, and you love pandas, you should take a close look at
Geopandas.
Answer: c
Explanation: Spyder is a cross-platform Qt-based open-source Python IDE.
6. Which of the following makes use of pandas and returns data in a series or dataFrame?
a) pandaSDMX
b) freedapi
c) OutPy
d) none of the mentioned
View Answer
Answer: b
Explanation: freedapi module requires a FRED API key that you can obtain for free on the FRED website.
Answer: b
Explanation: Spyder show both “column wise min/max and global min/max coloring.
9. The ________ project builds on top of pandas and matplotlib to provide easy plotting of data.
a) yhat
b) Seaborn
c) Vincent
d) None of the mentioned
View Answer
Answer: b
Explanation: Seaborn has great support for pandas data objects.
10. x-ray brings the labeled data power of pandas to the physical sciences.
a) True
b) False
View Answer
Answer: a
Explanation: It aims to provide a pandas-like and pandas-compatible toolkit for analytics on multi-dimensional arrays
1. Which of the following is the base layer for all of the sparse indexed data structures?
a) SArray
b) SparseArray
c) PyArray
d) None of the mentioned
View Answer
Answer: b
Explanation: SparseArray is a 1-dimensional ndarray-like object storing only values distinct from the fill_value.
Answer: d
Explanation: The to_sparse method takes a kind argument and a fill_value.
Answer: d
Explanation: SparseArray can be converted back to a regular ndarray by calling to_dense.
4. Which of the following list-like data structure is used for managing a dynamic collection of SparseArrays?
a) SparseList
b) GeoList
c) SparseSeries
d) All of the mentioned
View Answer
Answer: a
Explanation: To create one, simply call the SparseList constructor with a fill_value.
Answer: a
Explanation: to_array. append can accept scalar values or any 1-dimensional sequence.
6. Which of the following method is used for transforming a SparseSeries indexed by a MultiIndex to a scipy.sparse.coo_matrix?
a) SparseSeries.to_coo()
b) Series.to_coo()
c) SparseSeries.to_cooser()
d) None of the mentioned
View Answer
Answer: a
Explanation: Experimental api to transform between sparse pandas and scipy.sparse structures.
7. The integer format tracks only the locations and sizes of blocks of data.
a) True
b) False
View Answer
Answer: b
Explanation: The block format tracks only the locations and sizes of blocks of data.
8. Which of the following is used for testing for membership in the list of column names?
a) in
b) out
c) elseif
d) none of the mentioned
View Answer
Answer: a
Explanation: For DataFrames, likewise, in applies to the column axis.
9. Which of the following indexing capabilities is used as a concise means of selecting data from a pandas object?
a) In
b) ix
c) ipy
d) none of the mentioned
View Answer
Answer: b
Explanation: ix and reindex are 100% equivalent.
10. Pandas follow the NumPy convention of raising an error when you try to convert something to a bool.
a) True
b) False
View Answer
Answer: a
Explanation: This happens in an if or when using the boolean operations, and, or, or not.
1. Which of the following block information is odd man out?
a) Subsetting
b) Raw data
c) Ready for analysis
d) None of the mentioned
View Answer
Answer: b
Explanation: Characteristics mentioned in the diagram are traits of processed data.
Answer: a
Explanation: Data belongs to the set of items.
3. Data that summarize all observations in a category are called __________ data.
a) frequency
b) summarized
c) raw
d) none of the mentioned
View Answer
Answer: b
Explanation: The summary could be the sum of the observations, the number of occurrences, their mean value, and so on.
Answer: d
Explanation: Raw data refers to data that have not been changed since acquisition.
Answer: a
Explanation: Primary data is also referred to as raw data.
6. Which of the following data is put into a formula to produce commonly accepted results?
a) Raw
b) Processed
c) Synchronized
d) All of the Mentioned
View Answer
Answer: b
Explanation: Raw data came from direct measurements.
Answer: b
Explanation: There are many other techniques applied to raw data.
Answer: b
Explanation: Although raw data has the potential to become “information,” extraction, organization, and sometimes analysis and
formatting for presentation are required for that to occur.
9. Which type of data is generated by POS terminal in a busy supermarket each day?
a) Source
b) Processed
c) Synchronized
d) All of the mentioned
View Answer
Answer: a
Explanation: Raw data is sometimes referred to as source data.
10. Following figure represents correct sequence of steps in performing data analysis.
a) True
b) False
View Answer
Answer: a
Explanation: Data analysis is not a goal in itself; the goal is to enable the business to make better decisions.
1. Which of the following is an example of tidy data?
a) complicated JSON from facebook API
b) complicated JSON from Twitter API
c) unformatted excel file
d) all of the mentioned
View Answer
Answer: d
Explanation: Tidy data is obtained after processing script.
Answer: c
Explanation: Data cleansing is the process of detecting and correcting (or removing) corrupt or inaccurate records from a record set,
table, or database.
Answer: b
Explanation: The summary could be the sum of the observations, the number of occurrences, their mean value, and so on.
Answer: a
Explanation: tidyr is used for tidy data with spread and gather functions.
Answer: d
Explanation: The tidy data standard has been designed to simplify the development of data analysis tools that work well together.
Answer: a
Explanation: The principles of tidy data provide a standard way to organize data values within a dataset.
8. Which of the following is the most common problem with messy data?
a) Column headers are values
b) Variables are stored in both rows and columns
c) A single observational unit is stored in multiple tables
d) All of the mentioned
View Answer
Answer: d
Explanation: Real datasets can, and often do, violate the three precepts of tidy data in almost every way imaginable.
Answer: c
Explanation: tidyr does less reframing than reshape2.
Answer: a
Explanation: Data analysis is not a goal in itself; the goal is to enable the business to make better decisions.
Answer: c
Explanation: This reads data in to the RAM.
Answer: c
Explanation: write.xlsx write out an excel file with similar argument.
Answer: d
Explanation: More parameters are required for loading the data.
4. Which of the following will set the character that represents missing value?
a) na.quote
b) na.strings
c) nrows
d) all of the mentioned
View Answer
Answer: b
Explanation: na.strings takes a character vector.
Answer: b
Explanation: data.table is written in C.
Answer: a
Explanation: read.xlsx and read.xlsx functions are part of xlsx package.
7. Which of the following can be used to view all the tables in memory?
a) tables
b) alltable
c) table
d) none of the mentioned
View Answer
Answer: a
Explanation: The table function is a very basic, but essential, function to master while performing interactive data analyses.
Answer: a
Explanation: xmlSApply are simple wrappers for tapply and lappy functions.
Answer: a
Explanation: The jsonlite package is a JSON generator optimized for the web.
Answer: a
Explanation: This package contains meta information and index.
Answer: a
Explanation: HDF5 is used for storing large datasets.
3. Which of the following is used to extract data from HTML code of websites?
a) Webscraping
b) Webdredging
c) Webcleaning
d) All of the mentioned
View Answer
Answer: a
Explanation: Webscraping is a great way to get data.
4. Which of the following function is used to read data off the webpages?
a) read.web
b) read.Lines
c) read.Line
d) all of the mentioned
View Answer
Answer: b
Explanation: read.Lines function will extract the web page data.
Answer: b
Explanation: hdf5 can be used to reading/writing from disc in R.
6. Which of the following package is used for reading HTML and XML data?
a) httr
b) http
c) httx
d) all of the mentioned
View Answer
Answer: a
Explanation: httr contains tools for Working with URLs and HTTP.
7. httr package does not work well with facebook and twitter API.
a) True
b) False
View Answer
Answer: b
Explanation: Most modern APIs use something like oauth.
Answer: d
Explanation: Authentication is necessary for issuing a request.
Answer: a
Explanation: SPSS is a comprehensive and flexible statistical analysis and data management solution.
10. Which of the following package is used for reading GIS data?
a) rdgal
b) rgeos
c) raster
d) all of the mentioned
View Answer
Answer: d
Explanation: A geographic information system is a system designed to capture, store, manipulate, analyze, manage, and present all
types of spatial or geographical data.
1. Which of the following function gives information about top level data?
a) head
b) tail
c) summary
d) none of the mentioned
View Answer
Answer: a
Explanation: The function head is very useful for working with lists, tables, data frames and even functions.
Answer: d
Explanation: Both head and tail function do not work on strings.
Answer: d
Explanation: In R, missing values are represented by the symbol NA.
Answer: c
Explanation: Common variables are not used to apply transforms.
Answer: d
Explanation: Many common transforms can be applied to the data with R.
Answer: b
Explanation: Each variable forms a column in tidy data.
Answer: a
Explanation: Use acast or dcast depending on whether you want vector/matrix/array output or data frame output.
Answer: a
Explanation: Join is faster in plyr package.
10. mutate function is used for casting as multi dimensional arrays.
a) True
b) False
View Answer
Answer: b
Explanation: mutate is used for adding new variables.
1. Which of the following function is good for the automatic splitting of names?
a) split
b) strsplit
c) autsplit
d) none of the mentioned
View Answer
Answer: b
Explanation: strsplit split a character string or vector of character strings using a regular expression or a literal string.
Answer: a
Explanation: sub and gsub is used for fixing character vectors.
Answer: a
Explanation: It translates character to lowercase.
Answer: c
Explanation: A dot in function name can mean any of the following: nothing at all; a separator between method and class in S3
method.
Answer: a
Explanation: Variables with character values should be made more descriptive.
6. Which of the following is used for specifying character class with metacharacter?
a) []
b) {}
c) /+
d) All of the mentioned
View Answer
Answer: a
Explanation: You can list set of characters to accept a given point in the match.
Answer: a
Explanation: Regular expressions have rich set of metacharacters.
Answer: b
Explanation: * and + are metacharacters for repetition of data.
9. Which of the following function is used for searching text strings by means of regular expression?
a) grepd
b) grepl
c) gepexpr
d) all of the mentioned
View Answer
Answer: b
Explanation: grep, grepl, regexpr, gregexpr and regexec search for matches to argument pattern within each element of a character
vector.
Answer: a
Explanation: To merge two data frames horizontally, use the merge function.
1. Which of the the following graphic device information is odd man out in the below figure?
a) quartz
b) window
c) unix
d) x11
View Answer
Answer: c
Explanation: unix keyword does not exist with regards to graphics device.
Answer: a
Explanation: On Windows, the screen device is launched with window function.
Answer: d
Explanation: When the plot() function is invoked, R sends the data corresponding to the plot over, and the graphics device generates
the plot.
4. Which of the following file format is graphic device only for windows?
a) pdf
b) svg
c) win.metafile
d) all of the mentioned
View Answer
Answer: c
Explanation: Exporting graphics to a Windows MetaFile can be achieved via the win.metafile.
Answer: b
Explanation: window function cannot be used on Mac.
6. Which of the following system most often don’t have postscript viewer?
a) Windows
b) Linux
c) Mac
d) All of the mentioned
View Answer
Answer: a
Explanation: postscript is older format but it resizes well.
Answer: b
Explanation: There are mainly basic types of file devices-vector and bitmap.
Answer: b
Explanation: You can change the active graphics device with dev.set.
10. The most familiar place for a plot to be “sent” is screen device.
a) True
b) False
View Answer
Answer: a
Explanation: On Linux, the screen device is launched with x11 function.
1. Which of the following function has parameters shown in the below figure?
a) par
b) bar
c) base
d) all of the mentioned
View Answer
Answer: a
Explanation: R makes it easy to combine multiple plots into one overall graph, using either the par( ) or layout( ) function.
Answer: a
Explanation: Bitmap formats are good for plots with a large number of points, natural scenes or web based plots.
3. Which of the following will copy the plot from one device to another?
a) dev.copy
b) dev.copypdf
c) dev.device
d) all of the mentioned
View Answer
Answer: a
Explanation: Copying a plot to another device can be useful because some plots require a lot of code and it can be a pain to type all
that in again for a different device.
4. Which of the following is used to change active graphic device?
a) dev.set
b) dev.int
c) dev.win
d) all of the mentioned
View Answer
Answer: a
Explanation: You can change the active graphics device with dev.set(<integer>) where <integer> is the number associated with the
graphics device you want to switch to.
Answer: d
Explanation: For file devices, there are vector and bitmap formats.
Answer: a
Explanation: The principal components are equal to the right singular values if you first scale the variables.
Answer: a
Explanation: Copying a plot is not an exact operation, so the result may not be identical to the original.
Answer: b
Explanation: svg stands for scalable vector graphics.
Answer: d
Explanation: PC’s may mix real patterns.
1. Which of the following block information is odd man out in the below figure?
a) Scatterplots
b) 5 number summary
c) 2D Graph
d) None of the mentioned
View Answer
Answer: b
Explanation: 5 number summary is one dimensional graph.
a) Scatterplot
b) Barplot
c) Overlaying
d) None of the mentioned
View Answer
Answer: b
Explanation: A bar plot represents an estimate of central tendency for a numeric variable with the height of each rectangle.
Answer: d
Explanation: points and axis are other well known annotation function.
Answer: b
Explanation: Use grid on to display the major grid lines.
5. Point out the wrong statement.
a) Plot are created with multiple functions only
b) Plots are created with both single and multiple function calls
c) Annotation in plot is not especially intuitive
d) None of the mentioned
View Answer
Answer: a
Explanation: Plots are created with single function also.
6. Which of the following parameter defines line type such as dashed and dotted?
a) lty
b) pch
c) lwd
d) all of the mentioned
View Answer
Answer: a
Explanation: lwd is used for line width.
Answer: a
Explanation: graphics package contain plotting functions.
8. Which of the following argument specifies margin size with regards to par function?
a) las
b) bg
c) mar
d) all of the mentioned
View Answer
Answer: c
Explanation: par function is used to specify global parameters.
Answer: a
Explanation: The base plotting system is highly flexible.
10. Base graphics are used most commonly for creating 2D graphics.
a) True
b) False
View Answer
Answer: a
Explanation: Base graphics is a very powerful system for creating 2D graphics.
1. Which of the following clustering type has characteristic shown in the below figure?
a) Partitional
b) Hierarchical
c) Naive bayes
d) None of the mentioned
View Answer
Answer: b
Explanation: Hierarchical clustering groups data over a variety of scales by creating a cluster tree or dendrogram.
Answer: d
Explanation: Some elements may be close to one another according to one distance and farther away according to another.
Answer: b
Explanation: Hierarchical clustering is an agglomerative approach.
Answer: d
Explanation: K-means clustering follows partitioning approach.
Answer: c
Explanation: k-nearest neighbor has nothing to do with k-means.
Answer: d
Explanation: You should choose a distance/similarity that makes sense for your problem.
Answer: a
Explanation: Hierarchical clustering is deterministic.
Answer: a
Explanation: K-means requires a number of clusters.
Answer: b
Explanation: Hierarchical clustering requires a defined distance as well.
Answer: a
Explanation: K-means clustering produces the final estimate of cluster centroids.
a) Exploratory
b) Inferential
c) Causal
d) None of the mentioned
View Answer
Answer: a
Explanation: Making plots of the data reveals various interesting features.
2. Which of the following dimension type graph is shown in the below figure?
a) one-dimensional
b) two-dimensional
c) three-dimensional
d) none of the mentioned
View Answer
Answer: b
Explanation: A two-dimensional graph is a set of points in two-dimensional space.
Answer: d
Explanation: A picture can tell better story than data.
Answer: c
Explanation: A large number of exploratory graphs are made.
Answer: a
Explanation: coplot is used for two dimensional representation.
6. Which of the following graph can be used for simple summarization of data?
a) Scatterplot
b) Overlaying
c) Barplot
d) All of the mentioned
View Answer
Answer: c
Explanation: A bar chart or bar graph is a chart that presents Grouped data with rectangular bars with lengths proportional to the
values that they represent.
Answer: c
Explanation: The mode is the value that appears most often in a set of data.
Answer: a
Explanation: lattice is an add-on package that implements Trellis graphics.
Answer: a
Explanation: There are many ways to create a 3D spinning plot as well.
Answer: b
Explanation: More transparency is achieved with reproducibility.
Answer: a
Explanation: Data replication if the same data is stored on multiple storage device.
Answer: d
Explanation: Reproducibility addresses the most “downstream” aspect of the research process.
Answer: b
Explanation: Evidence-based Data Analysis a deterministic statistical machine.
Answer: b
Explanation: Replication is particularly important in studies that can impact broad policy or regulatory decisions.
Answer: d
Explanation: Different problems require different approaches and expertise.
Answer: b
Explanation: Reproducibility has nothing to do with validity of data analysis.
Answer: d
Explanation: The data set may depend on your goal.
9. Which of the following gives reviewers an important tool without dramatically increasing the burden?
a) Quality research
b) Replication research
c) Reproducible research
d) None of the mentioned
View Answer
Answer: c
Explanation: Reproducible research is important, but does not necessarily solve the critical question of whether a data analysis is
trustworthy.
Answer: b
Explanation: Complicated analyses should not be trusted.
1. Which of the following is suitable for knitr?
a) Reports
b) Data preprocessing documents
c) Technical manuals
d) All of the mentioned
View Answer
Answer: a
Explanation: knitr has short technical documents.
Answer: a
Explanation: Global option relating to echo have values TRUE and FALSE.
Answer: a
Explanation: Code has to be written to set the global options.
4. Which of the following global options are available for figures in knitr?
a) fig.height
b) fig.size
c) fig.breadth
d) all of the mentioned
View Answer
Answer: a
Explanation: fig.height has numeric value.
Answer: a
Explanation: Workflow R Markdown is a format for writing reproducible, dynamic reports with R.
Answer: a
Explanation: knitr converts markdown document in to html by default.
Answer: a
Explanation: knitr is not good for documents that require precise formatting.
9. The document produced by knitr document has which of the following extension?
a) .md
b) .rmd
c) .html
d) none of the mentioned
View Answer
Answer: b
Explanation: knitr produces markdown document.
10. Code chunks begin with “`{r} and end with “`.
a) True
b) False
View Answer
Answer: a
Explanation: Code chunks can have names.
Answer: c
Explanation: Data science workflow is a non-linear, iterative process.
Answer: a
Explanation: Literate Statistical Practice is a programming methodology.
Answer: b
Explanation: Literate Statistical Programming can be done with knitr.
Answer: c
Explanation: R is a language and environment for statistical computing and graphics.
5. What is one way in which the knitr system differs from Sweave?
a) knitr allows for the use of markdown instead of LaTeX
b) knitr is written in python instead of R
c) knitr lacks features like caching of code chunks
d) none of the mentioned
View Answer
Answer: a
Explanation: knitr is an engine for dynamic report generation with R.
6. Which of the following is useful way to put text, code, data, output all in one document?
a) Literate statistical programming
b) Object oriented programming
c) Descriptive programming
d) All of the mentioned
View Answer
Answer: a
Explanation: Object-oriented programming is a programming language model organized around objects rather than “actions” and data
rather than logic.
7. Some chunks have to be re-computed every time you re-knit the file.
a) True
b) False
View Answer
Answer: b
Explanation: All chunks have to be re-computed every time you re-knit the file.
8. Which of the following tool can be used for integrating text and code in one document?
a) knitr
b) ggplot2
c) NumPy
d) None of the mentioned
View Answer
Answer: a
Explanation: knitr is a way to write LaTeX, HTML, and Markdown with R code interlaced.
9. Which of the following should be set on chunk by chunk basis to store results of computation?
a) cache=TRUE
b) cache=FALSE
c) caching=TRUE
d) none of the mentioned
View Answer
Answer: a
Explanation: After the first run. The results are loaded from cache.
Answer: b
Explanation: Dependencies are not checked explicitly in caching caveats.
1. Original idea comes of Literate Statistical Practice from _______________
a) Don Knuth
b) Don Cutting
c) Douglas Cutting
d) All of the mentioned
View Answer
Answer: a
Explanation: Literate programs are tangled to produce machine readable documents.
Answer: a
Explanation: Analysis code is divided in to code chunks and text.
Answer: a
Explanation: Programming language is also required for literate programming.
Answer: c
Explanation: R is a language and environment for statistical computing and graphics.
Answer: a
Explanation: Save data in NON proprietary formats to make work reproducible.
Answer: a
Explanation: Code and text is in one place.
Answer: a
Explanation: knitr is available on CRAN.
Answer: b
Explanation: It can be exported to pdf and html.
10. Literate program code is live-automatic “regression test” when building a document.
a) True
b) False
View Answer
Answer: a
Explanation: Data and results are automatically updated to reflect external changes.
1. Which of the following is the probability calculus of beliefs, given that beliefs follow certain rules?
a) Bayesian probability
b) Frequency probability
c) Frequency inference
d) Bayesian inference
View Answer
Answer: a
Explanation: Data scientists tend to fall within shades of gray of these and various other schools of inference.
Answer: a
Explanation: Frequency probability is the long run proportion of times an event occurs in independent, identically distributed
repetitions.
Answer: d
Explanation: The probability distribution of a discrete random variable is a list of probabilities associated with each of its possible
values.
4. Which of the following random variable that take on only a countable number of possibilities?
a) Discrete
b) Non Discrete
c) Continuous
d) All of the mentioned
View Answer
Answer: a
Explanation: Continuous random variable can take any value on some subset of the real line.
Answer: b
Explanation: There are two types of random variable-continuous and discrete.
Answer: b
Explanation: Random variable is also known as stochastic variable.
Answer: b
Explanation: Frequency inference uses frequency interpretations of probabilities to control error rates.
Answer: a
Explanation: A probability mass function evaluated at a value corresponds to the probability that a random variable takes that value.
Answer: a
Explanation: pdf stands for probability density function.
10. Statistical inference is the process of drawing formal conclusions from data.
a) True
b) False
View Answer
Answer: a
Explanation: Statistical inference requires navigating the set of assumptions and tools.
1. The expected value or _______ of a random variable is the center of its distribution.
a) mode
b) median
c) mean
d) bayesian inference
View Answer
Answer: c
Explanation: A probability model connects the data to the population using assumptions.
Answer: d
Explanation: Every cumulative distribution function F is non-decreasing and right-continuous.
Answer: a
Explanation: Densities with a higher variance are more spread out than densities with a lower variance.
Answer: d
Explanation: Standard Deviation (SD) is the measure of spread of the numbers in a set of data from its mean value.
Answer: c
Explanation: R can approximate quantiles for you for common distributions.
Answer: a
Explanation: Chebyshev’s inequality is also spelled as Tchebysheff’s inequality.
7. For continuous random variables, the CDF is the derivative of the PDF.
a) True
b) False
View Answer
Answer: b
Explanation: For continuous random variables, the PDF is the derivative of the CDF.
8. Chebyshev’s inequality states that the probability of a “Six Sigma” event is less than ___________
a) 10%
b) 20%
c) 30%
d) 3%
View Answer
Answer: d
Explanation: If a bell curve is assumed, the probability of a “six sigma” event is on the order of one ten millionth of a percent.
9. Which of the following random variables are the default model for random samples?
a) iid
b) id
c) pmd
d) all of the mentioned
View Answer
Answer: a
Explanation: Random variables are said to be iid if they are independent and identically distributed.
10. Cumulative distribution functions are used to specify the distribution of multivariate random variables.
a) True
b) False
View Answer
Answer: a
Explanation: In the case of a continuous distribution, it gives the area under the probability density function from minus infinity to x.
Answer: d
Explanation: Causal is not directly related to goal of statistical modelling.
Answer: b
Explanation: Poisson distribution is used for modeling unbounded count data.
Answer: c
Explanation: Poisson distribution is used to model counts.
Answer: a
Explanation: The normal distribution is symmetric and peaked about its mean.
6. Which of the following form the basis for frequency interpretation of probabilities?
a) Asymptotics
b) Symptotics
c) Asymmetry
d) All of the mentioned
View Answer
Answer: a
Explanation: Asymptotics is the term for the behavior of statistics as the sample size.
Answer: a
Explanation: The Bernoulli distribution arises as the result of a binary outcome.
Answer: b
Explanation: LLN stands for law of large numbers.
9. Which of the following theorem states that the distribution of averages of iid variables, properly normalized, becomes that of a standard
normal as the sample size increases?
a) Central Limit Theorem
b) Central Mean Theorem
c) Centroid Limit Theorem
d) All of the mentioned
View Answer
Answer: a
Explanation: The Central Limit Theorem (CLT) is one of the most important theorems in statistics.
10. The binomial random variables are obtained as the sum of iid Gaussian trials.
a) True
b) False
View Answer
Answer: a
Explanation: The binomial random variables are obtained as the sum of iid Bernoulli trials.
Answer: a
Explanation: The mean of the Chi-squared is its degrees of freedom.
Answer: d
Explanation: Consistency is neither necessary nor sufficient for one estimator to be better than another.
Answer: a
Explanation: Gosset’s distribution is indexed by a degrees of freedom.
4. The _________ of a collection of data is the joint density evaluated as a function of the parameters with the data fixed.
a) probability
b) likelihood
c) poisson distribution
d) all of the mentioned
View Answer
Answer: b
Explanation: Likelihood analysis of data uses the likelihood to perform inference regarding the unknown parameter.
Answer: d
Explanation: Likelihood is the hypothetical probability that an event that has already occurred would yield a specific outcome.
Answer: a
Explanation: The CLT applies in an endless variety of settings.
8. The beta distribution is the default prior for parameters between ____________
a) 0 and 10
b) 1 and 2
c) 0 and 1
d) None of the mentioned
View Answer
Answer: c
Explanation: Bayesian statistics posits a prior on the parameter of interest.
9. Which of the following mean is a mixture of the MLE and the prior mean?
a) interior
b) exterior
c) posterior
d) all of the mentioned
View Answer
Answer: c
Explanation: MLE stands for maximum likelihood.
10. Usually replacing the standard error by its estimated value does change the CLT.
a) True
b) False
View Answer
Answer: b
Explanation: Usually replacing the standard error by its estimated value doesn’t change the CLT.
1. Which of the following testing is concerned with making decisions using data?
a) Probability
b) Hypothesis
c) Causal
d) None of the mentioned
View Answer
Answer: b
Explanation: The null hypothesis is assumed true and statistical evidence is required to reject it in favor of a research or alternative
hypothesis.
Answer: d
Explanation: Power of a one sided test is greater than the power of the associated two sided test.
3. Which of the following value is the most common measure of “statistical significance”?
a) P
b) A
c) L
d) All of the mentioned
View Answer
Answer: a
Explanation: The P-value is the probability under the null hypothesis of obtaining evidence as extreme or more extreme than would
be observed by chance alone.
Answer: d
Explanation: A false positive is an error in some evaluation process in which a condition tested for is mistakenly found to have been
detected.
Answer: a
Explanation: FDR stands for false discovery rate.
Answer: a
Explanation: Bonferroni correction is easy to calculate.
7. The pooled estimator is a mixture of the group variances, placing greater weight on whichever has a larger sample size.
a) True
b) False
View Answer
Answer: a
Explanation: If the sample sizes are the same the pooled variance estimate is the average of the group variances.
8. Which of the following tool is used for constructing confidence intervals and calculating standard errors for difficult statistics?
a) baggyer
b) bootstrap
c) jacknife
d) none of the mentioned
View Answer
Answer: b
Explanation: The bootstrap procedure follows from the so called bootstrap principle.
9. Which of the following tool is used for estimating standard errors and the bias of estimators?
a) knitr
b) jackknife
c) ggplot2
d) all of the mentioned
View Answer
Answer: c
Explanation: jackknife involves resampling data.
10. Power is the probability of rejecting the null hypothesis when it is true.
a) True
b) False
View Answer
Answer: b
Explanation: Power is the probability of rejecting the null hypothesis when it is false.
1. Which of the following function can be replaced with the question mark in the below figure?
a) boxplot
b) lplot
c) levelplot
d) all of the mentioned
View Answer
Answer: c
Explanation: levelplot is used plotting “image”.
Answer: d
Explanation: The process of centering and scaling the data is called “normalizing” the data.
Answer: b
Explanation: Correlation is a statistical technique that can show whether and how strongly pairs of variables are related.
4. Normalized data are centered at ___ and have units equal to standard deviations of the original data.
a) 0
b) 5
c) 1
d) 10
View Answer
Answer: a
Explanation: In statistics and applications of statistics, normalization can have a range of meanings.
Answer: c
Explanation: Least squares is an estimation tool.
Answer: a
Explanation: Residuals can be thought of as the outcome with the linear association of the predictor removed.
Answer: a
Explanation: Maximizing the likelihood is the same as minimizing 2 log likelihood.
8. Which of the following refers to the circumstance in which the variability of a variable is unequal across the range of values of a
second variable that predicts it?
a) Heterogeneity
b) Heteroskedasticity
c) Heteroelasticty
d) None of the mentioned
View Answer
Answer: b
Explanation: Heteroskedasticity has serious consequences for the OLS estimator.
9. Which of the following outcome is odd man out in the below figure?
a) R Squared
b) Kappa
c) RMSE
d) All of the mentioned
View Answer
Answer: b
Explanation: Kappa is categorical outcome.
Answer: b
Explanation: Residuals are useful for investigating poor model fit.
Answer: b
Explanation: The complementary part of the total variation is called unexplained or residual.
Answer: d
Explanation: In statistics, explained variation measures the proportion to which a mathematical model accounts for the variation of a
given data set.
Answer: d
Explanation: Linear models are the single most important applied statistical and machine learning technique.
Answer: c
Explanation: Outliers can conform to the regression relationship.
Answer: d
Explanation: Linearity refers to a mathematical relationship or function that can be graphically represented as a straight line.
6. Which of the following can be useful for diagnosing data entry errors?
a) hat values
b) dffit
c) resid
d) all of the mentioned
View Answer
Answer: a
Explanation: resid returns the ordinary residuals.
7. Multivariate regression estimates are exactly those having removed the linear relationship of the other variables from both the regressor
and response.
a) True
b) False
View Answer
Answer: a
Explanation: Multivariate Data Analysis refers to any statistical technique used to analyze data that arises from more than one
variable.
Answer: c
Explanation: Patterns in your residual plots generally indicate some poor aspect of model fit.
Answer: c
Explanation: rstandard stands for standardized residuals.
10. The least squares estimate for the coefficient of a multivariate regression model is exactly regression through the origin with the linear
relationships.
a) True
b) False
View Answer
Answer: b
Explanation: Multivariate regression adjusts a coefficient for the linear impact of the other variables.
Answer: d
Explanation: Generalized linear models involve three components.
Answer: d
Explanation: GLM is a flexible generalization of ordinary linear regression that allows for response variables that have error
distribution models other than a normal distribution.
4. Collection of exchangeable binary outcomes for the same covariate data are called _______ outcomes.
a) random
b) direct
c) binomial
d) none of the mentioned
View Answer
Answer: c
Explanation: The multivariate regression model for binary outcomes gives odds ratios, not risk ratios.
Answer: c
Explanation: Adding cubic terms makes it twice continuously differentiable at the knot points.
Answer: d
Explanation: The Poisson distribution is a useful model for counts and rates.
7. Principal components or factor analytic models on covariates are often useful for reducing complex covariate spaces.
a) True
b) False
View Answer
Answer: a
Explanation: The space of models explodes quickly as you add interactions and polynomial terms.
Answer: a
Explanation: Bernoulli trial is a random experiment with exactly two possible outcomes.
9. Which of the following analysis is a statistical process for estimating the relationships among variables?
a) Causal
b) Regression
c) Multivariate
d) All of the mentioned
View Answer
Answer: b
Explanation: Regression models provide the scientist with a powerful tool, allowing predictions about past, present, or future events
to be made with information about past or present events.
10. Linear models are the most useful applied statistical technique.
a) True
b) False
View Answer
Answer: b
Explanation: Linear model do have limitations.
1. Which of the following can be used to generate balanced cross–validation groupings from a set of data?
a) createFolds
b) createSample
c) createResample
d) none of the mentioned
View Answer
Answer: a
Explanation: createResample can be used to make simple bootstrap samples.
Answer: a
Explanation: Simple random sampling of time series is probably not the best way to resample times series data.
3. Which of the following function can be used to maximize the minimum dissimilarities?
a) sumDiss
b) minDiss
c) avgDiss
d) all of the mentioned
View Answer
Answer: d
Explanation: sumDiss can be used to maximize the total dissimilarities.
4. Which of the following function can create the indices for time series type of splitting?
a) newTimeSlices
b) createTimeSlices
c) binTimeSlices
d) none of the mentioned
View Answer
Answer: b
Explanation: Rolling forecasting origin techniques are associated with time series type of splitting.
6. Which of the following can be used to create sub–samples using a maximum dissimilarity approach?
a) minDissim
b) maxDissim
c) inmaxDissim
d) all of the mentioned
View Answer
Answer: b
Explanation: Splitting is based on the predictors.
Answer: b
Explanation: caret uses the proxy package.
8. Which of the following function can be used to create balanced splits of the data?
a) newDataPartition
b) createDataPartition
c) renameDataPartition
d) none of the mentioned
View Answer
Answer: b
Explanation: If the y argument to this function is a factor, the random sampling occurs within each class and should preserve the
overall class distribution of the data.
Answer: d
Explanation: There are many different modeling functions in R.
Answer: a
Explanation: The caret package is a set of functions that attempt to streamline the process for creating predictive models.
1. Which of the following function is a wrapper for different lattice plots to visualize the data?
a) levelplot
b) featurePlot
c) plotsample
d) none of the mentioned
View Answer
Answer: b
Explanation: featurePlot is used for data visualization in caret.
Answer: a
Explanation: In some situations, the data generating mechanism can create predictors that only have a single unique value.
3. Which of the following function can be used to identify near zero-variance variables?
a) zeroVar
b) nearVar
c) nearZeroVar
d) all of the mentioned
View Answer
Answer: c
Explanation: The saveMetrics argument can be used to show the details and usually defaults to FALSE.
4. Which of the following function can be used to flag predictors for removal?
a) searchCorrelation
b) findCausation
c) findCorrelation
d) none of the mentioned
View Answer
Answer: c
Explanation: Some models thrive on correlated predictors.
Answer: b
Explanation: For each linear combination, it will incrementally remove columns from the matrix and test to see if the dependencies
have been resolved.
6. Which of the following can be used to impute data sets based only on information in the training set?
a) postProcess
b) preProcess
c) process
d) all of the mentioned
View Answer
Answer: b
Explanation: This can be done with K-nearest neighbors.
7. The function preProcess estimates the required parameters for each operation.
a) True
b) False
View Answer
Answer: a
Explanation: predict.preProcess is used to apply them to specific data sets.
8. Which of the following can also be used to find new variables that are linear combinations of the original set with independent
components?
a) ICA
b) SCA
c) PCA
d) None of the mentioned
View Answer
Answer: a
Explanation: ICA stands for independent component analysis.
Answer: b
Explanation: By default, the distances are logged.
10. The preProcess class can be used for many operations on predictors.
a) True
b) False
View Answer
Answer: a
Explanation: Operations include centering and scaling.
Answer: b
Explanation: The earth package is an implementation of Jerome Friedman’s Multivariate Adaptive Regression Splines.
Answer: c
Explanation: An argument, nonpara, is used to pick the model fitting technique.
3. Which of the following curve analysis is conducted on each predictor for classification?
a) NOC
b) ROC
c) COC
d) All of the mentioned
View Answer
Answer: b
Explanation: For two class problems, a series of cutoffs is applied to the predictor data to predict the class.
Answer: a
Explanation: GCV change value can also be tracked.
Answer: a
Explanation: The larger the difference between the class centroid and the overall center of the data, the larger the separation between
the classes.
6. Which of the following model model include a backwards elimination feature selection routine?
a) MCV
b) MARS
c) MCRS
d) All of the mentioned
View Answer
Answer: b
Explanation: MARS stands for Multivariate Adaptive Regression Splines.
7. The advantage of using a model-based approach is that is more closely tied to the model performance.
a) True
b) False
View Answer
Answer: a
Explanation: Model-based approach is able to incorporate the correlation structure between the predictors into the importance
calculation.
8. Which of the following model sums the importance over each boosting iteration?
a) Boosted trees
b) Bagged trees
c) Partial least squares
d) None of the mentioned
View Answer
Answer: a
Explanation: gbm package can be used here.
Answer: a
Explanation: All measures of importance are scaled to have a maximum value of 100.
10. For most classification models, each predictor will have a separate variable importance for each class.
a) True
b) False
View Answer
Answer: a
Explanation: The exceptions are classification trees, bagged trees and boosted trees.
Answer: a
Explanation: Out of Sample Error is also called generalization error.
Answer: a
Explanation: Evaluation is done in the last.
Answer: b
Explanation: Garbage in should be equal to garbage out.
Answer: d
Explanation: Perfect in sample prediction can be built.
Answer: d
Explanation: There is always a trade-off in prediction accuracy.
Answer: b
Explanation: True positive means correctly identified.
Answer: d
Explanation: Interpretability also matters during prediction.
Answer: a
Explanation: Out of sample error is given more importance.
Answer: a
Explanation: Backtesting is the process of applying a trading strategy or analytical method to historical data to see how accurately the
strategy or method would have predicted actual results.
Answer: d
Explanation: Cross-validation is also used to pick type of prediction function to be used.
Answer: c
Explanation: False positive means incorrectly identified.
Answer: d
Explanation: Sensitivity and specificity are statistical measures of the performance of a binary classification test, also known in
statistics as classification function.
Answer: d
Explanation: Random sampling with replacement is the bootstrap.
Answer: c
Explanation: RMSE stands for Root Mean Squared Error.
Answer: b
Explanation: For k cross-validation, larger k value implies less bias.
Answer: a
Explanation: repeatedcv stands for repeated cross-validation.
9. Which of the following can be used to create the most common graph types?
a) qplot
b) quickplot
c) plot
d) all of the mentioned
View Answer
Answer: a
Explanation: qplot() is short for a quick plot.
Answer: a
Explanation: Larger k value implies more variance.
Answer: b
Explanation: Predicting with trees is easy to interpret.
Answer: a
Explanation: Training and testing data must be processed in same way.
3. Which of the following method options is provided by train function for bagging?
a) bagEarth
b) treebag
c) bagFDA
d) all of the mentioned
View Answer
Answer: d
Explanation: Bagging can be done using bag function as well.
Answer: a
Explanation: Random forest is top performing algorithm in prediction.
Answer: d
Explanation: Prediction with regression gives poor performance in non linear settings.
6. Which of the following library is used for boosting generalized additive models?
a) gamBoost
b) gbm
c) ada
d) all of the mentioned
View Answer
Answer: a
Explanation: Boosting can be used with any subset of classifier.
7. The principal components are equal to left singular values if you first scale the variables.
a) True
b) False
View Answer
Answer: b
Explanation: The principal components are equal to left singular values if you first scale the variables.
8. Which of the following is statistical boosting based on additive logistic regression?
a) gamBoost
b) gbm
c) ada
d) mboost
View Answer
Answer: a
Explanation: mboost is used for model based boosting.
Answer: b
Explanation: R has multiple boosting libraries.
Answer: b
Explanation: PCA is most useful for linear type models.
Answer: a
Explanation: Regularized regression does not perform as well as random forest.
Answer: c
Explanation: Model based approach are reasonably accurate on real problems.
3. Which of the following methods are present in caret for regularized regression?
a) ridge
b) lasso
c) relaxo
d) all of the mentioned
View Answer
Answer: d
Explanation: In caret one can tune over the no of predictors to retain instead of defined values for penalty.
Answer: c
Explanation: You can combine classifier by averaging.
Answer: d
Explanation: cl_predict function is clue package provides unsupervised prediction.
7. Model based prediction considers relatively easy version for covariance matrix.
a) True
b) False
View Answer
Answer: b
Explanation: Model based prediction considers relatively easy version for covariance matrix.
8. Which of the following is used to assist the quantitative trader in the development?
a) quantmod
b) quantile
c) quantity
d) mboost
View Answer
Answer: a
Explanation: Quandl package is similar to quantmod.
Answer: b
Explanation: Forecasting is the process of making predictions of the future based on past and present data and analysis of trends.
Answer: b
Explanation: Predictive analytics goes beyond forecasting.
1. Which of the following project is used for calling R products from web?
a) OpenCPU
b) OpenDisk
c) OpenMem
d) All of the mentioned
View Answer
Answer: a
Explanation: OpenCPU is complementary to OpenCPU.
Answer: c
Explanation: Time to create data products is less using shiny.
Answer: a
Explanation: Shiny applications are automatically “live” in the same way that spreadsheets are live.
Answer: d
Explanation: shiny allows users to upload files.
Answer: d
Explanation: shiny project consist is a directory containing at least two parts.
6. Which of the following function can interrupt execution and can be called continuously?
a) browser()
b) browse()
c) search()
d) all of the mentioned
View Answer
Answer: a
Explanation: Debugging shiny apps can be difficult.
7. runApp() will run the shiny and open the browser window.
a) True
b) False
View Answer
Answer: a
Explanation: The chart is rendered within the browser using Flash.
8. Which of the following function is for single checkbox widget?
a) checkboxInput
b) dateInput
c) singleboxInput
d) all of the mentioned
View Answer
Answer: a
Explanation: Shiny comes with a family of pre-built widgets, each created with a transparently named R function.
Answer: d
Explanation: Shiny apps have two components:user-interface script and server script.
Answer: b
Explanation: All of the styled elements are handled through ui.R.
Answer: b
Explanation: D3 is a JavaScript library for visualizing data with HTML, SVG, and CSS.
Answer: b
Explanation: Slidify is customizable and extendable.
Answer: a
Explanation: Devtools should be installed in advance.
4. Which of the following will be used to compose the content of the presentation?
a) ui.RMD
b) index.RMD
c) server.RMD
d) all of the mentioned
View Answer
Answer: b
Explanation: index.RMD is an R markdown document.
Answer: a
Explanation: Slidify allows mathematical formulas as well.
6. Which of the following statement generates a html slide deck from index.Rmd?
a) slidify(“index.Rmd”)
b) lib.slidify(“index.Rmd”)
c) slidifylib(“index.Rmd”)
d) all of the mentioned
View Answer
Answer: a
Explanation: It is a static file, which means that you can open it in your browser locally and it should display fine.
Answer: b
Explanation: The first part of index.Rmd is YAML code.
Answer: a
Explanation: Slidify is not on CRAN.
Answer: d
Explanation: Many interactive elements can be added to slidify.
10. MathJax is a cross-browser JavaScript library that displays mathematical notation in web browsers.
a) True
b) False
View Answer
Answer: a
Explanation: MathJax uses MathML.
Answer: a
Explanation: googleVis allow users to create interactive charts based on data frames.
Answer: a
Explanation: The plot command does not open a graphics device in the traditional way.
3. Which of the following create a Google Gadget based on a Google Visualization Object?
a) createGadget
b) createGoogleGadget
c) newGoogleGadget
d) all of the mentioned
View Answer
Answer: b
Explanation: createGoogleGadget returns a Google Gadget XML string.
4. Which of the following reads a data.frame and creates text output referring to the Google Visualization API?
a) gvisAnnotatedLine
b) gvisTimeLine
c) gvisAnnotatedTimeLine
d) none of the mentioned
View Answer
Answer: c
Explanation: An annotated time line is an interactive time series line chart with optional annotations.
Answer: d
Explanation: This can be included into a web page, or as a stand-alone page.
Answer: b
Explanation: gvisLineChart is used for creating line charts.
Answer: a
Explanation: The chart is rendered within the browser using flash.
8. The actual chart of gvisBarChart is rendered by the web browser using _________ or VML.
a) JPEG
b) SVG
c) PDF
d) All of the mentioned
View Answer
Answer: b
Explanation: gvisBarChart reads data frame.
Answer: c
Explanation: gvisGeoChart is used for interactive maps.
10. gvisAnnotationChart charts are interactive time series line charts that support annotations.
a) True
b) False
View Answer
Answer: a
Explanation: Unlike the gvisAnnotatedTimeLine, which uses flash, annotation charts are SVG/VML and should be preferred
whenever possible.
Answer: d
Explanation: NumPy is the fundamental package for scientific computing with Python.
Answer: c
Explanation: SciPy provides a lot of scientific routines that work on top of NumPy.
3. The ________ function returns its argument with a modified shape, whereas the ________ method modifies the array itself.
a) reshape, resize
b) resize, reshape
c) reshape2, resize
d) all of the mentioned
View Answer
Answer: a
Explanation: If a dimension is given as -1 in a reshaping operation, the other dimensions are automatically calculated.
4. To create sequences of numbers, NumPy provides a function __________ analogous to range that returns arrays instead of lists.
a) arange
b) aspace
c) aline
d) all of the mentioned
View Answer
Answer: a
Explanation: When arange is used with floating point arguments, it is generally not possible to predict the number of elements
obtained.
Answer: d
Explanation: The number of axes is called rank.
Answer: b
Explanation: column_stack is equivalent to vstack only for 1D arrays.
Answer: a
Explanation: numpy.array is not the same as the Standard Python Library class array.array.
8. Which of the following method creates a new array object that looks at the same data?
a) view
b) copy
c) paste
d) all of the mentioned
View Answer
Answer: a
Explanation: The copy method makes a complete copy of the array and its data.
9. Which of the following function can be used to combine different vectors so as to obtain the result for each n-uplet?
a) iid_
b) ix_
c) ixd_
d) all of the mentioned
View Answer
Answer: b
Explanation: Length of the 1D boolean array must coincide with the length of the dimension (or axis) you want to slice.
10. ndarray.dataitemSize is the buffer containing the actual elements of the array.
a) True
b) False
View Answer
Answer: a
Explanation: ndarray.data is the buffer containing the actual elements of the array.
1. Which of the following sets the size of the buffer used in ufuncs?
a) bufsize(size)
b) setsize(size)
c) setbufsize(size)
d) all of the mentioned
View Answer
Answer: c
Explanation: Adjusting the size of the buffer may therefore alter the speed at which ufunc calculations of various sorts are completed.
Answer: b
Explanation: ufunc instances can also be produced using the frompyfunc factory function.
3. Which of the following attribute should be used while checking for type combination input and output?
a) .types
b) .type
c) .class
d) all of the mentioned
View Answer
Answer: a
Explanation: Universal functions in NumPy are flexible enough to have mixed type signatures.
4. Which of the following returns an array of ones with the same shape and type as a given array?
a) all_like
b) ones_like
c) one_alike
d) all of the mentioned
View Answer
Answer: b
Explanation: The optional output arguments of the function can be used to help you save memory for large calculations.
Answer: c
Explanation: The output of the ufunc is not necessarily an ndarray, if all input arguments are not ndarrays.
6. Which of the following set the floating-point error callback function or log object?
a) setter
b) settercall
c) setterstack
d) all of the mentioned
View Answer
Answer: b
Explanation: seterr sets how floating-point errors are handled.
Answer: b
Explanation: All ufuncs can take output arguments. If necessary, output will be cast to the data-type of the provided output array.
8. ___________ decompose the elements of x into mantissa and twos exponent.
a) trunc
b) fmod
c) frexp
d) ldexp
View Answer
Answer: c
Explanation: fmod function return the element-wise remainder of division.
Answer: a
Explanation: iscomplex function returns a bool array, where true if input element is complex.
10. The array object returned by __array_prepare__ is passed to the ufunc for computation.
a) True
b) False
View Answer
Answer: a
Explanation: If the class has an __array_wrap__ method, the returned ndarray result will be passed to that method just before passing
control back to the caller.
1. When talking to a speech recognition program, the program divides each second of your speech into 100 separate __________
a) Codes
b) Phonemes
c) Samples
d) Words
View Answer
Answer: c
Explanation: None.
2. Which term is used for describing the judgmental or commonsense part of problem solving?
a) Heuristic
b) Critical
c) Value based
d) Analytical
View Answer
Answer: a
Explanation: None.
3. Which stage of the manufacturing process has been described as “the mapping of function onto form”?
a) Design
b) Distribution
c) Project management
d) Field service
View Answer
Answer: a
Explanation: None.
Answer: a
Explanation: None.
Answer: c
Explanation: None.
7. PROLOG is an AI programming language, which solves problems with a form of symbolic logic known as predicate calculus. It was
developed in 1972 at the University of Marseilles by a team of specialists. Can you name the person who headed this team?
a) Alain Colmerauer
b) Niklaus Wirth
c) Seymour Papert
d) John McCarthy
View Answer
Answer: a
Explanation: None.
8. Programming a robot by physically moving it through the trajectory you want it to follow be called __________
a) contact sensing control
b) continuous-path control
c) robot vision control
d) pick-and-place control
View Answer
Answer: b
Explanation: None.
Answer: b
Explanation: None.
Answer: b
Explanation: None.
11. ART (Automatic Reasoning Tool) is designed to be used on __________
a) LISP machines
b) Personal computers
c) Microcomputers
d) All of the mentioned
View Answer
Answer: a
Explanation: None.
Answer: c
Explanation: None.
13. Shaping teaching techniques to fit the learning patterns of individual students is the goal of __________
a) decision support
b) automatic programming
c) intelligent computer-assisted instruction
d) expert systems
View Answer
Answer: c
Explanation: None.
14. Which of the following function returns t If the object is a symbol m LISP?
a) (* <object>)
b) (symbolp <object>)
c) (nonnumeric <object>)
d) (constantp <object>)
View Answer
Answer: b
Explanation: None.
15. The symbols used in describing the syntax of a programming language are __________
a) 0
b) {}
c) “”
d) <>
View Answer
Answer: d
Explanation: None.
Answer: d
Explanation: None.
2. Which company offers the LISP machine considered “the most powerful symbolic processor available”?
a) LMI
b) Symbolics
c) Xerox
d) Texas Instruments
View Answer
Answer: b
Explanation: None.
3. What of the following is considered a pivotal event in the history of Artificial Intelligence?
a) 1949, Donald O, The organization of Behavior
b) 1950, Computing Machinery and Intelligence
c) 1956, Dartmouth University Conference Organized by John McCarthy
d) 1961, Computer and Computer Sense
View Answer
Answer: c
Explanation: None.
Answer: c
Explanation: None.
Answer: c
Explanation: None.
Answer: c
Explanation: None.
7. Which of the following have people traditionally done better than computers?
a) recognizing relative importance
b) finding similarities
c) resolving ambiguity
d) all of the mentioned
View Answer
Answer: c
Explanation: None.
9. Which type of actuator generates a good deal of power but tends to be messy?
a) electric
b) hydraulic
c) pneumatic
d) both hydraulic & pneumatic
View Answer
Answer: b
Explanation: None.
10. Research scientists all over the world are taking steps towards building computers with circuits patterned after the complex
interconnections existing among the human brain’s nerve cells. What name is given to such type of computers?
a) Intelligent computers
b) Supercomputers
c) Neural network computers
d) Smart computers
View Answer
Answer: c
Explanation: None.
Answer: b
Explanation: None.
Answer: d
Explanation: None.
13. The Cedar, BBN Butterfly, Cosmic Cube and Hypercube machine can be characterized as _____________
a) SISD
b) MIMD
c) SIMD
d) MISD
View Answer
Answer: b
Explanation: None.
14. A series of AI systems, developed by Pat Langley to explore the role of heuristics in scientific discovery is ________
a) RAMD
b) BACON
c) MIT
d) DU
View Answer
Answer: b
Explanation: None.
1. Nils Nilsson headed a team at SRI that created a mobile robot named _____________
a) Robotics
b) Dedalus
c) Shakey
d) Vax
View Answer
Answer: c
Explanation: None.
2. An Artificial Intelligence technique that allows computers to understand associations and relationships between objects and events is
called _____________
a) heuristic processing
b) cognitive science
c) relative symbolism
d) pattern matching
View Answer
Answer: c
Explanation: None.
3. The new organization established to implement the Fifth Generation Project is called _____________
a) ICOT (Institute for New Generation Computer Technology)
b) MITI (Ministry of International Trade and Industry)
c) MCC (Microelectronics and Computer Technology Corporation)
d) SCP (Strategic Computing Program)
View Answer
Answer: a
Explanation: None.
Answer: b
Explanation: None.
5. What is the name of the computer program that simulates the thought processes of human beings?
a) Human logic
b) Expert reason
c) Expert system
d) Personal information
View Answer
Answer: c
Explanation: None.
6. What is the name of the computer program that contains the distilled knowledge of an expert?
a) Database management system
b) Management information System
c) Expert system
d) Artificial intelligence
View Answer
Answer: c
Explanation: None.
7. Claude Shannon described the operation of electronic switching circuits with a system of mathematical logic called _____________
a) LISP
b) XLISP
c) Neural networking
d) Boolean algebra
View Answer
Answer: c
Explanation: None.
Answer: c
Explanation: None.
9. What is the term used for describing the judgmental or commonsense part of problem solving?
a) Heuristic
b) Critical
c) Value based
d) Analytical
View Answer
Answer: a
Explanation: None.
10. What was originally called the “imitation game” by its creator?
a) The Turing Test
b) LISP
c) The Logic Theorist
d) Cybernetics
View Answer
Answer: a
Explanation: None.
11. Decision support programs are designed to help managers make _____________
a) budget projections
b) visual presentations
c) business decisions
d) vacation schedules
View Answer
Answer: c
Explanation: None.
12. Programming a robot by physically moving it through the trajectory you want it to follow is called _____________
a) contact sensing control
b) continuous-path control
c) robot vision control
d) pick-and-place control
View Answer
Answer: b
Explanation: None
Answer: d
Explanation: None.
Answer: a
Explanation: None.
Answer: d
Explanation: None.
5. An expert system differs from a database program in that only an expert system _____________
a) contains declarative knowledge
b) contains procedural knowledge
c) features the retrieval of stored information
d) expects users to draw their own conclusions
View Answer
Answer: b
Explanation: None.
Answer: a
Explanation: None.
Answer: d
Explanation: None.
8. Which of the following are examples of software development tools?
a) debuggers
b) editors
c) assemblers, compilers and interpreters
d) all of the mentioned
View Answer
Answer: d
Explanation: None.
Answer: d
Explanation: None.
Answer: d
Explanation: None.
Answer: a
Explanation: Machine learning is the autonomous acquisition of knowledge through the use of computer programs.
2. Which of the factors affect the performance of learner system does not include?
a) Representation scheme used
b) Training scenario
c) Type of feedback
d) Good data structures
View Answer
Answer: d
Explanation: Factors that affect the performance of learner system does not include good data structures.
Answer: d
Explanation: Different learning methods does not include the introduction.
Answer: c
Explanation: In language understanding, the levels of knowledge that does not include empirical knowledge.
Answer: d
Explanation: A model of language consists of the categories which does not include structural units.
Answer: a
Explanation: A top-down parser begins by hypothesizing a sentence (the symbol S) and successively predicting lower level
constituents until individual preterminal symbols are written.
Answer: d
Explanation: p → Øq is not a horn clause.
Answer: d
Explanation: The action ‘STACK(A,B)’ of a robot arm specify to Place block A on block B.
Answer: c
Explanation: The three required terms are a conditional probability and two unconditional probability.
Answer: d
Explanation: Bayes rule can be used to answer the probabilistic queries conditioned on one piece of evidence.
Answer: a
Explanation: A Bayesian network provides a complete description of the domain.
5. How the entries in the full joint probability distribution can be calculated?
a) Using variables
b) Using information
c) Both Using variables & information
d) None of the mentioned
View Answer
Answer: b
Explanation: Every entry in the full joint probability distribution can be calculated from the information in the network.
Answer: b
Explanation: If a bayesian network is a representation of the joint distribution, then it can solve any query, by summing all the
relevant joint entries.
Answer: a
Explanation: The compactness of the bayesian network is an example of a very general property of a locally structured system.
Answer: c
Explanation: Local structure is usually associated with linear rather than exponential growth in complexity.
9. Which condition is used to influence a variable directly by all the others?
a) Partially connected
b) Fully connected
c) Local connected
d) None of the mentioned
View Answer
Answer: b
Explanation: None.
10. What is the consequence between a node and its predecessors while creating bayesian network?
a) Functionally dependent
b) Dependant
c) Conditionally independent
d) Both Conditionally dependant & Dependant
View Answer
Answer: c
Explanation: The semantics to derive a method for constructing bayesian networks were led to the consequence that a node can be
conditionally independent of its predecessors.
1. A _________ is a decision support tool that uses a tree-like graph or model of decisions and their possible consequences, including
chance event outcomes, resource costs, and utility.
a) Decision tree
b) Graphs
c) Trees
d) Neural Networks
View Answer
Answer: a
Explanation: Refer the definition of Decision tree.
Answer: a
Explanation: None.
Answer: c
Explanation: Refer the definition of Decision tree.
Answer: a
Explanation: None.
Answer: d
Explanation: None.
Answer: b
Explanation: None.
Answer: c
Explanation: None.
Answer: d
Explanation: None.
Answer: d
Explanation: None
View Answer
Ans : D
View Answer
Ans : D
Explanation: ML is a field of AI consisting of learning algorithms that : Improve their performance (P), At executing some task (T),
Over time with experience (E).
3. p → 0q is not a?
A. hack clause
B. horn clause
C. structural clause
D. system clause
View Answer
Ans : B
A. STACK(A,B)
B. LIST(A,B)
C. QUEUE(A,B)
D. ARRAY(A,B)
View Answer
Ans : A
Explanation: The action 'STACK(A,B)' of a robot arm specify to Place block A on block B.
5. A__________ begins by hypothesizing a sentence (the symbol S) and successively predicting lower level constituents until individual
preterminal symbols are written.
A. bottow-up parser
B. top parser
C. top-down parser
D. bottom parser
View Answer
Ans : C
Explanation: A top-down parser begins by hypothesizing a sentence (the symbol S) and successively predicting lower level
constituents until individual preterminal symbols are written.
6. A model of language consists of the categories which does not include ________.
A. System Unit
B. structural units.
C. data units
D. empirical units
View Answer
Ans : B
Explanation: A model of language consists of the categories which does not include structural units.
A. Introduction
B. Analogy
C. Deduction
D. Memorization
View Answer
Ans : A
8. The model will be trained with data in one single batch is known as ?
A. Batch learning
B. Offline learning
C. Both A and B
D. None of the above
View Answer
Ans : C
Explanation: we have end-to-end Machine Learning systems in which we need to train the model in one go by using whole
available training data. Such kind of learning method or algorithm is called Batch or Offline learning.
View Answer
Ans : A
Explanation: The following are various ML methods based on some broad categories : Based on human supervision, Unsupervised
Learning, Semi-supervised Learning and Reinforcement Learning
10. In Model based learning methods, an iterative process takes place on the ML models that are built based on various model parameters,
called ?
A. mini-batches
B. optimizedparameters
C. hyperparameters
D. superparameters
View Answer
Ans : C
Explanation: In Model based learning methods, an iterative process takes place on the ML models that are built based on various
model parameters, called hyperparameters.
ELL784/EEL709: Introduction to Machine Learning
Minor Test I, Form: A (please write this Form ID on the cover page of your answer script)
Maximum marks: 20
1. Consider a binary classification problem. Suppose I have trained a model on a linearly separable
training set, and now I get a new labeled data point which is correctly classified by the model, and far
away from the decision boundary. If I now add this new point to my earlier training set and re-train,
in which cases is the learnt decision boundary likely to change?
(a) When my model is a perceptron.
(b) When my model is logistic regression.
(c) When my model is an SVM.
(d) When my model is Gaussian discriminant analysis.
2. When doing least-squares regression with regularisation (assuming that the optimisation can be done
exactly), increasing the value of the regularisation parameter λ
(a) will never decrease the training error.
(b) will never increase the training error.
(c) will never decrease the testing error.
(d) will never increase the testing error.
(e) may either increase or decrease the training error.
(f) may either increase or decrease the testing error.
3. Which of the following points would Bayesians and frequentists disagree on?
(a) The use of a non-Gaussian noise model in probabilistic regression.
(b) The use of probabilistic modelling for regression.
(c) The use of prior distributions on the parameters in a probabilistic model.
(d) The use of class priors in Gaussian Discriminant Analysis.
(e) The idea of assuming a probability distribution over models.
4. Regarding bias and variance, which of the follwing statements are true? (Here ‘high’ and ‘low’ are
relative to the ideal model.)
(a) Models which overfit have a high bias.
(b) Models which overfit have a low bias.
(c) Models which underfit have a high variance.
(d) Models which underfit have a low variance.
5. Which of the following are characteristics of data sampled from a Gaussian distribution?
(a) The sample mean systematically underestimates the true mean.
(b) The sample variance systematically overestimates the true variance.
(c) Both the sample mean and variance are unbiased estimators of the true values.
1
6. Suppose your model is overfitting. Which of the following is NOT a valid way to try and reduce the
overfitting?
(a) Increase the amount of training data.
(b) Improve the optimisation algorithm being used for error minimisation.
(c) Decrease the model complexity.
(d) Reduce the noise in the training data.
7. You are reviewing papers for the World’s Fanciest Machine Learning Conference, and you see submis-
sions with the following claims. Which ones would you consider accepting?
(a) My method achieves a training error lower than all previous methods!
(b) My method achieves a test error lower than all previous methods! (Footnote: When regulari-
sation parameter λ is chosen so as to minimise test error.)
(c) My method achieves a test error lower than all previous methods! (Footnote: When regulari-
sation parameter λ is chosen so as to minimise cross-validaton error.)
(d) My method achieves a cross-validation error lower than all previous methods! (Footnote:
When regularisation parameter λ is chosen so as to minimise cross-validaton error.)
Suppose we set up a simple probabilistic model for this as follows: θ is the prior probability of a train
being late; p is the probability of a late prediction from Method 1 if the train is on time (also called
the False Positive Rate (FPR)); and q is the probability of a late prediction from Method 1 if the train
is in fact late (also called the True Positive Rate (TPR)).
(a) Write down the joint likelihood of the data for Method 1, as a function of the three model parameters
θ, p, and q. Obtain maximum likelihood estimates for each of these parameters. [4]
(b) Suppose the loss matrix for this prediction task is defined as follows:
Using the parameter estimates computed above, obtain the expected loss for Method 1 as a function
of K. [2]
(c) Obtain the expected loss for Method 2 as well (you can compute its FPR and TPR directly, without
doing the maximum likelihood derivations again); which is the preferable method? What is the critical
value of K at which this preference changes? [4]
2
Answer Key for Exam A
Section 1. Multiple choice questions
Each question may have any number of correct answers, including zero. List all choices you believe to be
correct (1 mark for each correct answer, -0.5 for each incorrect answer). No justification is required.
1. Consider a binary classification problem. Suppose I have trained a model on a linearly separable
training set, and now I get a new labeled data point which is correctly classified by the model, and far
away from the decision boundary. If I now add this new point to my earlier training set and re-train,
in which cases is the learnt decision boundary likely to change?
(a) When my model is a perceptron.
(b) When my model is logistic regression.
(c) When my model is an SVM.
(d) When my model is Gaussian discriminant analysis.
2. When doing least-squares regression with regularisation (assuming that the optimisation can be done
exactly), increasing the value of the regularisation parameter λ
(a) will never decrease the training error.
(b) will never increase the training error.
(c) will never decrease the testing error.
(d) will never increase the testing error.
(e) may either increase or decrease the training error.
(f) may either increase or decrease the testing error.
3. Which of the following points would Bayesians and frequentists disagree on?
(a) The use of a non-Gaussian noise model in probabilistic regression.
(b) The use of probabilistic modelling for regression.
(c) The use of prior distributions on the parameters in a probabilistic model.
(d) The use of class priors in Gaussian Discriminant Analysis.
(e) The idea of assuming a probability distribution over models.
4. Regarding bias and variance, which of the follwing statements are true? (Here ‘high’ and ‘low’ are
relative to the ideal model.)
(a) Models which overfit have a high bias.
(b) Models which overfit have a low bias.
(c) Models which underfit have a high variance.
(d) Models which underfit have a low variance.
5. Which of the following are characteristics of data sampled from a Gaussian distribution?
(a) The sample mean systematically underestimates the true mean.
(b) The sample variance systematically overestimates the true variance.
(c) Both the sample mean and variance are unbiased estimators of the true values.
1
6. Suppose your model is overfitting. Which of the following is NOT a valid way to try and reduce the
overfitting?
(a) Increase the amount of training data.
(b) Improve the optimisation algorithm being used for error minimisation.
(c) Decrease the model complexity.
(d) Reduce the noise in the training data.
7. You are reviewing papers for the World’s Fanciest Machine Learning Conference, and you see submis-
sions with the following claims. Which ones would you consider accepting?
(a) My method achieves a training error lower than all previous methods!
(b) My method achieves a test error lower than all previous methods! (Footnote: When regulari-
sation parameter λ is chosen so as to minimise test error.)
(c) My method achieves a test error lower than all previous methods! (Footnote: When regulari-
sation parameter λ is chosen so as to minimise cross-validaton error.)
(d) My method achieves a cross-validation error lower than all previous methods! (Footnote:
When regularisation parameter λ is chosen so as to minimise cross-validaton error.)
Suppose we set up a simple probabilistic model for this as follows: θ is the prior probability of a train
being late; p is the probability of a late prediction from Method 1 if the train is on time (also called
the False Positive Rate (FPR)); and q is the probability of a late prediction from Method 1 if the train
is in fact late (also called the True Positive Rate (TPR)).
(a) Write down the joint likelihood of the data for Method 1, as a function of the three model parameters
θ, p, and q. Obtain maximum likelihood estimates for each of these parameters. [4]
(b) Suppose the loss matrix for this prediction task is defined as follows:
2
Actually on time Actually late
Predicted on time 0 1
Predicted late K 0
Using the parameter estimates computed above, obtain the expected loss for Method 1 as a function
of K. [2]
(c) Obtain the expected loss for Method 2 as well (you can compute its FPR and TPR directly, without
doing the maximum likelihood derivations again); which is the preferable method? What is the critical
value of K at which this preference changes? [4]
For Method 2:
68
p̂M L =
150
278
q̂M L =
350
350 72 150 68
E[L] = × +K ×
500 350 500 150
72 68
= +K
500 500
72 + 68K
=
500
The preferable method is the one with the lower expected loss, which depends on the value of K. Let
the critical value be KC , then we have
3
Clustering VS Classification
MCQ
1. What is the relation between the distance between clusters and the corresponding class
discriminability?
a. proportional
b. inversely-proportional
c. no-relation
Ans: (a)
Ans: (d)
Ans: (b)
Ans: (c)
Ans: (b)
6. Unsupervised classification can be termed as
a. distance measurement
b. dimensionality reduction
c. clustering
d. none of the above
Ans: (d)
Ans: (c)
Linear Algebra
MCQ
1. Which of the properties are true for matrix multiplication
a. Distributive
b. Commutative
c. both a and b
d. neither a nor b
Ans: (a)
2. Which of the operations can be valid with two matrices of different sizes?
a. addition
b. subtraction
c. multiplication
d. Division
Ans: (c)
Ans: (c)
Ans: (a)
Ans: (d)
Ans: (d)
Ans: (d)
Eigenvalues and Eigenvectors
MCQ
2 7
1. The Eigenvalues of a matrix � � are
−1 −6
a. 3 and 0
b. -2 and 7
c. -5 and 1
d. 3 and -5
Ans: (c)
0 1 1
2. The Eigenvalues of �1 0 1� are
1 1 0
a. -1, 1 and 2
b. 1, 1 and -2
c. -1, -1 and 2
d. 1, 1 and 2
Ans: (c)
0 1 1
3. The Eigenvectors of �1 0 1� are
1 1 0
a. (1 1 1), (1 0 1) and (1 1 0)
b. (1 1 -1), (1 0 -1) and (1 1 0)
c. (-1 1 -1), (1 0 1) and (1 1 0)
d. (1 1 1), (-1 0 1) and (-1 1 0)
Ans: (d)
Ans: (c)
Ans: (c)
Ans: (a)
Vector Spaces
MCQ
1. Which of these is a vector space?
a. {(x y z w)′ ∈ R4 |x + y − z + w = 0}
b. {(x y z)′ ∈ R3 |x + y + z = 0}
c. {(x y z)′ ∈ R3 |x 2 + y 2 + z 2 = 1}
a 1
d. {� � |a, b, c ∈ R}
b c
Ans: (a)
Ans: (d)
Ans: (d)
a − 3b + 6c
4. What is the dimension of the subspace H = �� 5a + 4d � : a, b, c, d ∈ R�
b − 2c − d
5d
a. 1
b. 2
c. 3
d. 4
Ans: (c)
2 −1 1 −6 8
5. What is the rank of the matrix � 1 −2 −4 3 −2 �
−7 8 10 3 −10
4 −5 −7 0 4
a. 2
b. 3
c. 4
d. 5
Ans: (a)
6. If v1, v2, v3, v4 are in 𝑅 4 and v3 is not a linear combination of v1, v2, v4, then {v1, v2, v3, v4}
must be linearly independent.
a. True
b. False
1 1 3
7. The vectors x1=�1� , x2=�−1� , x3=�1� are :
1 2 4
a. Linearly dependent
b. Linearly independent
1 −5
8. The vectors x1=� � , x2=� � are :
2 3
a. Linearly dependent
b. Linearly independent
Ans: (b).
Rank and SVD
MCQ
1. The number of non-zero rows in an echelon form is called?
Ans: (b)
2. Let A and В be arbitrary m x n matrices. Then which one of the following statement is true
a. 𝑟𝑎𝑛𝑘(𝐴 + 𝐵) ≤ 𝑟𝑎𝑛𝑘(𝐴) + 𝑟𝑎𝑛𝑘(𝐵)
b. 𝑟𝑎𝑛𝑘(𝐴 + 𝐵) < 𝑟𝑎𝑛𝑘(𝐴) + 𝑟𝑎𝑛𝑘(𝐵)
c. 𝑟𝑎𝑛𝑘(𝐴 + 𝐵) ≥ 𝑟𝑎𝑛𝑘(𝐴) + 𝑟𝑎𝑛𝑘(𝐵)
d. 𝑟𝑎𝑛𝑘(𝐴 + 𝐵) > 𝑟𝑎𝑛𝑘(𝐴) + 𝑟𝑎𝑛𝑘(𝐵)
Ans: (a)
0 0 0
3. The rank of the matrix � � is
0 0 0
a. 0
b. 2
c. 1
d. 3
Ans: (a)
1 1 1
4. The rank of �1 1 1�is
1 1 1
a. 3
b. 2
c. 1
d. 0
Ans: (c)
5. Consider the following two statements:
I. The maximum number of linearly independent column vectors of a matrix A is called the rank of A.
II. If A is an n x n square matrix, it will be nonsingular is rank A = n.
With reference to the above statements, which of the following applies?
Ans: (b)
6. The rank of a 3 x 3 matrix C (= AB), found by multiplying a non-zero column matrix A of size
3 x 1 and a non-zero row matrix B of size 1 x 3, is
a. 0
b. 1
c. 2
d. 3
Ans: (b)
1 2
7. Find the singular values of the matrix 𝐵 = � �
2 1
a. 2 and 4
b. 3 and 4
c. 2 and 3
d. 3 and 1
Ans: (d)
a. AAT
b. ATA
c. AA-1
d. A*A
Ans: (a)
Ans: (a)
Normal Distribution and Decision Boundary I
MCQ
1. Three components of Bayes decision rule are class prior, likelihood and …
a. Evidence
b. Instance
c. Confidence
d. Salience
Ans: (a)
Ans: (a)
Ans: (d)
4. When the value of the data is equal to the mean of the distribution in which it belongs to, the
Gaussian function attains … value
a. Minimum
b. Maximum
c. Zero
d. None of the above
Ans: (b)
Ans: (a)
6. Property of correlation coefficient is
a. −1 ≤ 𝜌𝑥𝑦 ≤ 1
b. −0.5 ≤ 𝜌𝑥𝑦 ≤ 1
c. −1 ≤ 𝜌𝑥𝑦 ≤ 1.5
d. −0.5 ≤ 𝜌𝑥𝑦 ≤ 0.5
Ans: (a)
Ans: (b)
Ans: (a)
Ans: (a)
Normal Distribution and Decision Boundary II
MCQ
1. If the covariance matrix is strictly diagonal with equal variance then the iso-contour lines (data
scatter) of the data resembles
a. Concentric circle
b. Ellipse
c. Oriented Ellipse
d. None of the above
Ans: (a)
Ans: (c)
Ans: (a)
Ans: (b)
Ans: (b)
6. For spiral data the decision boundary will be
a. Linear
b. Non-linear
c. Does not exist
Ans: (b)
7. In a 2-class problem, if the discriminant function satisfies 𝑔1 (𝑥) = 𝑔2 (𝑥) then, the data point
lies
a. On the DB
b. Class 1’s side
c. Class 2’s side
d. None of the above
Ans: (a)
Bayes Theorem
MCQ
1. 𝑃�𝑋⃗�𝑃�𝑤𝑖 �𝑋⃗� =
a. 𝑃�1 − 𝑋⃗�𝑃�𝑤𝑖 �𝑋⃗�
b. 𝑃�𝑋⃗�𝑃�1 − 𝑤𝑖 �𝑋⃗�
c. 𝑃�𝑋⃗|𝑤𝑖 �𝑃(𝑤𝑖 )
d. 𝑃�𝑋⃗ − 𝑤𝑖 �𝑃�𝑤𝑖 �𝑋⃗�
Ans: (c)
Ans: (a)
Ans: (b)
4. When the covariance term in Mahalobian distance becomes Identity then the distance is similar
to
a. Euclidean distance
b. Manhattan distance
c. City block distance
d. Geodesic distance
Ans: (a)
Ans: (d)
6. Bayes error is the ….. bound of probability of classification error.
a. Lower
b. Upper
Ans: (a)
7. Bayes decision rule is the theoretically …….. classifier that minimize probability of classification
error.
a. Best
b. Worst
c. Average
Ans: (a)
Linear Discriminant Function and Perceptron Learning
MCQ
1. A perceptron is:
a. a single McCulloch-Pitts neuron
b. an autoassociative neural network
c. a double layer autoassociative neural network
d. All the above
Ans: (a)
2. Perceptron is used as a classifier for
a. Linearly separable data
b. Non-linearly separable data
c. Linearly non-separable data
d. Any data
Ans: (a)
3. A 4-input neuron has weights 1, 2, 3 and 4. The transfer function is linear with the
constant of proportionality being equal to 2. The inputs are 4, 10, 5 and 20 respectively.
The output will be:
a. 238
b. 76
c. 119
d. 178
Ans: (a)
𝑢𝑢1 f(a)
1 𝑓𝑜𝑟 𝑎 > 0
𝑢𝑢2
𝑓(𝑎) = � 0 𝑓𝑜𝑟 𝑎 = 0
−1 𝑓𝑜𝑟 𝑎 < 0
Ans: (a)
Ans: (a)
7. Consider a perceptron for which training sample, 𝑢𝑢 ∈ 𝑅 2 and actual output, 𝑥 ∈ {0,1}, let
the desired output be 0 when elements of class A={(2,4),(3,2),(3,4)} is applied as input
and let it be 1 for the class B={(1,0),(1,2),(2,1)}. Let the learning rate η be 0.5 and initial
connection weights are w 0 =0, w 1 =1, w 2 =1. Answer the following questions:
A. Shall the perceptron convergence procedure terminate if the input patterns from class
A and B are repeatedly applied by choosing a very small learning rate?
a. Yes
b. No
c. Can’t say
Ans: (a). Since Classes are linearly separable.
B. Now add sample (5,2) to class B, what is your answer now, i.e. will it converge or
not?
a. Yes
b. No
c. Can’t say
Ans: (b). After adding above sample, classes become non linear separable.
Linear and Non-Linear Decision Boundaries
MCQ
1. Decision Boundary in case of same covariance matrix, with identical diagonal elements is :
a. Linear
b. Non-Linear
c. None of the above
Ans: (a)
2. Decision Boundary in case of diagonal covariance matrix, with identical diagonal elements is
given by 𝑊 𝑇 (𝑋 − 𝑋0 ) = 0, where 𝑊 is given by:
a. (𝜇𝑘 − 𝜇𝑙 )/ 𝜎 2
b. (𝜇𝑘 + 𝜇𝑙 )/ 𝜎 2
c. (𝜇𝑘2 + 𝜇𝑙2 )/ 𝜎 2
d. (𝜇𝑘 + 𝜇𝑙 )/ 𝜎
Ans: (a)
3. Decision Boundary in case of arbitrary covariance matrix but identical for all class is :
a. Linear
b. Non-Linear
c. None of the above
Ans: (a)
4. Decision Boundary in case of arbitrary covariance matrix but identical for all class is given by
𝑊 𝑇 (𝑋 − 𝑋0 ) = 0, where 𝑊 is given by:
a. (𝜇𝑘 − 𝜇𝑙 )/ 𝜎 2
b. Σ −1 ( µ k − µl )
c. (𝜇𝑘2 + 𝜇𝑙2 )/ 𝜎 2
d. Σ −1 ( µ k2 − µl2 )
Ans: (b)
Ans: (b)
6. Discriminant function in case of arbitrary covariance matrix and all parameters are class
dependent is given by �𝑋 𝑇 𝑊𝑖 𝑋 + 𝑤 𝑇𝑖 𝑋 + 𝑤𝑖𝑜 � = 0, where 𝑊 is given by:
1
a. − Σ i−1
2
b. Σ i µi
−1
1
c. − Σ i−1µi
2
1 −1
d. − Σ i
4
Ans: (a)
PCA
MCQ
1. The tool used to obtain a PCA is
a. LU Decomposition
b. QR Decomposition
c. SVD
d. Cholesky Decomposition
Ans: (c)
Ans: (b)
b. ∑𝑁 𝑇
𝑘=1(𝑥𝑘 − 𝜇) (𝑥𝑘 − 𝜇)
c. ∑𝑁 𝑘=1(𝜇 − 𝑥𝑘 )(𝜇 − 𝑥𝑘 )
𝑇
d. ∑𝑁 𝑇
𝑘=1(𝜇 − 𝑥𝑘 ) (𝜇 − 𝑥𝑘 )
Ans: (a)
Ans: (b)
5. The vectors which correspond to the vanishing singular values of a matrix that span the null
space of the matrix are:
a. Right singular vectors
b. Left singular vectors
c. All the singular vectors
d. None
Ans: (a)
6. If 𝑆 is the scatter of the data in the original domain, then the scatter of the transformed feature
vectors is given by
a. 𝑆 𝑇
b. 𝑆
c. 𝑊𝑆𝑊 𝑇
d. 𝑊 𝑇 𝑆𝑊
Ans: (d)
Ans: (a)
8. The following linear transform does not have a fixed set of basis vectors:
a. DCT
b. DFT
c. DWT
d. PCA
Ans: (d)
b. ∑𝑐𝑖=1 ∑𝑁 𝑇
𝑘=1(𝑥𝑘 − 𝜇𝑖 ) (𝑥𝑘 − 𝜇𝑖 )
c. ∑𝑐𝑖=1 ∑𝑁
𝑘=1(𝑥𝑖 − 𝜇𝑘 )(𝑥𝑖 − 𝜇𝑘 )
𝑇
d. ∑𝑐𝑖=1 ∑𝑁 𝑇
𝑘=1(𝑥𝑖 − 𝜇𝑘 ) (𝑥𝑖 − 𝜇𝑘 )
Ans: (a)
10. The Between Class scatter matrix is given by:
a. ∑𝑐𝑖=1 𝑁𝑖 (𝜇𝑖 − 𝜇)(𝜇𝑖 − 𝜇)𝑇
b. ∑𝑐𝑖=1 𝑁𝑖 (𝜇𝑖 − 𝜇)𝑇 (𝜇𝑖 − 𝜇)
c. ∑𝑐𝑖=1 𝑁𝑖 (𝜇 − 𝜇𝑖 )(𝜇 − 𝜇𝑖 )𝑇
d. ∑𝑐𝑖=1 𝑁𝑖 (𝜇 − 𝜇𝑖 )𝑇 (𝜇 − 𝜇𝑖 )
Ans: (a)
Ans: (a)
Linear Discriminant Analysis
MCQ
1. Linear Discriminant Analysis is
a. Unsupervised Learning
b. Supervised Learning
c. Semi-supervised Learning
d. None of the above
Ans: (b)
Ans: (b)
Ans: (a)
4. The upper bound of the number of non-zero Eigenvalues of S w -1S B (C = No. of Classes)
a. C - 1
b. C + 1
c. C
d. None of the above
Ans: (a)
5. If S w is singular and N<D, its rank is at most (N is total number of samples, D dimension of data, C
is number of classes)
a. N + C
b. N
c. C
d. N - C
Ans: (d)
6. If S w is singular and N<D the alternative solution is to use (N is total number of samples, D
dimension of data)
a. EM
b. PCA
c. ML
d. Any one of the above
Ans: (b)
GMM
MCQ
1. A method to estimate the parameters of a distribution is
a. Maximum Likelihood
b. Linear Programming
c. Dynamic Programming
d. Convex Optimization
Ans: (a)
Ans: (c)
Ans: (a)
Ans: (b)
Ans: (d)
6. For Gaussian mixture models, parameters are estimated using a closed form solution by
a. Expectation Minimization
b. Expectation Maximization
c. Maximum Likelihood
d. None of the above
Ans: (b)
Ans: (b,c)
Ans: c
References:
(1) Let {X1 , . . . , Xn } be i.i.d. samples from N(µ, σ 2 ), with σ > 0. Letting µ̂n =
1 Pn
n i=1 Xi . Then, which of the following statements is true?
Pn 2
Pn 2
(a) i=1 (Xi − µ̂n ) = i=1 (Xi − µ) .
Pn
(Xi − µ̂n )2 ≤ ni=1 (Xi − µ)2 .
P
(b)
Pni=1 2
Pn 2
(c) i=1 (Xi − µ̂n ) > i=1 (Xi − µ) .
Pn 2
Pn 2
(d) An inequality/equality relating i=1 (Xi − µ̂n ) and i=1 (Xi − µ) does not
always hold.
Answer:
(2) Consider a Bayesian estimation problem, with data {X1 , . . . , Xn } i.i.d. from N(θ, 1),
and a N(0, 1) prior. Letting Sn = ni=1 Xi , the posterior mean is
P
Sn Sn
(a) (b)
n n+1
nSn Sn + 1
(c) (d)
n+1 n+2
Answer:
(3) Let X ∼ Unif[0, θ]. Then, the maximum likelihood estimate of θ, given i.i.d. samples
{X1 , . . . , Xn } is
(a) ni=1 Snn .
P
(b) mini=1,...,n Xi .
(c) maxi=1,...,n Xi . (d) 21 (maxi=1,...,n Xi − mini=1,...,n Xi ).
Answer:
(4) Suppose that we are trying to fit a linear and 10th degree polynomial to data coming
from a cubic function, corrupted by standard Gaussian noise. Let M1 and M2 denote
the models corresponding to the linear and 10 degree polynomial. Then,
(a) Bias(M1 ) ≤ Bias(M2 ), Variance(M1 ) ≤ Variance(M2 ).
(b) Bias(M1 ) ≤ Bias(M2 ), Variance(M1 ) ≥ Variance(M2 ).
(c) Bias(M1 ) ≥ Bias(M2 ), Variance(M1 ) ≤ Variance(M2 ).
(d) Bias(M1 ) ≥ Bias(M2 ), Variance(M1 ) ≥ Variance(M2 ).
Answer:
(5) Consider a regression problem, with scalar input X ∈ R, and target Y ∈ R. Suppose
(X, Y ) is bivariate normal with non-zero means, positive variances, and non-zero cor-
relation. Then, the optimal predictor, for the square loss, as a function of X is
(a) Quadratic. (b) Constant.
(c) Linear. (d) None of the above.
Answer:
Ordination – generalities
1. The primary objective of an ordination of multivariate data is to display the objects in a
diagram where similar objects are together and objects with different characteristics are far apart.
– True, False.
2. Ecologists use multivariate ordination methods such as PCA because the data they want to
display are multivariate. – True, False.
3. An ordination method is a statistical test. – True, False.
Principal component analysis (PCA) – computation
4. Principal component analysis (PCA) can be used with variables of any mathematical types:
quantitative, qualitative, or a mixture of these types. – True, False.
5. Principal component analysis (PCA) requires quantitative multivariate data. – True, False.
6. The sum of the PCA eigenvalues is equal to the sum of the variances of the variables. – True,
False.
7. Variances and covariances can be computed for variables of any mathematical types:
quantitative, qualitative, or a mixture of these types. – True, False.
Variable transformation
8. For variables with physical dimensions (e.g. kg), their variances also have physical
dimensions. – True, False.
9. The variables subjected to PCA must all have the same physical dimensions. – True, False.
10. When the variables have different physical dimensions, they must be made dimensionless by
standardization or ranging before PCA. – True, False.
11. Tables of environmental variables that have different physical dimensions must be
standardized before PCA. – True, False.
12. PCA ordination diagrams are easier to interpret when the distributions of the variables are
symmetrical. – True, False.
13. For community composition data, the Hellinger and chord transformations are appropriate
before PCA. – True, False.
2
PCA biplots
14. PCA biplots are graphs in which objects and variables (descriptors) are represented together.
– True, False.
15. In PCA, distance biplots (scaling 1) correctly represent the positions of the objects with
respect to one another, projected in 2 dimensions. – True, False.
16. In PCA, correlation biplots (scaling 2) correctly represent the angular relationships among the
variables, projected in 2 dimensions. – True, False.
17. Groups of similar sites can be identified on distance biplots (scaling 1). – True, False.
18. Intercorrelated groups of species can be identified on correlation biplots (scaling 2). – True,
False.
Equilibrium circle of descriptors
19. An equilibrium circle of descriptors can be drawn on PCA distance biplots (scaling 1). –
True, False.
20. An equilibrium circle of descriptors can be drawn on PCA correlation biplots (scaling 2). –
True, False.
Meaningful components, algorithms
21. The most meaningful and interpretable principal components are those that have the largest
eigenvalues. – True, False.
22. The broken-stick model is often used as a null model against which one can assess the
eigenvalues, in order to determine the most important eigenvalues and how many PCA axes one
should examine and plot. – True, False.
23. Eigen decomposition, singular value decomposition (SVD) and iterative search of
eigenvalues and eigenvectors are three different ways of computing PCA. They produce the same
PCA results. – True, False.
3
4. False
5. True
6. True
7. False
8. True
9. True
10. True
11. True
12. True
13. True
14. True
15. True
16. True
17. True
18. True
19. True
20. False
21. True
22. True
23. True
4
Sample questions for “Fundamentals of Machine Learning 2018”
Teacher: Mohammad Emtiyaz Khan
• In the final exam, no electronic devices are allowed except a calculator. Make
sure that your calculator is only a calculator and cannot be used for any other
purpose.
• For derivations, clearly explain your derivation step by step. In the final
exam you will be marked for steps as well as for the end result.
• We will denote the output data vector by y which is a vector that contains
all yn , and the feature matrix by X which is a matrix containing features xTn
as rows. Also, x en = [1, xTn ]T .
1 Multiple-Choice/Numerical Questions
1. Choose the options that are correct regarding machine learning (ML) and
artificial intelligence (AI),
1
Answer: (D)
1 1 1
Answer: 1
1 1 1
Answer: 2
12 8 −36
Answer: 2
Answer: C
(A) Linear in D.
(B) Polynomial in D.
(C) Exponential in D.
(D) Linear in N .
Answer: C,D
2
(B) It can be applied to non-continuous functions.
(C) It is easy to implement.
(D) It runs reasonably fast for multiple linear regression.
Answer: A,B,C.
Answer: A,C,D
11. Let us say that we have computed the gradient of our cost function and
stored it in a vector g. What is the cost of one gradient descent update
given the gradient?
(A) O(D)
(B) O(N )
(C) O(N D)
(D) O(N D2 )
Answer: (A)
12. Let us say that we are fitting one-parameter model to the data, i.e. yn ≈ β0 .
(0)
The average of y1 , y2 , . . . , yN is 1. We start gradient descent at β0 = 0 and
set the step-size to 0.5. What is the value of β0 after 3 iterations, i.e., the
(3)
value of β0 ?
Answer: 0.875 (deviation 0.01)
13. Let us say that we are fitting one-parameter model to the data, i.e. yn ≈ β0 .
(0)
The average of y1 , y2 , . . . , yN is 1. We start gradient descent at β0 = 10 and
set the step-size to 0.5. What is the value of β0 after 3 iterations, i.e., the
(3)
value of β0 ?
Answer: CA: 2.125 (deviation 0.01)
3
14. Computational complexity of Gradient descent is,
(A) linear in D
(B) linear in N
(C) polynomial in D
(D) dependent on the number of iterations
Answer: C
15. Generalization error measures how well an algorithm perform on unseen data.
The test error obtained using cross-validation is an estimate of the general-
ization error. Is this estimate unbiased?
Answer: (No)
(A) linear in K
(B) quadratic in K
(C) cubic in K
(D) exponential in K
Answer: A
17. You observe the following while fitting a linear regression to the data: As
you increase the amount of training data, the test error decreases and the
training error increases. The train error is quite low (almost what you expect
it to), while the test error is much higher than the train error.
What do you think is the main reason behind this behavior. Choose the
most probable option.
Answer: A
18. Adding more basis functions in a linear model... (pick the most probably
option)
4
(D) Doesn’t affect bias and variance
Answer: A
2 Multiple-output regression
Suppose we have N regression training-pairs, but instead of one output for each
input vector xn ∈ RD , we now have 2 outputs yn = [yn1 , yn2 ] where each yn1 and
yn2 are real numbers. For each output yn1 , we wish to fit a separate linear model:
where β 1 and β 2 are vectors of β1d and β2d respectively, for d = 0, 1, 2, . . . , D, and
eTn = [1 xTn ].
x
Our goal is to estimate β 1 and β 2 for which we choose to minimize the following
cost function:
N D D
X 1 T
2 1 T
2 X
2
X
2
L(β 1 , β 2 ) := yn1 − β 1 x
en + yn2 − β 2 x
en + λ1 β1d + λ2 β2d .
n=1
2 2 d=0 d=0
(6)
Answer:
PN h 2 i
(A) ∂L
:= − yn1 − β T1 x
en x
en + λ1 β 1 , same for β 2 .
∂β1 n=1
(B) The number of parameters is equal to 30 and the number of data points is
equal to 40. It is good to regularize, but just a mild regularization will do
since the number of parameters is still less than number of data points.
(C) Yes, we expect this to be the case because, if the data points are i.i.d., then
we might need less regularization.
(D) Same as gradient descent (please put an exact number here for the final
exam).
5
3 Eigenvalues
Given a real-valued matrix X, show that all the non-zero eigenvalues of XXT and
XT X are the same.
Answer: To prove this, you can use the SVD of X = USVT . Then XXT =
US2 UT and XT X = VS2 V. The non-zero eigenvalues are the same, although the
number of eigenvalues are different.
Consider the following artificial neural network with the nonlinear transformation
znm = σ(anm ) (see figure below). Here, n is the data index and m is the index of
hidden units. There are two binary outputs yn1 and yn2 taking values in {0, 1}.
Suppose you have N = 200 data points but M = 200 hidden units for each layer.
What problem(s) are you likely to encounter when training such a network? How
would you solve the problem(s)?
Answer: Overfitting. There are multiple ways to tackle this problem as discussed
in the lecture.