100% found this document useful (3 votes)

449 views

Machine Learning

This document contains a 29 question multiple choice quiz on machine learning concepts. The questions cover topics such as adaptive system management, Bayesian classifiers, algorithms, bias, background knowledge, case-based learning, classification, binary attributes, classification accuracy, clusters, black boxes, data mining, discovery, DNA, hybrid systems, Euclidean distance, hidden knowledge, heterogeneous databases, enumeration, heuristics, hybrid learning, Kohonen self-organizing maps, and incremental learning. Each question is followed by 4 possible answers with one correct answer identified.

Uploaded by

Dip

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (3 votes)

449 views

Machine Learning

Uploaded by

Dip

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2520

MCQ for unit 1: Introduction to Machine learning

1) Adaptive system management is

A) It uses machine-learning techniques. Here program can learn

from past experience and adapt themselves to new situations.
B) Computational procedure that takes some value as input and
produces some value as output.
C) Science of making machines performs tasks that would require
intelligence when performed by humans.
D) None of these

Answer: A

2) Bayesian classifiers is

A) A class of learning algorithm that tries to find an optimum

classification of a set of examples using the probabilistic
theory.
B) Any mechanism employed by a learning system to constrain the
search space of a hypothesis.
C) An approach to the design of learning algorithms that is
inspired by the fact that when people encounter new situations,
they often explain them by reference to familiar experiences,
adapting the explanations to fit the new situation.
D) None of these

Answer: A

3) Algorithm is

A) It uses machine-learning techniques. Here program can learn

Answer: B

4) Bias is

A) A class of learning algorithm that tries to find an optimum

Answer: B

5) Background knowledge referred to

A) Additional acquaintance used by a learning algorithm to

facilitate the learning process.
B) A neural network that makes use of a hidden layer.
C) It is a form of automatic learning.
D) None of these

Answer: A

6) Case-based learning is

A) A class of learning algorithm that tries to find an optimum

Answer: C

7) Classification is

A) A subdivision of a set of examples into a number of classes.

B) A measure of the accuracy, of the classification of a
concept that is given by a certain theory.
C) The task of assigning a classification to a set of examples
D) None of these

Answer: A

8) Binary attribute are

A) This takes only two values. In general, these values will be
0 and 1 and .they can be coded as one bit
B) The natural environment of a certain species.
C) Systems that can be used without knowledge of internal
operations.
D) None of these

Answer: A

9) Classification accuracy is

A) A subdivision of a set of examples into a number of classes

B) Measure of the accuracy, of the classification of a concept
that is given by a certain theory.
C) The task of assigning a classification to a set of examples
D) None of these

Answer: B

10) Biotope are

A) This takes only two values. In general, these values will be

0 and 1 and they can be coded as one bit.
B) The natural environment of a certain species
C) Systems that can be used without knowledge of internal
operations
D) None of these

Answer: B

11) Cluster is

A) Group of similar objects that differ significantly from

other objects
B) Operations on a database to transform or simplify data in
order to prepare it for a machine-learning algorithm
C) Symbolic representation of facts or ideas from which
information can potentially be extracted
D) None of these

Answer: A

12) Black boxes are

A) This takes only two values. In general, these values will be

0 and 1 and they can be coded as one bit.
B) The natural environment of a certain species
C) Systems that can be used without knowledge of internal
operations
D) None of these

Answer: C

13) A definition of a concept is-----if it recognizes all the

instances of that concept

A) Complete
B) Consistent
C) Constant
D) None of these

Answer: A

14) Data mining is

A) The actual discovery phase of a knowledge discovery process

B) The stage of selecting the right data for a KDD process
C) A subject-oriented integrated time variant non-volatile
collection of data in support of management
D) None of these

Answer: A

15) A definition or a concept is------------- if it classifies

any examples as coming within the concept

A) Complete
B) Consistent
C) Constant
D) None of these

Answer: B

16) Data selection is

A) The actual discovery phase of a knowledge discovery process

B) The stage of selecting the right data for a KDD process
C) A subject-oriented integrated time variant non-volatile
collection of data in support of management
D) None of these

Answer: B

17) Classification task referred to

A) A subdivision of a set of examples into a number of classes
B) A measure of the accuracy, of the classification of a
concept that is given by a certain theory.
C) The task of assigning a classification to a set of examples
D) None of these

Answer: C

18) DNA (Deoxyribonucleic acid)

A) It is hidden within a database and can only be recovered if

one ,is given certain clues (an example IS encrypted
information).
B) The process of executing implicit previously unknown and
potentially useful information from data
C) An extremely complex molecule that occurs in human
chromosomes and that carries genetic information in the form of
genes.
D) None of these

Answer: C

19) Hybrid is

A) Combining different types of method or information

B) Approach to the design of learning algorithms that is
structured along the lines of the theory of evolution.
C) Decision support systems that contain an information base
filled with the knowledge of an expert formulated in terms of
if-then rules.
D) None of these

Answer: A

20) Discovery is

A) It is hidden within a database and can only be recovered if

one is given certain clues (an example IS encrypted
information).
B) The process of executing implicit previously unknown and
potentially useful information from data.
C) An extremely complex molecule that occurs in human
chromosomes and that carries genetic information in the form of
genes.
D) None of these
Answer: B

21) Euclidean distance measure is

A) A stage of the KDD process in which new data is added to the

existing selection.
B) The process of finding a solution for a problem simply by
enumerating all possible solutions according to some pre-defined
order and then testing them
C) The distance between two points as calculated using the
Pythagoras theorem.
D) None of these

Answer: C

22) Hidden knowledge referred to

A) A set of databases from different vendors, possibly using

different database paradigms
B) An approach to a problem that is not guaranteed to work but
performs well in most cases
C) Information that is hidden in a database and that cannot be
recovered by a simple SQL query.
D) None of these

Answer: C

23) Enrichment is

A) A stage of the KDD process in which new data is added to the

existing selection
B) The process of finding a solution for a problem simply by
enumerating all possible solutions according to some pre-defined
order and then testing them
C) The distance between two points as calculated using the
Pythagoras theorem.
D) None of these

Answer: A

24) Heterogeneous databases referred to

A) A set of databases from different b vendors, possibly using

different database paradigms
B) An approach to a problem that is not guaranteed to work but
performs well in most cases.
C) Information that is hidden in a database and that cannot be
recovered by a simple SQL query.
D) None of these

Answer: A

25) Enumeration is referred to

A) A stage of the KDD process in which new data is added to the

Answer: B

26) Heuristic is

A) A set of databases from different vendors, possibly using

Answer: B

27) Hybrid learning is

A) Machine-learning involving different techniques

B) The learning algorithmic analyzes the examples on a
systematic basis 2nd makes incremental adjustments to the theory
that is learned
C) Learning by generalizing from examples
D) None of these

Answer: A

28) Kohonen self-organizing map referred to

A) The process of finding the right formal representation of a

certain body of knowledge in order to represent it in a
knowledge-based system
B) It automatically maps an external signal space into a
system's internal representational space. They are useful in the
performance of classification tasks
C) A process where an individual learns how to carry out a
certain task when making a transition from a situation in which
the task cannot be carried out to a situation in which the same'
task under the same circumstances can be carried out.
D) None of these

Answer: B

29) Incremental learning referred to

A) Machine-learning involving different techniques

B) The learning algorithmic analyzes the examples on a
systematic basis and makes incremental adjustments to the theory
that is learned
C) Learning by generalizing from examples
D) None of these

Answer: B

30) Knowledge engineering is

A) The process of finding the right formal representation of a

certain body of knowledge in order to represent it in a
knowledge-based system
B) It automatically maps an external signal space into a
system's internal representational space. They are useful in the
performance of classification tasks.
C) A process where an individual learns how to carry out a
certain task when making a transition from a situation in which
the task cannot be carried out to a situation in which the same
task under the same circumstances can be carried out.
D) None of these

Answer: A

31) Information content is

A) The amount of information with in data as opposed to the

amount of redundancy or noise.
B) One of the defining aspects of a data warehouse
C) Restriction that requires data in one column of a database
table to the a subset of another-column.
D) None of these
Answer: A

32) Inductive learning is

A) Machine-learning involving different techniques

B) The learning algorithmic analyzes the examples on a
systematic basis and makes incremental adjustments to the theory
that is learned
C) Learning by generalizing from examples
D) None of these

Answer: C

33) Inclusion dependencies

A) The amount of information with in data as opposed to the

amount of redundancy or noise
B) One of the defining aspects of a data warehouse
C) Restriction that requires data in one column of a database
table to the a subset of another-column
D) None of these

Answer: C

34) KDD (Knowledge Discovery in Databases) is referred to

A) Non-trivial extraction of implicit previously unknown and

potentially useful information from data
B) Set of columns in a database table that can be used to
identify each record within this table uniquely.
C) Collection of interesting and useful patterns in a database
D) none of these

Answer: A

35) Learning is

A) The process of finding the right formal representation of a

certain body of knowledge in order to represent it in a
knowledge-based system
B) It automatically maps an external signal space into a
system's internal representational space. They are useful in the
performance of classification tasks.
C) A process where an individual learns how to carry out a
certain task when making a transition from a situation in which
the task cannot be carried out to a situation in which the same
task under the same circumstances can be carried out.
D) None of these

Answer: C

36) Naive prediction is

A) A class of learning algorithms that try to derive a Prolog

program from examples.
B) A table with n independent attributes can be seen as an n-
dimensional space.
C) A prediction made using an extremely simple method, such as
always predicting the same output.
D) None of these

Answer: C

37) Learning algorithm referrers to

A) An algorithm that can learn

B) A sub-discipline of computer science that deals with the
design and implementation of learning algorithms.
C) A machine-learning approach that abstracts from the actual
strategy of an individual algorithm and can therefore be applied
to any other form of machine learning.
D) None of these

Answer: A

38) Knowledge is referred to

A) Non-trivial extraction of implicit previously unknown and

potentially useful information from data
B) Set of columns in a database table that can be used to
identify each record within this table uniquely
C) Collection of interesting and useful patterns in a database
D) none of these

Answer: C

39) Node is

A) A component of a network
B) In the context of KDD and data mining, this refers to random
errors in a database table.
C) One of the defining aspects of a data warehouse
D) None of these
Answer: A

40) Machine learning is

A) An algorithm that can learn

B) A sub-discipline of computer science that deals with the
design and implementation of learning algorithms
C) An approach that abstracts from the actual strategy of an
individual algorithm and can therefore be applied to any other
form of machine learning.
D) None of these

Answer: B

41) Projection pursuit is

A) The result of the application of a theory or a rule in a

specific case
B) One of several possible enters within a database table that
is chosen by the designer as the primary means of accessing the
data in the table.
C) Discipline in statistics that studies ways to find the most
interesting projections of multi-dimensional spaces
D) None of these

Answer: C

42) Inductive logic programming is

A) A class of learning algorithms that try to derive a Prolog

program from examples
B) A table with n independent attributes can be seen as an
n-dimensional space
C) A prediction made using an extremely simple method, such as
always predicting the same output
D) None of these

Answer: A

43) Statistical significance is

A) The science of collecting, organizing, and applying

numerical facts
B) Measure of the probability that a certain hypothesis is
incorrect given certain observations.
C) One of the defining aspects of a data warehouse, which is
specially built around all the existing applications of the
operational data
D) None of these

Answer: B

44) Multi-dimensional knowledge is

A) A class of learning algorithms that try to derive a Prolog

Answer: B

45) Prediction is

A) The result of the application of a theory or a rule in a

Answer: A

46) Query tools are

A) A reference to the speed of an algorithm, which is

quadratically dependent on the size of the data
B) Attributes of a database table that can take only numerical
values.
C) Tools designed to query a database.
D) None of these

Answer: C

47) Operational database is

A) A measure of the desired maximal complexity of data mining

algorithms
B) A database containing volatile data used for the daily
operation of an organization
C) Relational database management system
D) None of these

Answer: B

48) Which of the following is/are the Data mining tasks?

(a) Regression
(b) Classification
(c) Clustering
(d) inference of associative rules
(e) All (a), (b), (c) and (d) above.

Answer: E
Explanation: Regression, Classification and Clustering are the
data mining tasks.

49) In a data warehouse, if D1 and D2 are two conformed

dimensions, then

(a) D1 may be an exact replica of D2

(b) D1 may be at a rolled up level of granularity compared to
D2
(c) Columns of D1 may be a subset of D2 and vice versa
(d) Rows of D1 may be a subset of D2 and vice versa
(e) All (a), (b), (c) and (d) above.

Answer: A
Explanation: In a data warehouse, if D1 and D2 are two
conformed dimensions, then D1 may be an exact replica of D2.

50. Which of the following is not an ETL tool?

(a) Informatica
(b) Oracle warehouse builder
(c) Datastage
(d) Visual studio
(e) DT/studio.

Answer: D
Explanation: Visual Studio is not an ETL tool.

51) ...................... is an essential process where

intelligent methods are applied to extract data patterns.
A) Data warehousing
B) Data mining
C) Text mining
D) Data selection

Answer: B) Data mining

52) Data mining can also applied to other forms such

as ................

i) Data streams
ii) Sequence data
iii) Networked data
iv) Text data
v) Spatial data

A) i, ii, iii and v only

B) ii, iii, iv and v only
C) i, iii, iv and v only
D) All i, ii, iii, iv and v

Answer: D) All i, ii, iii, iv and v

53) Which of the following is not a data mining functionality?

A) Characterization and Discrimination

B) Classification and regression
C) Selection and interpretation
D) Clustering and Analysis

Answer: C) Selection and interpretation

54) ............................. is a summarization of the

general characteristics or features of a target class of data.

A) Data Characterization
B) Data Classification
C) Data discrimination
D) Data selection

Answer: A) Data Characterization

55) ............................. is a comparison of the general

features of the target class data objects against the general
features of objects from one or multiple contrasting classes.
A) Data Characterization
B) Data Classification
C) Data discrimination
D) Data selection

Answer: C) Data discrimination

56) Strategic value of data mining is ......................

A) cost-sensitive
B) work-sensitive
C) time-sensitive
D) technical-sensitive

Answer: C) time-sensitive

57) ............................. is the process of finding a

model that describes and distinguishes data classes or concepts.

A) Data Characterization
B) Data Classification
C) Data discrimination
D) Data selection

Answer: B) Data Classification

58. The various aspects of data mining methodologies

is/are ...................

i) Mining various and new kinds of knowledge

ii) Mining knowledge in multidimensional space
iii) Pattern evaluation and pattern or constraint-guided
mining.
iv) Handling uncertainty, noise, or incompleteness of data

A) i, ii and iv only
B) ii, iii and iv only
C) i, ii and iii only
D) All i, ii, iii and iv

Answer: D) All i, ii, iii and iv

59) The full form of KDD is ..................

A) Knowledge Database
B) Knowledge Discovery Database
C) Knowledge Data House
D) Knowledge Data Definition

Answer: B) Knowledge Discovery Database

60) The out put of KDD is .............

A) Data
B) Information
C) Query
D) Useful information

Answer: D) Useful information

61. The full form of OLAP is

A) Online Analytical Processing

B) Online Advanced Processing
C) Online Advanced Preparation
D) Online Analytical Performance

Answer: A) Online Analytical Processing

62) ......................... is a subject-oriented, integrated,

time-variant, nonvolatile collection or data in support of
management decisions.

A) Data Mining
B) Data Warehousing
C) Document Mining
D) Text Mining

Answer: B) Data Warehousing

63) The data is stored, retrieved and updated

in ....................

A) OLAP
B) OLTP
C) SMTP
D) FTP

Answer: B) OLTP
64) An .................. system is market-oriented and is used
for data analysis by knowledge workers, including managers,
executives, and analysts.

A) OLAP
B) OLTP
C) Both of the above
D) None of the above

Answer: A) OLAP

65) ........................ is a good alternative to the star

schema.

A) Star schema
B) Snowflake schema
C) Fact constellation
D) Star-snowflake schema

Answer: C) Fact constellation

66) The ............................ exposes the information

being captured, stored, and managed by operational systems.

A) top-down view
B) data warehouse view
C) data source view
D) business query view

Answer: C) data source view

67) The type of relationship in star schema is ...............

A) many to many
B) one to one
C) one to many
D) many to one

Answer: C) one to many

68) The .................. allows the selection of the relevant

information necessary for the data warehouse.

A) top-down view
B) data warehouse view
C) data source view
D) business query view

Answer: A) top-down view

69) Which of the following is not a component of a data

warehouse?

A) Metadata
B) Current detail data
C) Lightly summarized data
D) Component Key

Answer: D) Component Key

70) Which of the following is not a kind of data warehouse

application?

A) Information processing
B) Analytical processing
C) Data mining
D) Transaction processing

Answer: D) Transaction processing

71) Data warehouse architecture is based

on .......................

A) DBMS
B) RDBMS
C) Sybase
D) SQL Server

Answer:B) RDBMS

72) .......................... supports basic OLAP operations,

including slice and dice, drill-down, roll-up and pivoting.

A) Information processing
B) Analytical processing
C) Data mining
D) Transaction processing

Answer: B) Analytical processing

73) The core of the multidimensional model is
the ....................... , which consists of a large set of
facts and a number of dimensions.

A) Multidimensional cube
B) Dimensions cube
C) Data cube
D) Data model

Answer: C) Data cube

74) The data from the operational environment

enter ........................ of data warehouse.

A) Current detail data

B) Older detail data
C) Lightly Summarized data
D) Highly summarized data

Answer: A) Current detail data

75) A data warehouse is ......................

A) updated by end users.

B) contains numerous naming conventions and formats
C) organized around important subject areas
D) contain only current data

Answer: C) organized around important subject areas

76) Business Intelligence and data warehousing is used

for ..............

A) Forecasting
B) Data Mining
C) Analysis of large volumes of product sales data
D) All of the above

Answer: D) All of the above

77) Data warehouse contains ................ data that is never

found in the operational environment.

A) normalized
B) informational
C) summary
D) denormalized
Answer: C) summary

78) ................... are responsible for running queries and

reports against data warehouse tables.

A) Hardware
B) Software
C) End users
D) Middle ware

Answer: C) End users

79) The biggest drawback of the level indicator in the classic

star schema is that is limits ............

A) flexibility
B) quantify
C) qualify
D) ability

Answer: A) flexibility

80) ............................. are designed to overcome any

limitations placed on the warehouse by the nature of the
relational data model.

A) Operational database
B) Relational database
C) Multidimensional database
D) Data repository

Answer: C) Multidimensional database

81) Which of the following is the most important when deciding

on the data structure of a data mart?

(a) XML data exchange standards

(b) Data access tools to be used
(c) Metadata naming conventions
(d) Extract, Transform, and Load (ETL) tool to be used
(e) All (a), (b), (c) and (d) above.

Answer: B
Explanation: Data access tools to be used when deciding on the
data structure of a data mart.
82) The process of removing the deficiencies and loopholes in
the data is called as

(a) Aggregation of data

(b) Extracting of data
(c) Cleaning up of data.
(d) Loading of data
(e) Compression of data.

Answer: C
Explanation: The process of removing the deficiencies and
loopholes in the data is called as cleaning up of data.

83) Which one manages both current and historic transactions?

(a) OLTP
(b) OLAP
(c) Spread sheet
(d) XML
(e) All (a), (b), (c) and (d) above.

Answer: B
Explanation: Online Analytical Processing (OLAP) manages both
current and historic transactions.

84) Which of the following is the collection of data objects

that are similar to one another within the same group?

(a) Partitioning
(b) Grid
(c) Cluster
(d) Table
(e) Data source.

Answer: C
Explanation: Cluster is the collection of data objects that are
similar to one another within the same group.

85) Which of the following employees data mining techniques to

analyze the intent of a user query, provided additional
generalized or associated information relevant to the query?

(a) Iceberg query method

(b) Data analyzer
(c) Intelligent query answering
(d) DBA
(e) Query parser.
Answer: C
Explanation: Intelligent Query Answering employee’s data
mining techniques to analyze the intent of a user query provided
additional generalized or associated information relevant to the
query.

86) Which of the following process includes data cleaning, data

integration, data selection, data transformation, data mining,
pattern evolution and knowledge presentation?

(a) KDD process

(b) ETL process
(c) KTL process
(d) MDX process
(e) None of the above.

Answer: A
Explanation: KDD Process includes data cleaning, data
integration, data selection, data transformation, data mining,
pattern evolution, and knowledge presentation.

87. At which level we can create dimensional models?

(a) Business requirements level

(b) Architecture models level
(c) Detailed models level
(d) Implementation level
(e) Testing level.

Answer: B
Explanation: Dimensional models can be created at Architecture
models level.

88) Which of the following is not related to dimension table

attributes?

(a) Verbose
(b) Descriptive
(c) Equally unavailable
(d) Complete
(e) Indexed.

Answer: C
Explanation: Equally unavailable is not related to dimension
table attributes.
89) Data warehouse bus matrix is a combination of

(a) Dimensions and data marts

(b) Dimensions and facts
(c) Facts and data marts
(d) Dimensions and detailed facts
(e) All (a), (b), (c) and (d) above.

Answer: A
Explanation: Data warehouse bus matrix is a combination of
Dimensions and data marts.

90) Which of the following is not the managing issue in the

modeling process?

(a) Content of primary units column

(b) Document each candidate data source
(c) Do regions report to zones
(d) Walk through business scenarios
(e) Ensure that the transaction edit flat is used for analysis.

Answer: E
Explanation: Ensure that the transaction edit flat is used for
analysis is not the managing issue in the modeling process.

91) Data modeling technique used for data marts is

(a) Dimensional modeling

(b) ER – model
(c) Extended ER – model
(d) Physical model
(e) Logical model.

Answer: A
Explanation: Data modeling technique used for data marts is
Dimensional modeling.

92) A warehouse architect is trying to determine what data must

be included in the warehouse. A meeting has been arranged with a
business analyst to understand the data requirements, which of
the following should be included in the agenda?

(a) Number of users

(b) Corporate objectives
(c) Database design
(d) Routine reporting
(e) Budget.
Answer: D
Explanation: Routine reporting should be included in the
agenda.

93. An OLAP tool provides for

(a) Multidimensional analysis

(b) Roll-up and drill-down
(c) Slicing and dicing
(d) Rotation
(e) Setting up only relations.

Answer: C
Explanation: An OLAP tool provides for Slicing and dicing.

94. The Synonym for data mining is

(a) Data warehouse

(b) Knowledge discovery in database
(c) ETL
(d) Business intelligence
(e) OLAP.

Answer: C
Explanation: The synonym for data mining is Knowledge discovery
in Database.

95) Which of the following statements is true?

(a) A fact table describes the transactions stored in a DWH

(b) A fact table describes the granularity of data held in a
DWH
(c) The fact table of a data warehouse is the main store of
descriptions of the transactions stored in a DWH
(d) The fact table of a data warehouse is the main store of all
of the recorded transactions over time
(e) A fact table maintains the old records of the database.

Answer: D
Explanation: The fact table of a data warehouse is the main
store of all of the recorded transactions over time is the
correct statement.

96) Most common kind of queries in a data warehouse

(a) Inside-out queries

(b) Outside-in queries
(c) Browse queries
(d) Range queries
(e) All (a), (b), (c) and (d) above.

Answer: A
Explanation: The Most common kind of queries in a data
warehouse is Inside-out queries.

97) Concept description is the basic form of the

(a) Predictive data mining

(b) Descriptive data mining
(c) Data warehouse
(d) Relational data base
(e) Proactive data mining.

Answer: B
Explanation: Concept description is the basis form of the
descriptive data mining.

98) The apriori property means

(a) If a set cannot pass a test, all of its supersets will fail
the same test as well
(b) To improve the efficiency the level-wise generation of
frequent item sets
(c) If a set can pass a test, all of its supersets will fail
the same test as well
(d) To decrease the efficiency the level-wise generation of
frequent item sets
(e) All (a), (b), (c) and (d) above.

Answer: B
Explanation: The apriori property means to improve the
efficiency the level-wise generation of frequent item sets.

99) Which of following form the set of data created to support a

specific short lived business situation?

(a) Personal data marts

(b) Application models
(c) Downstream systems
(d) Disposable data marts
(e) Data mining models.

Answer: D
Explanation: Disposable Data Marts is the form the set of data
created to support a specific short lived business situation.

100) What is/are the different types of Meta data?

I. Administrative.
II. Business.
III. Operational.

(a) Only (I) above

(b) Both (II) and (III) above
(c) Both (I) and (II) above
(d) Both (I) and (III) above
(e) All (I), (II) and (III) above.

Answer: E
Explanation: The different types of Meta data are
Administrative, Business and Operational.

101) Multiple Regression means

(a) Data are modeled using a straight line

(b) Data are modeled using a curve line
(c) Extension of linear regression involving only one
predicator value
(d) Extension of linear regression involving more than one
predicator value
(e) All (a), (b), (c) and (d) above.

Answer: D
Explanation: Multiple Regression means extension of linear
regression involving more than one predicator value.

102) Which of the following should not be considered for each

dimension attribute?

(a) Attribute name

(b) Rapid changing dimension policy
(c) Attribute definition
(d) Sample data
(e) Cardinality.

Answer: B
Explanation: Rapid changing dimension policy should not be
considered for each dimension attribute.

103) A Business Intelligence system requires data from:

(a) Data warehouse
(b) Operational systems
(c) All possible sources within the organization and possibly
from external sources
(d) Web servers
(e) Database servers.

Answer: A
Explanation: A business Intelligence system requires data from
Data warehouse

104) Data mining application domains are

(a) Biomedical
(b) DNA data analysis
(c) Financial data analysis
(d) Retail industry and telecommunication industry
(e) All (a), (b), (c) and (d) above.

Answer: E
Explanation: Data mining application domains are Biomedical,
DNA data analysis, Financial data analysis and Retail industry
and telecommunication industry

105. The generalization of multidimensional attributes of a

complex object class can be performed by examining each
attribute, generalizing each attribute to simple-value data and
constructing a multidimensional data cube is called as

(a) Object cube

(b) Relational cube
(c) Transactional cube
(d) Tuple
(e) Attribute.

Answer: A
Explanation: The generalization of multidimensional attributes
of a complex object class can be performed by examining each
attribute, generalizing each attribute to simple-value data and
constructing a multidimensional data cube is called as object
cube.

106. Which of the following project is a building a data mart

for a business process/department that is very critical for your
organization?
(a) High risk high reward
(b) High risk low reward
(c) Low risk low reward
(d) Low risk high reward
(e) Involves high risks.

Answer: A
Explanation: High risk high reward project is a building a data
mart for a business process/department that is very critical for
your organization

107. Which of the following tools a business intelligence system

will have?

(a) OLAP tool

(b) Data mining tool
(c) Reporting tool
(d) Both(a) and (b) above
(e) (a), (b) and (c) above.

Answer: A
Explanation: Business intelligence system will have OLAP, Data
mining and reporting tolls.

108. A feature F1 can take certain value: A, B, C, D, E, & F and

represents grade of students from a college.

1) Which of the following statement is true in following case?

A) Feature F1 is an example of nominal variable.

B) Feature F1 is an example of ordinal variable.
C) It doesn’t belong to any of the above category.
D) Both of these

Solution: (B)

Ordinal variables are the variables which has some order in

their categories. For example, grade A should be consider as
high grade than grade B.

2) Which of the following is an example of a deterministic

algorithm?

A) PCA
B) K-Means

C) None of the above

Solution: (A)

A deterministic algorithm is that in which output does not

change on different runs. PCA would give the same result if we
run again, but not k-means.

3) [True or False] A Pearson correlation between two variables

is zero but, still their values can still be related to each
other.

A) TRUE

B) FALSE

Solution: (A)

Y=X2. Note that, they are not only associated, but one is a
function of the other and Pearson correlation between them is 0.

4) Which of the following statement(s) is / are true for

Gradient Decent (GD) and Stochastic Gradient Decent (SGD)?

1. In GD and SGD, you update a set of parameters in an

iterative manner to minimize the error function.

2. In SGD, you have to run through all the samples in your

training set for a single update of a parameter in each
iteration.

3. In GD, you either use the entire data or a subset of

training data to update a parameter in each iteration.

A) Only 1

B) Only 2

C) Only 3
D) 1 and 2

E) 2 and 3

F) 1,2 and 3

Solution: (A)

In SGD for each iteration you choose the batch which is

generally contain the random sample of data But in case of GD
each iteration contain the all of the training observations.

5) Which of the following hyper parameter(s), when increased may

cause random forest to over fit the data?

1. Number of Trees

2. Depth of Tree

3. Learning Rate

A) Only 1

B) Only 2

C) Only 3

D) 1 and 2

E) 2 and 3

F) 1,2 and 3

Solution: (B)

Usually, if we increase the depth of tree it will cause

overfitting. Learning rate is not an hyperparameter in random
forest. Increase in the number of tree will cause under fitting.

6) Imagine, you are working with “Analytics Vidhya” and you want
to develop a machine learning algorithm which predicts the
number of views on the articles.
Your analysis is based on features like author name, number of
articles written by the same author on Analytics Vidhya in past
and a few other features. Which of the following evaluation
metric would you choose in that case?

1. Mean Square Error

2. Accuracy

3. F1 Score

A) Only 1

B) Only 2

C) Only 3

D) 1 and 3

E) 2 and 3

F) 1 and 2

Solution:(A)

You can think that the number of views of articles is the

continuous target variable which fall under the regression
problem. So, mean squared error will be used as an evaluation
metrics.

7) Given below are three images (1,2,3). Which of the following

option is correct for these images?
A)

B)
C)
A) 1 is tanh, 2 is ReLU and 3 is SIGMOID activation functions.

B) 1 is SIGMOID, 2 is ReLU and 3 is tanh activation functions.

C) 1 is ReLU, 2 is tanh and 3 is SIGMOID activation functions.

D) 1 is tanh, 2 is SIGMOID and 3 is ReLU activation functions.

Solution: (D)

The range of SIGMOID function is [0,1].

The range of the tanh function is [-1,1].

The range of the RELU function is [0, infinity].

So Option D is the right answer.

8) Below are the 8 actual values of target variable in the train

file.

[0,0,0,1,1,1,1,1]

What is the entropy of the target variable?

A) -(5/8 log(5/8) + 3/8 log(3/8))

B) 5/8 log(5/8) + 3/8 log(3/8)

C) 3/8 log(5/8) + 5/8 log(3/8)

D) 5/8 log(3/8) – 3/8 log(5/8)

Solution: (A)

The formula for entropy is

So the answer is A.

9) Let’s say, you are working with categorical feature(s) and

you have not looked at the distribution of the categorical
variable in the test data.

You want to apply one hot encoding (OHE) on the categorical

feature(s). What challenges you may face if you have applied OHE
on a categorical variable of train dataset?

A) All categories of categorical variable are not present in the

test dataset.

B) Frequency distribution of categories is different in train as

compared to the test dataset.

C) Train and Test always have same distribution.

D) Both A and B

E) None of these

Solution: (D)

Both are true, The OHE will fail to encode the categories which
is present in test but not in train so it could be one of the
main challenges while applying OHE. The challenge given in
option B is also true you need to more careful while applying
OHE if frequency distribution doesn’t same in train and test.

10) Skip gram model is one of the best models used in Word2vec
algorithm for words embedding. Which one of the following models
depict the skip gram model?
A) A

B) B

C) Both A and B

D) None of these

Solution: (B)

Both models (model1 and model2) are used in Word2vec algorithm.

The model1 represent a CBOW model where as Model2 represent the
Skip gram model.

11) Let’s say, you are using activation function X in hidden

layers of neural network. At a particular neuron for any given
input, you get the output as “-0.0001”. Which of the following
activation function could X represent?

A) ReLU

B) tanh

C) SIGMOID
D) None of these

Solution: (B)

The function is a tanh because the this function output range is

between (-1,-1).

12) [True or False] LogLoss evaluation metric can have negative

values.

A) TRUE
B) FALSE

Solution: (B)

Log loss cannot have negative values.

13) Which of the following statements is/are true about “Type-1”

and “Type-2” errors?

1. Type1 is known as false positive and Type2 is known as

false negative.

2. Type1 is known as false negative and Type2 is known as

false positive.

3. Type1 error occurs when we reject a null hypothesis when it

is actually true.

A) Only 1

B) Only 2

C) Only 3

D) 1 and 2

E) 1 and 3

F) 2 and 3

Solution: (E)
In statistical hypothesis testing, a type I error is the
incorrect rejection of a true null hypothesis (a “false
positive”), while a type II error is incorrectly retaining a
false null hypothesis (a “false negative”).

14) Which of the following is/are one of the important step(s)

to pre-process the text in NLP based projects?

1. Stemming

2. Stop word removal

3. Object Standardization

A) 1 and 2

B) 1 and 3

C) 2 and 3

D) 1,2 and 3

Solution: (D)

Stemming is a rudimentary rule-based process of stripping the

suffixes (“ing”, “ly”, “es”, “s” etc) from a word.

Stop words are those words which will have not relevant to the
context of the data for example is/am/are.

Object Standardization is also one of the good way to

pre-process the text.

15) Suppose you want to project high dimensional data into lower
dimensions. The two most famous dimensionality reduction
algorithms used here are PCA and t-SNE. Let’s say you have
applied both algorithms respectively on data “X” and you got the
datasets “X_projected_PCA” , “X_projected_tSNE”.

Which of the following statements is true for “X_projected_PCA”

& “X_projected_tSNE” ?
A) X_projected_PCA will have interpretation in the nearest
neighbour space.

B) X_projected_tSNE will have interpretation in the nearest

neighbour space.

C) Both will have interpretation in the nearest neighbour space.

D) None of them will have interpretation in the nearest

neighbour space.

Solution: (B)

t-SNE algorithm consider nearest neighbour points to reduce the

dimensionality of the data. So, after using t-SNE we can think
that reduced dimensions will also have interpretation in nearest
neighbour space. But in case of PCA it is not the case.

Context: 16-17

Given below are three scatter plots for two features (Image 1, 2
& 3 from left to right).

16) In the above images, which of the following is/are example

of multi-collinear features?

A) Features in Image 1

B) Features in Image 2

C) Features in Image 3

D) Features in Image 1 & 2

E) Features in Image 2 & 3

F) Features in Image 3 & 1

Solution: (D)

In Image 1, features have high positive correlation where as in

Image 2 has high negative correlation between the features so in
both images pair of features are the example of multicollinear
features.

17) In previous question, suppose you have identified

multi-collinear features. Which of the following action(s) would
you perform next?

1. Remove both collinear variables.

2. Instead of removing both variables, we can remove only one

variable.

3. Removing correlated variables might lead to loss of

information. In order to retain those variables, we can use
penalized regression models like ridge or lasso regression.

A) Only 1

B)Only 2

C) Only 3

D) Either 1 or 3

E) Either 2 or 3

Solution: (E)

You cannot remove the both features because after removing the
both features you will lose all of the information so you
should either remove the only 1 feature or you can use the
regularization algorithm like L1 and L2.
18) Adding a non-important feature to a linear regression model
may result in.

1. Increase in R-square

2. Decrease in R-square

A) Only 1 is correct

B) Only 2 is correct

C) Either 1 or 2

D) None of these

Solution: (A)

After adding a feature in feature space, whether that feature is

important or unimportant features the R-squared always increase.

19) Suppose, you are given three variables X, Y and Z. The

Pearson correlation coefficients for (X, Y), (Y, Z) and (X, Z)
are C1, C2 & C3 respectively.

Now, you have added 2 in all values of X (i.enew values become

X+2), subtracted 2 from all values of Y (i.e. new values are
Y-2) and Z remains the same. The new coefficients for (X,Y),
(Y,Z) and (X,Z) are given by D1, D2 & D3 respectively. How do
the values of D1, D2 & D3 relate to C1, C2 & C3?

A) D1= C1, D2 < C2, D3 > C3

B) D1 = C1, D2 > C2, D3 > C3

C) D1 = C1, D2 > C2, D3 < C3

D) D1 = C1, D2 < C2, D3 < C3

E) D1 = C1, D2 = C2, D3 = C3

F) Cannot be determined

Solution: (E)
Correlation between the features won’t change if you add or
subtract a value in the features.

20) Imagine, you are solving a classification problems with

highly imbalanced class. The majority class is observed 99% of
times in the training data.

Your model has 99% accuracy after taking the predictions on test
data. Which of the following is true in such a case?

1. Accuracy metric is not a good idea for imbalanced class

problems.

2. Accuracy metric is a good idea for imbalanced class

problems.

3. Precision and recall metrics are good for imbalanced class

problems.

4. Precision and recall metrics aren’t good for imbalanced

class problems.

A) 1 and 3

B) 1 and 4

C) 2 and 3

D) 2 and 4

Solution: (A)

Refer the question number 4 from in this article.

21) In ensemble learning, you aggregate the predictions for weak

learners, so that an ensemble of these models will give a better
prediction than prediction of individual models.

Which of the following statements is / are true for weak

learners used in ensemble model?

1. They don’t usually overfit.

2. They have high bias, so they cannot solve complex learning
problems

3. They usually overfit.

A) 1 and 2

B) 1 and 3

C) 2 and 3

D) Only 1

E) Only 2

F) None of the above

Solution: (A)

Weak learners are sure about particular part of a problem. So,

they usually don’t overfit which means that weak learners have
low variance and high bias.

22) Which of the following options is/are true for K-fold

cross-validation?

1. Increase in K will result in higher time required to cross

validate the result.

2. Higher values of K will result in higher confidence on the

cross-validation result as compared to lower value of K.

3. If K=N, then it is called Leave one out cross validation,

where N is the number of observations.

A) 1 and 2

B) 2 and 3

C) 1 and 3

D) 1,2 and 3
Solution: (D)

Larger k value means less bias towards overestimating the true

expected error (as training folds will be closer to the total
dataset) and higher running time (as you are getting closer to
the limit case: Leave-One-Out CV). We also need to consider the
variance between the k folds accuracy while selecting the k.

Question Context 23-24

Cross-validation is an important step in machine learning for

hyper parameter tuning. Let’s say you are tuning a
hyper-parameter “max_depth” for GBM by selecting it from 10
different depth values (values are greater than 2) for tree
based model using 5-fold cross validation.

Time taken by an algorithm for training (on a model with

max_depth 2) 4-fold is 10 seconds and for the prediction on
remaining 1-fold is 2 seconds.

Note: Ignore hardware dependencies from the equation.

23) Which of the following option is true for overall execution

time for 5-fold cross validation with 10 different values of
“max_depth”?

A) Less than 100 seconds

B) 100 – 300 seconds

C) 300 – 600 seconds

D) More than or equal to 600 seconds

C) None of the above

D) Can’t estimate

Solution: (D)

Each iteration for depth “2” in 5-fold cross validation will

take 10 secs for training and 2 second for testing. So, 5 folds
will take 12*5 = 60 seconds. Since we are searching over the 10
depth values so the algorithm would take 60*10 = 600 seconds.
But training and testing a model on depth greater than 2 will
take more time than depth “2” so overall timing would be greater
than 600.

24) In previous question, if you train the same algorithm for

tuning 2 hyper parameters say “max_depth” and “learning_rate”.

You want to select the right value against “max_depth” (from

given 10 depth values) and learning rate (from given 5 different
learning rates). In such cases, which of the following will
represent the overall time?

A) 1000-1500 second

B) 1500-3000 Second

C) More than or equal to 3000 Second

D) None of these

Solution: (D)

Same as question number 23.

25) Given below is a scenario for training error TE and

Validation error VE for a machine learning algorithm M1. You
want to choose a hyperparameter (H) based on TE and VE.

H TE VE

1 105 90

2 200 85

3 250 96

4 105 85

5 300 100

Which value of H will you choose based on the above table?

A) 1
B) 2

C) 3

D) 4

E) 5

Solution: (D)

Looking at the table, option D seems the best

26) What would you do in PCA to get the same projection as SVD?

A) Transform data to zero mean

B) Transform data to zero median

C) Not possible

D) None of these

Solution: (A)

When the data has a zero mean vector PCA will have same
projections as SVD, otherwise you have to centre the data first
before taking SVD.

Question Context 27-28

Assume there is a black box algorithm, which takes training data

with multiple observations (t1, t2, t3,…….. tn) and a new
observation (q1). The black box outputs the nearest neighbor of
q1 (say ti) and its corresponding class label ci.

You can also think that this black box algorithm is same as 1-NN
(1-nearest neighbor).

27) It is possible to construct a k-NN classification algorithm

based on this black box alone.

Note: Where n (number of training observations) is very large

compared to k.
A) TRUE

B) FALSE

Solution: (A)

In first step, you pass an observation (q1) in the black box

algorithm so this algorithm would return a nearest observation
and its class.

In second step, you through it out nearest observation from

train data and again input the observation (q1). The black box
algorithm will again return the a nearest observation and it’s
class.

You need to repeat this procedure k times

28) Instead of using 1-NN black box we want to use the j-NN
(j>1) algorithm as black box. Which of the following option is
correct for finding k-NN using j-NN?

1. J must be a proper factor of k

2. J > k

3. Not possible

A) 1

B) 2

C) 3

Solution: (A)

Same as question number 27

29) Suppose you are given 7 Scatter plots 1-7 (left to right)
and you want to compare Pearson correlation coefficients between
variables of each scatterplot.

Which of the following is in the right order?

1. 1<2<3<4

2. 1>2>3 > 4

3. 7<6<5<4

4. 7>6>5>4

A) 1 and 3

B) 2 and 3

C) 1 and 4

D) 2 and 4

Solution: (B)

from image 1to 4 correlation is decreasing (absolute value). But

from image 4 to 7 correlation is increasing but values are
negative (for example, 0, -0.3, -0.7, -0.99).

30) You can evaluate the performance of a binary class

classification problem using different metrics such as accuracy,
log-loss, F-Score. Let’s say, you are using the log-loss
function as evaluation metric.

Which of the following option is / are true for interpretation

of log-loss as an evaluation metric?

1.
If a classifier is confident about an incorrect
classification, then log-loss will penalise it heavily.

2. For a particular observation, the classifier assigns a very

small probability for the correct class then the
corresponding contribution to the log-loss will be very
large.

3. Lower the log-loss, the better is the model.

A) 1 and 3

B) 2 and 3

C) 1 and 2

D) 1,2 and 3

Solution: (D)

Options are self-explanatory.

Question 31-32

Below are five samples given in the dataset.

Note: Visual distance between the points in the image represents

the actual distance.

31) Which of the following is leave-one-out cross-validation

accuracy for 3-NN (3-nearest neighbor)?

A) 0

D) 0.4

C) 0.8
D) 1

Solution: (C)

In Leave-One-Out cross validation, we will select (n-1)

observations for training and 1 observation of validation.
Consider each point as a cross validation point and then find
the 3 nearest point to this point. So if you repeat this
procedure for all points you will get the correct classification
for all positive class given in the above figure but negative
class will be misclassified. Hence you will get 80% accuracy.

32) Which of the following value of K will have least

leave-one-out cross validation accuracy?

A) 1NN

B) 3NN

C) 4NN

D) All have same leave one out error

Solution: (A)

Each point which will always be misclassified in 1-NN which

means that you will get the 0% accuracy.

33) Suppose you are given the below data and you want to apply a
logistic regression model for classifying it in two given
classes.
You are using logistic regression with L1 regularization.

Where C is the
regularization parameter and w1 & w2 are the coefficients of x1
and x2.

Which of the following option is correct when you increase the

value of C from zero to a very large value?

A) First w2 becomes zero and then w1 becomes zero

B) First w1 becomes zero and then w2 becomes zero

C) Both becomes zero at the same time

D) Both cannot be zero even after very large value of C

Solution: (B)

By looking at the image, we see that even on just using x2, we

can efficiently perform classification. So at first w1 will
become 0. As regularization parameter increases more, w2 will
come more and more closer to 0.

34) Suppose we have a dataset which can be trained with 100%

accuracy with help of a decision tree of depth 6. Now consider
the points below and choose the option based on these points.

Note: All other hyper parameters are same and other factors are
not affected.

1. Depth 4 will have high bias and low variance

2. Depth 4 will have low bias and low variance

A) Only 1

B) Only 2

C) Both 1 and 2

D) None of the above

Solution: (A)

If you fit decision tree of depth 4 in such data means it will

more likely to underfit the data. So, in case of underfitting
you will have high bias and low variance.

35) Which of the following options can be used to get global

minima in k-Means Algorithm?

1. Try to run algorithm for different centroid initialization

2. Adjust number of iterations

3. Find out the optimal number of clusters

A) 2 and 3

B) 1 and 3

C) 1 and 2

D) All of above

Solution: (D)

All of the option can be tuned to find the global minima.

36) Imagine you are working on a project which is a binary

classification problem. You trained a model on training dataset
and get the below confusion matrix on validation dataset.

Based on the above confusion matrix, choose which option(s)

below will give you correct predictions?

1. Accuracy is ~0.91

2. Misclassification rate is ~ 0.91

3. False positive rate is ~0.95

4. True positive rate is ~0.95

A) 1 and 3

B) 2 and 4
C) 1 and 4

D) 2 and 3

Solution: (C)

The Accuracy (correct classification) is (50+100)/165 which is

nearly equal to 0.91.

The true Positive Rate is how many times you are predicting
positive class correctly so true positive rate would be 100/105
= 0.95 also known as “Sensitivity” or “Recall”

37) For which of the following hyperparameters, higher value is

better for decision tree algorithm?

1. Number of samples used for split

2. Depth of tree

3. Samples for leaf

A)1 and 2

B) 2 and 3

C) 1 and 3

D) 1, 2 and 3

E) Can’t say

Solution: (E)

For all three options A, B and C, it is not necessary that if

you increase the value of parameter the performance may
increase. For example, if we have a very high value of depth of
tree, the resulting tree may overfit the data, and would not
generalize well. On the other hand, if we have a very low value,
the tree may underfit the data. So, we can’t say for sure that
“higher is better”.

Context 38-39
Imagine, you have a 28 * 28 image and you run a 3 * 3
convolution neural network on it with the input depth of 3 and
output depth of 8.

Note: Stride is 1 and you are using same padding.

38) What is the dimension of output feature map when you are
using the given parameters.

A) 28 width, 28 height and 8 depth

B) 13 width, 13 height and 8 depth

C) 28 width, 13 height and 8 depth

D) 13 width, 28 height and 8 depth

Solution: (A)

The formula for calculating output size is

output size = (N – F)/S + 1

where, N is input size, F is filter size and S is stride.

Read this article to get a better understanding.

39) What is the dimensions of output feature map when you are
using following parameters.

A) 28 width, 28 height and 8 depth

B) 13 width, 13 height and 8 depth

C) 28 width, 13 height and 8 depth

D) 13 width, 28 height and 8 depth

Solution: (B)

Same as above
40) Suppose, we were plotting the visualization for different
values of C (Penalty parameter) in SVM algorithm. Due to some
reason, we forgot to tag the C values with visualizations. In
that case, which of the following option best explains the C
values for the images below (1,2,3 left to right, so C values
are C1 for image1, C2 for image2 and C3 for image3 ) in case of
rbf kernel.

A) C1 = C2 = C3

B) C1 > C2 > C3

C) C1 < C2 < C3

D) None of these

Solution: (C)

Penalty parameter C of the error term. It also controls the

trade-off between smooth decision boundary and classifying the
training points correctly. For large values of C, the
optimization will choose a smaller-margin hyperplane.
1. Which of the following is a widely used and effective machine learning algorithm
based on the idea of bagging?
A. Decision Tree
B. Regression
C. Classification
D. Random Forest
ANSWER: D

2. The most widely used metrics and tools to assess a classification model is:
A. Confusion matrix
B. Cost-sensitive accuracy
C. Area under the ROC curve
D. All of these
ANSWER: D

3. Which of the following is a good test dataset characteristic?

A. Large enough to yield meaningful results
B. Is representative of the dataset as a whole
C. Both A and B
D. None of these
ANSWER: C

4. How do you handle missing or corrupted data in a dataset?

A. Drop missing rows or columns
B. Replace missing values with mean/median/mode
C. Assign a unique category to missing values
D. All of these
ANSWER: D

5. What is the purpose of performing cross-validation?

A. To assess the predictive performance of the models
B. To judge how the trained model performs outside the sample on test data
C. Both A and B
D. None of these
ANSWER: C

6. Statistical significance is
A. The science of collecting, ogranizing and applying numerical facts
B. Measure of the probability that a certain hypothesis is incorrect given certain
observations
C. One of the defining aspects of a data warehouse, which is specially built around
all the existing applicatons of the operational data
D. None of these
ANSWER: B
7. Which of the folllowing is an example of feature extraction?
A. Constructing bag of words vector from an email
B. Applying PCA projects to a large high-dimensional data
C. Removing stopwords in a sentence
D. All of these
ANSWER: D

8. How can you prevent a clustering algorithm from getting stuck in bad local optima?
A. Set the same seed value for each run
B. Use multiple random initializations
C. Both A and B
D. None of these
ANSWER: B

9. Adaptive system management is

A. It uses machine learning technique and program can learn from past experience and
adapt themselves to new situation
B. Computational procedure that takes some value as input and produces some value as
output
C. Science of making machines performs tasks that would require intelligence when
performed by humans
D. None of these
ANSWER: A

10. Binary attribute are

A. This takes only two values. In general, these values will be 0 and 1 and .they can
be coded as one bit
B. The natural environment of a certain species
C. Systems that can be used without knowledge of internal operations
D. None of these
ANSWER: A

11. Background knowledge referred to

A. Additional acquaintance used by a learning algorithm to facilitate the learning
process
B. Neural network that makes use of a hidden layer
C. It is a form of automatic learning
D. None of these
ANSWER: A

12. Classification is
A. Subdivision of a set of examples into a number of classes
B. Measure of the accuracy, of the classification of a concept that is given by a
certain theory
C. The task of assigning a classification to a set of examples
D. None of these
ANSWER: A

13. Classification accuracy is

A. Subdivision of a set of examples into a number of classes
B. Measure of the accuracy, of the classification of a concept that is given by a
certain theory
C. The task of assigning a classification to a set of examples
D. None of these
ANSWER: B

14. Cluster is
A. Group of similar objects that differ significantly from other objects
B. Operations on a database to transform or simplify data in order to prepare it for
a machine-learning algorithm
C. Symbolic representation of facts or ideas from which information can potentially
be extracted
D. None of these
ANSWER: A

15. Suppose you are given an EM algorithm that finds maximum likelihood estimates for
a model with latent variables. You are asked to modify the algorithm so that it finds MAP
estimates instead. Which step or steps do you need to modify?
A. Expectation
B. Maximization
C. No modification necessary
D. Both A & B
ANSWER: B

16. Compared to the variance of the Maximum Likelihood Estimate (MLE), the variance
of the Maximum A Posteriori (MAP) estimate is ________
A. Higher
B. Same
C. Lower
D. It could be any of the above
ANSWER: C

17. Incremental learning referred to

A. Machine-learning involving different techniques
B. The learning algorithmic analyzes the examples on a systematic basis and makes
incremental adjustments to the theory that is learned
C. Learning by generalizing from examples
D. None of these
ANSWER: B

18. Inductive learning is

19. Predicting on whether will it rain or not tomorrow evening at a particular time
is a type of _________ problem.
A. Classification
B. Regression
C. Unsupervised learning
D. All o these
ANSWER: A

20. Machine learning is

A. An algorithm that can learn
B. Sub-discipline of computer science that deals with the design and implementation
of learning algorithms
C. An approach that abstracts from the actual strategy of an individual algorithm and
can therefore be applied to any other form of machine learning.
D. None of these
ANSWER: B

21. A feature F1 can take certain value: A, B, C, D, E, & F and represents grade of
students from a college.Which of the following statement is true in following case?
A. Feature F1 is an example of nominal variable.
B. Feature F1 is an example of ordinal variable.
C. It doesn’t belong to any of the above category.
D. Both of A & B
ANSWER: B

22. If your training loss increases with number of epochs, which of the following could
be a possible issue with the learning process?
A. Regularization is too low and model is overfitting
B. Regularization is too high and model is underfitting
C. Step size is too large
D. Step size is too small
ANSWER: C
23. Given a large dataset of medical records from patients suffering from heart disease,
try to learn whether there might be different clusters of such patients for which we might
tailor separate treatments. What kind of learning problem is this?
A. Supervised learning
B. Unsupervised learning
C. Both A and B
D. None of these
ANSWER: B

24. Multi-dimensional knowledge is

A. A class of learning algorithms that try to derive a Prolog program from examples
B. A table with n independent attributes can be seen as an n-dimensional space
C. A prediction made using an extremely simple method, such as always predicting the
same output
D. None of these
ANSWER: B

25. The mutual information

A. Is symmetric
B. Always non negative
C. Both A and B
D. None of these
ANSWER: C

26. Classifying email as a spam, labeling webpages based on their content, voice
recognition are the example of _____.
A. Supervised learning
B. Unsupervised learning
C. Machine learning
D. Deep learning
ANSWER: A

27. Deep learning is a subfield of machine learning where concerned algorithms are
inspired by the structured and function of the brain called _____.
A. Machine learning
B. Artificial neural networks
C. Deep learning
D. Robotics
ANSWER: B

28. Machine learning invented by _____.

A. John McCarthy
B. Nicklaus Wirth
C. Joseph Weizenbaum
D. Arthur Samuel
ANSWER: D

29. When the number of output classes is greater than one, there are main possibilities
to manage a classification problem:
A. One-vs-all, One-vs-one
B. One-vs-one, Many-vs-one
C. One-vs-many, Many-vs-one
D. None of these
ANSWER: A

30. For a neural network, which one of these structural assumptions is the one that
most affects the trade-off between underfitting (i.e. a high bias model) and overfitting
(i.e. a high variance model):
A. The learning rate
B. The number of hidden nodes
C. The initial choice of weights
D. The use of a constant-term unit input
ANSWER: B

31. ___________ refers to a model that can neither model the training data nor
generalize to new data.
A. Good fitting
B. Overfitting
C. Underfitting
D. All of the these
ANSWER: C

32. Given two Boolean random variables, A and B, where P(A) = 1/2, P(B) = 1/3, and P(A
| ¬B) = 1/4, what is P(A | B)?
A. 1/6
B. 1/4
C. 3/4
D. 1
ANSWER: D

33. Suppose your model is overfitting. Which of the following is NOT a valid way to
try and reduce the overfitting?
A. Increase the amount of training data
B. Improve the optimization algorithm being used for error minimization
C. Decrease the model complexity
D. Reduce the noise in the training data
ANSWER: B

34. Predicting on whether will it rain or not tomorrow evening at a particular time
is a type of _________ problem.
A. Classification
B. Regression
C. Unsupervised learning
D. All of these
ANSWER: A

35. Given a large dataset of medical records from patients suffering from heart disease,
try to learn whether there might be different clusters of such patients for which we might
tailor separate treatments. What kind of learning problem is this?
A. Supervised learning
B. Unsupervised learning
C. Both A and B
D. Neither A nor B
ANSWER: B

36. Given a large dataset of medical records from patients suffering from heart disease,
try to learn whether there might be different clusters of such patients for which we might
tailor separate treatments. What kind of learning problem is this?
A. Supervised learning
B. Unsupervised learning
C. Both A and B
D. Neither A nor B
ANSWER: B

37. Which of the following is NOT supervised learning?

A. Decision Tree
B. PCA
C. Linear Regression
D. Naive Bayesian
ANSWER: B

38. In 1984, the computer scientist_______proposed a mathematical approach to

determine whether a problem is learnable by a computer.
A. John McCarthy
B. Nicklaus Wirth
C. L. Valiant
D. Arthur Samuel
ANSWER: C
39. In binary classification which error measure or loss fuction is used?
A. Non-negative error measure
B. Mean square error
C. Zero-one-loss
D. None of these
ANSWER: C

40. Benefits of Parametric Machine Learning Algorithms:

A. Complex, slow, more training data
B. Simpler, faster, less traning Data
C. Both A and B
D. Neither A nor B
ANSWER: B

41. Limitations of Parametric Machine Learning Algorithms is:

A. Highly Constrained
B. Limited Complexity
C. Poor Fit
D. All of these
ANSWER: D

42. Artificial Neural Networks is example of:

A. Nonparametric model
B. Parametric models
C. Both A and B
D. None of these
ANSWER: A

43. Benefits of Non-parametric Machine Learning Algorithms:

A. More data, Slower, Overfitting
B. Flexibility, Power, Performance
C. Both A and B
D. Neither A nor B
ANSWER: B

44. Limitations of Non-parametric Machine Learning Algorithms:

A. More data, Slower, Overfitting
B. Flexibility, Power, Performance
C. Both A and B
D. Neither A nor B
ANSWER: A

45. Naive Bayes is example of:

A. Nonparametric model
B. Parametric models
C. Both A and B
D. Neither A nor B
ANSWER: B

46. Which of the following is wrong statement about the maximum likelihood approach?
A. This method doesn’t always involve probability calculations
B. It finds a tree that best accounts for the variation in a set of sequences
C. The method is similar to the maximum parsimony method
D. The analysis is performed on each column of a multiple sequence alignment
ANSWER: A

47. The main disadvantage of maximum likelihood methods is that they are _____
A. Mathematically less folded
B. Mathematically less complex
C. Computationally lucid
D. Computationally intense
ANSWER: B

48. Which learning is often preferable to MAP learning?

A. Expectation-maximization
B. Log-likelihood (L)
C. Maximum-likelihood (ML)
D. None of these
ANSWER: C

49. Which is measure used in information thoery?

A. Entropy
B. Cross-entropy
C. Conditional entropy
D. All of these
ANSWER: C

50. Which measure uses bits in information thoery?

A. Entropy
B. Cross-entropy
C. Conditional entropy
D. All of these
ANSWER: A
MCQ questions for unit 3: Regression

Multiple choice questions

1) True-False: Linear Regression is a supervised machine learning algorithm.

A) TRUE
B) FALSE

Solution: (A)

Yes, Linear regression is a supervised learning algorithm because it uses true labels for
training. Supervised learning algorithm should have input variable (x) and an output variable
(Y) for each example.

2) True-False: Linear Regression is mainly used for Regression.

A) TRUE
B) FALSE

Solution: (A)

Linear Regression has dependent variables that have continuous values.

3) True-False: It is possible to design a Linear regression algorithm using a neural

network?

A) TRUE
B) FALSE

Solution: (A)

True. A Neural network can be used as a universal approximator, so it can definitely

implement a linear regression algorithm.

4) Which of the following methods do we use to find the best fit line for data in Linear
Regression?

C2 General
A) Least Square Error
B) Maximum Likelihood
C) Logarithmic Loss
D) Both A and B

Solution: (A)

In linear regression, we try to minimize the least square errors of the model to identify the
line of best fit.

5) Which of the following evaluation metrics can be used to evaluate a model while
modeling a continuous output variable?

A) AUC-ROC
B) Accuracy
C) Logloss
D) Mean-Squared-Error

Solution: (D)

Since linear regression gives output as continuous values, so in such case we use mean
squared error metric to evaluate the model performance. Remaining options are use in case
of a classification problem.

6) True-False: Lasso Regularization can be used for variable selection in Linear

Regression.

A) TRUE
B) FALSE

Solution: (A)

True, In case of lasso regression we apply absolute penalty which makes some of the
coefficients zero.

7) Which of the following is true about Residuals ?

A) Lower is better
B) Higher is better
C) A or B depend on the situation
D) None of these

Solution: (A)

C2 General
Residuals refer to the error values of the model. Therefore lower residuals are desired.

8) Suppose that we have N independent variables (X1,X2… Xn) and dependent variable is
Y. Now Imagine that you are applying linear regression by fitting the best fit line using least
square error on this data.

You found that correlation coefficient for one of it’s variable(Say X1) with Y is -0.95.

Which of the following is true for X1?

A) Relation between the X1 and Y is weak

B) Relation between the X1 and Y is strong
C) Relation between the X1 and Y is neutral
D) Correlation can’t judge the relationship

Solution: (B)

The absolute value of the correlation coefficient denotes the strength of the relationship.
Since absolute correlation is very high it means that the relationship is strong between X1
and Y.

9) Looking at above two characteristics, which of the following option is the correct
for Pearson correlation between V1 and V2?

If you are given the two variables V1 and V2 and they are following below two
characteristics.

1. If V1 increases then V2 also increases

2. If V1 decreases then V2 behavior is unknown

A) Pearson correlation will be close to 1

B) Pearson correlation will be close to -1
C) Pearson correlation will be close to 0
D) None of these

Solution: (D)

We cannot comment on the correlation coefficient by using only statement 1. We need to

consider the both of these two statements. Consider V1 as x and V2 as |x|. The correlation
coefficient would not be close to 1 in such a case.

C2 General
10) Suppose Pearson correlation between V1 and V2 is zero. In such case, is it right
to conclude that V1 and V2 do not have any relation between them?

A) TRUE
B) FALSE

Solution: (B)

Pearson correlation coefficient between 2 variables might be zero even when they have a
relationship between them. If the correlation coefficient is zero, it just means that that they
don’t move together. We can take examples like y=|x| or y=x^2.

11) Which of the following offsets, do we use in linear regression’s least square line
fit? Suppose horizontal axis is independent variable and vertical axis is dependent
variable.

A) Vertical offset
B) Perpendicular offset
C) Both, depending on the situation
D) None of above

Solution: (A)

We always consider residuals as vertical offsets. We calculate the direct differences

between actual value and the Y labels. Perpendicular offset are useful in case of PCA.

12) True- False: Overfitting is more likely when you have huge amount of data to
train?

C2 General
A) TRUE
B) FALSE

Solution: (B)

With a small training dataset, it’s easier to find a hypothesis to fit the training data exactly
i.e. overfitting.

13) We can also compute the coefficient of linear regression with the help of an
analytical method called “Normal Equation”. Which of the following is/are true about
Normal Equation?

1. We don’t have to choose the learning rate

2. It becomes slow when number of features is very large

3. Thers is no need to iterate

A) 1 and 2
B) 1 and 3
C) 2 and 3
D) 1,2 and 3

Solution: (D)

Instead of gradient descent, Normal Equation can also be used to find coefficients. Refer
this article for read more about normal equation.

14) Which of the following statement is true about sum of residuals of A and B?

Below graphs show two fitted regression lines (A & B) on randomly generated data. Now, I
want to find the sum of residuals in both cases A and B.

Note:

1. Scale is same in both graphs for both axis.

2. X axis is independent variable and Y-axis is dependent variable.

C2 General
A) A has higher sum of residuals than B
B) A has lower sum of residual than B
C) Both have same sum of residuals
D) None of these

Solution: (C)

Sum of residuals will always be zero, therefore both have same sum of residuals

Question Context 15-17:

Suppose you have fitted a complex regression model on a dataset. Now, you are using
Ridge regression with penality x.

15) Choose the option which describes bias in best manner.

A) In case of very large x; bias is low
B) In case of very large x; bias is high
C) We can’t say about bias
D) None of these

Solution: (B)

If the penalty is very large it means model is less complex, therefore the bias would be high.

16) What will happen when you apply very large penalty?

A) Some of the coefficient will become absolute zero

B) Some of the coefficient will approach zero but not absolute zero
C) Both A and B depending on the situation
D) None of these

C2 General
Solution: (B)

In lasso some of the coefficient value become zero, but in case of Ridge, the coefficients
become close to zero but not zero.

17) What will happen when you apply very large penalty in case of Lasso?
A) Some of the coefficient will become zero
B) Some of the coefficient will be approaching to zero but not absolute zero
C) Both A and B depending on the situation
D) None of these

Solution: (A)

As already discussed, lasso applies absolute penalty, so some of the coefficients will
become zero.

18) Which of the following statement is true about outliers in Linear regression?

A) Linear regression is sensitive to outliers

B) Linear regression is not sensitive to outliers
C) Can’t say
D) None of these

Solution: (A)

The slope of the regression line will change due to outliers in most of the cases. So Linear
Regression is sensitive to outliers.

19) Suppose you plotted a scatter plot between the residuals and predicted values in
linear regression and you found that there is a relationship between them. Which of
the following conclusion do you make about this situation?

A) Since the there is a relationship means our model is not good

B) Since the there is a relationship means our model is good
C) Can’t say
D) None of these

Solution: (A)

C2 General
There should not be any relationship between predicted values and residuals. If there exists
any relationship between them,it means that the model has not perfectly captured the
information in the data.

Question Context 20-22:

20) What will happen when you fit degree 4 polynomial in linear regression?
A) There are high chances that degree 4 polynomial will over fit the data
B) There are high chances that degree 4 polynomial will under fit the data
C) Can’t say
D) None of these

Solution: (A)

Since is more degree 4 will be more complex(overfit the data) than the degree 3 model so it
will again perfectly fit the data. In such case training error will be zero but test error may not
be zero.

21) What will happen when you fit degree 2 polynomial in linear regression?
A) It is high chances that degree 2 polynomial will over fit the data
B) It is high chances that degree 2 polynomial will under fit the data
C) Can’t say
D) None of these

Solution: (B)

If a degree 3 polynomial fits the data perfectly, it’s highly likely that a simpler model(degree
2 polynomial) might under fit the data.

22) In terms of bias and variance. Which of the following is true when you fit degree 2
polynomial?

A) Bias will be high, variance will be high

B) Bias will be low, variance will be high

C2 General
C) Bias will be high, variance will be low
D) Bias will be low, variance will be low

Solution: (C)

Since a degree 2 polynomial will be less complex as compared to degree 3, the bias will be
high and variance will be low.

Question Context 23:

Which of the following is true about below graphs(A,B, C left to right) between the cost
function and Number of iterations?

23) Suppose l1, l2 and l3 are the three learning rates for A,B,C respectively. Which of
the following is true about l1,l2 and l3?

A) l2 < l1 < l3

B) l1 > l2 > l3
C) l1 = l2 = l3
D) None of these

Solution: (A)

In case of high learning rate, step will be high, the objective function will decrease quickly
initially, but it will not find the global minima and objective function starts increasing after a
few iterations.

In case of low learning rate, the step will be small. So the objective function will decrease
slowly

C2 General
Question Context 24-25:

24) Now we increase the training set size gradually. As the training set size increases,
what do you expect will happen with the mean training error?

A) Increase
B) Decrease
C) Remain constant
D) Can’t Say

Solution: (D)

Training error may increase or decrease depending on the values that are used to fit the
model. If the values used to train contain more outliers gradually, then the error might just
increase.

25) What do you expect will happen with bias and variance as you increase the size
of training data?

A) Bias increases and Variance increases

B) Bias decreases and Variance increases
C) Bias decreases and Variance decreases
D) Bias increases and Variance decreases
E) Can’t Say False

Solution: (D)

As we increase the size of the training data, the bias would increase while the variance
would decrease.

Question Context 26:

Consider the following data where one input(X) and one output(Y) is given.

C2 General
26) What would be the root mean square training error for this data if you run a
Linear Regression model of the form (Y = A0+A1X)?

A) Less than 0
B) Greater than zero
C) Equal to 0
D) None of these

Solution: (C)

We can perfectly fit the line on the following data so mean error will be zero.

Question Context 27-28:

Suppose you have been given the following scenario for training and validation error for
Linear Regression.

Number of Validation
Scenario Learning Rate Training Error
iterations Error
1 0.1 1000 100 110
2 0.2 600 90 105
3 0.3 400 110 110
4 0.4 300 120 130
5 0.4 250 130 150

C2 General
27) Which of the following scenario would give you the right hyper parameter?

A) 1
B) 2
C) 3
D) 4

Solution: (B)

Option B would be the better option because it leads to less training as well as validation
error.

28) Suppose you got the tuned hyper parameters from the previous question. Now,
Imagine you want to add a variable in variable space such that this added feature is
important. Which of the following thing would you observe in such case?

A) Training Error will decrease and Validation error will increase

B) Training Error will increase and Validation error will increase

C) Training Error will increase and Validation error will decrease
D) Training Error will decrease and Validation error will decrease
E) None of the above

Solution: (D)

If the added feature is important, the training and validation error would decrease.

Question Context 29-30:

Suppose, you got a situation where you find that your linear regression model is under
fitting the data.

29) In such situation which of the following options would you consider?

1. I will add more variables

2. I will start introducing polynomial degree variables

3. I will remove some variables

A) 1 and 2
B) 2 and 3
C) 1 and 3
D) 1, 2 and 3

C2 General
Solution: (A)

In case of under fitting, you need to induce more variables in variable space or you can add
some polynomial degree variables to make the model more complex to be able to fir the
data better.

30) Now situation is same as written in previous question(under fitting).Which of

following regularization algorithm would you prefer?

A) L1
B) L2
C) Any
D) None of these

Solution: (D)

I won’t use any regularization methods because regularization is used in case of overfitting.

MCQs ON Linear Regression

1) True-False: Is Logistic regression a supervised machine learning algorithm?

A) TRUE
B) FALSE

Solution: A

True, Logistic regression is a supervised learning algorithm because it uses true labels for
training. Supervised learning algorithm should have input variables (x) and an target
variable (Y) when you train the model .

2) True-False: Is Logistic regression mainly used for Regression?

A) TRUE
B) FALSE

Solution: B

Logistic regression is a classification algorithm, don’t confuse with the name regression.

C2 General
3) True-False: Is it possible to design a logistic regression algorithm using a Neural
Network Algorithm?

A) TRUE
B) FALSE

Solution: A

True, Neural network is a is a universal approximator so it can implement linear regression

algorithm.

4) True-False: Is it possible to apply a logistic regression algorithm on a 3-class

Classification problem?

A) TRUE
B) FALSE

Solution: A

Yes, we can apply logistic regression on 3 classification problem, We can use One Vs all
method for 3 class classification in logistic regression.

5) Which of the following methods do we use to best fit the data in Logistic
Regression?

A) Least Square Error

B) Maximum Likelihood
C) Jaccard distance
D) Both A and B

Solution: B

Logistic regression uses maximum likely hood estimate for training a logistic regression.

6) Which of the following evaluation metrics can not be applied in case of logistic
regression output to compare with target?

C2 General
A) AUC-ROC
B) Accuracy
C) Logloss
D) Mean-Squared-Error

Solution: D

Since, Logistic Regression is a classification algorithm so it’s output can not be real time
value so mean squared error can not use for evaluating it

7) One of the very good methods to analyze the performance of Logistic Regression
is AIC, which is similar to R-Squared in Linear Regression. Which of the following is
true about AIC?

A) We prefer a model with minimum AIC value

B) We prefer a model with maximum AIC value
C) Both but depend on the situation
D) None of these

Solution: A

We select the best model in logistic regression which can least AIC. For more information
refer this source: http://www4.ncsu.edu/~shu3/Presentation/AIC.pdf

8) [True-False] Standardisation of features is required before training a Logistic

Regression.

A) TRUE
B) FALSE

Solution: B

Standardization isn’t required for logistic regression. The main goal of standardizing
features is to help convergence of the technique used for optimization.

9) Which of the following algorithms do we use for Variable Selection?

A) LASSO
B) Ridge

C2 General
C) Both
D) None of these

Solution: A

In case of lasso we apply a absolute penality, after increasing the penality in

lasso some of the coefficient of variables may become zero.

Context: 10-11

Consider a following model for logistic regression: P (y =1|x, w)= g(w0 + w1x)
where g(z) is the logistic function.

In the above equation the P (y =1|x; w) , viewed as a function of x, that we can get by
changing the parameters w.

10) What would be the range of p in such case?

A) (0, inf)
B) (-inf, 0 )
C) (0, 1)
D) (-inf, inf)

Solution: C

For values of x in the range of real number from −∞ to +∞ Logistic function will give the
output between (0,1)

11) In above question what do you think which function would make p between (0,1)?

A) logistic function
B) Log likelihood function
C) Mixture of both
D) None of them

Solution: A

Explanation is same as question number 10

C2 General
Context: 12-13

Suppose you train a logistic regression classifier and your hypothesis function H is

12) Which of the following figure will represent the decision boundary as given by
above classifier?

C2 General
Solution: B

Option B would be the right answer. Since our line will be represented by y = g(-6+x2) which
is shown in the option A and option B. But option B is the right answer because when you
put the value x2 = 6 in the equation then y = g(0) you will get that means y= 0.5 will be on
the line, if you increase the value of x2 greater then 6 you will get negative values so output
will be the region y =0.

13) If you replace coefficient of x1 with x2 what would be the output figure?

C2 General
Solution: D

Same explanation as in previous question.

14) Suppose you have been given a fair coin and you want to find out the odds of
getting heads. Which of the following option is true for such a case?

A) odds will be 0
B) odds will be 0.5
C) odds will be 1
D) None of these

Solution: C

Odds are defined as the ratio of the probability of success and the probability of failure. So
in case of fair coin probability of success is 1/2 and the probability of failure is 1/2 so odd
would be 1

15) The logit function(given as l(x)) is the log of odds function. What could be the
range of logit function in the domain x=[0,1]?

A) (– ∞ , ∞)
B) (0,1)
C) (0, ∞)
D) (- ∞, 0)

Solution: A

For our purposes, the odds function has the advantage of transforming the probability
function, which has values from 0 to 1, into an equivalent function with values between 0
and ∞. When we take the natural log of the odds function, we get a range of values from -∞
to ∞.

C2 General
16) Which of the following option is true?

A) Linear Regression errors values has to be normally distributed but in case of Logistic
Regression it is not the case
B) Logistic Regression errors values has to be normally distributed but in case of Linear
Regression it is not the case
C) Both Linear Regression and Logistic Regression error values have to be normally
distributed
D) Both Linear Regression and Logistic Regression error values have not to be normally
distributed

Solution:A

Only A is true.

17) Which of the following is true regarding the logistic function for any value “x”?

Note:
Logistic(x): is a logistic function of any number “x”

Logit(x): is a logit function of any number “x”

Logit_inv(x): is a inverse logit function of any number “x”

A) Logistic(x) = Logit(x)
B) Logistic(x) = Logit_inv(x)
C) Logit_inv(x) = Logit(x)
D) None of these

Solution: B

C2 General
MCQ For UNIT 2

1) Which of the following statement is true in following case?

A) Feature F1 is an example of nominal variable.

B) Feature F1 is an example of ordinal variable.
C) It doesn’t belong to any of the above category.
D) Both of these

Solution: (B)

Ordinal variables are the variables which has some order in their categories. For example,
grade A should be consider as high grade than grade B.

2) Which of the following is an example of a deterministic algorithm?

A) PCA

B) K-Means

C) None of the above

Solution: (A)

A deterministic algorithm is that in which output does not change on different runs. PCA
would give the same result if we run again, but not k-means.

3) [True or False] A Pearson correlation between two variables is zero but, still their
values can still be related to each other.

A) TRUE

B) FALSE

Solution: (A)

Y=X2. Note that, they are not only associated, but one is a function of the other and
Pearson correlation between them is 0.
4) Which of the following statement(s) is / are true for Gradient Decent (GD) and
Stochastic Gradient Decent (SGD)?

1. In GD and SGD, you update a set of parameters in an iterative manner to

minimize the error function.

2. In SGD, you have to run through all the samples in your training set for a
single update of a parameter in each iteration.

3. In GD, you either use the entire data or a subset of training data to update a
parameter in each iteration.

A) Only 1

B) Only 2

C) Only 3

D) 1 and 2

E) 2 and 3

F) 1,2 and 3

Solution: (A)

In SGD for each iteration you choose the batch which is generally contain the random
sample of data But in case of GD each iteration contain the all of the training observations.

5) Which of the following hyper parameter(s), when increased may cause random
forest to over fit the data?

1. Number of Trees

2. Depth of Tree

3. Learning Rate

A) Only 1

B) Only 2

C) Only 3
D) 1 and 2

E) 2 and 3

F) 1,2 and 3

Solution: (B)

Usually, if we increase the depth of tree it will cause overfitting. Learning rate is not an
hyperparameter in random forest. Increase in the number of tree will cause under fitting.

6) Imagine, you are working with “Analytics Vidhya” and you want to develop a
machine learning algorithm which predicts the number of views on the articles.

Your analysis is based on features like author name, number of articles written by the
same author on Analytics Vidhya in past and a few other features. Which of the
following evaluation metric would you choose in that case?

1. Mean Square Error

2. Accuracy

3. F1 Score

A) Only 1

B) Only 2

C) Only 3

D) 1 and 3

E) 2 and 3

F) 1 and 2

Solution:(A)

You can think that the number of views of articles is the continuous target variable which fall
under the regression problem. So, mean squared error will be used as an evaluation
metrics.
7) Given below are three images (1,2,3). Which of the following option is correct for
these images?

B)
C)
A) 1 is tanh, 2 is ReLU and 3 is SIGMOID activation functions.

B) 1 is SIGMOID, 2 is ReLU and 3 is tanh activation functions.

C) 1 is ReLU, 2 is tanh and 3 is SIGMOID activation functions.

D) 1 is tanh, 2 is SIGMOID and 3 is ReLU activation functions.

Solution: (D)

The range of SIGMOID function is [0,1].

The range of the tanh function is [-1,1].

The range of the RELU function is [0, infinity].

So Option D is the right answer.

8) Below are the 8 actual values of target variable in the train file.

[0,0,0,1,1,1,1,1]

What is the entropy of the target variable?

A) -(5/8 log(5/8) + 3/8 log(3/8))

B) 5/8 log(5/8) + 3/8 log(3/8)

C) 3/8 log(5/8) + 5/8 log(3/8)

D) 5/8 log(3/8) – 3/8 log(5/8)

Solution: (A)

The formula for entropy is

So the answer is A.

9) Let’s say, you are working with categorical feature(s) and you have not looked at
the distribution of the categorical variable in the test data.

You want to apply one hot encoding (OHE) on the categorical feature(s). What
challenges you may face if you have applied OHE on a categorical variable of train
dataset?

A) All categories of categorical variable are not present in the test dataset.

B) Frequency distribution of categories is different in train as compared to the test dataset.

C) Train and Test always have same distribution.

D) Both A and B

E) None of these

Solution: (D)

Both are true, The OHE will fail to encode the categories which is present in test but not in
train so it could be one of the main challenges while applying OHE. The challenge given in
option B is also true you need to more careful while applying OHE if frequency distribution
doesn’t same in train and test.

10) Skip gram model is one of the best models used in Word2vec algorithm for words
embedding. Which one of the following models depict the skip gram model?
A) A

B) B

C) Both A and B

D) None of these

Solution: (B)

Both models (model1 and model2) are used in Word2vec algorithm. The model1 represent
a CBOW model where as Model2 represent the Skip gram model.

11) Let’s say, you are using activation function X in hidden layers of neural network.
At a particular neuron for any given input, you get the output as “-0.0001”. Which of
the following activation function could X represent?

A) ReLU

B) tanh

C) SIGMOID

D) None of these
Solution: (B)

The function is a tanh because the this function output range is between (-1,-1).

12) [True or False] LogLoss evaluation metric can have negative values.

A) TRUE
B) FALSE

Solution: (B)

Log loss cannot have negative values.

13) Which of the following statements is/are true about “Type-1” and “Type-2” errors?

1. Type1 is known as false positive and Type2 is known as false negative.

2. Type1 is known as false negative and Type2 is known as false positive.

3. Type1 error occurs when we reject a null hypothesis when it is actually true.

A) Only 1

B) Only 2

C) Only 3

D) 1 and 2

E) 1 and 3

F) 2 and 3

Solution: (E)

In statistical hypothesis testing, a type I error is the incorrect rejection of a true null
hypothesis (a “false positive”), while a type II error is incorrectly retaining a false null
hypothesis (a “false negative”).
14) Which of the following is/are one of the important step(s) to pre-process the text
in NLP based projects?

1. Stemming

2. Stop word removal

3. Object Standardization

A) 1 and 2

B) 1 and 3

C) 2 and 3

D) 1,2 and 3

Solution: (D)

Stemming is a rudimentary rule-based process of stripping the suffixes (“ing”, “ly”, “es”, “s”
etc) from a word.

Stop words are those words which will have not relevant to the context of the data for
example is/am/are.

Object Standardization is also one of the good way to pre-process the text.

15) Suppose you want to project high dimensional data into lower dimensions. The
two most famous dimensionality reduction algorithms used here are PCA and t-SNE.
Let’s say you have applied both algorithms respectively on data “X” and you got the
datasets “X_projected_PCA” , “X_projected_tSNE”.

Which of the following statements is true for “X_projected_PCA” &

“X_projected_tSNE” ?

A) X_projected_PCA will have interpretation in the nearest neighbour space.

B) X_projected_tSNE will have interpretation in the nearest neighbour space.

C) Both will have interpretation in the nearest neighbour space.

D) None of them will have interpretation in the nearest neighbour space.

Solution: (B)

t-SNE algorithm consider nearest neighbour points to reduce the dimensionality of the data.
So, after using t-SNE we can think that reduced dimensions will also have interpretation in
nearest neighbour space. But in case of PCA it is not the case.

Context: 16-17

Given below are three scatter plots for two features (Image 1, 2 & 3 from left to right).

16) In the above images, which of the following is/are example of multi-collinear
features?

A) Features in Image 1

B) Features in Image 2

C) Features in Image 3

D) Features in Image 1 & 2

E) Features in Image 2 & 3

F) Features in Image 3 & 1

Solution: (D)

In Image 1, features have high positive correlation where as in Image 2 has high negative
correlation between the features so in both images pair of features are the example of
multicollinear features.
17) In previous question, suppose you have identified multi-collinear features. Which
of the following action(s) would you perform next?

1. Remove both collinear variables.

2. Instead of removing both variables, we can remove only one variable.

3. Removing correlated variables might lead to loss of information. In order to

retain those variables, we can use penalized regression models like ridge or
lasso regression.

A) Only 1

B)Only 2

C) Only 3

D) Either 1 or 3

E) Either 2 or 3

Solution: (E)

You cannot remove the both features because after removing the both features you will
lose all of the information so you should either remove the only 1 feature or you can use the
regularization algorithm like L1 and L2.

18) Adding a non-important feature to a linear regression model may result in.

1. Increase in R-square

2. Decrease in R-square

A) Only 1 is correct

B) Only 2 is correct

C) Either 1 or 2

D) None of these

Solution: (A)
After adding a feature in feature space, whether that feature is important or unimportant
features the R-squared always increase.

19) Suppose, you are given three variables X, Y and Z. The Pearson correlation
coefficients for (X, Y), (Y, Z) and (X, Z) are C1, C2 & C3 respectively.

Now, you have added 2 in all values of X (i.enew values become X+2), subtracted 2
from all values of Y (i.e. new values are Y-2) and Z remains the same. The new
coefficients for (X,Y), (Y,Z) and (X,Z) are given by D1, D2 & D3 respectively. How do
the values of D1, D2 & D3 relate to C1, C2 & C3?

A) D1= C1, D2 < C2, D3 > C3

B) D1 = C1, D2 > C2, D3 > C3

C) D1 = C1, D2 > C2, D3 < C3

D) D1 = C1, D2 < C2, D3 < C3

E) D1 = C1, D2 = C2, D3 = C3

F) Cannot be determined

Solution: (E)

Correlation between the features won’t change if you add or subtract a value in the
features.

20) Imagine, you are solving a classification problems with highly imbalanced class.
The majority class is observed 99% of times in the training data.

Your model has 99% accuracy after taking the predictions on test data. Which of the
following is true in such a case?

1. Accuracy metric is not a good idea for imbalanced class problems.

2. Accuracy metric is a good idea for imbalanced class problems.

3. Precision and recall metrics are good for imbalanced class problems.

4. Precision and recall metrics aren’t good for imbalanced class problems.
A) 1 and 3

B) 1 and 4

C) 2 and 3

D) 2 and 4

Solution: (A)

Refer the question number 4 from in this article.

21) In ensemble learning, you aggregate the predictions for weak learners, so that an
ensemble of these models will give a better prediction than prediction of individual
models.

Which of the following statements is / are true for weak learners used in ensemble
model?

1. They don’t usually overfit.

2. They have high bias, so they cannot solve complex learning problems

3. They usually overfit.

A) 1 and 2

B) 1 and 3

C) 2 and 3

D) Only 1

E) Only 2

F) None of the above

Solution: (A)

Weak learners are sure about particular part of a problem. So, they usually don’t overfit
which means that weak learners have low variance and high bias.
22) Which of the following options is/are true for K-fold cross-validation?

1. Increase in K will result in higher time required to cross validate the result.

2. Higher values of K will result in higher confidence on the cross-validation

result as compared to lower value of K.

3. If K=N, then it is called Leave one out cross validation, where N is the number
of observations.

A) 1 and 2

B) 2 and 3

C) 1 and 3

D) 1,2 and 3

Solution: (D)

Larger k value means less bias towards overestimating the true expected error (as training
folds will be closer to the total dataset) and higher running time (as you are getting closer to
the limit case: Leave-One-Out CV). We also need to consider the variance between the k
folds accuracy while selecting the k.

Question Context 23-24

Cross-validation is an important step in machine learning for hyper parameter tuning.

Let’s say you are tuning a hyper-parameter “max_depth” for GBM by selecting it from
10 different depth values (values are greater than 2) for tree based model using 5-fold
cross validation.

Time taken by an algorithm for training (on a model with max_depth 2) 4-fold is 10
seconds and for the prediction on remaining 1-fold is 2 seconds.

Note: Ignore hardware dependencies from the equation.

23) Which of the following option is true for overall execution time for 5-fold cross
validation with 10 different values of “max_depth”?

A) Less than 100 seconds

B) 100 – 300 seconds

C) 300 – 600 seconds

D) More than or equal to 600 seconds

C) None of the above

D) Can’t estimate

Solution: (D)

Each iteration for depth “2” in 5-fold cross validation will take 10 secs for training and 2
second for testing. So, 5 folds will take 12*5 = 60 seconds. Since we are searching over the
10 depth values so the algorithm would take 60*10 = 600 seconds. But training and testing
a model on depth greater than 2 will take more time than depth “2” so overall timing would
be greater than 600.

24) In previous question, if you train the same algorithm for tuning 2 hyper
parameters say “max_depth” and “learning_rate”.

You want to select the right value against “max_depth” (from given 10 depth values)
and learning rate (from given 5 different learning rates). In such cases, which of the
following will represent the overall time?

A) 1000-1500 second

B) 1500-3000 Second

C) More than or equal to 3000 Second

D) None of these

Solution: (D)

Same as question number 23.

25) Given below is a scenario for training error TE and Validation error VE for a
machine learning algorithm M1. You want to choose a hyperparameter (H) based on
TE and VE.
H TE VE
1 105 90
2 200 85
3 250 96
4 105 85
5 300 100
Which value of H will you choose based on the above table?

A) 1

B) 2

C) 3

D) 4

E) 5

Solution: (D)

Looking at the table, option D seems the best

26) What would you do in PCA to get the same projection as SVD?

A) Transform data to zero mean

B) Transform data to zero median

C) Not possible

D) None of these

Solution: (A)

When the data has a zero mean vector PCA will have same projections as SVD, otherwise
you have to centre the data first before taking SVD.
Question Context 27-28

Assume there is a black box algorithm, which takes training data with multiple
observations (t1, t2, t3,…….. tn) and a new observation (q1). The black box outputs
the nearest neighbor of q1 (say ti) and its corresponding class label ci.

You can also think that this black box algorithm is same as 1-NN (1-nearest
neighbor).

27) It is possible to construct a k-NN classification algorithm based on this black box
alone.

Note: Where n (number of training observations) is very large compared to k.

A) TRUE

B) FALSE

Solution: (A)

In first step, you pass an observation (q1) in the black box algorithm so this algorithm would
return a nearest observation and its class.

In second step, you through it out nearest observation from train data and again input the
observation (q1). The black box algorithm will again return the a nearest observation and it’s
class.

You need to repeat this procedure k times

28) Instead of using 1-NN black box we want to use the j-NN (j>1) algorithm as black
box. Which of the following option is correct for finding k-NN using j-NN?

1. J must be a proper factor of k

2. J > k

3. Not possible

A) 1

B) 2

C) 3
Solution: (A)

Same as question number 27

29) Suppose you are given 7 Scatter plots 1-7 (left to right) and you want to compare
Pearson correlation coefficients between variables of each scatterplot.

Which of the following is in the right order?

1. 1<2<3<4

2. 1>2>3 > 4

3. 7<6<5<4

4. 7>6>5>4

A) 1 and 3

B) 2 and 3

C) 1 and 4

D) 2 and 4

Solution: (B)

from image 1to 4 correlation is decreasing (absolute value). But from image 4 to 7
correlation is increasing but values are negative (for example, 0, -0.3, -0.7, -0.99).

30) You can evaluate the performance of a binary class classification problem using
different metrics such as accuracy, log-loss, F-Score. Let’s say, you are using the
log-loss function as evaluation metric.

Which of the following option is / are true for interpretation of log-loss as an

evaluation metric?
1.
If a classifier is confident about an incorrect classification, then log-loss will penalise
it heavily.

2. For a particular observation, the classifier assigns a very small probability for the
correct class then the corresponding contribution to the log-loss will be very large.

3. Lower the log-loss, the better is the model.

A) 1 and 3

B) 2 and 3

C) 1 and 2

D) 1,2 and 3

Solution: (D)

Options are self-explanatory.

Question 31-32

Below are five samples given in the dataset.

Note: Visual distance between the points in the image represents the actual distance.

31) Which of the following is leave-one-out cross-validation accuracy for 3-NN

(3-nearest neighbor)?
A) 0

D) 0.4

C) 0.8

D) 1

Solution: (C)

In Leave-One-Out cross validation, we will select (n-1) observations for training and 1
observation of validation. Consider each point as a cross validation point and then find the 3
nearest point to this point. So if you repeat this procedure for all points you will get the
correct classification for all positive class given in the above figure but negative class will be
misclassified. Hence you will get 80% accuracy.

32) Which of the following value of K will have least leave-one-out cross validation
accuracy?

A) 1NN

B) 3NN

C) 4NN

D) All have same leave one out error

Solution: (A)

Each point which will always be misclassified in 1-NN which means that you will get the 0%
accuracy.

33) Suppose you are given the below data and you want to apply a logistic regression
model for classifying it in two given classes.
You are using logistic regression with L1 regularization.

Where C is the regularization

parameter and w1 & w2 are the coefficients of x1 and x2.

Which of the following option is correct when you increase the value of C from zero to a
very large value?

A) First w2 becomes zero and then w1 becomes zero

B) First w1 becomes zero and then w2 becomes zero

C) Both becomes zero at the same time

D) Both cannot be zero even after very large value of C

Solution: (B)

By looking at the image, we see that even on just using x2, we can efficiently perform
classification. So at first w1 will become 0. As regularization parameter increases more, w2
will come more and more closer to 0.

34) Suppose we have a dataset which can be trained with 100% accuracy with help of
a decision tree of depth 6. Now consider the points below and choose the option
based on these points.

Note: All other hyper parameters are same and other factors are not affected.

1. Depth 4 will have high bias and low variance

2. Depth 4 will have low bias and low variance

A) Only 1

B) Only 2

C) Both 1 and 2

D) None of the above

Solution: (A)

If you fit decision tree of depth 4 in such data means it will more likely to underfit the data.
So, in case of underfitting you will have high bias and low variance.

35) Which of the following options can be used to get global minima in k-Means
Algorithm?

1. Try to run algorithm for different centroid initialization

2. Adjust number of iterations

3. Find out the optimal number of clusters

A) 2 and 3
B) 1 and 3

C) 1 and 2

D) All of above

Solution: (D)

All of the option can be tuned to find the global minima.

36) Imagine you are working on a project which is a binary classification problem.
You trained a model on training dataset and get the below confusion matrix on
validation dataset.

Based on the above confusion matrix, choose which option(s) below will give you
correct predictions?

1. Accuracy is ~0.91

2. Misclassification rate is ~ 0.91

3. False positive rate is ~0.95

4. True positive rate is ~0.95

A) 1 and 3

B) 2 and 4

C) 1 and 4
D) 2 and 3

Solution: (C)

The Accuracy (correct classification) is (50+100)/165 which is nearly equal to 0.91.

The true Positive Rate is how many times you are predicting positive class correctly so true
positive rate would be 100/105 = 0.95 also known as “Sensitivity” or “Recall”

37) For which of the following hyperparameters, higher value is better for decision
tree algorithm?

1. Number of samples used for split

2. Depth of tree

3. Samples for leaf

A)1 and 2

B) 2 and 3

C) 1 and 3

D) 1, 2 and 3

E) Can’t say

Solution: (E)

For all three options A, B and C, it is not necessary that if you increase the value of
parameter the performance may increase. For example, if we have a very high value of
depth of tree, the resulting tree may overfit the data, and would not generalize well. On the
other hand, if we have a very low value, the tree may underfit the data. So, we can’t say for
sure that “higher is better”.

Context 38-39

Imagine, you have a 28 * 28 image and you run a 3 * 3 convolution neural network on
it with the input depth of 3 and output depth of 8.

Note: Stride is 1 and you are using same padding.

38) What is the dimension of output feature map when you are using the given
parameters.

A) 28 width, 28 height and 8 depth

B) 13 width, 13 height and 8 depth

C) 28 width, 13 height and 8 depth

D) 13 width, 28 height and 8 depth

Solution: (A)

The formula for calculating output size is

output size = (N – F)/S + 1

where, N is input size, F is filter size and S is stride.

Read this article to get a better understanding.

39) What is the dimensions of output feature map when you are using following
parameters.

A) 28 width, 28 height and 8 depth

B) 13 width, 13 height and 8 depth

C) 28 width, 13 height and 8 depth

D) 13 width, 28 height and 8 depth

Solution: (B)

Same as above

40) Suppose, we were plotting the visualization for different values of C (Penalty
parameter) in SVM algorithm. Due to some reason, we forgot to tag the C values with
visualizations. In that case, which of the following option best explains the C values
for the images below (1,2,3 left to right, so C values are C1 for image1, C2 for image2
and C3 for image3 ) in case of rbf kernel.
A) C1 = C2 = C3

B) C1 > C2 > C3

C) C1 < C2 < C3

D) None of these

Solution: (C)
MCQ questions for unit 4: Naïve Bayes and Support Vector Machine

1. 1. How many terms are required for building a bayes model?

a) 1
b) 2
c) 3
d) 4 Answer: c
Explanation: The three required terms are a conditional probability and two unconditional
probability.
2. 2. What is needed to make probabilistic systems feasible in the world?
a) Reliability
b) Crucial robustness
c) Feasibility
d) None of the mentioned Answer: b
Explanation: On a model-based knowledge provides the crucial robustness needed to
make probabilistic system feasible in the real world.
3. 3. Where does the bayes rule can be used?
a) Solving queries
b) Increasing complexity
c) Decreasing complexity
d) Answering probabilistic query Answer: d
Explanation: Bayes rule can be used to answer the probabilistic queries conditioned on
one piece of evidence.
4. 4. What does the bayesian network provides?
a) Complete description of the domain
b) Partial description of the domain
c) Complete description of the problem
d) None of the mentioned Answer: a
Explanation: A Bayesian network provides a complete description of the domain.
5. 5. How the entries in the full joint probability distribution can be calculated?
a) Using variables
b) Using information
c) Both Using variables & information
d) None of the mentioned Answer: b
Explanation: Every entry in the full joint probability distribution can be calculated from
the information in the network
6. 6. How the bayesian network can be used to answer any query?
a) Full distribution
b) Joint distribution
c) Partial distribution
d) All of the mentioned Answer: b
Explanation: If a bayesian network is a representation of the joint distribution, then it can
solve any query, by summing all the relevant joint entries.
7. 7. How the compactness of the bayesian network can be described?
a) Locally structured
b) Fully structured
c) Partial structure
d) All of the mentioned Answer: a
Explanation: The compactness of the bayesian network is an example of a very general
property of a locally structured system.
8. 8. To which does the local structure is associated?
a) Hybrid
b) Dependant
c) Linear
d) None of the mentioned Answer: c
Explanation: Local structure is usually associated with linear rather than exponential
growth in complexity.
9. 9. Which condition is used to influence a variable directly by all the others?
a) Partially connected
b) Fully connected
c) Local connected
d) None of the mentioned Answer: b
Explanation: None.
10. 10. What is the consequence between a node and its predecessors while creating bayesian
network?
a) Functionally dependent
b) Dependant
c) Conditionally independent
d) Both Conditionally dependant & Dependant Answer: c
Explanation: The semantics to derive a method for constructing bayesian networks were
led to the consequence that a node can be conditionally independent of its predecessors.

11. What do you mean by generalization error in terms of the SVM?

12. A) How far the hyperplane is from the support vectors

B) How accurately the SVM can predict outcomes for unseen data
C) The threshold amount of error in an SVM

Solution: B

Generalisation error in statistics is generally the out-of-sample error which is the measure
of how accurately a model can predict values for previously unseen data.
13. The minimum time complexity for training an SVM is O(n2). According to this fact, what
sizes of datasets are not best suited for SVM’s?

A) Large datasets
B) Small datasets
C) Medium sized datasets
D) Size does not matter

Solution: A

Datasets which have a clear classification boundary will function best with SVM’s.

14. The effectiveness of an SVM depends upon:

A) Selection of Kernel
B) Kernel Parameters
C) Soft Margin Parameter C
D) All of the above

Solution: D

The SVM effectiveness depends upon how you choose the basic 3 requirements mentioned
above in such a way that it maximises your efficiency, reduces error and overfitting.

14. The SVM’s are less effective when:

A) The data is linearly separable

B) The data is clean and ready to use
C) The data is noisy and contains overlapping points

Solution: C

When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.

15. Suppose you are using RBF kernel in SVM with high Gamma value. What does this
signify?

A) The model would consider even far away points from hyperplane for modeling
B) The model would consider only the points close to the hyperplane for modeling
C) The model would not be affected by distance of points from hyperplane for modeling
D) None of the above

Solution: B
The gamma parameter in SVM tuning signifies the influence of points either near or far away
from the hyperplane.

For a low gamma, the model will be too constrained and include all points of the training dataset,
without really capturing the shape.

For a higher gamma, the model will capture the shape of the dataset well.

16. The cost parameter in the SVM means:

A) The number of cross-validations to be made

B) The kernel to be used
C) The tradeoff between misclassification and simplicity of the model
D) None of the above

Solution: C

The cost parameter decides how much an SVM should be allowed to “bend” with the data. For a
low cost, you aim for a smooth decision surface and for a higher cost, you aim to classify more
points correctly. It is also simply referred to as the cost of misclassification.

17. Which of the following are real world applications of the SVM?

A) Text and Hypertext Categorization

B) Image Classification
C) Clustering of News Articles
D) All of the above

Solution: D

SVM’s are highly versatile models that can be used for practically all real world problems
ranging from regression to clustering and handwriting recognitions.

18. We usually use feature normalization before using the Gaussian kernel in SVM. What is
true about feature normalization?

1. We do feature normalization so that new feature will dominate other

2. Some times, feature normalization is not feasible in case of categorical variables
3. Feature normalization always helps when we use Gaussian kernel in SVM

A) 1
B) 1 and 2
C) 1 and 3
D) 2 and 3

Solution: B
Statements one and two are correct.

19. What is/are true about kernel in SVM?

1. Kernel function map low dimensional data to high dimensional space

2. It’s a similarity function

A) 1
B) 2
C) 1 and 2
D) None of these

Solution: C

Both the given statements are correct.

1. Which of the following is a widely used and effective machine learning algorithm based on
the idea of bagging?
a. Decision Tree
b. Regression
c. Classification
d. Random Forest - answer
2. To find the minimum or the maximum of a function, we set the gradient to zero because:
a. The value of the gradient at extrema of a function is always zero - answer
b. Depends on the type of problem
c. Both A and B
d. None of the above
3. The most widely used metrics and tools to assess a classification model are:
a. Confusion matrix
b. Cost-sensitive accuracy
c. Area under the ROC curve
d. All of the above - answer
4. Which of the following is a good test dataset characteristic?
a. Large enough to yield meaningful results
b. Is representative of the dataset as a whole
c. Both A and B - answer
d. None of the above
5. Which of the following is a disadvantage of decision trees?
a. Factor analysis
b. Decision trees are robust to outliers
c. Decision trees are prone to be overfit - answer
d. None of the above
6. How do you handle missing or corrupted data in a dataset?
a. Drop missing rows or columns
b. Replace missing values with mean/median/mode
c. Assign a unique category to missing values
d. All of the above - answer
7. What is the purpose of performing cross-validation?
a. To assess the predictive performance of the models
b. To judge how the trained model performs outside the sample on test data
c. Both A and B - answer
8. Why is second order differencing in time series needed?
a. To remove stationarity
b. To find the maxima or minima at the local point
c. Both A and B - answer
d. None of the above
9. When performing regression or classification, which of the following is the correct way to
preprocess the data?
a. Normalize the data → PCA → training - answer
b. PCA → normalize PCA output → training
c. Normalize the data → PCA → normalize PCA output → training
d. None of the above
10. Which of the folllowing is an example of feature extraction?
a. Constructing bag of words vector from an email
b. Applying PCA projects to a large high-dimensional data
c. Removing stopwords in a sentence
d. All of the above - answer
11. What is pca.components_ in Sklearn?
a. Set of all eigen vectors for the projection space - answer
b. Matrix of principal components
c. Result of the multiplication matrix
d. None of the above options
12. Which of the following is true about Naive Bayes ?
a. Assumes that all the features in a dataset are equally important
b. Assumes that all the features in a dataset are independent
c. Both A and B - answer
d. None of the above options
13. Which of the following statements about regularization is not correct?
a. Using too large a value of lambda can cause your hypothesis to underfit the data.
b. Using too large a value of lambda can cause your hypothesis to overfit the data.
c. Using a very large value of lambda cannot hurt the performance of your hypothesis.
d. None of the above - answer
14. How can you prevent a clustering algorithm from getting stuck in bad local optima?
a. Set the same seed value for each run
b. Use multiple random initializations - answer
c. Both A and B
d. None of the above
15. Which of the following techniques can be used for normalization in text mining?
a. Stemming
b. Lemmatization
c. Stop Word Removal
d. Both A and B - answer
16. In which of the following cases will K-means clustering fail to give good results? 1) Data
points with outliers 2) Data points with different densities 3) Data points with nonconvex
shapes
a. 1 and 2
b. 2 and 3
c. 1, 2, and 3 - answer
d. 1 and 3
17. Which of the following is a reasonable way to select the number of principal components
"k"?
a. Choose k to be the smallest value so that at least 99% of the varinace is retained. -
answer
b. Choose k to be 99% of m (k = 0.99*m, rounded to the nearest integer).
c. Choose k to be the largest value so that 99% of the variance is retained.
d. Use the elbow method
18. You run gradient descent for 15 iterations with a=0.3 and compute J(theta) after each
iteration. You find that the value of J(Theta) decreases quickly and then levels off. Based on
this, which of the following conclusions seems most plausible?
a. Rather than using the current value of a, use a larger value of a (say a=1.0)
b. Rather than using the current value of a, use a smaller value of a (say a=0.1)
c. a=0.3 is an effective choice of learning rate- answer
d. None of the above
19. What is a sentence parser typically used for?
a. It is used to parse sentences to check if they are utf-8 compliant.
b. It is used to parse sentences to derive their most likely syntax tree structures. -
answer
c. It is used to parse sentences to assign POS tags to all tokens.
d. It is used to check if sentences can be parsed into meaningful tokens.
20. Suppose you have trained a logistic regression classifier and it outputs a new example x
with a prediction ho(x) = 0.2. This means
a. Our estimate for P(y=1 | x)
b. Our estimate for P(y=0 | x) - answer
c. Our estimate for P(y=1 | x)
d. Our estimate for P(y=0 | x)

1) If you remove the following any one red points from the data. Does the
decision boundary will change?
A) Yes
B) No
Solution: A
These three examples are positioned such that removing any one of them introduces slack
in the constraints. So the decision boundary would completely change.

21. [True or False] If you remove the non-red circled points from the data, the decision
boundary will change?
A) True
B) False
Solution: B
On the other hand, rest of the points in the data won’t affect the decision boundary much.

22. What do you mean by generalization error in terms of the SVM?

A) How far the hyperplane is from the support vectors
B) How accurately the SVM can predict outcomes for unseen data
C) The threshold amount of error in an SVM
Solution: B
Generalisation error in statistics is generally the out-of-sample error which is the measure
of how accurately a model can predict values for previously unseen data.

23. When the C parameter is set to infinite, which of the following holds true?
A) The optimal hyperplane if exists, will be the one that completely separates the data
B) The soft-margin classifier will separate the data
C) None of the above
Solution: A
At such a high level of misclassification penalty, soft margin will not hold existence as
there will be no room for error.

24. What do you mean by a hard margin?

A) The SVM allows very low error in classification
B) The SVM allows high amount of error in classification
C) None of the above
Solution: A
A hard margin means that an SVM is very rigid in classification and tries to work
extremely well in the training set, causing overfitting.

25. The minimum time complexity for training an SVM is O(n2). According to this fact, what
sizes of datasets are not best suited for SVM’s?
A) Large datasets
B) Small datasets
C) Medium sized datasets
D) Size does not matter
Solution: A
Datasets which have a clear classification boundary will function best with SVM’s.

26. The effectiveness of an SVM depends upon:

A) Selection of Kernel
B) Kernel Parameters
C) Soft Margin Parameter C
D) All of the above
Solution: D
The SVM effectiveness depends upon how you choose the basic 3 requirements
mentioned above in such a way that it maximises your efficiency, reduces error and
overfitting.
27. Support vectors are the data points that lie closest to the decision surface.
A) TRUE
B) FALSE
Solution: A
They are the points closest to the hyperplane and the hardest ones to classify. They also
have a direct bearing on the location of the decision surface.

28. The SVM’s are less effective when:

A) The data is linearly separable
B) The data is clean and ready to use
C) The data is noisy and contains overlapping points
Solution: C
When the data has noise and overlapping points, there is a problem in drawing a clear
hyperplane without misclassifying.

29. Suppose you are using RBF kernel in SVM with high Gamma value. What does this
signify?
A) The model would consider even far away points from hyperplane for modeling
B) The model would consider only the points close to the hyperplane for modeling
C) The model would not be affected by distance of points from hyperplane for
modeling
D) None of the above
Solution: B
The gamma parameter in SVM tuning signifies the influence of points either near or far
away from the hyperplane

For a low gamma, the model will be too constrained and include all points of the training
dataset, without really capturing the shape.
For a higher gamma, the model will capture the shape of the dataset well.

30. The cost parameter in the SVM means:

A) The number of cross-validations to be made
B) The kernel to be used
C) The tradeoff between misclassification and simplicity of the model
D) None of the above
Solution: C
The cost parameter decides how much an SVM should be allowed to “bend” with the
data. For a low cost, you aim for a smooth decision surface and for a higher cost, you aim
to classify more points correctly. It is also simply referred to as the cost of
misclassification.

31. 12)Suppose you are building a SVM model on data X. The data X can be error prone
which means that you should not trust any specific data point too much. Now think that
you want to build a SVM model which has quadratic kernel function of polynomial
degree 2 that uses Slack variable C as one of it’s hyper parameter. Based upon that give
the answer for following question.
What would happen when you use very large value of C(C->infinity)?
Note: For small C was also classifying all data points correctly

A) We can still classify data correctly for given setting of hyper parameter C
B) We can not classify data correctly for given setting of hyper parameter C
C) Can’t Say
D) None of these
Solution: A
For large values of C, the penalty for misclassifying points is very high, so the decision
boundary will perfectly separate the data if possible.

32. What would happen when you use very small C (C~0)?
A) Misclassification would happen
B) Data will be correctly classified
C) Can’t say
D) None of these
Solution: A
The classifier can maximize the margin between most of the points, while misclassifying
a few points, because the penalty is so low.

33. If I am using all features of my dataset and I achieve 100% accuracy on my training set,
but ~70% on validation set, what should I look out for?
A) Underfitting
B) Nothing, the model is perfect
C) Overfitting
Solution: C
If we’re achieving 100% training accuracy very easily, we need to check to verify if
we’re overfitting our data.

34. Which of the following are real world applications of the SVM?
A) Text and Hypertext Categorization
B) Image Classification
C) Clustering of News Articles
D) All of the above
Solution: D
SVM’s are highly versatile models that can be used for practically all real world problems
ranging from regression to clustering and handwriting recognitions.

Question Context: 16 – 18
Suppose you have trained an SVM with linear decision boundary after training SVM, you
correctly infer that your SVM model is under fitting.
35. Which of the following option would you more likely to consider iterating SVM next
time?
A) You want to increase your data points
B) You want to decrease your data points
C) You will try to calculate more variables
D) You will try to reduce the features
Solution: C
The best option here would be to create more features for the model.

36. Suppose you gave the correct answer in previous question. What do you think that is
actually happening?
1.We are lowering the bias
2. We are lowering the variance
3. We are increasing the bias
4. We are increasing the variance

A) 1 and 2
B) 2 and 3
C) 1 and 4
D) 2 and 4
Solution: C
Better model will lower the bias and increase the variance

37. In above question suppose you want to change one of it’s(SVM) hyperparameter so that
effect would be same as previous questions i.e model will not under fit?
A) We will increase the parameter C
B) We will decrease the parameter C
C) Changing in C don’t effect
D) None of these
Solution: A
Increasing C parameter would be the right thing to do here, as it will ensure regularized
model

38. We usually use feature normalization before using the Gaussian kernel in SVM. What is
true about feature normalization?
1.We do feature normalization so that new feature will dominate other
2. Some times, feature normalization is not feasible in case of categorical variables
3. Feature normalization always helps when we use Gaussian kernel in SVM
A) 1
B) 1 and 2
C) 1 and 3
D) 2 and 3
Solution: B
Statements one and two are correct.

Question Context: 20-22

Suppose you are dealing with 4 class classification problem and you want to train a SVM
model on the data for that you are using One-vs-all method. Now answer the below
questions?
39. How many times we need to train our SVM model in such case?
A) 1
B) 2
C) 3
D) 4
Solution: D
For a 4 class problem, you would have to train the SVM at least 4 times if you are using a
one-vs-all method.

40. Suppose you have same distribution of classes in the data. Now, say for training 1 time in
one vs all setting the SVM is taking 10 second. How many seconds would it require to
train one-vs-all method end to end?
A) 20
B) 40
C) 60
D) 80
Solution: B
It would take 10×4 = 40 seconds

41. Suppose your problem has changed now. Now, data has only 2 classes. What would you
think how many times we need to train SVM in such case?
A) 1
B) 2
C) 3
D) 4
Solution: A
Training the SVM only one time would give you appropriate results
1. Support Vector Machine works well with,
a) Linear Scenarios
b) Non-linear Scenarios
c) Both of these
d) None of these

Answer: c) Both of these

2. Which of the following is best for MNIST dataset classification,

a) Naïve Bayes
b) Support Vector Machines
c) Random forest
d) Decision tree

Answer: b) Support Vector Machines

3. Two classes separated by a margin with two boundaries are called as,
a) Linear Vectors
b) Support Vectors
c) Test Vectors
d) None of these

Answer: b) Support Vectors

4. Scikit-learn supports which kernels,

a) Polynomial kernels
b) Sigmoid kernels
c) Custom kernels
d) All of these

Answer: d) All of these

5. Which of the following is the default kernel used in SVM,

a) Polynomial kernel
b) Sigmoid kernel
c) Custom kernel
d) Radial Basis Function

Answer: d) Radial Basis Function

6. The gamma parameter in RBF determines,

a) Amplitude of the function
b) Altitude of the function
c) Complexity of the function
d) None of these

Answer: a) Amplitude of the function

7. Scikit-learn allows us to create which kernel as a normal python function,
a) Polynomial kernel
b) Custom kernel
c) Sigmoid kernel
d) All of these

Answer: b) Custom kernel

8. To find out a trade-off between precision and number of support vectors, scikit-learn provides
an implementation called as,
a) NuSVC
b) BuSVC
c) MuSVC
d) AuSVC

Answer: a) NuSVC

9. The RBF kernel is based on the function:

c)
d) None of these

Answer: a)

10. The polynomial kernel is based on the function:

c)
d) None of these

Answer: b)
11. The sigmoid kernel is based on this function:

c)
d) None of these

Answer: c)

12. What is/are true about kernel in SVM,

1. It maps low dimensional data to high dimensional data.
2. It is a similarity function.

a) 1
b) 2
c) Both 1 and 2
d) None of these

Answer: c) Both 1 and 2

13. Which type of classifier is SVM,

a) Discriminative
b) Generative
c) Both
d) None of these

Answer: a) Discriminative

14. SVM is used to solve which type of problems,

a) Classification
b) Regression
c) Clustering
d) Both Classification and Regression

Answer: d) Both Classification and Regression

15. SVM is which type of learning algorithm,

a) Supervised
b) Unsupervised
c) Both
d) None of these
Answer: a) Supervised

16. The goal of SVM is to,

a) Find the optimal separating hyperplane which minimizes the margin of training data.
b) Find the optimal separating hyperplane which maximizes the margin of training data.
c) Both
d) None of these

Answer: b) Find the optimal separating hyperplane which maximizes the margin of training data.

17. The equation for hyperplane is,

c)
d) None of these

Answer: a)

18. What is a kernel in SVM?

a) SVM algorithms use a set of mathematical functions that are defined as the kernel
b) SVM algorithms use a set of logarithmic functions that are defined as the kernel
c) SVM algorithms use a set of exponential functions that are defined as the kernel
d) SVM algorithms use a set of algebraic functions that are defined as the kernel

Answer: a) SVM algorithms use a set of mathematical functions that are defined as the kernel

19. Which of the following is false,

a) SVM’s are very good when we have no idea on the data.
b) It works well with unstructured and semi structured data.
c) The kernel trick is real strength of SVM.
d) It scales relatively well to low dimensional data.

Answer: d) It scales relatively well to low dimensional data.

20. Which of the following is false,

a) SVM algorithm is suitable for large data sets.
b) It does not perform well when the data has more noise.
c) SVM algorithm is not suitable for large data sets.
d) None of these

Answer: a) SVM algorithm is suitable for large data sets.

1. The Naive Bayes Classifier is a _____ in probability.
A. Technique.
B. Process.
C. Classification.
D. None of these answers are correct.
ANSWER: D

2. How many terms are required for building a bayes model?

A. 1
B. 2
C. 3
D. 4
ANSWER: C

3. Where does the bayes rule can be used

A. Solving queries
B. Increasing complexity
C. Decreasing complexity
D. Answering probabilistic query
ANSWER: D

4. _____ is the mathematical likelihood that something will occur.

A. Classification
B. Probability
C. NAive Bayes CLassifier
D. None
ANSWER: B

5. ______________binary distribution, useful when a feature can be present or absent

A. Bernoulli
B. multinomial
C. Gaussian
D. None
ANSWER: A

6. Naïve Bayes Algorithm is a ________ learning algorithm.

A. Supervised
B. Reinforcement
C. Unsupervised
D. None of these
ANSWER: A

7. Examples of Naïve Bayes Algorithm is/are

A. Spam filtration
B. Sentimental analysis
C. Classifying articles
D. All of the above
ANSWER: D
8. Why it is needed to make probabilistic systems feasible in the world
A. Feasibility
B. Reliability
C. Crucial robustness
D. None of the above
ANSWER: C

9. Probability provides a way of summarizing the ______ that comes from our laziness and
ignorances.
A. Belief
B. Uncertaintity
C. Joint probability distributions
D. Randomness
ANSWER: B

10. The entries in the full joint probability distribution can be calculated as
A. Using variables
B. Both Using variables & information
C. Using information
D. All of the above
ANSWER: C

11. Which of the following is correct about the Naive Bayes?

A. Assumes that all the features in a dataset are independent
B. Assumes that all the features in a dataset are equally important
C. None
D. All of the above
ANSWER: C

12. Naïve Bayes algorithm is based on _______ and used for solving classification problems.
A. Bayes Theorem
B. Candidate elimination algorithm
C. EM algorithm
D. None of the above
ANSWER: A

13. Types of Naïve Bayes Model:

A. Bernoulli
B. multinomial
C. Gaussian
D. All of above
ANSWER: D

14. Disadvantages of Naïve Bayes Classifier

A. Naive Bayes assumes that all features are independent or unrelated, so it cannot learn
the relationship between features.
B. It performs well in Multi-class predictions as compared to the other Algorithms.
C. Naïve Bayes is one of the fast and easy ML algorithms to predict a class of datasets.
D. It is the most popular choice for text classification problems.
15. The benefit of Naïve Bayes
A. Naïve Bayes is one of the fast and easy ML algorithms to predict a class of datasets.
B. It is the most popular choice for text classification problems.
C. It can be used for Binary as well as Multi-class Classifications.
D. All of the above
ANSWER: D

16. How can SVM be classified

A. It is a model trained using unsupervised learning. It can be used for classification
and regression.
B. It is a model trained using unsupervised learning. It can be used for classification
but not for regression.
C. It is a model trained using supervised learning. It can be used for classification
and regression.
D. It is a model trained using unsupervised learning. It can be used for classification
but not for regression.
ANSWER: C

17. What do you mean by a hard margin

A. The SVM allows very low error in classification
B. The SVM allows high amount of error in classification
C. None of the above
D. All of above
ANSWER: A

18. The effectiveness of an SVM depends upon:

A. Selection of Kernel
B. Kernel Parameters
C. Soft Margin Parameter C
D. All of the above
ANSWER: D

19. Support vectors are the data points that lie closest to the decision surface.
A. TRUE
B. FALSE
ANSWER: A

20. The SVM’s are less effective when:

A. The data is linearly separable
B. The data is clean and ready to use
C. The data is noisy and contains overlapping points
ANSWER: C

21. The cost parameter in the SVM means:

A. The number of cross-validations to be made
B. The kernel to be used
C. The tradeoff between misclassification and simplicity of the model
D. None of the above
ANSWER: C

22. Which of the following are real world applications of the SVM?
A. Text and Hypertext Categorization
B. Image Classification
C. Clustering of News Articles
D. All of the above
ANSWER:D

23. Gaussian naive Bayes is useful when working with continuous values whose probabilities
can be modeled using a Gaussian distribution
A. Bernoulli
B. multinomial
C. Gaussian
D. All of above
ANSWER: C

24. A multinomial distribution is useful to model feature vectors where each value
represents,the number of occurrences of a term or its relative frequency
A. Bernoulli
B. multinomial
C. Gaussian
D. All of above
ANSWER: B

25. Gaussian naive Bayes is limited due to

A. Mean and variance
B. Mean and Median
C. Median and covariance
D. Mean and standard deviation
ANSWER:A

26. The two classes are normally separated by a margin with two boundaries where a few
elements lie. Those elements are called
A. principal componants
B. support vectors
C. factors
D. None
ANSWER: B

27. What is/are true about kernel in SVM? 1. Kernel function map low dimensional data to
high dimensional space. 2.It’s a similarity function
A. 1
B. 2
C. 1 and 2
D. None of these
ANSWER: C
28. Support vector machine (SVM) is a _________ classifier
A. Descrinative
B. Generative
ANSWER: A

29. SVM is termed as ________ classifier

A. Maximum margin
B. Manimum margin
ANSWER:A

30. The training examples closest to the separating hyperplane are called as _______
A. Training vector
B. Testing Vector
C. Support margin
D. Support vector
ANSWER:D

31. Which of the following is a type of SVM?

A. Maximum margin classifier
B. Soft margin classifier
C. Support vector regression
D. All of the above
ANSWER: D

32. The goal of the SVM is to __________

A. Find the optimal separating hyperplane which minimizes the margin of training data
B. Find the optimal separating hyperplane which maximizes the margin of training data
ANSWER:B

33. When using R, which of the following package is used for SVM?
A. b1072
B. c1071
C. d2012
D. e1071
ANSWER:D

34. What are the different kernels functions in SVM ?

A. Linear Kernel
B. Polynomial kernel
C. Radial basis kernel
D. Sigmoid kernel
E. ALl of the above
ANSWER:E

35. Which of the following might be valid reasons for preferring an SVM over a neural
network?
A. An SVM can automatically learn to apply a non-linear transformation on the input space;
a neural net cannot.
B. An SVM can effectively map the data to an infinite-dimensional space; a neural net
cannot.
C. An SVM should not get stuck in local minima, unlike a neural net.
D. The transformed (basis function) representation constructed by an SVM is usually
easier to visualise/interpret than for a neural net.
ANSWER: B,C

36. You are given a labeled binary classification data set with N data points and D features.
Suppose that N < D. In training an SVM on this data set, which of the following kernels
is likely to be most appropriate?
A. Linear kernel
B. Quadratic kernel
C. Higher-order polynomial kernel
D. RBF kernel
ANSWER: A
UNIT I
1. What is classification?
a) when the output variable is a category, such as “red” or “blue” or “disease” and “no
disease”.
b) when the output variable is a real value, such as “dollars” or “weight”.

Ans: Solution A

2. What is regression?
a) When the output variable is a category, such as “red” or “blue” or “disease” and “no
disease”.
b) When the output variable is a real value, such as “dollars” or “weight”.

Ans: Solution B

3. What is supervised learning?

a) All data is unlabelled and the algorithms learn to inherent structure from the input data
b) All data is labelled and the algorithms learn to predict the output from the input data
c) It is a framework for learning where an agent interacts with an environment and receives
a reward for each interaction
d) Some data is labelled but most of it is unlabelled and a mixture of supervised and
unsupervised techniques can be used.

Ans: Solution B

4. What is Unsupervised learning?

Ans: Solution A

5. What is Semi-Supervised learning?

Ans: Solution D
6. What is Reinforcement learning?
a) All data is unlabelled and the algorithms learn to inherent structure from the input data
b) All data is labelled and the algorithms learn to predict the output from the input data
c) It is a framework for learning where an agent interacts with an environment and receives
a reward for each interaction
d) Some data is labelled but most of it is unlabelled and a mixture of supervised and
unsupervised techniques can be used.

Ans: Solution C

7. Sentiment Analysis is an example of:

Regression,

Classification

Clustering

Reinforcement Learning

Options:

A. 1 Only

B. 1 and 2

C. 1 and 3

D. 1, 2 and 4

Ans : Solution D

8. The process of forming general concept definitions from examples of concepts to be

learned.
a) Deduction
b) abduction
c) induction
d) conjunction

Ans : Solution C

9. Computers are best at learning

a) facts.
b) concepts.
c) procedures.
d) principles.
Ans : Solution A

10. Data used to build a data mining model.

a) validation data
b) training data
c) test data
d) hidden data

Ans : Solution B

11. Supervised learning and unsupervised clustering both require at least one
a) hidden attribute.
b) output attribute.
c) input attribute.
d) categorical attribute.

Ans : Solution A

12. Supervised learning differs from unsupervised clustering in that supervised learning requires
a) at least one input attribute.
b) input attributes to be categorical.
c) at least one output attribute.
d) output attributes to be categorical.

Ans : Solution B

13. A regression model in which more than one independent variable is used to predict the
dependent variable is called
a) a simple linear regression model
b) a multiple regression models
c) an independent model
d) none of the above

Ans : Solution C

14. A term used to describe the case when the independent variables in a multiple regression model
are correlated is
a) Regression
b) correlation
c) multicollinearity
d) none of the above

Ans : Solution C
15. A multiple regression model has the form: y = 2 + 3x1 + 4x2. As x1 increases by 1 unit (holding x2
constant), y will
a) increase by 3 units
b) decrease by 3 units
c) increase by 4 units
d) decrease by 4 units

Ans : Solution C

16. A multiple regression model has

a) only one independent variable
b) more than one dependent variable
c) more than one independent variable
d) none of the above

Ans : Solution B

17. A measure of goodness of fit for the estimated regression equation is the
a) multiple coefficient of determination
b) mean square due to error
c) mean square due to regression
d) none of the above

Ans : Solution C

18. The adjusted multiple coefficient of determination accounts for

a) the number of dependent variables in the model
b) the number of independent variables in the model
c) unusually large predictors
d) none of the above

Ans : Solution D

19. The multiple coefficient of determination is computed by

a) dividing SSR by SST
b) dividing SST by SSR
c) dividing SST by SSE
d) none of the above

Ans : Solution C

20. For a multiple regression model, SST = 200 and SSE = 50. The multiple coefficient of
determination is
a) 0.25
b) 4.00
c) 0.75
d) none of the above

Ans : Solution B

21. A nearest neighbor approach is best used

a) with large-sized datasets.
b) when irrelevant attributes have been removed from the data.
c) when a generalized model of the data is desirable.
d) when an explanation of what has been found is of primary importance.

Ans : Solution B

22. Another name for an output attribute.

a) predictive variable
b) independent variable
c) estimated variable
d) dependent variable

Ans : Solution B

23. Classification problems are distinguished from estimation problems in that

a) classification problems require the output attribute to be numeric.
b) classification problems require the output attribute to be categorical.
c) classification problems do not allow an output attribute.
d) classification problems are designed to predict future outcome.

Ans : Solution C

24. Which statement is true about prediction problems?

a) The output attribute must be categorical.
b) The output attribute must be numeric.
c) The resultant model is designed to determine future outcomes.
d) The resultant model is designed to classify current behavior.

Ans : Solution D

25. Which statement about outliers is true?

a) Outliers should be identified and removed from a dataset.
b) Outliers should be part of the training dataset but should not be present in the test
data.
c) Outliers should be part of the test dataset but should not be present in the training
data.
d) The nature of the problem determines how outliers are used.
Ans : Solution D

26. Which statement is true about neural network and linear regression models?
a) Both models require input attributes to be numeric.
b) Both models require numeric attributes to range between 0 and 1.
c) The output of both models is a categorical attribute value.
d) Both techniques build models whose output is determined by a linear sum of weighted
input attribute values.

Ans : Solution A

27. Which of the following is a common use of unsupervised clustering?

a) detect outliers
b) determine a best set of input attributes for supervised learning
c) evaluate the likely performance of a supervised learner model
d) determine if meaningful relationships can be found in a dataset

Ans : Solution A

28. The average positive difference between computed and desired outcome values.
a) root mean squared error
b) mean squared error
c) mean absolute error
d) mean positive error

Ans : Solution D

29. Selecting data so as to assure that each class is properly represented in both the training and
test set.
a) cross validation
b) stratification
c) verification
d) bootstrapping

Ans : Solution B

30. The standard error is defined as the square root of this computation.
a) The sample variance divided by the total number of sample instances.
b) The population variance divided by the total number of sample instances.
c) The sample variance divided by the sample mean.
d) The population variance divided by the sample mean.

Ans : Solution A
31. Data used to optimize the parameter settings of a supervised learner model.
a) Training
b) Test
c) Verification
d) Validation

Ans : Solution D

32. Bootstrapping allows us to

a) choose the same training instance several times.
b) choose the same test set instance several times.
c) build models with alternative subsets of the training data several times.
d) test a model with alternative subsets of the test data several times.

Ans : Solution A

33. The correlation between the number of years an employee has worked for a company and the
salary of the employee is 0.75. What can be said about employee salary and years worked?
a) There is no relationship between salary and years worked.
b) Individuals that have worked for the company the longest have higher salaries.
c) Individuals that have worked for the company the longest have lower salaries.
d) The majority of employees have been with the company a long time.
e) The majority of employees have been with the company a short period of time.

Ans : Solution B

34. The correlation coefficient for two real-valued attributes is –0.85. What does this value tell you?
a) The attributes are not linearly related.
b) As the value of one attribute increases the value of the second attribute also increases.
c) As the value of one attribute decreases the value of the second attribute increases.
d) The attributes show a curvilinear relationship.

Ans : Solution C

35. The average squared difference between classifier predicted output and actual output.
a) mean squared error
b) root mean squared error
c) mean absolute error
d) mean relative error

Ans : Solution A

36. Simple regression assumes a __________ relationship between the input attribute and output
attribute.
a) Linear
b) Quadratic
c) reciprocal
d) inverse

Ans : Solution A

37. Regression trees are often used to model _______ data.

a) Linear
b) Nonlinear
c) Categorical
d) Symmetrical

Ans : Solution B

38. The leaf nodes of a model tree are

a) averages of numeric output attribute values.
b) nonlinear regression equations.
c) linear regression equations.
d) sums of numeric output attribute values.

Ans : Solution C

39. Logistic regression is a ________ regression technique that is used to model data having a
_____outcome.
a) linear, numeric
b) linear, binary
c) nonlinear, numeric
d) nonlinear, binary

Ans : Solution D

40. This technique associates a conditional probability value with each data instance.
a) linear regression
b) logistic regression
c) simple regression
d) multiple linear regression

Ans : Solution B

41. This supervised learning technique can process both numeric and categorical input attributes.
a) linear regression
b) Bayes classifier
c) logistic regression
d) backpropagation learning
Ans : Solution A

42. With Bayes classifier, missing data items are

a) treated as equal compares.
b) treated as unequal compares.
c) replaced with a default value.
d) ignored.

Ans : Solution B

43. This clustering algorithm merges and splits nodes to help modify nonoptimal partitions.
a) agglomerative clustering
b) expectation maximization
c) conceptual clustering
d) K-Means clustering

Ans : Solution D

44. This clustering algorithm initially assumes that each data instance represents a single cluster.
a) agglomerative clustering
b) conceptual clustering
c) K-Means clustering
d) expectation maximization

Ans : Solution C

45. This unsupervised clustering algorithm terminates when mean values computed for the current
iteration of the algorithm are identical to the computed mean values for the previous iteration.
a) agglomerative clustering
b) conceptual clustering
c) K-Means clustering
d) expectation maximization

Ans : Solution C

46. Machine learning techniques differ from statistical techniques in that machine learning methods
a) typically assume an underlying distribution for the data.
b) are better able to deal with missing and noisy data.
c) are not able to explain their behavior.
d) have trouble with large-sized datasets.

Ans : Solution B
UNIT –II

1.True- False: Over fitting is more likely when you have huge amount of data to train?
A) TRUE
B) FALSE
Ans Solution: (B)
With a small training dataset, it’s easier to find a hypothesis to fit the training data exactly i.e.
over fitting.

2.What is pca.components_ in Sklearn?

Set of all eigen vectors for the projection space
Matrix of principal components
Result of the multiplication matrix
None of the above options
Ans A

3.Which of the following techniques would perform better for reducing dimensions of a data
set?
A. Removing columns which have too many missing values
B. Removing columns which have high variance in data
C. Removing columns with dissimilar data trends
D. None of these
Ans Solution: (A)
If a columns have too many missing values, (say 99%) then we can remove such columns.

4.It is not necessary to have a target variable for applying dimensionality reduction
algorithms.
A. TRUE
B. FALSE
Ans Solution: (A)
LDA is an example of supervised dimensionality reduction algorithm.

5. PCA can be used for projecting and visualizing data in lower dimensions.
A. TRUE
B. FALSE
Ans Solution: (A)
Sometimes it is very useful to plot the data in lower dimensions. We can take the first 2 principal
components and then visualize the data using scatter plot.

6. The most popularly used dimensionality reduction algorithm is Principal Component Analysis
(PCA). Which of the following is/are true about PCA?
PCA is an unsupervised method
It searches for the directions that data have the largest variance
Maximum number of principal components <= number of features
All principal components are orthogonal to each other
A. 1 and 2
B. 1 and 3
C. 2 and 3
D. All of the above

Ans D

7. PCA works better if there is?

A linear structure in the data
If the data lies on a curved surface and not on a flat surface
If variables are scaled in the same unit
A. 1 and 2
B. 2 and 3
C. 1 and 3
D. 1 ,2 and 3
Ans Solution: (C)

8. What happens when you get features in lower dimensions using PCA?
The features will still have interpretability
The features will lose interpretability
The features must carry all information present in data
The features may not carry all information present in data
A. 1 and 3
B. 1 and 4
C. 2 and 3
D. 2 and 4
Ans Solution: (D)
When you get the features in lower dimensions then you will lose some information of data
most of the times and you won’t be able to interpret the lower dimension data.

9. Which of the following option(s) is / are true?

You need to initialize parameters in PCA
You don’t need to initialize parameters in PCA
PCA can be trapped into local minima problem
PCA can’t be trapped into local minima problem
A. 1 and 3
B. 1 and 4
C. 2 and 3
D. 2 and 4
Ans Solution: (D)
PCA is a deterministic algorithm which doesn’t have parameters to initialize and it doesn’t have
local minima problem like most of the machine learning algorithms has.

10. What is of the following statement is true about t-SNE in comparison to PCA?
A. When the data is huge (in size), t-SNE may fail to produce better results.
B. T-NSE always produces better result regardless of the size of the data
C. PCA always performs better than t-SNE for smaller size data.
D. None of these
Ans Solution: (A)
Option A is correct

11. [ True or False ] PCA can be used for projecting and visualizing data in lower dimensions.
A. TRUE
B. FALSE

Solution: (A)
Sometimes it is very useful to plot the data in lower dimensions. We can take the first 2 principal
components and then visualize the data using scatter plot.

12. A feature F1 can take certain value: A, B, C, D, E, & F and represents grade of students from
a college.
1) Which of the following statement is true in following case?
A) Feature F1 is an example of nominal variable.
B) Feature F1 is an example of ordinal variable.
C) It doesn’t belong to any of the above category.
D) Both of these
Solution: (B)
Ordinal variables are the variables which has some order in their categories. For example, grade
A should be consider as high grade than grade B.

13. Which of the following is an example of a deterministic algorithm?

A) PCA
B) K-Means
C) None of the above
Solution: (A)
A deterministic algorithm is that in which output does not change on different runs. PCA would
give the same result if we run again, but not k-means.
UNIT –III

1. Which of the following methods do we use to best fit the data in Logistic Regression?
A) Least Square Error
B) Maximum Likelihood
C) Jaccard distance
D) Both A and B
Ans Solution: B

2. Choose which of the following options is true regarding One-Vs-All method in Logistic
Regression.
A) We need to fit n models in n-class classification problem
B) We need to fit n-1 models to classify into n classes
C) We need to fit only 1 model to classify into n classes
D) None of these
Ans Solution: A

3. Suppose, You applied a Logistic Regression model on a given data and got a training accuracy
X and testing accuracy Y. Now, you want to add a few new features in the same data. Select the
option(s) which is/are correct in such a case.
Note: Consider remaining parameters are same.
A) Training accuracy increases
B) Training accuracy increases or remains the same
C) Testing accuracy decreases
D) Testing accuracy increases or remains the same
Ans Solution: A and D
Adding more features to model will increase the training accuracy because model has to
consider more data to fit the logistic regression. But testing accuracy increases if feature is
found to be significant

4. Which of the following algorithms do we use for Variable Selection?

A) LASSO
B) Ridge
C) Both
D) None of these
Ans Solution: A
In case of lasso we apply a absolute penality, after increasing the penality in lasso some of the
coefficient of variables may become zero

5. Which of the following statement is true about outliers in Linear regression?

A) Linear regression is sensitive to outliers
B) Linear regression is not sensitive to outliers
C) Can’t say
D) None of these
Ans Solution: (A)
The slope of the regression line will change due to outliers in most of the cases. So Linear
Regression is sensitive to outliers.

6. Which of the following methods do we use to find the best fit line for data in Linear
Regression?
A) Least Square Error
B) Maximum Likelihood
C) Logarithmic Loss
D) Both A and B
Ans Solution: (A)
In linear regression, we try to minimize the least square errors of the model to identify the line
of best fit.

7. Which of the following is true about Residuals?

A) Lower is better
B) Higher is better
C) A or B depend on the situation
D) None of these
Ans Solution: (A)
Residuals refer to the error values of the model. Therefore lower residuals are desired.

8. Suppose you plotted a scatter plot between the residuals and predicted values in linear
regression and you found that there is a relationship between them. Which of the following
conclusion do you make about this situation?

A) Since the there is a relationship means our model is not good

B) Since the there is a relationship means our model is good
C) Can’t say
D) None of these
Ans Solution: (A)
There should not be any relationship between predicted values and residuals. If there exists any
relationship between them, it means that the model has not perfectly captured the information
in the data.

9. Suppose you have fitted a complex regression model on a dataset. Now, you are using Ridge
regression with penalty x.
Choose the option which describes bias in best manner.
A) In case of very large x; bias is low
B) In case of very large x; bias is high
C) We can’t say about bias
D) None of these
Ans Solution: (B)
If the penalty is very large it means model is less complex, therefore the bias would be high.

10. Which of the following option is true?

A) Linear Regression errors values has to be normally distributed but in case of Logistic
Regression it is not the case
B) Logistic Regression errors values has to be normally distributed but in case of Linear
Regression it is not the case
C) Both Linear Regression and Logistic Regression error values have to be normally distributed
D) Both Linear Regression and Logistic Regression error values have not to be normally
distributed
Ans Solution: A

11. Suppose you have trained a logistic regression classifier and it outputs a new example x with
a prediction ho(x) = 0.2. This means
Our estimate for P(y=1 | x)
Our estimate for P(y=0 | x)
Our estimate for P(y=1 | x)
Our estimate for P(y=0 | x)
Ans Solution: B

12. True-False: Linear Regression is a supervised machine learning algorithm.

A) TRUE
B) FALSE
Solution: (A)
Yes, Linear regression is a supervised learning algorithm because it uses true labels for training.
Supervised learning algorithm should have input variable (x) and an output variable (Y) for each
example.

13. True-False: Linear Regression is mainly used for Regression.

A) TRUE
B) FALSE
Solution: (A)
Linear Regression has dependent variables that have continuous values.
14. True-False: It is possible to design a Linear regression algorithm using a neural network?

A) TRUE
B) FALSE

Solution: (A)

True. A Neural network can be used as a universal approximator, so it can definitely implement
a linear regression algorithm.

15. Which of the following methods do we use to find the best fit line for data in Linear
Regression?
A) Least Square Error
B) Maximum Likelihood
C) Logarithmic Loss
D) Both A and B
Solution: (A)
In linear regression, we try to minimize the least square errors of the model to identify the line
of best fit.

16. Which of the following evaluation metrics can be used to evaluate a model while modeling
a continuous output variable?
A) AUC-ROC
B) Accuracy
C) Logloss
D) Mean-Squared-Error
Solution: (D)
Since linear regression gives output as continuous values, so in such case we use mean squared
error metric to evaluate the model performance. Remaining options are use in case of a
classification problem.

17. True-False: Lasso Regularization can be used for variable selection in Linear Regression.
A) TRUE
B) FALSE
Solution: (A)
True, In case of lasso regression we apply absolute penalty which makes some of the coefficients
zero.

18. Which of the following is true about Residuals ?

A) Lower is better
B) Higher is better
C) A or B depend on the situation
D) None of these
Solution: (A)
Residuals refer to the error values of the model. Therefore lower residuals are desired.

19. Suppose that we have N independent variables (X1,X2… Xn) and dependent variable is Y.
Now Imagine that you are applying linear regression by fitting the best fit line using least square
error on this data.
You found that correlation coefficient for one of it’s variable(Say X1) with Y is -0.95.
Which of the following is true for X1?
A) Relation between the X1 and Y is weak
B) Relation between the X1 and Y is strong
C) Relation between the X1 and Y is neutral
D) Correlation can’t judge the relationship
Solution: (B)
The absolute value of the correlation coefficient denotes the strength of the relationship.
Since absolute correlation is very high it means that the relationship is strong between X1 and
Y.

20. Looking at above two characteristics, which of the following option is the correct for
Pearson correlation between V1 and V2?
If you are given the two variables V1 and V2 and they are following below two characteristics.
1. If V1 increases then V2 also increases
2. If V1 decreases then V2 behavior is unknown
A) Pearson correlation will be close to 1
B) Pearson correlation will be close to -1
C) Pearson correlation will be close to 0
D) None of these

Solution: (D)
We cannot comment on the correlation coefficient by using only statement 1. We need to
consider the both of these two statements. Consider V1 as x and V2 as |x|. The correlation
coefficient would not be close to 1 in such a case.

21. Suppose Pearson correlation between V1 and V2 is zero. In such case, is it right to
conclude that V1 and V2 do not have any relation between them?
A) TRUE
B) FALSE
Solution: (B)
Pearson correlation coefficient between 2 variables might be zero even when they have a
relationship between them. If the correlation coefficient is zero, it just means that that they
don’t move together. We can take examples like y=|x| or y=x^2.
22. True- False: Overfitting is more likely when you have huge amount of data to train?
A) TRUE
B) FALSE
Solution: (B)
With a small training dataset, it’s easier to find a hypothesis to fit the training data exactly i.e.
overfitting.

23. We can also compute the coefficient of linear regression with the help of an analytical
method called “Normal Equation”. Which of the following is/are true about Normal Equation?
1. We don’t have to choose the learning rate
2. It becomes slow when number of features is very large
3. Thers is no need to iterate

A) 1 and 2
B) 1 and 3
C) 2 and 3
D) 1,2 and 3
Solution: (D)
Instead of gradient descent, Normal Equation can also be used to find coefficients.

Question Context 24-26:

Suppose you have fitted a complex regression model on a dataset. Now, you are using Ridge
regression with penality x.
24. Choose the option which describes bias in best manner.
A) In case of very large x; bias is low
B) In case of very large x; bias is high
C) We can’t say about bias
D) None of these
Solution: (B)
If the penalty is very large it means model is less complex, therefore the bias would be high.

25. What will happen when you apply very large penalty?
A) Some of the coefficient will become absolute zero
B) Some of the coefficient will approach zero but not absolute zero
C) Both A and B depending on the situation
D) None of these
Solution: (B)
In lasso some of the coefficient value become zero, but in case of Ridge, the coefficients become
close to zero but not zero.

26. What will happen when you apply very large penalty in case of Lasso?
A) Some of the coefficient will become zero
B) Some of the coefficient will be approaching to zero but not absolute zero
C) Both A and B depending on the situation
D) None of these
Solution: (A)
As already discussed, lasso applies absolute penalty, so some of the coefficients will become
zero.

27. Which of the following statement is true about outliers in Linear regression?
A) Linear regression is sensitive to outliers
B) Linear regression is not sensitive to outliers
C) Can’t say
D) None of these
Solution: (A)
The slope of the regression line will change due to outliers in most of the cases. So Linear
Regression is sensitive to outliers.

28. Suppose you plotted a scatter plot between the residuals and predicted values in linear
regression and you found that there is a relationship between them. Which of the following
conclusion do you make about this situation?

A) Since the there is a relationship means our model is not good

B) Since the there is a relationship means our model is good
C) Can’t say
D) None of these
Solution: (A)
There should not be any relationship between predicted values and residuals. If there exists any
relationship between them,it means that the model has not perfectly captured the information
in the data.

Question Context 29-31:

Suppose that you have a dataset D1 and you design a linear regression model of degree 3
polynomial and you found that the training and testing error is “0” or in another terms it
perfectly fits the data.
29. What will happen when you fit degree 4 polynomial in linear regression?
A) There are high chances that degree 4 polynomial will over fit the data
B) There are high chances that degree 4 polynomial will under fit the data
C) Can’t say
D) None of these
Solution: (A)
Since is more degree 4 will be more complex(overfit the data) than the degree 3 model so it will
again perfectly fit the data. In such case training error will be zero but test error may not be
zero.
30. What will happen when you fit degree 2 polynomial in linear regression?
A) It is high chances that degree 2 polynomial will over fit the data
B) It is high chances that degree 2 polynomial will under fit the data
C) Can’t say
D) None of these
Solution: (B)
If a degree 3 polynomial fits the data perfectly, it’s highly likely that a simpler model(degree 2
polynomial) might under fit the data.

31. In terms of bias and variance. Which of the following is true when you fit degree 2
polynomial?

A) Bias will be high, variance will be high

B) Bias will be low, variance will be high
C) Bias will be high, variance will be low
D) Bias will be low, variance will be low
Solution: (C)
Since a degree 2 polynomial will be less complex as compared to degree 3, the bias will be high
and variance will be low.

Question Context 32-33:

We have been given a dataset with n records in which we have input attribute as x and output
attribute as y. Suppose we use a linear regression method to model this data. To test our linear
regressor, we split the data in training set and test set randomly.
32. Now we increase the training set size gradually. As the training set size increases, what do
you expect will happen with the mean training error?

A) Increase
B) Decrease
C) Remain constant
D) Can’t Say
Solution: (D)
Training error may increase or decrease depending on the values that are used to fit the model.
If the values used to train contain more outliers gradually, then the error might just increase.

33. What do you expect will happen with bias and variance as you increase the size of training
data?

A) Bias increases and Variance increases

B) Bias decreases and Variance increases
C) Bias decreases and Variance decreases
D) Bias increases and Variance decreases
E) Can’t Say False
Solution: (D)
As we increase the size of the training data, the bias would increase while the variance would
decrease.

Question Context 34:

Consider the following data where one input(X) and one output(Y) is given.

34. What would be the root mean square training error for this data if you run a Linear
Regression model of the form (Y = A0+A1X)?

A) Less than 0
B) Greater than zero
C) Equal to 0
D) None of these
Solution: (C)
We can perfectly fit the line on the following data so mean error will be zero.

Question Context 35-36:

Suppose you have been given the following scenario for training and validation error for Linear
Regression.
Number Validation
Learning Training
Scenario of Error
Rate Error
iterations

1 0.1 1000 100 110

2 0.2 600 90 105

3 0.3 400 110 110

4 0.4 300 120 130

5 0.4 250 130 150

35. Which of the following scenario would give you the right hyper parameter?
A) 1
B) 2
C) 3
D) 4
Solution: (B)
Option B would be the better option because it leads to less training as well as validation error.
36. Suppose you got the tuned hyper parameters from the previous question. Now, Imagine
you want to add a variable in variable space such that this added feature is important. Which
of the following thing would you observe in such case?
A) Training Error will decrease and Validation error will increase
B) Training Error will increase and Validation error will increase
C) Training Error will increase and Validation error will decrease
D) Training Error will decrease and Validation error will decrease
E) None of the above
Solution: (D)
If the added feature is important, the training and validation error would decrease.

Question Context 37-38:

Suppose, you got a situation where you find that your linear regression model is under fitting
the data.
37. In such situation which of the following options would you consider?
1. I will add more variables
2. I will start introducing polynomial degree variables
3. I will remove some variables
A) 1 and 2
B) 2 and 3
C) 1 and 3
D) 1, 2 and 3
Solution: (A)
In case of under fitting, you need to induce more variables in variable space or you can add
some polynomial degree variables to make the model more complex to be able to fir the data
better.
38. Now situation is same as written in previous question(under fitting).Which of following
regularization algorithm would you prefer?

A) L1
B) L2
C) Any
D) None of these
Solution: (D)
I won’t use any regularization methods because regularization is used in case of overfitting.

39. True-False: Is Logistic regression a supervised machine learning algorithm?

A) TRUE
B) FALSE
Solution: A
True, Logistic regression is a supervised learning algorithm because it uses true labels for
training. Supervised learning algorithm should have input variables (x) and an target variable (Y)
when you train the model .

40. True-False: Is Logistic regression mainly used for Regression?

A) TRUE
B) FALSE
Solution: B
Logistic regression is a classification algorithm, don’t confuse with the name regression.

41. True-False: Is it possible to design a logistic regression algorithm using a Neural Network
Algorithm?
A) TRUE
B) FALSE
Solution: A
True, Neural network is a is a universal approximator so it can implement linear regression
algorithm.

42. True-False: Is it possible to apply a logistic regression algorithm on a 3-class Classification

problem?
A) TRUE
B) FALSE
Solution: A
Yes, we can apply logistic regression on 3 classification problem, We can use One Vs all method
for 3 class classification in logistic regression.

43. Which of the following methods do we use to best fit the data in Logistic Regression?
A) Least Square Error
B) Maximum Likelihood
C) Jaccard distance
D) Both A and B
Solution: B
Logistic regression uses maximum likely hood estimate for training a logistic regression.

44. Which of the following evaluation metrics can not be applied in case of logistic regression
output to compare with target?
A) AUC-ROC
B) Accuracy
C) Logloss
D) Mean-Squared-Error
Solution: D
Since, Logistic Regression is a classification algorithm so it’s output can not be real time value so
mean squared error can not use for evaluating it

45. One of the very good methods to analyze the performance of Logistic Regression is AIC,
which is similar to R-Squared in Linear Regression. Which of the following is true about AIC?
A) We prefer a model with minimum AIC value
B) We prefer a model with maximum AIC value
C) Both but depend on the situation
D) None of these
Solution: A
We select the best model in logistic regression which can least AIC.

46. [True-False] Standardisation of features is required before training a Logistic Regression.

A) TRUE
B) FALSE
Solution: B
Standardization isn’t required for logistic regression. The main goal of standardizing features is
to help convergence of the technique used for optimization.

47. Which of the following algorithms do we use for Variable Selection?

A) LASSO
B) Ridge
C) Both
D) None of these

Solution: A
In case of lasso we apply a absolute penality, after increasing the penality in lasso some of the
coefficient of variables may become zero.
Context: 48-49

Consider a following model for logistic regression: P (y =1|x, w)= g(w0 + w1x)
where g(z) is the logistic function.

In the above equation the P (y =1|x; w) , viewed as a function of x, that we can get by changing the
parameters w.

48 What would be the range of p in such case?

A) (0, inf)
B) (-inf, 0 )
C) (0, 1)
D) (-inf, inf)

Solution: C

For values of x in the range of real number from −∞ to +∞ Logistic function will give the output
between (0,1)

49 In above question what do you think which function would make p between (0,1)?

A) logistic function
B) Log likelihood function
C) Mixture of both
D) None of them

Solution: A

Explanation is same as question number 10

50. Suppose you have been given a fair coin and you want to find out the odds of getting heads.
Which of the following option is true for such a case?

A) odds will be 0
B) odds will be 0.5
C) odds will be 1
D) None of these

Solution: C

Odds are defined as the ratio of the probability of success and the probability of failure. So in case of fair
coin probability of success is 1/2 and the probability of failure is 1/2 so odd would be 1

51. The logit function(given as l(x)) is the log of odds function. What could be the range of logit
function in the domain x=[0,1]?
A) (– ∞ , ∞)
B) (0,1)
C) (0, ∞)
D) (- ∞, 0)

Solution: A

For our purposes, the odds function has the advantage of transforming the probability function, which
has values from 0 to 1, into an equivalent function with values between 0 and ∞. When we take the
natural log of the odds function, we get a range of values from -∞ to ∞.

52. Which of the following option is true?

A) Linear Regression errors values has to be normally distributed but in case of Logistic Regression it is
not the case
B) Logistic Regression errors values has to be normally distributed but in case of Linear Regression it is
not the case
C) Both Linear Regression and Logistic Regression error values have to be normally distributed
D) Both Linear Regression and Logistic Regression error values have not to be normally distributed

Solution:A

53. Which of the following is true regarding the logistic function for any value “x”?

Note:
Logistic(x): is a logistic function of any number “x”

Logit(x): is a logit function of any number “x”

Logit_inv(x): is a inverse logit function of any number “x”

A) Logistic(x) = Logit(x)
B) Logistic(x) = Logit_inv(x)
C) Logit_inv(x) = Logit(x)
D) None of these

Solution: B

54. How will the bias change on using high(infinite) regularisation?

Suppose you have given the two scatter plot “a” and “b” for two classes( blue for positive and red for
negative class). In scatter plot “a”, you correctly classified all data points using logistic regression ( black
line is a decision boundary).
A) Bias will be high
B) Bias will be low
C) Can’t say
D) None of these

Solution: A

Model will become very simple so bias will be very high.

55. Suppose, You applied a Logistic Regression model on a given data and got a training accuracy X
and testing accuracy Y. Now, you want to add a few new features in the same data. Select the
option(s) which is/are correct in such a case.

Note: Consider remaining parameters are same.

A) Training accuracy increases

B) Training accuracy increases or remains the same
C) Testing accuracy decreases
D) Testing accuracy increases or remains the same

Solution: A and D

Adding more features to model will increase the training accuracy because model has to consider more
data to fit the logistic regression. But testing accuracy increases if feature is found to be significant

56. Choose which of the following options is true regarding One-Vs-All method in Logistic Regression.

A) We need to fit n models in n-class classification problem

B) We need to fit n-1 models to classify into n classes
C) We need to fit only 1 model to classify into n classes
D) None of these
Solution: A

If there are n classes, then n separate logistic regression has to fit, where the probability of each
category is predicted over the rest of the categories combined.

57. Below are two different logistic models with different values for β0 and β1.

Which of the
following statement(s) is true about β0 and β1 values of two logistics models (Green, Black)?

Note: consider Y = β0 + β1*X. Here, β0 is intercept and β1 is coefficient.

A) β1 for Green is greater than Black

B) β1 for Green is lower than Black
C) β1 for both models is same
D) Can’t Say

Solution: B

β0 and β1: β0 = 0, β1 = 1 is in X1 color(black) and β0 = 0, β1 = −1 is in X4 color (green)

Context 58-60

Below are the three scatter plot(A,B,C left to right) and hand drawn decision boundaries for logistic
regression.
58. Which of the following above figure shows that the decision boundary is overfitting the training
data?

A) A
B) B
C) C
D)None of these

Solution: C

Since in figure 3, Decision boundary is not smooth that means it will over-fitting the data.

59. What do you conclude after seeing this visualization?

1. The training error in first plot is maximum as compare to second and third plot.

2. The best model for this regression problem is the last (third) plot because it has minimum
training error (zero).

3. The second model is more robust than first and third because it will perform best on unseen
data.

4. The third model is overfitting more as compare to first and second.

5. All will perform same because we have not seen the testing data.

A) 1 and 3
B) 1 and 3
C) 1, 3 and 4
D) 5

Solution: C

The trend in the graphs looks like a quadratic trend over independent variable X. A higher degree(Right
graph) polynomial might have a very high accuracy on the train population but is expected to fail badly
on test dataset. But if you see in left graph we will have training error maximum because it underfits the
training data

60. Suppose, above decision boundaries were generated for the different value of regularization.
Which of the above decision boundary shows the maximum regularization?

A) A
B) B
C) C
D) All have equal regularization

Solution: A

Since, more regularization means more penality means less complex decision boundry that shows in first
figure A.

61. What would do if you want to train logistic regression on same data that will take less time as well
as give the comparatively similar accuracy(may not be same)?

Suppose you are using a Logistic Regression model on a huge dataset. One of the problem you may face
on such huge data is that Logistic regression will take very long time to train.

A) Decrease the learning rate and decrease the number of iteration

B) Decrease the learning rate and increase the number of iteration
C) Increase the learning rate and increase the number of iteration
D) Increase the learning rate and decrease the number of iteration

Solution: D

If you decrease the number of iteration while training it will take less time for surly but will not give the
same accuracy for getting the similar accuracy but not exact you need to increase the learning rate.

62. Which of the following image is showing the cost function for y =1.

Following is the loss function in logistic regression(Y-axis loss function and x axis log probability) for
two class classification problem.

Note: Y is the target class

A) A
B) B
C) Both
D) None of these

Solution: A

A is the true answer as loss function decreases as the log probability increases

63. Suppose, Following graph is a cost function for logistic regression.

Now, How many local minimas are present in the graph?

A) 1
B) 2
C) 3
D) 4

Solution: C
There are three local minima present in the graph

64. Can a Logistic Regression classifier do a perfect classification on the below data?

Note: You can use only X1 and X2 variables where X1 and X2 can take only two binary values(0,1).

A) TRUE
B) FALSE
C) Can’t say
D) None of these

Solution: B

No, logistic regression only forms linear decision surface, but the examples in the figure are not linearly
separable.
UNIT IV

1. The SVM’s are less effective when:

A) The data is linearly separable

B) The data is clean and ready to use
C) The data is noisy and contains overlapping points

Ans Solution: C

When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.

2. The cost parameter in the SVM means:

A) The number of cross-validations to be made

B) The kernel to be used
C) The tradeoff between misclassification and simplicity of the model
D) None of the above

Ans Solution: C

The cost parameter decides how much an SVM should be allowed to “bend” with the data. For a low
cost, you aim for a smooth decision surface and for a higher cost, you aim to classify more points
correctly. It is also simply referred to as the cost of misclassification.

3. Which of the following are real world applications of the SVM?

A) Text and Hypertext Categorization

B) Image Classification
C) Clustering of News Articles
D) All of the above

Ans Solution: D

SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.

4. Which of the following is true about Naive Bayes ?

Assumes that all the features in a dataset are equally important

Assumes that all the features in a dataset are independent

Both A and B - answer

None of the above options

Ans Solution: C

5 What do you mean by generalization error in terms of the SVM?

A) How far the hyperplane is from the support vectors

B) How accurately the SVM can predict outcomes for unseen data
C) The threshold amount of error in an SVM

Ans Solution: B

Generalisation error in statistics is generally the out-of-sample error which is the measure of how
accurately a model can predict values for previously unseen data.

6 The SVM’s are less effective when:

A) The data is linearly separable

B) The data is clean and ready to use
C) The data is noisy and contains overlapping points

Ans Solution: C

When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.

7 What is/are true about kernel in SVM?

1. Kernel function map low dimensional data to high dimensional space

2. It’s a similarity function

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: C

Both the given statements are correct.

Question Context:8– 9

Suppose you are using a Linear SVM classifier with 2 class classification problem. Now you have been
given the following data in which some points are circled red that are representing support vectors.
8. If you remove the following any one red points from the data. Does the decision boundary will
change?

A) Yes
B) No

Solution: A

These three examples are positioned such that removing any one of them introduces slack in the
constraints. So the decision boundary would completely change.

9. [True or False] If you remove the non-red circled points from the data, the decision boundary will
change?

A) True
B) False

Solution: B

On the other hand, rest of the points in the data won’t affect the decision boundary much.

10. What do you mean by generalization error in terms of the SVM?

A) How far the hyperplane is from the support vectors

B) How accurately the SVM can predict outcomes for unseen data
C) The threshold amount of error in an SVM

Solution: B

Generalization error in statistics is generally the out-of-sample error which is the measure of how
accurately a model can predict values for previously unseen data.
11. When the C parameter is set to infinite, which of the following holds true?

A) The optimal hyperplane if exists, will be the one that completely separates the data
B) The soft-margin classifier will separate the data
C) None of the above

Solution: A

At such a high level of misclassification penalty, soft margin will not hold existence as there will be no
room for error.

12. What do you mean by a hard margin?

A) The SVM allows very low error in classification

B) The SVM allows high amount of error in classification
C) None of the above

Solution: A

A hard margin means that an SVM is very rigid in classification and tries to work extremely well in the
training set, causing overfitting.

13. The minimum time complexity for training an SVM is O(n2). According to this fact, what sizes of
datasets are not best suited for SVM’s?

A) Large datasets
B) Small datasets
C) Medium sized datasets
D) Size does not matter

Solution: A

Datasets which have a clear classification boundary will function best with SVM’s.

14. The effectiveness of an SVM depends upon:

A) Selection of Kernel
B) Kernel Parameters
C) Soft Margin Parameter C
D) All of the above

Solution: D

The SVM effectiveness depends upon how you choose the basic 3 requirements mentioned above in
such a way that it maximises your efficiency, reduces error and overfitting.

15. upport vectors are the data points that lie closest to the decision surface.
A) TRUE
B) FALSE

Solution: A

They are the points closest to the hyperplane and the hardest ones to classify. They also have a direct
bearing on the location of the decision surface.

16. The SVM’s are less effective when:

A) The data is linearly separable

B) The data is clean and ready to use
C) The data is noisy and contains overlapping points

Solution: C

When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.

17. Suppose you are using RBF kernel in SVM with high Gamma value. What does this signify?

Solution: B

The gamma parameter in SVM tuning signifies the influence of points either near or far away from the
hyperplane.

For a low gamma, the model will be too constrained and include all points of the training dataset,
without really capturing the shape.

For a higher gamma, the model will capture the shape of the dataset well.

18. The cost parameter in the SVM means:

A) The number of cross-validations to be made

B) The kernel to be used
C) The tradeoff between misclassification and simplicity of the model
D) None of the above

Solution: C
The cost parameter decides how much an SVM should be allowed to “bend” with the data. For a low
cost, you aim for a smooth decision surface and for a higher cost, you aim to classify more points
correctly. It is also simply referred to as the cost of misclassification.

19. Suppose you are building a SVM model on data X. The data X can be error prone which means that
you should not trust any specific data point too much. Now think that you want to build a SVM model
which has quadratic kernel function of polynomial degree 2 that uses Slack variable C as one of it’s hyper
parameter. Based upon that give the answer for following question.

What would happen when you use very large value of C(C->infinity)?

Note: For small C was also classifying all data points correctly

A) We can still classify data correctly for given setting of hyper parameter C
B) We can not classify data correctly for given setting of hyper parameter C
C) Can’t Say
D) None of these

Solution: A

For large values of C, the penalty for misclassifying points is very high, so the decision boundary will
perfectly separate the data if possible.

20. What would happen when you use very small C (C~0)?

A) Misclassification would happen

B) Data will be correctly classified
C) Can’t say
D) None of these

Solution: A

The classifier can maximize the margin between most of the points, while misclassifying a few points,
because the penalty is so low.

21. If I am using all features of my dataset and I achieve 100% accuracy on my training set, but ~70% on
validation set, what should I look out for?

A) Underfitting
B) Nothing, the model is perfect
C) Overfitting

Solution: C

If we’re achieving 100% training accuracy very easily, we need to check to verify if we’re overfitting our
data.
22. Which of the following are real world applications of the SVM?

A) Text and Hypertext Categorization

B) Image Classification
C) Clustering of News Articles
D) All of the above

Solution: D

SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.

Question Context: 23 – 25

Suppose you have trained an SVM with linear decision boundary after training SVM, you correctly infer
that your SVM model is under fitting.

23. Which of the following option would you more likely to consider iterating SVM next time?

A) You want to increase your data points

B) You want to decrease your data points
C) You will try to calculate more variables
D) You will try to reduce the features

Solution: C

The best option here would be to create more features for the model.

24. Suppose you gave the correct answer in previous question. What do you think that is actually
happening?

1. We are lowering the bias

2. We are lowering the variance
3. We are increasing the bias
4. We are increasing the variance

A) 1 and 2
B) 2 and 3
C) 1 and 4
D) 2 and 4

Solution: C

Better model will lower the bias and increase the variance
25. In above question suppose you want to change one of it’s(SVM) hyperparameter so that effect
would be same as previous questions i.e model will not under fit?

A) We will increase the parameter C

B) We will decrease the parameter C
C) Changing in C don’t effect
D) None of these

Solution: A

Increasing C parameter would be the right thing to do here, as it will ensure regularized model

26. We usually use feature normalization before using the Gaussian kernel in SVM. What is true about
feature normalization?

1. We do feature normalization so that new feature will dominate other

2. Some times, feature normalization is not feasible in case of categorical variables
3. Feature normalization always helps when we use Gaussian kernel in SVM

A) 1
B) 1 and 2
C) 1 and 3
D) 2 and 3

Solution: B

Statements one and two are correct.

Question Context: 27-29

Suppose you are dealing with 4 class classification problem and you want to train a SVM model on the
data for that you are using One-vs-all method. Now answer the below questions?

27. How many times we need to train our SVM model in such case?

A) 1
B) 2
C) 3
D) 4

Solution: D

For a 4 class problem, you would have to train the SVM at least 4 times if you are using a one-vs-all
method.
28. Suppose you have same distribution of classes in the data. Now, say for training 1 time in one vs all
setting the SVM is taking 10 second. How many seconds would it require to train one-vs-all method end
to end?

A) 20
B) 40
C) 60
D) 80

Solution: B

It would take 10×4 = 40 seconds

29 Suppose your problem has changed now. Now, data has only 2 classes. What would you think how
many times we need to train SVM in such case?

A) 1
B) 2
C) 3
D) 4

Solution: A

Training the SVM only one time would give you appropriate results

Question context: 30 –31

Suppose you are using SVM with linear kernel of polynomial degree 2, Now think that you have applied
this on data and found that it perfectly fit the data that means, Training and testing accuracy is 100%.

30. Now, think that you increase the complexity (or degree of polynomial of this kernel). What would
you think will happen?

A) Increasing the complexity will over fit the data

B) Increasing the complexity will under fit the data
C) Nothing will happen since your model was already 100% accurate
D) None of these

Solution: A

Increasing the complexity of the data would make the algorithm overfit the data.
31. In the previous question after increasing the complexity you found that training accuracy was still
100%. According to you what is the reason behind that?

1. Since data is fixed and we are fitting more polynomial term or parameters so the algorithm starts
memorizing everything in the data
2. Since data is fixed and SVM doesn’t need to search in big hypothesis space

A) 1
B) 2
C) 1 and 2
D) None of these

Solution: C

Both the given statements are correct.

32. What is/are true about kernel in SVM?

1. Kernel function map low dimensional data to high dimensional space

2. It’s a similarity function

A) 1
B) 2
C) 1 and 2
D) None of these

Solution: C

Both the given statements are correct.

UNIT V

1. Which of the following is a widely used and effective machine learning algorithm based on the
idea of bagging?

a) Decision Tree
b) Regression
c) Classification
d) Random Forest

Ans D

2. Which of the following is a disadvantage of decision trees?

a) Factor analysis
b) Decision trees are robust to outliers
c) Decision trees are prone to be overfit
d) None of the above

Ans C

3. Can decision trees be used for performing clustering?

a. True
b. False

Ans Solution: (A)

Decision trees can also be used to for clusters in the data but clustering often generates natural
clusters and is not dependent on any objective function.

4. Which of the following algorithm is most sensitive to outliers?

a. K-means clustering algorithm

b. K-medians clustering algorithm
c. K-modes clustering algorithm
d. K-medoids clustering algorithm

Ans Solution: (A)

5 Sentiment Analysis is an example of:

Regression

Classification

Clustering

Reinforcement Learning

Options:

a. 1 Only
b. 1 and 2
c. 1 and 3
d. 1, 2 and 4

Ans D

6 Which of the following is the most appropriate strategy for data cleaning before performing
clustering analysis, given less than desirable number of data points:

Capping and flouring of variables

Removal of outliers
Options:
a. 1 only
b. 2 only
c. 1 and 2
d. None of the above

Ans A

7 Which of the following is/are true about bagging trees?

1. In bagging trees, individual trees are independent of each other

2. Bagging is the method for improving the performance by aggregating the results of weak
learners

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: C

Both options are true. In Bagging, each individual trees are independent of each other because they
consider different subset of features and samples.

8. Which of the following is/are true about boosting trees?

1. In boosting trees, individual weak learners are independent of each other

2. It is the method for improving the performance by aggregating the results of weak learners

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: B

In boosting tree individual weak learners are not independent of each other because each tree correct
the results of previous tree. Bagging and boosting both can be consider as improving the base learners
results.

9. In Random forest you can generate hundreds of trees (say T1, T2 …..Tn) and then aggregate
the results of these tree. Which of the following is true about individual (Tk) tree in Random Forest?
1. Individual tree is built on a subset of the features

2. Individual tree is built on all the features

3. Individual tree is built on a subset of observations

4. Individual tree is built on full set of observations

A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4

Ans Solution: A

Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.

10. Suppose you are using a bagging based algorithm say a RandomForest in model building.
Which of the following can be true?

1. Number of tree should be as large as possible

2. You will have interpretability after using Random Forest

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: A

Since Random Forest aggregate the result of different weak learners, If It is possible we would want
more number of trees in model building. Random Forest is a black box model you will lose
interpretability after using it.

11. Which of the following is/are true about Random Forest and Gradient Boosting ensemble
methods?

1. Both methods can be used for classification task

2. Random Forest is use for classification whereas Gradient Boosting is use for regression task

3. Random Forest is use for regression whereas Gradient Boosting is use for Classification task

4. Both methods can be used for regression task

A) 1
B) 2
C) 3
D) 4
E) 1 and 4

Solution: E

Both algorithms are design for classification as well as regression task.

12. In Random forest you can generate hundreds of trees (say T1, T2 …..Tn) and then aggregate the
results of these tree. Which of the following is true about individual(Tk) tree in Random Forest?

1. Individual tree is built on a subset of the features

2. Individual tree is built on all the features

3. Individual tree is built on a subset of observations

4. Individual tree is built on full set of observations

A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4

Solution: A

Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.

13. Which of the following algorithm doesn’t uses learning Rate as of one of its hyperparameter?

1. Gradient Boosting

2. Extra Trees

3. AdaBoost

4. Random Forest

A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4

Solution: D
Random Forest and Extra Trees don’t have learning rate as a hyperparameter.

14. Which of the following algorithm are not an example of ensemble learning algorithm?

A) Random Forest
B) Adaboost
C) Extra Trees
D) Gradient Boosting
E) Decision Trees

Solution: E

Decision trees doesn’t aggregate the results of multiple trees so it is not an ensemble algorithm.

15. Suppose you are using a bagging based algorithm say a RandomForest in model building. Which of
the following can be true?

1. Number of tree should be as large as possible

2. You will have interpretability after using RandomForest

A) 1
B) 2
C) 1 and 2
D) None of these

Solution: A

16. True-False: The bagging is suitable for high variance low bias models?

A) TRUE
B) FALSE

Solution: A

The bagging is suitable for high variance low bias models or you can say for complex models.

17. To apply bagging to regression trees which of the following is/are true in such case?

1. We build the N regression with N bootstrap sample

2. We take the average the of N regression tree

3. Each tree has a high variance with low bias

A) 1 and 2
B) 2 and 3
C) 1 and 3
D) 1,2 and 3

Solution: D

All of the options are correct and self-explanatory

18. How to select best hyper parameters in tree based models?

A) Measure performance over training data

B) Measure performance over validation data
C) Both of these
D) None of these

Solution: B

We always consider the validation results to compare with the test result.

19. In which of the following scenario a gain ratio is preferred over Information Gain?

A) When a categorical variable has very large number of category

B) When a categorical variable has very small number of category
C) Number of categories is the not the reason
D) None of these

Solution: A

When high cardinality problems, gain ratio is preferred over Information Gain technique.

20. Suppose you have given the following scenario for training and validation error for Gradient
Boosting. Which of the following hyper parameter would you choose in such case?

Scenario Depth Training Error Validation Error

1 2 100 110

2 4 90 105

3 6 50 100

4 8 45 105
5 10 30 150

A) 1
B) 2
C) 3
D) 4

Solution: B

Scenario 2 and 4 has same validation accuracies but we would select 2 because depth is lower is better
hyper parameter.

21. Which of the following is/are not true about DBSCAN clustering algorithm:

1. For data points to be in a cluster, they must be in a distance threshold to a core point

2. It has strong assumptions for the distribution of data points in dataspace

3. It has substantially high time complexity of order O(n 3)

4. It does not require prior knowledge of the no. of desired clusters

5. It is robust to outliers

Options:

A. 1 only

B. 2 only

C. 4 only

D. 2 and 3

Solution: D

 DBSCAN can form a cluster of any arbitrary shape and does not have strong assumptions for the
distribution of data points in the data space.

 DBSCAN has a low time complexity of order O (n log n) only.

22. Point out the correct statement.

a) The choice of an appropriate metric will influence the shape of the clusters
b) Hierarchical clustering is also called HCA
c) In general, the merges and splits are determined in a greedy manner
d) All of the mentioned
Answer: d
Explanation: Some elements may be close to one another according to one distance and farther away
according to another.

23. Which of the following is required by K-means clustering?

a) defined distance metric
b) number of clusters
c) initial guess as to cluster centroids
d) all of the mentioned

Answer: d
Explanation: K-means clustering follows partitioning approach.

24. Point out the wrong statement.

a) k-means clustering is a method of vector quantization
b) k-means clustering aims to partition n observations into k clusters
c) k-nearest neighbor is same as k-means
d) none of the mentioned

Answer: c
Explanation: k-nearest neighbour has nothing to do with k-means.

25. Which of the following function is used for k-means clustering?

a) k-means
b) k-mean
c) heat map
d) none of the mentioned

Answer: a
Explanation: K-means requires a number of clusters.

26. K-means is not deterministic and it also consists of number of iterations.

a) True
b) False

Answer: a
Explanation: K-means clustering produces the final estimate of cluster centroids.
27.
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
S.r No Question a b c d Correct Image
Answer
Write down question Option a Option b Option c Option d a/b/c/d img.jpg
1 In reinforcement learning if feedback is Penalty Overlearning Reward None of above A
negative one it is defined as____.
2 According to____ , it�s a key success Claude Shannon's theory Gini Index Darwin�s theory None of above C
factor for the survival and evolution of all
species.
3 How can you avoid overfitting ? By using a lot of data By using inductive machine By using validation only None of above A
learning
4 What are the popular algorithms of Decision Trees and Neural Probabilistic networks and Support vector machines All D
Machine Learning? Networks (back Nearest Neighbor
propagation)
5 What is �Training set�? Training set is used to test A set of data is used to Both A & B None of above B
the accuracy of the discover the potentially
hypotheses generated by the predictive relationship.
learner.
6 Common deep learning applications Image classification, Autonomous car driving, All above D
include____ Real-time visual tracking Logistic optimization Bioinformatics,
Speech recognition
7 what is the function of �Supervised Classifications, Predict time Speech recognition, Both A & B None of above C
Learning�? series, Annotate strings Regression
8 Commons unsupervised applications Object segmentation Similarity detection Automatic labeling All above D
include
9 Reinforcement learning is particularly the environment is not it's often very dynamic it's impossible to have a All above D
efficient when______________. completely deterministic precise error measure
10 if there is only a discrete number of Regression Classification. Modelfree Categories B
possible outcomes (called categories),
the process becomes a______.
11 Which of the following are supervised Spam detection, Image classification, Autonomous car driving, A
learning applications Pattern detection, Real-time visual tracking Logistic optimization Bioinformatics,
Natural Language Speech recognition
Processing
12 During the last few years, many ______ Logical Classical Classification None of above D
algorithms have been applied to deep
neural networks to learn the best policy
for playing Atari video games and to teach
an agent how to associate the right action
with an input representing the state.
13 Which of the following sentence is Machine learning relates Data mining can be defined Both A & B None of the above C
correct? with the study, design and as the process in which the
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
development of the unstructured data tries to
algorithms that give extract knowledge or
computers the capability to unknown interesting
learn without being explicitly patterns.
programmed.
14 What is �Overfitting� in Machine when a statistical model Robots are programed so While involving the process a set of data is used to A
learning? describes random error or that they can perform the of learning �overfitting� discover the potentially
noise instead of underlying task based on data they occurs. predictive relationship
relationship �overfitting� gather from sensors.
occurs.
15 What is �Test set�? Test set is used to test the It is a set of data is used to Both A & B None of above A
accuracy of the hypotheses discover the potentially
generated by the learner. predictive relationship.
16 ________is much more difficult because it's Removing the whole line Creating sub-model to Using an automatic All above B
necessary to determine a supervised predict those features strategy to input them
strategy to train a model for each feature according to the other
and, finally, to predict their value known values
17 How it's possible to use a different regression classification random_state missing_values D
placeholder through the
parameter_______.
18 If you need a more powerful scaling RobustScaler DictVectorizer LabelBinarizer FeatureHasher A
feature, with a superior control on outliers
and the possibility to select a quantile
range, there's also the class________.
19 scikit-learn also provides a class for per- max, l0 and l1 norms max, l1 and l2 norms max, l2 and l3 norms max, l3 and l4 norms B
sample normalization, Normalizer. It can
apply________to each element of a dataset
20 There are also many univariate methods F-tests and p-values chi-square ANOVA All above A
that can be used in order to select the
best features according to specific criteria
based on________.
21 Which of the following selects only a SelectPercentile FeatureHasher SelectKBest All above A
subset of features belonging to a certain
percentile
22 ________performs a PCA with non-linearly SparsePCA KernelPCA SVD None of the Mentioned B
separable data sets.
23 A feature F1 can take certain value: A, B, Feature F1 is an example of Feature F1 is an example of It doesn�t belong to any Both of these B
C, D, E, & F and represents grade of nominal variable. ordinal variable. of the above category.
students from a college.
Which of the following statement is true in
following case?
24 What would you do in PCA to get the Transform data to zero mean Transform data to zero Not possible None of these A
same projection as SVD? median
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
25 What is PCA, KPCA and ICA used for? Principal Components Kernel based Principal Independent Component All above D
Analysis Component Analysis Analysis
26 Can a model trained for item based YES NO A
similarity also choose from a given set of
items?
27 What are common feature selection correlation coefficient Greedy algorithms All above None of these C
methods in regression task?
28 The parameter______ allows specifying test_size training_size All above None of these C
the percentage of elements to put into the
test/training set
29 In many classification problems, the random_state dataset test_size All above B
target ______ is made up of categorical
labels which cannot immediately be
processed by any algorithm.
30 _______adopts a dictionary-oriented LabelEncoder class LabelBinarizer class DictVectorizer FeatureHasher A
approach, associating to each category
label a progressive integer number.
31 If Linear regression model perfectly first a) Test error is also always b) Test error is non zero c) Couldn�t comment on d) Test error is equal to Train c
i.e., train error is zero, then zero Test error error
_____________________
32 Which of the following metrics can be a) ii and iv b) i and ii c) ii, iii and iv d) i, ii, iii and iv d
used for evaluating regression models?i)
R Squaredii) Adjusted R Squarediii) F
Statisticsiv) RMSE / MSE / MAE
33 How many coefficients do you need to a) 1 b) 2 c) 3 d) 4 b
estimate in a simple linear regression
model (One independent variable)?
34 In a simple linear regression model (One a) by 1 b) no change c) by intercept d) by its slope d
independent variable), If we change the
input variable by 1 unit. How much output
variable will change?
35 �Function used for linear regression in R a) lm(formula, data) b) lr(formula, data) c) lrm(formula, data) d) regression.linear(formula, a
is __________ data)
36 In syntax of linear model a) Matrix b) Vector c) Array d) List b
lm(formula,data,..), data refers to ______
37 In the mathematical Equation of Linear a) (X-intercept, Slope) b) (Slope, X-Intercept) c) (Y-Intercept, Slope) d) (slope, Y-Intercept) c
Regression Y?=??1 + ?2X + ?, (?1, ?2)
refers to __________
38 Linear Regression is a supervised A) TRUE B) FALSE a
machine learning algorithm.
39 It is possible to design a Linear regression A) TRUE B) FALSE a
algorithm using a neural network?
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
40 Which of the following methods do we A)�Least Square Error B)�Maximum Likelihood C) Logarithmic Loss D) Both A and B a
use to find the best fit line for data in
Linear Regression?
41 Which of the following evaluation metrics A)�AUC-ROC B)�Accuracy C)�Logloss D)�Mean-Squared-Error d
can be used to evaluate a model while
modeling a continuous output variable?
42 Which of the following is true about A) Lower is better B)�Higher is better C)�A or B depend on the D)�None of these a
Residuals ? situation
43 Overfitting is more likely when you have A) TRUE B) FALSE b
huge amount of data to train?
44 Which of the following statement is true A)�Linear regression is B)�Linear regression is C)�Can�t say D)�None of these a
about outliers in Linear regression? sensitive to outliers not sensitive to outliers
45 Suppose you plotted a scatter plot A)�Since the there is a B)�Since the there is a C)�Can�t say D)�None of these a
between the residuals and predicted relationship means our relationship means our
values in linear regression and you found model is not good model is good
that there is a relationship between them.
Which of the following conclusion do you
make about this situation?
46 Naive Bayes classifiers are a collection Classification Clustering Regression All a
------------------of algorithms�
47 Naive Bayes classifiers is _______________ Supervised Unsupervised Both None a
Learning
48 Features being classified is independent False TRUE b
of each other in Na�ve Bayes Classifier
49 Features being classified is __________ of Independent Dependent Partial Dependent None a
each other in Na�ve Bayes Classifier
50 Bayes Theorem is given by where 1. P(H) True FALSE a bayes.jpg
is the probability of hypothesis H being
true.
2. P(E) is the probability of the
evidence(regardless of the hypothesis).
3. P(E|H) is the probability of the evidence
given that hypothesis is true.
4. P(H|E) is the probability of the
hypothesis given that the evidence is
there.
51 In given image, P(H|E) Posterior Prior a bayes.jpg
is__________probability.
52 In given image, P(H) Posterior Prior b bayes.jpg
is__________probability.
53 Conditional probability is a measure of the True FALSE a
probability of an event given that another
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
event has already occurred.
54 Bayes� theorem describes the True FALSE a
probability of an event, based on prior
knowledge of conditions that might be
related to the event.
55 Bernoulli Na�ve Bayes Classifier is Continuous Discrete Binary c
___________distribution
56 Multinomial Na�ve Bayes Classifier is Continuous Discrete Binary b
___________distribution
57 Gaussian Na�ve Bayes Classifier is Continuous Discrete Binary a
___________distribution
58 Binarize parameter in BernoulliNB scikit True FALSE a
sets threshold for binarizing of sample
features.
59 Gaussian distribution when plotted, gives Mean Variance Discrete Random a
a bell shaped curve which is symmetric
about the _______ of the feature values.
60 SVMs directly give us the posterior True FALSE b
probabilities P(y = 1jx) and P(y = ??1jx)
61 Any linear combination of the True FALSE a
components of a multivariate Gaussian is
a univariate Gaussian.
62 Solving a non linear separation problem True FALSE a
with a hard margin Kernelized SVM
(Gaussian RBF Kernel) might lead to
overfitting
63 SVM is a ------------------ algorithm� Classification Clustering Regression All a
64 SVM is a ------------------ learning Supervised Unsupervised Both None a
65 The linear�SVM�classifier works by True FALSE a
drawing a straight line between two
classes
66 Which of the following function provides cl_forecastB cl_nowcastC cl_precastD None of the Mentioned D --
unsupervised prediction ?
67 Which of the following is characteristic of fast accuracy scalable All above D --
best machine learning method ?
68 What are the different Algorithm Supervised Learning and Unsupervised Learning and Both A & B None of the Mentioned C --
techniques in Machine Learning? Semi-supervised Learning Transduction
69 What is the standard approach to split the set of example into group the set of example a set of observed learns programs from data A --
supervised learning? the training set and the test into the training set and the instances tries to induce a
test general rule
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
70 Which of the following is not Machine Artificial Intelligence Rule based inference Both A & B None of the Mentioned B --
Learning?
71 What is Model Selection in Machine The process of selecting when a statistical model Find interesting directions All above A --
Learning? models among different describes random error or in data and find novel
mathematical models, which noise instead of underlying observations/ database
are used to describe the relationship cleaning
same data set
72 Which are two techniques of Machine Genetic Programming and Speech recognition and Both A & B None of the Mentioned A --
Learning ? Inductive Learning Regression
73 Even if there are no actual supervisors Supervised Reinforcement Unsupervised None of the above B --
________ learning is also based on
feedback provided by the environment
74 What does learning exactly mean? Robots are programed so A set of data is used to Learning is the ability to It is a set of data is used to C --
that they can perform the discover the potentially change according to discover the potentially
task based on data they predictive relationship. external stimuli and predictive relationship.
gather from sensors. remembering most of all
previous experiences.
75 When it is necessary to allow the model to Overfitting Overlearning Classification Regression A --
develop a generalization ability and avoid
a common problem called______.
76 Techniques involve the usage of both Supervised Semi-supervised Unsupervised None of the above B --
labeled and unlabeled data is called___.
77 In reinforcement learning if feedback is Penalty Overlearning Reward None of above A --
negative one it is defined as____.
78 According to____ , it�s a key success Claude Shannon's theory Gini Index Darwin�s theory None of above C --
factor for the survival and evolution of all
species.
79 A supervised scenario is characterized by Programmer Teacher Author Farmer B --
the concept of a _____.
80 overlearning causes due to an excessive Capacity Regression Reinforcement Accuracy A --
______.
81 Which of the following is an example of a PCA K-Means None of the above A --
deterministic algorithm?
82 Which of the following model model MCV MARS MCRS All above B --
include a backwards elimination feature
selection routine?
83 Can we extract knowledge without apply YES NO A --
feature selection
84 While using feature selection on the data, NO YES B --
is the number of features decreases.
85 Which of the following are several models regression classification None of the above C --
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
for feature extraction
86 _____ provides some built-in datasets that scikit-learn classification regression None of the above A --
can be used for testing purposes.
87 While using _____ all labels are LabelEncoder class LabelBinarizer class DictVectorizer FeatureHasher A --
turned into sequential numbers.
88 _______produce sparse matrices of real DictVectorizer FeatureHasher Both A & B None of the Mentioned C --
numbers that can be fed into any machine
learning model.
89 scikit-learn offers the class______, which is LabelEncoder LabelBinarizer DictVectorizer Imputer D --
responsible for filling the holes using a
strategy based on the mean, median, or
frequency
90 Which of the following scale data by MinMaxScaler MaxAbsScaler Both A & B None of the Mentioned C --
removing elements that don't belong to a
given range or by considering a maximum
absolute value.
91 scikit-learn also provides a class for per- Normalizer Imputer Classifier All above A --
sample normalization,_____
92 ______dataset with many features normalized unnormalized Both A & B None of the Mentioned B --
contains information proportional to the
independence of all features and their
variance.
93 In order to assess how much information Concuttent matrix Convergance matrix Supportive matrix Covariance matrix D --
is brought by each component, and the
correlation among them, a useful tool is
the_____.
94 The_____ parameter can assume different run start stop C --
values which determine how the data init
matrix is initially processed.
95 ______allows exploiting the natural SparsePCA KernelPCA SVD init parameter A --
sparsity of data while extracting principal
components.
96 Which of the following evaluation metrics AUC-ROC Accuracy Logloss Mean-Squared-Error D --
can be used to evaluate a model while
modeling a continuous output variable?
97 Which of the following is true about Lower is better Higher is better A or B depend on the None of these A --
Residuals ? situation
98 Overfitting is more likely when you have TRUE FALSE B --
huge amount of data to train?
99 Which of the following statement is true Linear regression is sensitive Linear regression is not Can�t say None of these A --
about outliers in Linear regression? to outliers sensitive to outliers
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
100 Suppose you plotted a scatter plot Since the there is a Since the there is a Can�t say None of these A --
between the residuals and predicted relationship means our relationship means our
values in linear regression and you found model is not good model is good
that there is a relationship between them.
Which of the following conclusion do you
make about this situation?
101 Let�s say, a �Linear regression� model You will always have test You can not have test error None of the above C --
perfectly fits the training data (train error error zero zero
is zero). Now, Which of the following
statement is true?
102 In a linear regression problem, we are If R Squared increases, this If R Squared decreases, this Individually R squared None of these. C --
using �R-squared� to measure variable is significant. variable is not significant. cannot tell about variable
goodness-of-fit. We add a feature in linear importance. We can�t say
regression model and retrain the same anything about it right now.
model.Which of the following option is
true?
103 Which of the one is true about Linear Regression with Linear Regression with Linear Regression with None of these A --
Heteroskedasticity? varying error terms constant error terms zero error terms
104 Which of the following assumptions do 1,2 and 3. 1,3 and 4. 1 and 3. All of above. D --
we make while deriving linear regression
parameters?1. The true relationship
between dependent y and predictor x is
linear2. The model errors are statistically
independent3. The errors are normally
distributed with a 0 mean and constant
standard deviation4. The predictor x is
non-stochastic and is measured error-free
105 To test linear relationship of y(dependent) Scatter plot Barchart Histograms None of these A --
and x(independent) continuous variables,
which of the following plot best suited?
106 which of the following step / assumption The polynomial degree Whether we learn the The use of a constant-term A --
in regression modeling impacts the trade- weights by matrix inversion
off between under-fitting and over-fitting or gradient descent
the most.
107 Can we calculate the skewness of TRUE FALSE B --
variables based on mean and median?
108 Which of the following is true about Ridge regression uses Lasso regression uses Both use subset selection None of above B --
�Ridge� or �Lasso� regression subset selection of features subset selection of features of features
methods in case of feature selection?
109 Which of the following statement(s) can 1 and 2 1 and 3 2 and 4 None of the above A --
be true post adding a variable in a linear
regression model?1. R-Squared and
Adjusted R-squared both increase2. R-
Squared increases and Adjusted R-
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
squared decreases3. R-Squared
decreases and Adjusted R-squared
decreases4. R-Squared decreases and
Adjusted R-squared increases
110 How many coefficients do you need to 1 2 Can�t Say B --
estimate in a simple linear regression
model (One independent variable)?
111 In given image, P(H) Posterior Prior B bayes.jpg
is__________probability.
112 Conditional probability is a measure of the True FALSE A --
probability of an event given that another
event has already occurred.
113 Gaussian distribution when plotted, gives Mean Variance Discrete Random A --
a bell shaped curve which is symmetric
about the _______ of the feature values.
114 SVMs directly give us the posterior True FALSE B --
probabilities P(y = 1jx) and P(y = ??1jx)
115 SVM is a ------------------ algorithm� Classification Clustering Regression All A --
116 What is/are true about kernel in SVM?1. 1 2 1 and 2 None of these C --
Kernel function map low dimensional data
to high dimensional space2. It�s a
similarity function
117 Suppose you are building a SVM model on Misclassification would Data will be correctly Can�t say None of these A --
data X. The data X can be error prone happen classified
which means that you should not trust
any specific data point too much. Now
think that you want to build a SVM model
which has quadratic kernel function of
polynomial degree 2 that uses Slack
variable C as one of it�s hyper
parameter.What would happen when you
use very small C (C~0)?
118 The cost parameter in the SVM means: The number of cross- The kernel to be used The tradeoff between None of the above C --
validations to be made misclassification and
simplicity of the model
119 Bayes� theorem describes the True FALSE A --
probability of an event, based on prior
knowledge of conditions that might be
related to the event.
120 Bernoulli Na�ve Bayes Classifier is Continuous Discrete Binary C --
___________distribution
121 If you remove the non-red circled points TRUE FALSE B svm.jpg
from the data, the decision boundary will
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
change?
122 How do you handle missing or corrupted a. Drop missing rows or b. Replace missing values c. Assign a unique d. All of the above� D --
data in a dataset? columns with mean/median/mode category to missing values
123 Binarize parameter in BernoulliNB scikit True FALSE A --
sets threshold for binarizing of sample
features.
124 Which of the following statements about A.��Attributes are B.��Attributes are C.��Attributes are D.��Attributes can B --
Naive Bayes is incorrect? equally important. statistically dependent of statistically independent of be nominal or numeric
one another given the class one another given the
value. class value.
125 The SVM�s are less effective when: The data is linearly separable The data is clean and ready The data is noisy and C --
to use contains overlapping
points
126 Naive Bayes classifiers is _______________ Supervised Unsupervised Both None A --
Learning
127 Features being classified is independent False TRUE B --
of each other in Na�ve Bayes Classifier
128 Features being classified is __________ of Independent Dependent Partial Dependent None A --
each other in Na�ve Bayes Classifier
129 Bayes Theorem is given by where 1. P(H) True FALSE A bayes.jpg
is the probability of hypothesis H being
true.
2. P(E) is the probability of the
evidence(regardless of the hypothesis).
3. P(E|H) is the probability of the evidence
given that hypothesis is true.
4. P(H|E) is the probability of the
hypothesis given that the evidence is
there.
130 Any linear combination of the True FALSE A --
components of a multivariate Gaussian is
a univariate Gaussian.

This sheet
is for 2
Mark
questions
S.r No Question a b c d Correct Image
Answer
e.g 1 Write down question Option a Option b Option c Option d a/b/c/d img.jpg
1 A supervised scenario is characterized by Programmer Teacher Author Farmer B
the concept of a _____.
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
2 overlearning causes due to an excessive Capacity Regression Reinforcement Accuracy A
______.
3 If there is only a discrete number of Modelfree Categories Prediction None of above B
possible outcomes called _____.
4 What is the standard approach to split the set of example into group the set of example a set of observed learns programs from data A
supervised learning? the training set and the test into the training set and the instances tries to induce a
test general rule
5 Some people are using the term ___ Inference Interference Accuracy None of above A
instead of prediction only to avoid the
weird idea that machine learning is a sort
of modern magic.
6 The term _____ can be freely used, but Accuracy Cluster Regression Prediction D
with the same meaning adopted in
physics or system theory.
7 Which are two techniques of Machine Genetic Programming and Speech recognition and Both A & B None of the Mentioned A
Learning ? Inductive Learning Regression
8 Even if there are no actual supervisors Supervised Reinforcement Unsupervised None of the above B
________ learning is also based on
feedback provided by the environment
9 Common deep learning applications / Real-time visual object Classic approaches Automatic labeling Bio-inspired adaptive B
problems can also be solved using____ identification systems
10 Identify the various approaches for Concept Vs Classification Symbolic Vs Statistical Inductive Vs Analytical All above D
machine learning. Learning Learning Learning
11 what is the function of �Unsupervised Find clusters of the data and Find interesting directions Interesting coordinates All D
Learning�? find low-dimensional in data and find novel and correlations
representations of the data observations/ database
cleaning
12 What are the two methods used for the Platt Calibration and Isotonic Statistics and A
calibration in Supervised Learning? Regression Informal Retrieval
13 What is the standard approach to split the set of example into group the set of example a set of observed learns programs from data A
supervised learning? the training set and the test into the training set and the instances tries to induce a
test general rule
14 Which of the following is not Machine Artificial Intelligence Rule based inference Both A & B None of the Mentioned B
Learning?
15 What is Model Selection in Machine The process of selecting when a statistical model Find interesting directions All above A
Learning? models among different describes random error or in data and find novel
mathematical models, which noise instead of underlying observations/ database
are used to describe the relationship cleaning
same data set
16 _____ provides some built-in datasets that scikit-learn classification regression None of the above A
can be used for testing purposes.
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
17 While using _____ all labels are LabelEncoder class LabelBinarizer class DictVectorizer FeatureHasher A
turned into sequential numbers.
18 _______produce sparse matrices of real DictVectorizer FeatureHasher Both A & B None of the Mentioned C
numbers that can be fed into any machine
learning model.
19 scikit-learn offers the class______, which is LabelEncoder LabelBinarizer DictVectorizer Imputer D
responsible for filling the holes using a
strategy based on the mean, median, or
frequency
20 Which of the following scale data by MinMaxScaler MaxAbsScaler Both A & B None of the Mentioned C
removing elements that don't belong to a
given range or by considering a maximum
absolute value.
21 Which of the following model model MCV MARS MCRS All above B
include a backwards elimination feature
selection routine?
22 Can we extract knowledge without apply YES NO A
feature selection
23 While using feature selection on the data, NO YES B
is the number of features decreases.
24 Which of the following are several models regression classification None of the above C
for feature extraction
25 scikit-learn also provides a class for per- Normalizer Imputer Classifier All above A
sample normalization,_____
26 ______dataset with many features normalized unnormalized Both A & B None of the Mentioned B
contains information proportional to the
independence of all features and their
variance.
27 In order to assess how much information Concuttent matrix Convergance matrix Supportive matrix Covariance matrix D
is brought by each component, and the
correlation among them, a useful tool is
the_____.
28 The_____ parameter can assume different run start stop C
values which determine how the data init
matrix is initially processed.
29 ______allows exploiting the natural SparsePCA KernelPCA SVD init parameter A
sparsity of data while extracting principal
components.
30 Which of the following is an example of a PCA K-Means None of the above A
deterministic algorithm?
31 Let�s say, a �Linear regression� model A. You will always have test B. You can not have test C. None of the above c
perfectly fits the training data (train error error zero error zero
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
is zero). Now, Which of the following
statement is true?
32 In a linear regression problem, we are A. If R Squared increases, B. If R Squared decreases, C. Individually R squared D. None of these. c
using �R-squared� to measure this variable is significant. this variable is not cannot tell about variable
goodness-of-fit. We add a feature in linear significant. importance. We can�t say
regression model and retrain the same anything about it right now.
model.Which of the following option is
true?
33 Which of the one is true about A. Linear Regression with B. Linear Regression with C. Linear Regression with D. None of these a
Heteroskedasticity? varying error terms constant error terms zero error terms
34 Which of the following assumptions do A. 1,2 and 3. B. 1,3 and 4. C. 1 and 3. D. All of above. d
we make while deriving linear regression
parameters?1. The true relationship
between dependent y and predictor x is
linear2. The model errors are statistically
independent3. The errors are normally
distributed with a 0 mean and constant
standard deviation4. The predictor x is
non-stochastic and is measured error-free
35 To test linear relationship of y(dependent) A. Scatter plot B. Barchart C. Histograms D. None of these a
and x(independent) continuous variables,
which of the following plot best suited?
36 Generally, which of the following A. 1 and 2 B. only 1 C. only 2 D. None of these. b
method(s) is used for predicting
continuous dependent variable?1. Linear
Regression2. Logistic Regression
37 Suppose you are training a linear A. Both are False B. 1 is False and 2 is True C. 1 is True and 2 is False D. Both are True c
regression model. Now consider these
points.1. Overfitting is more likely if we
have less data2. Overfitting is more likely
when the hypothesis space is small.Which
of the above statement(s) are correct?
38 Suppose we fit �Lasso Regression� to a A. It is more likely for X1 to B. It is more likely for X1 to C. Can�t say D. None of these b
data set, which has 100 features be excluded from the model be included in the model
(X1,X2�X100).� Now, we rescale one of
these feature by multiplying with 10 (say
that feature is X1),� and then refit Lasso
regression with the same regularization
parameter.Now, which of the following
option will be correct?
39 Which of the following is true about A. Ridge regression uses B. Lasso regression uses C. Both use subset D. None of above b
�Ridge� or �Lasso� regression subset selection of features subset selection of features selection of features
methods in case of feature selection?
40 Which of the following statement(s) can A. 1 and 2 B. 1 and 3 C. 2 and 4 D. None of the above a
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
be true post adding a variable in a linear
regression model?1. R-Squared and
Adjusted R-squared both increase2. R-
Squared increases and Adjusted R-
squared decreases3. R-Squared
decreases and Adjusted R-squared
decreases4. R-Squared decreases and
Adjusted R-squared increases
41 We can also compute the coefficient of A. 1 and 2 B. 1 and 3. C. 2 and 3. D. 1,2 and 3. d
linear regression with the help of an
analytical method called �Normal
Equation�. Which of the following is/are
true about �Normal Equation�?1. We
don�t have to choose the learning rate2.
It becomes slow when number of features
is very large3. No need to iterate
42 How many coefficients do you need to A. 1 B. 2 C. Can�t Say b
estimate in a simple linear regression
model (One independent variable)?
43 �If two variables are correlated, is it A. Yes B. No b
necessary that they have a linear
relationship?
44 Correlated variables can have zero A. True B. False a
correlation coeffficient. True or False?
45 Which of the following option is true A. The relationship is B. The relationship is not C. The relationship is not D. The relationship is d
regarding �Regression� and symmetric between x and y symmetric between x and y symmetric between x and symmetric between x and y
�Correlation� ?Note: y is dependent in both. in both. y in case of correlation but in case of correlation but in
variable and x is independent variable. in case of regression it is case of regression it is not
symmetric. symmetric.
46 What is/are true about kernel in SVM?1. 1 2 1 and 2 None of these c
Kernel function map low dimensional data
to high dimensional space2. It�s a
similarity function
47 Suppose you are building a SVM model on Misclassification would Data will be correctly Can�t say None of these a
data X. The data X can be error prone happen classified
which means that you should not trust
any specific data point too much. Now
think that you want to build a SVM model
which has quadratic kernel function of
polynomial degree 2 that uses Slack
variable C as one of it�s hyper
parameter.What would happen when you
use very small C (C~0)?
48 Suppose you are using a Linear SVM yes no a svm.jpg
classifier with 2 class classification
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
problem. Now you have been given the
following data in which some points are
circled red that are representing support
vectors.If you remove the following any
one red points from the data. Does the
decision boundary will change?
49 If you remove the non-red circled points TRUE FALSE b svm.jpg
from the data, the decision boundary will
change?
50 When the C parameter is set to infinite, The optimal hyperplane if The soft-margin classifier None of the above a
which of the following holds true? exists, will be the one that will separate the data
completely separates the
data
51 Suppose you are building a SVM model on We can still classify data We can not classify data Can�t Say None of these a
data X. The data X can be error prone correctly for given setting of correctly for given setting
which means that you should not trust hyper parameter C of hyper parameter C
any specific data point too much. Now
think that you want to build a SVM model
which has quadratic kernel function of
polynomial degree 2 that uses Slack
variable C as one of it�s hyper
parameter.What would happen when you
use very large value of C(C->infinity)?
52 SVM can solve�linear�and non- TRUE FALSE a
linear�problems
53 The objective of the support vector TRUE FALSE a
machine algorithm is to find a hyperplane
in an N-dimensional space(N � the
number of features) that distinctly
classifies the data points.
54 Hyperplanes are _____________boundaries usual decision parallel b
that help classify the data points.�
55 The _____of the hyperplane depends upon dimension classification reduction a
the number of features.
56 Hyperplanes are decision boundaries that TRUE FALSE a
help classify the data points.�
57 SVM�algorithms�use�a set of TRUE FALSE a
mathematical functions that are defined
as the�kernel.
58 In SVM, Kernel function is used to map a TRUE FALSE a
lower dimensional data into a higher
dimensional data.
59 In SVR we try to fit the error within a TRUE FALSE a
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
certain threshold.
60 When the C parameter is set to infinite, The optimal hyperplane if The soft-margin classifier None of the above a
which of the following holds true? exists, will be the one that will separate the data
completely separates the
data
61 How do you handle missing or corrupted a. Drop missing rows or b. Replace missing values c. Assign a unique d. All of the above� d
data in a dataset? columns with mean/median/mode category to missing values
62 What is the purpose of performing cross- a. To assess the predictive b. To judge how the trained c. Both A and B� c
validation? performance of the models model performs outside the
sample on test data
63 Which of the following is true about Naive a. Assumes that all the b. Assumes that all the c. Both A and B� d. None of the above option c
Bayes ? features in a dataset are features in a dataset are
equally important independent
64 Which of the following statements about A.��Attributes are B.��Attributes are C.��Attributes are D.��Attributes can b
Naive Bayes is incorrect? equally important. statistically dependent of statistically independent of be nominal or numeric
one another given the class one another given the
value. class value.
65 Which of the following ��PCA ��Decision Tree ��Naive Bayesian Linerar regression a
is�not�supervised learning?
66 How can you avoid overfitting ? By using a lot of data By using inductive machine By using validation only None of above A --
learning
67 What are the popular algorithms of Decision Trees and Neural Probabilistic networks and Support vector machines All D --
Machine Learning? Networks (back Nearest Neighbor
propagation)
68 What is �Training set�? Training set is used to test A set of data is used to Both A & B None of above B --
the accuracy of the discover the potentially
hypotheses generated by the predictive relationship.
learner.
69 Identify the various approaches for Concept Vs Classification Symbolic Vs Statistical Inductive Vs Analytical All above D --
machine learning. Learning Learning Learning
70 what is the function of �Unsupervised Find clusters of the data and Find interesting directions Interesting coordinates All D --
Learning�? find low-dimensional in data and find novel and correlations
representations of the data observations/ database
cleaning
71 What are the two methods used for the Platt Calibration and Isotonic Statistics and A --
calibration in Supervised Learning? Regression Informal Retrieval
72 ______can be adopted when it's necessary Supervised Semi-supervised Reinforcement Clusters B --
to categorize a large amount of data with
a few complete examples or when there's
the need to impose some constraints to a
clustering algorithm.
73 In reinforcement learning, this feedback is Overfitting Overlearning Reward None of above C --
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
usually called as___.
74 In the last decade, many researchers Deep learning Machine learning Reinforcement learning Unsupervised learning A --
started training bigger and bigger models,
built with several different layers that's
why this approach is called_____.
75 there's a growing interest in pattern Regression Accuracy Modelfree Scalable C --
recognition and associative memories
whose structure and functioning are
similar to what happens in the neocortex.
Such an approach also allows simpler
algorithms called _____
76 ______ showed better performance than Machine learning Deep learning Reinforcement learning Supervised learning B --
other approaches, even without a context-
based model
77 Common deep learning applications / Real-time visual object Classic approaches Automatic labeling Bio-inspired adaptive B --
problems can also be solved using____ identification systems
78 Some people are using the term ___ Inference Interference Accuracy None of above A --
instead of prediction only to avoid the
weird idea that machine learning is a sort
of modern magic.
79 The term _____ can be freely used, but Accuracy Cluster Regression Prediction D --
with the same meaning adopted in
physics or system theory.
80 If there is only a discrete number of Modelfree Categories Prediction None of above B --
possible outcomes called _____.
81 A feature F1 can take certain value: A, B, Feature F1 is an example of Feature F1 is an example of It doesn�t belong to any Both of these B --
C, D, E, & F and represents grade of nominal variable. ordinal variable. of the above category.
students from a college.
Which of the following statement is true in
following case?
82 What would you do in PCA to get the Transform data to zero mean Transform data to zero Not possible None of these A --
same projection as SVD? median
83 What is PCA, KPCA and ICA used for? Principal Components Kernel based Principal Independent Component All above D --
Analysis Component Analysis Analysis
84 Can a model trained for item based YES NO A --
similarity also choose from a given set of
items?
85 What are common feature selection correlation coefficient Greedy algorithms All above None of these C --
methods in regression task?
86 The parameter______ allows specifying test_size training_size All above None of these C --
the percentage of elements to put into the
test/training set
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
87 In many classification problems, the random_state dataset test_size All above B --
target ______ is made up of categorical
labels which cannot immediately be
processed by any algorithm.
88 _______adopts a dictionary-oriented LabelEncoder class LabelBinarizer class DictVectorizer FeatureHasher A --
approach, associating to each category
label a progressive integer number.
89 ________is much more difficult because it's Removing the whole line Creating sub-model to Using an automatic All above B --
necessary to determine a supervised predict those features strategy to input them
strategy to train a model for each feature according to the other
and, finally, to predict their value known values
90 How it's possible to use a different regression classification random_state missing_values D --
placeholder through the
parameter_______.
91 If you need a more powerful scaling RobustScaler DictVectorizer LabelBinarizer FeatureHasher A --
feature, with a superior control on outliers
and the possibility to select a quantile
range, there's also the class________.
92 scikit-learn also provides a class for per- max, l0 and l1 norms max, l1 and l2 norms max, l2 and l3 norms max, l3 and l4 norms B --
sample normalization, Normalizer. It can
apply________to each element of a dataset
93 There are also many univariate methods F-tests and p-values chi-square ANOVA All above A --
that can be used in order to select the
best features according to specific criteria
based on________.
94 Which of the following selects only a SelectPercentile FeatureHasher SelectKBest All above A --
subset of features belonging to a certain
percentile
95 ________performs a PCA with non-linearly SparsePCA KernelPCA SVD None of the Mentioned B --
separable data sets.
96 �If two variables are correlated, is it Yes No B --
necessary that they have a linear
relationship?
97 Correlated variables can have zero TRUE FALSE A --
correlation coeffficient. True or False?
98 Suppose we fit �Lasso Regression� to a It is more likely for X1 to be It is more likely for X1 to be Can�t say None of these B --
data set, which has 100 features excluded from the model included in the model
(X1,X2�X100).� Now, we rescale one of
these feature by multiplying with 10 (say
that feature is X1),� and then refit Lasso
regression with the same regularization
parameter.Now, which of the following
option will be correct?
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
99 If Linear regression model perfectly first Test error is also always zero Test error is non zero Couldn�t comment on Test error is equal to Train C --
i.e., train error is zero, then Test error error
_____________________
100 Which of the following metrics can be ii and iv i and ii ii, iii and iv i, ii, iii and iv D --
used for evaluating regression models?i)
R Squaredii) Adjusted R Squarediii) F
Statisticsiv) RMSE / MSE / MAE
101 In syntax of linear model Matrix Vector Array List B --
lm(formula,data,..), data refers to ______
102 Linear Regression is a supervised TRUE FALSE A --
machine learning algorithm.
103 It is possible to design a Linear regression TRUE FALSE A --
algorithm using a neural network?
104 Which of the following methods do we Least Square Error Maximum Likelihood Logarithmic Loss Both A and B A --
use to find the best fit line for data in
Linear Regression?
105 Suppose you are training a linear Both are False 1 is False and 2 is True 1 is True and 2 is False Both are True C --
regression model. Now consider these
points.1. Overfitting is more likely if we
have less data2. Overfitting is more likely
when the hypothesis space is small.Which
of the above statement(s) are correct?
106 We can also compute the coefficient of 1 and 2 1 and 3. 2 and 3. 1,2 and 3. D --
linear regression with the help of an
analytical method called �Normal
Equation�. Which of the following is/are
true about �Normal Equation�?1. We
don�t have to choose the learning rate2.
It becomes slow when number of features
is very large3. No need to iterate
107 Which of the following option is true The relationship is The relationship is not The relationship is not The relationship is D --
regarding �Regression� and symmetric between x and y symmetric between x and y symmetric between x and symmetric between x and y
�Correlation� ?Note: y is dependent in both. in both. y in case of correlation but in case of correlation but in
variable and x is independent variable. in case of regression it is case of regression it is not
symmetric. symmetric.
108 In a simple linear regression model (One by 1 no change by intercept by its slope D --
independent variable), If we change the
input variable by 1 unit. How much output
variable will change?
109 Generally, which of the following 1 and 2 only 1 only 2 None of these. B --
method(s) is used for predicting
continuous dependent variable?1. Linear
Regression2. Logistic Regression
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
110 How many coefficients do you need to 1 2 3 4 B --
estimate in a simple linear regression
model (One independent variable)?
111 Suppose you are building a SVM model on We can still classify data We can not classify data Can�t Say None of these A --
data X. The data X can be error prone correctly for given setting of correctly for given setting
which means that you should not trust hyper parameter C of hyper parameter C
any specific data point too much. Now
think that you want to build a SVM model
which has quadratic kernel function of
polynomial degree 2 that uses Slack
variable C as one of it�s hyper
parameter.What would happen when you
use very large value of C(C->infinity)?
112 SVM can solve�linear�and non- TRUE FALSE A --
linear�problems
113 The objective of the support vector TRUE FALSE A --
machine algorithm is to find a hyperplane
in an N-dimensional space(N � the
number of features) that distinctly
classifies the data points.
114 Hyperplanes are _____________boundaries usual decision parallel B --
that help classify the data points.�
115 When the C parameter is set to infinite, The optimal hyperplane if The soft-margin classifier None of the above A --
which of the following holds true? exists, will be the one that will separate the data
completely separates the
data
116 SVM is a ------------------ learning Supervised Unsupervised Both None A --
117 The linear�SVM�classifier works by True FALSE A --
drawing a straight line between two
classes
118 In a real problem, you should check to see TRUE FALSE B --
if the SVM is separable and then include
slack variables if it is not separable.
119 Which of the following are real world Text and Hypertext Image Classification Clustering of News All of the above D --
applications of the SVM? Categorization Articles
120 The _____of the hyperplane depends upon dimension classification reduction A --
the number of features.
121 Hyperplanes are decision boundaries that TRUE FALSE A --
help classify the data points.�
122 SVM�algorithms�use�a set of TRUE FALSE A --
mathematical functions that are defined
as the�kernel.
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
123 Naive Bayes classifiers are a collection Classification Clustering Regression All A --
------------------of algorithms�
124 In given image, P(H|E) Posterior Prior A bayes.jpg
is__________probability.
125 Solving a non linear separation problem True FALSE A
with a hard margin Kernelized SVM
(Gaussian RBF Kernel) might lead to
overfitting
126 100 people are at party. Given data gives TRUE FALSE A man.jpg
information about how many wear pink or
not, and if a man or not. Imagine a pink
wearing guest leaves, was it a man?
127 For the given weather data, Calculate 0.4 0.64 0.29 0.75 B weather
probability of playing data.jpg
128 In SVM, Kernel function is used to map a TRUE FALSE A --
lower dimensional data into a higher
dimensional data.
129 In SVR we try to fit the error within a TRUE FALSE A --
certain threshold.
130 When the C parameter is set to infinite, The optimal hyperplane if The soft-margin classifier None of the above A --
which of the following holds true? exists, will be the one that will separate the data
completely separates the
data

This sheet
is for 3
Mark
questions
S.r No Question a b c d Correct Image
Answer
e.g 1 Write down question Option a Option b Option c Option d a/b/c/d img.jpg
1 Which of the following is characteristic of fast accuracy scalable All above D
best machine learning method ?
2 What are the different Algorithm Supervised Learning and Unsupervised Learning and Both A & B None of the Mentioned C
techniques in Machine Learning? Semi-supervised Learning Transduction
3 ______can be adopted when it's necessary Supervised Semi-supervised Reinforcement Clusters B
to categorize a large amount of data with
a few complete examples or when there's
the need to impose some constraints to a
clustering algorithm.
4 In reinforcement learning, this feedback is Overfitting Overlearning Reward None of above C
usually called as___.
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
5 In the last decade, many researchers Deep learning Machine learning Reinforcement learning Unsupervised learning A
started training bigger and bigger models,
built with several different layers that's
why this approach is called_____.
6 What does learning exactly mean? Robots are programed so A set of data is used to Learning is the ability to It is a set of data is used to C
that they can perform the discover the potentially change according to discover the potentially
task based on data they predictive relationship. external stimuli and predictive relationship.
gather from sensors. remembering most of all
previous experiences.
7 When it is necessary to allow the model to Overfitting Overlearning Classification Regression A
develop a generalization ability and avoid
a common problem called______.
8 Techniques involve the usage of both Supervised Semi-supervised Unsupervised None of the above B
labeled and unlabeled data is called___.
9 there's a growing interest in pattern Regression Accuracy Modelfree Scalable C
recognition and associative memories
whose structure and functioning are
similar to what happens in the neocortex.
Such an approach also allows simpler
algorithms called _____
10 ______ showed better performance than Machine learning Deep learning Reinforcement learning Supervised learning B
other approaches, even without a context-
based model
11 Which of the following sentence is Machine learning relates Data mining can be defined Both A & B None of the above C --
correct? with the study, design and as the process in which the
development of the unstructured data tries to
algorithms that give extract knowledge or
computers the capability to unknown interesting
learn without being explicitly patterns.
programmed.
12 What is �Overfitting� in Machine when a statistical model Robots are programed so While involving the process a set of data is used to A --
learning? describes random error or that they can perform the of learning �overfitting� discover the potentially
noise instead of underlying task based on data they occurs. predictive relationship
relationship �overfitting� gather from sensors.
occurs.
13 What is �Test set�? Test set is used to test the It is a set of data is used to Both A & B None of above A --
accuracy of the hypotheses discover the potentially
generated by the learner. predictive relationship.
14 what is the function of �Supervised Classifications, Predict time Speech recognition, Both A & B None of above C --
Learning�? series, Annotate strings Regression
15 Commons unsupervised applications Object segmentation Similarity detection Automatic labeling All above D --
include
16 Reinforcement learning is particularly the environment is not it's often very dynamic it's impossible to have a All above D --
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
efficient when______________. completely deterministic precise error measure
17 During the last few years, many ______ Logical Classical Classification None of above D --
algorithms have been applied to deep
neural networks to learn the best policy
for playing Atari video games and to teach
an agent how to associate the right action
with an input representing the state.
18 Common deep learning applications Image classification, Autonomous car driving, All above D --
include____ Real-time visual tracking Logistic optimization Bioinformatics,
Speech recognition
19 if there is only a discrete number of Regression Classification. Modelfree Categories B --
possible outcomes (called categories),
the process becomes a______.
20 Which of the following are supervised Spam detection, Image classification, Autonomous car driving, A --
learning applications Pattern detection, Real-time visual tracking Logistic optimization Bioinformatics,
Natural Language Speech recognition
Processing
21 Let�s say, you are working with All categories of categorical Frequency distribution of Train and Test always have Both A and B D --
categorical feature(s) and you have not variable are not present in categories is different in same distribution.
looked at the distribution of the the test dataset. train as compared to the
categorical variable in the test data. test dataset.

You want to apply one hot encoding (OHE)

on the categorical feature(s). What
challenges you may face if you have
applied OHE on a categorical variable of
train dataset?
22 Which of the following sentence is FALSE It relates inputs to outputs. It is used for prediction. It may be used for It discovers causal D --
regarding regression? interpretation. relationships.
23 Which of the following method is used to k-Means Density-Based Spatial Spectral Clustering Find All above D --
find the optimal features for cluster Clustering clusters
analysis
24 scikit-learn also provides functions for make_classification() make_regression() make_blobs() All above D --
creating
dummy datasets from scratch:
25 _____which can accept a NumPy make_blobs random_state test_size training_size B --
RandomState generator or an integer
seed.
26 In many classification problems, the 1 2 3 4 B --
target dataset is made up of categorical
labels which cannot immediately be
processed by any algorithm. An encoding
is needed and scikit-learn offers at
least_____valid options
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
27 In which of the following each categorical LabelEncoder class DictVectorizer LabelBinarizer class FeatureHasher C --
label is first turned into a positive integer
and then transformed into a vector where
only one feature is 1 while all the others
are 0.
28 ______is the most drastic one and should Removing the whole line Creating sub-model to Using an automatic All above A --
be considered only when the dataset is predict those features strategy to input them
quite large, the number of missing according to the other
features is high, and any prediction could known values
be risky.
29 It's possible to specify if the scaling with_mean=True/False with_std=True/False Both A & B None of the Mentioned C --
process must include both mean and
standard deviation using the
parameters________.
30 Which of the following selects the best K SelectPercentile FeatureHasher SelectKBest All above C --
high-score features.
31 How does number of observations 1 and 4 2 and 3 1 and 3 None of theses A --
influence overfitting? Choose the correct
answer(s).Note: Rest all parameters are
same1. In case of fewer observations, it is
easy to overfit the data.2. In case of fewer
observations, it is hard to overfit the
data.3. In case of more observations, it is
easy to overfit the data.4. In case of more
observations, it is hard to overfit the data.
32 Suppose you have fitted a complex In case of very large lambda; In case of very large In case of very large In case of very large lambda; C --
regression model on a dataset. Now, you bias is low, variance is low lambda; bias is low, lambda; bias is high, bias is high, variance is high
are using Ridge regression with tuning variance is high variance is low
parameter lambda to reduce its
complexity. Choose the option(s) below
which describes relationship of bias and
variance with lambda.
33 What is/are true about ridge regression?1. 1 and 3 1 and 4 2 and 3 2 and 4 A --
When lambda is 0, model works like linear
regression model2. When lambda is 0,
model doesn�t work like linear
regression model3. When lambda goes to
infinity, we get very, very small coefficients
approaching 04. When lambda goes to
infinity, we get very, very large coefficients
approaching infinity
34 Which of the following method(s) does Ridge regression Lasso Both Ridge and Lasso None of both B --
not have closed form solution for its
coefficients?
35 �Function used for linear regression in R lm(formula, data) lr(formula, data) lrm(formula, data) regression.linear(formula, A --
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
is __________ data)
36 In the mathematical Equation of Linear (X-intercept, Slope) (Slope, X-Intercept) (Y-Intercept, Slope) (slope, Y-Intercept) C --
Regression Y?=??1 + ?2X + ?, (?1, ?2)
refers to __________
37 Suppose that we have N independent Relation between the X1 and Relation between the X1 Relation between the X1 Correlation can�t judge the B --
variables (X1,X2� Xn) and dependent Y is weak and Y is strong and Y is neutral relationship
variable is Y. Now Imagine that you are
applying linear regression by fitting the
best fit line using least square error on
this data. You found that correlation
coefficient for one of it�s variable(Say
X1) with Y is -0.95.Which of the following
is true for X1?
38 We have been given a dataset with n Increase Decrease Remain constant Can�t Say D --
records in which we have input attribute
as x and output attribute as y. Suppose we
use a linear regression method to model
this data. To test our linear regressor, we
split the data in training set and test set
randomly. Now we increase the training
set size gradually. As the training set size
increases, what do you expect will happen
with the mean training error?
39 We have been given a dataset with n �Bias increases and Bias decreases and Bias decreases and Bias increases and Variance D --
records in which we have input attribute Variance increases Variance increases Variance decreases decreases
as x and output attribute as y. Suppose we
use a linear regression method to model
this data. To test our linear regressor, we
split the data in training set and test set
randomly. What do you expect will
happen with bias and variance as you
increase the size of training data?
40 Suppose, you got a situation where you 1 and 2 2 and 3 �1 and 3 1, 2 and 3 A --
find that your linear regression model is
under fitting the data. In such situation
which of the following options would you
consider?1. I will add more variables2. I
will start introducing polynomial degree
variables3. I will remove some variables
41 Problem:�Players will play if weather is TRUE FALSE A weather
sunny. Is this statement is correct? data.jpg
42 Multinomial Na�ve Bayes Classifier is Continuous Discrete Binary B
___________distribution
43 For the given weather data, Calculate 0.4 0.64 0.36 0.5 C weather
probability of not playing data.jpg
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
44 Suppose you have trained an SVM with You want to increase your You want to decrease your You will try to calculate You will try to reduce the C --
linear decision boundary after training data points data points more variables features
SVM, you correctly infer that your SVM
model is under fitting.Which of the
following option would you more likely to
consider iterating SVM next time?
45 The minimum time complexity for training Large datasets Small datasets Medium sized datasets Size does not matter A --
an SVM is O(n2). According to this fact,
what sizes of datasets are not best suited
for SVM�s?
46 The effectiveness of an SVM depends Selection of Kernel Kernel Parameters Soft Margin Parameter C All of the above D --
upon:
47 What do you mean by generalization error How far the hyperplane is How accurately the SVM The threshold amount of B --
in terms of the SVM? from the support vectors can predict outcomes for error in an SVM
unseen data
48 What do you mean by a hard margin? The SVM allows very low The SVM allows high None of the above A --
error in classification amount of error in
classification
49 We usually use feature normalization 1 1 and 2 1 and 3 2 and 3 B --
before using the Gaussian kernel in SVM.
What is true about feature normalization?
1.�We do feature normalization so that
new feature will dominate other 2. Some
times, feature normalization is not
feasible in case of categorical variables3.
Feature normalization always helps when
we use Gaussian kernel in SVM
50 Support vectors are the data points that TRUE FALSE A --
lie closest to the decision surface.
51 Which of the following ��PCA ��Decision Tree ��Naive Bayesian Linerar regression A --
is�not�supervised learning?
52 Suppose you are using RBF kernel in SVM The model would consider The model would consider The model would not be None of the above B --
with high Gamma value. What does this even far away points from only the points close to the affected by distance of
signify? hyperplane for modeling hyperplane for modeling points from hyperplane for
modeling
53 Gaussian Na�ve Bayes Classifier is Continuous Discrete Binary A --
___________distribution
54 If I am using all features of my dataset Underfitting Nothing, the model is Overfitting C --
and I achieve 100% accuracy on my perfect
training set, but ~70% on validation set,
what should I look out for?
55 What is the purpose of performing cross- a. To assess the predictive b. To judge how the trained c. Both A and B� C --
validation? performance of the models model performs outside the
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
sample on test data
56 Which of the following is true about Naive a. Assumes that all the b. Assumes that all the c. Both A and B� d. None of the above option C --
Bayes ? features in a dataset are features in a dataset are
equally important independent
57 Suppose you are using a Linear SVM yes no A svm.jpg
classifier with 2 class classification
problem. Now you have been given the
following data in which some points are
circled red that are representing support
vectors.If you remove the following any
one red points from the data. Does the
decision boundary will change?
58 Linear SVMs have no hyperparameters TRUE FALSE B --
that need to be set by cross-validation
59 For the given weather data, what is the 0.5 0.26 0.73 0.6 D weather
probability that players will play if weather data.jpg
is sunny
60 100 people are at party. Given data gives 0.4 0.2 0.6 0.45 B man.jpg
information about how many wear pink or
not, and if a man or not. Imagine a pink
wearing guest leaves, what is the
probability of being a man
61 Problem:�Players will play if weather is TRUE FALSE a weather
sunny. Is this statement is correct? data.jpg
62 For the given weather data, Calculate 0.4 0.64 0.29 0.75 b weather
probability of playing data.jpg
63 For the given weather data, Calculate 0.4 0.64 0.36 0.5 c weather
probability of not playing data.jpg
64 For the given weather data, what is the 0.5 0.26 0.73 0.6 d weather
probability that players will play if weather data.jpg
is sunny
65 100 people are at party. Given data gives 0.4 0.2 0.6 0.45 b man.jpg
information about how many wear pink or
not, and if a man or not. Imagine a pink
wearing guest leaves, what is the
probability of being a man
66 100 people are at party. Given data gives TRUE FALSE a man.jpg
information about how many wear pink or
not, and if a man or not. Imagine a pink
wearing guest leaves, was it a man?
67 What do you mean by generalization error How far the hyperplane is How accurately the SVM The threshold amount of b
in terms of the SVM? from the support vectors can predict outcomes for error in an SVM
unseen data
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
68 What do you mean by a hard margin? The SVM allows very low The SVM allows high None of the above a
error in classification amount of error in
classification
69 The minimum time complexity for training Large datasets Small datasets Medium sized datasets Size does not matter a
an SVM is O(n2). According to this fact,
what sizes of datasets are not best suited
for SVM�s?
70 The effectiveness of an SVM depends Selection of Kernel Kernel Parameters Soft Margin Parameter C All of the above d
upon:
71 Support vectors are the data points that TRUE FALSE a
lie closest to the decision surface.
72 The SVM�s are less effective when: The data is linearly separable The data is clean and ready The data is noisy and c
to use contains overlapping
points
73 Suppose you are using RBF kernel in SVM The model would consider The model would consider The model would not be None of the above b
with high Gamma value. What does this even far away points from only the points close to the affected by distance of
signify? hyperplane for modeling hyperplane for modeling points from hyperplane for
modeling
74 The cost parameter in the SVM means: The number of cross- The kernel to be used The tradeoff between None of the above c
validations to be made misclassification and
simplicity of the model
75 If I am using all features of my dataset Underfitting Nothing, the model is Overfitting c
and I achieve 100% accuracy on my perfect
training set, but ~70% on validation set,
what should I look out for?
76 Which of the following are real world Text and Hypertext Image Classification Clustering of News All of the above d
applications of the SVM? Categorization Articles
77 Suppose you have trained an SVM with You want to increase your You want to decrease your You will try to calculate You will try to reduce the c
linear decision boundary after training data points data points more variables features
SVM, you correctly infer that your SVM
model is under fitting.Which of the
following option would you more likely to
consider iterating SVM next time?
78 We usually use feature normalization 1 1 and 2 1 and 3 2 and 3 b
before using the Gaussian kernel in SVM.
What is true about feature normalization?
1.�We do feature normalization so that
new feature will dominate other 2. Some
times, feature normalization is not
feasible in case of categorical variables3.
Feature normalization always helps when
we use Gaussian kernel in SVM
79 Linear SVMs have no hyperparameters TRUE FALSE b
This sheet FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 FIELD7 FIELD8 FIELD9
is for 1
Mark
questions
that need to be set by cross-validation
80 In a real problem, you should check to see TRUE FALSE b
if the SVM is separable and then include
slack variables if it is not separable.
MCQs on Unit 1( One mark questions)

1. A system which performs following tasks:

Taking Input , Processing input & Providing output, can be called as:
a. Adaptive System
b. Classic System- answer
c. Reinforced Learning
d. None of the above

2. A system which performs following tasks:

Taking Input , Processing input, Providing output & Tuning Parameters through Feedback
from environment can be termed as:
a. Adaptive System.
b. Classic System.
c. Non Adaptive System
d. None of the above

3. Classification and Regression techniques fall under the category of

a. Supervised Learning
b. Unsupervised Learning
c. Semi-supervised Learning
d. Reinforcement Learning

4. Supervised Learning algorithms are accompanied by both Input and Expected Output?

a. True- answer
b. False

5. Linear Regression, Random Forest , SVM are examples of

a. Supervised Learning- answer

b. Unsupervised Learning
c. Semi-Supervised Learning
d. Reinforcement Learning

6. Decision Tree algorithm can work on

a. Only Categorical values

b. Only Continuous values
c. Both Categorical and Continuous values- answer
d. None of the above
7. If the input and output variables are continuous in nature, which technique is more preferred?

a. Regression- answer
b. Classification
c. Association Rule mining
d. All of these

8. k-NN algorithm does more computation on ‘test’ time rather than ‘train’ time.

a. True- answer
b. False

9. Which of the following distance metric can be used in k-NN?

a. Manhattan
b. Minkowski
c. Jaccard
d. Mahalanobis
e. All can be used- answer

10. Which of the following machine learning algorithm can be used for imputing missing values
of both categorical and continuous variables?

a. K-NN- answer
b. Linear Regression
c. Logistic Regression
d. Decision Tree

11. Which of the following algorithm isNOT an example of ensemble learning algorithm

a. Random Forest
b. Adaboost
c. Gradient Boosting
d. Decision Trees

12. Spam detection, pattern detection, NLP are examples of

a. Semi-supervised learning.
b. Supervised Learning
c. Unsupervised Learning
d. All of these

13. Clustering technique & Association rule mining are examples of

a. Supervised Learning
b. Semi-supervised Learning
c. Unsupervised Learning- answer
d. Reinforcement Learning

14. Unsupervised Learning algorithms are accompanied by both Input and Expected Output?

a. True
b. False (Only Input) - answer

15. K-Means technique is an example of

a. Clustering- answer
b. Classification
c. Regression
d. Association

16. Which of the following is/are types of clustering

a. Centroid-based Clustering
b. Density-based Clustering
c. Hierarchical Clustering
d. All of the above- answer

17. Learning algorithms that use both labelled and unlabelled data can be categorised as

a. Supervised Algorithms
b. Unsupervised Algorithms
c. Semi-supervised Algorithms- answer
d. Reinforcement Learning

18. Reinforcement learning is particularly efficient when the environment is NOT

completely deterministic

a. True- answer
b. False

19. When the number of output classes is greater than one, which is / are the possible strategy
used to handle them

a. One-vs-All
b. One-vs-One
c. Both of them- answer
d. None of the above
20. In One-vs-All strategy how many classifiers are trained for n classes

a. 1
b. n- answer
c. n/2
d. None of the above

21. In One-vs-One strategy how many classifiers are trained for n classes

a. 1
b. n
c. n*(n-1)/2- answer
d. n/2

22. When the model isn't able to capture the dynamicsshown by the same training set, such
situation is called as

a. Underfitting- answer
b. Overfitting
c. Normal fitting
d. Regularization

23. When the model can associate almost perfectly all the known samples to the corresponding
output values, but when an unknown input is presented, the corresponding prediction error
can be very high, such situation is called as

a. Underfitting
b. Overfitting- answer
c. Normal fitting
d. None of these

24. The formula given below is to calculate_____________

a. Posterior Probability in Naïve Bayes Classifier- answer

b. Prior Probability in Naïve Bayes Classifier
c. Entropy in Decision Tree classifier
d. None of the above

25. The following

formula is used to calculate __________

a. Information Gain
b. Entropy- answer
c. Probability of an event
d. None of the above

26. Which algorithm is not a type of Parametric Learning?

a. Logistic Regression
b. Naïve Bayes
c. K-Nearest Neighbors- answer
d. Simple Neural Networks

27. What is Machine learning?

a. The autonomous acquisition of knowledge through the use of computer programs-

answer
b. The autonomous acquisition of knowledge through the use of manual programs
c. The selective acquisition of knowledge through the use of computer programs
d. The selective acquisition of knowledge through the use of manual programs

28. Which of the factors affect the performance of learner system does not include?

a. Representation scheme used

b. Training scenario
c. Type of feedback
d. Good data structures- answer

29. Which system is based on static or permanent structures?

a. Adaptive system
b. Non-adaptive system- answer
c. Both
d. None of the above

30. Which is not a type of supervised learning algorithm?

a. K-Nearest Neighbor
b. Decision Tree
c. K-means- answer
d. Linear Regression

31. From following, which are the approaches to Machine Learning?

a. Supervised Learning
b. Unsupervised Learning
c. Reinforcement Learning
d. All of the above- answer

32. In which type of Learning, both features and labels are given to an algorithm?

a. Supervised Learning- answer

b. Unsupervised Learning
c. Reinforcement Learning
d. None of the above

33. In which type of learning, the algorithm maps input variable to output variable?

a. Supervised Learning- answer

b. Unsupervised Learning
c. Reinforcement Learning
d. None of the above

34. Which is not a type of Supervised Learning?

a. Classification
b. Regression
c. Clustering- answer
d. None of the above

35. Which approach should be use to e-mail spam filtering?

a. Classification- answer
b. Clustering
c. Regression
d. Association

36. Which approach should be use to predict sales of a supermarket?

a. Classification
b. Clustering
c. Regression- answer
d. Association

37. In which learning technique, the system discovers patterns from dataset?

a. Supervised Learning
b. Unsupervised Learning- answer
c. Reinforcement Learning
d. None of the above
38. In which type of learning, the problem can be solved without knowing labels?

a. Supervised Learning
b. Unsupervised Learning- answer
c. All of the above
d. None of the above

39. Which type of problem discovers groups of data based on similarities?

a. Clustering- answer
b. Association
c. Regression
d. None of the above

40. Which type of problem discovers rules to describe large data?

a. Clustering
b. Association- answer
c. Regression
d. None of the above

41. From the following, which is best suited to build a game of chess?

a. Supervised Learning
b. Unsupervised Learning
c. Deep Learning- answer
d. None of the above

42. In which type, rewards and punishments are given as a feedback?

a. Supervised Learning
b. Unsupervised Learning
c. Reinforcement Learning- answer
d. None of the above

43. Which approach should be use for automatic labelling?

a. Supervised Learning
b. Unsupervised Learning- answer
c. Reinforcement Learning
d. None of the above

44. From the options, which application you should solve by deep learning for the best
performance?

a. Spam filtering
b. Image classification- answer
c. Sales prediction
d. Automatic labelling
45. A neural network model is said to be inspired from the human brain.Which of the following
statement(s) correctly represents a real neuron?

a. A neuron has a single input and a single output only

b. A neuron has multiple inputs but a single output only
c. A neuron has a single input but multiple outputs
d. All of the above statements are valid- answer

46. What is unsupervised learning?

a. Features of group explicitly stated
b. Number of groups may be known
c. Neither feature & nor number of groups is known- answer
d. None of the mentioned

47. Which is not a correct statement with respect to Deep Learning?

a. Large computing power is required

b. Less complex than machine learning- answer
c. Difficulty in interpreting the resulting models
d. Requires large amount of labelled data

48. Which algorithm is not a type of Non-parametric learning?

a. Naïve bayes- answer

b. C4.5
c. K-Nearest Neighbor
d. Support Vector Machines

49. In which type, the training data is modelled very well?

a. Underfitting
b. Overfitting- answer
c. Both
d. Not a and b

50. Which model gives poor performance on training data?

a. Underfitting- answer
b. Overfitting
c. Both
d. None of the above
Unit 1: Two marks questions
1. The goal(s) of the supervised learning system is (are) ___________
a. Training a system that must also work with samples never seen before.
b. To allow the model to develop a generalization ability and avoid a common problem
called over fitting
c. Supervisor: to provide the agent with a precise measure of its error
d. All of the above- answer

2. Identify the type of model for the given problem

a. Reinforcement learning
b. Supervised learning- answer
c. Un supervised learning
d. Semi supervised learning

3. The goal (s) of Classification techniques is (are) __________

a. Try to find the best separating hyperplane (in this case, it's a linear problem).
b. Reduce the number of misclassifications
c. Increasing the noise-robustness
d. All of these- answer

4. Consider D be a training set of n samples , each sample is represented by X of m features , X

= (x1, x2, x3 …… xn), Consider C classes : C1, C2…… Cc.
Bayesian classifier predicts that tuple X belongs to class Ci iff.
a. P(Ci/X) > P(Cj/X) for i<= j<=c , j != i Thus we maximize P(Ci/X) - answer
b. P(Ci/X) < P(Cj/X) for i<= j<=c , j != i Thus we maximize P(Ci/X)
c. P(Ci/X) > P(Cj/X) for i<= j<=c , j != i Thus we maximize P(Cj/X)
d. None of the above

5. The problem of high variance and low bias is called__________

a. Over-fitting- answer
b. Underfitting
c. Normal fitting
d. Best fitting

6. Identify the type of Machine learning approach to solve the given problems:
Decision Support System to predict the decision to play Match or not to play
a. Reinforcement learning
b. Supervised learning- answer
c. Un supervised learning
d. Semi supervised learning

7. Identify the type of Machine learning approach to solve the given problems:
Grouping of documents retrieved by Google Search Engine
a. Reinforcement learning
b. Supervised learning
c. Un supervised learning- answer
d. Semi supervised learning

8. Identify the type of Machine learning approach to solve the given problems:

System to predict price of product in next year

a. Reinforcement learning
b. Supervised learning- answer
c. Unsupervised learning
d. Semi supervised learning

9. Identify the type of Machine learning approach to solve the given problems:
System to predict the suitable treatment
a. Reinforcement learning
b. Supervised learning
c. Un supervised learning
d. Semi supervised learning

10. Identify the type of Machine learning approach to solve the given problems:
System for Driverless Car
a. Reinforcement learning- answer
b. Supervised learning
c. Unsupervised learning
d. Semi supervised learning

11. Which is true for AI, ML and DP

a. AI>ML>DP- answer
b. DP>ML>AI
c. ML>AI>DP
d. DP>ML>AI
MCQ’s on Unit 2: Feature selection (Two marks)

1. For creating Training and Test datasets which statements are true?
a. Both datasets must reflect the original distribution
b. The original dataset must be randomly shuffled before the split phase in order to avoid
correlation between consequent elements
c. Both a and b - answer
d. None of the above

2. SK-Learn provides which function to create train and test data:

a. train_test_split- answer
b. test_train_split
c. TestTrainSplit
d. Split_test_train

3. In scikit-learn LabelEncoder class:

a. Adopts a dictionary-oriented approach,
b. Associating to each category label a progressive integer number,
c. That is an index of an instance array called classes_

d. All of the above- answer

4. Scikit-learn class Imputer fills the holes using a strategy based on the:
a. mean
b. median
c. frequency (the most frequent entry)
d. All of the above- answer

5. Consider 3 dimensional dataset given below

x y z
1 Nan 2
2 3 nan
-1 4 2
SK-Learn Imputer mean strategy will fill missing values with
a. 3, 2
b. 4, 2
c. 3.5, 2 - answer
d. Difficult to tell
6. Consider 3 dimensional dataset given below
x y z
1 Nan 2
2 3 nan
-1 4 2
SK-Learn Imputer median strategy will fill missing values with
a. 3, 2
b. 4, 2
c. 3.5, 2 - answer
d. Difficult to tell

7. Consider 3 dimensional dataset given below

x y z
1 Nan 2
2 3 nan
-1 4 2
SK-Learn Imputer most_frequent strategy will fill missing values with
a. 3, 2- answer
b. 4, 2
c. 3.5, 2
d. Difficult to tell

8. Which statement(s) is (are) true for SK-Learn MinMaxScaler ?

a. Works well for cases when the distribution is not Gaussian
b. Works well when the standard deviation is very small
c. It is sensitive to outliers
d. All of these- answer

9. _________________ uses the interquartile range , which makes it robust to outliers.

a. MonMaxScaler
b. Standard Scaler
c. Robust Scaler- answer
d. None of these

10. Consider Q1=31 and Q3=119. The inter quartile range (IQR) will be______
a. 88 - answer
b. -88
c. 150
d. -150
MCQs on unit 2 (One mark question)
1) Which of the following contains train_test_split() function

A) sklearn.feature_extraction
B) sklearn.preprocessing
C) sklearn.model_selection- answer
D) sklearn.decomposition

2) Default value of test_size in train_test_split() when both test_size and train_size are none

A) 0.33
B) 0.25 - answer
C) 0.50
D) 0.20

3) The LabelEncoder class, adopts which approach?

A) Dictionary-oriented- answer
B) List-oriented
C) Tree-oriented
D) Map-oriented

4) FeatureHasher class in scikit-learn adopts which hashing technique:

A) SHA256
B) MD5
C) MurmurHash 3- answer
D) BLAKE3

5) Which of the following is best option to handle missing data?

A) Removing the whole line

B) Creating sub-model to predict those features
C) Using an automatic strategy to input them according to the other known values-
answer
D) Inserting random values

6) When performing regression or classification, which of the following is the correct way to
preprocess the data?

A) Normalize the data → PCA → training - answer

B) PCA → normalize PCA output → training
C) Normalize the data → PCA → normalize PCA output → training
D) None of the above
7) What is pca.components_ in Sklearn?

A) Set of all eigen vectors for the projection space - answer

B) Matrix of principal components
C) Result of the multiplication matrix
D) None of the above options

8) How do you handle missing or corrupted data in a dataset?

A) Drop missing rows or columns

B) Replace missing values with mean/median/mode
C) Assign a unique category to missing values
D) All of the above - answer

9) The class KernelPCA, which performs a PCAwith?

A) non-linearly separable data sets - answer
B) linearly separable data sets
C) categorical data sets
D) Heterogeneous data sets

10) Principal component analysis is a method to select only a subset of features which contain
the largest amount of?

A) Total covariance
B) Total variance - answer
C) Total count
D) Mean

11) In the following loss function which parametercontrols the level of sparsity?

A) xi
B) c - answer
C) D
D) αi

12) Which parameter determines the number of atoms in scikit-learn DictioanryLearning class?

A) alpha
B) n_jobs
C) n_components - answer
D) tol

13) In KernalPCA the default value for gamma is?

A) 1.0/number of features - answer

B) 2.0/number of features
C) 10/number of features
D) None of above

14) Non negative matrix factorization algorithm optimizes a loss function based on?

A) L1 Norm
B) Frobenius norm - answer
C) linalgnorm
D) matrix norm

15) Which of the following encoding technique is efficient to deal with large number of possible
categories?

A) Effect Encoding
B) Feature Hashing
C) One Hot Encoding
D) Bin counting scheme - answer

16) Which scaling technique scales data without being affected by outliers?

A) Robust Scaling - answer

B) Min Max Scaling
C) Standardized Scaling
D) Z-score Scaling

17) Which feature selection technique use recursive approach?

A) Filter Methods
B) Wrapper Methods - answer
C) Embedded Methods
D) Subset Methods

18) From the following which can be applied on dataset with more than one dimension?

A) Mean
B) Standard Deviation
C) Covariance - answer
D) Variance

19) In principal component analysis the sparse loadings can be obtained by imposing which
constraint on regression coefficients:

A) Ridge
B) Lasso - answer
C) Linear
D) Logistic

20) What provides better statistical regularization?

A) Sparse PCA - answer

B) Kernel PCA
C) Non-negative Matrix Factorization
D) Atom Extraction

21) Eigen vector with ____ Eigen value is the principle component of dataset.

A) Lowest
B) Highest - answer
C) Mean
D) Zero
22) Trace is equal to the ___ of the Eigen values.

A) Difference
B) Sum - answer
C) Product
D) Mean

23) In which scaling technique the upper and lower can be specified by user?

A) Robust Scaling
B) Min Max Scaling - answer
C) Standardized Scaling
D) Z-score Scaling

24) Principal component analysis (PCA) can be used with variables of any mathematical types:
quantitative, qualitative, ora mixture of these types.

A) True
B) False - answer

25) Variances and covariances can be computed for variables of any mathematical types:
quantitative, qualitative, or a mixture of these types.

A) True
B) False - answer
Unit- 3: Regression (One mark)

1. A process by which we estimate the value of dependent variable on the basis of one or more
independent variables is called:
a. Correlation
b. Regression - answer
c. Residual
d. Slope
2. All data points falling along a straight line is called:
a. Linear relationship - answer
b. Non linear relationship
c. Residual
d. Scatter diagram
3. A relationship where the flow of the data points is best represented by a curve is called:
a. Linear relationship
b. Nonlinear relationship - answer
c. Linear positive
d. Linear negative

4. The value we would predict for the dependent variable when the independent variables are all
equal to zero is called:
(a) Slope
(b) Sum of residual
(c) Intercept - answer
(d) Difficult to tell

5. The predicted rate of response of the dependent variable to changes in the independent variable is
called:
(a) Slope - answer
(b) Intercept
(c) Error
(d) Regression equation
6. The slope of the regression line of Y on X is also called the:
(a) Correlation coefficient of X on Y
(b) Correlation coefficient of Y on X
(c) Regression coefficient of X on Y
(d) Regression coefficient of Y on X - answer
8. In simple linear regression, the numbers of unknown constants are:
(a) One
(b) Two - answer
(c) Three
(d) Four
9. In simple regression equation, the numbers of variables involved are:
(a) 0
(b) 1
(c) 2 - answer
(d) 3
10. If the value of any regression coefficient is zero, then two variables are:
(a) Qualitative
(b) Correlation
(c) Dependent
(d) Independent- answer
11. In SK-Learn Linear Regression offers two instance variables, __________ and ____________
a) intercept_ and coef_ - answer
b) Intercept and coef
c) Slope and Intercept
d) Slope and Coef
12. _________ regression imposes an additional shrinkage penalty to the ordinary least squares loss
function to limit its squared L2 norm:

a) Lasso
b) LassoCV
c) Ridge - answer
d) ElasticNet
13. _____________ regressor imposes a penalty on the L1 norm of w to determine a potentially
higher number of null coefficients:

a) Lasso - answer
b) RidgeCV
c) Ridge
d) ElasticNet
14. A Regression approach to avoid the problem of outliers is offered by _______________
a) Linear Regression
b) Logistic Regression
c) RANSAC Regressor - answer
d) Polynomial Regressor

15. Model with high variance and low bias is called_________________

a) Over-fitted model - answer
b) Under-fitted model
c) Best fitted
d) None of the above

16. ________ occurs when our model neither fits the training data nor generalizes on the new data.
a) Over-fitting
b) Under-fitting - answer
c) Best fitting
d) None of the above

17. ________________ is the process of adding information in order to solve an ill-posed problem
or to prevent overfitting

a) Under-fitting
b) Regularization - answer
c) Best fitting
d) None of the above

18. ____________ selects the only some feature while reduces the coefficients of others to zero.
This property is known as feature selection

a) Lasso - answer
b) RidgeCV
c) Ridge
d) ElasticNet
19. ______ combines both Lasso and Ridge Regression into one model with two penalty factors, one
proportional to L1 norm and other proportional to L2 norm.
a) LassoCV
b) RidgeCV
c) ElasticNet - answer
d) None of the above
20. ____________minimizes the cost function by gradually updating the weight values.

a. Gradient Descent - answer

b. Perceptron
c. Grid search
d. None of the above
21. _______ is a technique allows using linear models even when the dataset has strong non-
linearities. The idea is to add some extra variables computed from the existing ones and using (in
this case) only polynomial combinations.

a) Linear Regression
b) Logistic Regression
c) RANSAC Regressor
d) Polynomial Regressor - answer
22. The Regression technique that uses sigmoid function is called________________
a) Linear Regression
b) Logistic Regression - answer
c) RANSAC Regressor
d) Polynomial Regressor

23. Confusion Matrix can be used to measure the performance of _______________ model.
a) Linear Regression
b) Logistic Regression - answer
c) RANSAC Regressor
d) Polynomial Regressor
24. The residual is defined as the difference between the:
a) actual value of y and the estimated value of y - answer
b) actual value of x and the estimated value of x
c) actual value of y and the estimated value of x
d) actual value of x and the estimated value of y

25)Which of the following methods do we use to find the best fit line for data in Linear
Regression?
A) Least Square Error
B) Maximum Likelihood
C) Logarithmic Loss
D) Both A and B
Answer:(A)

26)True- False: Overfitting is more likely when you have a huge amount of data to train.
A) TRUE
B) FALSE
Solution: (B)

27) What will happen when you apply very large penalty in the case of Lasso?
A) Some of the coefficients will become zero
B) Some of the coefficients will be approaching to zero but not absolute zero
C) Both A and B depending on the situation
D) None of these
Solution: (A)

28) Generally, which of the following method(s) is used for predicting continuous dependent
variable?
1. Linear Regression 2. Logistic Regression
A) 1 and 2
B)only 1
C)only 2
D)None of these
Solution:(B)

29)Full form of ROC is

A)Regression Operation Characteristics Curve
B)Receiver Operating Characteristics Curve
C)Regression Operating Characteristics Curve
D)Ridge Operation Characteristics Curve
Solution:(B)

30.F score is given by :

A)F=2*(precision+recall)/precision*recall
B)F=(precision+recall)/precision*recall
C)F=2*(precision*recall)/(precision+recall)
D)F=precision+recall

31)Which is L1 regression
A)Lasso
B)Ridge
C)polynomial
D)Isotonic
Answer A

32)Which of the following is true about “Ridge” or “Lasso” regression methods in case of feature
selection?
A) Ridge regression uses subset selection of features
B)Lasso regression uses subset selection of features
C)Both use subset selection of features
D)None of the above
Solution:(B)

33)SSE can never be

(A) larger than SST
(B) smaller than SST
(C)equal to 1
(D)equal to zero
Solution:(A)

34) 1. Which of the following is correct about regularized regression?

a) Can help with bias trade-off
b) Cannot help with model selection
c) Cannot help with variance trade-off
d) All of the mentioned
Solution:(A)

35) Which of the following statement is true about outliers in Linear regression?
A) Linear regression is sensitive to outliers
B) Linear regression is not sensitive to outliers
C) Can’t say
D) None of these
Solution: (A)

36) What do you expect will happen with bias and variance as you increase the size of training
data?
A) Bias increases and Variance increases
B) Bias decreases and Variance increases
C) Bias decreases and Variance decreases
D) Bias increases and Variance decreases
Solution: (D)

37)A Pearson correlation between two variables is zero but, still, their values can still be related
to each other.
A) TRUE
B) FALSE
Solution: (A)

38) Which of the following statement(s) is / are true for Gradient Decent (GD) and Stochastic
Gradient Decent (SGD)?
1. In GD and SGD, you update a set of parameters in an iterative manner to minimize the
error function.
2. In SGD, you have to run through all the samples in your training set for a single update of
a parameter in each iteration.
3. In GD, you either use the entire data or a subset of training data to update a parameter in
each iteration.
A) Only 1
B) Only 2
C) Only 3
D) 1 and 2
Solution:(A)

39) When hypothesis tests and confidence limits are to be used, the residuals are assumed
to follow the __________distribution.
A) Formal
B) Mutual
C) Normal
D) Abnormal
Solution:(C)

40)The error due to simplistic assumptions made by the model in fitting the data is called as
A)variance
B)bias
C)MSE
D)none of these
Solution:(B)

41)ROC curves show the trade-off between which parameters

A)TPR and FPR
B)TNR And TPR
C)FPR and TNR
D)FPR and FNR
Solution:(A)
42)The accuracy of the model can be measured by
A)The area above ROC curve
B)The area under ROC curve
C)All of the above
D)None of the above
Solution:(B)

43) Least square method calculates the best-fitting line for the observed data by minimizing the sum
of the squares of the _______ deviations.
a) Vertical
b) Horizontal
c) Both of these
d) None of these
Solution:(A)
Unit-3 (Two marks)
1. The regression line yhat = 3 + 2x has been fitted to the data points (4,8), (2,5), and (1,2). The
residual sum of squares will be:
a) 10
b) 15
c) 13
d) 22 - answer
2. Suppose you have trained a logistic regression classifier and it outputs a new example x with a
prediction ho(x) = 0.2. This means
a. Our estimate for P(y=1 | x)
b. Our estimate for P(y=0 | x) - answer
c. Our estimate for P(y=1 | x)
d. Our estimate for P(y=0 | x)

3. A regression analysis between sales (in $1000) and advertising (in $100) resulted in the following
least squares line: yhat = 75 +6x. This implies that if advertising is $800, then the predicted amount
of sales (in dollars) is:
a. $4875 - answer
b. $123,000
c. $487,500
d. $12,300
4. The value for SSE equals zero. This means that the coefficient of determination (r^2) must equal:
a. 0.0.
b. -1.0.
c. 2.3.
d. 1.0 - answer

5. Below equation shows the loss function of ____________________

a) Logistic Regression Model

b) Linear Regression Model - answer
c) Gaussian Naïve Bayes Model
d) Polynomial Model
6. For the given results of a recently conducted study on the correlation of the number of hours spent
driving with the risk of developing acute backache. The Intercept of the line is_______.

a) 12.58 - answer
b) 10.58
c) 11.85
d) 10.85

7. For the given results of a recently conducted study on the correlation of the number of hours spent
driving with the risk of developing acute backache. The slope of the line is_______.

a) 4.59 - answer
b) 10.58
c) 5.85
d) 10.85

8. for the given vector of outputs the Mean squared error is ________.
y_true = [3, -0.5, 2, 7]
y_pred = [2.5, 0.0, 2, 8]

a) 0.45
b) 0.375 - answer
c) 0.56
d) None of the above

9)The correct relationship between SST, SSR, and SSE is given by;
a) SSR = SST + SSE
b) SST = SSR + SSE
c) SSE = SSR – SST
d) all of the above
Solution:(B)

10)Stochastic gradient descent performs less computation per update than batch gradient descent.
A)True
B)False
Solution:(A)

11)A parameter that is external to model and whose value cannot be estimated from data is called as
A)Hyperparameter
B)Model Parameter
C)Outlier
D)Regularization constant
Solution:(A)

12)Which strategy is used for tuning hyperparameters

A)Gradient Descent
B)Feature Scaling
C)Regularization
D)Grid Search
Solution:(D)

13) Which is another term for true positive rate

A)precision
B)Recall
C)Specificity
D)Fscore
Solution:(B)

14)The most widely used metrics and tools to assess a classification model are:
A)Confusion matrix
B)Cost-sensitive accuracy
C)Area under the ROC curve
D)All of the above
Solution:(D)

15)Regularization term in ridge regression is

A) λ (sum of the absolute value of coefficients)
B) λ (sum of the square of coefficients)
C)λ square
D)None on these
Solution:(B)

16) In practice, Line of best fit or regression line is found when _____________
a) Sum of residuals (Σ(Y – h(X))) is minimum
b) Sum of the absolute value of residuals (Σ|Y-h(X)|) is maximum
c) Sum of the square of residuals ( Σ (Y-h(X))2) is minimum
d) Sum of the square of residuals ( Σ (Y-h(X))2) is maximum
Solution:(C)
Unit- 4 : Naïve Bayes and SVM
(one mark)
1. Naive bayes falls under which category-
a. Unsupervised classification learning
b. Supervised classification learning
c. Semi- supervised classification learning
d. Reinforcement learning
Ans - b
2. What machine learning task is the Naive Bayes algorithm used for?
a. dimensionality reduction
b. clustering
c. classification
d. regression
Ans - c
3. Naive Bayes assumption about data is-
a. input is independent, conditional on the output label.
b. input is dependent, conditional on the output label.
c. input is independent, not conditional on the output label.
d. input is dependent, not conditional on the output label.
Ans - a

4. Bayes rule:
a. P(A |B) = P(B|A) .P(B) / P(A)
b. P(A |B) = P(B|A) .P(A) / P(B)
c. P(A |B) = P(B|A) .P(A)
d. P(A |B) = P(B|A) .P(B)
Ans - b

5. Which is not a main type of naive bayes classifier -

a. Bernoulli naive bayes
b. Multinomial naive bayes
c. Gaussian naive bayes
d. Complement Naive bayes
Ans - d

6. Which type of naive bayes classifier is suited for imbalanced datasets -

a. Bernoulli naive bayes
b. Multinomial naive bayes
c. Gaussian naive bayes
d. Complement Naive bayes
Ans - b

7. Which type of naive bayes classifier is best suited for document classification problem -
a. Bernoulli naive bayes
b. Multinomial naive bayes
c. Gaussian naive bayes
d. Complement Naive bayes
Ans - b

8. Which type of naive bayes classifiers is usually used for yes/no type boolean predictores-
a. Bernoulli naive bayes
b. Multinomial naive bayes
c. Gaussian naive bayes
d. Complement Naive bayes
Ans - a

9. Naive Bayes is termed as 'Naive' because it assumes-

a. Dependence between every pair of feature in the data.
b. It is multiclass classifier
c. It is not multiclass classifier
d. Independence between every pair of feature in the data.
Ans- d
10. SVM Classifiers and Linear Classifiers are strictly:
a. Probabilistic Binary Linear Classifier
b. Probabilistic Multiclass classifier
c. Non Probabilistic Binary Linear Classifier
d. Non Probabilistic Multiclass classifier
Ans - c
11. SVM falls under which category-
a. Unsupervised classification learning
b. Supervised classification learning
c. Semi- supervised classification learning
d. Reinforcement learning
Ans - b
12. The effectiveness of an SVM depends upon:
a. Selection of Kernel
b. Kernel Parameters
c. Soft Margin Parameter C
d. All of the above
Ans- D
9. Which of the following is true about Naive Bayes ?
a. Assumes that all the features in a dataset are equally important
b. Assumes that all the features in a dataset are independent
c. Both A and B - answer
d. None of the above options

(Two marks)
1. One marble jar has several different colored marbles inside of it. It has 1 red, 2 green, 4 blue, and
8 yellow marbles. All the marbles are the same size and shape. If Peter takes out a marble from the
jar without looking, what is the probability that he will NOT choose a yellow marble.
a. 7/15
b. 8/15
c. 7/8
d. 5/8
Ans- a

2. If we train a Naive Bayes classifier using infinite training data that satisfies all of its modeling
assumptions , then in general, what can we say about the training error (error in training data) and
test error (error in held-out test data)?
a. It may not achieve either zero training error or zero test error
b. It will always achieve zero training error and zero test error.
c. It will always achieve zero training error but may not achieve zero test error.
d. It may not achieve zero training error but will always achieve zero test error.
Ans - a

3. If P(A) = 0.10, P(B) = 0.05.and P(B|A) = 7%. Find P(A|B)-

a. 0.35
b. 0.34
c. 0.14
d. 0.15
Ans - c
4. Which method is provided by scikit learn to tackle large scale classification for which full training
set might not fit in memory-
a. Memory_manage method
b. Partial_manage method
c. Partial_fit method
d. None of the above
Ans - c
5. If I am using all features of my dataset and I achieve 100% accuracy on my training set, but ~70%
on validation set, what should I look out for?
a. Underfitting
b. Nothing, the model is perfect
c. Overfitting
d. None of the above
Ans- C

6. What is/are true about kernel in SVM?

1.. Kernel function map low dimensional data to high dimensional space
2. It’s a similarity function
a. 1
b. 2
c. 1 and
d. None of these
Ans- C

7. The performance of SVM depends on which factors

a. the number of training instances
b. the distribution of the data
c. linear vs. non-linear problems
d. input scale of the features
e. All of the above
Ans - e

8. What do you mean by generalization error in terms of the SVM?

a. How far the hyperplane is from the support vectors
b. How accurately the SVM can predict outcomes for unseen data
c. How much you want to avoid misclassification of each training example
d. How far the influence of a single training example reaches.
Ans- b

9. What is regularisation parameter tells in SVM-

10. What is gamma parameter tells in SVM-

11. The SVM’s are less effective when:

a. The data is linearly separable
b. The data is clean and ready to use
c. The data is noisy and contains overlapping points
d. None of the above
Ans- c

12. Which of the following are real world applications of the SVM?
a. Text and Hypertext Categorization
b. Image Classification
c. Clustering of News Articles
d. All of the above
Ans- d

13. What is the kernel trick -

a. Polynomial and exponential kernels calculate the separation line in lower dimensions.
b. Polynomial and exponential kernels calculate the separation line in higher dimensions.
c. Polynomial or exponential kernels calculate the separation line in lower dimensions.
d. Polynomial or exponential kernels calculate the separation line in higher dimensions.
Ans - b

Unit-V (One mark)

2. SK-Learn provides _______ in built class for Decision Tree Classifier?
a) DTClassifier
b) DecisionTreeClassifier - answer
c) Tree
d) None of the above

3. What approach is taken by Decision Tree for Knowledge Engineering?

a) Inductive - answer
b) Association Rules
c) Statistical
d) Substitutive

4. Which of the following is a widely used and effective machine learning algorithm based on the
idea of bagging?
a. Decision Tree
b. Regression
c. Classification
d. Random Forest - answer
5. In the given formula of Decision Tree family , what A and D represents?
Gain(A) = Cross_Entropy(D) – EntropyA(D)

a. Attribute, Decision
b. Attribute, Dataset- answer
c. Probability, Dataset
d. None of the above

6. In the given formula of Decision Tree family , which are the given statements are true?
Gain(A) = Cross_Entropy(D) – EntropyA(D)

a. Gain(A) should be maximum.

b. The attribute A with highest gain is chosen as the splitting attribute
c. Both a and b-answer
d. None of the above

7. A _________ is a decision support tool that uses a tree-like graph or model of decisions and
their possible consequences, including chance event outcomes, resource costs, and utility.
a. Decision tree- answer
b. Graphs
c. Trees
d. Neural Networks

8. 3. What is Decision Tree?

a) Flow-Chart
b) Structure in which internal node represents test on an attribute, each branch represents
outcome of test and each leaf node represents class label
c) Flow-Chart & Structure in which internal node represents test on an attribute,
each branch represents outcome of test and each leaf node represents class label- answer
d) None of the mentioned

9. Decision Trees can be used for Classification Tasks.

a) True- answer
b) False

10. The most widely used metrics and tools to assess a classification model are:
a. Confusion matrix
b. Cost-sensitive accuracy
c. Area under the ROC curve
d. All of the above - answer
11. Which of the following is a good test dataset characteristic?
a. Large enough to yield meaningful results
b. Is representative of the dataset as a whole
c. Both A and B - answer
d. None of the above
12. Which of the following is a disadvantage of decision trees?
a. Factor analysis
b. Decision trees are robust to outliers
c. Decision trees are prone to be overfit - answer
d. None of the above
13. What is the purpose of performing cross-validation?
a. To assess the predictive performance of the models
b. To judge how the trained model performs outside the sample on test data
c. Both A and B – answer
d. None of the above

14. Which of the following is/are true about bagging trees?

1.In bagging trees, individual trees are independent of each other

2.Bagging is the method for improving the performance by aggregating the results of weak
learners

A) 1
B) 2
C) 1 and 2- answer
D) None of these

15. Which of the following is/are true about boosting trees?

1. In boosting trees, individual weak learners are independent of each other

2. It is the method for improving the performance by aggregating the results of weak learners

A) 1
B) 2- answer
C) 1 and 2
D) None of these

16. Which of the following algorithm are not an example of ensemble learning algorithm?
A) Random Forest
B) Adaboost
C) Extra Trees
D) Gradient Boosting
E) Decision Trees- answer

17. Suppose you are using a bagging based algorithm say a RandomForest in model building.
Which of the following can be true?

1. Number of tree should be as large as possible

2. You will have interpretability after using RandomForest

A) 1- answer
B) 2
C) 1 and 2
D) None of these
18. True-False: The bagging is suitable for high variance low bias models?

A) TRUE- answer
B) FALSE

19. In which of the following scenario a gain ratio is preferred over Information Gain?

A) When a categorical variable has very large number of category - answer

B) When a categorical variable has very small number of category
C) Number of categories is the not the reason
D) None of these

20. In K-means clustering, the distance between each sample and each centroid is computed and the
sample is assigned to the cluster where the distance is minimum. This approach is often called ----

a. Minimizing the inertia of the clusters- answer

b. Minimizing no. of clusters
c. Maximizing the inertia of the clusters
d. None of the above
21. Which statements are true about K-means method of clustering?

1)The process is iterative

2)All the distances are recomputed.

3)The algorithm stops when the centroids become stable and, therefore, the inertia is minimized

4) All of these- answer

22. [True or False] k-NN algorithm does more computation on test time rather than train
time.

A) TRUE - answer
B) FALSE

23. Which of the following statements is true for k-NN classifiers?

A) The classification accuracy is better with larger values of k

B) The decision boundary is smoother with smaller values of k
C) The decision boundary is linear
D) k-NN does not require an explicit training step- answer

Unit-5 (Two marks)

1. Which of the following is/are true about Random Forest and Gradient Boosting ensemble
methods?

1.Both methods can be used for classification task

2.Random Forest is use for classification whereas Gradient Boosting is use for regression
task
3.Random Forest is use for regression whereas Gradient Boosting is use for Classification
task
4.Both methods can be used for regression task
A) 1
B) 2
C) 3
D) 4
E) 1 and 4 – answer

2. In Random forest you can generate hundreds of trees (say T1, T2 …..Tn) and then aggregate the
results of these tree. Which of the following is true about individual(Tk) tree in Random Forest?

1. Individual tree is built on a subset of the features

2. Individual tree is built on all the features
3. Individual tree is built on a subset of observations
4. Individual tree is built on full set of observations

A) 1 and 3 - answer
B) 1 and 4
C) 2 and 3
D) 2 and 4

3. Which of the following algorithm doesn’t uses learning Rate as of one of its hyperparameter?

1. Gradient Boosting
2. Extra Trees
3. AdaBoost
4. Random Forest

A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4 - answer

4. Which of the following algorithm would you take into the consideration in your final model
building on the basis of performance?

Suppose you have given the following graph which shows the ROC curve for two different
classification algorithms such as Random Forest(Red) and Logistic Regression(Blue)
A) Random Forest- anwser
B) Logistic Regression
C) Both of the above
D) None of these

5. Which of the following is true about training and testing error in such case?
Suppose you want to apply AdaBoost algorithm on Data D which has T observations. You
set half the data for training and half for testing initially. Now you want to increase the
number of data points for training T1, T2 … Tn where T1 < T2…. Tn-1 < Tn.

E) The difference between training error and test error increases as number of observations
increases
B) The difference between training error and test error decreases as number of
observations increases- answer
C) The difference between training error and test error will not change
D) None of These

6. In random forest or gradient boosting algorithms, features can be of any type. For example,
it can be a continuous feature or a categorical feature. Which of the following option is true
when you consider these types of features?
A) Only Random forest algorithm handles real valued attributes by discretizing them
B) Only Gradient boosting algorithm handles real valued attributes by discretizing them
C) Both algorithms can handle real valued attributes by discretizing them- answer
D) None of these

7. Consider the following figure for answering the next few questions. In the figure, X1 and X2
are the two features and the data point is represented by dots (-1 is negative class and +1 is a
positive class). And you first split the data based on feature X1(say splitting point is x11)
which is shown in the figure using vertical line. Every value less than x11 will be predicted
as positive class and greater than x will be predicted as negative class.

How many data points are misclassified in above image?

A) 1- answer
B) 2
C) 3
D) 4
8. Suppose, you are working on a binary classification problem with 3 input features. And you
chose to apply a bagging algorithm(X) on this data. You chose max_features = 2 and the
n_estimators =3. Now, Think that each estimators have 70% accuracy.
Note: Algorithm X is aggregating the results of individual estimators based on maximum
voting
What will be the maximum accuracy you can get?
A) 70%
B) 80%
C) 90%
D) 100%- answer

9. Which of the following is true about the Gradient Boosting trees?

1. In each stage, introduce a new regression tree to compensate the shortcomings of existing
model
2. We can use gradient decent method for minimize the loss function

A) 1
B) 2
C) 1 and 2- answer
D) None of these

9. In SK-Learn which below parameters are in built in KMeans method

a. cluster_centers_
b. inertia_
c. n_clusters
d. all of the above

10. In which of the following cases will K-means clustering fail to give good results?
1) Data points with outliers 2) Data points with different densities 3) Data points with
nonconvex shapes
1. 1 and 2
2. 2 and 3
3. 1, 2, and 3 - answer
4. 1 and 3
11. Which of the following is a reasonable way to select the number of clusters "k"?
1. Choose k to be the smallest value so that at least 99% of the varinace is retained.
2. Choose k to be 99% of m (k = 0.99*m, rounded to the nearest integer).
3. Choose k to be the largest value so that 99% of the variance is retained.
4. Use the elbow method- answer
12. A company has build a kNN classifier that gets 100% accuracy on training data. When they
deployed this model on client side it has been found that the model is not at all accurate.
Which of the following thing might gone wrong?
Note: Model has successfully deployed and no technical issues are found at client side except
the model performance

A) It is probably a overfitted model - answer

B) It is probably a underfitted model
C) Can’t say
D) None of these

13. In k-NN it is very likely to overfit due to the curse of dimensionality. Which of the
following option would you consider to handle such problem?

1. Dimensionality Reduction
2. Feature selection

A) 1
B) 2
C) 1 and 2 - answer
D) None of these

14. In the image below, which would be the best value for k assuming that the algorithm you are
using is k-Nearest Neighbor.

A) 3
B) 10 - answer
C) 20
D 50

15. Which of the following is/are not true about DBSCAN clustering algorithm:

1. For data points to be in a cluster, they must be in a distance threshold to a core point
2. It has strong assumptions for the distribution of data points in dataspace
3. It has substantially high time complexity of order O(n3)
4. It does not require prior knowledge of the no. of desired clusters
5. It is robust to outliers

Options:

A. 1 only

B. 2 only
C. 4 only

D. 2 and 3 - answer

Unit-6 (Two marks)

1. After performing K-Means Clustering analysis on a dataset, you observed the following
dendrogram. Which of the following conclusion can be drawn from the dendrogram?

A. There were 28 data points in clustering analysis

B. The best no. of clusters for the analyzed data points is 4

C. The proximity function used is Average-link clustering

D. The above dendrogram interpretation is not possible for K-Means clustering analysis -
answer

3. In the figure below, if you draw a horizontal line on y-axis for y=2. What will be the number

of clusters formed?

A. 1

B. 2 - answer
C. 3

D. 4

4. What should be the best choice for number of clusters based on the following results:

A. 5

B. 6 - answer

C. 14

D. Greater than 14

5. Which of the following is/are not true about Centroid based K-Means clustering algorithm
and Distribution based expectation-maximization clustering algorithm:

1. Both starts with random initializations

2. Both are iterative algorithms
3. Both have strong assumptions that the data points must fulfill
4. Both are sensitive to outliers
5. Expectation maximization algorithm is a special case of K-Means
6. Both requires prior knowledge of the no. of desired clusters
7. The results produced by both are non-reproducible.

Options:

A. 1 only
B. 5 only - answer

C. 1 and 3

D. 6 and 7

7. If you are using Multinomial mixture models with the expectation-maximization algorithm for
clustering a set of data points into two clusters, which of the assumptions are important:

A. All the data points follow two Gaussian distribution

B. All the data points follow n Gaussian distribution (n >2)

C. All the data points follow two multinomial distribution - answer

D. All the data points follow n multinomial distribution (n >2)

8. Below is a mathematical representation of a neuron.

The different components of the neuron are denoted as:

• x1, x2,…, xN: These are inputs to the neuron. These can either be the actual observations
from input layer or an intermediate value from one of the hidden layers.
• w1, w2,…,wN: The Weight of each input.
• bi: Is termed as Bias units. These are constant values added to the input of the activation
function corresponding to each weight. It works similar to an intercept term.
• a: Is termed as the activation of the neuron which can be represented as
• and y: is the output of the neuron

Considering the above notations, will a line equation (y = mx + c) fall into the category of a
neuron?

A. Yes- answer
B. No

9. In the graph below, we observe that the error has many “ups and downs”

Should we be worried?

A. Yes, because this means there is a problem with the learning rate of neural network.
B. No, as long as there is a cumulative decrease in both training and validation error,
we don’t need to worry - answer

Unit 6 ( One mark)

1. Which of the following metrics, do we have for finding dissimilarity between two clusters in
hierarchical clustering?

1. Single-link
2. Complete-link
3. Average-link

Options:

A. 1 and 2

B. 1 and 3

C. 2 and 3

D. 1, 2 and 3 - answer

2. Which of the following statement(s) correctly represents a real neuron?

A. A neuron has a single input and a single output only

B. A neuron has multiple inputs but a single output only

C. A neuron has a single input but multiple outputs

D. A neuron has multiple inputs and multiple outputs

E. All of the above statements are valid - answer

3. If you increase the number of hidden layers in a Multi Layer Perceptron, the classification
error of test data always decreases. True or False?

A. True

B. False - answer

4. You are building a neural network where it gets input from the previous layer as well as from
itself.

Which of the following architecture has feedback connections?

A. Recurrent Neural network - answer

B. Convolutional Neural Network

C. Restricted Boltzmann Machine

D. None of these

5. In which neural net architecture, does weight sharing occur?

A. Convolutional neural Network
B. Recurrent Neural Network
C. Fully Connected Neural Network
D. Both A and B - answer

6. In a neural network, which of the following techniques is used to deal with overfitting?
A. Dropout
B. Regularization
C. Batch Normalization
D. All of these - answer

7. What is a dead unit in a neural network?

A. A unit which doesn’t update during training by any of its neighbour - answer
B. A unit which does not respond completely to any of the training patterns

C. The unit which produces the biggest sum-squared error

D. None of these

8. Suppose a convolutional neural network is trained on ImageNet dataset (Object recognition

dataset). This trained model is then given a completely white image as an input.The output
probabilities for this input would be equal for all classes. True or False?
A. True
B. False - answer

9. For an image recognition problem (recognizing a cat in a photo), which architecture of neural
network would be better suited to solve the problem?
A. Multi Layer Perceptron
B. Convolutional Neural Network - answer
C. Recurrent Neural network
D. Perceptron

10. What are the factors to select the depth of neural network?

1. Type of neural network (eg. MLP, CNN etc)

2. Input data
3. Computation power, i.e. Hardware capabilities and software capabilities
4. Learning Rate
5. The output function to map

A. 1, 2, 4, 5

B. 2, 3, 4, 5

C. 1, 3, 4, 5

D. All of these - answer

11. Movie Recommendation systems are an example of:

1. Classification
2. Clustering
3. Reinforcement Learning
4. Regression

Options:
1. 2 Only
2. 1 only
C. 1 and 2
D. 2 and 3 - answer
13. Recommendation systems are used in which of the following applications:
a. Banking
b. Shopping
c. Search Engine
d. All of the above – answer

14. Which of the following are methods of Recommendation Systems-

a. Naïve User based systems,
b. Content based Systems,
c. Model free collaborative filtering
d. All of the above – answer

15. Select correct option related to Hierarchical clustering.

a. Creates sets of clusters
b. Uses A tree data structure Dendrogram
c. Only b
d. Both a and b- answer

16. Agglomerative clustering is based on __________ approach

a. Top Down
b. Bottom Up- answer
c. Linear
d. Partition

17. For each pair of clusters, which algorithm computes the maximum distance between the clusters
using below formula?

a. Single link
b. Complete link -answer
c. Average link
d. Ward’s Linkage

18. ___________ Graphical method to better understand the agglomeration process shows in a static
way how the aggregations are performed ,starting from the bottom (where all samples are separated)
till the top (where the linkage is complete).

a. Flow chart
b. Histo graph
c. Dendrogram –answer
d. Decision tree

19. Which of the following functions are activation function?

a. ReLU
b. Tanh
c. Sigmoid
d. All of the above- answer
20. Which activation function is used by most of the Deep networks nowadays?
a. ReLU - answer
b. Tanh
c. Sigmoid
d. All of the above

21. ___________ are general computers which can learn algorithms to map input
sequences to output sequences
a. CNN
b. RNN- answer
c. Deep Q-Learning
d. All of these
UNIT I
1. What is classification?
a) when the output variable is a category, such as “red” or “blue” or “disease” and “no
disease”.
b) when the output variable is a real value, such as “dollars” or “weight”.

Ans: Solution A

Ans: Solution B

3. What is supervised learning?

Ans: Solution B

4. What is Unsupervised learning?

Ans: Solution A

5. What is Semi-Supervised learning?

Ans: Solution C

7. Sentiment Analysis is an example of:

Regression,

Classification

Clustering

Reinforcement Learning

Options:

A. 1 Only

B. 1 and 2

C. 1 and 3

D. 1, 2 and 4

Ans : Solution D

8. The process of forming general concept definitions from examples of concepts to be

learned.
a) Deduction
b) abduction
c) induction
d) conjunction

Ans : Solution C

9. Computers are best at learning

a) facts.
b) concepts.
c) procedures.
d) principles.
Ans : Solution A

10. Data used to build a data mining model.

a) validation data
b) training data
c) test data
d) hidden data

Ans : Solution B

11. Supervised learning and unsupervised clustering both require at least one
a) hidden attribute.
b) output attribute.
c) input attribute.
d) categorical attribute.

Ans : Solution A

Ans : Solution B

Ans : Solution C

14. A term used to describe the case when the independent variables in a multiple regression model
are correlated is
a) Regression
b) correlation
c) multicollinearity
d) none of the above

Ans : Solution C

16. A multiple regression model has

a) only one independent variable
b) more than one dependent variable
c) more than one independent variable
d) none of the above

Ans : Solution B

Ans : Solution C

18. The adjusted multiple coefficient of determination accounts for

a) the number of dependent variables in the model
b) the number of independent variables in the model
c) unusually large predictors
d) none of the above

Ans : Solution D

19. The multiple coefficient of determination is computed by

a) dividing SSR by SST
b) dividing SST by SSR
c) dividing SST by SSE
d) none of the above

Ans : Solution C

20. For a multiple regression model, SST = 200 and SSE = 50. The multiple coefficient of
determination is
a) 0.25
b) 4.00
c) 0.75
d) none of the above

Ans : Solution B

21. A nearest neighbor approach is best used

Ans : Solution B

22. Another name for an output attribute.

a) predictive variable
b) independent variable
c) estimated variable
d) dependent variable

Ans : Solution B

23. Classification problems are distinguished from estimation problems in that

Ans : Solution C

24. Which statement is true about prediction problems?

Ans : Solution D

25. Which statement about outliers is true?

Ans : Solution A

27. Which of the following is a common use of unsupervised clustering?

Ans : Solution A

28. The average positive difference between computed and desired outcome values.
a) root mean squared error
b) mean squared error
c) mean absolute error
d) mean positive error

Ans : Solution D

29. Selecting data so as to assure that each class is properly represented in both the training and
test set.
a) cross validation
b) stratification
c) verification
d) bootstrapping

Ans : Solution B

Ans : Solution A
31. Data used to optimize the parameter settings of a supervised learner model.
a) Training
b) Test
c) Verification
d) Validation

Ans : Solution D

32. Bootstrapping allows us to

Ans : Solution A

Ans : Solution B

Ans : Solution C

35. The average squared difference between classifier predicted output and actual output.
a) mean squared error
b) root mean squared error
c) mean absolute error
d) mean relative error

Ans : Solution A

36. Simple regression assumes a __________ relationship between the input attribute and output
attribute.
a) Linear
b) Quadratic
c) reciprocal
d) inverse

Ans : Solution A

37. Regression trees are often used to model _______ data.

a) Linear
b) Nonlinear
c) Categorical
d) Symmetrical

Ans : Solution B

38. The leaf nodes of a model tree are

a) averages of numeric output attribute values.
b) nonlinear regression equations.
c) linear regression equations.
d) sums of numeric output attribute values.

Ans : Solution C

39. Logistic regression is a ________ regression technique that is used to model data having a
_____outcome.
a) linear, numeric
b) linear, binary
c) nonlinear, numeric
d) nonlinear, binary

Ans : Solution D

40. This technique associates a conditional probability value with each data instance.
a) linear regression
b) logistic regression
c) simple regression
d) multiple linear regression

Ans : Solution B

42. With Bayes classifier, missing data items are

a) treated as equal compares.
b) treated as unequal compares.
c) replaced with a default value.
d) ignored.

Ans : Solution B

43. This clustering algorithm merges and splits nodes to help modify nonoptimal partitions.
a) agglomerative clustering
b) expectation maximization
c) conceptual clustering
d) K-Means clustering

Ans : Solution D

Ans : Solution C

Ans : Solution B
UNIT –II

2.What is pca.components_ in Sklearn?

Set of all eigen vectors for the projection space
Matrix of principal components
Result of the multiplication matrix
None of the above options
Ans A

Ans D

7. PCA works better if there is?

A linear structure in the data
If the data lies on a curved surface and not on a flat surface
If variables are scaled in the same unit
A. 1 and 2
B. 2 and 3
C. 1 and 3
D. 1 ,2 and 3
Ans Solution: (C)

9. Which of the following option(s) is / are true?

11. [ True or False ] PCA can be used for projecting and visualizing data in lower dimensions.
A. TRUE
B. FALSE

Solution: (A)
Sometimes it is very useful to plot the data in lower dimensions. We can take the first 2 principal
components and then visualize the data using scatter plot.

13. Which of the following is an example of a deterministic algorithm?

1. Which of the following methods do we use to best fit the data in Logistic Regression?
A) Least Square Error
B) Maximum Likelihood
C) Jaccard distance
D) Both A and B
Ans Solution: B

4. Which of the following algorithms do we use for Variable Selection?

A) LASSO
B) Ridge
C) Both
D) None of these
Ans Solution: A
In case of lasso we apply a absolute penality, after increasing the penality in lasso some of the
coefficient of variables may become zero

5. Which of the following statement is true about outliers in Linear regression?

7. Which of the following is true about Residuals?

A) Lower is better
B) Higher is better
C) A or B depend on the situation
D) None of these
Ans Solution: (A)
Residuals refer to the error values of the model. Therefore lower residuals are desired.

A) Since the there is a relationship means our model is not good

10. Which of the following option is true?

A) Linear Regression errors values has to be normally distributed but in case of Logistic
Regression it is not the case
B) Logistic Regression errors values has to be normally distributed but in case of Linear
Regression it is not the case
C) Both Linear Regression and Logistic Regression error values have to be normally distributed
D) Both Linear Regression and Logistic Regression error values have not to be normally
distributed
Ans Solution: A

12. True-False: Linear Regression is a supervised machine learning algorithm.

13. True-False: Linear Regression is mainly used for Regression.

A) TRUE
B) FALSE
Solution: (A)
Linear Regression has dependent variables that have continuous values.
14. True-False: It is possible to design a Linear regression algorithm using a neural network?

A) TRUE
B) FALSE

Solution: (A)

True. A Neural network can be used as a universal approximator, so it can definitely implement
a linear regression algorithm.

18. Which of the following is true about Residuals ?

A) Lower is better
B) Higher is better
C) A or B depend on the situation
D) None of these
Solution: (A)
Residuals refer to the error values of the model. Therefore lower residuals are desired.

A) 1 and 2
B) 1 and 3
C) 2 and 3
D) 1,2 and 3
Solution: (D)
Instead of gradient descent, Normal Equation can also be used to find coefficients.

Question Context 24-26:

A) Since the there is a relationship means our model is not good

Question Context 29-31:

31. In terms of bias and variance. Which of the following is true when you fit degree 2
polynomial?

A) Bias will be high, variance will be high

Question Context 32-33:

33. What do you expect will happen with bias and variance as you increase the size of training
data?

A) Bias increases and Variance increases

Question Context 34:

Consider the following data where one input(X) and one output(Y) is given.

34. What would be the root mean square training error for this data if you run a Linear
Regression model of the form (Y = A0+A1X)?

A) Less than 0
B) Greater than zero
C) Equal to 0
D) None of these
Solution: (C)
We can perfectly fit the line on the following data so mean error will be zero.

Question Context 35-36:

Suppose you have been given the following scenario for training and validation error for Linear
Regression.
Number Validation
Learning Training
Scenario of Error
Rate Error
iterations

1 0.1 1000 100 110

2 0.2 600 90 105

3 0.3 400 110 110

4 0.4 300 120 130

5 0.4 250 130 150

Question Context 37-38:

A) L1
B) L2
C) Any
D) None of these
Solution: (D)
I won’t use any regularization methods because regularization is used in case of overfitting.

39. True-False: Is Logistic regression a supervised machine learning algorithm?

40. True-False: Is Logistic regression mainly used for Regression?

A) TRUE
B) FALSE
Solution: B
Logistic regression is a classification algorithm, don’t confuse with the name regression.

42. True-False: Is it possible to apply a logistic regression algorithm on a 3-class Classification

problem?
A) TRUE
B) FALSE
Solution: A
Yes, we can apply logistic regression on 3 classification problem, We can use One Vs all method
for 3 class classification in logistic regression.

46. [True-False] Standardisation of features is required before training a Logistic Regression.

A) TRUE
B) FALSE
Solution: B
Standardization isn’t required for logistic regression. The main goal of standardizing features is
to help convergence of the technique used for optimization.

47. Which of the following algorithms do we use for Variable Selection?

A) LASSO
B) Ridge
C) Both
D) None of these

Solution: A
In case of lasso we apply a absolute penality, after increasing the penality in lasso some of the
coefficient of variables may become zero.
Context: 48-49

Consider a following model for logistic regression: P (y =1|x, w)= g(w0 + w1x)
where g(z) is the logistic function.

In the above equation the P (y =1|x; w) , viewed as a function of x, that we can get by changing the
parameters w.

48 What would be the range of p in such case?

A) (0, inf)
B) (-inf, 0 )
C) (0, 1)
D) (-inf, inf)

Solution: C

For values of x in the range of real number from −∞ to +∞ Logistic function will give the output
between (0,1)

49 In above question what do you think which function would make p between (0,1)?

A) logistic function
B) Log likelihood function
C) Mixture of both
D) None of them

Solution: A

Explanation is same as question number 10

50. Suppose you have been given a fair coin and you want to find out the odds of getting heads.
Which of the following option is true for such a case?

A) odds will be 0
B) odds will be 0.5
C) odds will be 1
D) None of these

Solution: C

Odds are defined as the ratio of the probability of success and the probability of failure. So in case of fair
coin probability of success is 1/2 and the probability of failure is 1/2 so odd would be 1

51. The logit function(given as l(x)) is the log of odds function. What could be the range of logit
function in the domain x=[0,1]?
A) (– ∞ , ∞)
B) (0,1)
C) (0, ∞)
D) (- ∞, 0)

Solution: A

For our purposes, the odds function has the advantage of transforming the probability function, which
has values from 0 to 1, into an equivalent function with values between 0 and ∞. When we take the
natural log of the odds function, we get a range of values from -∞ to ∞.

52. Which of the following option is true?

A) Linear Regression errors values has to be normally distributed but in case of Logistic Regression it is
not the case
B) Logistic Regression errors values has to be normally distributed but in case of Linear Regression it is
not the case
C) Both Linear Regression and Logistic Regression error values have to be normally distributed
D) Both Linear Regression and Logistic Regression error values have not to be normally distributed

Solution:A

53. Which of the following is true regarding the logistic function for any value “x”?

Note:
Logistic(x): is a logistic function of any number “x”

Logit(x): is a logit function of any number “x”

Logit_inv(x): is a inverse logit function of any number “x”

A) Logistic(x) = Logit(x)
B) Logistic(x) = Logit_inv(x)
C) Logit_inv(x) = Logit(x)
D) None of these

Solution: B

54. How will the bias change on using high(infinite) regularisation?

Solution: A

Model will become very simple so bias will be very high.

Note: Consider remaining parameters are same.

A) Training accuracy increases

B) Training accuracy increases or remains the same
C) Testing accuracy decreases
D) Testing accuracy increases or remains the same

Solution: A and D

56. Choose which of the following options is true regarding One-Vs-All method in Logistic Regression.

A) We need to fit n models in n-class classification problem

B) We need to fit n-1 models to classify into n classes
C) We need to fit only 1 model to classify into n classes
D) None of these
Solution: A

If there are n classes, then n separate logistic regression has to fit, where the probability of each
category is predicted over the rest of the categories combined.

57. Below are two different logistic models with different values for β0 and β1.

Which of the
following statement(s) is true about β0 and β1 values of two logistics models (Green, Black)?

Note: consider Y = β0 + β1*X. Here, β0 is intercept and β1 is coefficient.

A) β1 for Green is greater than Black

B) β1 for Green is lower than Black
C) β1 for both models is same
D) Can’t Say

Solution: B

β0 and β1: β0 = 0, β1 = 1 is in X1 color(black) and β0 = 0, β1 = −1 is in X4 color (green)

Context 58-60

A) A
B) B
C) C
D)None of these

Solution: C

Since in figure 3, Decision boundary is not smooth that means it will over-fitting the data.

59. What do you conclude after seeing this visualization?

1. The training error in first plot is maximum as compare to second and third plot.

2. The best model for this regression problem is the last (third) plot because it has minimum
training error (zero).

3. The second model is more robust than first and third because it will perform best on unseen
data.

4. The third model is overfitting more as compare to first and second.

5. All will perform same because we have not seen the testing data.

A) 1 and 3
B) 1 and 3
C) 1, 3 and 4
D) 5

Solution: C

60. Suppose, above decision boundaries were generated for the different value of regularization.
Which of the above decision boundary shows the maximum regularization?

A) A
B) B
C) C
D) All have equal regularization

Solution: A

Since, more regularization means more penality means less complex decision boundry that shows in first
figure A.

61. What would do if you want to train logistic regression on same data that will take less time as well
as give the comparatively similar accuracy(may not be same)?

Suppose you are using a Logistic Regression model on a huge dataset. One of the problem you may face
on such huge data is that Logistic regression will take very long time to train.

A) Decrease the learning rate and decrease the number of iteration

Solution: D

62. Which of the following image is showing the cost function for y =1.

Following is the loss function in logistic regression(Y-axis loss function and x axis log probability) for
two class classification problem.

Note: Y is the target class

A) A
B) B
C) Both
D) None of these

Solution: A

A is the true answer as loss function decreases as the log probability increases

63. Suppose, Following graph is a cost function for logistic regression.

Now, How many local minimas are present in the graph?

A) 1
B) 2
C) 3
D) 4

Solution: C
There are three local minima present in the graph

64. Can a Logistic Regression classifier do a perfect classification on the below data?

Note: You can use only X1 and X2 variables where X1 and X2 can take only two binary values(0,1).

A) TRUE
B) FALSE
C) Can’t say
D) None of these

Solution: B

No, logistic regression only forms linear decision surface, but the examples in the figure are not linearly
separable.
UNIT IV

1. The SVM’s are less effective when:

A) The data is linearly separable

B) The data is clean and ready to use
C) The data is noisy and contains overlapping points

Ans Solution: C

When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.

2. The cost parameter in the SVM means:

A) The number of cross-validations to be made

B) The kernel to be used
C) The tradeoff between misclassification and simplicity of the model
D) None of the above

Ans Solution: C

The cost parameter decides how much an SVM should be allowed to “bend” with the data. For a low
cost, you aim for a smooth decision surface and for a higher cost, you aim to classify more points
correctly. It is also simply referred to as the cost of misclassification.

3. Which of the following are real world applications of the SVM?

A) Text and Hypertext Categorization

B) Image Classification
C) Clustering of News Articles
D) All of the above

Ans Solution: D

SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.

4. Which of the following is true about Naive Bayes ?

Assumes that all the features in a dataset are equally important

Assumes that all the features in a dataset are independent

Both A and B - answer

None of the above options

Ans Solution: C

5 What do you mean by generalization error in terms of the SVM?

A) How far the hyperplane is from the support vectors

B) How accurately the SVM can predict outcomes for unseen data
C) The threshold amount of error in an SVM

Ans Solution: B

Generalisation error in statistics is generally the out-of-sample error which is the measure of how
accurately a model can predict values for previously unseen data.

6 The SVM’s are less effective when:

A) The data is linearly separable

B) The data is clean and ready to use
C) The data is noisy and contains overlapping points

Ans Solution: C

When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.

7 What is/are true about kernel in SVM?

1. Kernel function map low dimensional data to high dimensional space

2. It’s a similarity function

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: C

Both the given statements are correct.

Question Context:8– 9

A) Yes
B) No

Solution: A

These three examples are positioned such that removing any one of them introduces slack in the
constraints. So the decision boundary would completely change.

9. [True or False] If you remove the non-red circled points from the data, the decision boundary will
change?

A) True
B) False

Solution: B

On the other hand, rest of the points in the data won’t affect the decision boundary much.

10. What do you mean by generalization error in terms of the SVM?

A) How far the hyperplane is from the support vectors

B) How accurately the SVM can predict outcomes for unseen data
C) The threshold amount of error in an SVM

Solution: B

A) The optimal hyperplane if exists, will be the one that completely separates the data
B) The soft-margin classifier will separate the data
C) None of the above

Solution: A

At such a high level of misclassification penalty, soft margin will not hold existence as there will be no
room for error.

12. What do you mean by a hard margin?

A) The SVM allows very low error in classification

B) The SVM allows high amount of error in classification
C) None of the above

Solution: A

A hard margin means that an SVM is very rigid in classification and tries to work extremely well in the
training set, causing overfitting.

13. The minimum time complexity for training an SVM is O(n2). According to this fact, what sizes of
datasets are not best suited for SVM’s?

A) Large datasets
B) Small datasets
C) Medium sized datasets
D) Size does not matter

Solution: A

Datasets which have a clear classification boundary will function best with SVM’s.

14. The effectiveness of an SVM depends upon:

A) Selection of Kernel
B) Kernel Parameters
C) Soft Margin Parameter C
D) All of the above

Solution: D

The SVM effectiveness depends upon how you choose the basic 3 requirements mentioned above in
such a way that it maximises your efficiency, reduces error and overfitting.

15. upport vectors are the data points that lie closest to the decision surface.
A) TRUE
B) FALSE

Solution: A

They are the points closest to the hyperplane and the hardest ones to classify. They also have a direct
bearing on the location of the decision surface.

16. The SVM’s are less effective when:

A) The data is linearly separable

B) The data is clean and ready to use
C) The data is noisy and contains overlapping points

Solution: C

When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.

17. Suppose you are using RBF kernel in SVM with high Gamma value. What does this signify?

Solution: B

The gamma parameter in SVM tuning signifies the influence of points either near or far away from the
hyperplane.

For a low gamma, the model will be too constrained and include all points of the training dataset,
without really capturing the shape.

For a higher gamma, the model will capture the shape of the dataset well.

18. The cost parameter in the SVM means:

A) The number of cross-validations to be made

B) The kernel to be used
C) The tradeoff between misclassification and simplicity of the model
D) None of the above

What would happen when you use very large value of C(C->infinity)?

Note: For small C was also classifying all data points correctly

A) We can still classify data correctly for given setting of hyper parameter C
B) We can not classify data correctly for given setting of hyper parameter C
C) Can’t Say
D) None of these

Solution: A

For large values of C, the penalty for misclassifying points is very high, so the decision boundary will
perfectly separate the data if possible.

20. What would happen when you use very small C (C~0)?

A) Misclassification would happen

B) Data will be correctly classified
C) Can’t say
D) None of these

Solution: A

The classifier can maximize the margin between most of the points, while misclassifying a few points,
because the penalty is so low.

21. If I am using all features of my dataset and I achieve 100% accuracy on my training set, but ~70% on
validation set, what should I look out for?

A) Underfitting
B) Nothing, the model is perfect
C) Overfitting

Solution: C

If we’re achieving 100% training accuracy very easily, we need to check to verify if we’re overfitting our
data.
22. Which of the following are real world applications of the SVM?

A) Text and Hypertext Categorization

B) Image Classification
C) Clustering of News Articles
D) All of the above

Solution: D

SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.

Question Context: 23 – 25

Suppose you have trained an SVM with linear decision boundary after training SVM, you correctly infer
that your SVM model is under fitting.

23. Which of the following option would you more likely to consider iterating SVM next time?

A) You want to increase your data points

B) You want to decrease your data points
C) You will try to calculate more variables
D) You will try to reduce the features

Solution: C

The best option here would be to create more features for the model.

24. Suppose you gave the correct answer in previous question. What do you think that is actually
happening?

1. We are lowering the bias

2. We are lowering the variance
3. We are increasing the bias
4. We are increasing the variance

A) 1 and 2
B) 2 and 3
C) 1 and 4
D) 2 and 4

Solution: C

A) We will increase the parameter C

B) We will decrease the parameter C
C) Changing in C don’t effect
D) None of these

Solution: A

Increasing C parameter would be the right thing to do here, as it will ensure regularized model

26. We usually use feature normalization before using the Gaussian kernel in SVM. What is true about
feature normalization?

1. We do feature normalization so that new feature will dominate other

2. Some times, feature normalization is not feasible in case of categorical variables
3. Feature normalization always helps when we use Gaussian kernel in SVM

A) 1
B) 1 and 2
C) 1 and 3
D) 2 and 3

Solution: B

Statements one and two are correct.

Question Context: 27-29

Suppose you are dealing with 4 class classification problem and you want to train a SVM model on the
data for that you are using One-vs-all method. Now answer the below questions?

27. How many times we need to train our SVM model in such case?

A) 1
B) 2
C) 3
D) 4

Solution: D

A) 20
B) 40
C) 60
D) 80

Solution: B

It would take 10×4 = 40 seconds

29 Suppose your problem has changed now. Now, data has only 2 classes. What would you think how
many times we need to train SVM in such case?

A) 1
B) 2
C) 3
D) 4

Solution: A

Training the SVM only one time would give you appropriate results

Question context: 30 –31

30. Now, think that you increase the complexity (or degree of polynomial of this kernel). What would
you think will happen?

A) Increasing the complexity will over fit the data

B) Increasing the complexity will under fit the data
C) Nothing will happen since your model was already 100% accurate
D) None of these

Solution: A

A) 1
B) 2
C) 1 and 2
D) None of these

Solution: C

Both the given statements are correct.

32. What is/are true about kernel in SVM?

1. Kernel function map low dimensional data to high dimensional space

2. It’s a similarity function

A) 1
B) 2
C) 1 and 2
D) None of these

Solution: C

Both the given statements are correct.

UNIT V

1. Which of the following is a widely used and effective machine learning algorithm based on the
idea of bagging?

a) Decision Tree
b) Regression
c) Classification
d) Random Forest

Ans D

2. Which of the following is a disadvantage of decision trees?

a) Factor analysis
b) Decision trees are robust to outliers
c) Decision trees are prone to be overfit
d) None of the above

Ans C

3. Can decision trees be used for performing clustering?

a. True
b. False

Ans Solution: (A)

Decision trees can also be used to for clusters in the data but clustering often generates natural
clusters and is not dependent on any objective function.

4. Which of the following algorithm is most sensitive to outliers?

a. K-means clustering algorithm

b. K-medians clustering algorithm
c. K-modes clustering algorithm
d. K-medoids clustering algorithm

Ans Solution: (A)

5 Sentiment Analysis is an example of:

Regression

Classification

Clustering

Reinforcement Learning

Options:

a. 1 Only
b. 1 and 2
c. 1 and 3
d. 1, 2 and 4

Ans D

6 Which of the following is the most appropriate strategy for data cleaning before performing
clustering analysis, given less than desirable number of data points:

Capping and flouring of variables

Removal of outliers
Options:
a. 1 only
b. 2 only
c. 1 and 2
d. None of the above

Ans A

7 Which of the following is/are true about bagging trees?

1. In bagging trees, individual trees are independent of each other

2. Bagging is the method for improving the performance by aggregating the results of weak
learners

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: C

Both options are true. In Bagging, each individual trees are independent of each other because they
consider different subset of features and samples.

8. Which of the following is/are true about boosting trees?

1. In boosting trees, individual weak learners are independent of each other

2. It is the method for improving the performance by aggregating the results of weak learners

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: B

2. Individual tree is built on all the features

3. Individual tree is built on a subset of observations

4. Individual tree is built on full set of observations

A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4

Ans Solution: A

Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.

10. Suppose you are using a bagging based algorithm say a RandomForest in model building.
Which of the following can be true?

1. Number of tree should be as large as possible

2. You will have interpretability after using Random Forest

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: A

11. Which of the following is/are true about Random Forest and Gradient Boosting ensemble
methods?

1. Both methods can be used for classification task

2. Random Forest is use for classification whereas Gradient Boosting is use for regression task

3. Random Forest is use for regression whereas Gradient Boosting is use for Classification task

4. Both methods can be used for regression task

A) 1
B) 2
C) 3
D) 4
E) 1 and 4

Solution: E

Both algorithms are design for classification as well as regression task.

12. In Random forest you can generate hundreds of trees (say T1, T2 …..Tn) and then aggregate the
results of these tree. Which of the following is true about individual(Tk) tree in Random Forest?

1. Individual tree is built on a subset of the features

2. Individual tree is built on all the features

3. Individual tree is built on a subset of observations

4. Individual tree is built on full set of observations

A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4

Solution: A

Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.

13. Which of the following algorithm doesn’t uses learning Rate as of one of its hyperparameter?

1. Gradient Boosting

2. Extra Trees

3. AdaBoost

4. Random Forest

A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4

Solution: D
Random Forest and Extra Trees don’t have learning rate as a hyperparameter.

14. Which of the following algorithm are not an example of ensemble learning algorithm?

A) Random Forest
B) Adaboost
C) Extra Trees
D) Gradient Boosting
E) Decision Trees

Solution: E

Decision trees doesn’t aggregate the results of multiple trees so it is not an ensemble algorithm.

15. Suppose you are using a bagging based algorithm say a RandomForest in model building. Which of
the following can be true?

1. Number of tree should be as large as possible

2. You will have interpretability after using RandomForest

A) 1
B) 2
C) 1 and 2
D) None of these

Solution: A

16. True-False: The bagging is suitable for high variance low bias models?

A) TRUE
B) FALSE

Solution: A

The bagging is suitable for high variance low bias models or you can say for complex models.

17. To apply bagging to regression trees which of the following is/are true in such case?

1. We build the N regression with N bootstrap sample

2. We take the average the of N regression tree

3. Each tree has a high variance with low bias

A) 1 and 2
B) 2 and 3
C) 1 and 3
D) 1,2 and 3

Solution: D

All of the options are correct and self-explanatory

18. How to select best hyper parameters in tree based models?

A) Measure performance over training data

B) Measure performance over validation data
C) Both of these
D) None of these

Solution: B

We always consider the validation results to compare with the test result.

19. In which of the following scenario a gain ratio is preferred over Information Gain?

A) When a categorical variable has very large number of category

B) When a categorical variable has very small number of category
C) Number of categories is the not the reason
D) None of these

Solution: A

When high cardinality problems, gain ratio is preferred over Information Gain technique.

20. Suppose you have given the following scenario for training and validation error for Gradient
Boosting. Which of the following hyper parameter would you choose in such case?

Scenario Depth Training Error Validation Error

1 2 100 110

2 4 90 105

3 6 50 100

4 8 45 105
5 10 30 150

A) 1
B) 2
C) 3
D) 4

Solution: B

Scenario 2 and 4 has same validation accuracies but we would select 2 because depth is lower is better
hyper parameter.

21. Which of the following is/are not true about DBSCAN clustering algorithm:

1. For data points to be in a cluster, they must be in a distance threshold to a core point

2. It has strong assumptions for the distribution of data points in dataspace

3. It has substantially high time complexity of order O(n 3)

4. It does not require prior knowledge of the no. of desired clusters

5. It is robust to outliers

Options:

A. 1 only

B. 2 only

C. 4 only

D. 2 and 3

Solution: D

 DBSCAN can form a cluster of any arbitrary shape and does not have strong assumptions for the
distribution of data points in the data space.

 DBSCAN has a low time complexity of order O (n log n) only.

22. Point out the correct statement.

23. Which of the following is required by K-means clustering?

a) defined distance metric
b) number of clusters
c) initial guess as to cluster centroids
d) all of the mentioned

Answer: d
Explanation: K-means clustering follows partitioning approach.

24. Point out the wrong statement.

a) k-means clustering is a method of vector quantization
b) k-means clustering aims to partition n observations into k clusters
c) k-nearest neighbor is same as k-means
d) none of the mentioned

Answer: c
Explanation: k-nearest neighbour has nothing to do with k-means.

25. Which of the following function is used for k-means clustering?

a) k-means
b) k-mean
c) heat map
d) none of the mentioned

Answer: a
Explanation: K-means requires a number of clusters.

26. K-means is not deterministic and it also consists of number of iterations.

a) True
b) False

Answer: a
Explanation: K-means clustering produces the final estimate of cluster centroids.
27.
((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1
3 UPTO 10)
((QUESTION)) Which of the following step / assumption in regression modeling impacts
the trade-off between under-fitting and over-fitting the most

ENTER CONTENT. QTN CAN HAVE IMAGES ALSO

((OPTION_A)) The polynomial degree
THIS IS MANDATORY OPTION
((OPTION_B)) Whether we learn the weights by matrix inversion or gradient descent

THIS IS ALSO MANDATORY OPTION

((OPTION_C)) The use of a constant-term
This is optional
((OPTION_D))
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) Suppose you have the following data with one real-value input variable &
one real-value output variable. What is leave-one out cross validation
mean square error in case of linear regression (Y = bX+c)?

ENTER CONTENT. QTN CAN HAVE IMAGES ALSO

((OPTION_A)) Oct-27
THIS IS MANDATORY OPTION
((OPTION_B)) 20/27
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) 50/27
This is optional
((OPTION_D)) 49/27
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E D
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) Which of the following is/ are true about “Maximum Likelihood estimate
(MLE)”?
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
1. MLE may not always exist
2. MLE always exists
3. If MLE exist, it (they) may not be unique
4. If MLE exist, it (they) must be unique
((OPTION_A)) 1and4
THIS IS MANDATORY OPTION
((OPTION_B)) 2 and3
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) 1 and3
This is optional
((OPTION_D)) 2 and4
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E C
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) Let’s say, a “Linear regression” model perfectly fits the training data
(train error is zero). Now, Which of the following statement is true?

ENTER CONTENT. QTN CAN HAVE IMAGES ALSO

((OPTION_A)) You will always have test error zero
THIS IS MANDATORY OPTION
((OPTION_B)) . You can not have test error zero
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) None of the above
This is optional
((OPTION_D))
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E C
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) Which one of the statement is true regarding residuals in regression
analysis?
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A)) A. Mean of residuals is always zero
THIS IS MANDATORY OPTION
((OPTION_B)) Mean of residuals is always less than zero
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) Mean of residuals is always greater than zero
This is optional
((OPTION_D)) There is no such rule for residuals.
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) Which of the one is true about Heteroskedasticity?
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A)) Linear Regression with varying error terms
THIS IS MANDATORY OPTION
((OPTION_B)) Linear Regression with constant error terms
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) Linear Regression with zero error terms
This is optional
((OPTION_D)) None of the above
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional
((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1
3 UPTO 10)
((QUESTION)) Which of the following indicates a fairly strong relationship between X
and Y?
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A)) A. Correlation coefficient = 0.9
THIS IS MANDATORY OPTION
((OPTION_B)) . The p-value for the null hypothesis Beta coefficient =0 is 0.0001
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) The t-statistic for the null hypothesis Beta coefficient=0 is 30
This is optional
((OPTION_D)) None of these
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) Which of the following assumptions do we make while deriving linear regression param

ENTER CONTENT. QTN CAN HAVE IMAGES ALSO 1. The true relationship between dependent y and predictor x is linear
2. The model errors are statistically independent

3. The errors are normally distributed with a 0 mean and constant standard deviation.

((OPTION_A)) 1,2&3
THIS IS MANDATORY OPTION
((OPTION_B)) 1&3
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) All of above
This is optional
((OPTION_D))
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E C
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) To test linear relationship of y(dependent) and x(independent)
continuous variables, which of the following plot best suited?

ENTER CONTENT. QTN CAN HAVE IMAGES ALSO

((OPTION_A)) Scatter plot
THIS IS MANDATORY OPTION
((OPTION_B)) Barchart
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) Histograms
This is optional
((OPTION_D)) None of these
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional
((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1
3 UPTO 10)
((QUESTION)) Generally, which of the following method(s) is used for predicting
continuous dependent variable?
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
1. Linear Regression
2. Logistic Regression
((OPTION_A)) 1&2
THIS IS MANDATORY OPTION
((OPTION_B)) Only 1
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) Only 2
This is optional
((OPTION_D)) None f the above
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E B
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) . A correlation between age and health of a person found to be -1.09. On
the basis of this you would tell the doctors that:

ENTER CONTENT. QTN CAN HAVE IMAGES ALSO

((OPTION_A)) . The age is good predictor of health
THIS IS MANDATORY OPTION
((OPTION_B)) . The age is poor predictor of health
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) None of these
This is optional
((OPTION_D)) All of the above
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E C
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) Which of the following offsets, do we use in case of least square line fit? Suppose horizontal axis is
independent variable and vertical axis is dependent variable

ENTER CONTENT. QTN CAN HAVE IMAGES ALSO

((OPTION_A)) Vertical offset
THIS IS MANDATORY OPTION
((OPTION_B)) Perpendicular offset
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) Both but depend on situation
This is optional
((OPTION_D)) Both a&b
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional
((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1
3 UPTO 10)
((QUESTION)) Suppose we have generated the data with help of polynomial regression of degree 3 (degree 3 will perfectly fit
this data). Now consider below points and choose the option based on these points.

ENTER CONTENT. QTN CAN HAVE IMAGES ALSO 1. Simple Linear regression will have high bias and low variance
2. Simple Linear regression will have low bias and high variance
3. polynomial of degree 3 will have low bias and high variance
Polynomial of degree 3 will have low bias and Low variance

((OPTION_A)) . Only 1
THIS IS MANDATORY OPTION
((OPTION_B)) 1&3
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) 1&4
This is optional
((OPTION_D)) None of the above
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E C
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) . Suppose you are training a linear regression model. Now consider these
points.
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
1. Overfitting is more likely if we have less data
2. Overfitting is more likely when the hypothesis space is small

Which of the above statement(s) are correct?

((OPTION_A)) Both are False
THIS IS MANDATORY OPTION
((OPTION_B)) 1 is False and 2 is True
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) 1 is True and 2 is False
This is optional
((OPTION_D)) None of the above
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E c
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) Suppose we fit “Lasso Regression” to a data set, which has 100 features (X1,X2…X100). Now, we rescale one
of these feature by multiplying with 10 (say that feature is X1), and then refit Lasso regression with the same
regularization parameter.

ENTER CONTENT. QTN CAN HAVE IMAGES ALSO Now, which of the following option will be correct ?
((OPTION_A)) It is more likely for X1 to be excluded from the model
THIS IS MANDATORY OPTION
((OPTION_B)) It is more likely for X1 to be included in the model
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) . Can’t say
This is optional
((OPTION_D)) None of the above
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E B
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) Which of the following is true about “Ridge” or “Lasso” regression
methods in case of feature selection?
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A)) Ridge regression uses subset selection of features
THIS IS MANDATORY OPTION
((OPTION_B)) . Lasso regression uses subset selection of features
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) Both use subset selection of features
This is optional
((OPTION_D)) All of the above
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E B
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) . Which of the following statement(s) can be true post adding a variable in
a linear regression model?
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO 1. R-Squared and Adjusted R-squared both increase
2. R-Squared increases and Adjusted R-squared decreases
3. R-Squared decreases and Adjusted R-squared decreases
4. R-Squared decreases and Adjusted R-squared increases
((OPTION_A)) . 1 and 2
THIS IS MANDATORY OPTION
((OPTION_B)) 1 and 3
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) 2 and 4
This is optional
((OPTION_D)) none of these

This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) . Which of the following metrics can be used for evaluating regression
models?
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO 1. R Squared
2. Adjusted R Squared
3. F Statistics
1. RMSE / MSE / MAE
((OPTION_A)) 2 and 4
THIS IS MANDATORY OPTION
((OPTION_B)) 1 and 2.
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) . 2, 3 and 4.
This is optional
((OPTION_D)) All of the above
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E D
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) We can also compute the coefficient of linear regression with the help of
an analytical method called “Normal Equation”. Which of the following
is/are true about “Normal Equation”?

ENTER CONTENT. QTN CAN HAVE IMAGES ALSO 1. We don’t have to choose the learning rate
2. It becomes slow when number of features is very large
3. No need to iterate
((OPTION_A)) 1 and 2
THIS IS MANDATORY OPTION
((OPTION_B)) 1&3
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) 2&3
This is optional
((OPTION_D)) 1,2&3
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E D
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) . The expected value of Y is a linear function of the X(X1,X2….Xn) variables and regression line is defined as:

ENTER CONTENT. QTN CAN HAVE IMAGES ALSO Y = β0 + β1 X1 + β2 X2……+ βn Xn

Which of the following statement(s) are true?
1. If Xi changes by an amount ∆Xi, holding other variables constant, then the expected value of Y changes
by a proportional amount βi ∆Xi, for some constant βi (which in general could be a positive or negative
number).
2. The value of βi is always the same, regardless of values of the other X’s.
3. The total effect of the X’s on the expected value of Y is the sum of their separate effects.

((OPTION_A)) . 1 and 2
THIS IS MANDATORY OPTION
((OPTION_B)) 1 and 3
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) 2 and 3
This is optional
((OPTION_D)) 1,2 and 3
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E D
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) . How many coefficients do you need to estimate in a simple linear
regression model (One independent variable)
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A)) 1
THIS IS MANDATORY OPTION
((OPTION_B)) 2
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) CAN’T SAY
This is optional
((OPTION_D))
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E B
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 2

3 UPTO 10)
((QUESTION)) . Below graphs show two fitted regression lines (A & B) on randomly generated data. Now, I want to find the
sum of residuals in both cases A and B.

ENTER CONTENT. QTN CAN HAVE IMAGES ALSO

Which of the following statement is true about sum of residuals of A and B

((OPTION_A)) A has higher than B

THIS IS MANDATORY OPTION
((OPTION_B)) A has lower than B
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) Both have same
This is optional
((OPTION_D)) None of these
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E C
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) If two variables are correlated, is it necessary that they have a linear
relationsh
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A)) YES
THIS IS MANDATORY OPTION
((OPTION_B)) NO
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) Both a&b
This is optional
((OPTION_D)) None of the above
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E B
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) Correlated variables can have zero correlation coeffficient. True or
False?
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A)) TRUE
THIS IS MANDATORY OPTION
((OPTION_B)) FALSE
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) Suppose I applied a logistic regression model on data and got training accuracy X and testing accuracy Y.
Now I want to add few new features in data. Select option(s) which are correct in such case.

ENTER CONTENT. QTN CAN HAVE IMAGES ALSO Note: Consider remaining parameters are same.
1. Training accuracy always decreases.
2. Training accuracy always increases or remain same.
3. Testing accuracy always decreases
Testing accuracy always increases or remain same

((OPTION_A)) Only 2
THIS IS MANDATORY OPTION
((OPTION_B)) Only 1
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) Only3
This is optional
((OPTION_D)) All of the above
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional
((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1
3 UPTO 10)
((QUESTION)) The graph below represents a regression line predicting Y from X. The values on the
graph shows the residuals for each predictions value. Use this information to compute
the SSE.
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A)) 3.02
THIS IS MANDATORY OPTION
((OPTION_B)) 0.75
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) 1.01
This is optional
((OPTION_D)) None of these
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) Suppose the distribution of salaries in a company X has median $35,000,
and 25th and 75th percentiles are $21,000 and $53,000 respectively.

ENTER CONTENT. QTN CAN HAVE IMAGES ALSO Would a person with Salary $1 be considered an Outlier?
((OPTION_A)) YES
THIS IS MANDATORY OPTION
((OPTION_B)) NO
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) . More information is required
This is optional
((OPTION_D)) None of these
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E C
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) Which of the following option is true regarding “Regression” and
“Correlation” ?
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO Note: y is dependent variable and x is independent variable.
((OPTION_A)) The relationship is symmetric between x and y in both.
THIS IS MANDATORY OPTION
((OPTION_B)) The relationship is not symmetric between x and y in both.
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) The relationship is not symmetric between x and y in case of correlation but in
case of regression it is symmetric.
This is optional
((OPTION_D)) The relationship is symmetric between x and y in case of correlation but in
case of regression it is not symmetric.
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E B
((EXPLANATION)) This is also optional
((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1
3 UPTO 10)
((QUESTION)) True-False: Is Logistic regression a supervised machine learning
algorithm?
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A)) TRUE
THIS IS MANDATORY OPTION
((OPTION_B)) FALSE
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) _
This is optional
((OPTION_D))
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) True-False: Is Logistic regression mainly used for Regression?

ENTER CONTENT. QTN CAN HAVE IMAGES ALSO

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) True-False: Is it possible to design a logistic regression algorithm using a
Neural Network Algorithm?
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A)) TRUE
THIS IS MANDATORY OPTION
((OPTION_B)) FALSE
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) True-False: Is it possible to apply a logistic regression algorithm on a 3-
class Classification problem?
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A)) TRUE
THIS IS MANDATORY OPTION
((OPTION_B)) FALSE
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) Which of the following methods do we use to best fit the data in Logistic
Regression?
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A)) Least Square Error
THIS IS MANDATORY OPTION
((OPTION_B)) Maximum Likelihood
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) Jaccard distance
This is optional
((OPTION_D)) Both a&B
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E B
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) One of the very good methods to analyze the performance of Logistic
Regression is AIC, which is similar to R-Squared in Linear Regression.
Which of the following is true about AIC

ENTER CONTENT. QTN CAN HAVE IMAGES ALSO

((OPTION_A)) We prefer a model with minimum AIC value
THIS IS MANDATORY OPTION
((OPTION_B)) We prefer a model with maximum AIC value
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) Both but depend on the situation
This is optional
((OPTION_D)) None of the above
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) True-False] Standardisation of features is required before training a
Logistic Regression
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A)) TRUE
THIS IS MANDATORY OPTION
((OPTION_B)) FALSE
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E B
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) Which of the following algorithms do we use for Variable Selection?

ENTER CONTENT. QTN CAN HAVE IMAGES ALSO

((OPTION_A)) ) LASSO
THIS IS MANDATORY OPTION
((OPTION_B)) Ridge
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) Both
This is optional
((OPTION_D)) All of these
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional
((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1
3 UPTO 10)
((QUESTION)) Suppose you have been given a fair coin and you want to find out the
odds of getting heads. Which of the following option is true for such a
case?
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A)) odds will be 0
THIS IS MANDATORY OPTION
((OPTION_B)) odds will be 0.5
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) odds will be 1
This is optional
((OPTION_D)) None of the above
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E C
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) ) The logit function(given as l(x)) is the log of odds function. What could
be the range of logit function in the domain x=[0,1]?

ENTER CONTENT. QTN CAN HAVE IMAGES ALSO

((OPTION_A)) (– ∞ , ∞)
THIS IS MANDATORY OPTION
((OPTION_B)) (0,1)
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) (0, ∞)
This is optional
((OPTION_D)) (- ∞, 0)
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) Which of the following option is true?
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A)) Linear Regression errors values has to be normally distributed but in case of
Logistic Regression it is not the case
THIS IS MANDATORY OPTION
((OPTION_B)) Linear Regression errors values has to be normally distributed but in case of
Logistic Regression it is not the case
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) Both Linear Regression and Logistic Regression error values have to be
normally distributed
This is optional
((OPTION_D)) Both Linear Regression and Logistic Regression error values have not to be
normally distributed
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional
((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1
3 UPTO 10)
((QUESTION)) 17) Which of the following is true regarding the logistic function for any value “x Note:

ENTER CONTENT. QTN CAN HAVE IMAGES ALSO Logistic(x): is a logistic function of any number “x”
Logit(x): is a logit function of any number “x”
Logit_inv(x): is a inverse logit function of any number “x””?

((OPTION_A)) C) A) Logistic(x) = Logit(x)

THIS IS MANDATORY OPTION
((OPTION_B)) Logistic(x) = Logit_inv(x)
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) A) Logistic(x) = Logit(x)
This is optional
((OPTION_D)) None of these
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E B
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 2

3 UPTO 10)
((QUESTION)) Suppose, You applied a Logistic Regression model on a given data and
got a training accuracy X and testing accuracy Y. Now, you want to add a
few new features in the same data. Select the option(s) which is/are
correct in such a case.

ENTER CONTENT. QTN CAN HAVE IMAGES ALSO Note: Consider remaining parameters are same.
((OPTION_A)) Training accuracy increases
THIS IS MANDATORY OPTION
((OPTION_B)) Training accuracy increases or remains the same
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) Testing accuracy decreases
This is optional
((OPTION_D)) Testing accuracy increases or remains the same
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A&D
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) Choose which of the following options is true regarding One-Vs-All
method in Logistic Regression.
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A)) We need to fit n models in n-class classification problem
THIS IS MANDATORY OPTION
((OPTION_B)) We need to fit n-1 models to classify into n classes
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) We need to fit only 1 model to classify into n classes
This is optional
((OPTION_D))
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional
((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1
3 UPTO 10)
((QUESTION)) What would do if you want to train logistic regression on same data that
will take less time as well as give the comparatively similar accuracy(may
not be same)?
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO Suppose you are using a Logistic Regression model on a huge dataset. One of
the problem you may face on such huge data is that Logistic regression will
take very long time to train
((OPTION_A)) Decrease the learning rate and decrease the number of iteration
THIS IS MANDATORY OPTION
((OPTION_B)) Decrease the learning rate and increase the number of iteration
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) Increase the learning rate and increase the number of iteration
This is optional
((OPTION_D)) Increase the learning rate and decrease the number of iteration
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E D
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 2

3 UPTO 10)
((QUESTION)) Which of the following image is showing the cost function for y =1.

ENTER CONTENT. QTN CAN HAVE IMAGES ALSO Following is the loss function in logistic regression(Y-axis loss function and x axis log probability) for two
class classification problem.
Note: Y is the target class

((OPTION_A)) A
THIS IS MANDATORY OPTION
((OPTION_B)) B
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) BOTH
This is optional
((OPTION_D)) NON OF THESE
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) Logistic regression is used when you want to:
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A)) Predict a dichotomous variable from continuous or dichotomous variables.

THIS IS MANDATORY OPTION

((OPTION_B)) Predict a continuous variable from dichotomous variables.
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) Predict any categorical variable from several other categorical variables.

This is optional
((OPTION_D)) Predict a continuous variable from dichotomous or continuous variables

This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional
((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1
3 UPTO 10)
((QUESTION)) The odds ratio is

ENTER CONTENT. QTN CAN HAVE IMAGES ALSO

((OPTION_A)) The ratio of the probability of an event not happening to the probability of the event happening.

THIS IS MANDATORY OPTION

((OPTION_B)) The probability of an event occurring.

THIS IS ALSO MANDATORY OPTION

((OPTION_C)) The ratio of the odds after a unit change in the predictor to the original odds.

This is optional
((OPTION_D)) The ratio of the probability of an event happening to the probability of the event not happening.

This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E C
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) Large values of the log-likelihood statistic indicate:

ENTER CONTENT. QTN CAN HAVE IMAGES ALSO

((OPTION_A)) That there are a greater number of explained vs. unexplained observations.

THIS IS MANDATORY OPTION

((OPTION_B)) That the statistical model fits the data well.

THIS IS ALSO MANDATORY OPTION

((OPTION_C)) That as the predictor variable increases, the likelihood of the outcome occurring decreases.

This is optional
((OPTION_D)) That the statistical model is a poor fit of the data.

This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E B
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) Logistic regression assumes a:

ENTER CONTENT. QTN CAN HAVE IMAGES ALSO

((OPTION_A)) Linear relationship between continuous predictor variables and the outcome variable.

THIS IS MANDATORY OPTION

((OPTION_B)) Linear relationship between continuous predictor variables and the logit of the outcome variable.

THIS IS ALSO MANDATORY OPTION

((OPTION_C)) Linear relationship between continuous predictor variables.

This is optional
((OPTION_D)) Linear relationship between observations.

This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E B
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) In binary logistic regression:

ENTER CONTENT. QTN CAN HAVE IMAGES ALSO

((OPTION_A)) The dependent variable is continuous.

THIS IS MANDATORY OPTION

((OPTION_B)) The dependent variable is divided into two equal subcategories.

THIS IS ALSO MANDATORY OPTION

((OPTION_C)) The dependent variable consists of two categories.

This is optional
((OPTION_D)) There is no dependent variable.

This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E C
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR 1

3 UPTO 10)
((QUESTION)) The correlation coefficient is used to determine
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A)) A specific value of the y-variable given a specific value of the x-
variable
THIS IS MANDATORY OPTION
((OPTION_B)) A specific value of the x-variable given a specific value of the y-
variable
THIS IS ALSO MANDATORY OPTION
((OPTION_C)) The strength of the relationship between the x and y variables

This is optional
((OPTION_D)) none
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E C
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR

1
3 UPTO 10)
((QUESTION)) Choose the options that is incorrect regarding machine learning (ML) and
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO artificial intelligence (AI)
((OPTION_A))
ML is an alternate way of programming intelligent machines.
THIS IS MANDATORY OPTION
((OPTION_B))
ML and AI have very different goals
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
ML is a set of techniques that turns a dataset into a software.
This is optional
((OPTION_D))
AI is a software that can emulate the human mind
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E B
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR

1
3 UPTO 10)
((QUESTION))
Which of the following sentence is FALSE regarding regression
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A))
It is used for prediction
THIS IS MANDATORY OPTION
((OPTION_B))
It may be used for interpretation
It may be used for interpretation
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
It relates inputs to outputs.
This is optional
((OPTION_D))
It discovers causal relationships
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E D
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR

1
3 UPTO 10)
((QUESTION))
Grid search is
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A))
Linear in D
THIS IS MANDATORY OPTION
((OPTION_B))
Exponential in D
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
Linear in N
This is optional
((OPTION_D))
Both B&C
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E D
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR

1
3 UPTO 10)
((QUESTION))
Find incorrect regarding Gradient of a continuous and differentiable function
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A))
is zero at a minimum
THIS IS MANDATORY OPTION
((OPTION_B))
is non-zero at a maximum
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
is zero at a saddle point
This is optional
((OPTION_D))
decreases as you get closer to the minimum
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E B
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR

1
3 UPTO 10)
((QUESTION)) Consider a linear-regression model with N = 3 and D = 1 with input-ouput

ENTER CONTENT. QTN CAN HAVE IMAGES ALSO pairs as follows: y 1 = 22, x 1 = 1, y 2 = 3, x 2 = 1, y 3 = 3, x 3 = 2. What

is the gradient of mean-square error (MSE) with respect to β 1 when β 0 = 0

and β 1 = 1? Give your answer correct to two decimal digits.
((OPTION_A))
-1.66
THIS IS MANDATORY OPTION
((OPTION_B))
2
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
3
This is optional
((OPTION_D))
4
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR

1
3 UPTO 10)
((QUESTION)) Let us say that we have computed the gradient of our cost function and

ENTER CONTENT. QTN CAN HAVE IMAGES ALSO stored it in a vector g. What is the cost of one gradient descent update
given the gradient?
((OPTION_A))
O (D )
THIS IS MANDATORY OPTION
((OPTION_B))
O (N )
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
O (ND )
This is optional
((OPTION_D))
O (ND 2)
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR

1
3 UPTO 10)
((QUESTION)) You observe the following while fitting a linear regression to the data: As

ENTER CONTENT. QTN CAN HAVE IMAGES ALSO you increase the amount of training data, the test error decreases and the

training error increases. The train error is quite low (almost what you expect
it to), while the test error is much higher than the train error.
What do you think is the main reason behind this behavior. Choose the
most probable option
((OPTION_A))
High variance
THIS IS MANDATORY OPTION
((OPTION_B))
High model bias
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
High estimation bias
This is optional
((OPTION_D))
None of the above
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR

1
3 UPTO 10)
((QUESTION)) Adding more basis functions in a linear model... (pick the most probably
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO option)
((OPTION_A))
Decreases model bias
THIS IS MANDATORY OPTION
((OPTION_B))
Decreases estimation bias
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
Decreases variance
This is optional
((OPTION_D))
Doesn’t affect bias and variance
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR

1
3 UPTO 10)
((QUESTION))
The problem of finding hidden structure in unlabeled data is called
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A))
Supervised learning
THIS IS MANDATORY OPTION
((OPTION_B))
UnSupervised learning
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
Reinforcement learning
This is optional
((OPTION_D))
None of the above
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E B
((EXPLANATION)) This is also optional
((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR
1
3 UPTO 10)
((QUESTION))
Task of inferring a model from labeled training data is called
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A))
Unsupervised learning
THIS IS MANDATORY OPTION
((OPTION_B))
supervised learning
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
Reinforcement learning
This is optional
((OPTION_D))
None of the above
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E B
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR

1
3 UPTO 10)
((QUESTION)) Some telecommunication company wants to segment their customers into
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO distinct groups in order to send appropriate subscription offers, this is an
((OPTION_A))
Supervised learning
THIS IS MANDATORY OPTION
((OPTION_B))
Data extraction
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
Serration
This is optional
((OPTION_D))
Unsupervised learning
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E D
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR

1
3 UPTO 10)
((QUESTION))
Self-organizing maps are an example of
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A))
Unsupervised learning
THIS IS MANDATORY OPTION
((OPTION_B))
Supervised learning
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
Reinforcement learning
This is optional
((OPTION_D))
Missing data imputation
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR

1
3 UPTO 10)
((QUESTION)) You are given data about seismic activity in Japan, and you want to
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO predict a magnitude of the next earthquake, this is in an example of
((OPTION_A))
Supervised learning
THIS IS MANDATORY OPTION
((OPTION_B))
Unsupervised learning
Unsupervised learning
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
Serration
This is optional
((OPTION_D))
None of the above
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR

1
3 UPTO 10)
Assume you want to perform supervised learning and to predict number
((QUESTION))
of newborns according to size of storks’ population

ENTER CONTENT. QTN CAN HAVE IMAGES ALSO (http://www.brixtonhealth.com/storksBabies.pdf), it is an example of

((OPTION_A))
Classification
THIS IS MANDATORY OPTION
((OPTION_B))
Regression
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
Clustering
This is optional
((OPTION_D))
None of the above
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E B
((EXPLANATION)) This is also optional
((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR
1
3 UPTO 10)
((QUESTION)) Discriminating between spam and ham e-mails is a classification task,
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO true or false?
((OPTION_A))
TRUE
THIS IS MANDATORY OPTION
((OPTION_B))
FALSE
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR

1
3 UPTO 10)
((QUESTION)) In the example of predicting number of babies based on storks’
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO population size, number of babies is
((OPTION_A))
Outcome
THIS IS MANDATORY OPTION
((OPTION_B))
Feature
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
Attribute
This is optional
((OPTION_D))
None of the above
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR

1
3 UPTO 10)
((QUESTION)) It may be better to avoid the metric of ROC curve as it can suffer from
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO accuracy paradox.
((OPTION_A))
TRUE
THIS IS MANDATORY OPTION
((OPTION_B))
FALSE
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E B
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR

1
3 UPTO 10)
((QUESTION))
which of the following is not involve in data mining
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A))
Knowledge extraction
THIS IS MANDATORY OPTION
((OPTION_B))
Data archaeology
Data archaeology
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
Data exploration
This is optional
((OPTION_D))
Data transformation
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E D
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR

1
3 UPTO 10)
((QUESTION))
The expected value or _______ of a random variable is the center of its distribution.
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A))
Mode
THIS IS MANDATORY OPTION
((OPTION_B))
median
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
mean
This is optional
((OPTION_D))
None of the above
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E C
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR

1
3 UPTO 10)
((QUESTION))
Point out the correct statement.
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A))
Some cumulative distribution function F is non-decreasing and right-continuous
THIS IS MANDATORY OPTION
((OPTION_B))
Every cumulative distribution function F is decreasing and right-continuous
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
Every cumulative distribution function F is increasing and left-continuous
This is optional
((OPTION_D))
None of the above
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E D
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR

1
3 UPTO 10)
((QUESTION))
Which of the following of a random variable is a measure of spread
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A))
variance
THIS IS MANDATORY OPTION
((OPTION_B))
standard deviation
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
empirical mean
This is optional
((OPTION_D))
All above
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR

1
3 UPTO 10)
((QUESTION))
The square root of the variance is called the ________ deviation
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A))
empirical
THIS IS MANDATORY OPTION
((OPTION_B))
mean
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
continuous
This is optional
((OPTION_D))
standard
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E D
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR

1
3 UPTO 10)
((QUESTION))
For continuous random variables, the CDF is the derivative of the PDF.
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A))
TRUE
THIS IS MANDATORY OPTION
((OPTION_B))
FALSE
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E B
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR

1
3 UPTO 10)
((QUESTION)) Cumulative distribution functions are used to specify the distribution of
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO multivariate random variables.
((OPTION_A))
TRUE
THIS IS MANDATORY OPTION
((OPTION_B))
FALSE
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR

1
3 UPTO 10)
Consider the results of a medical experiment that aims to predict whether someone is
((QUESTION))
going to develop myopia based on some physical measurements and heredity. In this
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO case, the input dataset consists of the person’s medical characteristics and the target
((OPTION_A))
Regression
Regression
THIS IS MANDATORY OPTION
((OPTION_B))
Desicion Tree
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
Clustering
This is optional
((OPTION_D))
Association Rule
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E B
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR

1
3 UPTO 10)
The purpose of a machine learning model is to approximate an unknown function
((QUESTION))
that
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO associates input elements to output ones

((OPTION_A))
TRUE
THIS IS MANDATORY OPTION
((OPTION_B))
FALSE
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional
((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR
1
3 UPTO 10)
((QUESTION))
Training set is normally a representation of a global distribution
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A))
TRUE
THIS IS MANDATORY OPTION
((OPTION_B))
FALSE
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR

2
3 UPTO 10)
((QUESTION)) The model has an excessive capacity and it's not more able to
generalize considering the original dynamics provided by the training set. This
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
problem is called as

((OPTION_A))
Underfitting
THIS IS MANDATORY OPTION
((OPTION_B))
Overfitting
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
Both
Both
This is optional
((OPTION_D))
None
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR

1
3 UPTO 10)
((QUESTION)) It can associate almost perfectly all the known samples to the corresponding output

ENTER CONTENT. QTN CAN HAVE IMAGES ALSO values, but when an unknown input is presented, the corresponding prediction
error can be very high, This problem is called as

((OPTION_A))
Underfitting
THIS IS MANDATORY OPTION
((OPTION_B))
Overfitting
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
Both
This is optional
((OPTION_D))
None
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E
((EXPLANATION)) This is also optional
((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR
1
3 UPTO 10)
((QUESTION)) ---------- may prove to be more difficult to discover as it could be initially considered
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO the result of a perfect fitting
((OPTION_A))
Underfitting
THIS IS MANDATORY OPTION
((OPTION_B))
Overfitting
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
Both
This is optional
((OPTION_D))
None
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR

1
3 UPTO 10)
((QUESTION)) when working with a supervised scenario, we define a non-negative error

measure e m which takes two arguments and allows us to compute a total error value
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
over the whole dataset. Those two arguments are.

((OPTION_A))
expected and predicted output
THIS IS MANDATORY OPTION
((OPTION_B))
calculated and predicted output
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
calculated and measured output
calculated and measured output
This is optional
((OPTION_D))
none
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A

((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR

1
3 UPTO 10)
((QUESTION)) Initial value represents a starting point over the surface of a n-variables function. A

ENTER CONTENT. QTN CAN HAVE IMAGES ALSO generic training algorithm has to find the global minimum or a point quite close to it

(there's always a tolerance to avoid an excessive number of iterations and a

consequent risk
of overfitting). This measure is also called
((OPTION_A))
loss function
THIS IS MANDATORY OPTION
((OPTION_B))
predicted output
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
measured output
This is optional
((OPTION_D))
mean square error
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional
((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR
1
3 UPTO 10)
((QUESTION)) In 1984, the computer scientist L. Valiant

ENTER CONTENT. QTN CAN HAVE IMAGES ALSO proposed a mathematical approach to determine whether a problem is learnable by a

computer. The name of this technique is

((OPTION_A))
Max likelihood
THIS IS MANDATORY OPTION
((OPTION_B))
Zero one loss error
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
Probably approximately correct
This is optional
((OPTION_D))
none
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E C
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR

1
3 UPTO 10)
((QUESTION)) In particular, a concept is a subset of input patterns X which determine the same
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO output element
((OPTION_A))
TRUE
THIS IS MANDATORY OPTION
((OPTION_B))
FALSE
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR

1
3 UPTO 10)
((QUESTION)) Therefore, learning a
concept (parametrically) means minimizing the corresponding loss function
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
restricted to a
specific class, while learning all possible concepts (belonging to the same universe),
means
finding the minimum of a global loss function
((OPTION_A))
TRUE
THIS IS MANDATORY OPTION
((OPTION_B))
FALSE
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
This is optional
((OPTION_D))
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional
((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR
1
3 UPTO 10)
An exponential time could lead to computational explosions when the datasets are
((QUESTION))
too large
or the optimization starting point is very far from an acceptable minimum. Moreover,
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
it's
important to remember the so-called …….
((OPTION_A))
curse of dimensionality
THIS IS MANDATORY OPTION
((OPTION_B))
Hughes phenomenon
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
Probably approximately correct
This is optional
((OPTION_D))
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR

1
3 UPTO 10)
((QUESTION)) In many cases, in order to capture the full expressivity, it's
necessary to have a very large dataset and without enough training data, the
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
approximation
can become problematic. This is called…
((OPTION_A))
curse of dimensionality
THIS IS MANDATORY OPTION
((OPTION_B))
Hughes phenomenon
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
Probably approximately correct
This is optional
((OPTION_D))
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E B
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR

1
3 UPTO 10)
((QUESTION))
First term is called as
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A))
posteriori
THIS IS MANDATORY OPTION
((OPTION_B))
Apriori
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
likelihood.
This is optional
((OPTION_D))
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E A
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR

1
3 UPTO 10)
((QUESTION))
second term is called as
second term is called as
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A))
posteriori
THIS IS MANDATORY OPTION
((OPTION_B))
Apriori
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
likelihood.
This is optional
((OPTION_D))
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E B
((EXPLANATION)) This is also optional

((MARKS)) QUESTION IS OF HOW MANY MARKS? (1 OR 2 OR

1
3 UPTO 10)
((QUESTION))
Third term is called as
ENTER CONTENT. QTN CAN HAVE IMAGES ALSO
((OPTION_A))
posteriori
THIS IS MANDATORY OPTION
((OPTION_B))
Apriori
THIS IS ALSO MANDATORY OPTION
((OPTION_C))
likelihood.
This is optional
((OPTION_D))
This is optional
((OPTION_E)) This is optional. If optional keep empty so that
system will skip this option
((CORRECT_CHOICE)) Either A or B or C or D or E C
((EXPLANATION)) This is also optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Choose the options that is incorrect regarding machine learning (ML) and
artificial intelligence (AI)
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) ML is an alternate way of programming intelligent machines.

THIS IS
MANDATORY
OPTION

((OPTION_B)) ML and AI have very different goals

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) ML is a set of techniques that turns a dataset into a software.

This is optional

((OPTION_D)) AI is a software that can emulate the human mind

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following sentence is FALSE regarding regression

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) It is used for prediction

THIS IS
MANDATORY
OPTION

((OPTION_B)) It may be used for interpretation

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) It relates inputs to outputs.

This is optional

((OPTION_D)) It discovers causal relationships

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Grid search is

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Linear in D
THIS IS
MANDATORY
OPTION

((OPTION_B)) Exponential in D
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Linear in N
This is optional

((OPTION_D)) Both B&C

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Find incorrect regarding Gradient of a continuous and differentiable

function
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) is zero at a minimum

THIS IS
MANDATORY
OPTION

((OPTION_B)) is non-zero at a maximum

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) is zero at a saddle point

This is optional

((OPTION_D)) decreases as you get closer to the minimum

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Consider a linear-regression model with N = 3 and D = 1 with input-ouput

pairs as follows: y1 = 22, x1 = 1, y2 = 3, x2 = 1, y3 = 3, x3 = 2. What
ENTER is the gradient of mean-square error (MSE) with respect to β1 when β0 = 0
CONTENT. QTN and β1 = 1? Give your answer correct to two decimal digits.
CAN HAVE
IMAGES ALSO

((OPTION_A)) -1.66
THIS IS
MANDATORY
OPTION

((OPTION_B)) 2
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 3
This is optional

((OPTION_D)) 4
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Let us say that we have computed the gradient of our cost function and
stored it in a vector g. What is the cost of one gradient descent update
ENTER given the gradient?
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) O(D)
THIS IS
MANDATORY
OPTION

((OPTION_B)) O(N)
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) O(ND)
This is optional

((OPTION_D)) O(ND2)

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) You observe the following while fitting a linear regression to the data: As
you increase the amount of training data, the test error decreases and the
ENTER training error increases. The train error is quite low (almost what you
CONTENT. QTN expect
CAN HAVE it to), while the test error is much higher than the train error.
IMAGES ALSO What do you think is the main reason behind this behavior. Choose the
most probable option
((OPTION_A)) High variance
THIS IS
MANDATORY
OPTION

((OPTION_B)) High model bias

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) High estimation bias

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Adding more basis functions in a linear model... (pick the most probably
option)
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Decreases model bias

THIS IS
MANDATORY
OPTION

((OPTION_B)) Decreases estimation bias

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Decreases variance

This is optional

((OPTION_D)) Doesn’t affect bias and variance

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) The problem of finding hidden structure in unlabeled data is called

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Supervised learning

THIS IS
MANDATORY
OPTION

((OPTION_B)) UnSupervised learning

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Reinforcement learning

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Task of inferring a model from labeled training data is called

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Unsupervised learning

THIS IS
MANDATORY
OPTION

((OPTION_B)) supervised learning

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Reinforcement learning

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Some telecommunication company wants to segment their customers

into distinct groups in order to send appropriate subscription offers,
ENTER
this is an example of
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Supervised learning

THIS IS
MANDATORY
OPTION

((OPTION_B)) Data extraction

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Serration
This is optional

((OPTION_D)) Unsupervised learning

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Self-organizing maps are an example of

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Unsupervised learning

THIS IS
MANDATORY
OPTION

((OPTION_B)) Supervised learning

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Reinforcement learning

This is optional

((OPTION_D)) Missing data imputation

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) You are given data about seismic activity in Japan, and you want to
predict a magnitude of the next earthquake, this is in an example of
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Supervised learning

THIS IS
MANDATORY
OPTION

((OPTION_B)) Unsupervised learning

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Serration
This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Assume you want to perform supervised learning and to predict

number of newborns according to size of storks’ population
ENTER
(http://www.brixtonhealth.com/storksBabies.pdf), it is an example of
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Classification
THIS IS
MANDATORY
OPTION

((OPTION_B)) Regression
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Clustering
This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Discriminating between spam and ham e-mails is a classification task,

true or false?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) True
THIS IS
MANDATORY
OPTION

((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) In the example of predicting number of babies based on storks’

population size, number of babies is
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Outcome
THIS IS
MANDATORY
OPTION

((OPTION_B)) Feature
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Attribute
This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) It may be better to avoid the metric of ROC curve as it can suffer
from accuracy paradox.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) True

THIS IS
MANDATORY
OPTION

((OPTION_B)) False

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) which of the following is not involve in data mining

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Knowledge extraction

THIS IS
MANDATORY
OPTION

((OPTION_B)) Data archaeology

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Data exploration

This is optional

((OPTION_D)) Data transformation

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) The expected value or _______ of a random variable is the center of its
distribution.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Mode
THIS IS
MANDATORY
OPTION

((OPTION_B)) median
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) mean
This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Point out the correct statement.

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Some cumulative distribution function F is non-decreasing and right-continuous

THIS IS
MANDATORY
OPTION

((OPTION_B)) Every cumulative distribution function F is decreasing and right-continuous

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Every cumulative distribution function F is increasing and left-continuous

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following of a random variable is a measure of spread

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) variance
THIS IS
MANDATORY
OPTION

((OPTION_B)) standard deviation

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) empirical mean

This is optional

((OPTION_D)) All above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) The square root of the variance is called the ________ deviation

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) empirical
THIS IS
MANDATORY
OPTION

((OPTION_B)) mean
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) continuous
This is optional

((OPTION_D)) standard
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) For continuous random variables, the CDF is the derivative of the PDF.

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) True
THIS IS
MANDATORY
OPTION

((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Cumulative distribution functions are used to specify the distribution of

multivariate random variables.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) True
THIS IS
MANDATORY
OPTION

((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Consider the results of a medical experiment that aims to predict whether someone is
going to develop myopia based on some physical measurements and heredity. In this
ENTER case, the input dataset consists of the person’s medical characteristics and the target
variable is binary: 1 for those who are likely to develop myopia and 0 for those who
CONTENT. QTN aren’t. This can be best classified as
CAN HAVE
IMAGES ALSO

((OPTION_A)) Regression
THIS IS
MANDATORY
OPTION

((OPTION_B)) Desicion Tree

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Clustering
This is optional

((OPTION_D)) Association Rule

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
The purpose of a machine learning model is to approximate an unknown function
((QUESTION))
that
ENTER associates input elements to output ones
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) True
THIS IS
MANDATORY
OPTION

((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Training set is normally a representation of a global distribution
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) True
THIS IS
MANDATORY
OPTION

((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
The model has an excessive capacity and it's not more able to
((QUESTION))
generalize considering the original dynamics provided by the training set. This
ENTER problem is called as
CONTENT. QTN
CAN HAVE
IMAGES ALSO
Underfitting
((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B)) Overfitting

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both
This is optional

((OPTION_D)) None
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
It can associate almost perfectly all the known samples to the corresponding
((QUESTION))
output
ENTER values, but when an unknown input is presented, the corresponding prediction
CONTENT. QTN error can be very high, This problem is called as
CAN HAVE
IMAGES ALSO
Underfitting
((OPTION_A))
THIS IS
MANDATORY
OPTION
Overfitting
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both
This is optional

((OPTION_D)) None
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
---------- may prove to be more difficult to discover as it could be initially
((QUESTION))
considered the result of a perfect fitting
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO
Underfitting
((OPTION_A))
THIS IS
MANDATORY
OPTION
Overfitting
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both
This is optional

((OPTION_D)) None
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
when working with a supervised scenario, we define a non-negative error
((QUESTION))
measure em which takes two arguments and allows us to compute a total error
ENTER value over the whole dataset. Those two arguments are.
CONTENT. QTN
CAN HAVE
IMAGES ALSO
expected and predicted output
((OPTION_A))
THIS IS
MANDATORY
OPTION
calculated and predicted output
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
calculated and measured output
((OPTION_C))
This is optional
none
((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
A
((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Initial value represents a starting point over the surface of a n-variables function.
((QUESTION))
A
ENTER generic training algorithm has to find the global minimum or a point quite close
CONTENT. QTN to it
CAN HAVE (there's always a tolerance to avoid an excessive number of iterations and a
IMAGES ALSO consequent risk
of overfitting). This measure is also called

loss function
((OPTION_A))
THIS IS
MANDATORY
OPTION
predicted output
((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION
measured output
((OPTION_C))
This is optional
mean square error
((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) In 1984, the computer scientist L. Valiant

proposed a mathematical approach to determine whether a problem is learnable
ENTER by a
CONTENT. QTN computer. The name of this technique is
CAN HAVE
IMAGES ALSO
Max likelihood
((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B)) Zero one loss error

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Probably approximately correct

This is optional
none
((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) In particular, a concept is a subset of input patterns X which determine the same
output element
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) True
THIS IS
MANDATORY
OPTION

((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Therefore, learning a

concept (parametrically) means minimizing the corresponding loss function
ENTER restricted to a
CONTENT. QTN specific class, while learning all possible concepts (belonging to the same
CAN HAVE universe), means
IMAGES ALSO finding the minimum of a global loss function

((OPTION_A)) True
THIS IS
MANDATORY
OPTION

((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) An exponential time could lead to computational explosions when the datasets
are too large
ENTER or the optimization starting point is very far from an acceptable minimum.
CONTENT. QTN Moreover, it's
CAN HAVE important to remember the so-called …….
IMAGES ALSO

((OPTION_A)) curse of dimensionality

THIS IS
MANDATORY
OPTION

((OPTION_B)) Hughes phenomenon

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Probably approximately correct

This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) In many cases, in order to capture the full expressivity, it's

necessary to have a very large dataset and without enough training data, the
ENTER approximation
CONTENT. QTN can become problematic. This is called…
CAN HAVE
IMAGES ALSO

((OPTION_A)) curse of dimensionality

THIS IS
MANDATORY
OPTION

((OPTION_B)) Hughes phenomenon

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Probably approximately correct

This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE First term is called as
IMAGES ALSO

((OPTION_A)) posteriori
THIS IS
MANDATORY
OPTION

((OPTION_B)) Apriori
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) likelihood.
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
second term is called as
CAN HAVE
IMAGES ALSO

((OPTION_A)) posteriori
THIS IS
MANDATORY
OPTION

((OPTION_B)) Apriori
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) likelihood.
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
Third term is called as
CAN HAVE
IMAGES ALSO

((OPTION_A)) posteriori
THIS IS
MANDATORY
OPTION

((OPTION_B)) Apriori
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) likelihood.
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) We can create the object of abstract class

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following step / assumption in regression modeling

impacts the trade-off between under-fitting and over-fitting the most
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) The polynomial degree

THIS IS
MANDATORY
OPTION

((OPTION_B)) Whether we learn the weights by matrix inversion or gradient descent

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) The use of a constant-term

This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Suppose you have the following data with one real-value input
variable & one real-value output variable. What is leave-one out cross
ENTER validation mean square error in case of linear regression (Y = bX+c)?
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) 10/27
THIS IS
MANDATORY
OPTION

((OPTION_B)) 20/27

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 50/27
This is optional

((OPTION_D)) 49/27
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following is/ are true about “Maximum Likelihood
estimate (MLE)”?
ENTER
CONTENT. QTN 1. MLE may not always exist
CAN HAVE 2. MLE always exists
IMAGES ALSO 3. If MLE exist, it (they) may not be unique
4. If MLE exist, it (they) must be unique

((OPTION_A)) 1and4

THIS IS
MANDATORY
OPTION

((OPTION_B)) 2 and3
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 1 and3
This is optional

((OPTION_D)) 2 and4
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Let’s say, a “Linear regression” model perfectly fits the training data
(train error is zero). Now, Which of the following statement is true?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) You will always have test error zero

THIS IS
MANDATORY
OPTION

((OPTION_B)) . You can not have test error zero

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) None of the above

This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which one of the statement is true regarding residuals in regression

analysis?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) A. Mean of residuals is always zero

THIS IS
MANDATORY
OPTION

((OPTION_B)) Mean of residuals is always less than zero

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Mean of residuals is always greater than zero

This is optional

((OPTION_D)) There is no such rule for residuals.

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the one is true about Heteroskedasticity?

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Linear Regression with varying error terms

THIS IS
MANDATORY
OPTION

((OPTION_B)) Linear Regression with constant error terms

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Linear Regression with zero error terms

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following indicates a fairly strong relationship between

X and Y?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) A. Correlation coefficient = 0.9

THIS IS
MANDATORY
OPTION

((OPTION_B)) . The p-value for the null hypothesis Beta coefficient =0 is 0.0001
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) The t-statistic for the null hypothesis Beta coefficient=0 is 30

This is optional

((OPTION_D)) None of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Which of the following assumptions do we make while deriving linear regression param
((QUESTION))
1. The true relationship between dependent y and predictor x is linear
ENTER 2. The model errors are statistically independent
CONTENT. QTN 3. The errors are normally distributed with a 0 mean and constant standard deviation.
CAN HAVE
IMAGES ALSO

((OPTION_A)) 1,2&3

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1&3

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) All of above

This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) To test linear relationship of y(dependent) and x(independent)

continuous variables, which of the following plot best suited?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Scatter plot

THIS IS
MANDATORY
OPTION

((OPTION_B)) Barchart
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Histograms
This is optional

((OPTION_D)) None of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Generally, which of the following method(s) is used for predicting

continuous dependent variable?
ENTER
CONTENT. QTN 1. Linear Regression
CAN HAVE 2. Logistic Regression
IMAGES ALSO

((OPTION_A)) 1&2
THIS IS
MANDATORY
OPTION

((OPTION_B)) Only 1
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Only 2
This is optional

((OPTION_D)) None f the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) . A correlation between age and health of a person found to be -1.09.

On the basis of this you would tell the doctors that:
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) . The age is good predictor of health

THIS IS
MANDATORY
OPTION

((OPTION_B)) . The age is poor predictor of health

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) None of these

This is optional

((OPTION_D)) All of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Which of the following offsets, do we use in case of least square line fit? Suppose horizontal axis is
((QUESTION)) independent variable and vertical axis is dependent variable

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Vertical offset

THIS IS
MANDATORY
OPTION

((OPTION_B)) Perpendicular offset

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both but depend on situation

This is optional

((OPTION_D)) Both a&b

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Suppose we have generated the data with help of polynomial regression of degree 3 (degree 3 will
((QUESTION)) perfectly fit this data). Now consider below points and choose the option based on these points.

ENTER 1. Simple Linear regression will have high bias and low variance
CONTENT. QTN 2. Simple Linear regression will have low bias and high variance
3. polynomial of degree 3 will have low bias and high variance
CAN HAVE
IMAGES ALSO Polynomial of degree 3 will have low bias and Low variance

((OPTION_A)) . Only 1

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1&3
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 1&4
This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) . Suppose you are training a linear regression model. Now consider
these points.
ENTER
CONTENT. QTN 1. Overfitting is more likely if we have less data
CAN HAVE 2. Overfitting is more likely when the hypothesis space is small
IMAGES ALSO
Which of the above statement(s) are correct?
((OPTION_A)) Both are False

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1 is False and 2 is True

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 1 is True and 2 is False

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH c
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Suppose we fit “Lasso Regression” to a data set, which has 100 features (X1,X2…X100). Now, we rescale
((QUESTION)) one of these feature by multiplying with 10 (say that feature is X1), and then refit Lasso regression with
the same regularization parameter.
ENTER
CONTENT. QTN Now, which of the following option will be correct?
CAN HAVE
IMAGES ALSO

((OPTION_A)) It is more likely for X1 to be excluded from the model

THIS IS
MANDATORY
OPTION

((OPTION_B)) It is more likely for X1 to be included in the model

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) . Can’t say

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following is true about “Ridge” or “Lasso” regression

methods in case of feature selection?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Ridge regression uses subset selection of features

THIS IS
MANDATORY
OPTION

((OPTION_B)) . Lasso regression uses subset selection of features

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both use subset selection of features

This is optional

((OPTION_D)) All of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) .Which of the following statement(s) can be true post adding a

variable in a linear regression model?
ENTER 1. R-Squared and Adjusted R-squared both increase
CONTENT. QTN 2. R-Squared increases and Adjusted R-squared decreases
CAN HAVE 3. R-Squared decreases and Adjusted R-squared decreases
IMAGES ALSO 4. R-Squared decreases and Adjusted R-squared increases

((OPTION_A)) . 1 and 2

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1 and 3

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 2 and 4

This is optional

((OPTION_D)) none of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) . Which of the following metrics can be used for evaluating regression
models?
ENTER 1. R Squared
CONTENT. QTN 2. Adjusted R Squared
CAN HAVE 3. F Statistics
IMAGES ALSO 1. RMSE / MSE / MAE

((OPTION_A)) 2 and 4

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1 and 2.
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) . 2, 3 and 4.
This is optional

((OPTION_D)) All of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) We can also compute the coefficient of linear regression with the help
of an analytical method called “Normal Equation”. Which of the
ENTER following is/are true about “Normal Equation”?
CONTENT. QTN 1. We don’t have to choose the learning rate
CAN HAVE 2. It becomes slow when number of features is very large
IMAGES ALSO 3. No need to iterate

((OPTION_A)) 1 and 2

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1&3
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 2&3
This is optional

((OPTION_D)) 1,2&3
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
. The expected value of Y is a linear function of the X(X1,X2….Xn) variables and regression line is
((QUESTION)) defined as:
Y = β0 + β1 X1 + β2 X2……+ βn Xn
ENTER Which of the following statement(s) are true?
1. If Xi changes by an amount ∆Xi, holding other variables constant, then the expected value of Y
CONTENT. QTN changes by a proportional amount βi ∆Xi, for some constant βi (which in general could be a
CAN HAVE positive or negative number).
2. The value of βi is always the same, regardless of values of the other X’s.
IMAGES ALSO 3. The total effect of the X’s on the expected value of Y is the sum of their separate effects.

((OPTION_A)) . 1 and 2

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1 and 3

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 2 and 3

This is optional

((OPTION_D)) 1,2 and 3

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) . How many coefficients do you need to estimate in a simple linear

regression model (One independent variable)
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) 1
THIS IS
MANDATORY
OPTION

((OPTION_B)) 2
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) CAN’T SAY

This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
. Below graphs show two fitted regression lines (A & B) on randomly generated data. Now, I want to find
((QUESTION)) the sum of residuals in both cases A and B.

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO Which of the following statement is true about sum of residuals of A and B

((OPTION_A)) A has higher than B

THIS IS
MANDATORY
OPTION

((OPTION_B)) A has lower than B

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both have same

This is optional

((OPTION_D)) None of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) If two variables are correlated, is it necessary that they have a linear
relationsh
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) YES
THIS IS
MANDATORY
OPTION

((OPTION_B)) NO
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both a&b

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Correlated variables can have zero correlation coeffficient. True or

False?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION

((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Suppose I applied a logistic regression model on data and got training accuracy X and testing accuracy Y.
((QUESTION)) Now I want to add few new features in data. Select option(s) which are correct in such case.
Note: Consider remaining parameters are same.
ENTER 1. Training accuracy always decreases.
2. Training accuracy always increases or remain same.
CONTENT. QTN 3. Testing accuracy always decreases
CAN HAVE Testing accuracy always increases or remain same

IMAGES ALSO

((OPTION_A)) Only 2
THIS IS
MANDATORY
OPTION

((OPTION_B)) Only 1
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Only3
This is optional

((OPTION_D)) All of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
The graph below represents a regression line predicting Y from X. The values on the
((QUESTION)) graph shows the residuals for each predictions value. Use this information to
ENTER compute the SSE.
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) 3.02

THIS IS
MANDATORY
OPTION

((OPTION_B)) 0.75
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 1.01

This is optional

((OPTION_D)) None of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Suppose the distribution of salaries in a company X has median

$35,000, and 25th and 75th percentiles are $21,000 and $53,000
ENTER respectively.
CONTENT. QTN Would a person with Salary $1 be considered an Outlier?
CAN HAVE
IMAGES ALSO

((OPTION_A)) YES

THIS IS
MANDATORY
OPTION

((OPTION_B)) NO
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) . More information is required

This is optional

((OPTION_D)) None of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following option is true regarding “Regression” and

“Correlation” ?
ENTER Note: y is dependent variable and x is independent variable.
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) The relationship is symmetric between x and y in both.

THIS IS
MANDATORY
OPTION

((OPTION_B)) The relationship is not symmetric between x and y in both.

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) The relationship is not symmetric between x and y in case of correlation

but in case of regression it is symmetric.
This is optional

((OPTION_D)) The relationship is symmetric between x and y in case of correlation but

in case of regression it is not symmetric.
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) True-False: Is Logistic regression a supervised machine learning

algorithm?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) TRUE

THIS IS
MANDATORY
OPTION

((OPTION_B)) FALSE

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) _
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) True-False: Is Logistic regression mainly used for Regression?

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION

((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) True-False: Is it possible to design a logistic regression algorithm

using a Neural Network Algorithm?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION

((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) True-False: Is it possible to apply a logistic regression algorithm on a

3-class Classification problem?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION

((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following methods do we use to best fit the data in
Logistic Regression?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Least Square Error

THIS IS
MANDATORY
OPTION

((OPTION_B)) Maximum Likelihood

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Jaccard distance

This is optional

((OPTION_D)) Both a&B

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) One of the very good methods to analyze the performance of Logistic
Regression is AIC, which is similar to R-Squared in Linear
ENTER Regression. Which of the following is true about AIC
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) We prefer a model with minimum AIC value

THIS IS
MANDATORY
OPTION

((OPTION_B)) We prefer a model with maximum AIC value

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both but depend on the situation

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) True-False] Standardisation of features is required before training a

Logistic Regression
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION

((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following algorithms do we use for Variable Selection?

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) ) LASSO

THIS IS
MANDATORY
OPTION

((OPTION_B)) Ridge

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both
This is optional

((OPTION_D)) All of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Suppose you have been given a fair coin and you want to find out the
odds of getting heads. Which of the following option is true for such a
ENTER case?
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) odds will be 0

THIS IS
MANDATORY
OPTION

((OPTION_B)) odds will be 0.5

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) odds will be 1

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) ) The logit function(given as l(x)) is the log of odds function. What
could be the range of logit function in the domain x=[0,1]?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) (– ∞ , ∞)

THIS IS
MANDATORY
OPTION

((OPTION_B)) (0,1)
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) (0, ∞)
This is optional

((OPTION_D)) (- ∞, 0)
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following option is true?

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Linear Regression errors values has to be normally distributed but in case
of Logistic Regression it is not the case
THIS IS
MANDATORY
OPTION

((OPTION_B)) Linear Regression errors values has to be normally distributed but in case
of Logistic Regression it is not the case
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both Linear Regression and Logistic Regression error values have to be
normally distributed
This is optional

((OPTION_D)) Both Linear Regression and Logistic Regression error values have not to
be normally distributed
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
17) Which of the following is true regarding the logistic function for any value “x Note:
((QUESTION)) Logistic(x): is a logistic function of any number “x”
Logit(x): is a logit function of any number “x”
ENTER Logit_inv(x): is a inverse logit function of any number “x””?
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) C) A) Logistic(x) = Logit(x)

THIS IS
MANDATORY
OPTION

((OPTION_B)) Logistic(x) = Logit_inv(x)

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) A) Logistic(x) = Logit(x)

This is optional

((OPTION_D)) None of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Suppose, You applied a Logistic Regression model on a given data and
got a training accuracy X and testing accuracy Y. Now, you want to
ENTER add a few new features in the same data. Select the option(s) which
CONTENT. QTN is/are correct in such a case.
CAN HAVE
IMAGES ALSO Note: Consider remaining parameters are same.

((OPTION_A)) Training accuracy increases

THIS IS
MANDATORY
OPTION

((OPTION_B)) Training accuracy increases or remains the same

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Testing accuracy decreases

This is optional

((OPTION_D)) Testing accuracy increases or remains the same

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A&D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Choose which of the following options is true regarding One-Vs-All

method in Logistic Regression.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) We need to fit n models in n-class classification problem

THIS IS
MANDATORY
OPTION

((OPTION_B)) We need to fit n-1 models to classify into n classes

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) We need to fit only 1 model to classify into n classes

This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) What would do if you want to train logistic regression on same data
that will take less time as well as give the comparatively similar
ENTER accuracy(may not be same)?
CONTENT. QTN
CAN HAVE Suppose you are using a Logistic Regression model on a huge dataset. One
IMAGES ALSO of the problem you may face on such huge data is that Logistic regression
will take very long time to train
((OPTION_A)) Decrease the learning rate and decrease the number of iteration
THIS IS
MANDATORY
OPTION

((OPTION_B)) Decrease the learning rate and increase the number of iteration
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Increase the learning rate and increase the number of iteration
This is optional

((OPTION_D)) Increase the learning rate and decrease the number of iteration
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Which of the following image is showing the cost function for y =1.
((QUESTION)) Following is the loss function in logistic regression(Y-axis loss function and x axis log probability) for two
class classification problem.
ENTER Note: Y is the target class
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) A
THIS IS
MANDATORY
OPTION

((OPTION_B)) B
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) BOTH
This is optional

((OPTION_D)) NON OF THESE

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Logistic regression is used when you want to:

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Predict a dichotomous variable from continuous or dichotomous variables.

THIS IS
MANDATORY
OPTION

((OPTION_B)) Predict a continuous variable from dichotomous variables.

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Predict any categorical variable from several other categorical

variables.
This is optional

((OPTION_D)) Predict a continuous variable from dichotomous or continuous variables

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
The odds ratio is
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) The ratio of the probability of an event not happening to the probability of the event happening.

THIS IS
MANDATORY
OPTION

((OPTION_B)) The probability of an event occurring.

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) The ratio of the odds after a unit change in the predictor to the original odds.

This is optional

((OPTION_D)) The ratio of the probability of an event happening to the probability of the event not happening.

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Large values of the log-likelihood statistic indicate:
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) That there are a greater number of explained vs. unexplained observations.

THIS IS
MANDATORY
OPTION

((OPTION_B)) That the statistical model fits the data well.

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) That as the predictor variable increases, the likelihood of the outcome occurring decreases.

This is optional

((OPTION_D)) That the statistical model is a poor fit of the data.

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Logistic regression assumes a:
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Linear relationship between continuous predictor variables and the outcome variable.

THIS IS
MANDATORY
OPTION

((OPTION_B)) Linear relationship between continuous predictor variables and the logit of the outcome
variable.
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Linear relationship between continuous predictor variables.

This is optional

((OPTION_D)) Linear relationship between observations.

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
In binary logistic regression:
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) The dependent variable is continuous.

THIS IS
MANDATORY
OPTION

((OPTION_B)) The dependent variable is divided into two equal subcategories.

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) The dependent variable consists of two categories.

This is optional

((OPTION_D)) There is no dependent variable.

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) The correlation coefficient is used to determine

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) A specific value of the y-variable given a specific value of the x-

variable
THIS IS
MANDATORY
OPTION

((OPTION_B)) A specific value of the x-variable given a specific value of the y-

variable
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) The strength of the relationship between the x and y variables

This is optional

((OPTION_D)) none
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
This sheet is for 3 Mark questions
S.r No Question Image a b c d Correct Answer
e.g 1 Write down question img.jpg Option a Option b Option c Option d a/b/c/d
1 Which of the following is characteristic of best fast accuracy scalable All above D
machine learning method ?

2 What are the different Algorithm techniques in Supervised Unsupervised Both A & B None of the C
Machine Learning? Learning and Learning and Mentioned
Semi- Transduction
3 ______can be adopted when it's necessary to Supervised Semi- Reinforcement Clusters B
categorize a large amount of data with a few supervised
complete examples or when there's the need to
4 In reinforcement learning, this feedback is usually Overfitting Overlearning Reward None of above C
called as___.

5 In the last decade, many researchers started training Deep learning Machine Reinforcement Unsupervised A
bigger and bigger models, built with several different learning learning learning
layers that's why this approach is called_____.
6 What does learning exactly mean? Robots are A set of data Learning is the It is a set of C
programed so is used to ability to data is used to
that they can discover the change discover the
7 When it is necessary to allow the model to develop a Overfitting Overlearning Classification Regression A
generalization ability and avoid a common problem
called______.
8 Techniques involve the usage of both labeled and Supervised Semi- Unsupervised None of the B
unlabeled data is called___. supervised above

9 there's a growing interest in pattern recognition and Regression Accuracy Modelfree Scalable C
associative memories whose structure and functioning
are similar to what happens in the neocortex. Such an
10 ______ showed better performance than other Machine Deep learning Reinforcement Supervised B
approaches, even without a context-based model learning learning learning

11 Machine Data mining

None of the
Which of the following sentence is correct? -- learning relates can be defined Both A & B C
above
with the study, as the process
12 when a Robots are
While a set of data is
statistical programed so
involving the used to
model that they can
process of discover the
What is ‘Overfitting’ in Machine learning? -- describes perform the A
learning potentially
random error task based on
‘overfitting’ predictive
or noise data they
occurs. relationship
instead of gather from
13
Test set is used It is a set of
to test the data is used to
accuracy of the discover the
What is ‘Test set’? -- Both A & B None of above A
hypotheses potentially
generated by predictive
the learner. relationship.

14 Classifications,
Predict time Speech
what is the function of ‘Supervised Learning’? -- series, recognition, Both A & B None of above C
Annotate Regression
strings
15 Object Similarity Automatic
Commons unsupervised applications include -- All above D
segmentation detection labeling
16
the it's impossible
Reinforcement learning is particularly efficient environment is it's often very to have a
-- All above D
when______________. not completely dynamic precise error
deterministic measure

17 During the last few years, many ______ algorithms

have been applied to deep
neural networks to learn the best policy for playing
-- Logical Classical Classification None of above D
Atari video games and to teach an agent how to
associate the right action with an input representing
the state.
18
Image Autonomous
classification, car driving, Bioinformatics,
Common deep learning applications include____ -- All above D
Real-time Logistic Speech
visual tracking optimization recognition

19 if there is only a discrete number of possible

outcomes (called categories), -- Regression Classification. Modelfree Categories B
the process becomes a______.
20
Spam detection,
Pattern Image Autonomous
Which of the following are supervised learning detection, classification, car driving, Bioinformatics,
-- A
applications Natural Real-time Logistic Speech
Language visual tracking optimization recognition
Processing
21
Let’s say, you are working with categorical feature(s) Frequency
and you have not looked at the distribution of the All categories distribution of
Train and Test
categorical variable in the test data. of categorical categories is
always have
-- variable are different in Both A and B D
same
You want to apply one hot encoding (OHE) on the not present in train as
distribution.
categorical feature(s). What challenges you may face the test dataset. compared to
if you have applied OHE on a categorical variable of the test dataset.
train dataset?
22 It may be used It discovers
Which of the following sentence is FALSE regarding It relates inputs It is used for
-- for causal D
regression? to outputs. prediction.
interpretation. relationships.
23
Density-Based Spectral
Which of the following method is used to find the
-- k-Means Spatial Clustering Find All above D
optimal features for cluster analysis
Clustering clusters

24 scikit-learn also provides functions for creating make_classifica make_regressio

-- make_blobs() All above D
dummy datasets from scratch: tion() n()
25 _____which can accept a NumPy RandomState
-- make_blobs random_state test_size training_size B
generator or an integer seed.
26 In many classification problems, the target dataset is
made up of categorical labels which cannot
immediately be processed by any algorithm. An -- 1 2 3 4 B
encoding is needed and scikit-learn offers at
least_____valid options
27 In which of the following each categorical label is
first turned into a positive integer and then LabelEncoder LabelBinarizer
-- DictVectorizer FeatureHasher C
transformed into a vector where only one feature is 1 class class
while all the others are 0.
28 Using an
automatic
______is the most drastic one and should be Creating sub-
strategy to
considered only when the dataset is quite large, the Removing the model to
-- input them All above A
number of missing features is high, and any whole line predict those
according to
prediction could be risky. features
the other
known values
29 It's possible to specify if the scaling process must
with_mean=Tru with_std=True/ None of the
include both mean and standard deviation using the -- Both A & B C
e/False False Mentioned
parameters________.
30 Which of the following selects the best K high-score SelectPercentil
-- FeatureHasher SelectKBest All above C
features. e
31
How does number of observations influence
overfitting? Choose the correct answer(s).Note:
Rest all parameters are same1. In case of fewer
observations, it is easy to overfit the data.2. In
-- 1 and 4 2 and 3 1 and 3 None of theses A
case of fewer observations, it is hard to overfit
the data.3. In case of more observations, it is
easy to overfit the data.4. In case of more
observations, it is hard to overfit the data.
32 Suppose you have fitted a complex regression In case of In case of In case of In case of
model on a dataset. Now, you are using Ridge very large very large very large very large
regression with tuning parameter lambda to lambda; bias lambda; bias lambda; bias lambda; bias
-- C
reduce its complexity. Choose the option(s) is low, is low, is high, is high,
below which describes relationship of bias and variance is variance is variance is variance is
variance with lambda. low high low high
33 What is/are true about ridge regression?1. When
lambda is 0, model works like linear regression
model2. When lambda is 0, model doesn’t work
like linear regression model3. When lambda goes
-- 1 and 3 1 and 4 2 and 3 2 and 4 A
to infinity, we get very, very small coefficients
approaching 04. When lambda goes to infinity,
we get very, very large coefficients approaching
infinity
34 Which of the following method(s) does not have Ridge Both Ridge
-- Lasso None of both B
closed form solution for its coefficients? regression and Lasso
35
Function used for linear regression in R is lm(formula, lr(formula, lrm(formula, regression.linear
-- A
__________ data) data) data) (formula, data)
36
In the mathematical Equation of Linear Regression (X-intercept, (Slope, X- (Y-Intercept, (slope, Y-
-- C
Y = β1 + β2X + ϵ, (β1, β2) refers to __________ Slope) Intercept) Slope) Intercept)
37
Suppose that we have N independent variables
(X1,X2… Xn) and dependent variable is Y. Now
Relation Relation Relation Correlation
Imagine that you are applying linear regression
between the between the between the can’t judge
by fitting the best fit line using least square error -- B
X1 and Y is X1 and Y is X1 and Y is the
on this data. You found that correlation
weak strong neutral relationship
coefficient for one of it’s variable(Say X1) with
Y is -0.95.Which of the following is true for X1?
38 We have been given a dataset with n records in
which we have input attribute as x and output
attribute as y. Suppose we use a linear regression
method to model this data. To test our linear
Remain
regressor, we split the data in training set and test -- Increase Decrease Can’t Say D
constant
set randomly. Now we increase the training set
size gradually. As the training set size increases,
what do you expect will happen with the mean
training error?
39 We have been given a dataset with n records in
which we have input attribute as x and output
attribute as y. Suppose we use a linear regression Bias Bias Bias
Bias increases
method to model this data. To test our linear increases and decreases decreases and
-- and Variance D
regressor, we split the data in training set and test Variance and Variance Variance
decreases
set randomly. What do you expect will happen increases increases decreases
with bias and variance as you increase the size of
training data?
40
Suppose, you got a situation where you find that
your linear regression model is under fitting the
data. In such situation which of the following
-- 1 and 2 2 and 3 1 and 3 1, 2 and 3 A
options would you consider?1. I will add more
variables2. I will start introducing polynomial
degree variables3. I will remove some variables
41 Problem: Players will play if weather is sunny. Is
weather data.jpg TRUE FALSE A
this statement is correct?
42 Multinomial Naïve Bayes Classifier is
Continuous Discrete Binary B
___________distribution
43 For the given weather data, Calculate probability
weather data.jpg 0.4 0.64 0.36 0.5 C
of not playing
44
Suppose you have trained an SVM with linear
You want to
decision boundary after training SVM, you You want to You will try You will try
decrease
correctly infer that your SVM model is under -- increase your to calculate to reduce the C
your data
fitting.Which of the following option would you data points more variables features
points
more likely to consider iterating SVM next time?
45 The minimum time complexity for training an
Small Medium Size does not
SVM is O(n2). According to this fact, what sizes -- Large datasets A
datasets sized datasets matter
of datasets are not best suited for SVM’s?
46 Selection of Kernel Soft Margin All of the
The effectiveness of an SVM depends upon: -- D
Kernel Parameters Parameter C above
47 How
How far the
accurately The threshold
hyperplane is
What do you mean by generalization error in the SVM can amount of
-- from the B
terms of the SVM? predict error in an
support
outcomes for SVM
vectors
unseen data
48 The SVM
The SVM
allows high
allows very None of the
What do you mean by a hard margin? -- amount of A
low error in above
error in
classification
classification
49 We usually use feature normalization before
using the Gaussian kernel in SVM. What is true
about feature normalization? 1. We do feature
normalization so that new feature will dominate
-- 1 1 and 2 1 and 3 2 and 3 B
other 2. Some times, feature normalization is not
feasible in case of categorical variables3. Feature
normalization always helps when we use
Gaussian kernel in SVM
50 Support vectors are the data points that lie
-- TRUE FALSE A
closest to the decision surface.
51 Which of the following is not supervised Decision Naive Linerar
-- PCA A
learning? Tree Bayesian regression
52
The model The model The model
would would would not be
consider even consider only affected by
Suppose you are using RBF kernel in SVM with None of the
-- far away the points distance of B
high Gamma value. What does this signify? above
points from close to the points from
hyperplane hyperplane hyperplane
for modeling for modeling for modeling
53 Gaussian Naïve Bayes Classifier is
-- Continuous Discrete Binary A
___________distribution
54 If I am using all features of my dataset and I
Nothing, the
achieve 100% accuracy on my training set, but
-- Underfitting model is Overfitting C
~70% on validation set, what should I look out
perfect
for?
55 b. To judge
how the
a. To assess trained
What is the purpose of performing cross- the predictive model c. Both A and
-- C
validation? performance performs B
of the models outside the
sample on
test data
56 a. Assumes
b. Assumes
that all the
that all the
Which of the following is true about Naive features in a c. Both A and d. None of the
-- features in a C
Bayes ? dataset are B above option
dataset are
equally
independent
important
57 Suppose you are using a Linear SVM classifier
with 2 class classification problem. Now you
have been given the following data in which
some points are circled red that are representing svm.jpg yes no A
support vectors.If you remove the following any
one red points from the data. Does the decision
boundary will change?
58 Linear SVMs have no hyperparameters that need
-- TRUE FALSE B
to be set by cross-validation
59 For the given weather data, what is the
probability that players will play if weather is weather data.jpg 0.5 0.26 0.73 0.6 D
sunny
60 100 people are at party. Given data gives
information about how many wear pink or not,
and if a man or not. Imagine a pink wearing man.jpg 0.4 0.2 0.6 0.45 B
guest leaves, what is the probability of being a
man
61 Problem: Players will play if weather is sunny. Is this statement
weather is correct?
data.jpg TRUE FALSE a
62 For the given weather data, Calculate probability of playingdata.jpg
weather 0.4 0.64 0.29 0.75 b
63 For the given weather data, Calculate probability of not playing
weather data.jpg 0.4 0.64 0.36 0.5 c
64 For the given weather data, what is the probabilityweather
that players will play if weather
data.jpg 0.5 is sunny 0.26 0.73 0.6 d
65 100 people are at party. Given data gives information about how many wear pink
man.jpg 0.4 or not, and 0.2
if a man or not.0.6
Imagine a pink0.45
wearing
b guest leaves, what is the probabilit
66 100 people are at party. Given data gives information about how many wear
man.jpg TRUE pink or not, and if a man or not. Imagine a pink wearing
FALSE a guest leaves, was it a man?
67 What do you mean by generalization error in terms of the SVM? How far the hyperplane How accurately
is fromThe
the threshold
support
SVM canvectors
amount
predict outcomes
of error inbfor
an unseen
SVM data
68 What do you mean by a hard margin? The SVM allowsThe very
SVMlow allows
errorNone
inhigh
classification
ofamount
the above
of error in classification
a
69 The minimum time complexity for training an SVM is O(n2). According Large
to this
datasets
fact, what
Smallsizes
datasets
of datasets
Mediumare sized
notdatasets
best
Size suited
does notfor matter
SVM’s?
a
70 The effectiveness of an SVM depends upon: Selection of Kernel Kernel ParametersSoft Margin Parameter
All of theC aboved
71 TRUE
Support vectors are the data points that lie closest to the decision surface. FALSE a
72 The SVM’s are less effective when: The data is linearlyTheseparable
data is clean
Theanddataready
is noisy
to use
and contains overlapping
c points
73 Suppose you are using RBF kernel in SVM with high Gamma value.The Whatmodel
doeswould
thisThe
signify?
consider
model would
evenThefar
consider
model
away would
points
onlyNone
the
not
from
points
be
ofhyperplane
affected
theclose
above
b to
byfor
the
distance
modeling
hyperplane
of points
for modeling
from hyperplane
74 The tradeoff
between
The number misclassificati
of cross- on and
validations to The kernel to simplicity of None of the
The cost parameter in the SVM means: be made be used the model above c
75 If I am using all features of my dataset and I achieve 100% accuracy Underfitting
on my trainingNothing,
set, but the
~70%model
Overfitting
on validation
is perfect set, what should Iclook out for?
76 Which of the following are real world applications of the SVM? Text and Hypertext ImageCategorization
Classification
Clustering of News All of
Articles
the aboved
77
Suppose you have trained an SVM with linear
decision boundary after training SVM, you
correctly infer that your SVM model is under
fitting.Which of the following option would you
more likely to consider iterating SVM next time? You want to increase
You wantyourtodata
decrease
You
points
willyour
try todata
calculate
You
points
will more
try to variables
reduce
c the features
78 We usually use feature normalization before using the Gaussian kernel in SVM. What 1 1 and
is true
2 about1featureand 3 normalization?
2 and 3 1. We do b feature normalization so that new fea
79 Linear SVMs have no hyperparameters that need to be set by cross-validation TRUE FALSE b
80 In a real problem, you should check to see if the SVM is separable and then include
TRUE slack variables
FALSEif it is not separable. b
UNIT I
1. What is classification?
a) when the output variable is a category, such as “red” or “blue” or “disease” and “no
disease”.
b) when the output variable is a real value, such as “dollars” or “weight”.

Ans: Solution A

Ans: Solution B

3. What is supervised learning?

Ans: Solution B

4. What is Unsupervised learning?

Ans: Solution A

5. What is Semi-Supervised learning?

Ans: Solution C

7. Sentiment Analysis is an example of:

Regression,

Classification

Clustering

Reinforcement Learning

Options:

A. 1 Only

B. 1 and 2

C. 1 and 3

D. 1, 2 and 4

Ans : Solution D

8. The process of forming general concept definitions from examples of concepts to be

learned.
a) Deduction
b) abduction
c) induction
d) conjunction

Ans : Solution C

9. Computers are best at learning

a) facts.
b) concepts.
c) procedures.
d) principles.
Ans : Solution A

10. Data used to build a data mining model.

a) validation data
b) training data
c) test data
d) hidden data

Ans : Solution B

11. Supervised learning and unsupervised clustering both require at least one
a) hidden attribute.
b) output attribute.
c) input attribute.
d) categorical attribute.

Ans : Solution A

Ans : Solution B

Ans : Solution C

14. A term used to describe the case when the independent variables in a multiple regression model
are correlated is
a) Regression
b) correlation
c) multicollinearity
d) none of the above

Ans : Solution C

16. A multiple regression model has

a) only one independent variable
b) more than one dependent variable
c) more than one independent variable
d) none of the above

Ans : Solution B

Ans : Solution C

18. The adjusted multiple coefficient of determination accounts for

a) the number of dependent variables in the model
b) the number of independent variables in the model
c) unusually large predictors
d) none of the above

Ans : Solution D

19. The multiple coefficient of determination is computed by

a) dividing SSR by SST
b) dividing SST by SSR
c) dividing SST by SSE
d) none of the above

Ans : Solution C

20. For a multiple regression model, SST = 200 and SSE = 50. The multiple coefficient of
determination is
a) 0.25
b) 4.00
c) 0.75
d) none of the above

Ans : Solution B

21. A nearest neighbor approach is best used

Ans : Solution B

22. Another name for an output attribute.

a) predictive variable
b) independent variable
c) estimated variable
d) dependent variable

Ans : Solution B

23. Classification problems are distinguished from estimation problems in that

Ans : Solution C

24. Which statement is true about prediction problems?

Ans : Solution D

25. Which statement about outliers is true?

Ans : Solution A

27. Which of the following is a common use of unsupervised clustering?

Ans : Solution A

28. The average positive difference between computed and desired outcome values.
a) root mean squared error
b) mean squared error
c) mean absolute error
d) mean positive error

Ans : Solution D

29. Selecting data so as to assure that each class is properly represented in both the training and
test set.
a) cross validation
b) stratification
c) verification
d) bootstrapping

Ans : Solution B

Ans : Solution A
31. Data used to optimize the parameter settings of a supervised learner model.
a) Training
b) Test
c) Verification
d) Validation

Ans : Solution D

32. Bootstrapping allows us to

Ans : Solution A

Ans : Solution B

Ans : Solution C

35. The average squared difference between classifier predicted output and actual output.
a) mean squared error
b) root mean squared error
c) mean absolute error
d) mean relative error

Ans : Solution A

36. Simple regression assumes a __________ relationship between the input attribute and output
attribute.
a) Linear
b) Quadratic
c) reciprocal
d) inverse

Ans : Solution A

37. Regression trees are often used to model _______ data.

a) Linear
b) Nonlinear
c) Categorical
d) Symmetrical

Ans : Solution B

38. The leaf nodes of a model tree are

a) averages of numeric output attribute values.
b) nonlinear regression equations.
c) linear regression equations.
d) sums of numeric output attribute values.

Ans : Solution C

39. Logistic regression is a ________ regression technique that is used to model data having a
_____outcome.
a) linear, numeric
b) linear, binary
c) nonlinear, numeric
d) nonlinear, binary

Ans : Solution D

40. This technique associates a conditional probability value with each data instance.
a) linear regression
b) logistic regression
c) simple regression
d) multiple linear regression

Ans : Solution B

42. With Bayes classifier, missing data items are

a) treated as equal compares.
b) treated as unequal compares.
c) replaced with a default value.
d) ignored.

Ans : Solution B

43. This clustering algorithm merges and splits nodes to help modify nonoptimal partitions.
a) agglomerative clustering
b) expectation maximization
c) conceptual clustering
d) K-Means clustering

Ans : Solution D

Ans : Solution C

Ans : Solution B
UNIT –II

2.What is pca.components_ in Sklearn?

Set of all eigen vectors for the projection space
Matrix of principal components
Result of the multiplication matrix
None of the above options
Ans A

Ans D

7. PCA works better if there is?

A linear structure in the data
If the data lies on a curved surface and not on a flat surface
If variables are scaled in the same unit
A. 1 and 2
B. 2 and 3
C. 1 and 3
D. 1 ,2 and 3
Ans Solution: (C)

9. Which of the following option(s) is / are true?

11. [ True or False ] PCA can be used for projecting and visualizing data in lower dimensions.
A. TRUE
B. FALSE

Solution: (A)
Sometimes it is very useful to plot the data in lower dimensions. We can take the first 2 principal
components and then visualize the data using scatter plot.

13. Which of the following is an example of a deterministic algorithm?

1. Which of the following methods do we use to best fit the data in Logistic Regression?
A) Least Square Error
B) Maximum Likelihood
C) Jaccard distance
D) Both A and B
Ans Solution: B

4. Which of the following algorithms do we use for Variable Selection?

A) LASSO
B) Ridge
C) Both
D) None of these
Ans Solution: A
In case of lasso we apply a absolute penality, after increasing the penality in lasso some of the
coefficient of variables may become zero

5. Which of the following statement is true about outliers in Linear regression?

7. Which of the following is true about Residuals?

A) Lower is better
B) Higher is better
C) A or B depend on the situation
D) None of these
Ans Solution: (A)
Residuals refer to the error values of the model. Therefore lower residuals are desired.

A) Since the there is a relationship means our model is not good

10. Which of the following option is true?

A) Linear Regression errors values has to be normally distributed but in case of Logistic
Regression it is not the case
B) Logistic Regression errors values has to be normally distributed but in case of Linear
Regression it is not the case
C) Both Linear Regression and Logistic Regression error values have to be normally distributed
D) Both Linear Regression and Logistic Regression error values have not to be normally
distributed
Ans Solution: A

12. True-False: Linear Regression is a supervised machine learning algorithm.

13. True-False: Linear Regression is mainly used for Regression.

A) TRUE
B) FALSE
Solution: (A)
Linear Regression has dependent variables that have continuous values.
14. True-False: It is possible to design a Linear regression algorithm using a neural network?

A) TRUE
B) FALSE

Solution: (A)

True. A Neural network can be used as a universal approximator, so it can definitely implement
a linear regression algorithm.

18. Which of the following is true about Residuals ?

A) Lower is better
B) Higher is better
C) A or B depend on the situation
D) None of these
Solution: (A)
Residuals refer to the error values of the model. Therefore lower residuals are desired.

A) 1 and 2
B) 1 and 3
C) 2 and 3
D) 1,2 and 3
Solution: (D)
Instead of gradient descent, Normal Equation can also be used to find coefficients.

Question Context 24-26:

A) Since the there is a relationship means our model is not good

Question Context 29-31:

31. In terms of bias and variance. Which of the following is true when you fit degree 2
polynomial?

A) Bias will be high, variance will be high

Question Context 32-33:

33. What do you expect will happen with bias and variance as you increase the size of training
data?

A) Bias increases and Variance increases

Question Context 34:

Consider the following data where one input(X) and one output(Y) is given.

34. What would be the root mean square training error for this data if you run a Linear
Regression model of the form (Y = A0+A1X)?

A) Less than 0
B) Greater than zero
C) Equal to 0
D) None of these
Solution: (C)
We can perfectly fit the line on the following data so mean error will be zero.

Question Context 35-36:

Suppose you have been given the following scenario for training and validation error for Linear
Regression.
Number Validation
Learning Training
Scenario of Error
Rate Error
iterations

1 0.1 1000 100 110

2 0.2 600 90 105

3 0.3 400 110 110

4 0.4 300 120 130

5 0.4 250 130 150

Question Context 37-38:

A) L1
B) L2
C) Any
D) None of these
Solution: (D)
I won’t use any regularization methods because regularization is used in case of overfitting.

39. True-False: Is Logistic regression a supervised machine learning algorithm?

40. True-False: Is Logistic regression mainly used for Regression?

A) TRUE
B) FALSE
Solution: B
Logistic regression is a classification algorithm, don’t confuse with the name regression.

42. True-False: Is it possible to apply a logistic regression algorithm on a 3-class Classification

problem?
A) TRUE
B) FALSE
Solution: A
Yes, we can apply logistic regression on 3 classification problem, We can use One Vs all method
for 3 class classification in logistic regression.

46. [True-False] Standardisation of features is required before training a Logistic Regression.

A) TRUE
B) FALSE
Solution: B
Standardization isn’t required for logistic regression. The main goal of standardizing features is
to help convergence of the technique used for optimization.

47. Which of the following algorithms do we use for Variable Selection?

A) LASSO
B) Ridge
C) Both
D) None of these

Solution: A
In case of lasso we apply a absolute penality, after increasing the penality in lasso some of the
coefficient of variables may become zero.
Context: 48-49

Consider a following model for logistic regression: P (y =1|x, w)= g(w0 + w1x)
where g(z) is the logistic function.

In the above equation the P (y =1|x; w) , viewed as a function of x, that we can get by changing the
parameters w.

48 What would be the range of p in such case?

A) (0, inf)
B) (-inf, 0 )
C) (0, 1)
D) (-inf, inf)

Solution: C

For values of x in the range of real number from −∞ to +∞ Logistic function will give the output
between (0,1)

49 In above question what do you think which function would make p between (0,1)?

A) logistic function
B) Log likelihood function
C) Mixture of both
D) None of them

Solution: A

Explanation is same as question number 10

50. Suppose you have been given a fair coin and you want to find out the odds of getting heads.
Which of the following option is true for such a case?

A) odds will be 0
B) odds will be 0.5
C) odds will be 1
D) None of these

Solution: C

Odds are defined as the ratio of the probability of success and the probability of failure. So in case of fair
coin probability of success is 1/2 and the probability of failure is 1/2 so odd would be 1

51. The logit function(given as l(x)) is the log of odds function. What could be the range of logit
function in the domain x=[0,1]?
A) (– ∞ , ∞)
B) (0,1)
C) (0, ∞)
D) (- ∞, 0)

Solution: A

For our purposes, the odds function has the advantage of transforming the probability function, which
has values from 0 to 1, into an equivalent function with values between 0 and ∞. When we take the
natural log of the odds function, we get a range of values from -∞ to ∞.

52. Which of the following option is true?

A) Linear Regression errors values has to be normally distributed but in case of Logistic Regression it is
not the case
B) Logistic Regression errors values has to be normally distributed but in case of Linear Regression it is
not the case
C) Both Linear Regression and Logistic Regression error values have to be normally distributed
D) Both Linear Regression and Logistic Regression error values have not to be normally distributed

Solution:A

53. Which of the following is true regarding the logistic function for any value “x”?

Note:
Logistic(x): is a logistic function of any number “x”

Logit(x): is a logit function of any number “x”

Logit_inv(x): is a inverse logit function of any number “x”

A) Logistic(x) = Logit(x)
B) Logistic(x) = Logit_inv(x)
C) Logit_inv(x) = Logit(x)
D) None of these

Solution: B

54. How will the bias change on using high(infinite) regularisation?

Solution: A

Model will become very simple so bias will be very high.

Note: Consider remaining parameters are same.

A) Training accuracy increases

B) Training accuracy increases or remains the same
C) Testing accuracy decreases
D) Testing accuracy increases or remains the same

Solution: A and D

56. Choose which of the following options is true regarding One-Vs-All method in Logistic Regression.

A) We need to fit n models in n-class classification problem

B) We need to fit n-1 models to classify into n classes
C) We need to fit only 1 model to classify into n classes
D) None of these
Solution: A

If there are n classes, then n separate logistic regression has to fit, where the probability of each
category is predicted over the rest of the categories combined.

57. Below are two different logistic models with different values for β0 and β1.

Which of the
following statement(s) is true about β0 and β1 values of two logistics models (Green, Black)?

Note: consider Y = β0 + β1*X. Here, β0 is intercept and β1 is coefficient.

A) β1 for Green is greater than Black

B) β1 for Green is lower than Black
C) β1 for both models is same
D) Can’t Say

Solution: B

β0 and β1: β0 = 0, β1 = 1 is in X1 color(black) and β0 = 0, β1 = −1 is in X4 color (green)

Context 58-60

A) A
B) B
C) C
D)None of these

Solution: C

Since in figure 3, Decision boundary is not smooth that means it will over-fitting the data.

59. What do you conclude after seeing this visualization?

1. The training error in first plot is maximum as compare to second and third plot.

2. The best model for this regression problem is the last (third) plot because it has minimum
training error (zero).

3. The second model is more robust than first and third because it will perform best on unseen
data.

4. The third model is overfitting more as compare to first and second.

5. All will perform same because we have not seen the testing data.

A) 1 and 3
B) 1 and 3
C) 1, 3 and 4
D) 5

Solution: C

60. Suppose, above decision boundaries were generated for the different value of regularization.
Which of the above decision boundary shows the maximum regularization?

A) A
B) B
C) C
D) All have equal regularization

Solution: A

Since, more regularization means more penality means less complex decision boundry that shows in first
figure A.

61. What would do if you want to train logistic regression on same data that will take less time as well
as give the comparatively similar accuracy(may not be same)?

Suppose you are using a Logistic Regression model on a huge dataset. One of the problem you may face
on such huge data is that Logistic regression will take very long time to train.

A) Decrease the learning rate and decrease the number of iteration

Solution: D

62. Which of the following image is showing the cost function for y =1.

Following is the loss function in logistic regression(Y-axis loss function and x axis log probability) for
two class classification problem.

Note: Y is the target class

A) A
B) B
C) Both
D) None of these

Solution: A

A is the true answer as loss function decreases as the log probability increases

63. Suppose, Following graph is a cost function for logistic regression.

Now, How many local minimas are present in the graph?

A) 1
B) 2
C) 3
D) 4

Solution: C
There are three local minima present in the graph

64. Can a Logistic Regression classifier do a perfect classification on the below data?

Note: You can use only X1 and X2 variables where X1 and X2 can take only two binary values(0,1).

A) TRUE
B) FALSE
C) Can’t say
D) None of these

Solution: B

No, logistic regression only forms linear decision surface, but the examples in the figure are not linearly
separable.
UNIT IV

1. The SVM’s are less effective when:

A) The data is linearly separable

B) The data is clean and ready to use
C) The data is noisy and contains overlapping points

Ans Solution: C

When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.

2. The cost parameter in the SVM means:

A) The number of cross-validations to be made

B) The kernel to be used
C) The tradeoff between misclassification and simplicity of the model
D) None of the above

Ans Solution: C

The cost parameter decides how much an SVM should be allowed to “bend” with the data. For a low
cost, you aim for a smooth decision surface and for a higher cost, you aim to classify more points
correctly. It is also simply referred to as the cost of misclassification.

3. Which of the following are real world applications of the SVM?

A) Text and Hypertext Categorization

B) Image Classification
C) Clustering of News Articles
D) All of the above

Ans Solution: D

SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.

4. Which of the following is true about Naive Bayes ?

Assumes that all the features in a dataset are equally important

Assumes that all the features in a dataset are independent

Both A and B - answer

None of the above options

Ans Solution: C

5 What do you mean by generalization error in terms of the SVM?

A) How far the hyperplane is from the support vectors

B) How accurately the SVM can predict outcomes for unseen data
C) The threshold amount of error in an SVM

Ans Solution: B

Generalisation error in statistics is generally the out-of-sample error which is the measure of how
accurately a model can predict values for previously unseen data.

6 The SVM’s are less effective when:

A) The data is linearly separable

B) The data is clean and ready to use
C) The data is noisy and contains overlapping points

Ans Solution: C

When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.

7 What is/are true about kernel in SVM?

1. Kernel function map low dimensional data to high dimensional space

2. It’s a similarity function

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: C

Both the given statements are correct.

Question Context:8– 9

A) Yes
B) No

Solution: A

These three examples are positioned such that removing any one of them introduces slack in the
constraints. So the decision boundary would completely change.

9. [True or False] If you remove the non-red circled points from the data, the decision boundary will
change?

A) True
B) False

Solution: B

On the other hand, rest of the points in the data won’t affect the decision boundary much.

10. What do you mean by generalization error in terms of the SVM?

A) How far the hyperplane is from the support vectors

B) How accurately the SVM can predict outcomes for unseen data
C) The threshold amount of error in an SVM

Solution: B

A) The optimal hyperplane if exists, will be the one that completely separates the data
B) The soft-margin classifier will separate the data
C) None of the above

Solution: A

At such a high level of misclassification penalty, soft margin will not hold existence as there will be no
room for error.

12. What do you mean by a hard margin?

A) The SVM allows very low error in classification

B) The SVM allows high amount of error in classification
C) None of the above

Solution: A

A hard margin means that an SVM is very rigid in classification and tries to work extremely well in the
training set, causing overfitting.

13. The minimum time complexity for training an SVM is O(n2). According to this fact, what sizes of
datasets are not best suited for SVM’s?

A) Large datasets
B) Small datasets
C) Medium sized datasets
D) Size does not matter

Solution: A

Datasets which have a clear classification boundary will function best with SVM’s.

14. The effectiveness of an SVM depends upon:

A) Selection of Kernel
B) Kernel Parameters
C) Soft Margin Parameter C
D) All of the above

Solution: D

The SVM effectiveness depends upon how you choose the basic 3 requirements mentioned above in
such a way that it maximises your efficiency, reduces error and overfitting.

15. upport vectors are the data points that lie closest to the decision surface.
A) TRUE
B) FALSE

Solution: A

They are the points closest to the hyperplane and the hardest ones to classify. They also have a direct
bearing on the location of the decision surface.

16. The SVM’s are less effective when:

A) The data is linearly separable

B) The data is clean and ready to use
C) The data is noisy and contains overlapping points

Solution: C

When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.

17. Suppose you are using RBF kernel in SVM with high Gamma value. What does this signify?

Solution: B

The gamma parameter in SVM tuning signifies the influence of points either near or far away from the
hyperplane.

For a low gamma, the model will be too constrained and include all points of the training dataset,
without really capturing the shape.

For a higher gamma, the model will capture the shape of the dataset well.

18. The cost parameter in the SVM means:

A) The number of cross-validations to be made

B) The kernel to be used
C) The tradeoff between misclassification and simplicity of the model
D) None of the above

What would happen when you use very large value of C(C->infinity)?

Note: For small C was also classifying all data points correctly

A) We can still classify data correctly for given setting of hyper parameter C
B) We can not classify data correctly for given setting of hyper parameter C
C) Can’t Say
D) None of these

Solution: A

For large values of C, the penalty for misclassifying points is very high, so the decision boundary will
perfectly separate the data if possible.

20. What would happen when you use very small C (C~0)?

A) Misclassification would happen

B) Data will be correctly classified
C) Can’t say
D) None of these

Solution: A

The classifier can maximize the margin between most of the points, while misclassifying a few points,
because the penalty is so low.

21. If I am using all features of my dataset and I achieve 100% accuracy on my training set, but ~70% on
validation set, what should I look out for?

A) Underfitting
B) Nothing, the model is perfect
C) Overfitting

Solution: C

If we’re achieving 100% training accuracy very easily, we need to check to verify if we’re overfitting our
data.
22. Which of the following are real world applications of the SVM?

A) Text and Hypertext Categorization

B) Image Classification
C) Clustering of News Articles
D) All of the above

Solution: D

SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.

Question Context: 23 – 25

Suppose you have trained an SVM with linear decision boundary after training SVM, you correctly infer
that your SVM model is under fitting.

23. Which of the following option would you more likely to consider iterating SVM next time?

A) You want to increase your data points

B) You want to decrease your data points
C) You will try to calculate more variables
D) You will try to reduce the features

Solution: C

The best option here would be to create more features for the model.

24. Suppose you gave the correct answer in previous question. What do you think that is actually
happening?

1. We are lowering the bias

2. We are lowering the variance
3. We are increasing the bias
4. We are increasing the variance

A) 1 and 2
B) 2 and 3
C) 1 and 4
D) 2 and 4

Solution: C

A) We will increase the parameter C

B) We will decrease the parameter C
C) Changing in C don’t effect
D) None of these

Solution: A

Increasing C parameter would be the right thing to do here, as it will ensure regularized model

26. We usually use feature normalization before using the Gaussian kernel in SVM. What is true about
feature normalization?

1. We do feature normalization so that new feature will dominate other

2. Some times, feature normalization is not feasible in case of categorical variables
3. Feature normalization always helps when we use Gaussian kernel in SVM

A) 1
B) 1 and 2
C) 1 and 3
D) 2 and 3

Solution: B

Statements one and two are correct.

Question Context: 27-29

Suppose you are dealing with 4 class classification problem and you want to train a SVM model on the
data for that you are using One-vs-all method. Now answer the below questions?

27. How many times we need to train our SVM model in such case?

A) 1
B) 2
C) 3
D) 4

Solution: D

A) 20
B) 40
C) 60
D) 80

Solution: B

It would take 10×4 = 40 seconds

29 Suppose your problem has changed now. Now, data has only 2 classes. What would you think how
many times we need to train SVM in such case?

A) 1
B) 2
C) 3
D) 4

Solution: A

Training the SVM only one time would give you appropriate results

Question context: 30 –31

30. Now, think that you increase the complexity (or degree of polynomial of this kernel). What would
you think will happen?

A) Increasing the complexity will over fit the data

B) Increasing the complexity will under fit the data
C) Nothing will happen since your model was already 100% accurate
D) None of these

Solution: A

A) 1
B) 2
C) 1 and 2
D) None of these

Solution: C

Both the given statements are correct.

32. What is/are true about kernel in SVM?

1. Kernel function map low dimensional data to high dimensional space

2. It’s a similarity function

A) 1
B) 2
C) 1 and 2
D) None of these

Solution: C

Both the given statements are correct.

UNIT V

1. Which of the following is a widely used and effective machine learning algorithm based on the
idea of bagging?

a) Decision Tree
b) Regression
c) Classification
d) Random Forest

Ans D

2. Which of the following is a disadvantage of decision trees?

a) Factor analysis
b) Decision trees are robust to outliers
c) Decision trees are prone to be overfit
d) None of the above

Ans C

3. Can decision trees be used for performing clustering?

a. True
b. False

Ans Solution: (A)

Decision trees can also be used to for clusters in the data but clustering often generates natural
clusters and is not dependent on any objective function.

4. Which of the following algorithm is most sensitive to outliers?

a. K-means clustering algorithm

b. K-medians clustering algorithm
c. K-modes clustering algorithm
d. K-medoids clustering algorithm

Ans Solution: (A)

5 Sentiment Analysis is an example of:

Regression

Classification

Clustering

Reinforcement Learning

Options:

a. 1 Only
b. 1 and 2
c. 1 and 3
d. 1, 2 and 4

Ans D

6 Which of the following is the most appropriate strategy for data cleaning before performing
clustering analysis, given less than desirable number of data points:

Capping and flouring of variables

Removal of outliers
Options:
a. 1 only
b. 2 only
c. 1 and 2
d. None of the above

Ans A

7 Which of the following is/are true about bagging trees?

1. In bagging trees, individual trees are independent of each other

2. Bagging is the method for improving the performance by aggregating the results of weak
learners

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: C

Both options are true. In Bagging, each individual trees are independent of each other because they
consider different subset of features and samples.

8. Which of the following is/are true about boosting trees?

1. In boosting trees, individual weak learners are independent of each other

2. It is the method for improving the performance by aggregating the results of weak learners

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: B

2. Individual tree is built on all the features

3. Individual tree is built on a subset of observations

4. Individual tree is built on full set of observations

A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4

Ans Solution: A

Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.

10. Suppose you are using a bagging based algorithm say a RandomForest in model building.
Which of the following can be true?

1. Number of tree should be as large as possible

2. You will have interpretability after using Random Forest

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: A

11. Which of the following is/are true about Random Forest and Gradient Boosting ensemble
methods?

1. Both methods can be used for classification task

2. Random Forest is use for classification whereas Gradient Boosting is use for regression task

3. Random Forest is use for regression whereas Gradient Boosting is use for Classification task

4. Both methods can be used for regression task

A) 1
B) 2
C) 3
D) 4
E) 1 and 4

Solution: E

Both algorithms are design for classification as well as regression task.

12. In Random forest you can generate hundreds of trees (say T1, T2 …..Tn) and then aggregate the
results of these tree. Which of the following is true about individual(Tk) tree in Random Forest?

1. Individual tree is built on a subset of the features

2. Individual tree is built on all the features

3. Individual tree is built on a subset of observations

4. Individual tree is built on full set of observations

A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4

Solution: A

Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.

13. Which of the following algorithm doesn’t uses learning Rate as of one of its hyperparameter?

1. Gradient Boosting

2. Extra Trees

3. AdaBoost

4. Random Forest

A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4

Solution: D
Random Forest and Extra Trees don’t have learning rate as a hyperparameter.

14. Which of the following algorithm are not an example of ensemble learning algorithm?

A) Random Forest
B) Adaboost
C) Extra Trees
D) Gradient Boosting
E) Decision Trees

Solution: E

Decision trees doesn’t aggregate the results of multiple trees so it is not an ensemble algorithm.

15. Suppose you are using a bagging based algorithm say a RandomForest in model building. Which of
the following can be true?

1. Number of tree should be as large as possible

2. You will have interpretability after using RandomForest

A) 1
B) 2
C) 1 and 2
D) None of these

Solution: A

16. True-False: The bagging is suitable for high variance low bias models?

A) TRUE
B) FALSE

Solution: A

The bagging is suitable for high variance low bias models or you can say for complex models.

17. To apply bagging to regression trees which of the following is/are true in such case?

1. We build the N regression with N bootstrap sample

2. We take the average the of N regression tree

3. Each tree has a high variance with low bias

A) 1 and 2
B) 2 and 3
C) 1 and 3
D) 1,2 and 3

Solution: D

All of the options are correct and self-explanatory

18. How to select best hyper parameters in tree based models?

A) Measure performance over training data

B) Measure performance over validation data
C) Both of these
D) None of these

Solution: B

We always consider the validation results to compare with the test result.

19. In which of the following scenario a gain ratio is preferred over Information Gain?

A) When a categorical variable has very large number of category

B) When a categorical variable has very small number of category
C) Number of categories is the not the reason
D) None of these

Solution: A

When high cardinality problems, gain ratio is preferred over Information Gain technique.

20. Suppose you have given the following scenario for training and validation error for Gradient
Boosting. Which of the following hyper parameter would you choose in such case?

Scenario Depth Training Error Validation Error

1 2 100 110

2 4 90 105

3 6 50 100

4 8 45 105
5 10 30 150

A) 1
B) 2
C) 3
D) 4

Solution: B

Scenario 2 and 4 has same validation accuracies but we would select 2 because depth is lower is better
hyper parameter.

21. Which of the following is/are not true about DBSCAN clustering algorithm:

1. For data points to be in a cluster, they must be in a distance threshold to a core point

2. It has strong assumptions for the distribution of data points in dataspace

3. It has substantially high time complexity of order O(n 3)

4. It does not require prior knowledge of the no. of desired clusters

5. It is robust to outliers

Options:

A. 1 only

B. 2 only

C. 4 only

D. 2 and 3

Solution: D

 DBSCAN can form a cluster of any arbitrary shape and does not have strong assumptions for the
distribution of data points in the data space.

 DBSCAN has a low time complexity of order O (n log n) only.

22. Point out the correct statement.

23. Which of the following is required by K-means clustering?

a) defined distance metric
b) number of clusters
c) initial guess as to cluster centroids
d) all of the mentioned

Answer: d
Explanation: K-means clustering follows partitioning approach.

24. Point out the wrong statement.

a) k-means clustering is a method of vector quantization
b) k-means clustering aims to partition n observations into k clusters
c) k-nearest neighbor is same as k-means
d) none of the mentioned

Answer: c
Explanation: k-nearest neighbour has nothing to do with k-means.

25. Which of the following function is used for k-means clustering?

a) k-means
b) k-mean
c) heat map
d) none of the mentioned

Answer: a
Explanation: K-means requires a number of clusters.

26. K-means is not deterministic and it also consists of number of iterations.

a) True
b) False

Answer: a
Explanation: K-means clustering produces the final estimate of cluster centroids.
27.
1. Techniques of feature engineering involve:
A. Clean dataset
B. Increase their signal-noise ratio
C. Reduce dimensionality
D. All of these
ANSWER: D

2. Correlated features provide additional pieces of information

A. True
B. False
C. None of these
D. All of these
ANSWER: B

3. Training set is used to test performance of system

A. True
B. False
C. None of these
D. All of these
ANSWER: B

4. The original dataset must be randomly shuffled before the split phase
A. avoid a correlation between consequent elements
B. avoid a sequencing between consequent elements
C. build a relation between consequent elements
D. None of these
ANSWER: A

5. NumPy RandomState generator or an integer seed is used to

A. randomize data
B. reproduce experiments
C. bechmark dataset
D. None of these
ANSWER: B

6. A good ratio of training and testing split is

A. 50-50
B. 60-40
C. 70-20
D. 80-20
ANSWER: D

7. If dataset has categorical data, then following action needs to be taken

A. encode categorical data
B. drop the column contating categorical data
C. Convert it to number
D. keep it as it is, no effect
ANSWER: A.
8. Which of the following is a categorical variable?
A. Gender
B. Object
C. Number
D. Alphabet
ANSWER: A

9. which adopts a dictionary-oriented approach, associating to each category label a

progressive integer number, that is an index of an instance array called classes_:
A. Hashing
B. Dict method
C. LabelEncoder
D. None of these
ANSWER: C

10. Drawback of label encoder class is

A. All labels are turned into binary numbers
B. All labels are turned into sequential numbers
C. All labels are turned into random numbers
D. All of these
ANSWER: B

11. Label encoder class

A. provides position of data
B. does not concern about semantics
C. preserve semantics
D. all of the above
ANSWER: B

12. One-hot encoding method coverts the data into binary

A. True
B. False
C. None of these
D. All of these
ANSWER: A

13. For converting labels we use

A. Label encoder
B. Label Binarizer
C. One hot encoding
D. One label encoding
ANSWER: B
14. In which method, each categorical label is first turned into a positive integer and
then transformed into a vector where only one feature is 1 while all the others are 0
A. LabelBinarizer
B. Encoder
C. One hot encoding
D. None of these
ANSWER: A

15. Methods to manage categorical variable

A. One hot encoding
B. LabelEncoder
C. LabelBinarizer
D. All of these
ANSWER: D

16. DictVectorizer and FeatureHasher produce

A. Sparse Matrices
B. Inverse Matrices
C. Idetity Matrices
D. None of these
ANSWER: A

17. While managing missing values, which method are available

A. Removing the whole line
B. Creating sub-model to predict those features
C. Using an automatic strategy to input them according to the other known values
D. All of these
ANSWER: D

18. Which option should be considered only when the dataset is quite large, the number
of missing features is high, and any prediction could be risky
A. Removing the whole line
B. Creating sub-model to predict those features
C. Using an automatic strategy to input them according to the other known values
D. All of these
ANSWER: A

19. While managing missing values, which method is said as best choice
A. Removing the whole line
B. Creating sub-model to predict those features
C. Using an automatic strategy to input them according to the other known values
D. All of these
ANSWER: C
20. What are data preprocessing techniques to handle outliers
A. Winsorize (cap at threshold).
B. Transform to reduce skew (using Box-Cox or similar).
C. Remove outliers if you're certain they are anomalies or measurement errors.
D. All of above
ANSWER: D

21. Which of the following model model include a backwards elimination feature selection
routine?
A. MCV
B. MARS
C. MCRS
D. All of the Mentioned
ANSWER: B

22. Which of the following is a categorical outcome

A. RMSE
B. RSquared
C. Accuracy
D. All of the Mentioned
ANSWER: C

23. What are data preprocessing techniques to handle outliers

A. Winsorize (cap at threshold).
B. Transform to reduce skew (using Box-Cox or similar).
C. Remove outliers if you're certain they are anomalies or measurement errors.
D. All of above
ANSWER: D

24. What are ways of reducing dimensionality

A. Removing collinear features.
B. Performing PCA, ICA, or other forms of algorithmic dimensionality reduction.
C. Combining features with feature engineering.
D. All of above
ANSWER: D

25. If you split your data into train/test splits, is it still possible to overfit your
model?
A. True
B. False
C. None of these
D. All of these
ANSWER: A
26. How do you handle missing or corrupted data in a dataset
A. Drop missing rows or columns
B. Replace missing values with mean/median/mode
C. Assign a unique category to missing values
D. All of the above
ANSWER: D

27. Techniques to perform Feature Scaling are

A. Min-Max Normalization
B. Standardization
C. Both a and b
D. None of above
ANSWER: B

28. Technique to re-scales a feature or observation value with distribution value between
0 and 1 is known a
A. Mean Normalization
B. Max Normalization
C. Mode Normalization
D. Min-Max Normalization
ANSWER: D

29. Feature Scaling is a technique to standardize the independent features present in the
data in a fixed range
A. True
B. False
C. None of these
D. All of these
ANSWER: A

30. To calculate the distance between centroid and data point which method is used
A. Euclidean Distance
B. Manhattan Distance
C. Minkowski Distance
D. All of the above
ANSWER: D

31. Feature Scaling is a technique to standardize the independent features present in the
data in a fixed range
A. True
B. False
C. None of these
D. All of these
ANSWER: A

32. ------------ performed during the data pre-processing to handle highly varying
magnitudes or values or units
A. Label encoding
B. Feature Scaling
C. Feature extraction
D. Normalization
ANSWER: B

33. Techniques to perform feature scaling are

A. standardization
B. Min Normalization
C. Max Normalization
D. Minmax Normalization and standardization
ANSWER: D

34. Normalization is generally required when we are dealing with attributes on a different
scale
A. True
B. False
C. None of these
D. All of these
ANSWER: A

35. Why do we perform feature scaling?

A. The range of all features should be normalized so that each feature contributes
approximately proportionately to the final distance
B. Gradient descent converges much faster with feature scaling than without it.
C. Both A and B
D. None
ANSWER: C

36. What is the difference between normalization and scaling

A. Normalization devides the value and scaling multiplies the value
B. Normalization converts data in range and scaling multiplies data by weight
C. Both are same
D. In scaling we change the range of your data while in normalization we change the
shape of the distribution of your data
ANSWER: D

37. For normalization, the maximum value and minimum value is

A. 1 and 0
B. 0 and 1
C. 0 to 0
D 1 and 1
ANSWER: A

38. An unnormalized dataset with many features contains

A. information which is proportional to the independence of all features and their
variance
B. information which is inversly proportional to the independence of all features and
their variance
C. information proportional to the mean of all features and their Covariance
D. None
ANSWER: A

39. A -----------is a useful approach to remove all those elements whose contribution is
under a predefined level
A. correlation threshold
B. covariance threshold
C. variance threshold
D. None of these
ANSWER: C

40. Imagine, you have 1000 input features and 1 target feature in a machine learning problem.
You have to select 100 most important features based on the relationship between input
features and the target features.Do you think, this is an example of dimensionality
reduction?
A. Yes
B. No
C. None of these
D. All of these
ANSWER: A

41. When performing regression or classification, which of the following is the correct
way to preprocess the data
A. Normalize the data → PCA → training
B. PCA → normalize PCA output → training
C. Normalize the data → PCA → normalize PCA output → training
D. None of the above
ANSWER: A

42. What is pca.components_ in Sklearn

A. Set of all eigen vectors for the projection space
B. Matrix of principal components
C. Result of the multiplication matrix
D. None of the above options
ANSWER: A

43. Which of the following is a reasonable way to select the number of principal components
"k"
A. Choose k to be the smallest value so that at least 99% of the varinace is retained.
B. Choose k to be 99% of m (k = 0.99*m, rounded to the nearest integer).
C. Choose k to be the largest value so that 99% of the variance is retained.
D. Use the elbow method
ANSWER: A

44. Dimensionality reduction algorithms are one of the possible ways to reduce the
computation time required to build a model.
A. TRUE
B. FALSE
C. None of these
D. All of these
ANSWER: A

45. When a dataset is made up of non-negative elements can we use non-negative matrix
factorization (NNMF) instead of standard PCA
A. Yes
B. NO
C. None of these
D. All of these
ANSWER: A

46. PCA can be used for projecting and visualizing data in lower dimensions.
A. TRUE
B. FALSE
C. None of these
D. All of these
ANSWER: A

47. Which of the following is/are true about PCA? 1. PCA is an unsupervised method. 2.
It searches for the directions that data have the largest variance 3. Maximum number of
principal components <= number of features 4. All principal components are orthogonal to
each other
A. 1 and 2
B. 1 and 3
C. 2 and 3
D. All of these
ANSWER: D
48. ------- allows exploiting the natural sparsity of data while extracting principal
components
A. Standard PCA
B. Kernal PCA
C. Sparse PCA
D. All of the above
ANSWER: C

49. ------- performs a PCA with non-linearly separable data sets

A. Standard PCA
B. Kernal PCA
C. Sparse PCA
D. All of the above
ANSWER: B

50. Dictionary learning is a technique which allows rebuilding a sample starting from a
sparse dictionary of atoms
A. True
B. False
C. None of these
D. All of these
ANSWER: A
1. In practice, Line of best fit or regression line is found when _____________
a) Sum of residuals (∑(Y – h(X))) is minimum
b) Sum of the absolute value of residuals (∑|Y-h(X)|) is maximum
c) Sum of the square of residuals ( ∑ (Y-h(X))2) is minimum
d) Sum of the square of residuals ( ∑ (Y-h(X))2) is maximum
View Answer
Answer: c
Explanation: Here we penalize higher error value much more as compared to the smaller one,
such that there is a significant difference between making big errors and small errors,
which makes it easy to differentiate and select the best fit line.

2. If Linear regression model perfectly first i.e., train error is zero, then
_____________________
a) Test error is also always zero
b) Test error is non zero
c) Couldn’t comment on Test error
d) Test error is equal to Train error
View Answer
Answer: c
Explanation: Test Error depends on the test data. If the Test data is an exact representation
of train data then test error is always zero. But this may not be the case.

3. Which of the following metrics can be used for evaluating regression models?

i) R Squared
ii) Adjusted R Squared
iii) F Statistics
iv) RMSE / MSE / MAE
a) ii and iv
b) i and ii
c) ii, iii and iv
d) i, ii, iii and iv
View Answer
Answer: d
Explanation: These (R Squared, Adjusted R Squared, F Statistics, RMSE / MSE / MAE) are some
metrics which you can use to evaluate your regression model.

4. How many coefficients do you need to estimate in a simple linear regression model (One
independent variable)?
a) 1
b) 2
c) 3
d) 4
View Answer
Answer: b
Explanation: In simple linear regression, there is one independent variable so 2
coefficients (Y=a+bx+error).
5. In a simple linear regression model (One independent variable), If we change the input
variable by 1 unit. How much output variable will change?
a) by 1
b) no change
c) by intercept
d) by its slope
View Answer
Answer: d
Explanation: For linear regression Y=a+bx+error. If neglect error then Y=a+bx. If x
increases by 1, then Y = a+b(x+1) which implies Y=a+bx+b. So Y increases by its slope.

6. Function used for linear regression in R is __________

a) lm(formula, data)
b) lr(formula, data)
c) lrm(formula, data)
d) regression.linear(formula, data)
View Answer
Answer: a
Explanation: lm(formula, data) refers to a linear model in which formula is the object of
the class “formula”, representing the relation between variables. Now this formula is on
applied on the data to create a relationship model.

7. In syntax of linear model lm(formula,data,..), data refers to ______

a) Matrix
b) Vector
c) Array
d) List
View Answer
Answer: b
Explanation: Formula is just a symbol to show the relationship and is applied on data which
is a vector. In General, data.frame are used for data.

8. In the mathematical Equation of Linear Regression Y = β1 + β2X + ϵ, (β1, β2) refers to
__________
a) (X-intercept, Slope)
b) (Slope, X-Intercept)
c) (Y-Intercept, Slope)
d) (slope, Y-Intercept)
View Answer
Answer: c

9. True-False: Linear Regression is a supervised machine learning algorithm.

A) TRUE
B) FALSE

Solution: (A)
Yes, Linear regression is a supervised learning algorithm because it uses true labels for
training. Supervised learning algorithm should have input variable (x) and an output
variable (Y) for each example.

10. True-False: Linear Regression is mainly used for Regression.

A) TRUE
B) FALSE

Solution: (A)

Linear Regression has dependent variables that have continuous values.

11. True-False: It is possible to design a Linear regression algorithm using a neural

network?

A) TRUE
B) FALSE

Solution: (A)

True. A Neural network can be used as a universal approximator, so it can definitely

implement a linear regression algorithm.

12. Which of the following methods do we use to find the best fit line for data in Linear
Regression?

A) Least Square Error

B) Maximum Likelihood
C) Logarithmic Loss
D) Both A and B

Solution: (A)

In linear regression, we try to minimize the least square errors of the model to identify
the line of best fit.

13. Which of the following evaluation metrics can be used to evaluate a model while modeling
a continuous output variable?

A) AUC-ROC
B) Accuracy
C) Logloss
D) Mean-Squared-Error

Solution: (D)
Since linear regression gives output as continuous values, so in such case we use mean
squared error metric to evaluate the model performance. Remaining options are use in case
of a classification problem.

14. True-False: Lasso Regularization can be used for variable selection in Linear
Regression.

A) TRUE
B) FALSE

Solution: (A)

True, In case of lasso regression we apply absolute penalty which makes some of the
coefficients zero.

15. Which of the following is true about Residuals ?

A) Lower is better
B) Higher is better
C) A or B depend on the situation
D) None of these

Solution: (A)

Residuals refer to the error values of the model. Therefore lower residuals are desired.

16. Suppose that we have N independent variables (X1,X2… Xn) and dependent variable is Y.
Now Imagine that you are applying linear regression by fitting the best fit line using least
square error on this data.

You found that correlation coefficient for one of it’s variable(Say X1) with Y is -0.95.

Which of the following is true for X1?

A) Relation between the X1 and Y is weak

B) Relation between the X1 and Y is strong
C) Relation between the X1 and Y is neutral
D) Correlation can’t judge the relationship

Solution: (B)

The absolute value of the correlation coefficient denotes the strength of the relationship.
Since absolute correlation is very high it means that the relationship is strong between
X1 and Y.
17. Looking at above two characteristics, which of the following option is the correct for
Pearson correlation between V1 and V2?

If you are given the two variables V1 and V2 and they are following below two characteristics.

1. If V1 increases then V2 also increases

2. If V1 decreases then V2 behavior is unknown

A) Pearson correlation will be close to 1

B) Pearson correlation will be close to -1
C) Pearson correlation will be close to 0
D) None of these

Solution: (D)

We cannot comment on the correlation coefficient by using only statement 1. We need to

consider the both of these two statements. Consider V1 as x and V2 as |x|. The correlation
coefficient would not be close to 1 in such a case.

18. Suppose Pearson correlation between V1 and V2 is zero. In such case, is it right to
conclude that V1 and V2 do not have any relation between them?

A) TRUE
B) FALSE

Solution: (B)

Pearson correlation coefficient between 2 variables might be zero even when they have a
relationship between them. If the correlation coefficient is zero, it just means that that
they don’t move together. We can take examples like y=|x| or y=x^2.

19. Which of the following offsets, do we use in linear regression’s least square line fit?
Suppose horizontal axis is independent variable and vertical axis is dependent variable.

A) Vertical offset
B) Perpendicular offset
C) Both, depending on the situation
D) None of above

Solution: (A)

We always consider residuals as vertical offsets. We calculate the direct differences

between actual value and the Y labels. Perpendicular offset are useful in case of PCA.

20. True- False: Overfitting is more likely when you have huge amount of data to train?

A) TRUE
B) FALSE
Solution: (B)

With a small training dataset, it’s easier to find a hypothesis to fit the training data
exactly i.e. overfitting.

21. We can also compute the coefficient of linear regression with the help of an analytical
method called “Normal Equation”. Which of the following is/are true about Normal Equation?

We don’t have to choose the learning rate

It becomes slow when number of features is very large
Thers is no need to iterate

A) 1 and 2
B) 1 and 3
C) 2 and 3
D) 1,2 and 3

Solution: (D)

Instead of gradient descent, Normal Equation can also be used to find coefficients. Refer
this article for read more about normal equation.

22. Which of the following statement is true about sum of residuals of A and B?

Below graphs show two fitted regression lines (A & B) on randomly generated data. Now, I
want to find the sum of residuals in both cases A and B.

Note:

Scale is same in both graphs for both axis.

X axis is independent variable and Y-axis is dependent variable.

A) A has higher sum of residuals than B

B) A has lower sum of residual than B
C) Both have same sum of residuals
D) None of these

Solution: (C)

Sum of residuals will always be zero, therefore both have same sum of residuals

23. Suppose you have fitted a complex regression model on a dataset. Now, you are using
Ridge regression with penality x. Choose the option which describes bias in best manner.
A) In case of very large x; bias is low
B) In case of very large x; bias is high
C) We can’t say about bias
D) None of these

Solution: (B)

24. If the penalty is very large it means model is less complex, therefore the bias would
be high. What will happen when you apply very large penalty?

A) Some of the coefficient will become absolute zero

B) Some of the coefficient will approach zero but not absolute zero
C) Both A and B depending on the situation
D) None of these

Solution: (B)

In lasso some of the coefficient value become zero, but in case of Ridge, the coefficients
become close to zero but not zero.

25. What will happen when you apply very large penalty in case of Lasso?
A) Some of the coefficient will become zero
B) Some of the coefficient will be approaching to zero but not absolute zero
C) Both A and B depending on the situation
D) None of these

Solution: (A)

As already discussed, lasso applies absolute penalty, so some of the coefficients will
become zero.

26. Which of the following statement is true about outliers in Linear regression?

A) Linear regression is sensitive to outliers

B) Linear regression is not sensitive to outliers
C) Can’t say
D) None of these

Solution: (A)

The slope of the regression line will change due to outliers in most of the cases. So Linear
Regression is sensitive to outliers.

27. Suppose you plotted a scatter plot between the residuals and predicted values in linear
regression and you found that there is a relationship between them. Which of the following
conclusion do you make about this situation?

A) Since the there is a relationship means our model is not good

B) Since the there is a relationship means our model is good
C) Can’t say
D) None of these

Solution: (A)

There should not be any relationship between predicted values and residuals. If there exists
any relationship between them,it means that the model has not perfectly captured the
information in the data.

Question Context 28-30:

28. What will happen when you fit degree 4 polynomial in linear regression?
A) There are high chances that degree 4 polynomial will over fit the data
B) There are high chances that degree 4 polynomial will under fit the data
C) Can’t say
D) None of these

Solution: (A)

Since is more degree 4 will be more complex(overfit the data) than the degree 3 model so
it will again perfectly fit the data. In such case training error will be zero but test
error may not be zero.

29. What will happen when you fit degree 2 polynomial in linear regression?
A) It is high chances that degree 2 polynomial will over fit the data
B) It is high chances that degree 2 polynomial will under fit the data
C) Can’t say
D) None of these

Solution: (B)

If a degree 3 polynomial fits the data perfectly, it’s highly likely that a simpler
model(degree 2 polynomial) might under fit the data.

30. In terms of bias and variance. Which of the following is true when you fit degree 2
polynomial?

A) Bias will be high, variance will be high

B) Bias will be low, variance will be high
C) Bias will be high, variance will be low
D) Bias will be low, variance will be low

Solution: (C)

Since a degree 2 polynomial will be less complex as compared to degree 3, the bias will
be high and variance will be low.

Question Context 31:

Which of the following is true about below graphs(A,B, C left to right) between the cost
function and Number of iterations?

31. Suppose l1, l2 and l3 are the three learning rates for A,B,C respectively. Which of
the following is true about l1,l2 and l3?

A) l2 < l1 < l3
B) l1 > l2 > l3
C) l1 = l2 = l3
D) None of these

Solution: (A)

In case of high learning rate, step will be high, the objective function will decrease
quickly initially, but it will not find the global minima and objective function starts
increasing after a few iterations.

In case of low learning rate, the step will be small. So the objective function will decrease
slowly

Question Context 32- 33:

32. Now we increase the training set size gradually. As the training set size increases,
what do you expect will happen with the mean training error?

A) Increase
B) Decrease
C) Remain constant
D) Can’t Say

Solution: (D)
Training error may increase or decrease depending on the values that are used to fit the
model. If the values used to train contain more outliers gradually, then the error might
just increase.

33. What do you expect will happen with bias and variance as you increase the size of training
data?

A) Bias increases and Variance increases

B) Bias decreases and Variance increases
C) Bias decreases and Variance decreases
D) Bias increases and Variance decreases
E) Can’t Say False

Solution: (D)

As we increase the size of the training data, the bias would increase while the variance
would decrease.

Question Context 34:

Consider the following data where one input(X) and one output(Y) is given.

34. What would be the root mean square training error for this data if you run a Linear
Regression model of the form (Y = A0+A1X)?

A) Less than 0
B) Greater than zero
C) Equal to 0
D) None of these

Solution: (C)

We can perfectly fit the line on the following data so mean error will be zero.

Question Context 35-36:

Suppose you have been given the following scenario for training and validation error for
Linear Regression.
Scenario Learning Rate Number of iterations Training Error Validation Error
1 0.1 1000 100 110
2 0.2 600 90 105
3 0.3 400 110 110
4 0.4 300 120 130
5 0.4 250 130 150

35. Which of the following scenario would give you the right hyper parameter?
A) 1
B) 2
C) 3
D) 4

Solution: (B)

Option B would be the better option because it leads to less training as well as validation
error.

36. Suppose you got the tuned hyper parameters from the previous question. Now, Imagine
you want to add a variable in variable space such that this added feature is important.
Which of the following thing would you observe in such case?

A) Training Error will decrease and Validation error will increase

B) Training Error will increase and Validation error will increase

C) Training Error will increase and Validation error will decrease
D) Training Error will decrease and Validation error will decrease
E) None of the above

Solution: (D)

If the added feature is important, the training and validation error would decrease.

Question Context 37-38:

Suppose, you got a situation where you find that your linear regression model is under
fitting the data.

37. In such situation which of the following options would you consider?

I will add more variables

I will start introducing polynomial degree variables
I will remove some variables

A) 1 and 2
B) 2 and 3
C) 1 and 3
D) 1, 2 and 3

Solution: (A)

In case of under fitting, you need to induce more variables in variable space or you can
add some polynomial degree variables to make the model more complex to be able to fir the
data better.
38. Now situation is same as written in previous question(under fitting).Which of following
regularization algorithm would you prefer?
A) L1
B) L2
C) Any
D) None of these

Solution: (D)

39. Which of the following step / assumption in regression modeling impacts the trade-off
between under-fitting and over-fitting the most.

A. The polynomial degree

B. Whether we learn the weights by matrix inversion or gradient descent

C. The use of a constant-term

Solution: A

Choosing the right degree of polynomial plays a critical role in fit of regression. If we
choose higher degree of polynomial, chances of overfit increase significantly.

40. Suppose you have the following data with one real-value input variable & one real-value
output variable. What is leave-one out cross validation mean square error in case of linear
regression (Y = bX+c)?

A. 10/27

B. 20/27

C. 50/27

D. 49/27

Solution: D

We need to calculate the residuals for each cross validation point. After fitting the line
with 2 points and leaving 1 point for cross validation.

Leave one out cross validation mean square error = (2^2 +(2/3)^2 +1^2) /3 = 49/27

41. Which of the following is/ are true about “Maximum Likelihood estimate (MLE)”?

MLE may not always exist

MLE always exists
If MLE exist, it (they) may not be unique
If MLE exist, it (they) must be unique

A. 1 and 4

B. 2 and 3

C. 1 and 3

D. 2 and 4

Solution: C

The MLE may not be a turning point i.e. may not be a point at which the first derivative
of the likelihood (and log-likelihood) function vanishes.

* The MLE may not be unique.

42. Let’s say, a “Linear regression” model perfectly fits the training data (train error
is zero). Now, Which of the following statement is true?

A. You will always have test error zero

B. You can not have test error zero

C. None of the above

Solution: C

Test error may be zero if there no noise in test data. In other words, it will be zero,
if the test data is perfect representative of train data but not always.

43. In a linear regression problem, we are using “R-squared” to measure goodness-of-fit.

We add a feature in linear regression model and retrain the same model.

Which of the following option is true?

A. If R Squared increases, this variable is significant.

B. If R Squared decreases, this variable is not significant.

C. Individually R squared cannot tell about variable importance. We can’t say anything about
it right now.

D. None of these.

Solution: C
“R squared” individually can’t tell whether a variable is significant or not because each
time when we add a feature, “R squared” can either increase or stay constant. But, it is
not true in case of “Adjusted R squared” (increases when features found to be significant).

44. Which one of the statement is true regarding residuals in regression analysis?

A. Mean of residuals is always zero

B. Mean of residuals is always less than zero

C. Mean of residuals is always greater than zero

D. There is no such rule for residuals.

Solution: A

Sum of residual in regression is always zero. It the sum of residuals is zero, the ‘Mean’
will also be zero.

45. Which of the one is true about Heteroskedasticity?

A. Linear Regression with varying error terms

B. Linear Regression with constant error terms

C. Linear Regression with zero error terms

D. None of these

Solution: A

The presence of non-constant variance in the error terms results in heteroskedasticity.

Generally, non-constant variance arises because of presence of outliers or extreme leverage
values.

You can refer this article for more detail about regression analysis.

46. Which of the following indicates a fairly strong relationship between X and Y?

A. Correlation coefficient = 0.9

B. The p-value for the null hypothesis Beta coefficient =0 is 0.0001

C. The t-statistic for the null hypothesis Beta coefficient=0 is 30

D. None of these
Solution: A

Correlation between variables is 0.9. It signifies that the relationship between variables
is fairly strong.

On the other hand, p-value and t-statistics merely measure how strong is the evidence that
there is non zero association. Even a weak effect can be extremely significant given enough
data.

47. Which of the following assumptions do we make while deriving linear regression
parameters?

The true relationship between dependent y and predictor x is linear

The model errors are statistically independent
The errors are normally distributed with a 0 mean and constant standard deviation
The predictor x is non-stochastic and is measured error-free

A. 1,2 and 3.

B. 1,3 and 4.

C. 1 and 3.

D. All of above.

Solution: D

When deriving regression parameters, we make all the four assumptions mentioned above. If
any of the assumptions is violated, the model would be misleading.

48. To test linear relationship of y(dependent) and x(independent) continuous variables,

which of the following plot best suited?

A. Scatter plot

B. Barchart

C. Histograms

D. None of these

Solution: A

To test the linear relationship between continuous variables Scatter plot is a good option.
We can find out how one variable is changing w.r.t. another variable. A scatter plot displays
the relationship between two quantitative variables.
49. Generally, which of the following method(s) is used for predicting continuous dependent
variable?

Linear Regression
Logistic Regression

A. 1 and 2

B. only 1

C. only 2

D. None of these.

Solution: B

Logistic Regression is used for classification problems. Regression term is misleading

here.

50. A correlation between age and health of a person found to be -1.09. On the basis of
this you would tell the doctors that:

A. The age is good predictor of health

B. The age is poor predictor of health

C. None of these

Solution: C

Correlation coefficient range is between [-1 ,1]. So -1.09 is not possible.

51. Which of the following offsets, do we use in case of least square line fit? Suppose
horizontal axis is independent variable and vertical axis is dependent variable.

A. Vertical offset

B. Perpendicular offset

C. Both but depend on situation

D. None of above

Solution: A

We always consider residual as vertical offsets. Perpendicular offset are useful in case
of PCA.
52. Suppose we have generated the data with help of polynomial regression of degree 3 (degree
3 will perfectly fit this data). Now consider below points and choose the option based on
these points.

Simple Linear regression will have high bias and low variance
Simple Linear regression will have low bias and high variance
polynomial of degree 3 will have low bias and high variance
Polynomial of degree 3 will have low bias and Low variance

A. Only 1

B. 1 and 3

C. 1 and 4

D. 2 and 4

Solution: C

If we fit higher degree polynomial greater than 3, it will overfit the data because model
will become more complex. If we fit the lower degree polynomial less than 3 which means
that we have less complex model so in this case high bias and low variance. But in case
of degree 3 polynomial it will have low bias and low variance.

53. Suppose you are training a linear regression model. Now consider these points.

Overfitting is more likely if we have less data

Overfitting is more likely when the hypothesis space is small

Which of the above statement(s) are correct?

A. Both are False

B. 1 is False and 2 is True

C. 1 is True and 2 is False

D. Both are True

Solution: C

1.With small training dataset, it’s easier to find a hypothesis to fit the training data
exactly i.e. overfitting.
2. We can see this from the bias-variance trade-off. When hypothesis space is small, it
has higher bias and lower variance. So with a small hypothesis space, it’s less likely to
find a hypothesis to fit the data exactly i.e. underfitting.

54. Suppose we fit “Lasso Regression” to a data set, which has 100 features (X1,X2…X100).
Now, we rescale one of these feature by multiplying with 10 (say that feature is X1), and
then refit Lasso regression with the same regularization parameter.

Now, which of the following option will be correct?

A. It is more likely for X1 to be excluded from the model

B. It is more likely for X1 to be included in the model

C. Can’t say

D. None of these

Solution: B

Big feature values =⇒ smaller coefficients =⇒ less lasso penalty =⇒ more likely to have
be kept

55. Which of the following is true about “Ridge” or “Lasso” regression methods in case
of feature selection?

A. Ridge regression uses subset selection of features

B. Lasso regression uses subset selection of features

C. Both use subset selection of features

D. None of above

Solution: B

“Ridge regression” will use all predictors in final model whereas “Lasso regression” can
be used for feature selection because coefficient values can be zero. For more detail click
here.

56. Which of the following statement(s) can be true post adding a variable in a linear
regression model?

R-Squared and Adjusted R-squared both increase

R-Squared increases and Adjusted R-squared decreases

R-Squared decreases and Adjusted R-squared decreases

R-Squared decreases and Adjusted R-squared increases

A. 1 and 2

B. 1 and 3

C. 2 and 4

D. None of the above

Solution: A

Each time when you add a feature, R squared always either increase or stays constant, but
it is not true in case of Adjusted R squared. If it increases, the feature would be
significant.

57. The following visualization shows the fit of three different models (in blue line) on
same training data. What can you conclude from these visualizations?

The training error in first model is higher when compared to second and third model.
The best model for this regression problem is the last (third) model, because it has
minimum training error.
The second model is more robust than first and third because it will perform better
on unseen data.
The third model is overfitting data as compared to first and second model.
All models will perform same because we have not seen the test data.

A. 1 and 3

B. 1 and 3

C. 1, 3 and 4

D. Only 5

Solution: C

The trend of the data looks like a quadratic trend over independent variable X. A higher
degree (Right graph) polynomial might have a very high accuracy on the train population
but is expected to fail badly on test dataset. But if you see in left graph we will have
training error maximum because it under-fits the training data.

58. Which of the following metrics can be used for evaluating regression models?

R Squared
Adjusted R Squared

F Statistics

RMSE / MSE / MAE

A. 2 and 4.

B. 1 and 2.

C. 2, 3 and 4.

D. All of the above.

Solution: D

These (R Squared, Adjusted R Squared, F Statistics , RMSE / MSE / MAE ) are some metrics
which you can use to evaluate your regression model.

59. We can also compute the coefficient of linear regression with the help of an analytical
method called “Normal Equation”. Which of the following is/are true about “Normal
Equation”?

We don’t have to choose the learning rate

It becomes slow when number of features is very large
No need to iterate

A. 1 and 2

B. 1 and 3.

C. 2 and 3.

D. 1,2 and 3.

Solution: D

Instead of gradient descent, Normal Equation can also be used to find coefficients. Refer
this article for read more about normal equation.

60. The expected value of Y is a linear function of the X(X1,X2….Xn) variables and regression
line is defined as:

Y = β0 + β1 X1 + β2 X2……+ βn Xn

Which of the following statement(s) are true?

If Xi changes by an amount ∆Xi, holding other variables constant, then the expected
value of Y changes by a proportional amount βi ∆Xi, for some constant βi (which in general
could be a positive or negative number).
The value of βi is always the same, regardless of values of the other X’s.
The total effect of the X’s on the expected value of Y is the sum of their separate
effects.

Note: Features are independent of each others(zero interaction).

A. 1 and 2

B. 1 and 3

C. 2 and 3

D. 1,2 and 3

Solution: D

The expected value of Y is a linear function of the X variables. This means:

If X i changes by an amount ∆X i , holding other variables fixed, then the expected
value of Y changes by a proportional amount β i ∆X i , for some constant β i (which in
general could be a positive or negative number).
The value of β i is always the same, regardless of values of the other X’s.
The total effect of the X’s on the expected value of Y is the sum of their separate
effects.

The unexplained variations of Y are independent random variables (in particular, not
“auto correlated” if the variables are time series)
They all have the same variance (“homoscedasticity”).
They are normally distributed.

61. How many coefficients do you need to estimate in a simple linear regression model (One
independent variable)?

A. 1

B. 2

C. Can’t Say

Solution: B

In simple linear regression, there is one independent variable so 2 coefficients (Y=a+bx).

62. Below graphs show two fitted regression lines (A & B) on randomly generated data. Now,
I want to find the sum of residuals in both cases A and B.

Note:

Scale is same in both graphs for both axis.

X axis is independent variable and Y-axis is dependent variable.

Which of the following statement is true about sum of residuals of A and B?

A) A has higher than B

B) A has lower than B

C) Both have same

D) None of these

Solution: C

Sum of residuals always zero.

63. If two variables are correlated, is it necessary that they have a linear relationship?

A. Yes

B. No

Solution: B

It is not necessary. They could have non linear relationship

64. Correlated variables can have zero correlation coeffficient. True or False?

A. True

B. False

Solution: A

65. Suppose I applied a logistic regression model on data and got training accuracy X and
testing accuracy Y. Now I want to add few new features in data. Select option(s) which are
correct in such case.

Note: Consider remaining parameters are same.

Training accuracy always decreases.
Training accuracy always increases or remain same.
Testing accuracy always decreases
Testing accuracy always increases or remain same

A. Only 2

B. Only 1

C. Only 3

D. Only 4

Solution: A

Adding more features to model will always increase the training accuracy i.e. low bias.
But testing accuracy increases if feature is found to be significant.

66. The graph below represents a regression line predicting Y from X. The values on the
graph shows the residuals for each predictions value. Use this information to compute the
SSE.

A. 3.02

B. 0.75

C. 1.01

D. None of these

Solution: A

SSE is the sum of the squared errors of prediction, so SSE = (-.2)^2 + (.4)^2 + (-.8)^2
+ (1.3)^2 + (-.7)^2 = 3.02

67. Height and weight are well known to be positively correlated. Ignoring the plot scales
(the variables have been standardized), which of the two scatter plots (plot1, plot2) is
more likely to be a plot showing the values of height (Var1 – X axis) and weight (Var2 –
Y axis).

A. Plot2

B. Plot1

C. Both

D. Can’t say
Solution: A

Plot 2 is definitely a better representation of the association between height and weight.
As individuals get taller, they take up more volume, which leads to an increase in height,
so a positive relationship is expected. The plot on the right has this positive relationship
while the plot on the left shows a negative relationship.

68. Suppose the distribution of salaries in a company X has median $35,000, and 25th and
75th percentiles are $21,000 and $53,000 respectively.

Would a person with Salary $1 be considered an Outlier?

A. Yes

B. No

C. More information is required

D. None of these.

Solution: C

69. Which of the following option is true regarding “Regression” and “Correlation” ?

Note: y is dependent variable and x is independent variable.

A. The relationship is symmetric between x and y in both.

B. The relationship is not symmetric between x and y in both.

C. The relationship is not symmetric between x and y in case of correlation but in case
of regression it is symmetric.

D. The relationship is symmetric between x and y in case of correlation but in case of

regression it is not symmetric.

Solution: D

Correlation is a statistic metric that measures the linear association between two
variables. It treats y and x symmetrically.
Regression is setup to predict y from x. The relationship is not symmetric.

70. Can we calculate the skewness of variables based on mean and median?

A. True
B. False

Solution: B

The skewness is not directly related to the relationship between the mean and median.

71. Suppose you have n datasets with two continuous variables (y is dependent variable and
x is independent variable). We have calculated summary statistics on these datasets. All
of them give the following result:

Are all the given datasets same?

A. Yes

B. No

C. Can’t Say

Solutiom: C

To answer this question, you should know about Anscombe’s quartet. Refer this link to read
more about this.

72. How does number of observations influence overfitting? Choose the correct answer(s).

Note: Rest all parameters are same

In case of fewer observations, it is easy to overfit the data.

In case of fewer observations, it is hard to overfit the data.
In case of more observations, it is easy to overfit the data.
In case of more observations, it is hard to overfit the data.

A. 1 and 4

B. 2 and 3

C. 1 and 3

D. None of theses

Solution: A

In particular, if we have very few observations and it’s small, then our models can rapidly
overfits data. Because we have only a few points and as we’re increasing in our model
complexity like the order of the polynomial, it becomes very easy to hit all of our
observations.
On the other hand, if we have lots and lots of observations, even with really, really complex
models, it is difficult to overfit because we have dense observations across our input.

73. Suppose you have fitted a complex regression model on a dataset. Now, you are using
Ridge regression with tuning parameter lambda to reduce its complexity. Choose the option(s)
below which describes relationship of bias and variance with lambda.

A. In case of very large lambda; bias is low, variance is low

B. In case of very large lambda; bias is low, variance is high

C. In case of very large lambda; bias is high, variance is low

D. In case of very large lambda; bias is high, variance is high

Solution: C

If lambda is very large it means model is less complex. So in this case bias is high and
variance in low.

74. Suppose you have fitted a complex regression model on a dataset. Now, you are using
Ridge regression with tuning parameter lambda to reduce its complexity. Choose the option(s)
below which describes relationship of bias and variance with lambda.

A. In case of very small lambda; bias is low, variance is low

B. In case of very small lambda; bias is low, variance is high

C. In case of very small lambda; bias is high, variance is low

D. In case of very small lambda; bias is high, variance is high

Solution: B

If lambda is very small it means model is complex. So in this case bias is low and variance
is high because model will overfit the data.

75. What is/are true about ridge regression?

When lambda is 0, model works like linear regression model

When lambda is 0, model doesn’t work like linear regression model
When lambda goes to infinity, we get very, very small coefficients approaching 0
When lambda goes to infinity, we get very, very large coefficients approaching infinity

A. 1 and 3
B. 1 and 4

C. 2 and 3

D. 2 and 4

Solution: A

Specifically, we can see that when lambda is 0, we get our least square solution. When lambda
goes to infinity, we get very, very small coefficients approaching 0.

76. Out of the three residual plots given below, which of the following represent worse
model(s) compared to others?

Note:

All residuals are standardized.

The plots are between predicted values Vs. residuals

A. 1

B. 2

C. 3

D. 1 and 2

Solution: C

There should not be any relationship between predicted values and residuals. If there exist
any relationship between them means model has not perfectly capture the information in data.

77. Which of the following method(s) does not have closed form solution for its coefficients?

A. Ridge regression

B. Lasso

C. Both Ridge and Lasso

D. None of both

Solution: B

The Lasso does not admit a closed-form solution. The L1-penalty makes the solution
non-linear. So we need to approximate the solution.
78. Consider the following dataset

Which bold point, if removed will have the largest effect on fitted regression line as shown
in above figure(dashed)?

A) a

B) b

C) c

D) d

Solution: D

Linear regression is sensitive to outliers in the data. Although c is also an outlier in

given data space but it is closed to the regression line(residual is less) so it will not
affect much.

79. In a simple linear regression model (One independent variable), If we change the input
variable by 1 unit. How much output variable will change?

A: By 1

B. No change

C. By intercept

D. By its Slope

Solution: D

Equation for simple linear regression: Y=a+bx. Now if we increase the value of x by 1 then
the value of y would be a+b(x+1) i.e. value of y will get incremented by b.

80. Logistic Regression transforms the output probability to be in a range of [0, 1]. Which
of the following function is used by logistic regression to convert the probability in the
range between [0,1].

A. Sigmoid

B. Mode

C. Square
D. Probit

Solution: A

Sigmoid function is used to convert output probability between [0,1] in logistic regression.

81: Which of the following statement is true about partial derivative of the cost functions
w.r.t weights / coefficients in linear-regression and logistic-regression?

A. Both will be different

B. Both will be same

C. Can’t say

D. None of these

Solution: B

82. Suppose, we are using Logistic regression model for n-class classification problem.
In this case, we can use One-vs-rest method. Choose which of the following option is true
regarding this?

A. We need to fit n model in n-class classification problem.

B. We need to fit n-1 models to classify into n classes.

C. We need to fit only 1 model to classify into n classes.

D. None of these.

Solution: A

If there are n classes, then n separate logistic regression has to fit, where the probability
of each category is predicted over the rest of the categories combined.

Take a example of 3-class(-1,0,1) classification. Then need to train 3 logistic regression

classifiers.

-1 vs 0 and 1
0 vs -1 and 1
1 vs 0 and -1

83. Below are two different logistic models with different values for β0 and β1.

Which of the following statement(s) is true about β0 and β1 values of two logistics models
(Green, Black)?
Note: consider Y = β0 + β1*X. Here, β0 is intercept and β1 is coefficient.

A. β1 for Green is greater than Black

B. β1 for Green is lower than Black

C. β1 for both models is same

D. Can’t Say.

Solution: B
Name of Faculty Dr Roshani Raut
Name of Subject Machine Learning & Applications
Year BE
Branch IT

Difficult
y Level
Blooms
Unit (Easy-1/
Q.no Description Question Choice A Choice B Choice C Choice D Taxonom
No Medium-
y Level
2/
Hard-3)
In multiclass classification number of classes must
1 Less than two Equals to two Greater than two option 1 and option 2 2 1 1
be
Application of machine learning methods to large
2 databases is called Data Mining. Artificial Intelligence Big Data Computing Internet of Things 1 2 1

If machine learning model output involves target

3 Descriptive model Predictive Model Reinforcement Learning All of the above 1 1 1
variable then that model is called as
In what type of learning labelled training data is
4 Unsupervised Learning Supervised Learning Reinforcement Learning Active Learning 1 1 1
used
In following type of feature selection method we start
5 Forward Feature selection Backword Feature selection Both A and B None of the above 1 1 1
with empty feature set
In PCA the number of input dimensiona are equal to
6 True FALSE 1 1 1
principal components
PCA can be used for projecting and visualizing data
7 True FALSE 1 1 1
in lower dimensions.
Accuracy
Which of the following is the best machine learning
Scalable
method? Fast
8 All of the above 1 1 1

What characterize unlabeled examples in machine There is no confusing There is plenty of confusing
9 There is no prior knowledge There is prior knowledge 1 2 1
learning knowledge knowledge
10 What does dimensionality reduction reduce? stochastics collinerity performance Entropy 1 1 1
11 Data used to build a data mining model. Training data Validation data test data hidden data 1 1 1
The problem of finding hidden structure in unlabeled Supervised learning Unsupervised learning
12 Reinforcement learning None of the above 1 1 1
data is called…
The difference between the actual Y value and the
13 predicted Y value found using a regression equation slope residual outlier scatter plot 3 1 1
is called the
Which of the following can only be used when
14 Linear hard-margin SVM Linear Logistic Regression Linear Soft margin SVM The centroid method 2 1 1
training data are linearlyseparable?
15 Impact of high variance on the training set ? overfitting underfitting both underfitting & overfitting Depents upon the dataset 2 1 1
The SVM allows very low error The SVM allows high amount
16 What do you mean by a hard margin? Both 1 & 2 None of the above 2 1 1
in classification of error in classification
17 The effectiveness of an SVM depends upon: Selection of Kernel Kernel Parameters Soft Margin Parameter C All of the above 2 1 1
It is the transmission of error
It is another name given to It is the transmission of error
back through the network to
18 What is back propagation? the curvy function in the back through the network to None of the mentioned 6 2 1
allow weights to be adjusted so
perceptron adjust the inputs
that the network can learn
The only examples
All the examples that have a
19 What are support vectors? necessary to compute f(x) in All of the above None of the above 2 2 1
non-zero weight αk in a SVM
an SVM.
always output values can be used for regression
20 Neural networks optimize a convex cost function All of the above 3 2 1
between 0 and 1 as well as classification
Given a database of customer
Given a set of news articles
Given email labeled as Spam data, automatically discover
Of the Following Examples, Which would you found on the web, group Find the patterns in Market
21 or not Spam, learn a spam market segments and group 1 1 2
address using an supervised learning Algorithm? them into set of articles Basket Analysis
filter customers into different market
about the same story.
segments.
Dimensionality Reduction Algorithms are one of the
22 possible ways to reduce the computation time TRUE FALSE 1 1 2
required to build a model
You are given reviews of few netflix series marked
23 as positive, negative and neutral. Classifying reviews Supervised Learning Unsupervised Learning Semisupervised Learning Reinforcement Learning 1 1 2
of a new netflix series is an example of
Which of the following is a good test dataset Large enough to yield Is representative of the
24 Both A and B None of the above 1 2 2
characteristic? meaningful results dataset as a whole
25 Following are the types of supervised learning Classification Regression subgroup discovery All of the above 1 1 2
26 Type of matrix decomposition model is Descriptive model Predictive Model Logical model None of the above 1 3 2
Following is powerful distance metrics used by
27 Euclidean distance Manhattan distance Both A and B square distance 1 2 2
Geometric model
28 The output of training process in machine learning is machine learning model machine learning algorithm null accuracy 1 1 2
A feature F1 can take certain value: A, B, C, D, E, &
F and represents grade of students from a college.
29 Here feature type is nominal ordinal categorical boolean 1 3 2

30 PCA is Forward Feature selection Backword Feature selection Feature Extraction All of the above 1 1 2
Dimensionality reduction algorithms are one of the
31 possible ways to reduce the computation time True FALSE 1 1 2
required to build a model.
Removing columns which
Removing columns which Removing columns with
Which of the following techniques would perform have too many missing
32 have high variance in data dissimilar data trends None of these 1 3 2
better for reducing dimensions of a data set? values

Supervised learning and unsupervised clustering output attribute. hidden attribute. input attribute.
33 both require which is correct according to the categorical attribute 1 2 2
statement.
A plane with 1 dimensional A plane with 2 dimensional A plane with 1 dimensional
What characterize is hyperplance in geometrical A plane with 2 dimensional more
34 fewer than number of input fewer than number of more than number of input 1 2 2
model of machine learning? than number of input attributes
attributes input attributes attributes
Like the probabilistic view, the ________ view
35 allows us to associate a probability of membership exampler deductive classical inductive 1 2 2
with each classification.
Database query is used to uncover this type of
36 deep hidden shallow multidimensional 1 2 2
knowledge.
A person trained to interact with a human expert in knowledge programmer knowledge developer knowledge engineer
37 knowledge extractor 1 2 2
order to capture their knowledge. r
Some telecommunication company wants to
Supervised learning Unsupervised learning
38 segment their customers into distinct groups ,this is Reinforcement learning Data extraction 1 2 2
an example of
In the example of predicting number of babies based
39 outcome feature observation attribute 1 3 2
on stork's population ,Number of babies is
Linear Regression is a _______ machine learning
40 Supervised Unsupervised Semi-Supervised Can't say 3 1 2
algorithm.
A perceptron adds up all the weighted inputs it
Sometimes – it can also output
41 receives, and if it exceeds a certain value, it outputs TRUE False Can’t say 2 1 2
intermediate values as well
a 1, otherwise it just outputs a 0.
To transform the data from To transform the problem To transform the problem from
42 What is the purpose of the Kernel Trick? nonlinearly separable to from regression to supervised to unsupervised All of the above 2 1 2
linearly separable classification learning.
Which of the following can only be used when
43 Linear hard-margin SVM Linear Logistic Regression Linear Soft margin SVM Parzen windows 2 1 2
training data are linearlyseparable?
determines how strongly the is more analogous to the only changes very slowly,
can sometimes exceed 30,000
dendrites of the output of a unit in a taking a period of
44 The firing rate of a neuron action potentials 2 1 2
neuron stimulate axons of neural net than the output several seconds to make large
per second
neighboring neurons voltage of the neuron adjustments
Which of the following methods/methods do we use
45 Least Square Error Maximum Likelihood Logarithmic Loss Both A and B 3 2 2
to find the best fit line for data in Linear Regression?
Which of the following methods do we use to best fit
46 Least Square Error Maximum Likelihood Jaccard distance Both A and B 3 2 2
the data in Logistic Regression?
Which of the following evaluation metrics can not be
47 applied in case of logistic regression output to AUC-ROC Accuracy Logloss Mean-Squared-Error 2 2 2
compare with target?
Which of the following is an application of NN
48 Sales forecasting Data validation Risk management All of the mentioned 6 2 2
(Neural Network)?
Neural Networks are complex ______________ with
49 Linear Functions Nonlinear Functions Discrete Functions Exponential Functions 6 2 2
many parameters.
The tradeoff between
The number of cross-validations
50 The cost parameter in the SVM means: The kernel to be used misclassification and None of the above 2 2 2
to be made
simplicity of the model
Lasso can be interpreted as least-squares linear weights are regularized with the weights have a Gaussian weights are regularized with
51 the solution algorithm is simpler 3 2 2
regression where the L1 norm prior the L2 norm
changes ridge regression so exploits the fact that in many
we solve a d × d learning algorithms, the
can be applied to every is commonly used for
52 The kernel trick linear system instead of an n weights can be written as a 2 2 2
classification algorithm dimensionality reduction
× n system, given n linear
sample points with d features combination of input points
How does the bias-variance decomposition of a
ridge regression estimator compare with that of Ridge has larger bias, larger Ridge has smaller bias, Ridge has larger bias, Ridge has smaller bias, smaller
53 2 2 2
ordinary variance larger variance smaller variance variance
least squares regression?
Which of the following evaluation metrics can be
54 used to evaluate a model while modeling a AUC-ROC Accuracy Logloss Mean-Squared-Error 3 3 2
continuous output variable?
Classifiers which perform
Classifiers which form a tree series of condition checking
55 What are tree based classifiers? Both options except none None of the options 4 1 2
with each attribute at one level with one attribute
at a time
56 What is gini index? It is a type of index structure It is a measure of purity Both options except none None of the options 4 1 2
Which of the following sentences are correct in
reference to
Information gain?
57 a. It is biased towards single-valued attributes a and b a and d b, c and d All of the above 4 1 2
b. It is biased towards multi-valued attributes
c. ID3 makes use of information gain
d. The approact used by ID3 is greedy
Multivariate split is where the partitioning of tuples is
based on a
58 TRUE FALSE 4 1 2
combination of attributes rather than on a single
attribute.
Gain ratio tends to prefer unbalanced splits in which
59 TRUE FALSE 4 1 2
one partition is much smaller than the other
The gini index is not biased towards multivalued
60 TRUE FALSE 4 1 2
attributed.
61 Gini index does not favour equal sized partitions. TRUE FALSE 4 1 2
When the number of classes is large Gini index is
62 TRUE FALSE 4 1 2
not a good choice.
Attribute selection measures are also known as
63 TRUE FALSE 4 1 2
splitting rules.
his clustering approach initially assumes that each
64 expectation maximization K-Means clustering agglomerative clustering conceptual clustering 4 1 2
data instance represents a single cluster.
Which statement is true about the K-Means The output attribute must be All attribute values must be All attributes must be Attribute values may be either
65 4 1 2
algorithm? cateogrical categorical numeric categorical or numeric
The probability of a hypothesis before the
66 priori posterior conditional subjective 5 1 2
presentation of evidence.
67 KDD represents extraction of data knowledge rules model 4 1 2
68 The most general form of distance is Manhattan Eucledian Mean Minkowski 4 1 2
With Bayes theorem the probability of hypothesis
69 a conditional probability an a priori probability a bidirectional probability a posterior probability 5 1 2
HÂ¾ specified by P(H) Â¾ is referred to as
Simple regression assumes a __________
70 relationship between the input attribute and output quadratic inverse linear reciprocal 3 1 2
attribute.
Which of the following algorithm comes under the
71 Apriori Brute force DBSCAN K-nearest neighbor 4 1 2
classification
Hierarchical agglomerative clustering is typically
72 Dendrogram Binary trees Block diagram Graph 4 1 2
visualized as?
The _______ step eliminates the extensions of
73 (k-1)-itemsets which are not found to be Partitioning Candidate generation Itemset eliminations Pruning 4 1 2
frequent,from being considered for counting support
The distance between two points calculated using
74 Supremum distance Eucledian distance Linear distance Manhattan Distance 4 1 2
Pythagoras theorem is
Which learning Requires Self Assessment to identify
75 Unsupervised Learning Supervised Learning Semisupervised Learning Reinforced Learning 1 1 3
patterns within data?
Select the correct answers for following statements.
1. Filter methods are much faster compared to
wrapper methods.
76 2. Wrapper methods use statistical methods for Both are True 1 is True and 2 is False Both are False 1 is False and 2 is True 1 2 3
evaluation of a subset of features while Filter
methods use cross validation.
All the problems that arise All the problems that arise
All the problems that arise All the problems that arise when
when working with data in the when working with data in
when working with data in the working with data in the higher
77 The "curse of dimensionality" referes higher dimensions, that did the lower dimensions, that 1 2 3
lower dimensions, that did not dimensions, that did not exist in
not exist in the lower did not exist in the higher
exist in the lower dimensions. the higher dimensions.
dimensions. dimensions.
Training based on historical
78 In simple term, machine learning is Prediction to answer a query Both A and B Automization of complex tasks 1 1 3
data
If machine learning model output doesnot involves
79 Descriptive model Predictive Model Reinforcement Learning All of the above 1 1 3
target variable then that model is called as
80 Following are the descriptive models Clustering Classification Association rule Both a and c 1 1 3
Different learning methods does not include?
81 Memorization Analogy Deduction Introduction 1 3 3
A measurable property or parameter of the data-set
82 training data feature test data validation data 1 2 3
is
83 Feature can be used as a Binary split Predictor Both A and B None of the above 1 1 3
It is not necessary to have a target variable for
84 True FALSE 1 1 3
applying dimensionality reduction algorithms
The most popularly used dimensionality reduction
algorithm is Principal Component Analysis (PCA).
Which of the following is/are true about PCA? 1.
PCA is an unsupervised method
2. It searches for the directions that data have the
85 1&2 2&3 3&4 All of the above 1 3 3
largest variance
3. Maximum number of principal components <=
number of features
4. All principal components are orthogonal to each
other
Choose k to be the smallest
Choose k to be 99% of m (k
value so that at least 99% of Choose k to be the largest
Which of the following is a reasonable way to select = 0.99*m, rounded to the
86 the varinace is retained. - value so that 99% of the Use the elbow method 1 3 3
the number of principal components "k"? nearest integer).
answer variance is retained.

Which of the folllowing is an example of feature

Construction bag of words from Applying PCA to project
87 extraction? Removing stop words Forward selection 1 3 3
an email high dimensional data
The result of application of Discipline in statistics used to
Value entered in database by
88 Prediction is specific theory or rule in a find projections in Independent of data 1 2 3
expert
specific case multidimensional data
You are given sesimic data and you want to predict Supervised learning Unsupervised learning
89 Reinforcement learning Dimensionality reduction 1 3 3
next earthquake , this is an example of
In the regression equation Y = 75.65 + 0.50X, the
90 0.5 75.65 1 indeterminable 3 1 3
intercept is
The selling price of a house depends on many
factors. For example, it depends on the number of
bedrooms, number of kitchen, number of
91 bathrooms, the year the house was built, and the Binary Classification Multilabel Classification Simple Linear Regression Multiple Linear Regression 3 1 3
square footage of the lot. Given these factors,
predicting the selling price of the house is an
example of ____________ task.
Suppose, you got a situation where you find that
your linear regression model is under fitting the data. You will remove some
92 You will add more features All of the above None of the above 3 1 3
In such situation which of the following options would features
you consider?
Which of the following are real world applications of Text and Hypertext
93 Image Classification Clustering of News Articles All of the above 2 1 3
the SVM? Categorization
It is a model trained using It is a model trained using It is a model trained using t is a model trained using
unsupervised learning. It can be unsupervised learning. It can supervised learning. It can unsupervised learning. It can be
94 How can SVM be classified? 2 1 3
used for classification and be used for classification but be used for classification used for classification but not for
regression. not for regression. and regression. regression.
Which of the following can help to reduce overfitting High-degree polynomial
95 Use of slack variables Normalizing the data Setting a very low learning rate 2 1 3
in an SVM classifier? features
We have been given a dataset with n records in
which we have input attribute as x and output
attribute as y. Suppose we use a linear regression
method to model this data. To test our linear
96 regressor, we split the data in training set and test Increase Decrease Remain constant Can’t Say 3 2 3
set randomly. Now we increase the training set
size gradually. As the training set size increases,
What do you expect will happen with the mean
training error?
We have been given a dataset with n records in
which we have input attribute as x and output
attribute as y. Suppose we use a linear regression
method to model this data. To test our linear Bias increases and Variance Bias decreases and Variance Bias decreases and Variance Bias increases and Variance
97 3 2 3
regressor, we split the data in training set and test increases increases decreases decreases
set randomly. What do you expect will happen
with bias and variance as you increase the size of
training data?
Regarding bias and variance, which of the following
statements are true? (Here ‘high’ and ‘low’ are
relative to the ideal model.
(i) Models which overfit are more likely to have high
bias

(ii) Models which overfit are more likely to have low

98 (i) and (ii) (ii) and (iii) (iii) and (iv) None of these 3 2 3
bias

(iii) Models which overfit are more likely to have high

variance

(iv) Models which overfit are more likely to have low

variance
Which of the following indicates the fundamental of arithmetic mean should be arithmetic mean should be arithmetic mean should be arithmetic mean should be
99 3 2 3
least squares? maximized zero neutralized minimized
False – perceptrons are
Having multiple perceptrons can actually solve the True – this works always, and True – perceptrons can do
mathematically incapable of
XOR problem satisfactorily: this is because each these multiple perceptrons learn this but are unable to learn False – just having a single
100 solving linearly inseparable 6 2 3
perceptron can partition off a linear part of the space to classify even complex to do it – they have to be perceptron is enough
functions, no matter what
itself, and they can then combine their results. problems explicitly hand-coded
you do
Suppose you have trained an SVM with linear
decision boundary after training SVM, you correctly
You want to increase your data You want to decrease your You will try to calculate more You will try to reduce the
101 infer that your SVM model is under fitting. 2 2 3
points data points variables features
Which of the following is best option would you
more likely to consider iterating SVM next time?
What is/are true about kernel in SVM? 1.
Kernel function map low dimensional data to high
102 1 2 1 and 2 None of these 2 2 3
dimensional space
2. It’s a similarity function
You trained a binary classifier model which gives The training and testing
very high accuracy on the This is an instance of The training was not well examples are sampled from
103 This is an instance of overfitting 2 2 3
training data, but much lower accuracy on validation underfitting regularized different
data. Which is false. distributions
Suppose that we have N independent variables
(X1,X2… Xn) and dependent variable is Y. Now
Imagine that you are applying linear regression by Relation between the X1 and Y Relation between the X1 Relation between the X1 and Y Correlation can’t judge the
104 3 3 3
fitting the best fit line using least square error on this is weak and Y is strong is neutral relationship
data. You found that correlation coefficient for one of
it’s variable(Say X1) with Y is 0.95.
In terms of bias and variance. Which of the following Bias will be high, variance will be Bias will be low, variance will Bias will be high, variance Bias will be low, variance will be
105 3 3 3
is true when you fit degree 2 polynomial? high be high will be low low
At least one principal
Which of the following statements are true for a Least-squares linear regression
106
∈
design matrix X Rn×d with d > n? (The rows are n computes the The sample points are
X has exactly d − n
eigenvectors with eigenvalue
component direction is
orthogonal to a hyperplane 3 3 3
sample weights w = (XTX)−1 linearly separable
zero that contains all the sample
points and the columns represent d features.) XTy
points
Suppose your model is demonstrating high variance
Improve the optimization
across the different training sets. Which of the Increase the amount of traning Decrease the model Reduce the noise in the training
107 algorithm being used for 2 3 3
following is NOT valid way to try and reduce the data in each traning set complexity data
error minimization.
variance?
Regression through the origin Normalizing variables results
Least squares is not an
108 Point out the wrong statement. yields an equivalent slope if you in the slope being the None of the mentioned 3 3 3
estimation tool
center the data first correlation
The model would consider
The model would consider even The model would not be
Suppose you are using RBF kernel in SVM with high only the points close to
109 far away points from hyperplane affected by distance of points None of the above 2 3 3
Gamma value. What does this signify? the hyperplane for
for modeling from hyperplane for modeling
modeling
We usually use feature normalization before using
the Gaussian kernel in SVM. What is true about
feature normalization?
1. We do feature normalization so that new feature
110 will dominate other 1 1 and 2 1 and 3 2 and 3 2 3 3
2. Some times, feature normalization is not feasible
in case of categorical variables
3. Feature normalization always helps when we use
Gaussian kernel in SVM
Should be used whenever Should be avoided unless
Are useful mainly when the
Wrapper methods are hyper-parameter selection possible because they are there are no other options
111 learning machines are Should be avoided altogether. 2 3 3
methods that computationally because they are
“black boxes”
efficient always prone to overfitting.
Which of the following methods can not achieve zero
112 Decision tree 15-nearest neighbors Hard-margin SVM Perceptron 2 3 3
training error on any linearly separable dataset?
Suppose we train a hard-margin linear SVM on n >
100 data points in R2, yielding a hyperplane with
exactly 2 support vectors. If we add one more data
113 point and retrain the classifier, what is the maximum 2 3 n n+1 2 3 3
possible number of support vectors for the new
hyperplane (assuming the n + 1 points are linearly
separable)?
Let S1 and S2 be the set of support vectors and w1
and w2 be the learnt weight vectors for a linearly
114
separable problem using hard and soft margin linear
SVMs respectively. Which of the following are
S1 ⊂ S2 S1 may not be a subset of
S2
w1 = w2 All of the above 2 3 3
correct?

115 Which one of these is not a tree based learner? CART ID3 Bayesian classifier Random Forest 4 2 3
116 Which one of these is a tree based learner? Rule based Bayesian Belief Network Bayesian classifier Random Forest 4 2 3
What is the approach of basic algorithm for decision
117 Greedy Top Down Procedural Step by Step 4 2 3
tree induction?
Which of the following classifications would best suit
118 If...then... Analysis Market-basket analysis Regression analysis Cluster analysis 4 2 3
the student performance classification systems?
Given that we can select the same feature multiple
times during the recursive partitioning of
the input space, is it always possible to achieve
119 Yes No 4 2 3
100% accuracy on the training data (given
that we allow for trees to grow to their maximum
size) when building decision trees?
This clustering algorithm terminates when mean
values computed for the current iteration of the
120 K-Means clustering conceptual clustering expectation maximization agglomerative clustering 4 2 3
algorithm are identical to the computed mean values
for the previous iteration
The number of iterations in apriori ___________ increases with the size of the decreases with the increase increases with the size of decreases with increase in size
121 4 2 3
Select one: a. b. c. d. data in size of the data the maximum frequent set of the maximum frequent set
Superset of both closed
Frequent item sets is Superset of only closed frequent Superset of only maximal Subset of maximal frequent
122 frequent item sets and 4 2 3
item sets frequent item sets item sets
maximal frequent item sets
A good clustering method will produce high quality
123 high inter class similarity low intra class similarity high intra class similarity no inter class similarity 4 2 3
clusters with
Both techniques build models
Both models require numeric
Which statement is true about neural network and whose output is determined by The output of both models is Both models require input
124 attributes to range between 0 4 2 3
linear regression models? a linear sum of weighted input a categorical attribute value attributes to be numeric
and 1
attribute values
Outliers should be part of the The nature of the problem Outliers should be part of the
Outliers should be identified
125 Which statement about outliers is true? training dataset but should not determines how outliers are test dataset but should not be 2 2 3
and removed from a dataset
be present in the test data used present in the training data
High support and medium High support and low Low support and high
126 Which Association Rule would you prefer Low support and low confidence 4 2 3
confidence confidence confidence
In a Rule based classifier, If there is a rule for each
127 combination of attribute values, what do you called Exhaustive Inclusive Comprehensive Mutually exclusive 4 2 3
that rule set R
If a set cannot pass a test, its To decrease the efficiency, To improve the efficiency, do
If a set can pass a test, its
128 The apriori property means supersets will also fail the do level-wise generation of level-wise generation of 4 2 3
supersets will fail the same test
same test frequent item sets frequent item sets d.
If an item set ‘XYZ’ is a frequent item set, then all
129 Undefined Not frequent Frequent Can not say 4 2 3
subsets of that frequent item set are
Clustering is ___________ and is example of
130 Predictive and supervised Predictive and unsupervised Descriptive and supervised Descriptive and unsupervised 4 2 3
____________learning
To determine association rules from frequent item Only minimum confidence Neither support not Both minimum support and
131 Minimum support is needed 4 2 3
sets needed confidence needed confidence are needed
If {A,B,C,D} is a frequent itemset, candidate rules
132 C –> A D –>ABCD A –> BC B –> ADC 4 2 3
which is not possible is
Low support and high Low support and low High support and medium
133 Which Association Rule would you prefer High support and low confidence 4 2 3
confidence confidence confidence
The probability that a person owns a sports car
given that they subscribe to automotive magazine is
40%. We also know that 3% of the adult population
subscribes to automotive magazine. The probability
134 of a person owning a sports car given that they don’t 0.0398 0.0389 0.0368 0.0396 5 3 3
subscribe to automotive magazine is 30%. Use this
information to compute the probability that a person
subscribes to automotive magazine given that they
own a sports car
This clustering algorithm terminates when mean
values computed for the current iteration of the
135 conceptual clustering K-Means clustering expectation maximization agglomerative clustering 4 2 3
algorithm are identical to the computed mean values
for the previous iteration
Classification rules are extracted from
136 decision tree root node branches siblings 4 2 3
_____________
What does K refers in the K-Means algorithm which
137 Complexity Fixed value No of iterations number of clusters 4 2 3
is a non-hierarchical clustering approach?
PCA works better if there is
1. A linear structure in the data
138 2. If the data lies on a curved surface and not on a 1 and 2 2 and 3 1 and 3 1,2 and 3 1 3 4
flat surface
3. If variables are scaled in the same unit
139 If TP=9 FP=6 FN=26 TN=70 then Error rate will be 45 percentage 99 percentage 28 percentage 20 perentage 2 3 4
Imagine, you are solving a classification problems
with highly imbalanced class. The majority class is
observed 99% of times in the training data. Your
model has 99% accuracy after taking the predictions
on test data. Which of the following is true in such a
case? 1.
Accuracy metric is not a good idea for imbalanced
140 1 and 3 1 and 4 2 and 3 2 and 4 2 3 4
class problems.
2.Accuracy metric is a good idea for imbalanced
class problems.
3.Precision and recall metrics are good for
imbalanced class problems.
4.Precision and recall metrics aren’t good for
imbalanced class problems.
he minimum time complexity for training an SVM is
141 O(n2). According to this fact, what sizes of datasets Large datasets Small datasets Medium sized datasets Size does not matter 2 1 4
are not best suited for SVM’s?
Both By pruning the longer
How will you counter over-fitting in decision tree?
142 By pruning the longer rules By creating new rules rules’ and ‘ By creating new None of the options 4 3 4
rules’
Pessimistic pruning and Postpruning and Cost complexity pruning and
143 What are two steps of tree pruning work? None of the options 4 3 4
Optimistic pruning Prepruning time complexity pruning
The best pruned tree is the
A pruning set of class
In pre-pruning a tree is 'pruned' one that minimizes the number
144 Which of the following sentences are true? labelled tuples is used to All of the above 4 3 4
by halting its construction early of encoding
estimate cost complexity
bits
Assume that you are given a data set and a neural
Fidelity of the decision tree
network model trained on the data set. You
model, which is the fraction Comprehensibility of the
are asked to build a decision tree model with the F1 measure of the decision
Accuracy of the decision tree of instances on which the decision tree model, measured
145 sole purpose of understanding/interpreting tree model on the given data 4 3 4
model on the given data set neural in terms of the size of the
the built neural network model. In such a scenario, set
network and the decision corresponding rule set
which among the following measures would
tree give the same output
you concentrate most on optimising?
Which of the following properties are characteristic
of decision trees?
(a) High bias
146 a and b a and d b, c and d All of the above 4 3 4
(b) High variance
(c) Lack of smoothness of prediction surfaces
(d) Unbounded parameter set
To control the size of the tree, we need to control the
number of regions. One approach to
do this would be to split tree nodes only if the
resultant decrease in the sum of squares error
exceeds some threshold. For the described method,
which among the following are true?
147 a and b a and d b, c and d All of the above 4 3 4
(a) It would, in general, help restrict the size of the
trees
(b) It has the potential to affect the performance of
the resultant regression/classification
model
(c) It is computationally infeasible
Identify the model which
Identify the best Identify the model which gives
Identify the best partition of the gives performance close to
approximation of the above the best performance using the
Which among the following statements best input space and response per the best greedy
148 by the greedy approach (to greedy approximation 4 3 4
describes our approach to learning decision trees partition to minimise sum approximation performance
identifying the (option (b)) with the smallest
of squares error (option (b)) with the smallest
partitions) partition scheme
partition scheme
Having built a decision tree, we are using reduced
error pruning to reduce the size of the
tree. We select a node to collapse. For this particular
node, on the left branch, there are 3
training data points with the following outputs: 5, 7,
149 9.6 and for the right branch, there are 10.8, 13.33, 14.48 10.8, 13.33, 12.06 7.2, 10, 8.8 7.2, 10, 8.6 4 3 4
four training data points with the following outputs:
8.7, 9.8, 10.5, 11. What were the original
responses for data points along the two branches
(left & right respectively) and what is the
new response after collapsing the node?
Suppose on performing reduced error pruning, we
collapsed a node and observed an improvement in
the prediction accuracy on the validation set. Which
among the following statements
are possible in light of the performance improvement
observed?
(a) The collapsed node helped overcome the effect
of one or more noise affected data points
150 a and b a and d b, c and d All of the above 4 3 4
in the training set
(b) The validation set had one or more noise
affected data points in the region corresponding
to the collapsed node
(c) The validation set did not have any data points
along at least one of the collapsed branches
(d) The validation set did have data points adversely
affected by the collapsed node
151 Time Complexity of k-means is given by O(mn) O(tkn) O(kn) O(t2kn) 4 3 4
Neural network learning Neural networks can be used for
Neural networks work well Neural networks can be used
Which one of the following is not a major strength of algorithms are guaranteed to applications that require a time
152 with datasets containing for both supervised learning 6 3 4
the neural network approach? converge to an optimal element to be included in the
noisy data and unsupervised clustering
solution data
In Apriori algorithm, if 1 item-sets are 100, then the
153 100 200 4950 5000 4 3 4
number of candidate 2 item-sets are
154 Significant Bottleneck in the Apriori algorithm is Finding frequent itemsets Pruning Candidate generation Number of iterations 4 3 4
typically assume an
Machine learning techniques differ from statistical are better able to deal with have trouble with large-sized are not able to explain their
155 underlying distribution for the 4 3 4
techniques in that machine learning methods missing and noisy data datasets behavior
data
The probability that a person owns a sports car
given that they subscribe to automotive magazine is
40%. We also know that 3% of the adult population
subscribes to automotive magazine. The probability
156 of a person owning a sports car given that they 0.0368 0.0396 0.0389 0.0398 4 3 4
donâ€™t subscribe to automotive magazine is 30%.
Use this information to compute the probability that a
person subscribes to automotive magazine given
that they own a sports car
What is the final resultant cluster size in Divisive
157 algorithm, which is one of the hierarchical clustering Zero Three singleton Two 4 3 4
approaches?
2k – 1 candidate association 2k candidate association 2k – 2 candidate
158 Given a frequent itemset L, If |L| = k, then there are 2k -2 candidate association rules 4 3 5
rules rules association rules
A student Grade is a variable F1 which takes a value
Variable F1 is an example of Variable F1 is an example It doesn't belong to any of the It belongs to both ordinal and
159 from A,B,C and D. Which of the following is True in 1 2 3
nominal variable of ordinal variable mentioned categories nominal category
the following case?
What can be major issue in Faster Runtime Compared to Slower Runtime Compared to
160 Low Variance High Variance 1 2 3
Leave-One-Out-Cross-Validation(LOOCV)? K-Fold Cross Validation normal Validation
Imagine a Newly-Born starts to learn walking. It will
try to find a suitable policy to learn walking after
161 classification regression Kmeans algorithm Reinforcement Learning 1 2 3
repeated falling and getting up.specify what type of
machine learning is best suited?
Semi-Supervised Learning Supervised learning
162 Perceptron Classifier is Unsupervised learning algorithm Soft margin classifier 2 1 2
Algorithm algorithm
163 Type of dataset available in Supervised Learning is Unlabeled dataset Labeled Dataset CSV file Excel file 2 2 3
which among the following is the most appropriate
164 kernel that can be used with SVM to separate the Linear kernel Gaussian RBF kernel Polynomial kernel Option 1 and option 3 2 2 3
classes.
The data is clean and ready The data is noisy and
165 The SVMs are less effective when The data is linearly separable option 1 and option 2 2 2 3
to use contains overlapping points
The model would consider
The model would consider even The model would not be
Suppose you are using RBF kernel in SVM with high only the points close to
166 far away points from affected by distance of points opton 1 and option 2 2 2 3
Gamma value. What does this signify? the hyperplane for
hyperplane for modeling from hyperplane for modeling
modeling
What is the precision value for following confusion
167 0.91 0.09 0.9 0.95 2 3 4
matrix of binary classification?
Which of the following are components of
168 Bias Vaiance Both of them None of them 2 1 2
generalization Error?
Which of the following is not a kernel method in
169 Linear Kernel Polynomial Kernel RBF Kernel Nonlinear Kernel 2 2 3
SVM?
During the treatement of cancer patients , the doctor
needs to be very careful about which patients need
170 to be given chemotherapy.Which metric should we Precision Recall call score 2 3 4
use in order to decide the patients who should given
chemotherapy?
Which one of the following is suitable? 1. When the
hypothsis space is richer, overfitting is more likely. 2.
171 True, False False, True True,True False,False 2 2 3
when the feature space is larger , overfitting is more
likely.
172 Which of the following is a categorical data? Branch of Bank Expenditure in rupees prize of house Weight of a person 2 2 3
The data is noisy and
The soft margin SVM is more preferred than the The data is not noisy and The data is noisy and linearly
173 The data is linearly seperable contains overlapping 2 2 3
hard-margin SVM when- linearly seperable seperable
points
In SVM which has quadratic kernel function of
We can still classify the data We can not classify the data
polynomial degree 2 that has slack variable C as We can not classify the data at Data can be classified correctly
174 correctly for given setting of correctly for given setting of 2 3 4
one hyper paramenter. What would happen if we all without any impact of C
hyper parameter C hyper parameter C
use very large value for C
In SVM, RBF kernel with appropriate parameters to The Decision boundry in the The Decision boundry in The Decision boundry in the
The Decision boundry in the
175 perform binary classification where the data is transformed feature space in the transformed feature original feature space in not 2 2 3
original feature space in linear
non-linearly seperable. In this scenario non-linear space in linear considered
Which of the following is true about SVM? 1. Kernel
176 function map low dimensional data to high 1 is True, 2 is False 1 is False, 2 is True 1 is True, 2 is True 1 is False, 2 is False 2 1 2
dimensional space. 2. It is a similarity Function
What is the Accuracy in percentage based on
following confusion matrix of three class
classification.
177 Confusion Matrix C= 75% 97% 95% 85% 2 3 4
[14 0 0]
[ 1 15 0]
[ 0 0 6]
Which of the following method is used for multiclass
178 One Vs Rest LOOCV All vs One One vs Another 2 1 2
classification?
What is the precision value for following confusion
179 0.91 0.09 0.9 0.95 2 3 4
matrix of binary classification?
Which of the following is not a kernel method in
180 Linear Kernel Polynomial Kernel RBF Kernel Nonlinear Kernel 2 1 2
SVM?
Based on survey , it was found that the probability
that person like to watch serials is 0.25 and the
probability that person like to watch netflix series is
181 0.32 0.2 0.44 0.56 2 2 3
0.43. Also the probability that person like to watch
serials and netflix sereis is 0.12. what is the
probability that a person doesn't like to watch either?
A machine learning problem involves four attributes
plus a class. The attributes have 3, 2, 2, and 2
182 possible values each. The class has 3 possible 12 24 48 72 2 3 4
values. How many maximum possible different
examples are there?
they are not consistent
183 MLE estimates are often undesirable because they are biased they have high variance None of the above 2 1 2
estimators
Linear Regression is a _______ machine learning
184 Supervised Unsupervised Semi-Supervised Can't say 3 1 2
algorithm.
In the regression equation Y = 75.65 + 0.50X, the
185 0.5 75.65 1 indeterminable 3 1 2
intercept is
The difference between the actual Y value and the slope residual outlier scatter plot
186 predicted Y value found using a regression equation 3 2 3
is called the
The selling price of a house depends on many
factors. For example, it depends on the number of
bedrooms, number of kitchen, number of
187 bathrooms, the year the house was built, and the Binary Classification Multilabel Classification Simple Linear Regression Multiple Linear Regression 3 3 4
square footage of the lot. Given these factors,
predicting the selling price of the house is an
example of ____________ task.
Suppose, you got a situation where you find that
your linear regression model is under fitting the data. You will remove some
188 You will add more features All of the above None of the above 3 2 3
In such situation which of the following options would features
you consider?
Which of the following methods/methods do we use
189 Least Square Error Maximum Likelihood Logarithmic Loss Both A and B 3 2 3
to find the best fit line for data in Linear Regression?
We have been given a dataset with n records in
which we have input attribute as x and output
attribute as y. Suppose we use a linear regression
method to model this data. To test our linear
190 regressor, we split the data in training set and test Increase Decrease Remain constant Can’t Say 3 2 3
set randomly. Now we increase the training set
size gradually. As the training set size increases,
What do you expect will happen with the mean
training error?
We have been given a dataset with n records in
which we have input attribute as x and output
attribute as y. Suppose we use a linear regression
method to model this data. To test our linear Bias increases and Variance Bias decreases and Variance Bias decreases and Variance Bias increases and Variance
191 3 2 3
regressor, we split the data in training set and test increases increases decreases decreases
set randomly. What do you expect will happen
with bias and variance as you increase the size of
training data?
If X and Y in a regression model are totally the correlation coefficient would the coefficient of the coefficient of determination
192 the SSE would be 0 3 2 3
unrelated, be -1 determination would be 0 would be 1
Regarding bias and variance, which of the following
statements are true? (Here ‘high’ and ‘low’ are
relative to the ideal model.
(i) Models which overfit are more likely to have high
bias
193 (ii) Models which overfit are more likely to have low (i) and (ii) (ii) and (iii) (iii) and (iv) None of these 3 2 3
bias
(iii) Models which overfit are more likely to have high
variance
(iv) Models which overfit are more likely to have low
variance
Which of the following evaluation metrics can be
194 used to evaluate a model while modeling a AUC-ROC Accuracy Logloss Mean-Squared-Error 3 3 4
continuous output variable?
Suppose that we have N independent variables
(X1,X2… Xn) and dependent variable is Y. Now
Imagine that you are applying linear regression by Relation between the X1 and Y Relation between the X1 Relation between the X1 and Y Correlation can’t judge the
195 3 3 4
fitting the best fit line using least square error on this is weak and Y is strong is neutral relationship
data. You found that correlation coefficient for one of
it’s variable(Say X1) with Y is 0.95.
In terms of bias and variance. Which of the following Bias will be high, variance will be Bias will be low, variance will Bias will be high, variance Bias will be low, variance will be
196 3 3 4
is true when you fit degree 2 polynomial? high be high will be low low
At least one principal
Which of the following statements are true for a
197
∈
design matrix X Rn×d with d > n? (The rows are n
Least-squares linear regression
computes the
The sample points are
X has exactly d − n
eigenvectors with eigenvalue
component direction is
orthogonal to a hyperplane 3 3 4
sample points and the columns represent d linearly separable
weights w = (XTX)−1 XTy zero that contains all the sample
features.)
points
Suppose your model is demonstrating high variance
Improve the optimization
across the different training sets. Which of the Increase the amount of traning Decrease the model Reduce the noise in the training
198 algorithm being used for 3 3 3
following is NOT valid way to try and reduce the data in each traning set complexity data
error minimization.
variance?
Regression through the origin Normalizing variables results
Least squares is not an
199 Point out the wrong statement. yields an equivalent slope if you in the slope being the None of the mentioned 3 3 4
estimation tool
center the data first correlation
Which of the following are components of
200 Bias Vaiance Both of them None of them 3 1 2
generalization Error?
both multicollinearity &
201 Problem in multi regression is ? multicollinearity overfitting underfitting 3 1 2
overfitting
How can we best represent ‘support’ for the {X,Y}/(Total number of {Z}/(Total number of {X,Y,Z}/(Total number of
202 {Z}/{X,Y} 3 2 3
following association rule: “If X and Y, then Z”. transactions) transactions) transactions)
It is the conditional
probability that a randomly It is the probability that a
selected transaction will A high value of confidence randomly selected transaction Confidence is not measured in
Choose the correct statement with respect to
203 include all the items in the suggests a weak association will include all the items in the terms of (estimated) conditional 3 2 3
‘confidence’ metric in association rules
consequent given that the rule consequent as well as all the probability.
transaction includes all the items in the antecedent.
items in the antecedent.
k-means clustering aims to
k-means clustering is a linear k-nearest neighbor is same as
204 Which Statement is not true statement. partition n observations k-means is sensitive to outlier 4 1 2
clustering algorithm. k-means
into k clusters
which of the following cases will K-Means clustering
give poor results?
1. Data points with outliers
205 1 and 2 2 and 3 2 and 4 1, 2 and 4 4 1 2
2. Data points with different densities
3. Data points with round shapes
4. Data points with non-convex shapes
Structure in which internal Flow-Chart like Structure in
node represents test on an which internal node represents
attribute, each branch test on an attribute, each
206 What is Decision Tree? Flow-Chart None of the above 4 1 2
represents outcome of test branch represents outcome of
and each leaf node test and each leaf node
represents class label represents class label
8 observations are clustered into 3 clusters using
K-Means clustering algorithm. After first iteration
clusters, C1, C2, C3 has following observations:
C1: {(2,2), (4,4), (6,6)}
207 C1: (4,4), C2: (2,2), C3: (7,7) C1: (6,6), C2: (4,4), C3: (9,9) C1: (2,2), C2: (0,0), C3: (5,5) C1: (4,4), C2: (3,3), C3: (7,7) 4 2 3
C2: {(0,4), (4,0),(2,5)}
C3: {(5,5), (9,9)}
What will be the cluster centroids if you want to
proceed for second iteration?
It is the conditional
probability that a randomly It is the probability that a
selected transaction will A high value of confidence randomly selected transaction Confidence is not measured in
Choose the correct statement with respect to
208 include all the items in the suggests a weak association will include all the items in the terms of (estimated) conditional 4 2 3
‘confidence’ metric in association rules
consequent given that the rule consequent as well as all the probability.
transaction includes all the items in the antecedent.
items in the antecedent.
Pessimistic pruning and Postpruning and Cost complexity pruning and
209 What are two steps of tree pruning work? None of the options 4 2 3
Optimistic pruning Prepruning time complexity pruning
A database has 5 transactions. Of these, 4
transactions include milk and bread. Further, of the
given 4 transactions, 2 transactions include cheese.
210 0.4 0.6 0.8 0.42 4 2 3
Find the support percentage for the following
association rule “if milk and bread are purchased,
then cheese is also purchased”.
It can be used in both
Which of the following option is true about k-NN It can be used for
211 It can be used for classification classification and Not useful in ML algorithm 4 1 2
algorithm? regression
regression
How to select best hyperparameters in tree based Measure performance over Measure performance over Random selection of hyper
212 Both of these 4 1 2
models? training data validation data parameters
What is true about K-Mean Clustering?
1. K-means is extremely sensitive to cluster center
initializations
213 1 and 3 1 and 2 2 and 3 1, 2 and 3 4 1 2
2. Bad initialization can lead to Poor convergence
speed
3. Bad initialization can lead to bad overall clustering
Classifiers which perform
Classifiers which form a tree
214 What are tree based classifiers? series of condition checking Both options except none Not possible 4 1 2
with each attribute at one level
with one attribute at a time
Gini index operates on the Gini index performs only binary
215 What is gini index? It is a measure of purity All (1,2 and 3) 4 1 2
categorical target variables split
Tree/Rule based classification algorithms
216 if-then. while. do while switch. 4 1 2
generate ... rule to perform the classification.
Structure in which internal
node represents test on an
attribute, each branch
217 Decision Tree is Flow-Chart Both a & b Class of instance 4 1 2
represents outcome of test
and each leaf node
represents class label
Which of the following is true about Manhattan It can be used for continuous It can be used for categorical It can be used for categorical
218 It can be used for constants 4 2 3
distance? variables variables as well as continuous
A company has build a kNN classifier that gets 100%
accuracy on training data. When they deployed this
model on client side it has been found that the
model is not at all accurate. Which of the following It is probably a overfitted It is probably a underfitted
219 Can’t say Wrong Client data 4 2 3
thing might gone wrong? model model
Note: Model has successfully deployed and no
technical issues are found at client side except the
model performance
hich of the following classifications would best suit
220 If...then... analysis Market-basket analysis Regression analysis Cluster analysis 4 3 4
the student performance classification systems?
Which statement is true about the K-Means The output attribute must be All attribute values must be All attributes must be Attribute values may be either
221 4 2 3
algorithm? Select one: cateogrical. categorical. numeric categorical or numeric
Which of the following can act as possible
termination conditions in K-Means?
1. For a fixed number of iterations.
2. Assignment of observations to clusters does not
222 change between iterations. Except for cases with a 1, 3 and 4 1, 2 and 3 1, 2 and 4 1,2,3,4 4 3 4
bad local minimum.
3. Centroids do not change between successive
iterations.
4. Terminate when RSS falls below a threshold.
Which of the following statement is true about k-NN
algorithm?
1) k-NN performs much better if all of the data have
the same scale
223 2) k-NN works well with a small number of input 1 and 2 1 and 3 Only 1 1,2 and 3 4 3 4
variables (p), but struggles when the number of
inputs is very large
3) k-NN makes no assumptions about the functional
form of the problem being solved
In which of the following cases will K-means
clustering fail to give good results?
224 1) Data points with outliers 1 and 2 2 and 3 1, 2, and 3 1 and 3 4 3 4
2) Data points with different densities
3) Data points with nonconvex shapes
Both By pruning the longer
225 How will you counter over-fitting in decision tree? By pruning the longer rules By creating new rules rules’ and ‘ By creating new Over-fitting is not possible 4 3 4
rules’
This clustering algorithm terminates when mean
values computed for the current iteration of the
226 K-Means clustering conceptual clustering expectation maximization agglomerative clustering 4 3 4
algorithm are identical to the computed mean values
for the previous iteration Select one:
Which one of the following is the main reason for To save computing time during To save space for storing the To make the training set error To avoid overfitting the
227 4 3 4
pruning a Decision Tree? testing Decision Tree smaller training set
You've just finished training a decision tree for spam
classification, and it is getting abnormally bad
Your decision trees are too You need to increase the
228 performance on both your training and test sets. You You are overfitting. Incorrect data 4 3 4
shallow. learning rate.
know that your implementation has no bugs, so what
could be causing the problem?
Converges to the global
Requires the dimension of the Minimizes the within class
Has the smallest value of the optimum if and only if the initial
229 The K-means algorithm: feature space to be no bigger variance for a given number 4 3 4
objective function when K = 1 means are chosen as some of
than the number of samples of clusters
the samples themselves
Which of the following metrics, do we have for
finding dissimilarity between two clusters in
hierarchical clustering?
230 1 and 2 1 and 3 2 and 3 1, 2 and 3 4 3 4
1. Single-link
2. Complete-link
3. Average-link
In which of the following cases will K-Means
clustering fail to give good results?
1. Data points with outliers
231 1 and 2 2 and 3 2 and 4 1, 2 and 4 4 2 3
2. Data points with different densities
3. Data points with round shapes
4. Data points with non-convex shapes
Hierarchical clustering is slower than
232 TRUE FALSE Depends on data Cannot say 4 2 3
non-hierarchical clustering?
High entropy means that the partitions in
233 pure not pure useful useless 4 2 3
classification are
Suppose we would like to perform clustering on
spatial data such as the geometrical locations of
234 houses. We wish to produce clusters of many Decision Trees Density-based clustering Model-based clustering K-means clustering 4 3 4
different sizes and shapes. Which of the following
methods is the most appropriate?
The main disadvantage of maximum likelihood
235 mathematically less folded mathematically less complex mathematically less complex computationally intense 4 1 2
methods is that they are _____
The maximum likelihood method can be used to TRUE FALSE
explore relationships among more diverse
236 - - 4 2 3
sequences, conditions that are not well handled by
maximum parsimony methods.
k-means clustering aims to
Which Statement is not true statement. k-means clustering is a linear k-nearest neighbor is same
237 partition n observations into k-means is sensitive to outlier 4 1 2
clustering algorithm. as k-means
k clusters
In distance calculation it will You always get the same In Manhattan distance it is an
what is Feature scaling done before applying
238 give the same weights for all clusters. If you use or don't important step but in Euclidian None of these 4 1 2
K-Mean algorithm?
features use feature scaling it is not
which of the following cases will K-Means clustering
give poor results?
1. Data points with outliers
239 1 and 2 2 and 3 2 and 4 1, 2 and 4 4 1 2
2. Data points with different densities
3. Data points with round shapes
4. Data points with non-convex shapes
The most probable feature for
All the features of a class are
What is the naïve assumption in a Naïve Bayes All the classes are independent All the features of a class are a class is the most important
240 conditionally dependent on 5 1 2
Classifier. of each other independent of each other feature to be cinsidered for
each other
classification
Based on survey , it was found that the probability
that person like to watch serials is 0.25 and the
probability that person like to watch netflix series is
241 0.32 0.2 0.44 0.56 5 2 3
0.43. Also the probability that person like to watch
serials and netflix sereis is 0.12. what is the
probability that a person doesn't like to watch either?
What is the actual number of independent
242 parameters which need to be estimated in P P 2P P(P+1)/2 P(P+3)/2 5 1 2
dimensional Gaussian distribution model?
Give the correct Answer for following statements.
1. It is important to perform feature normalization
243 1 is True, 2 is False 1 is False, 2 is True 1 is True, 2 is True 1 is False, 2 is False 5 3 4
before using the Gaussian kernel.
2. The maximum value of the Gaussian kernel is 1.
The most probable feature for
All the features of a class All the features of a class are
What is the naïve assumption in a Naïve Bayes All the classes are independent a class is the most important
244 are independent of each conditionally dependent on each 5 1 2
Classifier. of each other feature to be cinsidered for
other other
classification
What is the actual number of independent
245 parameters which need to be estimated in P P 2P P(P+1)/2 P(P+3)/2 5 1 2
dimensional Gaussian distribution model?
Which of the following quantities are minimized
246 directly or indirectly during parameter estimation in Negative Log-likelihood Log-liklihood Cross Entropy Residual Sum of Square 5 2 3
Gaussian distribution Model?
In Naive Bayes equation P(C / X)= (P(X / C)
247 P(X/C) P(C/X) P(C) P(X) 5 1 2
*P(C) ) / P(X) which part considers "likelihood"?
Consider the following dataset. x,y,z are the features
248 and T is a class(1/0). Classify the test data (0,0,1) as 0 1 0.1 0.9 5 3 4
values of x,y,z respectively.
Given a rule of the form IF X THEN Y, rule
Y is false when X is known to be Y is true when X is known X is true when Y is known to be X is false when Y is known to be
249 confidence is defined as the conditional probability 5 3 4
false. to be true. true false.
that Select one:
Attributes are statistically Attributes are statistically
Which of the following statements about Naive Attributes can be nominal or
250 Attributes are equally important. dependent of one another independent of one another 5 2 3
Bayes is incorrect? numeric
given the class value. given the class value.
How the entries in the full joint probability distribution Both Using variables &
251 Using variables Using information None of the mentioned 5 2 3
can be calculated? information
How many terms are required for building a bayes
252 1 2 3 4 5 2 3
model?
253 Skewness of Normal distribution is ___________ Negative Positive 0 Undefined 5 1 2
254 The shape of the Normal Curve is ___________ Bell Shaped flat circular spiked 5 1 2
As the value of one attribute
As the value of one attribute
The correlation coefficient for two real-valued The attributes are not linearly increases the value of the The attributes show a linear
255 decreases the value of the 5 1 2
attributes is –0.85. What does this value tell you? related. second attribute also relationship
second attribute increases
increases
8 observations are clustered into 3 clusters using
K-Means clustering algorithm. After first iteration
clusters, C1, C2, C3 has following observations:
C1: {(2,2), (4,4), (6,6)}
256 C1: (4,4), C2: (2,2), C3: (7,7) C1: (6,6), C2: (4,4), C3: (9,9) C1: (2,2), C2: (0,0), C3: (5,5) C1: (4,4), C2: (3,3), C3: (7,7) 5 3 4
C2: {(0,4), (4,0),(2,5)}
C3: {(5,5), (9,9)}
What will be the cluster centroids if you want to
proceed for second iteration?
Which of the following quantities are minimized
257 directly or indirectly during parameter estimation in Negative Log-likelihood Log-liklihood Cross Entropy Residual Sum of Square 5 2 3
Gaussian distribution Model?
In Naive Bayes equation P(C / X)= (P(X / C)
258 P(X/C) P(C/X) P(C) P(X) 5 1 2
*P(C) ) / P(X) which part considers "likelihood"?
Consider the following dataset. x,y,z are the features
259 and T is a class(1/0). Classify the test data (0,0,1) as 0 1 0.1 0.9 5 3 4
values of x,y,z respectively.
Which of the following option is / are correct
regarding benefits of ensemble model?
260 1. Better performance 1 and 3 2 and 3 1, 2 and 3 1 and 2 5 1 2
2. Generalized models
3. Better interpretability
The network that involves backward links from
261 Self organizing maps Perceptrons Recurrent neural network Multi layered perceptron 6 1 2
output to the input and hidden layers is called
Which of the following parameters can be tuned for
finding good ensemble model in bagging based
algorithms?
262 1. Max number of samples 1 2 3&4 1,2,3&4 6 1 2
2. Max features
3. Bootstrapping of samples
4. Bootstrapping of features
What is back propagation?
a) It is another name given to the curvy function in
the perceptron
b) It is the transmission of error back through the
263 network to adjust the inputs a b c b&c 6 1 2
c) It is the transmission of error back through the
network to allow weights to be adjusted so that the
network can learn
d) None of the mentioned
In an election for the head of college, N candidates
are competing against each other and people are
voting for either of the candidates. Voters don’t
264 Bagging Boosting Stacking Randomization 6 2 3
communicate with each other while casting their
votes.which of the following ensembles method
works similar to the discussed elction Procedure?
What is the sequence of the following tasks in a
perceptron?
Initialize weights of perceptron randomly
265 Go to the next batch of dataset 1, 4, 3, 2 3, 1, 2, 4 4, 3, 2, 1 1, 2, 3, 4 6 2 3
If the prediction does not match the output, change
the weights
For a sample input, compute an output
In which neural net architecture, does weight sharing
occur? Convolutional neural . Fully Connected Neural
266 Recurrent Neural Network Both A and B 6 2 3
Network Network
Which of the following are correct statement(s)
about stacking?
1. A machine learning model is trained on
predictions of multiple machine learning models
2. A Logistic regression will definitely work better in
267 1 and 2 2 and 3 1 and 3 1,2 and 3 6 2 3
the second stage as compared to other classification
methods
3. First stage models are trained on full / partial
feature space of training data

Given above is a description of a neural network. When you add more hidden
When there is higher When the problem is an image When there is lower
268 When does a neural network model become a deep layers and increase depth of 6 2 3
dimensionality of data recognition problem dimensionality of data
learning model? neural network
What are the steps for using a gradient descent
algorithm?
1)Calculate error between the actual value and the
predicted value
2)Reiterate until you find the best weights of network 1, 2, 3, 4, 5
269 4, 3, 1, 5, 2 3, 2, 1, 5, 4 5, 4, 3, 2, 1 6 3 4
3)Pass an input through the network and get values
from output layer
4)Initialize random weight and bias
5)Go to each neurons which contributes to the error
and change its respective values to reduce the error
A 4-input neuron has weights 1, 2, 3 and 4. The
transfer function is linear with the constant of
270 238 76 248 348 6 3 4
proportionality being equal to 2. The inputs are 4,
10, 10 and 30 respectively. What will be the output?
Which of the following option is / are correct
regarding benefits of ensemble model?
271 1. Better performance 1 and 3 2 and 3 1, 2 and 3 1 and 2 6 1 2
2. Generalized models
3. Better interpretability
The network that involves backward links from
272 Self organizing maps Perceptrons Recurrent neural network Multi layered perceptron 6 1 2
output to the input and hidden layers is called
Which of the following parameters can be tuned for
finding good ensemble model in bagging based
algorithms?
273 1. Max number of samples 1 2 3&4 1,2,3&4 6 1 2
2. Max features
3. Bootstrapping of samples
4. Bootstrapping of features
Increase in size of a convolutional kernel would
274 necessarily increase the performance of a TRUE FALSE 6 1 2
convolutional network.
considers the reduction in considers the reduction in
error when moving from the error when moving from the can only be conceptualized as a
275 The F-test an omnibus test 6 1 2
complete model to the reduced model to the reduction in error
reduced model complete model
What is back propagation?
a) It is another name given to the curvy function in
the perceptron
b) It is the transmission of error back through the
276 network to adjust the inputs a b c b&c 6 1 2
c) It is the transmission of error back through the
network to allow weights to be adjusted so that the
network can learn
d) None of the mentioned
In an election for the head of college, N candidates
are competing against each other and people are
voting for either of the candidates. Voters don’t
277 Bagging Boosting Stacking Randomization 6 1 2
communicate with each other while casting their
votes.which of the following ensembles method
works similar to the discussed elction Procedure?
2,3,
278 Which of the following is NOT supervised learning? PCA Decision tree Linear Regression Naive Bayesian 1 2
4,5
Which of the following algorithm is not an example of
279 Extra Tree Regressor Random Forest Gradient Boosting Decision Tree 6 2 3
an ensemble method?
What is true about an ensembled classifier?
1. Classifiers that are more “sure” can vote with
more conviction
280 2. Classifiers can be more “sure” about a particular 1 and 2 1 and 3 2 and 3 All of the above 6 2 3
part of the space
3. Most of the times, it performs better than a single
classifier
Which of the following option is / are correct
regarding benefits of ensemble model?
281 1. Better performance 1 and 3 2 and 3 1 and 2 1, 2 and 3 6 1 2
2. Generalized models
3. Better interpretability
Which of the following can be true for selecting base
learners for an ensemble?
1. Different learners can come from same algorithm
with different hyper parameters
282 1 2 1 and 3 1, 2 and 3 6 2 3
2. Different learners can come from different
algorithms
3. Different learners can come from different training
spaces
True or False: Ensemble learning can only be
283 TRUE FALSE 6 1 2
applied to supervised learning methods.
True or False: Ensembles will yield bad results when
there is significant diversity among the models.
284 TRUE FALSE 6 1 2
Note: All individual models have meaningful and
good predictions.
Which of the following is / are true about weak
learners used in ensemble model?
1. They have low variance and they don’t usually
overfit
285 1 and 2 1 and 3 2 and 3 None of these 6 3 4
2. They have high bias, so they can not solve hard
learning problems
3. They have high variance and they don’t usually
overfit
True or False: Ensemble of classifiers may or may
286 not be more accurate than any of its individual TRUE False 6 1 2
model.
If you use an ensemble of different base models, is it
287 necessary to tune the hyper parameters of all base Yes No can’t say 6 1 2
models to improve the ensemble performance?
Generally, an ensemble method works better, if the
individual base models have ____________? Less correlation among High correlation among Correlation does not have any
288 None of the above 6 3 4
Note: Suppose each individual base models have predictions predictions impact on ensemble output
accuracy greater than 50%.
In an election, N candidates are competing against
each other and people are voting for either of the
candidates. Voters don’t communicate with each
other while casting their votes.
289 Which of the following ensemble method works Bagging Boosting A Or B None of these 6 3 4
similar to above-discussed election procedure?

Hint: Persons are like base models of ensemble

method.
Suppose there are 25 base classifiers. Each
classifier has error rates of e = 0.35.
Suppose you are using averaging as ensemble
290 technique. What will be the probabilities that 0.05 0.06 0.07 0.09 6 3 4
ensemble of above 25 classifiers will make a wrong
prediction?
Note: All classifiers are independent of each other
In machine learning, an algorithm (or learning
algorithm) is said to be unstable if a small change in
training data cause the large change in the learned
291 TRUE FALSE 6 2 3
classifiers.
True or False: Bagging of unstable classifiers is a
good idea
Which of the following parameters can be tuned for
finding good ensemble model in bagging based
algorithms?
292 1. Max number of samples 1 and 3 2 and 3 1 and 2 All of above 6 2 3
2. Max features
3. Bootstrapping of samples
4. Bootstrapping of features
How is the model capacity affected with dropout rate
Model capacity increases in Model capacity decreases Model capacity is not affected
293 (where model capacity means the ability of a neural None of these 6 3 4
increase in dropout rate in increase in dropout rate on increase in dropout rate
network to approximate complex functions)?
True or False: Dropout is computationally expensive
294 TRUE FALSE 6 1 2
technique w.r.t. bagging
Suppose, you want to apply a stepwise forward
selection method for choosing the best models for
an ensemble model. Which of the following is the
correct order of the steps?
Note: You have more than 1000 models predictions
1. Add the models predictions (or in another term
295 1-2-3 1-3-4 2-1-3 None of above 6 3 4
take the average) one by one in the ensemble which
improves the metrics in the validation set.
2. Start with empty ensemble
3. Return the ensemble from the nested set of
ensembles that has maximum performance on the
validation set
Suppose, you have 2000 different models with their
predictions and want to ensemble predictions of best
Step wise backward
296 x models. Now, which of the following can be a Step wise forward selection Both None of above 6 2 3
elimination
possible method to select the best x models for an
ensemble?
Below are the two ensemble models:
1. E1(M1, M2, M3) and
2. E2(M4, M5, M6)
Above, Mx is the individual base models.
Which of the following are more likely to choose if
following conditions for E1 and E2 are given?
297 E1 E2 Any of E1 and E2 None of these 6 3 4
E1: Individual Models accuracies are high but
models are of the same type or in another term less
diverse
E2: Individual Models accuracies are high but they
are of different types in another term high diverse in
nature
True or False: In boosting, individual base learners
298 TRUE FALSE 6 1 2
can be parallel.
Which of the following is true about bagging?
1. Bagging can be parallel
299 1 and 2 2 and 3 1 and 3 All of these 6 2 3
2. The aim of bagging is to reduce bias not variance
3. Bagging helps in reducing overfitting
Suppose you are using stacking with n different
machine learning algorithms with k folds on data.

Which of the following is true about one level (m

base models + 1 stacker) stacking? You will have only m
You will have only k features You will have k+m features You will have k*n features after
300 features after the first 6 3 4
after the first stage after the first stage the first stage
Note: stage
Here, we are working on binary classification
problem
All base models are trained on all features
You are using k folds for base models
Which of the following is the difference between Stacking has less stable CV In Blending, you create out Stacking is simpler than
301 None of these 6 2 3
stacking and blending? compared to Blending of fold prediction Blending
Which of the following can be one of the steps in
stacking?
1. Divide the training data into k folds
302 2. Train k models on each k-1 folds and get the out 1 and 2 2 and 3 1 and 3 All of above 6 2 3
of fold predictions for remaining one fold
3. Divide the test data set in “k” folds and get
individual fold predictions by different algorithms
Q25. Which of the following are advantages of
stacking?
303 1) More robust model 1 and 2 2 and 3 1 and 3 All of the above 6 2 3
2) better prediction
3) Lower time of execution
Which of the following are correct statement(s)
about stacking?
A machine learning model is trained on predictions
of multiple machine learning models
304 A Logistic regression will definitely work better in the 1 and 2 2 and 3 1 and 3 All of above 6 2 3
second stage as compared to other classification
methods
First stage models are trained on full / partial feature
space of training data
Which of the following is true about weighted
majority votes?
1. We want to give higher weights to better
performing models
305 1 and 3 2 and 3 1 and 2 1, 2 and 3 6 2 3
2. Inferior models can overrule the best model if
collective weighted votes for inferior models is higher
than best model
3. Voting is special case of weighted voting
It can be used in both
Which of the following is true about averaging It can only be used in It can only be used in
306 classification as well as None of these 6 1 2
ensemble? classification problem regression problem
regression
How can we assign the weights to output of different
models in an ensemble?
307 1. Use an algorithm to return the optimal weights 1 and 2 1 and 3 2 and 3 All of above 6 2 3
2. Choose the weights using cross validation
3. Give high weights to more accurate models
Suppose you are given ‘n’ predictions on test data
by ‘n’ different models (M1, M2, …. Mn) respectively.
Which of the following method(s) can be used to
combine the predictions of these models?
Note: We are working on a regression problem
308 1. Median 1, 3 and 4 1,3 and 6 1,3, 4 and 6 All of above 6 3 4
2. Product
3. Average
4. Weighted sum
5. Minimum and Maximum
6. Generalized mean rule
In an election, N candidates are competing against
each other and people are voting for either of the
candidates. Voters don’t communicate with each
other while casting their votes.
309 Bagging Boosting A Or B None of these 6 3 4
Which of the following ensemble method works
similar to above-discussed election procedure?
Hint: Persons are like base models of ensemble
method.
Generally, an ensemble method works better, if the
individual base models have ____________? Less correlation among High correlation among Correlation does not have any
310 None of the above 6 3 4
Note: Suppose each individual base models have predictions predictions impact on ensemble output
accuracy greater than 50%.
If you use an ensemble of different base models, is it
311 necessary to tune the hyper parameters of all base Yes No can’t say 6 2 3
models to improve the ensemble performance?
312 Support Vector Machine is Logical Model Proababilistic Model Geometric Model None of the above 1 2 3
If X and Y in a regression model are totally the correlation coefficient would the coefficient of the coefficient of determination
313 the SSE would be 0 3 2 4
unrelated, be -1 determination would be 0 would be 1
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Choose the options that is incorrect regarding machine learning (ML) and
artificial intelligence (AI)
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) ML is an alternate way of programming intelligent machines.

THIS IS
MANDATORY
OPTION

((OPTION_B)) ML and AI have very different goals

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) ML is a set of techniques that turns a dataset into a software.

This is optional

((OPTION_D)) AI is a software that can emulate the human mind

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following sentence is FALSE regarding regression

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) It is used for prediction

THIS IS
MANDATORY
OPTION

((OPTION_B)) It may be used for interpretation

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) It relates inputs to outputs.

This is optional

((OPTION_D)) It discovers causal relationships

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Grid search is

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Linear in D
THIS IS
MANDATORY
OPTION

((OPTION_B)) Exponential in D
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Linear in N
This is optional

((OPTION_D)) Both B&C

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Find incorrect regarding Gradient of a continuous and differentiable

function
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) is zero at a minimum

THIS IS
MANDATORY
OPTION

((OPTION_B)) is non-zero at a maximum

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) is zero at a saddle point

This is optional

((OPTION_D)) decreases as you get closer to the minimum

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Consider a linear-regression model with N = 3 and D = 1 with input-ouput

((OPTION_A)) -1.66
THIS IS
MANDATORY
OPTION

((OPTION_B)) 2
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 3
This is optional

((OPTION_D)) 4
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_A)) O(D)
THIS IS
MANDATORY
OPTION

((OPTION_B)) O(N)
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) O(ND)
This is optional

((OPTION_D)) O(ND2)

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_B)) High model bias

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) High estimation bias

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Adding more basis functions in a linear model... (pick the most probably
option)
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Decreases model bias

THIS IS
MANDATORY
OPTION

((OPTION_B)) Decreases estimation bias

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Decreases variance

This is optional

((OPTION_D)) Doesn’t affect bias and variance

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) The problem of finding hidden structure in unlabeled data is called

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Supervised learning

THIS IS
MANDATORY
OPTION

((OPTION_B)) UnSupervised learning

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Reinforcement learning

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Task of inferring a model from labeled training data is called

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Unsupervised learning

THIS IS
MANDATORY
OPTION

((OPTION_B)) supervised learning

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Reinforcement learning

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Some telecommunication company wants to segment their customers

into distinct groups in order to send appropriate subscription offers,
ENTER
this is an example of
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Supervised learning

THIS IS
MANDATORY
OPTION

((OPTION_B)) Data extraction

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Serration
This is optional

((OPTION_D)) Unsupervised learning

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Self-organizing maps are an example of

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Unsupervised learning

THIS IS
MANDATORY
OPTION

((OPTION_B)) Supervised learning

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Reinforcement learning

This is optional

((OPTION_D)) Missing data imputation

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) You are given data about seismic activity in Japan, and you want to
predict a magnitude of the next earthquake, this is in an example of
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Supervised learning

THIS IS
MANDATORY
OPTION

((OPTION_B)) Unsupervised learning

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Serration
This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Assume you want to perform supervised learning and to predict

((OPTION_A)) Classification
THIS IS
MANDATORY
OPTION

((OPTION_B)) Regression
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Clustering
This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Discriminating between spam and ham e-mails is a classification task,

true or false?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) True
THIS IS
MANDATORY
OPTION

((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) In the example of predicting number of babies based on storks’

population size, number of babies is
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Outcome
THIS IS
MANDATORY
OPTION

((OPTION_B)) Feature
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Attribute
This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) It may be better to avoid the metric of ROC curve as it can suffer
from accuracy paradox.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) True

THIS IS
MANDATORY
OPTION

((OPTION_B)) False

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) which of the following is not involve in data mining

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Knowledge extraction

THIS IS
MANDATORY
OPTION

((OPTION_B)) Data archaeology

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Data exploration

This is optional

((OPTION_D)) Data transformation

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) The expected value or _______ of a random variable is the center of its
distribution.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Mode
THIS IS
MANDATORY
OPTION

((OPTION_B)) median
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) mean
This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Point out the correct statement.

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Some cumulative distribution function F is non-decreasing and right-continuous

THIS IS
MANDATORY
OPTION

((OPTION_B)) Every cumulative distribution function F is decreasing and right-continuous

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Every cumulative distribution function F is increasing and left-continuous

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following of a random variable is a measure of spread

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) variance
THIS IS
MANDATORY
OPTION

((OPTION_B)) standard deviation

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) empirical mean

This is optional

((OPTION_D)) All above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) The square root of the variance is called the ________ deviation

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) empirical
THIS IS
MANDATORY
OPTION

((OPTION_B)) mean
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) continuous
This is optional

((OPTION_D)) standard
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) For continuous random variables, the CDF is the derivative of the PDF.

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) True
THIS IS
MANDATORY
OPTION

((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Cumulative distribution functions are used to specify the distribution of

multivariate random variables.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) True
THIS IS
MANDATORY
OPTION

((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_A)) Regression
THIS IS
MANDATORY
OPTION

((OPTION_B)) Desicion Tree

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Clustering
This is optional

((OPTION_D)) Association Rule

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((OPTION_A)) True
THIS IS
MANDATORY
OPTION

((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((OPTION_A)) True
THIS IS
MANDATORY
OPTION

((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((OPTION_B)) Overfitting

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both
This is optional

((OPTION_D)) None
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((OPTION_C)) Both
This is optional

((OPTION_D)) None
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((OPTION_C)) Both
This is optional

((OPTION_D)) None
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
A
((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) In 1984, the computer scientist L. Valiant

((OPTION_B)) Zero one loss error

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Probably approximately correct

This is optional
none
((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) In particular, a concept is a subset of input patterns X which determine the same
output element
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) True
THIS IS
MANDATORY
OPTION

((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Therefore, learning a

((OPTION_A)) True
THIS IS
MANDATORY
OPTION

((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_A)) curse of dimensionality

THIS IS
MANDATORY
OPTION

((OPTION_B)) Hughes phenomenon

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Probably approximately correct

This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) In many cases, in order to capture the full expressivity, it's

necessary to have a very large dataset and without enough training data, the
ENTER approximation
CONTENT. QTN can become problematic. This is called…
CAN HAVE
IMAGES ALSO

((OPTION_A)) curse of dimensionality

THIS IS
MANDATORY
OPTION

((OPTION_B)) Hughes phenomenon

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Probably approximately correct

This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE First term is called as
IMAGES ALSO

((OPTION_A)) posteriori
THIS IS
MANDATORY
OPTION

((OPTION_B)) Apriori
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) likelihood.
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
second term is called as
CAN HAVE
IMAGES ALSO

((OPTION_A)) posteriori
THIS IS
MANDATORY
OPTION

((OPTION_B)) Apriori
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) likelihood.
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
Third term is called as
CAN HAVE
IMAGES ALSO

((OPTION_A)) posteriori
THIS IS
MANDATORY
OPTION

((OPTION_B)) Apriori
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) likelihood.
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) We can create the object of abstract class

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following step / assumption in regression modeling

impacts the trade-off between under-fitting and over-fitting the most
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) The polynomial degree

THIS IS
MANDATORY
OPTION

((OPTION_B)) Whether we learn the weights by matrix inversion or gradient descent

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) The use of a constant-term

This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_A)) 10/27
THIS IS
MANDATORY
OPTION

((OPTION_B)) 20/27

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 50/27
This is optional

((OPTION_D)) 49/27
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_A)) 1and4

THIS IS
MANDATORY
OPTION

((OPTION_B)) 2 and3
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 1 and3
This is optional

((OPTION_D)) 2 and4
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_A)) You will always have test error zero

THIS IS
MANDATORY
OPTION

((OPTION_B)) . You can not have test error zero

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) None of the above

This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which one of the statement is true regarding residuals in regression

analysis?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) A. Mean of residuals is always zero

THIS IS
MANDATORY
OPTION

((OPTION_B)) Mean of residuals is always less than zero

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Mean of residuals is always greater than zero

This is optional

((OPTION_D)) There is no such rule for residuals.

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the one is true about Heteroskedasticity?

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Linear Regression with varying error terms

THIS IS
MANDATORY
OPTION

((OPTION_B)) Linear Regression with constant error terms

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Linear Regression with zero error terms

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following indicates a fairly strong relationship between

X and Y?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) A. Correlation coefficient = 0.9

THIS IS
MANDATORY
OPTION

((OPTION_B)) . The p-value for the null hypothesis Beta coefficient =0 is 0.0001
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) The t-statistic for the null hypothesis Beta coefficient=0 is 30

This is optional

((OPTION_D)) None of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((OPTION_A)) 1,2&3

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1&3

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) All of above

This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) To test linear relationship of y(dependent) and x(independent)

continuous variables, which of the following plot best suited?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Scatter plot

THIS IS
MANDATORY
OPTION

((OPTION_B)) Barchart
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Histograms
This is optional

((OPTION_D)) None of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Generally, which of the following method(s) is used for predicting

continuous dependent variable?
ENTER
CONTENT. QTN 1. Linear Regression
CAN HAVE 2. Logistic Regression
IMAGES ALSO

((OPTION_A)) 1&2
THIS IS
MANDATORY
OPTION

((OPTION_B)) Only 1
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Only 2
This is optional

((OPTION_D)) None f the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) . A correlation between age and health of a person found to be -1.09.

On the basis of this you would tell the doctors that:
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) . The age is good predictor of health

THIS IS
MANDATORY
OPTION

((OPTION_B)) . The age is poor predictor of health

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) None of these

This is optional

((OPTION_D)) All of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Vertical offset

THIS IS
MANDATORY
OPTION

((OPTION_B)) Perpendicular offset

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both but depend on situation

This is optional

((OPTION_D)) Both a&b

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((OPTION_A)) . Only 1

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1&3
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 1&4
This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1 is False and 2 is True

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 1 is True and 2 is False

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH c
OICE)) Either A
or B or C or D or
E

((OPTION_A)) It is more likely for X1 to be excluded from the model

THIS IS
MANDATORY
OPTION

((OPTION_B)) It is more likely for X1 to be included in the model

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) . Can’t say

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following is true about “Ridge” or “Lasso” regression

methods in case of feature selection?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Ridge regression uses subset selection of features

THIS IS
MANDATORY
OPTION

((OPTION_B)) . Lasso regression uses subset selection of features

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both use subset selection of features

This is optional

((OPTION_D)) All of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) .Which of the following statement(s) can be true post adding a

((OPTION_A)) . 1 and 2

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1 and 3

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 2 and 4

This is optional

((OPTION_D)) none of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_A)) 2 and 4

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1 and 2.
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) . 2, 3 and 4.
This is optional

((OPTION_D)) All of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_A)) 1 and 2

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1&3
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 2&3
This is optional

((OPTION_D)) 1,2&3
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((OPTION_A)) . 1 and 2

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1 and 3

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 2 and 3

This is optional

((OPTION_D)) 1,2 and 3

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) . How many coefficients do you need to estimate in a simple linear

regression model (One independent variable)
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) 1
THIS IS
MANDATORY
OPTION

((OPTION_B)) 2
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) CAN’T SAY

This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO Which of the following statement is true about sum of residuals of A and B

((OPTION_A)) A has higher than B

THIS IS
MANDATORY
OPTION

((OPTION_B)) A has lower than B

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both have same

This is optional

((OPTION_D)) None of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) If two variables are correlated, is it necessary that they have a linear
relationsh
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) YES
THIS IS
MANDATORY
OPTION

((OPTION_B)) NO
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both a&b

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Correlated variables can have zero correlation coeffficient. True or

False?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION

((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

IMAGES ALSO

((OPTION_A)) Only 2
THIS IS
MANDATORY
OPTION

((OPTION_B)) Only 1
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Only3
This is optional

((OPTION_D)) All of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((OPTION_A)) 3.02

THIS IS
MANDATORY
OPTION

((OPTION_B)) 0.75
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 1.01

This is optional

((OPTION_D)) None of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Suppose the distribution of salaries in a company X has median

$35,000, and 25th and 75th percentiles are $21,000 and $53,000
ENTER respectively.
CONTENT. QTN Would a person with Salary $1 be considered an Outlier?
CAN HAVE
IMAGES ALSO

((OPTION_A)) YES

THIS IS
MANDATORY
OPTION

((OPTION_B)) NO
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) . More information is required

This is optional

((OPTION_D)) None of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following option is true regarding “Regression” and

“Correlation” ?
ENTER Note: y is dependent variable and x is independent variable.
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) The relationship is symmetric between x and y in both.

THIS IS
MANDATORY
OPTION

((OPTION_B)) The relationship is not symmetric between x and y in both.

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) The relationship is not symmetric between x and y in case of correlation

but in case of regression it is symmetric.
This is optional

((OPTION_D)) The relationship is symmetric between x and y in case of correlation but

in case of regression it is not symmetric.
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) True-False: Is Logistic regression a supervised machine learning

algorithm?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) TRUE

THIS IS
MANDATORY
OPTION

((OPTION_B)) FALSE

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) _
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) True-False: Is Logistic regression mainly used for Regression?

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION

((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) True-False: Is it possible to design a logistic regression algorithm

using a Neural Network Algorithm?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION

((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) True-False: Is it possible to apply a logistic regression algorithm on a

3-class Classification problem?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION

((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following methods do we use to best fit the data in
Logistic Regression?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Least Square Error

THIS IS
MANDATORY
OPTION

((OPTION_B)) Maximum Likelihood

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Jaccard distance

This is optional

((OPTION_D)) Both a&B

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_A)) We prefer a model with minimum AIC value

THIS IS
MANDATORY
OPTION

((OPTION_B)) We prefer a model with maximum AIC value

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both but depend on the situation

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) True-False] Standardisation of features is required before training a

Logistic Regression
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION

((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following algorithms do we use for Variable Selection?

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) ) LASSO

THIS IS
MANDATORY
OPTION

((OPTION_B)) Ridge

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both
This is optional

((OPTION_D)) All of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Suppose you have been given a fair coin and you want to find out the
odds of getting heads. Which of the following option is true for such a
ENTER case?
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) odds will be 0

THIS IS
MANDATORY
OPTION

((OPTION_B)) odds will be 0.5

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) odds will be 1

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) ) The logit function(given as l(x)) is the log of odds function. What
could be the range of logit function in the domain x=[0,1]?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) (– ∞ , ∞)

THIS IS
MANDATORY
OPTION

((OPTION_B)) (0,1)
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) (0, ∞)
This is optional

((OPTION_D)) (- ∞, 0)
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following option is true?

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Linear Regression errors values has to be normally distributed but in case
of Logistic Regression it is not the case
THIS IS
MANDATORY
OPTION

((OPTION_B)) Linear Regression errors values has to be normally distributed but in case
of Logistic Regression it is not the case
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both Linear Regression and Logistic Regression error values have to be
normally distributed
This is optional

((OPTION_D)) Both Linear Regression and Logistic Regression error values have not to
be normally distributed
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((OPTION_A)) C) A) Logistic(x) = Logit(x)

THIS IS
MANDATORY
OPTION

((OPTION_B)) Logistic(x) = Logit_inv(x)

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) A) Logistic(x) = Logit(x)

This is optional

((OPTION_D)) None of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_A)) Training accuracy increases

THIS IS
MANDATORY
OPTION

((OPTION_B)) Training accuracy increases or remains the same

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Testing accuracy decreases

This is optional

((OPTION_D)) Testing accuracy increases or remains the same

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A&D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Choose which of the following options is true regarding One-Vs-All

method in Logistic Regression.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) We need to fit n models in n-class classification problem

THIS IS
MANDATORY
OPTION

((OPTION_B)) We need to fit n-1 models to classify into n classes

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) We need to fit only 1 model to classify into n classes

This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_B)) Decrease the learning rate and increase the number of iteration
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Increase the learning rate and increase the number of iteration
This is optional

((OPTION_D)) Increase the learning rate and decrease the number of iteration
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((OPTION_A)) A
THIS IS
MANDATORY
OPTION

((OPTION_B)) B
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) BOTH
This is optional

((OPTION_D)) NON OF THESE

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Logistic regression is used when you want to:

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Predict a dichotomous variable from continuous or dichotomous variables.

THIS IS
MANDATORY
OPTION

((OPTION_B)) Predict a continuous variable from dichotomous variables.

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Predict any categorical variable from several other categorical

variables.
This is optional

((OPTION_D)) Predict a continuous variable from dichotomous or continuous variables

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
The odds ratio is
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) The ratio of the probability of an event not happening to the probability of the event happening.

THIS IS
MANDATORY
OPTION

((OPTION_B)) The probability of an event occurring.

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) The ratio of the odds after a unit change in the predictor to the original odds.

This is optional

((OPTION_D)) The ratio of the probability of an event happening to the probability of the event not happening.

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((OPTION_A)) That there are a greater number of explained vs. unexplained observations.

THIS IS
MANDATORY
OPTION

((OPTION_B)) That the statistical model fits the data well.

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) That as the predictor variable increases, the likelihood of the outcome occurring decreases.

This is optional

((OPTION_D)) That the statistical model is a poor fit of the data.

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Logistic regression assumes a:
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Linear relationship between continuous predictor variables and the outcome variable.

THIS IS
MANDATORY
OPTION

((OPTION_B)) Linear relationship between continuous predictor variables and the logit of the outcome
variable.
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Linear relationship between continuous predictor variables.

This is optional

((OPTION_D)) Linear relationship between observations.

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
In binary logistic regression:
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) The dependent variable is continuous.

THIS IS
MANDATORY
OPTION

((OPTION_B)) The dependent variable is divided into two equal subcategories.

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) The dependent variable consists of two categories.

This is optional

((OPTION_D)) There is no dependent variable.

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) The correlation coefficient is used to determine

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) A specific value of the y-variable given a specific value of the x-

variable
THIS IS
MANDATORY
OPTION

((OPTION_B)) A specific value of the x-variable given a specific value of the y-

variable
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) The strength of the relationship between the x and y variables

This is optional

((OPTION_D)) none
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

11 Machine Data mining

17 During the last few years, many ______ algorithms

19 if there is only a discrete number of possible

24 scikit-learn also provides functions for creating make_classifica make_regressio

Ans: Solution A

Ans: Solution B

3. What is supervised learning?

Ans: Solution B

4. What is Unsupervised learning?

Ans: Solution A

5. What is Semi-Supervised learning?

Ans: Solution C

7. Sentiment Analysis is an example of:

Regression,

Classification

Clustering

Reinforcement Learning

Options:

A. 1 Only

B. 1 and 2

C. 1 and 3

D. 1, 2 and 4

Ans : Solution D

8. The process of forming general concept definitions from examples of concepts to be

learned.
a) Deduction
b) abduction
c) induction
d) conjunction

Ans : Solution C

9. Computers are best at learning

a) facts.
b) concepts.
c) procedures.
d) principles.
Ans : Solution A

10. Data used to build a data mining model.

a) validation data
b) training data
c) test data
d) hidden data

Ans : Solution B

11. Supervised learning and unsupervised clustering both require at least one
a) hidden attribute.
b) output attribute.
c) input attribute.
d) categorical attribute.

Ans : Solution A

Ans : Solution B

Ans : Solution C

14. A term used to describe the case when the independent variables in a multiple regression model
are correlated is
a) Regression
b) correlation
c) multicollinearity
d) none of the above

Ans : Solution C

16. A multiple regression model has

a) only one independent variable
b) more than one dependent variable
c) more than one independent variable
d) none of the above

Ans : Solution B

Ans : Solution C

18. The adjusted multiple coefficient of determination accounts for

a) the number of dependent variables in the model
b) the number of independent variables in the model
c) unusually large predictors
d) none of the above

Ans : Solution D

19. The multiple coefficient of determination is computed by

a) dividing SSR by SST
b) dividing SST by SSR
c) dividing SST by SSE
d) none of the above

Ans : Solution C

20. For a multiple regression model, SST = 200 and SSE = 50. The multiple coefficient of
determination is
a) 0.25
b) 4.00
c) 0.75
d) none of the above

Ans : Solution B

21. A nearest neighbor approach is best used

Ans : Solution B

22. Another name for an output attribute.

a) predictive variable
b) independent variable
c) estimated variable
d) dependent variable

Ans : Solution B

23. Classification problems are distinguished from estimation problems in that

Ans : Solution C

24. Which statement is true about prediction problems?

Ans : Solution D

25. Which statement about outliers is true?

Ans : Solution A

27. Which of the following is a common use of unsupervised clustering?

Ans : Solution A

28. The average positive difference between computed and desired outcome values.
a) root mean squared error
b) mean squared error
c) mean absolute error
d) mean positive error

Ans : Solution D

29. Selecting data so as to assure that each class is properly represented in both the training and
test set.
a) cross validation
b) stratification
c) verification
d) bootstrapping

Ans : Solution B

Ans : Solution A
31. Data used to optimize the parameter settings of a supervised learner model.
a) Training
b) Test
c) Verification
d) Validation

Ans : Solution D

32. Bootstrapping allows us to

Ans : Solution A

Ans : Solution B

Ans : Solution C

35. The average squared difference between classifier predicted output and actual output.
a) mean squared error
b) root mean squared error
c) mean absolute error
d) mean relative error

Ans : Solution A

36. Simple regression assumes a __________ relationship between the input attribute and output
attribute.
a) Linear
b) Quadratic
c) reciprocal
d) inverse

Ans : Solution A

37. Regression trees are often used to model _______ data.

a) Linear
b) Nonlinear
c) Categorical
d) Symmetrical

Ans : Solution B

38. The leaf nodes of a model tree are

a) averages of numeric output attribute values.
b) nonlinear regression equations.
c) linear regression equations.
d) sums of numeric output attribute values.

Ans : Solution C

39. Logistic regression is a ________ regression technique that is used to model data having a
_____outcome.
a) linear, numeric
b) linear, binary
c) nonlinear, numeric
d) nonlinear, binary

Ans : Solution D

40. This technique associates a conditional probability value with each data instance.
a) linear regression
b) logistic regression
c) simple regression
d) multiple linear regression

Ans : Solution B

42. With Bayes classifier, missing data items are

a) treated as equal compares.
b) treated as unequal compares.
c) replaced with a default value.
d) ignored.

Ans : Solution B

43. This clustering algorithm merges and splits nodes to help modify nonoptimal partitions.
a) agglomerative clustering
b) expectation maximization
c) conceptual clustering
d) K-Means clustering

Ans : Solution D

Ans : Solution C

Ans : Solution B
UNIT –II

2.What is pca.components_ in Sklearn?

Set of all eigen vectors for the projection space
Matrix of principal components
Result of the multiplication matrix
None of the above options
Ans A

Ans D

7. PCA works better if there is?

A linear structure in the data
If the data lies on a curved surface and not on a flat surface
If variables are scaled in the same unit
A. 1 and 2
B. 2 and 3
C. 1 and 3
D. 1 ,2 and 3
Ans Solution: (C)

9. Which of the following option(s) is / are true?

11. [ True or False ] PCA can be used for projecting and visualizing data in lower dimensions.
A. TRUE
B. FALSE

Solution: (A)
Sometimes it is very useful to plot the data in lower dimensions. We can take the first 2 principal
components and then visualize the data using scatter plot.

13. Which of the following is an example of a deterministic algorithm?

1. Which of the following methods do we use to best fit the data in Logistic Regression?
A) Least Square Error
B) Maximum Likelihood
C) Jaccard distance
D) Both A and B
Ans Solution: B

4. Which of the following algorithms do we use for Variable Selection?

A) LASSO
B) Ridge
C) Both
D) None of these
Ans Solution: A
In case of lasso we apply a absolute penality, after increasing the penality in lasso some of the
coefficient of variables may become zero

5. Which of the following statement is true about outliers in Linear regression?

7. Which of the following is true about Residuals?

A) Lower is better
B) Higher is better
C) A or B depend on the situation
D) None of these
Ans Solution: (A)
Residuals refer to the error values of the model. Therefore lower residuals are desired.

A) Since the there is a relationship means our model is not good

10. Which of the following option is true?

A) Linear Regression errors values has to be normally distributed but in case of Logistic
Regression it is not the case
B) Logistic Regression errors values has to be normally distributed but in case of Linear
Regression it is not the case
C) Both Linear Regression and Logistic Regression error values have to be normally distributed
D) Both Linear Regression and Logistic Regression error values have not to be normally
distributed
Ans Solution: A

12. True-False: Linear Regression is a supervised machine learning algorithm.

13. True-False: Linear Regression is mainly used for Regression.

A) TRUE
B) FALSE
Solution: (A)
Linear Regression has dependent variables that have continuous values.
14. True-False: It is possible to design a Linear regression algorithm using a neural network?

A) TRUE
B) FALSE

Solution: (A)

True. A Neural network can be used as a universal approximator, so it can definitely implement
a linear regression algorithm.

18. Which of the following is true about Residuals ?

A) Lower is better
B) Higher is better
C) A or B depend on the situation
D) None of these
Solution: (A)
Residuals refer to the error values of the model. Therefore lower residuals are desired.

A) 1 and 2
B) 1 and 3
C) 2 and 3
D) 1,2 and 3
Solution: (D)
Instead of gradient descent, Normal Equation can also be used to find coefficients.

Question Context 24-26:

A) Since the there is a relationship means our model is not good

Question Context 29-31:

31. In terms of bias and variance. Which of the following is true when you fit degree 2
polynomial?

A) Bias will be high, variance will be high

Question Context 32-33:

33. What do you expect will happen with bias and variance as you increase the size of training
data?

A) Bias increases and Variance increases

Question Context 34:

Consider the following data where one input(X) and one output(Y) is given.

34. What would be the root mean square training error for this data if you run a Linear
Regression model of the form (Y = A0+A1X)?

A) Less than 0
B) Greater than zero
C) Equal to 0
D) None of these
Solution: (C)
We can perfectly fit the line on the following data so mean error will be zero.

Question Context 35-36:

Suppose you have been given the following scenario for training and validation error for Linear
Regression.
Number Validation
Learning Training
Scenario of Error
Rate Error
iterations

1 0.1 1000 100 110

2 0.2 600 90 105

3 0.3 400 110 110

4 0.4 300 120 130

5 0.4 250 130 150

Question Context 37-38:

A) L1
B) L2
C) Any
D) None of these
Solution: (D)
I won’t use any regularization methods because regularization is used in case of overfitting.

39. True-False: Is Logistic regression a supervised machine learning algorithm?

40. True-False: Is Logistic regression mainly used for Regression?

A) TRUE
B) FALSE
Solution: B
Logistic regression is a classification algorithm, don’t confuse with the name regression.

42. True-False: Is it possible to apply a logistic regression algorithm on a 3-class Classification

problem?
A) TRUE
B) FALSE
Solution: A
Yes, we can apply logistic regression on 3 classification problem, We can use One Vs all method
for 3 class classification in logistic regression.

46. [True-False] Standardisation of features is required before training a Logistic Regression.

A) TRUE
B) FALSE
Solution: B
Standardization isn’t required for logistic regression. The main goal of standardizing features is
to help convergence of the technique used for optimization.

47. Which of the following algorithms do we use for Variable Selection?

A) LASSO
B) Ridge
C) Both
D) None of these

Solution: A
In case of lasso we apply a absolute penality, after increasing the penality in lasso some of the
coefficient of variables may become zero.
Context: 48-49

Consider a following model for logistic regression: P (y =1|x, w)= g(w0 + w1x)
where g(z) is the logistic function.

In the above equation the P (y =1|x; w) , viewed as a function of x, that we can get by changing the
parameters w.

48 What would be the range of p in such case?

A) (0, inf)
B) (-inf, 0 )
C) (0, 1)
D) (-inf, inf)

Solution: C

For values of x in the range of real number from −∞ to +∞ Logistic function will give the output
between (0,1)

49 In above question what do you think which function would make p between (0,1)?

A) logistic function
B) Log likelihood function
C) Mixture of both
D) None of them

Solution: A

Explanation is same as question number 10

50. Suppose you have been given a fair coin and you want to find out the odds of getting heads.
Which of the following option is true for such a case?

A) odds will be 0
B) odds will be 0.5
C) odds will be 1
D) None of these

Solution: C

Odds are defined as the ratio of the probability of success and the probability of failure. So in case of fair
coin probability of success is 1/2 and the probability of failure is 1/2 so odd would be 1

51. The logit function(given as l(x)) is the log of odds function. What could be the range of logit
function in the domain x=[0,1]?
A) (– ∞ , ∞)
B) (0,1)
C) (0, ∞)
D) (- ∞, 0)

Solution: A

For our purposes, the odds function has the advantage of transforming the probability function, which
has values from 0 to 1, into an equivalent function with values between 0 and ∞. When we take the
natural log of the odds function, we get a range of values from -∞ to ∞.

52. Which of the following option is true?

A) Linear Regression errors values has to be normally distributed but in case of Logistic Regression it is
not the case
B) Logistic Regression errors values has to be normally distributed but in case of Linear Regression it is
not the case
C) Both Linear Regression and Logistic Regression error values have to be normally distributed
D) Both Linear Regression and Logistic Regression error values have not to be normally distributed

Solution:A

53. Which of the following is true regarding the logistic function for any value “x”?

Note:
Logistic(x): is a logistic function of any number “x”

Logit(x): is a logit function of any number “x”

Logit_inv(x): is a inverse logit function of any number “x”

A) Logistic(x) = Logit(x)
B) Logistic(x) = Logit_inv(x)
C) Logit_inv(x) = Logit(x)
D) None of these

Solution: B

54. How will the bias change on using high(infinite) regularisation?

Solution: A

Model will become very simple so bias will be very high.

Note: Consider remaining parameters are same.

A) Training accuracy increases

B) Training accuracy increases or remains the same
C) Testing accuracy decreases
D) Testing accuracy increases or remains the same

Solution: A and D

56. Choose which of the following options is true regarding One-Vs-All method in Logistic Regression.

A) We need to fit n models in n-class classification problem

B) We need to fit n-1 models to classify into n classes
C) We need to fit only 1 model to classify into n classes
D) None of these
Solution: A

If there are n classes, then n separate logistic regression has to fit, where the probability of each
category is predicted over the rest of the categories combined.

57. Below are two different logistic models with different values for β0 and β1.

Which of the
following statement(s) is true about β0 and β1 values of two logistics models (Green, Black)?

Note: consider Y = β0 + β1*X. Here, β0 is intercept and β1 is coefficient.

A) β1 for Green is greater than Black

B) β1 for Green is lower than Black
C) β1 for both models is same
D) Can’t Say

Solution: B

β0 and β1: β0 = 0, β1 = 1 is in X1 color(black) and β0 = 0, β1 = −1 is in X4 color (green)

Context 58-60

A) A
B) B
C) C
D)None of these

Solution: C

Since in figure 3, Decision boundary is not smooth that means it will over-fitting the data.

59. What do you conclude after seeing this visualization?

1. The training error in first plot is maximum as compare to second and third plot.

2. The best model for this regression problem is the last (third) plot because it has minimum
training error (zero).

3. The second model is more robust than first and third because it will perform best on unseen
data.

4. The third model is overfitting more as compare to first and second.

5. All will perform same because we have not seen the testing data.

A) 1 and 3
B) 1 and 3
C) 1, 3 and 4
D) 5

Solution: C

60. Suppose, above decision boundaries were generated for the different value of regularization.
Which of the above decision boundary shows the maximum regularization?

A) A
B) B
C) C
D) All have equal regularization

Solution: A

Since, more regularization means more penality means less complex decision boundry that shows in first
figure A.

61. What would do if you want to train logistic regression on same data that will take less time as well
as give the comparatively similar accuracy(may not be same)?

Suppose you are using a Logistic Regression model on a huge dataset. One of the problem you may face
on such huge data is that Logistic regression will take very long time to train.

A) Decrease the learning rate and decrease the number of iteration

Solution: D

62. Which of the following image is showing the cost function for y =1.

Following is the loss function in logistic regression(Y-axis loss function and x axis log probability) for
two class classification problem.

Note: Y is the target class

A) A
B) B
C) Both
D) None of these

Solution: A

A is the true answer as loss function decreases as the log probability increases

63. Suppose, Following graph is a cost function for logistic regression.

Now, How many local minimas are present in the graph?

A) 1
B) 2
C) 3
D) 4

Solution: C
There are three local minima present in the graph

64. Can a Logistic Regression classifier do a perfect classification on the below data?

Note: You can use only X1 and X2 variables where X1 and X2 can take only two binary values(0,1).

A) TRUE
B) FALSE
C) Can’t say
D) None of these

Solution: B

No, logistic regression only forms linear decision surface, but the examples in the figure are not linearly
separable.
UNIT IV

1. The SVM’s are less effective when:

A) The data is linearly separable

B) The data is clean and ready to use
C) The data is noisy and contains overlapping points

Ans Solution: C

When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.

2. The cost parameter in the SVM means:

A) The number of cross-validations to be made

B) The kernel to be used
C) The tradeoff between misclassification and simplicity of the model
D) None of the above

Ans Solution: C

The cost parameter decides how much an SVM should be allowed to “bend” with the data. For a low
cost, you aim for a smooth decision surface and for a higher cost, you aim to classify more points
correctly. It is also simply referred to as the cost of misclassification.

3. Which of the following are real world applications of the SVM?

A) Text and Hypertext Categorization

B) Image Classification
C) Clustering of News Articles
D) All of the above

Ans Solution: D

SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.

4. Which of the following is true about Naive Bayes ?

Assumes that all the features in a dataset are equally important

Assumes that all the features in a dataset are independent

Both A and B - answer

None of the above options

Ans Solution: C

5 What do you mean by generalization error in terms of the SVM?

A) How far the hyperplane is from the support vectors

B) How accurately the SVM can predict outcomes for unseen data
C) The threshold amount of error in an SVM

Ans Solution: B

Generalisation error in statistics is generally the out-of-sample error which is the measure of how
accurately a model can predict values for previously unseen data.

6 The SVM’s are less effective when:

A) The data is linearly separable

B) The data is clean and ready to use
C) The data is noisy and contains overlapping points

Ans Solution: C

When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.

7 What is/are true about kernel in SVM?

1. Kernel function map low dimensional data to high dimensional space

2. It’s a similarity function

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: C

Both the given statements are correct.

Question Context:8– 9

A) Yes
B) No

Solution: A

These three examples are positioned such that removing any one of them introduces slack in the
constraints. So the decision boundary would completely change.

9. [True or False] If you remove the non-red circled points from the data, the decision boundary will
change?

A) True
B) False

Solution: B

On the other hand, rest of the points in the data won’t affect the decision boundary much.

10. What do you mean by generalization error in terms of the SVM?

A) How far the hyperplane is from the support vectors

B) How accurately the SVM can predict outcomes for unseen data
C) The threshold amount of error in an SVM

Solution: B

A) The optimal hyperplane if exists, will be the one that completely separates the data
B) The soft-margin classifier will separate the data
C) None of the above

Solution: A

At such a high level of misclassification penalty, soft margin will not hold existence as there will be no
room for error.

12. What do you mean by a hard margin?

A) The SVM allows very low error in classification

B) The SVM allows high amount of error in classification
C) None of the above

Solution: A

A hard margin means that an SVM is very rigid in classification and tries to work extremely well in the
training set, causing overfitting.

13. The minimum time complexity for training an SVM is O(n2). According to this fact, what sizes of
datasets are not best suited for SVM’s?

A) Large datasets
B) Small datasets
C) Medium sized datasets
D) Size does not matter

Solution: A

Datasets which have a clear classification boundary will function best with SVM’s.

14. The effectiveness of an SVM depends upon:

A) Selection of Kernel
B) Kernel Parameters
C) Soft Margin Parameter C
D) All of the above

Solution: D

The SVM effectiveness depends upon how you choose the basic 3 requirements mentioned above in
such a way that it maximises your efficiency, reduces error and overfitting.

15. upport vectors are the data points that lie closest to the decision surface.
A) TRUE
B) FALSE

Solution: A

They are the points closest to the hyperplane and the hardest ones to classify. They also have a direct
bearing on the location of the decision surface.

16. The SVM’s are less effective when:

A) The data is linearly separable

B) The data is clean and ready to use
C) The data is noisy and contains overlapping points

Solution: C

When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.

17. Suppose you are using RBF kernel in SVM with high Gamma value. What does this signify?

Solution: B

The gamma parameter in SVM tuning signifies the influence of points either near or far away from the
hyperplane.

For a low gamma, the model will be too constrained and include all points of the training dataset,
without really capturing the shape.

For a higher gamma, the model will capture the shape of the dataset well.

18. The cost parameter in the SVM means:

A) The number of cross-validations to be made

B) The kernel to be used
C) The tradeoff between misclassification and simplicity of the model
D) None of the above

What would happen when you use very large value of C(C->infinity)?

Note: For small C was also classifying all data points correctly

A) We can still classify data correctly for given setting of hyper parameter C
B) We can not classify data correctly for given setting of hyper parameter C
C) Can’t Say
D) None of these

Solution: A

For large values of C, the penalty for misclassifying points is very high, so the decision boundary will
perfectly separate the data if possible.

20. What would happen when you use very small C (C~0)?

A) Misclassification would happen

B) Data will be correctly classified
C) Can’t say
D) None of these

Solution: A

The classifier can maximize the margin between most of the points, while misclassifying a few points,
because the penalty is so low.

21. If I am using all features of my dataset and I achieve 100% accuracy on my training set, but ~70% on
validation set, what should I look out for?

A) Underfitting
B) Nothing, the model is perfect
C) Overfitting

Solution: C

If we’re achieving 100% training accuracy very easily, we need to check to verify if we’re overfitting our
data.
22. Which of the following are real world applications of the SVM?

A) Text and Hypertext Categorization

B) Image Classification
C) Clustering of News Articles
D) All of the above

Solution: D

SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.

Question Context: 23 – 25

Suppose you have trained an SVM with linear decision boundary after training SVM, you correctly infer
that your SVM model is under fitting.

23. Which of the following option would you more likely to consider iterating SVM next time?

A) You want to increase your data points

B) You want to decrease your data points
C) You will try to calculate more variables
D) You will try to reduce the features

Solution: C

The best option here would be to create more features for the model.

24. Suppose you gave the correct answer in previous question. What do you think that is actually
happening?

1. We are lowering the bias

2. We are lowering the variance
3. We are increasing the bias
4. We are increasing the variance

A) 1 and 2
B) 2 and 3
C) 1 and 4
D) 2 and 4

Solution: C

A) We will increase the parameter C

B) We will decrease the parameter C
C) Changing in C don’t effect
D) None of these

Solution: A

Increasing C parameter would be the right thing to do here, as it will ensure regularized model

26. We usually use feature normalization before using the Gaussian kernel in SVM. What is true about
feature normalization?

1. We do feature normalization so that new feature will dominate other

2. Some times, feature normalization is not feasible in case of categorical variables
3. Feature normalization always helps when we use Gaussian kernel in SVM

A) 1
B) 1 and 2
C) 1 and 3
D) 2 and 3

Solution: B

Statements one and two are correct.

Question Context: 27-29

Suppose you are dealing with 4 class classification problem and you want to train a SVM model on the
data for that you are using One-vs-all method. Now answer the below questions?

27. How many times we need to train our SVM model in such case?

A) 1
B) 2
C) 3
D) 4

Solution: D

A) 20
B) 40
C) 60
D) 80

Solution: B

It would take 10×4 = 40 seconds

29 Suppose your problem has changed now. Now, data has only 2 classes. What would you think how
many times we need to train SVM in such case?

A) 1
B) 2
C) 3
D) 4

Solution: A

Training the SVM only one time would give you appropriate results

Question context: 30 –31

30. Now, think that you increase the complexity (or degree of polynomial of this kernel). What would
you think will happen?

A) Increasing the complexity will over fit the data

B) Increasing the complexity will under fit the data
C) Nothing will happen since your model was already 100% accurate
D) None of these

Solution: A

A) 1
B) 2
C) 1 and 2
D) None of these

Solution: C

Both the given statements are correct.

32. What is/are true about kernel in SVM?

1. Kernel function map low dimensional data to high dimensional space

2. It’s a similarity function

A) 1
B) 2
C) 1 and 2
D) None of these

Solution: C

Both the given statements are correct.

UNIT V

1. Which of the following is a widely used and effective machine learning algorithm based on the
idea of bagging?

a) Decision Tree
b) Regression
c) Classification
d) Random Forest

Ans D

2. Which of the following is a disadvantage of decision trees?

a) Factor analysis
b) Decision trees are robust to outliers
c) Decision trees are prone to be overfit
d) None of the above

Ans C

3. Can decision trees be used for performing clustering?

a. True
b. False

Ans Solution: (A)

Decision trees can also be used to for clusters in the data but clustering often generates natural
clusters and is not dependent on any objective function.

4. Which of the following algorithm is most sensitive to outliers?

a. K-means clustering algorithm

b. K-medians clustering algorithm
c. K-modes clustering algorithm
d. K-medoids clustering algorithm

Ans Solution: (A)

5 Sentiment Analysis is an example of:

Regression

Classification

Clustering

Reinforcement Learning

Options:

a. 1 Only
b. 1 and 2
c. 1 and 3
d. 1, 2 and 4

Ans D

6 Which of the following is the most appropriate strategy for data cleaning before performing
clustering analysis, given less than desirable number of data points:

Capping and flouring of variables

Removal of outliers
Options:
a. 1 only
b. 2 only
c. 1 and 2
d. None of the above

Ans A

7 Which of the following is/are true about bagging trees?

1. In bagging trees, individual trees are independent of each other

2. Bagging is the method for improving the performance by aggregating the results of weak
learners

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: C

Both options are true. In Bagging, each individual trees are independent of each other because they
consider different subset of features and samples.

8. Which of the following is/are true about boosting trees?

1. In boosting trees, individual weak learners are independent of each other

2. It is the method for improving the performance by aggregating the results of weak learners

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: B

2. Individual tree is built on all the features

3. Individual tree is built on a subset of observations

4. Individual tree is built on full set of observations

A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4

Ans Solution: A

Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.

10. Suppose you are using a bagging based algorithm say a RandomForest in model building.
Which of the following can be true?

1. Number of tree should be as large as possible

2. You will have interpretability after using Random Forest

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: A

11. Which of the following is/are true about Random Forest and Gradient Boosting ensemble
methods?

1. Both methods can be used for classification task

2. Random Forest is use for classification whereas Gradient Boosting is use for regression task

3. Random Forest is use for regression whereas Gradient Boosting is use for Classification task

4. Both methods can be used for regression task

A) 1
B) 2
C) 3
D) 4
E) 1 and 4

Solution: E

Both algorithms are design for classification as well as regression task.

12. In Random forest you can generate hundreds of trees (say T1, T2 …..Tn) and then aggregate the
results of these tree. Which of the following is true about individual(Tk) tree in Random Forest?

1. Individual tree is built on a subset of the features

2. Individual tree is built on all the features

3. Individual tree is built on a subset of observations

4. Individual tree is built on full set of observations

A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4

Solution: A

Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.

13. Which of the following algorithm doesn’t uses learning Rate as of one of its hyperparameter?

1. Gradient Boosting

2. Extra Trees

3. AdaBoost

4. Random Forest

A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4

Solution: D
Random Forest and Extra Trees don’t have learning rate as a hyperparameter.

14. Which of the following algorithm are not an example of ensemble learning algorithm?

A) Random Forest
B) Adaboost
C) Extra Trees
D) Gradient Boosting
E) Decision Trees

Solution: E

Decision trees doesn’t aggregate the results of multiple trees so it is not an ensemble algorithm.

15. Suppose you are using a bagging based algorithm say a RandomForest in model building. Which of
the following can be true?

1. Number of tree should be as large as possible

2. You will have interpretability after using RandomForest

A) 1
B) 2
C) 1 and 2
D) None of these

Solution: A

16. True-False: The bagging is suitable for high variance low bias models?

A) TRUE
B) FALSE

Solution: A

The bagging is suitable for high variance low bias models or you can say for complex models.

17. To apply bagging to regression trees which of the following is/are true in such case?

1. We build the N regression with N bootstrap sample

2. We take the average the of N regression tree

3. Each tree has a high variance with low bias

A) 1 and 2
B) 2 and 3
C) 1 and 3
D) 1,2 and 3

Solution: D

All of the options are correct and self-explanatory

18. How to select best hyper parameters in tree based models?

A) Measure performance over training data

B) Measure performance over validation data
C) Both of these
D) None of these

Solution: B

We always consider the validation results to compare with the test result.

19. In which of the following scenario a gain ratio is preferred over Information Gain?

A) When a categorical variable has very large number of category

B) When a categorical variable has very small number of category
C) Number of categories is the not the reason
D) None of these

Solution: A

When high cardinality problems, gain ratio is preferred over Information Gain technique.

20. Suppose you have given the following scenario for training and validation error for Gradient
Boosting. Which of the following hyper parameter would you choose in such case?

Scenario Depth Training Error Validation Error

1 2 100 110

2 4 90 105

3 6 50 100

4 8 45 105
5 10 30 150

A) 1
B) 2
C) 3
D) 4

Solution: B

Scenario 2 and 4 has same validation accuracies but we would select 2 because depth is lower is better
hyper parameter.

21. Which of the following is/are not true about DBSCAN clustering algorithm:

1. For data points to be in a cluster, they must be in a distance threshold to a core point

2. It has strong assumptions for the distribution of data points in dataspace

3. It has substantially high time complexity of order O(n 3)

4. It does not require prior knowledge of the no. of desired clusters

5. It is robust to outliers

Options:

A. 1 only

B. 2 only

C. 4 only

D. 2 and 3

Solution: D

 DBSCAN can form a cluster of any arbitrary shape and does not have strong assumptions for the
distribution of data points in the data space.

 DBSCAN has a low time complexity of order O (n log n) only.

22. Point out the correct statement.

23. Which of the following is required by K-means clustering?

a) defined distance metric
b) number of clusters
c) initial guess as to cluster centroids
d) all of the mentioned

Answer: d
Explanation: K-means clustering follows partitioning approach.

24. Point out the wrong statement.

a) k-means clustering is a method of vector quantization
b) k-means clustering aims to partition n observations into k clusters
c) k-nearest neighbor is same as k-means
d) none of the mentioned

Answer: c
Explanation: k-nearest neighbour has nothing to do with k-means.

25. Which of the following function is used for k-means clustering?

a) k-means
b) k-mean
c) heat map
d) none of the mentioned

Answer: a
Explanation: K-means requires a number of clusters.

26. K-means is not deterministic and it also consists of number of iterations.

a) True
b) False

Answer: a
Explanation: K-means clustering produces the final estimate of cluster centroids.
27.
UNIT I
1. What is classification?
a) when the output variable is a category, such as “red” or “blue” or “disease” and “no
disease”.
b) when the output variable is a real value, such as “dollars” or “weight”.

Ans: Solution A

Ans: Solution B

3. What is supervised learning?

Ans: Solution B

4. What is Unsupervised learning?

Ans: Solution A

5. What is Semi-Supervised learning?

Ans: Solution C

7. Sentiment Analysis is an example of:

Regression,

Classification

Clustering

Reinforcement Learning

Options:

A. 1 Only

B. 1 and 2

C. 1 and 3

D. 1, 2 and 4

Ans : Solution D

8. The process of forming general concept definitions from examples of concepts to be

learned.
a) Deduction
b) abduction
c) induction
d) conjunction

Ans : Solution C

9. Computers are best at learning

a) facts.
b) concepts.
c) procedures.
d) principles.
Ans : Solution A

10. Data used to build a data mining model.

a) validation data
b) training data
c) test data
d) hidden data

Ans : Solution B

11. Supervised learning and unsupervised clustering both require at least one
a) hidden attribute.
b) output attribute.
c) input attribute.
d) categorical attribute.

Ans : Solution A

Ans : Solution B

Ans : Solution C

14. A term used to describe the case when the independent variables in a multiple regression model
are correlated is
a) Regression
b) correlation
c) multicollinearity
d) none of the above

Ans : Solution C

16. A multiple regression model has

a) only one independent variable
b) more than one dependent variable
c) more than one independent variable
d) none of the above

Ans : Solution B

Ans : Solution C

18. The adjusted multiple coefficient of determination accounts for

a) the number of dependent variables in the model
b) the number of independent variables in the model
c) unusually large predictors
d) none of the above

Ans : Solution D

19. The multiple coefficient of determination is computed by

a) dividing SSR by SST
b) dividing SST by SSR
c) dividing SST by SSE
d) none of the above

Ans : Solution C

20. For a multiple regression model, SST = 200 and SSE = 50. The multiple coefficient of
determination is
a) 0.25
b) 4.00
c) 0.75
d) none of the above

Ans : Solution B

21. A nearest neighbor approach is best used

Ans : Solution B

22. Another name for an output attribute.

a) predictive variable
b) independent variable
c) estimated variable
d) dependent variable

Ans : Solution B

23. Classification problems are distinguished from estimation problems in that

Ans : Solution C

24. Which statement is true about prediction problems?

Ans : Solution D

25. Which statement about outliers is true?

Ans : Solution A

27. Which of the following is a common use of unsupervised clustering?

Ans : Solution A

28. The average positive difference between computed and desired outcome values.
a) root mean squared error
b) mean squared error
c) mean absolute error
d) mean positive error

Ans : Solution D

29. Selecting data so as to assure that each class is properly represented in both the training and
test set.
a) cross validation
b) stratification
c) verification
d) bootstrapping

Ans : Solution B

Ans : Solution A
31. Data used to optimize the parameter settings of a supervised learner model.
a) Training
b) Test
c) Verification
d) Validation

Ans : Solution D

32. Bootstrapping allows us to

Ans : Solution A

Ans : Solution B

Ans : Solution C

35. The average squared difference between classifier predicted output and actual output.
a) mean squared error
b) root mean squared error
c) mean absolute error
d) mean relative error

Ans : Solution A

36. Simple regression assumes a __________ relationship between the input attribute and output
attribute.
a) Linear
b) Quadratic
c) reciprocal
d) inverse

Ans : Solution A

37. Regression trees are often used to model _______ data.

a) Linear
b) Nonlinear
c) Categorical
d) Symmetrical

Ans : Solution B

38. The leaf nodes of a model tree are

a) averages of numeric output attribute values.
b) nonlinear regression equations.
c) linear regression equations.
d) sums of numeric output attribute values.

Ans : Solution C

39. Logistic regression is a ________ regression technique that is used to model data having a
_____outcome.
a) linear, numeric
b) linear, binary
c) nonlinear, numeric
d) nonlinear, binary

Ans : Solution D

40. This technique associates a conditional probability value with each data instance.
a) linear regression
b) logistic regression
c) simple regression
d) multiple linear regression

Ans : Solution B

42. With Bayes classifier, missing data items are

a) treated as equal compares.
b) treated as unequal compares.
c) replaced with a default value.
d) ignored.

Ans : Solution B

43. This clustering algorithm merges and splits nodes to help modify nonoptimal partitions.
a) agglomerative clustering
b) expectation maximization
c) conceptual clustering
d) K-Means clustering

Ans : Solution D

Ans : Solution C

Ans : Solution B
UNIT –II

2.What is pca.components_ in Sklearn?

Set of all eigen vectors for the projection space
Matrix of principal components
Result of the multiplication matrix
None of the above options
Ans A

Ans D

7. PCA works better if there is?

A linear structure in the data
If the data lies on a curved surface and not on a flat surface
If variables are scaled in the same unit
A. 1 and 2
B. 2 and 3
C. 1 and 3
D. 1 ,2 and 3
Ans Solution: (C)

9. Which of the following option(s) is / are true?

11. [ True or False ] PCA can be used for projecting and visualizing data in lower dimensions.
A. TRUE
B. FALSE

Solution: (A)
Sometimes it is very useful to plot the data in lower dimensions. We can take the first 2 principal
components and then visualize the data using scatter plot.

13. Which of the following is an example of a deterministic algorithm?

1. Which of the following methods do we use to best fit the data in Logistic Regression?
A) Least Square Error
B) Maximum Likelihood
C) Jaccard distance
D) Both A and B
Ans Solution: B

4. Which of the following algorithms do we use for Variable Selection?

A) LASSO
B) Ridge
C) Both
D) None of these
Ans Solution: A
In case of lasso we apply a absolute penality, after increasing the penality in lasso some of the
coefficient of variables may become zero

5. Which of the following statement is true about outliers in Linear regression?

7. Which of the following is true about Residuals?

A) Lower is better
B) Higher is better
C) A or B depend on the situation
D) None of these
Ans Solution: (A)
Residuals refer to the error values of the model. Therefore lower residuals are desired.

A) Since the there is a relationship means our model is not good

10. Which of the following option is true?

A) Linear Regression errors values has to be normally distributed but in case of Logistic
Regression it is not the case
B) Logistic Regression errors values has to be normally distributed but in case of Linear
Regression it is not the case
C) Both Linear Regression and Logistic Regression error values have to be normally distributed
D) Both Linear Regression and Logistic Regression error values have not to be normally
distributed
Ans Solution: A

12. True-False: Linear Regression is a supervised machine learning algorithm.

13. True-False: Linear Regression is mainly used for Regression.

A) TRUE
B) FALSE
Solution: (A)
Linear Regression has dependent variables that have continuous values.
14. True-False: It is possible to design a Linear regression algorithm using a neural network?

A) TRUE
B) FALSE

Solution: (A)

True. A Neural network can be used as a universal approximator, so it can definitely implement
a linear regression algorithm.

18. Which of the following is true about Residuals ?

A) Lower is better
B) Higher is better
C) A or B depend on the situation
D) None of these
Solution: (A)
Residuals refer to the error values of the model. Therefore lower residuals are desired.

A) 1 and 2
B) 1 and 3
C) 2 and 3
D) 1,2 and 3
Solution: (D)
Instead of gradient descent, Normal Equation can also be used to find coefficients.

Question Context 24-26:

A) Since the there is a relationship means our model is not good

Question Context 29-31:

31. In terms of bias and variance. Which of the following is true when you fit degree 2
polynomial?

A) Bias will be high, variance will be high

Question Context 32-33:

33. What do you expect will happen with bias and variance as you increase the size of training
data?

A) Bias increases and Variance increases

Question Context 34:

Consider the following data where one input(X) and one output(Y) is given.

34. What would be the root mean square training error for this data if you run a Linear
Regression model of the form (Y = A0+A1X)?

A) Less than 0
B) Greater than zero
C) Equal to 0
D) None of these
Solution: (C)
We can perfectly fit the line on the following data so mean error will be zero.

Question Context 35-36:

Suppose you have been given the following scenario for training and validation error for Linear
Regression.
Number Validation
Learning Training
Scenario of Error
Rate Error
iterations

1 0.1 1000 100 110

2 0.2 600 90 105

3 0.3 400 110 110

4 0.4 300 120 130

5 0.4 250 130 150

Question Context 37-38:

A) L1
B) L2
C) Any
D) None of these
Solution: (D)
I won’t use any regularization methods because regularization is used in case of overfitting.

39. True-False: Is Logistic regression a supervised machine learning algorithm?

40. True-False: Is Logistic regression mainly used for Regression?

A) TRUE
B) FALSE
Solution: B
Logistic regression is a classification algorithm, don’t confuse with the name regression.

42. True-False: Is it possible to apply a logistic regression algorithm on a 3-class Classification

problem?
A) TRUE
B) FALSE
Solution: A
Yes, we can apply logistic regression on 3 classification problem, We can use One Vs all method
for 3 class classification in logistic regression.

46. [True-False] Standardisation of features is required before training a Logistic Regression.

A) TRUE
B) FALSE
Solution: B
Standardization isn’t required for logistic regression. The main goal of standardizing features is
to help convergence of the technique used for optimization.

47. Which of the following algorithms do we use for Variable Selection?

A) LASSO
B) Ridge
C) Both
D) None of these

Solution: A
In case of lasso we apply a absolute penality, after increasing the penality in lasso some of the
coefficient of variables may become zero.
Context: 48-49

Consider a following model for logistic regression: P (y =1|x, w)= g(w0 + w1x)
where g(z) is the logistic function.

In the above equation the P (y =1|x; w) , viewed as a function of x, that we can get by changing the
parameters w.

48 What would be the range of p in such case?

A) (0, inf)
B) (-inf, 0 )
C) (0, 1)
D) (-inf, inf)

Solution: C

For values of x in the range of real number from −∞ to +∞ Logistic function will give the output
between (0,1)

49 In above question what do you think which function would make p between (0,1)?

A) logistic function
B) Log likelihood function
C) Mixture of both
D) None of them

Solution: A

Explanation is same as question number 10

50. Suppose you have been given a fair coin and you want to find out the odds of getting heads.
Which of the following option is true for such a case?

A) odds will be 0
B) odds will be 0.5
C) odds will be 1
D) None of these

Solution: C

Odds are defined as the ratio of the probability of success and the probability of failure. So in case of fair
coin probability of success is 1/2 and the probability of failure is 1/2 so odd would be 1

51. The logit function(given as l(x)) is the log of odds function. What could be the range of logit
function in the domain x=[0,1]?
A) (– ∞ , ∞)
B) (0,1)
C) (0, ∞)
D) (- ∞, 0)

Solution: A

For our purposes, the odds function has the advantage of transforming the probability function, which
has values from 0 to 1, into an equivalent function with values between 0 and ∞. When we take the
natural log of the odds function, we get a range of values from -∞ to ∞.

52. Which of the following option is true?

A) Linear Regression errors values has to be normally distributed but in case of Logistic Regression it is
not the case
B) Logistic Regression errors values has to be normally distributed but in case of Linear Regression it is
not the case
C) Both Linear Regression and Logistic Regression error values have to be normally distributed
D) Both Linear Regression and Logistic Regression error values have not to be normally distributed

Solution:A

53. Which of the following is true regarding the logistic function for any value “x”?

Note:
Logistic(x): is a logistic function of any number “x”

Logit(x): is a logit function of any number “x”

Logit_inv(x): is a inverse logit function of any number “x”

A) Logistic(x) = Logit(x)
B) Logistic(x) = Logit_inv(x)
C) Logit_inv(x) = Logit(x)
D) None of these

Solution: B

54. How will the bias change on using high(infinite) regularisation?

Solution: A

Model will become very simple so bias will be very high.

Note: Consider remaining parameters are same.

A) Training accuracy increases

B) Training accuracy increases or remains the same
C) Testing accuracy decreases
D) Testing accuracy increases or remains the same

Solution: A and D

56. Choose which of the following options is true regarding One-Vs-All method in Logistic Regression.

A) We need to fit n models in n-class classification problem

B) We need to fit n-1 models to classify into n classes
C) We need to fit only 1 model to classify into n classes
D) None of these
Solution: A

If there are n classes, then n separate logistic regression has to fit, where the probability of each
category is predicted over the rest of the categories combined.

57. Below are two different logistic models with different values for β0 and β1.

Which of the
following statement(s) is true about β0 and β1 values of two logistics models (Green, Black)?

Note: consider Y = β0 + β1*X. Here, β0 is intercept and β1 is coefficient.

A) β1 for Green is greater than Black

B) β1 for Green is lower than Black
C) β1 for both models is same
D) Can’t Say

Solution: B

β0 and β1: β0 = 0, β1 = 1 is in X1 color(black) and β0 = 0, β1 = −1 is in X4 color (green)

Context 58-60

A) A
B) B
C) C
D)None of these

Solution: C

Since in figure 3, Decision boundary is not smooth that means it will over-fitting the data.

59. What do you conclude after seeing this visualization?

1. The training error in first plot is maximum as compare to second and third plot.

2. The best model for this regression problem is the last (third) plot because it has minimum
training error (zero).

3. The second model is more robust than first and third because it will perform best on unseen
data.

4. The third model is overfitting more as compare to first and second.

5. All will perform same because we have not seen the testing data.

A) 1 and 3
B) 1 and 3
C) 1, 3 and 4
D) 5

Solution: C

60. Suppose, above decision boundaries were generated for the different value of regularization.
Which of the above decision boundary shows the maximum regularization?

A) A
B) B
C) C
D) All have equal regularization

Solution: A

Since, more regularization means more penality means less complex decision boundry that shows in first
figure A.

61. What would do if you want to train logistic regression on same data that will take less time as well
as give the comparatively similar accuracy(may not be same)?

Suppose you are using a Logistic Regression model on a huge dataset. One of the problem you may face
on such huge data is that Logistic regression will take very long time to train.

A) Decrease the learning rate and decrease the number of iteration

Solution: D

62. Which of the following image is showing the cost function for y =1.

Following is the loss function in logistic regression(Y-axis loss function and x axis log probability) for
two class classification problem.

Note: Y is the target class

A) A
B) B
C) Both
D) None of these

Solution: A

A is the true answer as loss function decreases as the log probability increases

63. Suppose, Following graph is a cost function for logistic regression.

Now, How many local minimas are present in the graph?

A) 1
B) 2
C) 3
D) 4

Solution: C
There are three local minima present in the graph

64. Can a Logistic Regression classifier do a perfect classification on the below data?

Note: You can use only X1 and X2 variables where X1 and X2 can take only two binary values(0,1).

A) TRUE
B) FALSE
C) Can’t say
D) None of these

Solution: B

No, logistic regression only forms linear decision surface, but the examples in the figure are not linearly
separable.
UNIT IV

1. The SVM’s are less effective when:

A) The data is linearly separable

B) The data is clean and ready to use
C) The data is noisy and contains overlapping points

Ans Solution: C

When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.

2. The cost parameter in the SVM means:

A) The number of cross-validations to be made

B) The kernel to be used
C) The tradeoff between misclassification and simplicity of the model
D) None of the above

Ans Solution: C

The cost parameter decides how much an SVM should be allowed to “bend” with the data. For a low
cost, you aim for a smooth decision surface and for a higher cost, you aim to classify more points
correctly. It is also simply referred to as the cost of misclassification.

3. Which of the following are real world applications of the SVM?

A) Text and Hypertext Categorization

B) Image Classification
C) Clustering of News Articles
D) All of the above

Ans Solution: D

SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.

4. Which of the following is true about Naive Bayes ?

Assumes that all the features in a dataset are equally important

Assumes that all the features in a dataset are independent

Both A and B - answer

None of the above options

Ans Solution: C

5 What do you mean by generalization error in terms of the SVM?

A) How far the hyperplane is from the support vectors

B) How accurately the SVM can predict outcomes for unseen data
C) The threshold amount of error in an SVM

Ans Solution: B

Generalisation error in statistics is generally the out-of-sample error which is the measure of how
accurately a model can predict values for previously unseen data.

6 The SVM’s are less effective when:

A) The data is linearly separable

B) The data is clean and ready to use
C) The data is noisy and contains overlapping points

Ans Solution: C

When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.

7 What is/are true about kernel in SVM?

1. Kernel function map low dimensional data to high dimensional space

2. It’s a similarity function

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: C

Both the given statements are correct.

Question Context:8– 9

A) Yes
B) No

Solution: A

These three examples are positioned such that removing any one of them introduces slack in the
constraints. So the decision boundary would completely change.

9. [True or False] If you remove the non-red circled points from the data, the decision boundary will
change?

A) True
B) False

Solution: B

On the other hand, rest of the points in the data won’t affect the decision boundary much.

10. What do you mean by generalization error in terms of the SVM?

A) How far the hyperplane is from the support vectors

B) How accurately the SVM can predict outcomes for unseen data
C) The threshold amount of error in an SVM

Solution: B

A) The optimal hyperplane if exists, will be the one that completely separates the data
B) The soft-margin classifier will separate the data
C) None of the above

Solution: A

At such a high level of misclassification penalty, soft margin will not hold existence as there will be no
room for error.

12. What do you mean by a hard margin?

A) The SVM allows very low error in classification

B) The SVM allows high amount of error in classification
C) None of the above

Solution: A

A hard margin means that an SVM is very rigid in classification and tries to work extremely well in the
training set, causing overfitting.

13. The minimum time complexity for training an SVM is O(n2). According to this fact, what sizes of
datasets are not best suited for SVM’s?

A) Large datasets
B) Small datasets
C) Medium sized datasets
D) Size does not matter

Solution: A

Datasets which have a clear classification boundary will function best with SVM’s.

14. The effectiveness of an SVM depends upon:

A) Selection of Kernel
B) Kernel Parameters
C) Soft Margin Parameter C
D) All of the above

Solution: D

The SVM effectiveness depends upon how you choose the basic 3 requirements mentioned above in
such a way that it maximises your efficiency, reduces error and overfitting.

15. upport vectors are the data points that lie closest to the decision surface.
A) TRUE
B) FALSE

Solution: A

They are the points closest to the hyperplane and the hardest ones to classify. They also have a direct
bearing on the location of the decision surface.

16. The SVM’s are less effective when:

A) The data is linearly separable

B) The data is clean and ready to use
C) The data is noisy and contains overlapping points

Solution: C

When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.

17. Suppose you are using RBF kernel in SVM with high Gamma value. What does this signify?

Solution: B

The gamma parameter in SVM tuning signifies the influence of points either near or far away from the
hyperplane.

For a low gamma, the model will be too constrained and include all points of the training dataset,
without really capturing the shape.

For a higher gamma, the model will capture the shape of the dataset well.

18. The cost parameter in the SVM means:

A) The number of cross-validations to be made

B) The kernel to be used
C) The tradeoff between misclassification and simplicity of the model
D) None of the above

What would happen when you use very large value of C(C->infinity)?

Note: For small C was also classifying all data points correctly

A) We can still classify data correctly for given setting of hyper parameter C
B) We can not classify data correctly for given setting of hyper parameter C
C) Can’t Say
D) None of these

Solution: A

For large values of C, the penalty for misclassifying points is very high, so the decision boundary will
perfectly separate the data if possible.

20. What would happen when you use very small C (C~0)?

A) Misclassification would happen

B) Data will be correctly classified
C) Can’t say
D) None of these

Solution: A

The classifier can maximize the margin between most of the points, while misclassifying a few points,
because the penalty is so low.

21. If I am using all features of my dataset and I achieve 100% accuracy on my training set, but ~70% on
validation set, what should I look out for?

A) Underfitting
B) Nothing, the model is perfect
C) Overfitting

Solution: C

If we’re achieving 100% training accuracy very easily, we need to check to verify if we’re overfitting our
data.
22. Which of the following are real world applications of the SVM?

A) Text and Hypertext Categorization

B) Image Classification
C) Clustering of News Articles
D) All of the above

Solution: D

SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.

Question Context: 23 – 25

Suppose you have trained an SVM with linear decision boundary after training SVM, you correctly infer
that your SVM model is under fitting.

23. Which of the following option would you more likely to consider iterating SVM next time?

A) You want to increase your data points

B) You want to decrease your data points
C) You will try to calculate more variables
D) You will try to reduce the features

Solution: C

The best option here would be to create more features for the model.

24. Suppose you gave the correct answer in previous question. What do you think that is actually
happening?

1. We are lowering the bias

2. We are lowering the variance
3. We are increasing the bias
4. We are increasing the variance

A) 1 and 2
B) 2 and 3
C) 1 and 4
D) 2 and 4

Solution: C

A) We will increase the parameter C

B) We will decrease the parameter C
C) Changing in C don’t effect
D) None of these

Solution: A

Increasing C parameter would be the right thing to do here, as it will ensure regularized model

26. We usually use feature normalization before using the Gaussian kernel in SVM. What is true about
feature normalization?

1. We do feature normalization so that new feature will dominate other

2. Some times, feature normalization is not feasible in case of categorical variables
3. Feature normalization always helps when we use Gaussian kernel in SVM

A) 1
B) 1 and 2
C) 1 and 3
D) 2 and 3

Solution: B

Statements one and two are correct.

Question Context: 27-29

Suppose you are dealing with 4 class classification problem and you want to train a SVM model on the
data for that you are using One-vs-all method. Now answer the below questions?

27. How many times we need to train our SVM model in such case?

A) 1
B) 2
C) 3
D) 4

Solution: D

A) 20
B) 40
C) 60
D) 80

Solution: B

It would take 10×4 = 40 seconds

29 Suppose your problem has changed now. Now, data has only 2 classes. What would you think how
many times we need to train SVM in such case?

A) 1
B) 2
C) 3
D) 4

Solution: A

Training the SVM only one time would give you appropriate results

Question context: 30 –31

30. Now, think that you increase the complexity (or degree of polynomial of this kernel). What would
you think will happen?

A) Increasing the complexity will over fit the data

B) Increasing the complexity will under fit the data
C) Nothing will happen since your model was already 100% accurate
D) None of these

Solution: A

A) 1
B) 2
C) 1 and 2
D) None of these

Solution: C

Both the given statements are correct.

32. What is/are true about kernel in SVM?

1. Kernel function map low dimensional data to high dimensional space

2. It’s a similarity function

A) 1
B) 2
C) 1 and 2
D) None of these

Solution: C

Both the given statements are correct.

UNIT V

1. Which of the following is a widely used and effective machine learning algorithm based on the
idea of bagging?

a) Decision Tree
b) Regression
c) Classification
d) Random Forest

Ans D

2. Which of the following is a disadvantage of decision trees?

a) Factor analysis
b) Decision trees are robust to outliers
c) Decision trees are prone to be overfit
d) None of the above

Ans C

3. Can decision trees be used for performing clustering?

a. True
b. False

Ans Solution: (A)

Decision trees can also be used to for clusters in the data but clustering often generates natural
clusters and is not dependent on any objective function.

4. Which of the following algorithm is most sensitive to outliers?

a. K-means clustering algorithm

b. K-medians clustering algorithm
c. K-modes clustering algorithm
d. K-medoids clustering algorithm

Ans Solution: (A)

5 Sentiment Analysis is an example of:

Regression

Classification

Clustering

Reinforcement Learning

Options:

a. 1 Only
b. 1 and 2
c. 1 and 3
d. 1, 2 and 4

Ans D

6 Which of the following is the most appropriate strategy for data cleaning before performing
clustering analysis, given less than desirable number of data points:

Capping and flouring of variables

Removal of outliers
Options:
a. 1 only
b. 2 only
c. 1 and 2
d. None of the above

Ans A

7 Which of the following is/are true about bagging trees?

1. In bagging trees, individual trees are independent of each other

2. Bagging is the method for improving the performance by aggregating the results of weak
learners

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: C

Both options are true. In Bagging, each individual trees are independent of each other because they
consider different subset of features and samples.

8. Which of the following is/are true about boosting trees?

1. In boosting trees, individual weak learners are independent of each other

2. It is the method for improving the performance by aggregating the results of weak learners

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: B

2. Individual tree is built on all the features

3. Individual tree is built on a subset of observations

4. Individual tree is built on full set of observations

A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4

Ans Solution: A

Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.

10. Suppose you are using a bagging based algorithm say a RandomForest in model building.
Which of the following can be true?

1. Number of tree should be as large as possible

2. You will have interpretability after using Random Forest

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: A

11. Which of the following is/are true about Random Forest and Gradient Boosting ensemble
methods?

1. Both methods can be used for classification task

2. Random Forest is use for classification whereas Gradient Boosting is use for regression task

3. Random Forest is use for regression whereas Gradient Boosting is use for Classification task

4. Both methods can be used for regression task

A) 1
B) 2
C) 3
D) 4
E) 1 and 4

Solution: E

Both algorithms are design for classification as well as regression task.

12. In Random forest you can generate hundreds of trees (say T1, T2 …..Tn) and then aggregate the
results of these tree. Which of the following is true about individual(Tk) tree in Random Forest?

1. Individual tree is built on a subset of the features

2. Individual tree is built on all the features

3. Individual tree is built on a subset of observations

4. Individual tree is built on full set of observations

A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4

Solution: A

Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.

13. Which of the following algorithm doesn’t uses learning Rate as of one of its hyperparameter?

1. Gradient Boosting

2. Extra Trees

3. AdaBoost

4. Random Forest

A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4

Solution: D
Random Forest and Extra Trees don’t have learning rate as a hyperparameter.

14. Which of the following algorithm are not an example of ensemble learning algorithm?

A) Random Forest
B) Adaboost
C) Extra Trees
D) Gradient Boosting
E) Decision Trees

Solution: E

Decision trees doesn’t aggregate the results of multiple trees so it is not an ensemble algorithm.

15. Suppose you are using a bagging based algorithm say a RandomForest in model building. Which of
the following can be true?

1. Number of tree should be as large as possible

2. You will have interpretability after using RandomForest

A) 1
B) 2
C) 1 and 2
D) None of these

Solution: A

16. True-False: The bagging is suitable for high variance low bias models?

A) TRUE
B) FALSE

Solution: A

The bagging is suitable for high variance low bias models or you can say for complex models.

17. To apply bagging to regression trees which of the following is/are true in such case?

1. We build the N regression with N bootstrap sample

2. We take the average the of N regression tree

3. Each tree has a high variance with low bias

A) 1 and 2
B) 2 and 3
C) 1 and 3
D) 1,2 and 3

Solution: D

All of the options are correct and self-explanatory

18. How to select best hyper parameters in tree based models?

A) Measure performance over training data

B) Measure performance over validation data
C) Both of these
D) None of these

Solution: B

We always consider the validation results to compare with the test result.

19. In which of the following scenario a gain ratio is preferred over Information Gain?

A) When a categorical variable has very large number of category

B) When a categorical variable has very small number of category
C) Number of categories is the not the reason
D) None of these

Solution: A

When high cardinality problems, gain ratio is preferred over Information Gain technique.

20. Suppose you have given the following scenario for training and validation error for Gradient
Boosting. Which of the following hyper parameter would you choose in such case?

Scenario Depth Training Error Validation Error

1 2 100 110

2 4 90 105

3 6 50 100

4 8 45 105
5 10 30 150

A) 1
B) 2
C) 3
D) 4

Solution: B

Scenario 2 and 4 has same validation accuracies but we would select 2 because depth is lower is better
hyper parameter.

21. Which of the following is/are not true about DBSCAN clustering algorithm:

1. For data points to be in a cluster, they must be in a distance threshold to a core point

2. It has strong assumptions for the distribution of data points in dataspace

3. It has substantially high time complexity of order O(n 3)

4. It does not require prior knowledge of the no. of desired clusters

5. It is robust to outliers

Options:

A. 1 only

B. 2 only

C. 4 only

D. 2 and 3

Solution: D

 DBSCAN can form a cluster of any arbitrary shape and does not have strong assumptions for the
distribution of data points in the data space.

 DBSCAN has a low time complexity of order O (n log n) only.

22. Point out the correct statement.

23. Which of the following is required by K-means clustering?

a) defined distance metric
b) number of clusters
c) initial guess as to cluster centroids
d) all of the mentioned

Answer: d
Explanation: K-means clustering follows partitioning approach.

24. Point out the wrong statement.

a) k-means clustering is a method of vector quantization
b) k-means clustering aims to partition n observations into k clusters
c) k-nearest neighbor is same as k-means
d) none of the mentioned

Answer: c
Explanation: k-nearest neighbour has nothing to do with k-means.

25. Which of the following function is used for k-means clustering?

a) k-means
b) k-mean
c) heat map
d) none of the mentioned

Answer: a
Explanation: K-means requires a number of clusters.

26. K-means is not deterministic and it also consists of number of iterations.

a) True
b) False

Answer: a
Explanation: K-means clustering produces the final estimate of cluster centroids.
27.
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following step / assumption in regression modeling

impacts the trade-off between under-fitting and over-fitting the most
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) The polynomial degree

THIS IS
MANDATORY
OPTION

((OPTION_B)) Whether we learn the weights by matrix inversion or gradient descent

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) The use of a constant-term

This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_A)) 10/27
THIS IS
MANDATORY
OPTION

((OPTION_B)) 20/27

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 50/27
This is optional

((OPTION_D)) 49/27
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_A)) 1and4

THIS IS
MANDATORY
OPTION

((OPTION_B)) 2 and3
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 1 and3
This is optional

((OPTION_D)) 2 and4
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_A)) You will always have test error zero

THIS IS
MANDATORY
OPTION

((OPTION_B)) . You can not have test error zero

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) None of the above

This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which one of the statement is true regarding residuals in regression

analysis?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) A. Mean of residuals is always zero

THIS IS
MANDATORY
OPTION

((OPTION_B)) Mean of residuals is always less than zero

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Mean of residuals is always greater than zero

This is optional

((OPTION_D)) There is no such rule for residuals.

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the one is true about Heteroskedasticity?

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Linear Regression with varying error terms

THIS IS
MANDATORY
OPTION

((OPTION_B)) Linear Regression with constant error terms

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Linear Regression with zero error terms

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following indicates a fairly strong relationship between

X and Y?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) A. Correlation coefficient = 0.9

THIS IS
MANDATORY
OPTION

((OPTION_B)) . The p-value for the null hypothesis Beta coefficient =0 is 0.0001
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) The t-statistic for the null hypothesis Beta coefficient=0 is 30

This is optional

((OPTION_D)) None of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((OPTION_A)) 1,2&3

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1&3

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) All of above

This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) To test linear relationship of y(dependent) and x(independent)

continuous variables, which of the following plot best suited?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Scatter plot

THIS IS
MANDATORY
OPTION

((OPTION_B)) Barchart
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Histograms
This is optional

((OPTION_D)) None of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Generally, which of the following method(s) is used for predicting

continuous dependent variable?
ENTER
CONTENT. QTN 1. Linear Regression
CAN HAVE 2. Logistic Regression
IMAGES ALSO

((OPTION_A)) 1&2
THIS IS
MANDATORY
OPTION

((OPTION_B)) Only 1
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Only 2
This is optional

((OPTION_D)) None f the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) . A correlation between age and health of a person found to be -1.09.

On the basis of this you would tell the doctors that:
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) . The age is good predictor of health

THIS IS
MANDATORY
OPTION

((OPTION_B)) . The age is poor predictor of health

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) None of these

This is optional

((OPTION_D)) All of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Vertical offset

THIS IS
MANDATORY
OPTION

((OPTION_B)) Perpendicular offset

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both but depend on situation

This is optional

((OPTION_D)) Both a&b

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((OPTION_A)) . Only 1

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1&3
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 1&4
This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1 is False and 2 is True

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 1 is True and 2 is False

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH c
OICE)) Either A
or B or C or D or
E

((OPTION_A)) It is more likely for X1 to be excluded from the model

THIS IS
MANDATORY
OPTION

((OPTION_B)) It is more likely for X1 to be included in the model

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) . Can’t say

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following is true about “Ridge” or “Lasso” regression

methods in case of feature selection?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Ridge regression uses subset selection of features

THIS IS
MANDATORY
OPTION

((OPTION_B)) . Lasso regression uses subset selection of features

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both use subset selection of features

This is optional

((OPTION_D)) All of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) .Which of the following statement(s) can be true post adding a

((OPTION_A)) . 1 and 2

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1 and 3

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 2 and 4

This is optional

((OPTION_D)) none of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_A)) 2 and 4

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1 and 2.
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) . 2, 3 and 4.
This is optional

((OPTION_D)) All of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_A)) 1 and 2

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1&3
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 2&3
This is optional

((OPTION_D)) 1,2&3
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((OPTION_A)) . 1 and 2

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1 and 3

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 2 and 3

This is optional

((OPTION_D)) 1,2 and 3

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) . How many coefficients do you need to estimate in a simple linear

regression model (One independent variable)
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) 1
THIS IS
MANDATORY
OPTION

((OPTION_B)) 2
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) CAN’T SAY

This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO Which of the following statement is true about sum of residuals of A and B

((OPTION_A)) A has higher than B

THIS IS
MANDATORY
OPTION

((OPTION_B)) A has lower than B

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both have same

This is optional

((OPTION_D)) None of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) If two variables are correlated, is it necessary that they have a linear
relationsh
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) YES
THIS IS
MANDATORY
OPTION

((OPTION_B)) NO
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both a&b

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Correlated variables can have zero correlation coeffficient. True or

False?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION

((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

IMAGES ALSO

((OPTION_A)) Only 2
THIS IS
MANDATORY
OPTION

((OPTION_B)) Only 1
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Only3
This is optional

((OPTION_D)) All of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((OPTION_A)) 3.02

THIS IS
MANDATORY
OPTION

((OPTION_B)) 0.75
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 1.01

This is optional

((OPTION_D)) None of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Suppose the distribution of salaries in a company X has median

$35,000, and 25th and 75th percentiles are $21,000 and $53,000
ENTER respectively.
CONTENT. QTN Would a person with Salary $1 be considered an Outlier?
CAN HAVE
IMAGES ALSO

((OPTION_A)) YES

THIS IS
MANDATORY
OPTION

((OPTION_B)) NO
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) . More information is required

This is optional

((OPTION_D)) None of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following option is true regarding “Regression” and

“Correlation” ?
ENTER Note: y is dependent variable and x is independent variable.
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) The relationship is symmetric between x and y in both.

THIS IS
MANDATORY
OPTION

((OPTION_B)) The relationship is not symmetric between x and y in both.

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) The relationship is not symmetric between x and y in case of correlation

but in case of regression it is symmetric.
This is optional

((OPTION_D)) The relationship is symmetric between x and y in case of correlation but

in case of regression it is not symmetric.
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) True-False: Is Logistic regression a supervised machine learning

algorithm?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) TRUE

THIS IS
MANDATORY
OPTION

((OPTION_B)) FALSE

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) _
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) True-False: Is Logistic regression mainly used for Regression?

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION

((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) True-False: Is it possible to design a logistic regression algorithm

using a Neural Network Algorithm?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION

((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) True-False: Is it possible to apply a logistic regression algorithm on a

3-class Classification problem?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION

((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following methods do we use to best fit the data in
Logistic Regression?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Least Square Error

THIS IS
MANDATORY
OPTION

((OPTION_B)) Maximum Likelihood

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Jaccard distance

This is optional

((OPTION_D)) Both a&B

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_A)) We prefer a model with minimum AIC value

THIS IS
MANDATORY
OPTION

((OPTION_B)) We prefer a model with maximum AIC value

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both but depend on the situation

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) True-False] Standardisation of features is required before training a

Logistic Regression
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION

((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following algorithms do we use for Variable Selection?

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) ) LASSO

THIS IS
MANDATORY
OPTION

((OPTION_B)) Ridge

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both
This is optional

((OPTION_D)) All of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Suppose you have been given a fair coin and you want to find out the
odds of getting heads. Which of the following option is true for such a
ENTER case?
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) odds will be 0

THIS IS
MANDATORY
OPTION

((OPTION_B)) odds will be 0.5

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) odds will be 1

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) ) The logit function(given as l(x)) is the log of odds function. What
could be the range of logit function in the domain x=[0,1]?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) (– ∞ , ∞)

THIS IS
MANDATORY
OPTION

((OPTION_B)) (0,1)
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) (0, ∞)
This is optional

((OPTION_D)) (- ∞, 0)
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following option is true?

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Linear Regression errors values has to be normally distributed but in case
of Logistic Regression it is not the case
THIS IS
MANDATORY
OPTION

((OPTION_B)) Linear Regression errors values has to be normally distributed but in case
of Logistic Regression it is not the case
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both Linear Regression and Logistic Regression error values have to be
normally distributed
This is optional

((OPTION_D)) Both Linear Regression and Logistic Regression error values have not to
be normally distributed
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((OPTION_A)) C) A) Logistic(x) = Logit(x)

THIS IS
MANDATORY
OPTION

((OPTION_B)) Logistic(x) = Logit_inv(x)

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) A) Logistic(x) = Logit(x)

This is optional

((OPTION_D)) None of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_A)) Training accuracy increases

THIS IS
MANDATORY
OPTION

((OPTION_B)) Training accuracy increases or remains the same

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Testing accuracy decreases

This is optional

((OPTION_D)) Testing accuracy increases or remains the same

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A&D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Choose which of the following options is true regarding One-Vs-All

method in Logistic Regression.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) We need to fit n models in n-class classification problem

THIS IS
MANDATORY
OPTION

((OPTION_B)) We need to fit n-1 models to classify into n classes

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) We need to fit only 1 model to classify into n classes

This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_B)) Decrease the learning rate and increase the number of iteration
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Increase the learning rate and increase the number of iteration
This is optional

((OPTION_D)) Increase the learning rate and decrease the number of iteration
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((OPTION_A)) A
THIS IS
MANDATORY
OPTION

((OPTION_B)) B
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) BOTH
This is optional

((OPTION_D)) NON OF THESE

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Logistic regression is used when you want to:

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Predict a dichotomous variable from continuous or dichotomous variables.

THIS IS
MANDATORY
OPTION

((OPTION_B)) Predict a continuous variable from dichotomous variables.

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Predict any categorical variable from several other categorical

variables.
This is optional

((OPTION_D)) Predict a continuous variable from dichotomous or continuous variables

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
The odds ratio is
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) The ratio of the probability of an event not happening to the probability of the event happening.

THIS IS
MANDATORY
OPTION

((OPTION_B)) The probability of an event occurring.

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) The ratio of the odds after a unit change in the predictor to the original odds.

This is optional

((OPTION_D)) The ratio of the probability of an event happening to the probability of the event not happening.

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((OPTION_A)) That there are a greater number of explained vs. unexplained observations.

THIS IS
MANDATORY
OPTION

((OPTION_B)) That the statistical model fits the data well.

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) That as the predictor variable increases, the likelihood of the outcome occurring decreases.

This is optional

((OPTION_D)) That the statistical model is a poor fit of the data.

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Logistic regression assumes a:
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Linear relationship between continuous predictor variables and the outcome variable.

THIS IS
MANDATORY
OPTION

((OPTION_B)) Linear relationship between continuous predictor variables and the logit of the outcome
variable.
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Linear relationship between continuous predictor variables.

This is optional

((OPTION_D)) Linear relationship between observations.

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
In binary logistic regression:
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) The dependent variable is continuous.

THIS IS
MANDATORY
OPTION

((OPTION_B)) The dependent variable is divided into two equal subcategories.

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) The dependent variable consists of two categories.

This is optional

((OPTION_D)) There is no dependent variable.

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) The correlation coefficient is used to determine

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) A specific value of the y-variable given a specific value of the x-

THIS IS variable
MANDATORY
OPTION

((OPTION_B)) A specific value of the x-variable given a specific value of the y-

variable
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) The strength of the relationship between the x and y variables

This is optional

((OPTION_D)) none
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Choose the options that is incorrect regarding machine learning (ML) and
artificial intelligence (AI)
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) ML is an alternate way of programming intelligent machines.

THIS IS
MANDATORY
OPTION

((OPTION_B)) ML and AI have very different goals

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) ML is a set of techniques that turns a dataset into a software.

This is optional

((OPTION_D)) AI is a software that can emulate the human mind

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following sentence is FALSE regarding regression

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) It is used for prediction

THIS IS
MANDATORY
OPTION

((OPTION_B)) It may be used for interpretation

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) It relates inputs to outputs.

This is optional

((OPTION_D)) It discovers causal relationships

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Grid search is

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Linear in D
THIS IS
MANDATORY
OPTION

((OPTION_B)) Exponential in D
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Linear in N
This is optional

((OPTION_D)) Both B&C

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Find incorrect regarding Gradient of a continuous and differentiable

function
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) is zero at a minimum

THIS IS
MANDATORY
OPTION

((OPTION_B)) is non-zero at a maximum

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) is zero at a saddle point

This is optional

((OPTION_D)) decreases as you get closer to the minimum

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Consider a linear-regression model with N = 3 and D = 1 with input-ouput

pairs as follows: y1 = 22, x1 = 1, y2 = 3, x2 = 1, y3 = 3, x3 = 2. What
ENTER
is the gradient of mean-square error (MSE) with respect to β1 when β0 = 0
CONTENT. QTN and β1 = 1? Give your answer correct to two decimal digits.
CAN HAVE
IMAGES ALSO

((OPTION_A)) -1.66
THIS IS
MANDATORY
OPTION

((OPTION_B)) 2
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 3
This is optional

((OPTION_D)) 4
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Let us say that we have computed the gradient of our cost function and
stored it in a vector g. What is the cost of one gradient descent update
ENTER
given the gradient?
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) O(D)
THIS IS
MANDATORY
OPTION

((OPTION_B)) O(N)
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) O(ND)
This is optional

((OPTION_D)) O(ND2)

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) You observe the following while fitting a linear regression to the data: As
you increase the amount of training data, the test error decreases and the
ENTER
training error increases. The train error is quite low (almost what you
CONTENT. QTN expect
CAN HAVE it to), while the test error is much higher than the train error.
IMAGES ALSO What do you think is the main reason behind this behavior. Choose the
most probable option
((OPTION_A)) High variance
THIS IS
MANDATORY
OPTION

((OPTION_B)) High model bias

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) High estimation bias

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Adding more basis functions in a linear model... (pick the most probably
option)
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Decreases model bias

THIS IS
MANDATORY
OPTION

((OPTION_B)) Decreases estimation bias

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Decreases variance

This is optional

((OPTION_D)) Doesn’t affect bias and variance

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) The problem of finding hidden structure in unlabeled data is called

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Supervised learning

THIS IS
MANDATORY
OPTION

((OPTION_B)) UnSupervised learning

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Reinforcement learning

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Task of inferring a model from labeled training data is called

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Unsupervised learning

THIS IS
MANDATORY
OPTION

((OPTION_B)) supervised learning

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Reinforcement learning

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Some telecommunication company wants to segment their customers

into distinct groups in order to send appropriate subscription offers,
ENTER
this is an example of
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Supervised learning

THIS IS
MANDATORY
OPTION

((OPTION_B)) Data extraction

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Serration
This is optional

((OPTION_D)) Unsupervised learning

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Self-organizing maps are an example of

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Unsupervised learning

THIS IS
MANDATORY
OPTION

((OPTION_B)) Supervised learning

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Reinforcement learning

This is optional

((OPTION_D)) Missing data imputation

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) You are given data about seismic activity in Japan, and you want to
predict a magnitude of the next earthquake, this is in an example of
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Supervised learning

THIS IS
MANDATORY
OPTION

((OPTION_B)) Unsupervised learning

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Serration
This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Assume you want to perform supervised learning and to predict

((OPTION_A)) Classification
THIS IS
MANDATORY
OPTION

((OPTION_B)) Regression
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Clustering
This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Discriminating between spam and ham e-mails is a classification task,

true or false?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) True
THIS IS
MANDATORY
OPTION

((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) In the example of predicting number of babies based on storks’

population size, number of babies is
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Outcome
THIS IS
MANDATORY
OPTION

((OPTION_B)) Feature
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Attribute
This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) It may be better to avoid the metric of ROC curve as it can suffer
from accuracy paradox.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) True

THIS IS
MANDATORY
OPTION

((OPTION_B)) False

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) which of the following is not involve in data mining

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Knowledge extraction

THIS IS
MANDATORY
OPTION

((OPTION_B)) Data archaeology

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Data exploration

This is optional

((OPTION_D)) Data transformation

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) The expected value or _______ of a random variable is the center of its
distribution.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Mode
THIS IS
MANDATORY
OPTION

((OPTION_B)) median
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) mean
This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Point out the correct statement.

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Some cumulative distribution function F is non-decreasing and right-continuous

THIS IS
MANDATORY
OPTION

((OPTION_B)) Every cumulative distribution function F is decreasing and right-continuous

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Every cumulative distribution function F is increasing and left-continuous

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following of a random variable is a measure of spread

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) variance
THIS IS
MANDATORY
OPTION

((OPTION_B)) standard deviation

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) empirical mean

This is optional

((OPTION_D)) All above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) The square root of the variance is called the ________ deviation
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) empirical
THIS IS
MANDATORY
OPTION

((OPTION_B)) mean
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) continuous
This is optional

((OPTION_D)) standard
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) For continuous random variables, the CDF is the derivative of the PDF.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) True
THIS IS
MANDATORY
OPTION

((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Cumulative distribution functions are used to specify the distribution of

multivariate random variables.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) True
THIS IS
MANDATORY
OPTION

((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_A)) Regression
THIS IS
MANDATORY
OPTION

((OPTION_B)) Desicion Tree

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Clustering
This is optional

((OPTION_D)) Association Rule

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((OPTION_A)) True
THIS IS
MANDATORY
OPTION

((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((OPTION_A)) True
THIS IS
MANDATORY
OPTION

((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((OPTION_B)) Overfitting

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both
This is optional

((OPTION_D)) None
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((OPTION_C)) Both
This is optional

((OPTION_D)) None
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((OPTION_C)) Both
This is optional

((OPTION_D)) None
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
A
((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) In 1984, the computer scientist L. Valiant

((OPTION_B)) Zero one loss error

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Probably approximately correct

This is optional
none
((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) In particular, a concept is a subset of input patterns X which determine the same
output element
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) True
THIS IS
MANDATORY
OPTION

((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Therefore, learning a

((OPTION_A)) True
THIS IS
MANDATORY
OPTION

((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_A)) curse of dimensionality

THIS IS
MANDATORY
OPTION

((OPTION_B)) Hughes phenomenon

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Probably approximately correct

This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) In many cases, in order to capture the full expressivity, it's

necessary to have a very large dataset and without enough training data, the
ENTER approximation
CONTENT. QTN can become problematic. This is called…
CAN HAVE
IMAGES ALSO

((OPTION_A)) curse of dimensionality

THIS IS
MANDATORY
OPTION

((OPTION_B)) Hughes phenomenon

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Probably approximately correct

This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE First term is called as
IMAGES ALSO

((OPTION_A)) posteriori
THIS IS
MANDATORY
OPTION

((OPTION_B)) Apriori
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) likelihood.
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
second term is called as
CAN HAVE
IMAGES ALSO

((OPTION_A)) posteriori
THIS IS
MANDATORY
OPTION

((OPTION_B)) Apriori
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) likelihood.
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
Third term is called as
CAN HAVE
IMAGES ALSO

((OPTION_A)) posteriori
THIS IS
MANDATORY
OPTION

((OPTION_B)) Apriori
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) likelihood.
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) We can create the object of abstract class

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Choose the options that is incorrect regarding machine learning (ML) and
artificial intelligence (AI)
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) ML is an alternate way of programming intelligent machines.

THIS IS
MANDATORY
OPTION

((OPTION_B)) ML and AI have very different goals

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) ML is a set of techniques that turns a dataset into a software.

This is optional

((OPTION_D)) AI is a software that can emulate the human mind

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following sentence is FALSE regarding regression

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) It is used for prediction

THIS IS
MANDATORY
OPTION

((OPTION_B)) It may be used for interpretation

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) It relates inputs to outputs.

This is optional

((OPTION_D)) It discovers causal relationships

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Grid search is

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Linear in D
THIS IS
MANDATORY
OPTION

((OPTION_B)) Exponential in D
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Linear in N
This is optional

((OPTION_D)) Both B&C

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Find incorrect regarding Gradient of a continuous and differentiable

function
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) is zero at a minimum

THIS IS
MANDATORY
OPTION

((OPTION_B)) is non-zero at a maximum

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) is zero at a saddle point

This is optional

((OPTION_D)) decreases as you get closer to the minimum

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Consider a linear-regression model with N = 3 and D = 1 with input-ouput

((OPTION_A)) -1.66
THIS IS
MANDATORY
OPTION

((OPTION_B)) 2
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 3
This is optional

((OPTION_D)) 4
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_A)) O(D)
THIS IS
MANDATORY
OPTION

((OPTION_B)) O(N)
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) O(ND)
This is optional

((OPTION_D)) O(ND2)

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_B)) High model bias

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) High estimation bias

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Adding more basis functions in a linear model... (pick the most probably
option)
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Decreases model bias

THIS IS
MANDATORY
OPTION

((OPTION_B)) Decreases estimation bias

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Decreases variance

This is optional

((OPTION_D)) Doesn’t affect bias and variance

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) The problem of finding hidden structure in unlabeled data is called

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Supervised learning

THIS IS
MANDATORY
OPTION

((OPTION_B)) UnSupervised learning

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Reinforcement learning

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Task of inferring a model from labeled training data is called

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Unsupervised learning

THIS IS
MANDATORY
OPTION

((OPTION_B)) supervised learning

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Reinforcement learning

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Some telecommunication company wants to segment their customers

into distinct groups in order to send appropriate subscription offers,
ENTER
this is an example of
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Supervised learning

THIS IS
MANDATORY
OPTION

((OPTION_B)) Data extraction

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Serration
This is optional

((OPTION_D)) Unsupervised learning

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Self-organizing maps are an example of

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Unsupervised learning

THIS IS
MANDATORY
OPTION

((OPTION_B)) Supervised learning

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Reinforcement learning

This is optional

((OPTION_D)) Missing data imputation

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) You are given data about seismic activity in Japan, and you want to
predict a magnitude of the next earthquake, this is in an example of
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Supervised learning

THIS IS
MANDATORY
OPTION

((OPTION_B)) Unsupervised learning

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Serration
This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Assume you want to perform supervised learning and to predict

((OPTION_A)) Classification
THIS IS
MANDATORY
OPTION

((OPTION_B)) Regression
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Clustering
This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Discriminating between spam and ham e-mails is a classification task,

true or false?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) True
THIS IS
MANDATORY
OPTION

((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) In the example of predicting number of babies based on storks’

population size, number of babies is
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Outcome
THIS IS
MANDATORY
OPTION

((OPTION_B)) Feature
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Attribute
This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) It may be better to avoid the metric of ROC curve as it can suffer
from accuracy paradox.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) True

THIS IS
MANDATORY
OPTION

((OPTION_B)) False

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) which of the following is not involve in data mining

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Knowledge extraction

THIS IS
MANDATORY
OPTION

((OPTION_B)) Data archaeology

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Data exploration

This is optional

((OPTION_D)) Data transformation

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) The expected value or _______ of a random variable is the center of its
distribution.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Mode
THIS IS
MANDATORY
OPTION

((OPTION_B)) median
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) mean
This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Point out the correct statement.

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Some cumulative distribution function F is non-decreasing and right-continuous

THIS IS
MANDATORY
OPTION

((OPTION_B)) Every cumulative distribution function F is decreasing and right-continuous

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Every cumulative distribution function F is increasing and left-continuous

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following of a random variable is a measure of spread

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) variance
THIS IS
MANDATORY
OPTION

((OPTION_B)) standard deviation

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) empirical mean

This is optional

((OPTION_D)) All above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) The square root of the variance is called the ________ deviation

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) empirical
THIS IS
MANDATORY
OPTION

((OPTION_B)) mean
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) continuous
This is optional

((OPTION_D)) standard
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) For continuous random variables, the CDF is the derivative of the PDF.

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) True
THIS IS
MANDATORY
OPTION

((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Cumulative distribution functions are used to specify the distribution of

multivariate random variables.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) True
THIS IS
MANDATORY
OPTION

((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_A)) Regression
THIS IS
MANDATORY
OPTION

((OPTION_B)) Desicion Tree

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Clustering
This is optional

((OPTION_D)) Association Rule

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((OPTION_A)) True
THIS IS
MANDATORY
OPTION

((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((OPTION_A)) True
THIS IS
MANDATORY
OPTION

((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((OPTION_B)) Overfitting

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both
This is optional

((OPTION_D)) None
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((OPTION_C)) Both
This is optional

((OPTION_D)) None
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((OPTION_C)) Both
This is optional

((OPTION_D)) None
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option
A
((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) In 1984, the computer scientist L. Valiant

((OPTION_B)) Zero one loss error

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Probably approximately correct

This is optional
none
((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) In particular, a concept is a subset of input patterns X which determine the same
output element
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) True
THIS IS
MANDATORY
OPTION

((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Therefore, learning a

((OPTION_A)) True
THIS IS
MANDATORY
OPTION

((OPTION_B)) False
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_A)) curse of dimensionality

THIS IS
MANDATORY
OPTION

((OPTION_B)) Hughes phenomenon

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Probably approximately correct

This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) In many cases, in order to capture the full expressivity, it's

necessary to have a very large dataset and without enough training data, the
ENTER approximation
CONTENT. QTN can become problematic. This is called…
CAN HAVE
IMAGES ALSO

((OPTION_A)) curse of dimensionality

THIS IS
MANDATORY
OPTION

((OPTION_B)) Hughes phenomenon

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Probably approximately correct

This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE First term is called as
IMAGES ALSO

((OPTION_A)) posteriori
THIS IS
MANDATORY
OPTION

((OPTION_B)) Apriori
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) likelihood.
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
second term is called as
CAN HAVE
IMAGES ALSO

((OPTION_A)) posteriori
THIS IS
MANDATORY
OPTION

((OPTION_B)) Apriori
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) likelihood.
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
Third term is called as
CAN HAVE
IMAGES ALSO

((OPTION_A)) posteriori
THIS IS
MANDATORY
OPTION

((OPTION_B)) Apriori
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) likelihood.
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) We can create the object of abstract class

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following step / assumption in regression modeling

impacts the trade-off between under-fitting and over-fitting the most
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) The polynomial degree

THIS IS
MANDATORY
OPTION

((OPTION_B)) Whether we learn the weights by matrix inversion or gradient descent

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) The use of a constant-term

This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_A)) 10/27
THIS IS
MANDATORY
OPTION

((OPTION_B)) 20/27

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 50/27
This is optional

((OPTION_D)) 49/27
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_A)) 1and4

THIS IS
MANDATORY
OPTION

((OPTION_B)) 2 and3
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 1 and3
This is optional

((OPTION_D)) 2 and4
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_A)) You will always have test error zero

THIS IS
MANDATORY
OPTION

((OPTION_B)) . You can not have test error zero

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) None of the above

This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which one of the statement is true regarding residuals in regression

analysis?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) A. Mean of residuals is always zero

THIS IS
MANDATORY
OPTION

((OPTION_B)) Mean of residuals is always less than zero

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Mean of residuals is always greater than zero

This is optional

((OPTION_D)) There is no such rule for residuals.

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the one is true about Heteroskedasticity?

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Linear Regression with varying error terms

THIS IS
MANDATORY
OPTION

((OPTION_B)) Linear Regression with constant error terms

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Linear Regression with zero error terms

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following indicates a fairly strong relationship between

X and Y?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) A. Correlation coefficient = 0.9

THIS IS
MANDATORY
OPTION

((OPTION_B)) . The p-value for the null hypothesis Beta coefficient =0 is 0.0001
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) The t-statistic for the null hypothesis Beta coefficient=0 is 30

This is optional

((OPTION_D)) None of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((OPTION_A)) 1,2&3

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1&3

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) All of above

This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) To test linear relationship of y(dependent) and x(independent)

continuous variables, which of the following plot best suited?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Scatter plot

THIS IS
MANDATORY
OPTION

((OPTION_B)) Barchart
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Histograms
This is optional

((OPTION_D)) None of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Generally, which of the following method(s) is used for predicting

continuous dependent variable?
ENTER
CONTENT. QTN 1. Linear Regression
CAN HAVE 2. Logistic Regression
IMAGES ALSO

((OPTION_A)) 1&2
THIS IS
MANDATORY
OPTION

((OPTION_B)) Only 1
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Only 2
This is optional

((OPTION_D)) None f the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) . A correlation between age and health of a person found to be -1.09.

On the basis of this you would tell the doctors that:
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) . The age is good predictor of health

THIS IS
MANDATORY
OPTION

((OPTION_B)) . The age is poor predictor of health

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) None of these

This is optional

((OPTION_D)) All of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Vertical offset

THIS IS
MANDATORY
OPTION

((OPTION_B)) Perpendicular offset

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both but depend on situation

This is optional

((OPTION_D)) Both a&b

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((OPTION_A)) . Only 1

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1&3
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 1&4
This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1 is False and 2 is True

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 1 is True and 2 is False

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH c
OICE)) Either A
or B or C or D or
E

((OPTION_A)) It is more likely for X1 to be excluded from the model

THIS IS
MANDATORY
OPTION

((OPTION_B)) It is more likely for X1 to be included in the model

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) . Can’t say

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following is true about “Ridge” or “Lasso” regression

methods in case of feature selection?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Ridge regression uses subset selection of features

THIS IS
MANDATORY
OPTION

((OPTION_B)) . Lasso regression uses subset selection of features

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both use subset selection of features

This is optional

((OPTION_D)) All of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) .Which of the following statement(s) can be true post adding a

((OPTION_A)) . 1 and 2

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1 and 3

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 2 and 4

This is optional

((OPTION_D)) none of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_A)) 2 and 4

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1 and 2.
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) . 2, 3 and 4.
This is optional

((OPTION_D)) All of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_A)) 1 and 2

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1&3
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 2&3
This is optional

((OPTION_D)) 1,2&3
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((OPTION_A)) . 1 and 2

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1 and 3

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 2 and 3

This is optional

((OPTION_D)) 1,2 and 3

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) . How many coefficients do you need to estimate in a simple linear

regression model (One independent variable)
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) 1
THIS IS
MANDATORY
OPTION

((OPTION_B)) 2
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) CAN’T SAY

This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO Which of the following statement is true about sum of residuals of A and B

((OPTION_A)) A has higher than B

THIS IS
MANDATORY
OPTION

((OPTION_B)) A has lower than B

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both have same

This is optional

((OPTION_D)) None of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) If two variables are correlated, is it necessary that they have a linear
relationsh
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) YES
THIS IS
MANDATORY
OPTION

((OPTION_B)) NO
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both a&b

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Correlated variables can have zero correlation coeffficient. True or

False?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION

((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

IMAGES ALSO

((OPTION_A)) Only 2
THIS IS
MANDATORY
OPTION

((OPTION_B)) Only 1
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Only3
This is optional

((OPTION_D)) All of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((OPTION_A)) 3.02

THIS IS
MANDATORY
OPTION

((OPTION_B)) 0.75
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 1.01

This is optional

((OPTION_D)) None of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Suppose the distribution of salaries in a company X has median

$35,000, and 25th and 75th percentiles are $21,000 and $53,000
ENTER respectively.
CONTENT. QTN Would a person with Salary $1 be considered an Outlier?
CAN HAVE
IMAGES ALSO

((OPTION_A)) YES

THIS IS
MANDATORY
OPTION

((OPTION_B)) NO
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) . More information is required

This is optional

((OPTION_D)) None of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following option is true regarding “Regression” and

“Correlation” ?
ENTER Note: y is dependent variable and x is independent variable.
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) The relationship is symmetric between x and y in both.

THIS IS
MANDATORY
OPTION

((OPTION_B)) The relationship is not symmetric between x and y in both.

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) The relationship is not symmetric between x and y in case of correlation

but in case of regression it is symmetric.
This is optional

((OPTION_D)) The relationship is symmetric between x and y in case of correlation but

in case of regression it is not symmetric.
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) True-False: Is Logistic regression a supervised machine learning

algorithm?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) TRUE

THIS IS
MANDATORY
OPTION

((OPTION_B)) FALSE

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) _
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) True-False: Is Logistic regression mainly used for Regression?

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION

((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) True-False: Is it possible to design a logistic regression algorithm

using a Neural Network Algorithm?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION

((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) True-False: Is it possible to apply a logistic regression algorithm on a

3-class Classification problem?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION

((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following methods do we use to best fit the data in
Logistic Regression?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Least Square Error

THIS IS
MANDATORY
OPTION

((OPTION_B)) Maximum Likelihood

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Jaccard distance

This is optional

((OPTION_D)) Both a&B

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_A)) We prefer a model with minimum AIC value

THIS IS
MANDATORY
OPTION

((OPTION_B)) We prefer a model with maximum AIC value

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both but depend on the situation

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) True-False] Standardisation of features is required before training a

Logistic Regression
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION

((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following algorithms do we use for Variable Selection?

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) ) LASSO

THIS IS
MANDATORY
OPTION

((OPTION_B)) Ridge

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both
This is optional

((OPTION_D)) All of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Suppose you have been given a fair coin and you want to find out the
odds of getting heads. Which of the following option is true for such a
ENTER case?
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) odds will be 0

THIS IS
MANDATORY
OPTION

((OPTION_B)) odds will be 0.5

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) odds will be 1

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) ) The logit function(given as l(x)) is the log of odds function. What
could be the range of logit function in the domain x=[0,1]?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) (– ∞ , ∞)

THIS IS
MANDATORY
OPTION

((OPTION_B)) (0,1)
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) (0, ∞)
This is optional

((OPTION_D)) (- ∞, 0)
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following option is true?

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Linear Regression errors values has to be normally distributed but in case
of Logistic Regression it is not the case
THIS IS
MANDATORY
OPTION

((OPTION_B)) Linear Regression errors values has to be normally distributed but in case
of Logistic Regression it is not the case
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both Linear Regression and Logistic Regression error values have to be
normally distributed
This is optional

((OPTION_D)) Both Linear Regression and Logistic Regression error values have not to
be normally distributed
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((OPTION_A)) C) A) Logistic(x) = Logit(x)

THIS IS
MANDATORY
OPTION

((OPTION_B)) Logistic(x) = Logit_inv(x)

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) A) Logistic(x) = Logit(x)

This is optional

((OPTION_D)) None of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_A)) Training accuracy increases

THIS IS
MANDATORY
OPTION

((OPTION_B)) Training accuracy increases or remains the same

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Testing accuracy decreases

This is optional

((OPTION_D)) Testing accuracy increases or remains the same

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A&D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Choose which of the following options is true regarding One-Vs-All

method in Logistic Regression.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) We need to fit n models in n-class classification problem

THIS IS
MANDATORY
OPTION

((OPTION_B)) We need to fit n-1 models to classify into n classes

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) We need to fit only 1 model to classify into n classes

This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_B)) Decrease the learning rate and increase the number of iteration
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Increase the learning rate and increase the number of iteration
This is optional

((OPTION_D)) Increase the learning rate and decrease the number of iteration
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((OPTION_A)) A
THIS IS
MANDATORY
OPTION

((OPTION_B)) B
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) BOTH
This is optional

((OPTION_D)) NON OF THESE

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Logistic regression is used when you want to:

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Predict a dichotomous variable from continuous or dichotomous variables.

THIS IS
MANDATORY
OPTION

((OPTION_B)) Predict a continuous variable from dichotomous variables.

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Predict any categorical variable from several other categorical

variables.
This is optional

((OPTION_D)) Predict a continuous variable from dichotomous or continuous variables

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
The odds ratio is
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) The ratio of the probability of an event not happening to the probability of the event happening.

THIS IS
MANDATORY
OPTION

((OPTION_B)) The probability of an event occurring.

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) The ratio of the odds after a unit change in the predictor to the original odds.

This is optional

((OPTION_D)) The ratio of the probability of an event happening to the probability of the event not happening.

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((OPTION_A)) That there are a greater number of explained vs. unexplained observations.

THIS IS
MANDATORY
OPTION

((OPTION_B)) That the statistical model fits the data well.

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) That as the predictor variable increases, the likelihood of the outcome occurring decreases.

This is optional

((OPTION_D)) That the statistical model is a poor fit of the data.

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Logistic regression assumes a:
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Linear relationship between continuous predictor variables and the outcome variable.

THIS IS
MANDATORY
OPTION

((OPTION_B)) Linear relationship between continuous predictor variables and the logit of the outcome
variable.
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Linear relationship between continuous predictor variables.

This is optional

((OPTION_D)) Linear relationship between observations.

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
In binary logistic regression:
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) The dependent variable is continuous.

THIS IS
MANDATORY
OPTION

((OPTION_B)) The dependent variable is divided into two equal subcategories.

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) The dependent variable consists of two categories.

This is optional

((OPTION_D)) There is no dependent variable.

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) The correlation coefficient is used to determine

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) A specific value of the y-variable given a specific value of the x-

variable
THIS IS
MANDATORY
OPTION

((OPTION_B)) A specific value of the x-variable given a specific value of the y-

variable
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) The strength of the relationship between the x and y variables

This is optional

((OPTION_D)) none
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional
((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

11 Machine Data mining

17 During the last few years, many ______ algorithms

19 if there is only a discrete number of possible

24 scikit-learn also provides functions for creating make_classifica make_regressio

Ans: Solution A

Ans: Solution B

3. What is supervised learning?

Ans: Solution B

4. What is Unsupervised learning?

Ans: Solution A

5. What is Semi-Supervised learning?

Ans: Solution C

7. Sentiment Analysis is an example of:

Regression,

Classification

Clustering

Reinforcement Learning

Options:

A. 1 Only

B. 1 and 2

C. 1 and 3

D. 1, 2 and 4

Ans : Solution D

8. The process of forming general concept definitions from examples of concepts to be

learned.
a) Deduction
b) abduction
c) induction
d) conjunction

Ans : Solution C

9. Computers are best at learning

a) facts.
b) concepts.
c) procedures.
d) principles.
Ans : Solution A

10. Data used to build a data mining model.

a) validation data
b) training data
c) test data
d) hidden data

Ans : Solution B

11. Supervised learning and unsupervised clustering both require at least one
a) hidden attribute.
b) output attribute.
c) input attribute.
d) categorical attribute.

Ans : Solution A

Ans : Solution B

Ans : Solution C

14. A term used to describe the case when the independent variables in a multiple regression model
are correlated is
a) Regression
b) correlation
c) multicollinearity
d) none of the above

Ans : Solution C

16. A multiple regression model has

a) only one independent variable
b) more than one dependent variable
c) more than one independent variable
d) none of the above

Ans : Solution B

Ans : Solution C

18. The adjusted multiple coefficient of determination accounts for

a) the number of dependent variables in the model
b) the number of independent variables in the model
c) unusually large predictors
d) none of the above

Ans : Solution D

19. The multiple coefficient of determination is computed by

a) dividing SSR by SST
b) dividing SST by SSR
c) dividing SST by SSE
d) none of the above

Ans : Solution C

20. For a multiple regression model, SST = 200 and SSE = 50. The multiple coefficient of
determination is
a) 0.25
b) 4.00
c) 0.75
d) none of the above

Ans : Solution B

21. A nearest neighbor approach is best used

Ans : Solution B

22. Another name for an output attribute.

a) predictive variable
b) independent variable
c) estimated variable
d) dependent variable

Ans : Solution B

23. Classification problems are distinguished from estimation problems in that

Ans : Solution C

24. Which statement is true about prediction problems?

Ans : Solution D

25. Which statement about outliers is true?

Ans : Solution A

27. Which of the following is a common use of unsupervised clustering?

Ans : Solution A

28. The average positive difference between computed and desired outcome values.
a) root mean squared error
b) mean squared error
c) mean absolute error
d) mean positive error

Ans : Solution D

29. Selecting data so as to assure that each class is properly represented in both the training and
test set.
a) cross validation
b) stratification
c) verification
d) bootstrapping

Ans : Solution B

Ans : Solution A
31. Data used to optimize the parameter settings of a supervised learner model.
a) Training
b) Test
c) Verification
d) Validation

Ans : Solution D

32. Bootstrapping allows us to

Ans : Solution A

Ans : Solution B

Ans : Solution C

35. The average squared difference between classifier predicted output and actual output.
a) mean squared error
b) root mean squared error
c) mean absolute error
d) mean relative error

Ans : Solution A

36. Simple regression assumes a __________ relationship between the input attribute and output
attribute.
a) Linear
b) Quadratic
c) reciprocal
d) inverse

Ans : Solution A

37. Regression trees are often used to model _______ data.

a) Linear
b) Nonlinear
c) Categorical
d) Symmetrical

Ans : Solution B

38. The leaf nodes of a model tree are

a) averages of numeric output attribute values.
b) nonlinear regression equations.
c) linear regression equations.
d) sums of numeric output attribute values.

Ans : Solution C

39. Logistic regression is a ________ regression technique that is used to model data having a
_____outcome.
a) linear, numeric
b) linear, binary
c) nonlinear, numeric
d) nonlinear, binary

Ans : Solution D

40. This technique associates a conditional probability value with each data instance.
a) linear regression
b) logistic regression
c) simple regression
d) multiple linear regression

Ans : Solution B

42. With Bayes classifier, missing data items are

a) treated as equal compares.
b) treated as unequal compares.
c) replaced with a default value.
d) ignored.

Ans : Solution B

43. This clustering algorithm merges and splits nodes to help modify nonoptimal partitions.
a) agglomerative clustering
b) expectation maximization
c) conceptual clustering
d) K-Means clustering

Ans : Solution D

Ans : Solution C

Ans : Solution B
UNIT –II

2.What is pca.components_ in Sklearn?

Set of all eigen vectors for the projection space
Matrix of principal components
Result of the multiplication matrix
None of the above options
Ans A

Ans D

7. PCA works better if there is?

A linear structure in the data
If the data lies on a curved surface and not on a flat surface
If variables are scaled in the same unit
A. 1 and 2
B. 2 and 3
C. 1 and 3
D. 1 ,2 and 3
Ans Solution: (C)

9. Which of the following option(s) is / are true?

11. [ True or False ] PCA can be used for projecting and visualizing data in lower dimensions.
A. TRUE
B. FALSE

Solution: (A)
Sometimes it is very useful to plot the data in lower dimensions. We can take the first 2 principal
components and then visualize the data using scatter plot.

13. Which of the following is an example of a deterministic algorithm?

1. Which of the following methods do we use to best fit the data in Logistic Regression?
A) Least Square Error
B) Maximum Likelihood
C) Jaccard distance
D) Both A and B
Ans Solution: B

4. Which of the following algorithms do we use for Variable Selection?

A) LASSO
B) Ridge
C) Both
D) None of these
Ans Solution: A
In case of lasso we apply a absolute penality, after increasing the penality in lasso some of the
coefficient of variables may become zero

5. Which of the following statement is true about outliers in Linear regression?

7. Which of the following is true about Residuals?

A) Lower is better
B) Higher is better
C) A or B depend on the situation
D) None of these
Ans Solution: (A)
Residuals refer to the error values of the model. Therefore lower residuals are desired.

A) Since the there is a relationship means our model is not good

10. Which of the following option is true?

A) Linear Regression errors values has to be normally distributed but in case of Logistic
Regression it is not the case
B) Logistic Regression errors values has to be normally distributed but in case of Linear
Regression it is not the case
C) Both Linear Regression and Logistic Regression error values have to be normally distributed
D) Both Linear Regression and Logistic Regression error values have not to be normally
distributed
Ans Solution: A

12. True-False: Linear Regression is a supervised machine learning algorithm.

13. True-False: Linear Regression is mainly used for Regression.

A) TRUE
B) FALSE
Solution: (A)
Linear Regression has dependent variables that have continuous values.
14. True-False: It is possible to design a Linear regression algorithm using a neural network?

A) TRUE
B) FALSE

Solution: (A)

True. A Neural network can be used as a universal approximator, so it can definitely implement
a linear regression algorithm.

18. Which of the following is true about Residuals ?

A) Lower is better
B) Higher is better
C) A or B depend on the situation
D) None of these
Solution: (A)
Residuals refer to the error values of the model. Therefore lower residuals are desired.

A) 1 and 2
B) 1 and 3
C) 2 and 3
D) 1,2 and 3
Solution: (D)
Instead of gradient descent, Normal Equation can also be used to find coefficients.

Question Context 24-26:

A) Since the there is a relationship means our model is not good

Question Context 29-31:

31. In terms of bias and variance. Which of the following is true when you fit degree 2
polynomial?

A) Bias will be high, variance will be high

Question Context 32-33:

33. What do you expect will happen with bias and variance as you increase the size of training
data?

A) Bias increases and Variance increases

Question Context 34:

Consider the following data where one input(X) and one output(Y) is given.

34. What would be the root mean square training error for this data if you run a Linear
Regression model of the form (Y = A0+A1X)?

A) Less than 0
B) Greater than zero
C) Equal to 0
D) None of these
Solution: (C)
We can perfectly fit the line on the following data so mean error will be zero.

Question Context 35-36:

Suppose you have been given the following scenario for training and validation error for Linear
Regression.
Number Validation
Learning Training
Scenario of Error
Rate Error
iterations

1 0.1 1000 100 110

2 0.2 600 90 105

3 0.3 400 110 110

4 0.4 300 120 130

5 0.4 250 130 150

Question Context 37-38:

A) L1
B) L2
C) Any
D) None of these
Solution: (D)
I won’t use any regularization methods because regularization is used in case of overfitting.

39. True-False: Is Logistic regression a supervised machine learning algorithm?

40. True-False: Is Logistic regression mainly used for Regression?

A) TRUE
B) FALSE
Solution: B
Logistic regression is a classification algorithm, don’t confuse with the name regression.

42. True-False: Is it possible to apply a logistic regression algorithm on a 3-class Classification

problem?
A) TRUE
B) FALSE
Solution: A
Yes, we can apply logistic regression on 3 classification problem, We can use One Vs all method
for 3 class classification in logistic regression.

46. [True-False] Standardisation of features is required before training a Logistic Regression.

A) TRUE
B) FALSE
Solution: B
Standardization isn’t required for logistic regression. The main goal of standardizing features is
to help convergence of the technique used for optimization.

47. Which of the following algorithms do we use for Variable Selection?

A) LASSO
B) Ridge
C) Both
D) None of these

Solution: A
In case of lasso we apply a absolute penality, after increasing the penality in lasso some of the
coefficient of variables may become zero.
Context: 48-49

Consider a following model for logistic regression: P (y =1|x, w)= g(w0 + w1x)
where g(z) is the logistic function.

In the above equation the P (y =1|x; w) , viewed as a function of x, that we can get by changing the
parameters w.

48 What would be the range of p in such case?

A) (0, inf)
B) (-inf, 0 )
C) (0, 1)
D) (-inf, inf)

Solution: C

For values of x in the range of real number from −∞ to +∞ Logistic function will give the output
between (0,1)

49 In above question what do you think which function would make p between (0,1)?

A) logistic function
B) Log likelihood function
C) Mixture of both
D) None of them

Solution: A

Explanation is same as question number 10

50. Suppose you have been given a fair coin and you want to find out the odds of getting heads.
Which of the following option is true for such a case?

A) odds will be 0
B) odds will be 0.5
C) odds will be 1
D) None of these

Solution: C

Odds are defined as the ratio of the probability of success and the probability of failure. So in case of fair
coin probability of success is 1/2 and the probability of failure is 1/2 so odd would be 1

51. The logit function(given as l(x)) is the log of odds function. What could be the range of logit
function in the domain x=[0,1]?
A) (– ∞ , ∞)
B) (0,1)
C) (0, ∞)
D) (- ∞, 0)

Solution: A

For our purposes, the odds function has the advantage of transforming the probability function, which
has values from 0 to 1, into an equivalent function with values between 0 and ∞. When we take the
natural log of the odds function, we get a range of values from -∞ to ∞.

52. Which of the following option is true?

A) Linear Regression errors values has to be normally distributed but in case of Logistic Regression it is
not the case
B) Logistic Regression errors values has to be normally distributed but in case of Linear Regression it is
not the case
C) Both Linear Regression and Logistic Regression error values have to be normally distributed
D) Both Linear Regression and Logistic Regression error values have not to be normally distributed

Solution:A

53. Which of the following is true regarding the logistic function for any value “x”?

Note:
Logistic(x): is a logistic function of any number “x”

Logit(x): is a logit function of any number “x”

Logit_inv(x): is a inverse logit function of any number “x”

A) Logistic(x) = Logit(x)
B) Logistic(x) = Logit_inv(x)
C) Logit_inv(x) = Logit(x)
D) None of these

Solution: B

54. How will the bias change on using high(infinite) regularisation?

Solution: A

Model will become very simple so bias will be very high.

Note: Consider remaining parameters are same.

A) Training accuracy increases

B) Training accuracy increases or remains the same
C) Testing accuracy decreases
D) Testing accuracy increases or remains the same

Solution: A and D

56. Choose which of the following options is true regarding One-Vs-All method in Logistic Regression.

A) We need to fit n models in n-class classification problem

B) We need to fit n-1 models to classify into n classes
C) We need to fit only 1 model to classify into n classes
D) None of these
Solution: A

If there are n classes, then n separate logistic regression has to fit, where the probability of each
category is predicted over the rest of the categories combined.

57. Below are two different logistic models with different values for β0 and β1.

Which of the
following statement(s) is true about β0 and β1 values of two logistics models (Green, Black)?

Note: consider Y = β0 + β1*X. Here, β0 is intercept and β1 is coefficient.

A) β1 for Green is greater than Black

B) β1 for Green is lower than Black
C) β1 for both models is same
D) Can’t Say

Solution: B

β0 and β1: β0 = 0, β1 = 1 is in X1 color(black) and β0 = 0, β1 = −1 is in X4 color (green)

Context 58-60

A) A
B) B
C) C
D)None of these

Solution: C

Since in figure 3, Decision boundary is not smooth that means it will over-fitting the data.

59. What do you conclude after seeing this visualization?

1. The training error in first plot is maximum as compare to second and third plot.

2. The best model for this regression problem is the last (third) plot because it has minimum
training error (zero).

3. The second model is more robust than first and third because it will perform best on unseen
data.

4. The third model is overfitting more as compare to first and second.

5. All will perform same because we have not seen the testing data.

A) 1 and 3
B) 1 and 3
C) 1, 3 and 4
D) 5

Solution: C

60. Suppose, above decision boundaries were generated for the different value of regularization.
Which of the above decision boundary shows the maximum regularization?

A) A
B) B
C) C
D) All have equal regularization

Solution: A

Since, more regularization means more penality means less complex decision boundry that shows in first
figure A.

61. What would do if you want to train logistic regression on same data that will take less time as well
as give the comparatively similar accuracy(may not be same)?

Suppose you are using a Logistic Regression model on a huge dataset. One of the problem you may face
on such huge data is that Logistic regression will take very long time to train.

A) Decrease the learning rate and decrease the number of iteration

Solution: D

62. Which of the following image is showing the cost function for y =1.

Following is the loss function in logistic regression(Y-axis loss function and x axis log probability) for
two class classification problem.

Note: Y is the target class

A) A
B) B
C) Both
D) None of these

Solution: A

A is the true answer as loss function decreases as the log probability increases

63. Suppose, Following graph is a cost function for logistic regression.

Now, How many local minimas are present in the graph?

A) 1
B) 2
C) 3
D) 4

Solution: C
There are three local minima present in the graph

64. Can a Logistic Regression classifier do a perfect classification on the below data?

Note: You can use only X1 and X2 variables where X1 and X2 can take only two binary values(0,1).

A) TRUE
B) FALSE
C) Can’t say
D) None of these

Solution: B

No, logistic regression only forms linear decision surface, but the examples in the figure are not linearly
separable.
UNIT IV

1. The SVM’s are less effective when:

A) The data is linearly separable

B) The data is clean and ready to use
C) The data is noisy and contains overlapping points

Ans Solution: C

When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.

2. The cost parameter in the SVM means:

A) The number of cross-validations to be made

B) The kernel to be used
C) The tradeoff between misclassification and simplicity of the model
D) None of the above

Ans Solution: C

The cost parameter decides how much an SVM should be allowed to “bend” with the data. For a low
cost, you aim for a smooth decision surface and for a higher cost, you aim to classify more points
correctly. It is also simply referred to as the cost of misclassification.

3. Which of the following are real world applications of the SVM?

A) Text and Hypertext Categorization

B) Image Classification
C) Clustering of News Articles
D) All of the above

Ans Solution: D

SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.

4. Which of the following is true about Naive Bayes ?

Assumes that all the features in a dataset are equally important

Assumes that all the features in a dataset are independent

Both A and B - answer

None of the above options

Ans Solution: C

5 What do you mean by generalization error in terms of the SVM?

A) How far the hyperplane is from the support vectors

B) How accurately the SVM can predict outcomes for unseen data
C) The threshold amount of error in an SVM

Ans Solution: B

Generalisation error in statistics is generally the out-of-sample error which is the measure of how
accurately a model can predict values for previously unseen data.

6 The SVM’s are less effective when:

A) The data is linearly separable

B) The data is clean and ready to use
C) The data is noisy and contains overlapping points

Ans Solution: C

When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.

7 What is/are true about kernel in SVM?

1. Kernel function map low dimensional data to high dimensional space

2. It’s a similarity function

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: C

Both the given statements are correct.

Question Context:8– 9

A) Yes
B) No

Solution: A

These three examples are positioned such that removing any one of them introduces slack in the
constraints. So the decision boundary would completely change.

9. [True or False] If you remove the non-red circled points from the data, the decision boundary will
change?

A) True
B) False

Solution: B

On the other hand, rest of the points in the data won’t affect the decision boundary much.

10. What do you mean by generalization error in terms of the SVM?

A) How far the hyperplane is from the support vectors

B) How accurately the SVM can predict outcomes for unseen data
C) The threshold amount of error in an SVM

Solution: B

A) The optimal hyperplane if exists, will be the one that completely separates the data
B) The soft-margin classifier will separate the data
C) None of the above

Solution: A

At such a high level of misclassification penalty, soft margin will not hold existence as there will be no
room for error.

12. What do you mean by a hard margin?

A) The SVM allows very low error in classification

B) The SVM allows high amount of error in classification
C) None of the above

Solution: A

A hard margin means that an SVM is very rigid in classification and tries to work extremely well in the
training set, causing overfitting.

13. The minimum time complexity for training an SVM is O(n2). According to this fact, what sizes of
datasets are not best suited for SVM’s?

A) Large datasets
B) Small datasets
C) Medium sized datasets
D) Size does not matter

Solution: A

Datasets which have a clear classification boundary will function best with SVM’s.

14. The effectiveness of an SVM depends upon:

A) Selection of Kernel
B) Kernel Parameters
C) Soft Margin Parameter C
D) All of the above

Solution: D

The SVM effectiveness depends upon how you choose the basic 3 requirements mentioned above in
such a way that it maximises your efficiency, reduces error and overfitting.

15. upport vectors are the data points that lie closest to the decision surface.
A) TRUE
B) FALSE

Solution: A

They are the points closest to the hyperplane and the hardest ones to classify. They also have a direct
bearing on the location of the decision surface.

16. The SVM’s are less effective when:

A) The data is linearly separable

B) The data is clean and ready to use
C) The data is noisy and contains overlapping points

Solution: C

When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.

17. Suppose you are using RBF kernel in SVM with high Gamma value. What does this signify?

Solution: B

The gamma parameter in SVM tuning signifies the influence of points either near or far away from the
hyperplane.

For a low gamma, the model will be too constrained and include all points of the training dataset,
without really capturing the shape.

For a higher gamma, the model will capture the shape of the dataset well.

18. The cost parameter in the SVM means:

A) The number of cross-validations to be made

B) The kernel to be used
C) The tradeoff between misclassification and simplicity of the model
D) None of the above

What would happen when you use very large value of C(C->infinity)?

Note: For small C was also classifying all data points correctly

A) We can still classify data correctly for given setting of hyper parameter C
B) We can not classify data correctly for given setting of hyper parameter C
C) Can’t Say
D) None of these

Solution: A

For large values of C, the penalty for misclassifying points is very high, so the decision boundary will
perfectly separate the data if possible.

20. What would happen when you use very small C (C~0)?

A) Misclassification would happen

B) Data will be correctly classified
C) Can’t say
D) None of these

Solution: A

The classifier can maximize the margin between most of the points, while misclassifying a few points,
because the penalty is so low.

21. If I am using all features of my dataset and I achieve 100% accuracy on my training set, but ~70% on
validation set, what should I look out for?

A) Underfitting
B) Nothing, the model is perfect
C) Overfitting

Solution: C

If we’re achieving 100% training accuracy very easily, we need to check to verify if we’re overfitting our
data.
22. Which of the following are real world applications of the SVM?

A) Text and Hypertext Categorization

B) Image Classification
C) Clustering of News Articles
D) All of the above

Solution: D

SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.

Question Context: 23 – 25

Suppose you have trained an SVM with linear decision boundary after training SVM, you correctly infer
that your SVM model is under fitting.

23. Which of the following option would you more likely to consider iterating SVM next time?

A) You want to increase your data points

B) You want to decrease your data points
C) You will try to calculate more variables
D) You will try to reduce the features

Solution: C

The best option here would be to create more features for the model.

24. Suppose you gave the correct answer in previous question. What do you think that is actually
happening?

1. We are lowering the bias

2. We are lowering the variance
3. We are increasing the bias
4. We are increasing the variance

A) 1 and 2
B) 2 and 3
C) 1 and 4
D) 2 and 4

Solution: C

A) We will increase the parameter C

B) We will decrease the parameter C
C) Changing in C don’t effect
D) None of these

Solution: A

Increasing C parameter would be the right thing to do here, as it will ensure regularized model

26. We usually use feature normalization before using the Gaussian kernel in SVM. What is true about
feature normalization?

1. We do feature normalization so that new feature will dominate other

2. Some times, feature normalization is not feasible in case of categorical variables
3. Feature normalization always helps when we use Gaussian kernel in SVM

A) 1
B) 1 and 2
C) 1 and 3
D) 2 and 3

Solution: B

Statements one and two are correct.

Question Context: 27-29

Suppose you are dealing with 4 class classification problem and you want to train a SVM model on the
data for that you are using One-vs-all method. Now answer the below questions?

27. How many times we need to train our SVM model in such case?

A) 1
B) 2
C) 3
D) 4

Solution: D

A) 20
B) 40
C) 60
D) 80

Solution: B

It would take 10×4 = 40 seconds

29 Suppose your problem has changed now. Now, data has only 2 classes. What would you think how
many times we need to train SVM in such case?

A) 1
B) 2
C) 3
D) 4

Solution: A

Training the SVM only one time would give you appropriate results

Question context: 30 –31

30. Now, think that you increase the complexity (or degree of polynomial of this kernel). What would
you think will happen?

A) Increasing the complexity will over fit the data

B) Increasing the complexity will under fit the data
C) Nothing will happen since your model was already 100% accurate
D) None of these

Solution: A

A) 1
B) 2
C) 1 and 2
D) None of these

Solution: C

Both the given statements are correct.

32. What is/are true about kernel in SVM?

1. Kernel function map low dimensional data to high dimensional space

2. It’s a similarity function

A) 1
B) 2
C) 1 and 2
D) None of these

Solution: C

Both the given statements are correct.

UNIT V

1. Which of the following is a widely used and effective machine learning algorithm based on the
idea of bagging?

a) Decision Tree
b) Regression
c) Classification
d) Random Forest

Ans D

2. Which of the following is a disadvantage of decision trees?

a) Factor analysis
b) Decision trees are robust to outliers
c) Decision trees are prone to be overfit
d) None of the above

Ans C

3. Can decision trees be used for performing clustering?

a. True
b. False

Ans Solution: (A)

Decision trees can also be used to for clusters in the data but clustering often generates natural
clusters and is not dependent on any objective function.

4. Which of the following algorithm is most sensitive to outliers?

a. K-means clustering algorithm

b. K-medians clustering algorithm
c. K-modes clustering algorithm
d. K-medoids clustering algorithm

Ans Solution: (A)

5 Sentiment Analysis is an example of:

Regression

Classification

Clustering

Reinforcement Learning

Options:

a. 1 Only
b. 1 and 2
c. 1 and 3
d. 1, 2 and 4

Ans D

6 Which of the following is the most appropriate strategy for data cleaning before performing
clustering analysis, given less than desirable number of data points:

Capping and flouring of variables

Removal of outliers
Options:
a. 1 only
b. 2 only
c. 1 and 2
d. None of the above

Ans A

7 Which of the following is/are true about bagging trees?

1. In bagging trees, individual trees are independent of each other

2. Bagging is the method for improving the performance by aggregating the results of weak
learners

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: C

Both options are true. In Bagging, each individual trees are independent of each other because they
consider different subset of features and samples.

8. Which of the following is/are true about boosting trees?

1. In boosting trees, individual weak learners are independent of each other

2. It is the method for improving the performance by aggregating the results of weak learners

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: B

2. Individual tree is built on all the features

3. Individual tree is built on a subset of observations

4. Individual tree is built on full set of observations

A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4

Ans Solution: A

Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.

10. Suppose you are using a bagging based algorithm say a RandomForest in model building.
Which of the following can be true?

1. Number of tree should be as large as possible

2. You will have interpretability after using Random Forest

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: A

11. Which of the following is/are true about Random Forest and Gradient Boosting ensemble
methods?

1. Both methods can be used for classification task

2. Random Forest is use for classification whereas Gradient Boosting is use for regression task

3. Random Forest is use for regression whereas Gradient Boosting is use for Classification task

4. Both methods can be used for regression task

A) 1
B) 2
C) 3
D) 4
E) 1 and 4

Solution: E

Both algorithms are design for classification as well as regression task.

12. In Random forest you can generate hundreds of trees (say T1, T2 …..Tn) and then aggregate the
results of these tree. Which of the following is true about individual(Tk) tree in Random Forest?

1. Individual tree is built on a subset of the features

2. Individual tree is built on all the features

3. Individual tree is built on a subset of observations

4. Individual tree is built on full set of observations

A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4

Solution: A

Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.

13. Which of the following algorithm doesn’t uses learning Rate as of one of its hyperparameter?

1. Gradient Boosting

2. Extra Trees

3. AdaBoost

4. Random Forest

A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4

Solution: D
Random Forest and Extra Trees don’t have learning rate as a hyperparameter.

14. Which of the following algorithm are not an example of ensemble learning algorithm?

A) Random Forest
B) Adaboost
C) Extra Trees
D) Gradient Boosting
E) Decision Trees

Solution: E

Decision trees doesn’t aggregate the results of multiple trees so it is not an ensemble algorithm.

15. Suppose you are using a bagging based algorithm say a RandomForest in model building. Which of
the following can be true?

1. Number of tree should be as large as possible

2. You will have interpretability after using RandomForest

A) 1
B) 2
C) 1 and 2
D) None of these

Solution: A

16. True-False: The bagging is suitable for high variance low bias models?

A) TRUE
B) FALSE

Solution: A

The bagging is suitable for high variance low bias models or you can say for complex models.

17. To apply bagging to regression trees which of the following is/are true in such case?

1. We build the N regression with N bootstrap sample

2. We take the average the of N regression tree

3. Each tree has a high variance with low bias

A) 1 and 2
B) 2 and 3
C) 1 and 3
D) 1,2 and 3

Solution: D

All of the options are correct and self-explanatory

18. How to select best hyper parameters in tree based models?

A) Measure performance over training data

B) Measure performance over validation data
C) Both of these
D) None of these

Solution: B

We always consider the validation results to compare with the test result.

19. In which of the following scenario a gain ratio is preferred over Information Gain?

A) When a categorical variable has very large number of category

B) When a categorical variable has very small number of category
C) Number of categories is the not the reason
D) None of these

Solution: A

When high cardinality problems, gain ratio is preferred over Information Gain technique.

20. Suppose you have given the following scenario for training and validation error for Gradient
Boosting. Which of the following hyper parameter would you choose in such case?

Scenario Depth Training Error Validation Error

1 2 100 110

2 4 90 105

3 6 50 100

4 8 45 105
5 10 30 150

A) 1
B) 2
C) 3
D) 4

Solution: B

Scenario 2 and 4 has same validation accuracies but we would select 2 because depth is lower is better
hyper parameter.

21. Which of the following is/are not true about DBSCAN clustering algorithm:

1. For data points to be in a cluster, they must be in a distance threshold to a core point

2. It has strong assumptions for the distribution of data points in dataspace

3. It has substantially high time complexity of order O(n 3)

4. It does not require prior knowledge of the no. of desired clusters

5. It is robust to outliers

Options:

A. 1 only

B. 2 only

C. 4 only

D. 2 and 3

Solution: D

 DBSCAN can form a cluster of any arbitrary shape and does not have strong assumptions for the
distribution of data points in the data space.

 DBSCAN has a low time complexity of order O (n log n) only.

22. Point out the correct statement.

23. Which of the following is required by K-means clustering?

a) defined distance metric
b) number of clusters
c) initial guess as to cluster centroids
d) all of the mentioned

Answer: d
Explanation: K-means clustering follows partitioning approach.

24. Point out the wrong statement.

a) k-means clustering is a method of vector quantization
b) k-means clustering aims to partition n observations into k clusters
c) k-nearest neighbor is same as k-means
d) none of the mentioned

Answer: c
Explanation: k-nearest neighbour has nothing to do with k-means.

25. Which of the following function is used for k-means clustering?

a) k-means
b) k-mean
c) heat map
d) none of the mentioned

Answer: a
Explanation: K-means requires a number of clusters.

26. K-means is not deterministic and it also consists of number of iterations.

a) True
b) False

Ans: Solution A

Ans: Solution B

3. What is supervised learning?

Ans: Solution B

4. What is Unsupervised learning?

Ans: Solution A

5. What is Semi-Supervised learning?

Ans: Solution C

7. Sentiment Analysis is an example of:

Regression,

Classification

Clustering

Reinforcement Learning

Options:

A. 1 Only

B. 1 and 2

C. 1 and 3

D. 1, 2 and 4

Ans : Solution D

8. The process of forming general concept definitions from examples of concepts to be

learned.
a) Deduction
b) abduction
c) induction
d) conjunction

Ans : Solution C

9. Computers are best at learning

a) facts.
b) concepts.
c) procedures.
d) principles.
Ans : Solution A

10. Data used to build a data mining model.

a) validation data
b) training data
c) test data
d) hidden data

Ans : Solution B

11. Supervised learning and unsupervised clustering both require at least one
a) hidden attribute.
b) output attribute.
c) input attribute.
d) categorical attribute.

Ans : Solution A

Ans : Solution B

Ans : Solution C

14. A term used to describe the case when the independent variables in a multiple regression model
are correlated is
a) Regression
b) correlation
c) multicollinearity
d) none of the above

Ans : Solution C

16. A multiple regression model has

a) only one independent variable
b) more than one dependent variable
c) more than one independent variable
d) none of the above

Ans : Solution B

Ans : Solution C

18. The adjusted multiple coefficient of determination accounts for

a) the number of dependent variables in the model
b) the number of independent variables in the model
c) unusually large predictors
d) none of the above

Ans : Solution D

19. The multiple coefficient of determination is computed by

a) dividing SSR by SST
b) dividing SST by SSR
c) dividing SST by SSE
d) none of the above

Ans : Solution C

20. For a multiple regression model, SST = 200 and SSE = 50. The multiple coefficient of
determination is
a) 0.25
b) 4.00
c) 0.75
d) none of the above

Ans : Solution B

21. A nearest neighbor approach is best used

Ans : Solution B

22. Another name for an output attribute.

a) predictive variable
b) independent variable
c) estimated variable
d) dependent variable

Ans : Solution B

23. Classification problems are distinguished from estimation problems in that

Ans : Solution C

24. Which statement is true about prediction problems?

Ans : Solution D

25. Which statement about outliers is true?

Ans : Solution A

27. Which of the following is a common use of unsupervised clustering?

Ans : Solution A

28. The average positive difference between computed and desired outcome values.
a) root mean squared error
b) mean squared error
c) mean absolute error
d) mean positive error

Ans : Solution D

29. Selecting data so as to assure that each class is properly represented in both the training and
test set.
a) cross validation
b) stratification
c) verification
d) bootstrapping

Ans : Solution B

Ans : Solution A
31. Data used to optimize the parameter settings of a supervised learner model.
a) Training
b) Test
c) Verification
d) Validation

Ans : Solution D

32. Bootstrapping allows us to

Ans : Solution A

Ans : Solution B

Ans : Solution C

35. The average squared difference between classifier predicted output and actual output.
a) mean squared error
b) root mean squared error
c) mean absolute error
d) mean relative error

Ans : Solution A

36. Simple regression assumes a __________ relationship between the input attribute and output
attribute.
a) Linear
b) Quadratic
c) reciprocal
d) inverse

Ans : Solution A

37. Regression trees are often used to model _______ data.

a) Linear
b) Nonlinear
c) Categorical
d) Symmetrical

Ans : Solution B

38. The leaf nodes of a model tree are

a) averages of numeric output attribute values.
b) nonlinear regression equations.
c) linear regression equations.
d) sums of numeric output attribute values.

Ans : Solution C

39. Logistic regression is a ________ regression technique that is used to model data having a
_____outcome.
a) linear, numeric
b) linear, binary
c) nonlinear, numeric
d) nonlinear, binary

Ans : Solution D

40. This technique associates a conditional probability value with each data instance.
a) linear regression
b) logistic regression
c) simple regression
d) multiple linear regression

Ans : Solution B

42. With Bayes classifier, missing data items are

a) treated as equal compares.
b) treated as unequal compares.
c) replaced with a default value.
d) ignored.

Ans : Solution B

43. This clustering algorithm merges and splits nodes to help modify nonoptimal partitions.
a) agglomerative clustering
b) expectation maximization
c) conceptual clustering
d) K-Means clustering

Ans : Solution D

Ans : Solution C

Ans : Solution B
UNIT –II

2.What is pca.components_ in Sklearn?

Set of all eigen vectors for the projection space
Matrix of principal components
Result of the multiplication matrix
None of the above options
Ans A

Ans D

7. PCA works better if there is?

A linear structure in the data
If the data lies on a curved surface and not on a flat surface
If variables are scaled in the same unit
A. 1 and 2
B. 2 and 3
C. 1 and 3
D. 1 ,2 and 3
Ans Solution: (C)

9. Which of the following option(s) is / are true?

11. [ True or False ] PCA can be used for projecting and visualizing data in lower dimensions.
A. TRUE
B. FALSE

Solution: (A)
Sometimes it is very useful to plot the data in lower dimensions. We can take the first 2 principal
components and then visualize the data using scatter plot.

13. Which of the following is an example of a deterministic algorithm?

1. Which of the following methods do we use to best fit the data in Logistic Regression?
A) Least Square Error
B) Maximum Likelihood
C) Jaccard distance
D) Both A and B
Ans Solution: B

4. Which of the following algorithms do we use for Variable Selection?

A) LASSO
B) Ridge
C) Both
D) None of these
Ans Solution: A
In case of lasso we apply a absolute penality, after increasing the penality in lasso some of the
coefficient of variables may become zero

5. Which of the following statement is true about outliers in Linear regression?

7. Which of the following is true about Residuals?

A) Lower is better
B) Higher is better
C) A or B depend on the situation
D) None of these
Ans Solution: (A)
Residuals refer to the error values of the model. Therefore lower residuals are desired.

A) Since the there is a relationship means our model is not good

10. Which of the following option is true?

A) Linear Regression errors values has to be normally distributed but in case of Logistic
Regression it is not the case
B) Logistic Regression errors values has to be normally distributed but in case of Linear
Regression it is not the case
C) Both Linear Regression and Logistic Regression error values have to be normally distributed
D) Both Linear Regression and Logistic Regression error values have not to be normally
distributed
Ans Solution: A

12. True-False: Linear Regression is a supervised machine learning algorithm.

13. True-False: Linear Regression is mainly used for Regression.

A) TRUE
B) FALSE
Solution: (A)
Linear Regression has dependent variables that have continuous values.
14. True-False: It is possible to design a Linear regression algorithm using a neural network?

A) TRUE
B) FALSE

Solution: (A)

True. A Neural network can be used as a universal approximator, so it can definitely implement
a linear regression algorithm.

18. Which of the following is true about Residuals ?

A) Lower is better
B) Higher is better
C) A or B depend on the situation
D) None of these
Solution: (A)
Residuals refer to the error values of the model. Therefore lower residuals are desired.

A) 1 and 2
B) 1 and 3
C) 2 and 3
D) 1,2 and 3
Solution: (D)
Instead of gradient descent, Normal Equation can also be used to find coefficients.

Question Context 24-26:

A) Since the there is a relationship means our model is not good

Question Context 29-31:

31. In terms of bias and variance. Which of the following is true when you fit degree 2
polynomial?

A) Bias will be high, variance will be high

Question Context 32-33:

33. What do you expect will happen with bias and variance as you increase the size of training
data?

A) Bias increases and Variance increases

Question Context 34:

Consider the following data where one input(X) and one output(Y) is given.

34. What would be the root mean square training error for this data if you run a Linear
Regression model of the form (Y = A0+A1X)?

A) Less than 0
B) Greater than zero
C) Equal to 0
D) None of these
Solution: (C)
We can perfectly fit the line on the following data so mean error will be zero.

Question Context 35-36:

Suppose you have been given the following scenario for training and validation error for Linear
Regression.
Number Validation
Learning Training
Scenario of Error
Rate Error
iterations

1 0.1 1000 100 110

2 0.2 600 90 105

3 0.3 400 110 110

4 0.4 300 120 130

5 0.4 250 130 150

Question Context 37-38:

A) L1
B) L2
C) Any
D) None of these
Solution: (D)
I won’t use any regularization methods because regularization is used in case of overfitting.

39. True-False: Is Logistic regression a supervised machine learning algorithm?

40. True-False: Is Logistic regression mainly used for Regression?

A) TRUE
B) FALSE
Solution: B
Logistic regression is a classification algorithm, don’t confuse with the name regression.

42. True-False: Is it possible to apply a logistic regression algorithm on a 3-class Classification

problem?
A) TRUE
B) FALSE
Solution: A
Yes, we can apply logistic regression on 3 classification problem, We can use One Vs all method
for 3 class classification in logistic regression.

46. [True-False] Standardisation of features is required before training a Logistic Regression.

A) TRUE
B) FALSE
Solution: B
Standardization isn’t required for logistic regression. The main goal of standardizing features is
to help convergence of the technique used for optimization.

47. Which of the following algorithms do we use for Variable Selection?

A) LASSO
B) Ridge
C) Both
D) None of these

Solution: A
In case of lasso we apply a absolute penality, after increasing the penality in lasso some of the
coefficient of variables may become zero.
Context: 48-49

Consider a following model for logistic regression: P (y =1|x, w)= g(w0 + w1x)
where g(z) is the logistic function.

In the above equation the P (y =1|x; w) , viewed as a function of x, that we can get by changing the
parameters w.

48 What would be the range of p in such case?

A) (0, inf)
B) (-inf, 0 )
C) (0, 1)
D) (-inf, inf)

Solution: C

For values of x in the range of real number from −∞ to +∞ Logistic function will give the output
between (0,1)

49 In above question what do you think which function would make p between (0,1)?

A) logistic function
B) Log likelihood function
C) Mixture of both
D) None of them

Solution: A

Explanation is same as question number 10

50. Suppose you have been given a fair coin and you want to find out the odds of getting heads.
Which of the following option is true for such a case?

A) odds will be 0
B) odds will be 0.5
C) odds will be 1
D) None of these

Solution: C

Odds are defined as the ratio of the probability of success and the probability of failure. So in case of fair
coin probability of success is 1/2 and the probability of failure is 1/2 so odd would be 1

51. The logit function(given as l(x)) is the log of odds function. What could be the range of logit
function in the domain x=[0,1]?
A) (– ∞ , ∞)
B) (0,1)
C) (0, ∞)
D) (- ∞, 0)

Solution: A

For our purposes, the odds function has the advantage of transforming the probability function, which
has values from 0 to 1, into an equivalent function with values between 0 and ∞. When we take the
natural log of the odds function, we get a range of values from -∞ to ∞.

52. Which of the following option is true?

A) Linear Regression errors values has to be normally distributed but in case of Logistic Regression it is
not the case
B) Logistic Regression errors values has to be normally distributed but in case of Linear Regression it is
not the case
C) Both Linear Regression and Logistic Regression error values have to be normally distributed
D) Both Linear Regression and Logistic Regression error values have not to be normally distributed

Solution:A

53. Which of the following is true regarding the logistic function for any value “x”?

Note:
Logistic(x): is a logistic function of any number “x”

Logit(x): is a logit function of any number “x”

Logit_inv(x): is a inverse logit function of any number “x”

A) Logistic(x) = Logit(x)
B) Logistic(x) = Logit_inv(x)
C) Logit_inv(x) = Logit(x)
D) None of these

Solution: B

54. How will the bias change on using high(infinite) regularisation?

Solution: A

Model will become very simple so bias will be very high.

Note: Consider remaining parameters are same.

A) Training accuracy increases

B) Training accuracy increases or remains the same
C) Testing accuracy decreases
D) Testing accuracy increases or remains the same

Solution: A and D

56. Choose which of the following options is true regarding One-Vs-All method in Logistic Regression.

A) We need to fit n models in n-class classification problem

B) We need to fit n-1 models to classify into n classes
C) We need to fit only 1 model to classify into n classes
D) None of these
Solution: A

If there are n classes, then n separate logistic regression has to fit, where the probability of each
category is predicted over the rest of the categories combined.

57. Below are two different logistic models with different values for β0 and β1.

Which of the
following statement(s) is true about β0 and β1 values of two logistics models (Green, Black)?

Note: consider Y = β0 + β1*X. Here, β0 is intercept and β1 is coefficient.

A) β1 for Green is greater than Black

B) β1 for Green is lower than Black
C) β1 for both models is same
D) Can’t Say

Solution: B

β0 and β1: β0 = 0, β1 = 1 is in X1 color(black) and β0 = 0, β1 = −1 is in X4 color (green)

Context 58-60

A) A
B) B
C) C
D)None of these

Solution: C

Since in figure 3, Decision boundary is not smooth that means it will over-fitting the data.

59. What do you conclude after seeing this visualization?

1. The training error in first plot is maximum as compare to second and third plot.

2. The best model for this regression problem is the last (third) plot because it has minimum
training error (zero).

3. The second model is more robust than first and third because it will perform best on unseen
data.

4. The third model is overfitting more as compare to first and second.

5. All will perform same because we have not seen the testing data.

A) 1 and 3
B) 1 and 3
C) 1, 3 and 4
D) 5

Solution: C

60. Suppose, above decision boundaries were generated for the different value of regularization.
Which of the above decision boundary shows the maximum regularization?

A) A
B) B
C) C
D) All have equal regularization

Solution: A

Since, more regularization means more penality means less complex decision boundry that shows in first
figure A.

61. What would do if you want to train logistic regression on same data that will take less time as well
as give the comparatively similar accuracy(may not be same)?

Suppose you are using a Logistic Regression model on a huge dataset. One of the problem you may face
on such huge data is that Logistic regression will take very long time to train.

A) Decrease the learning rate and decrease the number of iteration

Solution: D

62. Which of the following image is showing the cost function for y =1.

Following is the loss function in logistic regression(Y-axis loss function and x axis log probability) for
two class classification problem.

Note: Y is the target class

A) A
B) B
C) Both
D) None of these

Solution: A

A is the true answer as loss function decreases as the log probability increases

63. Suppose, Following graph is a cost function for logistic regression.

Now, How many local minimas are present in the graph?

A) 1
B) 2
C) 3
D) 4

Solution: C
There are three local minima present in the graph

64. Can a Logistic Regression classifier do a perfect classification on the below data?

Note: You can use only X1 and X2 variables where X1 and X2 can take only two binary values(0,1).

A) TRUE
B) FALSE
C) Can’t say
D) None of these

Solution: B

No, logistic regression only forms linear decision surface, but the examples in the figure are not linearly
separable.
UNIT IV

1. The SVM’s are less effective when:

A) The data is linearly separable

B) The data is clean and ready to use
C) The data is noisy and contains overlapping points

Ans Solution: C

When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.

2. The cost parameter in the SVM means:

A) The number of cross-validations to be made

B) The kernel to be used
C) The tradeoff between misclassification and simplicity of the model
D) None of the above

Ans Solution: C

The cost parameter decides how much an SVM should be allowed to “bend” with the data. For a low
cost, you aim for a smooth decision surface and for a higher cost, you aim to classify more points
correctly. It is also simply referred to as the cost of misclassification.

3. Which of the following are real world applications of the SVM?

A) Text and Hypertext Categorization

B) Image Classification
C) Clustering of News Articles
D) All of the above

Ans Solution: D

SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.

4. Which of the following is true about Naive Bayes ?

Assumes that all the features in a dataset are equally important

Assumes that all the features in a dataset are independent

Both A and B - answer

None of the above options

Ans Solution: C

5 What do you mean by generalization error in terms of the SVM?

A) How far the hyperplane is from the support vectors

B) How accurately the SVM can predict outcomes for unseen data
C) The threshold amount of error in an SVM

Ans Solution: B

Generalisation error in statistics is generally the out-of-sample error which is the measure of how
accurately a model can predict values for previously unseen data.

6 The SVM’s are less effective when:

A) The data is linearly separable

B) The data is clean and ready to use
C) The data is noisy and contains overlapping points

Ans Solution: C

When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.

7 What is/are true about kernel in SVM?

1. Kernel function map low dimensional data to high dimensional space

2. It’s a similarity function

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: C

Both the given statements are correct.

Question Context:8– 9

A) Yes
B) No

Solution: A

These three examples are positioned such that removing any one of them introduces slack in the
constraints. So the decision boundary would completely change.

9. [True or False] If you remove the non-red circled points from the data, the decision boundary will
change?

A) True
B) False

Solution: B

On the other hand, rest of the points in the data won’t affect the decision boundary much.

10. What do you mean by generalization error in terms of the SVM?

A) How far the hyperplane is from the support vectors

B) How accurately the SVM can predict outcomes for unseen data
C) The threshold amount of error in an SVM

Solution: B

A) The optimal hyperplane if exists, will be the one that completely separates the data
B) The soft-margin classifier will separate the data
C) None of the above

Solution: A

At such a high level of misclassification penalty, soft margin will not hold existence as there will be no
room for error.

12. What do you mean by a hard margin?

A) The SVM allows very low error in classification

B) The SVM allows high amount of error in classification
C) None of the above

Solution: A

A hard margin means that an SVM is very rigid in classification and tries to work extremely well in the
training set, causing overfitting.

13. The minimum time complexity for training an SVM is O(n2). According to this fact, what sizes of
datasets are not best suited for SVM’s?

A) Large datasets
B) Small datasets
C) Medium sized datasets
D) Size does not matter

Solution: A

Datasets which have a clear classification boundary will function best with SVM’s.

14. The effectiveness of an SVM depends upon:

A) Selection of Kernel
B) Kernel Parameters
C) Soft Margin Parameter C
D) All of the above

Solution: D

The SVM effectiveness depends upon how you choose the basic 3 requirements mentioned above in
such a way that it maximises your efficiency, reduces error and overfitting.

15. upport vectors are the data points that lie closest to the decision surface.
A) TRUE
B) FALSE

Solution: A

They are the points closest to the hyperplane and the hardest ones to classify. They also have a direct
bearing on the location of the decision surface.

16. The SVM’s are less effective when:

A) The data is linearly separable

B) The data is clean and ready to use
C) The data is noisy and contains overlapping points

Solution: C

When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.

17. Suppose you are using RBF kernel in SVM with high Gamma value. What does this signify?

Solution: B

The gamma parameter in SVM tuning signifies the influence of points either near or far away from the
hyperplane.

For a low gamma, the model will be too constrained and include all points of the training dataset,
without really capturing the shape.

For a higher gamma, the model will capture the shape of the dataset well.

18. The cost parameter in the SVM means:

A) The number of cross-validations to be made

B) The kernel to be used
C) The tradeoff between misclassification and simplicity of the model
D) None of the above

What would happen when you use very large value of C(C->infinity)?

Note: For small C was also classifying all data points correctly

A) We can still classify data correctly for given setting of hyper parameter C
B) We can not classify data correctly for given setting of hyper parameter C
C) Can’t Say
D) None of these

Solution: A

For large values of C, the penalty for misclassifying points is very high, so the decision boundary will
perfectly separate the data if possible.

20. What would happen when you use very small C (C~0)?

A) Misclassification would happen

B) Data will be correctly classified
C) Can’t say
D) None of these

Solution: A

The classifier can maximize the margin between most of the points, while misclassifying a few points,
because the penalty is so low.

21. If I am using all features of my dataset and I achieve 100% accuracy on my training set, but ~70% on
validation set, what should I look out for?

A) Underfitting
B) Nothing, the model is perfect
C) Overfitting

Solution: C

If we’re achieving 100% training accuracy very easily, we need to check to verify if we’re overfitting our
data.
22. Which of the following are real world applications of the SVM?

A) Text and Hypertext Categorization

B) Image Classification
C) Clustering of News Articles
D) All of the above

Solution: D

SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.

Question Context: 23 – 25

Suppose you have trained an SVM with linear decision boundary after training SVM, you correctly infer
that your SVM model is under fitting.

23. Which of the following option would you more likely to consider iterating SVM next time?

A) You want to increase your data points

B) You want to decrease your data points
C) You will try to calculate more variables
D) You will try to reduce the features

Solution: C

The best option here would be to create more features for the model.

24. Suppose you gave the correct answer in previous question. What do you think that is actually
happening?

1. We are lowering the bias

2. We are lowering the variance
3. We are increasing the bias
4. We are increasing the variance

A) 1 and 2
B) 2 and 3
C) 1 and 4
D) 2 and 4

Solution: C

A) We will increase the parameter C

B) We will decrease the parameter C
C) Changing in C don’t effect
D) None of these

Solution: A

Increasing C parameter would be the right thing to do here, as it will ensure regularized model

26. We usually use feature normalization before using the Gaussian kernel in SVM. What is true about
feature normalization?

1. We do feature normalization so that new feature will dominate other

2. Some times, feature normalization is not feasible in case of categorical variables
3. Feature normalization always helps when we use Gaussian kernel in SVM

A) 1
B) 1 and 2
C) 1 and 3
D) 2 and 3

Solution: B

Statements one and two are correct.

Question Context: 27-29

Suppose you are dealing with 4 class classification problem and you want to train a SVM model on the
data for that you are using One-vs-all method. Now answer the below questions?

27. How many times we need to train our SVM model in such case?

A) 1
B) 2
C) 3
D) 4

Solution: D

A) 20
B) 40
C) 60
D) 80

Solution: B

It would take 10×4 = 40 seconds

29 Suppose your problem has changed now. Now, data has only 2 classes. What would you think how
many times we need to train SVM in such case?

A) 1
B) 2
C) 3
D) 4

Solution: A

Training the SVM only one time would give you appropriate results

Question context: 30 –31

30. Now, think that you increase the complexity (or degree of polynomial of this kernel). What would
you think will happen?

A) Increasing the complexity will over fit the data

B) Increasing the complexity will under fit the data
C) Nothing will happen since your model was already 100% accurate
D) None of these

Solution: A

A) 1
B) 2
C) 1 and 2
D) None of these

Solution: C

Both the given statements are correct.

32. What is/are true about kernel in SVM?

1. Kernel function map low dimensional data to high dimensional space

2. It’s a similarity function

A) 1
B) 2
C) 1 and 2
D) None of these

Solution: C

Both the given statements are correct.

UNIT V

1. Which of the following is a widely used and effective machine learning algorithm based on the
idea of bagging?

a) Decision Tree
b) Regression
c) Classification
d) Random Forest

Ans D

2. Which of the following is a disadvantage of decision trees?

a) Factor analysis
b) Decision trees are robust to outliers
c) Decision trees are prone to be overfit
d) None of the above

Ans C

3. Can decision trees be used for performing clustering?

a. True
b. False

Ans Solution: (A)

Decision trees can also be used to for clusters in the data but clustering often generates natural
clusters and is not dependent on any objective function.

4. Which of the following algorithm is most sensitive to outliers?

a. K-means clustering algorithm

b. K-medians clustering algorithm
c. K-modes clustering algorithm
d. K-medoids clustering algorithm

Ans Solution: (A)

5 Sentiment Analysis is an example of:

Regression

Classification

Clustering

Reinforcement Learning

Options:

a. 1 Only
b. 1 and 2
c. 1 and 3
d. 1, 2 and 4

Ans D

6 Which of the following is the most appropriate strategy for data cleaning before performing
clustering analysis, given less than desirable number of data points:

Capping and flouring of variables

Removal of outliers
Options:
a. 1 only
b. 2 only
c. 1 and 2
d. None of the above

Ans A

7 Which of the following is/are true about bagging trees?

1. In bagging trees, individual trees are independent of each other

2. Bagging is the method for improving the performance by aggregating the results of weak
learners

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: C

Both options are true. In Bagging, each individual trees are independent of each other because they
consider different subset of features and samples.

8. Which of the following is/are true about boosting trees?

1. In boosting trees, individual weak learners are independent of each other

2. It is the method for improving the performance by aggregating the results of weak learners

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: B

2. Individual tree is built on all the features

3. Individual tree is built on a subset of observations

4. Individual tree is built on full set of observations

A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4

Ans Solution: A

Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.

10. Suppose you are using a bagging based algorithm say a RandomForest in model building.
Which of the following can be true?

1. Number of tree should be as large as possible

2. You will have interpretability after using Random Forest

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: A

11. Which of the following is/are true about Random Forest and Gradient Boosting ensemble
methods?

1. Both methods can be used for classification task

2. Random Forest is use for classification whereas Gradient Boosting is use for regression task

3. Random Forest is use for regression whereas Gradient Boosting is use for Classification task

4. Both methods can be used for regression task

A) 1
B) 2
C) 3
D) 4
E) 1 and 4

Solution: E

Both algorithms are design for classification as well as regression task.

12. In Random forest you can generate hundreds of trees (say T1, T2 …..Tn) and then aggregate the
results of these tree. Which of the following is true about individual(Tk) tree in Random Forest?

1. Individual tree is built on a subset of the features

2. Individual tree is built on all the features

3. Individual tree is built on a subset of observations

4. Individual tree is built on full set of observations

A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4

Solution: A

Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.

13. Which of the following algorithm doesn’t uses learning Rate as of one of its hyperparameter?

1. Gradient Boosting

2. Extra Trees

3. AdaBoost

4. Random Forest

A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4

Solution: D
Random Forest and Extra Trees don’t have learning rate as a hyperparameter.

14. Which of the following algorithm are not an example of ensemble learning algorithm?

A) Random Forest
B) Adaboost
C) Extra Trees
D) Gradient Boosting
E) Decision Trees

Solution: E

Decision trees doesn’t aggregate the results of multiple trees so it is not an ensemble algorithm.

15. Suppose you are using a bagging based algorithm say a RandomForest in model building. Which of
the following can be true?

1. Number of tree should be as large as possible

2. You will have interpretability after using RandomForest

A) 1
B) 2
C) 1 and 2
D) None of these

Solution: A

16. True-False: The bagging is suitable for high variance low bias models?

A) TRUE
B) FALSE

Solution: A

The bagging is suitable for high variance low bias models or you can say for complex models.

17. To apply bagging to regression trees which of the following is/are true in such case?

1. We build the N regression with N bootstrap sample

2. We take the average the of N regression tree

3. Each tree has a high variance with low bias

A) 1 and 2
B) 2 and 3
C) 1 and 3
D) 1,2 and 3

Solution: D

All of the options are correct and self-explanatory

18. How to select best hyper parameters in tree based models?

A) Measure performance over training data

B) Measure performance over validation data
C) Both of these
D) None of these

Solution: B

We always consider the validation results to compare with the test result.

19. In which of the following scenario a gain ratio is preferred over Information Gain?

A) When a categorical variable has very large number of category

B) When a categorical variable has very small number of category
C) Number of categories is the not the reason
D) None of these

Solution: A

When high cardinality problems, gain ratio is preferred over Information Gain technique.

20. Suppose you have given the following scenario for training and validation error for Gradient
Boosting. Which of the following hyper parameter would you choose in such case?

Scenario Depth Training Error Validation Error

1 2 100 110

2 4 90 105

3 6 50 100

4 8 45 105
5 10 30 150

A) 1
B) 2
C) 3
D) 4

Solution: B

Scenario 2 and 4 has same validation accuracies but we would select 2 because depth is lower is better
hyper parameter.

21. Which of the following is/are not true about DBSCAN clustering algorithm:

1. For data points to be in a cluster, they must be in a distance threshold to a core point

2. It has strong assumptions for the distribution of data points in dataspace

3. It has substantially high time complexity of order O(n 3)

4. It does not require prior knowledge of the no. of desired clusters

5. It is robust to outliers

Options:

A. 1 only

B. 2 only

C. 4 only

D. 2 and 3

Solution: D

 DBSCAN can form a cluster of any arbitrary shape and does not have strong assumptions for the
distribution of data points in the data space.

 DBSCAN has a low time complexity of order O (n log n) only.

22. Point out the correct statement.

23. Which of the following is required by K-means clustering?

a) defined distance metric
b) number of clusters
c) initial guess as to cluster centroids
d) all of the mentioned

Answer: d
Explanation: K-means clustering follows partitioning approach.

24. Point out the wrong statement.

a) k-means clustering is a method of vector quantization
b) k-means clustering aims to partition n observations into k clusters
c) k-nearest neighbor is same as k-means
d) none of the mentioned

Answer: c
Explanation: k-nearest neighbour has nothing to do with k-means.

25. Which of the following function is used for k-means clustering?

a) k-means
b) k-mean
c) heat map
d) none of the mentioned

Answer: a
Explanation: K-means requires a number of clusters.

26. K-means is not deterministic and it also consists of number of iterations.

a) True
b) False

Answer: a
Explanation: K-means clustering produces the final estimate of cluster centroids.
27.
lOMoARcPSD|7874213

Ml unit 3 - UNIT 3 MCQ

machine learning (Savitribai Phule Pune University)

StuDocu is not sponsored or endorsed by any college or university

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)
lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following step / assumption in regression modeling impacts

the trade-off between under-fitting and over-fitting the most
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) The polynomial degree

THIS IS
MANDATORY
OPTION

((OPTION_B)) Whether we learn the weights by matrix inversion or gradient descent

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) The use of a constant-term

This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Suppose you have the following data with one real-value input variable
& one real-value output variable. What is leave-one out cross validation
ENTER mean square error in case of linear regression (Y = bX+c)?
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) 10/27
THIS IS
MANDATORY
OPTION

((OPTION_B)) 20/27

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 50/27
This is optional

((OPTION_D)) 49/27
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following is/ are true about “Maximum Likelihood es-
timate (MLE)”?
ENTER
CONTENT. QTN 1. MLE may not always exist
CAN HAVE 2. MLE always exists
IMAGES ALSO
3. If MLE exist, it (they) may not be unique

4. If MLE exist, it (they) must be unique

((OPTION_A)) 1and4

THIS IS
MANDATORY
OPTION

((OPTION_B)) 2 and3
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 1 and3
This is optional

((OPTION_D)) 2 and4
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_A)) You will always have test error zero

THIS IS
MANDATORY
OPTION

((OPTION_B)) . You can not have test error zero

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) None of the above

This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which one of the statement is true regarding residuals in regression

analysis?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) A. Mean of residuals is always zero

THIS IS
MANDATORY
OPTION

((OPTION_B)) Mean of residuals is always less than zero

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Mean of residuals is always greater than zero

This is optional

((OPTION_D)) There is no such rule for residuals.

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the one is true about Heteroskedasticity?

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Linear Regression with varying error terms

THIS IS
MANDATORY
OPTION

((OPTION_B)) Linear Regression with constant error terms

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Linear Regression with zero error terms

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following indicates a fairly strong relationship between X

and Y?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) A. Correlation coefficient = 0.9

THIS IS
MANDATORY
OPTION

((OPTION_B)) . The p-value for the null hypothesis Beta coefficient =0 is 0.0001
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) The t-statistic for the null hypothesis Beta coefficient=0 is 30

This is optional

((OPTION_D)) None of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Which of the following assumptions do we make while deriving linear regression param
((QUESTION))
1. The true relationship between dependent y and predictor x is linear
ENTER 2. The model errors are statistically independent
CONTENT. QTN
3. The errors are normally distributed with a 0 mean and constant standard deviation.
CAN HAVE
IMAGES ALSO

((OPTION_A)) 1,2&3

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1&3

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) All of above

This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) To test linear relationship of y(dependent) and x(independent)

continuous variables, which of the following plot best suited?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Scatter plot

THIS IS
MANDATORY
OPTION

((OPTION_B)) Barchart
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Histograms
This is optional

((OPTION_D)) None of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Generally, which of the following method(s) is used for predicting

continuous dependent variable?
ENTER
CONTENT. QTN 1. Linear Regression
CAN HAVE
IMAGES ALSO 2. Logistic Regression

((OPTION_A)) 1&2
THIS IS
MANDATORY
OPTION

((OPTION_B)) Only 1
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Only 2
This is optional

((OPTION_D)) None f the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) . A correlation between age and health of a person found to be -1.09.

On the basis of this you would tell the doctors that:
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) . The age is good predictor of health

THIS IS
MANDATORY
OPTION

((OPTION_B)) . The age is poor predictor of health

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) None of these

This is optional

((OPTION_D)) All of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Which of the following offsets, do we use in case of least square line fit? Suppose horizontal axis is
((QUESTION)) independent variable and vertical axis is dependent variable

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Vertical offset

THIS IS
MANDATORY
OPTION

((OPTION_B)) Perpendicular offset

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both but depend on situation

This is optional

((OPTION_D)) Both a&b

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Suppose we have generated the data with help of polynomial regression of degree 3 (degree 3 will perfectly
((QUESTION)) fit this data). Now consider below points and choose the option based on these points.

ENTER 1. Simple Linear regression will have high bias and low variance
2. Simple Linear regression will have low bias and high variance
CONTENT. QTN
CAN HAVE 3. polynomial of degree 3 will have low bias and high variance
IMAGES ALSO
Polynomial of degree 3 will have low bias and Low variance

((OPTION_A)) . Only 1

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1&3
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 1&4
This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) . Suppose you are training a linear regression model. Now consider these
points.
ENTER
CONTENT. QTN 1. Overfitting is more likely if we have less data
CAN HAVE 2. Overfitting is more likely when the hypothesis space is small
IMAGES ALSO
Which of the above statement(s) are correct?
((OPTION_A)) Both are False

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1 is False and 2 is True

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 1 is True and 2 is False

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH c
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Suppose we fit “Lasso Regression” to a data set, which has 100 features (X1,X2…X100). Now, we rescale
((QUESTION)) one of these feature by multiplying with 10 (say that feature is X1), and then refit Lasso regression with the
same regularization parameter.
ENTER
CONTENT. QTN Now, which of the following option will be correct?

CAN HAVE
IMAGES ALSO

((OPTION_A)) It is more likely for X1 to be excluded from the model

THIS IS
MANDATORY
OPTION

((OPTION_B)) It is more likely for X1 to be included in the model

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) . Can’t say

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following is true about “Ridge” or “Lasso” regression

methods in case of feature selection?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Ridge regression uses subset selection of features

THIS IS
MANDATORY
OPTION

((OPTION_B)) . Lasso regression uses subset selection of features

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both use subset selection of features

This is optional

((OPTION_D)) All of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following statement(s) can be true post adding a variable
.
in a linear regression model?
ENTER 1. R-Squared and Adjusted R-squared both increase
CONTENT. QTN 2. R-Squared increases and Adjusted R-squared decreases
CAN HAVE 3. R-Squared decreases and Adjusted R-squared decreases
IMAGES ALSO 4. R-Squared decreases and Adjusted R-squared increases

((OPTION_A)) . 1 and 2

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1 and 3

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 2 and 4

This is optional
none of these
((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((OPTION_A)) 2 and 4

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1 and 2.
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) . 2, 3 and 4.
This is optional

((OPTION_D)) All of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) We can also compute the coefficient of linear regression with the help of
an analytical method called “Normal Equation”. Which of the following
ENTER is/are true about “Normal Equation”?
CONTENT. QTN 1. We don’t have to choose the learning rate
CAN HAVE 2. It becomes slow when number of features is very large
IMAGES ALSO 3. No need to iterate

((OPTION_A)) 1 and 2

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1&3
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 2&3
This is optional

((OPTION_D)) 1,2&3
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
. The expected value of Y is a linear function of the X(X1,X2….Xn) variables and regression line is defined
((QUESTION)) as:
Y = β0 + β1 X1 + β2 X2……+ βn Xn
ENTER Which of the following statement(s) are true?
1. If Xi changes by an amount ∆Xi, holding other variables constant, then the expected value of Y
CONTENT. QTN changes by a proportional amount βi ∆Xi, for some constant βi (which in general could be a posit -
CAN HAVE ive or negative number).
2. The value of βi is always the same, regardless of values of the other X’s.
IMAGES ALSO 3. The total effect of the X’s on the expected value of Y is the sum of their separate effects.

((OPTION_A)) . 1 and 2

THIS IS
MANDATORY
OPTION

((OPTION_B)) 1 and 3

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 2 and 3

This is optional

((OPTION_D)) 1,2 and 3

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) . How many coefficients do you need to estimate in a simple linear

regression model (One independent variable)
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) 1
THIS IS
MANDATORY
OPTION

((OPTION_B)) 2
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) CAN’T SAY

This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
. Below graphs show two fitted regression lines (A & B) on randomly generated data. Now, I want to find the
((QUESTION)) sum of residuals in both cases A and B.

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO Which of the following statement is true about sum of residuals of A and B

((OPTION_A)) A has higher than B

THIS IS
MANDATORY
OPTION

((OPTION_B)) A has lower than B

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both have same

This is optional

((OPTION_D)) None of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) If two variables are correlated, is it necessary that they have a linear re-
lationsh
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) YES
THIS IS
MANDATORY
OPTION

((OPTION_B)) NO
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both a&b

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Correlated variables can have zero correlation coeffficient. True or

False?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION

((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Suppose I applied a logistic regression model on data and got training accuracy X and testing accuracy Y.
((QUESTION)) Now I want to add few new features in data. Select option(s) which are correct in such case.
Note: Consider remaining parameters are same.
ENTER 1. Training accuracy always decreases.
2. Training accuracy always increases or remain same.
CONTENT. QTN 3. Testing accuracy always decreases
CAN HAVE Testing accuracy always increases or remain same

IMAGES ALSO

((OPTION_A)) Only 2
THIS IS
MANDATORY
OPTION

((OPTION_B)) Only 1
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Only3
This is optional

((OPTION_D)) All of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
The graph below represents a regression line predicting Y from X. The values on the
((QUESTION)) graph shows the residuals for each predictions value. Use this information to compute
ENTER the SSE.
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) 3.02

THIS IS
MANDATORY
OPTION

((OPTION_B)) 0.75
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) 1.01

This is optional

((OPTION_D)) None of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Suppose the distribution of salaries in a company X has median $35,000,

and 25th and 75th percentiles are $21,000 and $53,000 respectively.
ENTER Would a person with Salary $1 be considered an Outlier?
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) YES

THIS IS
MANDATORY
OPTION

((OPTION_B)) NO
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) . More information is required

This is optional

((OPTION_D)) None of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following option is true regarding “Regression” and

“Correlation” ?
ENTER Note: y is dependent variable and x is independent variable.
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) The relationship is symmetric between x and y in both.

THIS IS
MANDATORY
OPTION

((OPTION_B)) The relationship is not symmetric between x and y in both.

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) The relationship is not symmetric between x and y in case of correlation but
in case of regression it is symmetric.
This is optional

((OPTION_D)) The relationship is symmetric between x and y in case of correlation but in

case of regression it is not symmetric.
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) True-False: Is Logistic regression a supervised machine learning al-

gorithm?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) TRUE

THIS IS
MANDATORY
OPTION

((OPTION_B)) FALSE

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) _
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) True-False: Is Logistic regression mainly used for Regression?

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION

((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) True-False: Is it possible to design a logistic regression algorithm using

a Neural Network Algorithm?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION

((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) True-False: Is it possible to apply a logistic regression algorithm on a 3-

class Classification problem?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION

((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following methods do we use to best fit the data in Logistic
Regression?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Least Square Error

THIS IS
MANDATORY
OPTION

((OPTION_B)) Maximum Likelihood

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Jaccard distance

This is optional

((OPTION_D)) Both a&B

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) One of the very good methods to analyze the performance of Logistic
Regression is AIC, which is similar to R-Squared in Linear Regression.
ENTER Which of the following is true about AIC
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) We prefer a model with minimum AIC value

THIS IS
MANDATORY
OPTION

((OPTION_B)) We prefer a model with maximum AIC value

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both but depend on the situation

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) True-False] Standardisation of features is required before training a Lo-

gistic Regression
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) TRUE
THIS IS
MANDATORY
OPTION

((OPTION_B)) FALSE
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following algorithms do we use for Variable Selection?

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) ) LASSO

THIS IS
MANDATORY
OPTION

((OPTION_B)) Ridge

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both
This is optional

((OPTION_D)) All of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Suppose you have been given a fair coin and you want to find out the
odds of getting heads. Which of the following option is true for such a
ENTER case?
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) odds will be 0

THIS IS
MANDATORY
OPTION

((OPTION_B)) odds will be 0.5

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) odds will be 1

This is optional

((OPTION_D)) None of the above

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) ) The logit function(given as l(x)) is the log of odds function. What could
be the range of logit function in the domain x=[0,1]?
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) (– ∞ , ∞)

THIS IS
MANDATORY
OPTION

((OPTION_B)) (0,1)
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) (0, ∞)
This is optional

((OPTION_D)) (- ∞, 0)
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Which of the following option is true?

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Linear Regression errors values has to be normally distributed but in case of
Logistic Regression it is not the case
THIS IS
MANDATORY
OPTION

((OPTION_B)) Linear Regression errors values has to be normally distributed but in case of
Logistic Regression it is not the case
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Both Linear Regression and Logistic Regression error values have to be
normally distributed
This is optional

((OPTION_D)) Both Linear Regression and Logistic Regression error values have not to be
normally distributed
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
17) Which of the following is true regarding the logistic function for any value “x Note:
((QUESTION)) Logistic(x): is a logistic function of any number “x”
Logit(x): is a logit function of any number “x”
ENTER Logit_inv(x): is a inverse logit function of any number “x””?
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) C) A) Logistic(x) = Logit(x)

THIS IS
MANDATORY
OPTION

((OPTION_B)) Logistic(x) = Logit_inv(x)

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) A) Logistic(x) = Logit(x)

This is optional

((OPTION_D)) None of these

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Suppose, You applied a Logistic Regression model on a given data and
got a training accuracy X and testing accuracy Y. Now, you want to add
ENTER a few new features in the same data. Select the option(s) which is/are
CONTENT. QTN correct in such a case.
CAN HAVE
IMAGES ALSO Note: Consider remaining parameters are same.

((OPTION_A)) Training accuracy increases

THIS IS
MANDATORY
OPTION

((OPTION_B)) Training accuracy increases or remains the same

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Testing accuracy decreases

This is optional

((OPTION_D)) Testing accuracy increases or remains the same

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A&D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Choose which of the following options is true regarding One-Vs-All

method in Logistic Regression.
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) We need to fit n models in n-class classification problem

THIS IS
MANDATORY
OPTION

((OPTION_B)) We need to fit n-1 models to classify into n classes

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) We need to fit only 1 model to classify into n classes

This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) What would do if you want to train logistic regression on same data that
will take less time as well as give the comparatively similar
ENTER accuracy(may not be same)?
CONTENT. QTN
CAN HAVE Suppose you are using a Logistic Regression model on a huge dataset. One
IMAGES ALSO of the problem you may face on such huge data is that Logistic regression
will take very long time to train
((OPTION_A)) Decrease the learning rate and decrease the number of iteration
THIS IS
MANDATORY
OPTION

((OPTION_B)) Decrease the learning rate and increase the number of iteration
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Increase the learning rate and increase the number of iteration
This is optional

((OPTION_D)) Increase the learning rate and decrease the number of iteration
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH D
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Which of the following image is showing the cost function for y =1.
((QUESTION)) Following is the loss function in logistic regression(Y-axis loss function and x axis log probability) for two
class classification problem.
ENTER Note: Y is the target class
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) A
THIS IS
MANDATORY
OPTION

((OPTION_B)) B
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) BOTH
This is optional

((OPTION_D)) NON OF THESE

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) Logistic regression is used when you want to:

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Predict a dichotomous variable from continuous or dichotomous variables.

THIS IS
MANDATORY
OPTION

((OPTION_B)) Predict a continuous variable from dichotomous variables.

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Predict any categorical variable from several other categorical variables.

This is optional

((OPTION_D)) Predict a continuous variable from dichotomous or continuous variables

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH A
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
The odds ratio is
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) The ratio of the probability of an event not happening to the probability of the event happening.

THIS IS
MANDATORY
OPTION

((OPTION_B)) The probability of an event occurring.

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) The ratio of the odds after a unit change in the predictor to the original odds.

This is optional

((OPTION_D)) The ratio of the probability of an event happening to the probability of the event not happening.

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Large values of the log-likelihood statistic indicate:
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) That there are a greater number of explained vs. unexplained observations.

THIS IS
MANDATORY
OPTION

((OPTION_B)) That the statistical model fits the data well.

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) That as the predictor variable increases, the likelihood of the outcome occurring decreases.

This is optional

((OPTION_D)) That the statistical model is a poor fit of the data.

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
Logistic regression assumes a:
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) Linear relationship between continuous predictor variables and the outcome variable.

THIS IS
MANDATORY
OPTION

((OPTION_B)) Linear relationship between continuous predictor variables and the logit of the outcome variable.

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) Linear relationship between continuous predictor variables.

This is optional

((OPTION_D)) Linear relationship between observations.

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)
In binary logistic regression:
((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) The dependent variable is continuous.

THIS IS
MANDATORY
OPTION

((OPTION_B)) The dependent variable is divided into two equal subcategories.

THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) The dependent variable consists of two categories.

This is optional

((OPTION_D)) There is no dependent variable.

This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 1
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION)) The correlation coefficient is used to determine

ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A)) A specific value of the y-variable given a specific value of the x-

variable
THIS IS
MANDATORY
OPTION

((OPTION_B)) A specific value of the x-variable given a specific value of the y-

variable
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C)) The strength of the relationship between the x and y variables

This is optional

((OPTION_D)) none
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH C
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH B
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS)) 2
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

((MARKS))
QUESTION IS OF
HOW MANY
MARKS? (1 OR 2
OR 3 UPTO 10)

((QUESTION))
ENTER
CONTENT. QTN
CAN HAVE
IMAGES ALSO

((OPTION_A))
THIS IS
MANDATORY
OPTION

((OPTION_B))
THIS IS ALSO
MANDATORY
OPTION

((OPTION_C))
This is optional

((OPTION_D))
This is optional

((OPTION_E))
This is optional.
If optional keep
empty so that
system will skip
this option

((CORRECT_CH
OICE)) Either A
or B or C or D or
E

((EXPLANATION
)) This is also
optional

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

UNIT I
1. What is classification?
a) when the output variable is a category, such as “red” or “blue” or “disease” and “no
disease”.
b) when the output variable is a real value, such as “dollars” or “weight”.

Ans: Solution A

Ans: Solution B

3. What is supervised learning?

Ans: Solution B

4. What is Unsupervised learning?

Ans: Solution A

5. What is Semi-Supervised learning?

Ans: Solution C

7. Sentiment Analysis is an example of:

Regression,

Classification

Clustering

Reinforcement Learning

Options:

A. 1 Only

B. 1 and 2

C. 1 and 3

D. 1, 2 and 4

Ans : Solution D

8. The process of forming general concept definitions from examples of concepts to be

learned.
a) Deduction
b) abduction
c) induction
d) conjunction

Ans : Solution C

9. Computers are best at learning

a) facts.
b) concepts.
c) procedures.
d) principles.
Ans : Solution A

10. Data used to build a data mining model.

a) validation data
b) training data
c) test data
d) hidden data

Ans : Solution B

11. Supervised learning and unsupervised clustering both require at least one
a) hidden attribute.
b) output attribute.
c) input attribute.
d) categorical attribute.

Ans : Solution A

Ans : Solution B

Ans : Solution C

14. A term used to describe the case when the independent variables in a multiple regression model
are correlated is
a) Regression
b) correlation
c) multicollinearity
d) none of the above

Ans : Solution C

16. A multiple regression model has

a) only one independent variable
b) more than one dependent variable
c) more than one independent variable
d) none of the above

Ans : Solution B

Ans : Solution C

18. The adjusted multiple coefficient of determination accounts for

a) the number of dependent variables in the model
b) the number of independent variables in the model
c) unusually large predictors
d) none of the above

Ans : Solution D

19. The multiple coefficient of determination is computed by

a) dividing SSR by SST
b) dividing SST by SSR
c) dividing SST by SSE
d) none of the above

Ans : Solution C

20. For a multiple regression model, SST = 200 and SSE = 50. The multiple coefficient of
determination is
a) 0.25
b) 4.00
c) 0.75
d) none of the above

Ans : Solution B

21. A nearest neighbor approach is best used

Ans : Solution B

22. Another name for an output attribute.

a) predictive variable
b) independent variable
c) estimated variable
d) dependent variable

Ans : Solution B

23. Classification problems are distinguished from estimation problems in that

Ans : Solution C

24. Which statement is true about prediction problems?

Ans : Solution D

25. Which statement about outliers is true?

Ans : Solution A

27. Which of the following is a common use of unsupervised clustering?

Ans : Solution A

28. The average positive difference between computed and desired outcome values.
a) root mean squared error
b) mean squared error
c) mean absolute error
d) mean positive error

Ans : Solution D

29. Selecting data so as to assure that each class is properly represented in both the training and
test set.
a) cross validation
b) stratification
c) verification
d) bootstrapping

Ans : Solution B

Ans : Solution A
31. Data used to optimize the parameter settings of a supervised learner model.
a) Training
b) Test
c) Verification
d) Validation

Ans : Solution D

32. Bootstrapping allows us to

Ans : Solution A

Ans : Solution B

Ans : Solution C

35. The average squared difference between classifier predicted output and actual output.
a) mean squared error
b) root mean squared error
c) mean absolute error
d) mean relative error

Ans : Solution A

36. Simple regression assumes a __________ relationship between the input attribute and output
attribute.
a) Linear
b) Quadratic
c) reciprocal
d) inverse

Ans : Solution A

37. Regression trees are often used to model _______ data.

a) Linear
b) Nonlinear
c) Categorical
d) Symmetrical

Ans : Solution B

38. The leaf nodes of a model tree are

a) averages of numeric output attribute values.
b) nonlinear regression equations.
c) linear regression equations.
d) sums of numeric output attribute values.

Ans : Solution C

39. Logistic regression is a ________ regression technique that is used to model data having a
_____outcome.
a) linear, numeric
b) linear, binary
c) nonlinear, numeric
d) nonlinear, binary

Ans : Solution D

40. This technique associates a conditional probability value with each data instance.
a) linear regression
b) logistic regression
c) simple regression
d) multiple linear regression

Ans : Solution B

42. With Bayes classifier, missing data items are

a) treated as equal compares.
b) treated as unequal compares.
c) replaced with a default value.
d) ignored.

Ans : Solution B

43. This clustering algorithm merges and splits nodes to help modify nonoptimal partitions.
a) agglomerative clustering
b) expectation maximization
c) conceptual clustering
d) K-Means clustering

Ans : Solution D

Ans : Solution C

Ans : Solution B
UNIT –II

2.What is pca.components_ in Sklearn?

Set of all eigen vectors for the projection space
Matrix of principal components
Result of the multiplication matrix
None of the above options
Ans A

Ans D

7. PCA works better if there is?

A linear structure in the data
If the data lies on a curved surface and not on a flat surface
If variables are scaled in the same unit
A. 1 and 2
B. 2 and 3
C. 1 and 3
D. 1 ,2 and 3
Ans Solution: (C)

9. Which of the following option(s) is / are true?

11. [ True or False ] PCA can be used for projecting and visualizing data in lower dimensions.
A. TRUE
B. FALSE

Solution: (A)
Sometimes it is very useful to plot the data in lower dimensions. We can take the first 2 principal
components and then visualize the data using scatter plot.

13. Which of the following is an example of a deterministic algorithm?

1. Which of the following methods do we use to best fit the data in Logistic Regression?
A) Least Square Error
B) Maximum Likelihood
C) Jaccard distance
D) Both A and B
Ans Solution: B

4. Which of the following algorithms do we use for Variable Selection?

A) LASSO
B) Ridge
C) Both
D) None of these
Ans Solution: A
In case of lasso we apply a absolute penality, after increasing the penality in lasso some of the
coefficient of variables may become zero

5. Which of the following statement is true about outliers in Linear regression?

7. Which of the following is true about Residuals?

A) Lower is better
B) Higher is better
C) A or B depend on the situation
D) None of these
Ans Solution: (A)
Residuals refer to the error values of the model. Therefore lower residuals are desired.

A) Since the there is a relationship means our model is not good

10. Which of the following option is true?

A) Linear Regression errors values has to be normally distributed but in case of Logistic
Regression it is not the case
B) Logistic Regression errors values has to be normally distributed but in case of Linear
Regression it is not the case
C) Both Linear Regression and Logistic Regression error values have to be normally distributed
D) Both Linear Regression and Logistic Regression error values have not to be normally
distributed
Ans Solution: A

12. True-False: Linear Regression is a supervised machine learning algorithm.

13. True-False: Linear Regression is mainly used for Regression.

A) TRUE
B) FALSE
Solution: (A)
Linear Regression has dependent variables that have continuous values.
14. True-False: It is possible to design a Linear regression algorithm using a neural network?

A) TRUE
B) FALSE

Solution: (A)

True. A Neural network can be used as a universal approximator, so it can definitely implement
a linear regression algorithm.

18. Which of the following is true about Residuals ?

A) Lower is better
B) Higher is better
C) A or B depend on the situation
D) None of these
Solution: (A)
Residuals refer to the error values of the model. Therefore lower residuals are desired.

A) 1 and 2
B) 1 and 3
C) 2 and 3
D) 1,2 and 3
Solution: (D)
Instead of gradient descent, Normal Equation can also be used to find coefficients.

Question Context 24-26:

A) Since the there is a relationship means our model is not good

Question Context 29-31:

31. In terms of bias and variance. Which of the following is true when you fit degree 2
polynomial?

A) Bias will be high, variance will be high

Question Context 32-33:

33. What do you expect will happen with bias and variance as you increase the size of training
data?

A) Bias increases and Variance increases

Question Context 34:

Consider the following data where one input(X) and one output(Y) is given.

34. What would be the root mean square training error for this data if you run a Linear
Regression model of the form (Y = A0+A1X)?

A) Less than 0
B) Greater than zero
C) Equal to 0
D) None of these
Solution: (C)
We can perfectly fit the line on the following data so mean error will be zero.

Question Context 35-36:

Suppose you have been given the following scenario for training and validation error for Linear
Regression.
Number Validation
Learning Training
Scenario of Error
Rate Error
iterations

1 0.1 1000 100 110

2 0.2 600 90 105

3 0.3 400 110 110

4 0.4 300 120 130

5 0.4 250 130 150

Question Context 37-38:

A) L1
B) L2
C) Any
D) None of these
Solution: (D)
I won’t use any regularization methods because regularization is used in case of overfitting.

39. True-False: Is Logistic regression a supervised machine learning algorithm?

40. True-False: Is Logistic regression mainly used for Regression?

A) TRUE
B) FALSE
Solution: B
Logistic regression is a classification algorithm, don’t confuse with the name regression.

42. True-False: Is it possible to apply a logistic regression algorithm on a 3-class Classification

problem?
A) TRUE
B) FALSE
Solution: A
Yes, we can apply logistic regression on 3 classification problem, We can use One Vs all method
for 3 class classification in logistic regression.

46. [True-False] Standardisation of features is required before training a Logistic Regression.

A) TRUE
B) FALSE
Solution: B
Standardization isn’t required for logistic regression. The main goal of standardizing features is
to help convergence of the technique used for optimization.

47. Which of the following algorithms do we use for Variable Selection?

A) LASSO
B) Ridge
C) Both
D) None of these

Solution: A
In case of lasso we apply a absolute penality, after increasing the penality in lasso some of the
coefficient of variables may become zero.
Context: 48-49

Consider a following model for logistic regression: P (y =1|x, w)= g(w0 + w1x)
where g(z) is the logistic function.

In the above equation the P (y =1|x; w) , viewed as a function of x, that we can get by changing the
parameters w.

48 What would be the range of p in such case?

A) (0, inf)
B) (-inf, 0 )
C) (0, 1)
D) (-inf, inf)

Solution: C

For values of x in the range of real number from −∞ to +∞ Logistic function will give the output
between (0,1)

49 In above question what do you think which function would make p between (0,1)?

A) logistic function
B) Log likelihood function
C) Mixture of both
D) None of them

Solution: A

Explanation is same as question number 10

50. Suppose you have been given a fair coin and you want to find out the odds of getting heads.
Which of the following option is true for such a case?

A) odds will be 0
B) odds will be 0.5
C) odds will be 1
D) None of these

Solution: C

Odds are defined as the ratio of the probability of success and the probability of failure. So in case of fair
coin probability of success is 1/2 and the probability of failure is 1/2 so odd would be 1

51. The logit function(given as l(x)) is the log of odds function. What could be the range of logit
function in the domain x=[0,1]?
A) (– ∞ , ∞)
B) (0,1)
C) (0, ∞)
D) (- ∞, 0)

Solution: A

For our purposes, the odds function has the advantage of transforming the probability function, which
has values from 0 to 1, into an equivalent function with values between 0 and ∞. When we take the
natural log of the odds function, we get a range of values from -∞ to ∞.

52. Which of the following option is true?

A) Linear Regression errors values has to be normally distributed but in case of Logistic Regression it is
not the case
B) Logistic Regression errors values has to be normally distributed but in case of Linear Regression it is
not the case
C) Both Linear Regression and Logistic Regression error values have to be normally distributed
D) Both Linear Regression and Logistic Regression error values have not to be normally distributed

Solution:A

53. Which of the following is true regarding the logistic function for any value “x”?

Note:
Logistic(x): is a logistic function of any number “x”

Logit(x): is a logit function of any number “x”

Logit_inv(x): is a inverse logit function of any number “x”

A) Logistic(x) = Logit(x)
B) Logistic(x) = Logit_inv(x)
C) Logit_inv(x) = Logit(x)
D) None of these

Solution: B

54. How will the bias change on using high(infinite) regularisation?

Solution: A

Model will become very simple so bias will be very high.

Note: Consider remaining parameters are same.

A) Training accuracy increases

B) Training accuracy increases or remains the same
C) Testing accuracy decreases
D) Testing accuracy increases or remains the same

Solution: A and D

56. Choose which of the following options is true regarding One-Vs-All method in Logistic Regression.

A) We need to fit n models in n-class classification problem

B) We need to fit n-1 models to classify into n classes
C) We need to fit only 1 model to classify into n classes
D) None of these
Solution: A

If there are n classes, then n separate logistic regression has to fit, where the probability of each
category is predicted over the rest of the categories combined.

57. Below are two different logistic models with different values for β0 and β1.

Which of the
following statement(s) is true about β0 and β1 values of two logistics models (Green, Black)?

Note: consider Y = β0 + β1*X. Here, β0 is intercept and β1 is coefficient.

A) β1 for Green is greater than Black

B) β1 for Green is lower than Black
C) β1 for both models is same
D) Can’t Say

Solution: B

β0 and β1: β0 = 0, β1 = 1 is in X1 color(black) and β0 = 0, β1 = −1 is in X4 color (green)

Context 58-60

A) A
B) B
C) C
D)None of these

Solution: C

Since in figure 3, Decision boundary is not smooth that means it will over-fitting the data.

59. What do you conclude after seeing this visualization?

1. The training error in first plot is maximum as compare to second and third plot.

2. The best model for this regression problem is the last (third) plot because it has minimum
training error (zero).

3. The second model is more robust than first and third because it will perform best on unseen
data.

4. The third model is overfitting more as compare to first and second.

5. All will perform same because we have not seen the testing data.

A) 1 and 3
B) 1 and 3
C) 1, 3 and 4
D) 5

Solution: C

60. Suppose, above decision boundaries were generated for the different value of regularization.
Which of the above decision boundary shows the maximum regularization?

A) A
B) B
C) C
D) All have equal regularization

Solution: A

Since, more regularization means more penality means less complex decision boundry that shows in first
figure A.

61. What would do if you want to train logistic regression on same data that will take less time as well
as give the comparatively similar accuracy(may not be same)?

Suppose you are using a Logistic Regression model on a huge dataset. One of the problem you may face
on such huge data is that Logistic regression will take very long time to train.

A) Decrease the learning rate and decrease the number of iteration

Solution: D

62. Which of the following image is showing the cost function for y =1.

Following is the loss function in logistic regression(Y-axis loss function and x axis log probability) for
two class classification problem.

Note: Y is the target class

A) A
B) B
C) Both
D) None of these

Solution: A

A is the true answer as loss function decreases as the log probability increases

63. Suppose, Following graph is a cost function for logistic regression.

Now, How many local minimas are present in the graph?

A) 1
B) 2
C) 3
D) 4

Solution: C
There are three local minima present in the graph

64. Can a Logistic Regression classifier do a perfect classification on the below data?

Note: You can use only X1 and X2 variables where X1 and X2 can take only two binary values(0,1).

A) TRUE
B) FALSE
C) Can’t say
D) None of these

Solution: B

No, logistic regression only forms linear decision surface, but the examples in the figure are not linearly
separable.
UNIT IV

1. The SVM’s are less effective when:

A) The data is linearly separable

B) The data is clean and ready to use
C) The data is noisy and contains overlapping points

Ans Solution: C

When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.

2. The cost parameter in the SVM means:

A) The number of cross-validations to be made

B) The kernel to be used
C) The tradeoff between misclassification and simplicity of the model
D) None of the above

Ans Solution: C

The cost parameter decides how much an SVM should be allowed to “bend” with the data. For a low
cost, you aim for a smooth decision surface and for a higher cost, you aim to classify more points
correctly. It is also simply referred to as the cost of misclassification.

3. Which of the following are real world applications of the SVM?

A) Text and Hypertext Categorization

B) Image Classification
C) Clustering of News Articles
D) All of the above

Ans Solution: D

SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.

4. Which of the following is true about Naive Bayes ?

Assumes that all the features in a dataset are equally important

Assumes that all the features in a dataset are independent

Both A and B - answer

None of the above options

Ans Solution: C

5 What do you mean by generalization error in terms of the SVM?

A) How far the hyperplane is from the support vectors

B) How accurately the SVM can predict outcomes for unseen data
C) The threshold amount of error in an SVM

Ans Solution: B

Generalisation error in statistics is generally the out-of-sample error which is the measure of how
accurately a model can predict values for previously unseen data.

6 The SVM’s are less effective when:

A) The data is linearly separable

B) The data is clean and ready to use
C) The data is noisy and contains overlapping points

Ans Solution: C

When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.

7 What is/are true about kernel in SVM?

1. Kernel function map low dimensional data to high dimensional space

2. It’s a similarity function

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: C

Both the given statements are correct.

Question Context:8– 9

A) Yes
B) No

Solution: A

These three examples are positioned such that removing any one of them introduces slack in the
constraints. So the decision boundary would completely change.

9. [True or False] If you remove the non-red circled points from the data, the decision boundary will
change?

A) True
B) False

Solution: B

On the other hand, rest of the points in the data won’t affect the decision boundary much.

10. What do you mean by generalization error in terms of the SVM?

A) How far the hyperplane is from the support vectors

B) How accurately the SVM can predict outcomes for unseen data
C) The threshold amount of error in an SVM

Solution: B

A) The optimal hyperplane if exists, will be the one that completely separates the data
B) The soft-margin classifier will separate the data
C) None of the above

Solution: A

At such a high level of misclassification penalty, soft margin will not hold existence as there will be no
room for error.

12. What do you mean by a hard margin?

A) The SVM allows very low error in classification

B) The SVM allows high amount of error in classification
C) None of the above

Solution: A

A hard margin means that an SVM is very rigid in classification and tries to work extremely well in the
training set, causing overfitting.

13. The minimum time complexity for training an SVM is O(n2). According to this fact, what sizes of
datasets are not best suited for SVM’s?

A) Large datasets
B) Small datasets
C) Medium sized datasets
D) Size does not matter

Solution: A

Datasets which have a clear classification boundary will function best with SVM’s.

14. The effectiveness of an SVM depends upon:

A) Selection of Kernel
B) Kernel Parameters
C) Soft Margin Parameter C
D) All of the above

Solution: D

The SVM effectiveness depends upon how you choose the basic 3 requirements mentioned above in
such a way that it maximises your efficiency, reduces error and overfitting.

15. upport vectors are the data points that lie closest to the decision surface.
A) TRUE
B) FALSE

Solution: A

They are the points closest to the hyperplane and the hardest ones to classify. They also have a direct
bearing on the location of the decision surface.

16. The SVM’s are less effective when:

A) The data is linearly separable

B) The data is clean and ready to use
C) The data is noisy and contains overlapping points

Solution: C

When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.

17. Suppose you are using RBF kernel in SVM with high Gamma value. What does this signify?

Solution: B

The gamma parameter in SVM tuning signifies the influence of points either near or far away from the
hyperplane.

For a low gamma, the model will be too constrained and include all points of the training dataset,
without really capturing the shape.

For a higher gamma, the model will capture the shape of the dataset well.

18. The cost parameter in the SVM means:

A) The number of cross-validations to be made

B) The kernel to be used
C) The tradeoff between misclassification and simplicity of the model
D) None of the above

What would happen when you use very large value of C(C->infinity)?

Note: For small C was also classifying all data points correctly

A) We can still classify data correctly for given setting of hyper parameter C
B) We can not classify data correctly for given setting of hyper parameter C
C) Can’t Say
D) None of these

Solution: A

For large values of C, the penalty for misclassifying points is very high, so the decision boundary will
perfectly separate the data if possible.

20. What would happen when you use very small C (C~0)?

A) Misclassification would happen

B) Data will be correctly classified
C) Can’t say
D) None of these

Solution: A

The classifier can maximize the margin between most of the points, while misclassifying a few points,
because the penalty is so low.

21. If I am using all features of my dataset and I achieve 100% accuracy on my training set, but ~70% on
validation set, what should I look out for?

A) Underfitting
B) Nothing, the model is perfect
C) Overfitting

Solution: C

If we’re achieving 100% training accuracy very easily, we need to check to verify if we’re overfitting our
data.
22. Which of the following are real world applications of the SVM?

A) Text and Hypertext Categorization

B) Image Classification
C) Clustering of News Articles
D) All of the above

Solution: D

SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.

Question Context: 23 – 25

Suppose you have trained an SVM with linear decision boundary after training SVM, you correctly infer
that your SVM model is under fitting.

23. Which of the following option would you more likely to consider iterating SVM next time?

A) You want to increase your data points

B) You want to decrease your data points
C) You will try to calculate more variables
D) You will try to reduce the features

Solution: C

The best option here would be to create more features for the model.

24. Suppose you gave the correct answer in previous question. What do you think that is actually
happening?

1. We are lowering the bias

2. We are lowering the variance
3. We are increasing the bias
4. We are increasing the variance

A) 1 and 2
B) 2 and 3
C) 1 and 4
D) 2 and 4

Solution: C

A) We will increase the parameter C

B) We will decrease the parameter C
C) Changing in C don’t effect
D) None of these

Solution: A

Increasing C parameter would be the right thing to do here, as it will ensure regularized model

26. We usually use feature normalization before using the Gaussian kernel in SVM. What is true about
feature normalization?

1. We do feature normalization so that new feature will dominate other

2. Some times, feature normalization is not feasible in case of categorical variables
3. Feature normalization always helps when we use Gaussian kernel in SVM

A) 1
B) 1 and 2
C) 1 and 3
D) 2 and 3

Solution: B

Statements one and two are correct.

Question Context: 27-29

Suppose you are dealing with 4 class classification problem and you want to train a SVM model on the
data for that you are using One-vs-all method. Now answer the below questions?

27. How many times we need to train our SVM model in such case?

A) 1
B) 2
C) 3
D) 4

Solution: D

A) 20
B) 40
C) 60
D) 80

Solution: B

It would take 10×4 = 40 seconds

29 Suppose your problem has changed now. Now, data has only 2 classes. What would you think how
many times we need to train SVM in such case?

A) 1
B) 2
C) 3
D) 4

Solution: A

Training the SVM only one time would give you appropriate results

Question context: 30 –31

30. Now, think that you increase the complexity (or degree of polynomial of this kernel). What would
you think will happen?

A) Increasing the complexity will over fit the data

B) Increasing the complexity will under fit the data
C) Nothing will happen since your model was already 100% accurate
D) None of these

Solution: A

A) 1
B) 2
C) 1 and 2
D) None of these

Solution: C

Both the given statements are correct.

32. What is/are true about kernel in SVM?

1. Kernel function map low dimensional data to high dimensional space

2. It’s a similarity function

A) 1
B) 2
C) 1 and 2
D) None of these

Solution: C

Both the given statements are correct.

UNIT V

1. Which of the following is a widely used and effective machine learning algorithm based on the
idea of bagging?

a) Decision Tree
b) Regression
c) Classification
d) Random Forest

Ans D

2. Which of the following is a disadvantage of decision trees?

a) Factor analysis
b) Decision trees are robust to outliers
c) Decision trees are prone to be overfit
d) None of the above

Ans C

3. Can decision trees be used for performing clustering?

a. True
b. False

Ans Solution: (A)

Decision trees can also be used to for clusters in the data but clustering often generates natural
clusters and is not dependent on any objective function.

4. Which of the following algorithm is most sensitive to outliers?

a. K-means clustering algorithm

b. K-medians clustering algorithm
c. K-modes clustering algorithm
d. K-medoids clustering algorithm

Ans Solution: (A)

5 Sentiment Analysis is an example of:

Regression

Classification

Clustering

Reinforcement Learning

Options:

a. 1 Only
b. 1 and 2
c. 1 and 3
d. 1, 2 and 4

Ans D

6 Which of the following is the most appropriate strategy for data cleaning before performing
clustering analysis, given less than desirable number of data points:

Capping and flouring of variables

Removal of outliers
Options:
a. 1 only
b. 2 only
c. 1 and 2
d. None of the above

Ans A

7 Which of the following is/are true about bagging trees?

1. In bagging trees, individual trees are independent of each other

2. Bagging is the method for improving the performance by aggregating the results of weak
learners

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: C

Both options are true. In Bagging, each individual trees are independent of each other because they
consider different subset of features and samples.

8. Which of the following is/are true about boosting trees?

1. In boosting trees, individual weak learners are independent of each other

2. It is the method for improving the performance by aggregating the results of weak learners

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: B

2. Individual tree is built on all the features

3. Individual tree is built on a subset of observations

4. Individual tree is built on full set of observations

A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4

Ans Solution: A

Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.

10. Suppose you are using a bagging based algorithm say a RandomForest in model building.
Which of the following can be true?

1. Number of tree should be as large as possible

2. You will have interpretability after using Random Forest

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: A

11. Which of the following is/are true about Random Forest and Gradient Boosting ensemble
methods?

1. Both methods can be used for classification task

2. Random Forest is use for classification whereas Gradient Boosting is use for regression task

3. Random Forest is use for regression whereas Gradient Boosting is use for Classification task

4. Both methods can be used for regression task

A) 1
B) 2
C) 3
D) 4
E) 1 and 4

Solution: E

Both algorithms are design for classification as well as regression task.

12. In Random forest you can generate hundreds of trees (say T1, T2 …..Tn) and then aggregate the
results of these tree. Which of the following is true about individual(Tk) tree in Random Forest?

1. Individual tree is built on a subset of the features

2. Individual tree is built on all the features

3. Individual tree is built on a subset of observations

4. Individual tree is built on full set of observations

A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4

Solution: A

Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.

13. Which of the following algorithm doesn’t uses learning Rate as of one of its hyperparameter?

1. Gradient Boosting

2. Extra Trees

3. AdaBoost

4. Random Forest

A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4

Solution: D
Random Forest and Extra Trees don’t have learning rate as a hyperparameter.

14. Which of the following algorithm are not an example of ensemble learning algorithm?

A) Random Forest
B) Adaboost
C) Extra Trees
D) Gradient Boosting
E) Decision Trees

Solution: E

Decision trees doesn’t aggregate the results of multiple trees so it is not an ensemble algorithm.

15. Suppose you are using a bagging based algorithm say a RandomForest in model building. Which of
the following can be true?

1. Number of tree should be as large as possible

2. You will have interpretability after using RandomForest

A) 1
B) 2
C) 1 and 2
D) None of these

Solution: A

16. True-False: The bagging is suitable for high variance low bias models?

A) TRUE
B) FALSE

Solution: A

The bagging is suitable for high variance low bias models or you can say for complex models.

17. To apply bagging to regression trees which of the following is/are true in such case?

1. We build the N regression with N bootstrap sample

2. We take the average the of N regression tree

3. Each tree has a high variance with low bias

A) 1 and 2
B) 2 and 3
C) 1 and 3
D) 1,2 and 3

Solution: D

All of the options are correct and self-explanatory

18. How to select best hyper parameters in tree based models?

A) Measure performance over training data

B) Measure performance over validation data
C) Both of these
D) None of these

Solution: B

We always consider the validation results to compare with the test result.

19. In which of the following scenario a gain ratio is preferred over Information Gain?

A) When a categorical variable has very large number of category

B) When a categorical variable has very small number of category
C) Number of categories is the not the reason
D) None of these

Solution: A

When high cardinality problems, gain ratio is preferred over Information Gain technique.

20. Suppose you have given the following scenario for training and validation error for Gradient
Boosting. Which of the following hyper parameter would you choose in such case?

Scenario Depth Training Error Validation Error

1 2 100 110

2 4 90 105

3 6 50 100

4 8 45 105
5 10 30 150

A) 1
B) 2
C) 3
D) 4

Solution: B

Scenario 2 and 4 has same validation accuracies but we would select 2 because depth is lower is better
hyper parameter.

21. Which of the following is/are not true about DBSCAN clustering algorithm:

1. For data points to be in a cluster, they must be in a distance threshold to a core point

2. It has strong assumptions for the distribution of data points in dataspace

3. It has substantially high time complexity of order O(n 3)

4. It does not require prior knowledge of the no. of desired clusters

5. It is robust to outliers

Options:

A. 1 only

B. 2 only

C. 4 only

D. 2 and 3

Solution: D

 DBSCAN can form a cluster of any arbitrary shape and does not have strong assumptions for the
distribution of data points in the data space.

 DBSCAN has a low time complexity of order O (n log n) only.

22. Point out the correct statement.

23. Which of the following is required by K-means clustering?

a) defined distance metric
b) number of clusters
c) initial guess as to cluster centroids
d) all of the mentioned

Answer: d
Explanation: K-means clustering follows partitioning approach.

24. Point out the wrong statement.

a) k-means clustering is a method of vector quantization
b) k-means clustering aims to partition n observations into k clusters
c) k-nearest neighbor is same as k-means
d) none of the mentioned

Answer: c
Explanation: k-nearest neighbour has nothing to do with k-means.

25. Which of the following function is used for k-means clustering?

a) k-means
b) k-mean
c) heat map
d) none of the mentioned

Answer: a
Explanation: K-means requires a number of clusters.

26. K-means is not deterministic and it also consists of number of iterations.

a) True
b) False

Answer: a
Explanation: K-means clustering produces the final estimate of cluster centroids.
27.
lOMoARcPSD|7874213

MCQ-ML

Bsc (computer science) (Savitribai Phule Pune University)

StuDocu is not sponsored or endorsed by any college or university

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)
lOMoARcPSD|7874213

Machine Learning Questions & Solutions

Question Context

A feature F1 can take certain value: A, B, C, D, E, & F and represents grade of

students from a college.

1) Which of the following statement is true in following case?

A) Feature F1 is an example of nominal variable.

B) Feature F1 is an example of ordinal variable.
C) It doesn’t belong to any of the above category.
D) Both of these

Solution: (B)

Ordinal variables are the variables which has some order in their categories. For
example, grade A should be consider as high grade than grade B.

2) Which of the following is an example of a deterministic algorithm?

A) PCA

B) K-Means

C) None of the above

Solution: (A)A deterministic algorithm is that in which output does not change on
different runs. PCA would give the same result if we run again, but not k-means.

3) [True or False] A Pearson correlation between two variables is zero but, still
their values can still be related to each other.

A) TRUE

B) FALSE

Solution: (A)

Y=X2. Note that, they are not only associated, but one is a function of the other and
Pearson correlation between them is 0.

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

4) Which of the following statement(s) is / are true for Gradient Decent (GD) and
Stochastic Gradient Decent (SGD)?

1. In GD and SGD, you update a set of parameters in an iterative manner to

minimize the error function.
2. In SGD, you have to run through all the samples in your training set for a
single update of a parameter in each iteration.
3. In GD, you either use the entire data or a subset of training data to update a
parameter in each iteration.

A) Only 1

B) Only 2

C) Only 3

D) 1 and 2

E) 2 and 3

F) 1,2 and 3

Solution: (A)In SGD for each iteration you choose the batch which is generally contain
the random sample of data But in case of GD each iteration contain the all of the training
observations.

5) Which of the following hyper parameter(s), when increased may cause random
forest to over fit the data?

1. Number of Trees
2. Depth of Tree
3. Learning Rate

A) Only 1

B) Only 2

C) Only 3

D) 1 and 2

E) 2 and 3

F) 1,2 and 3

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

Solution: (B)Usually, if we increase the depth of tree it will cause overfitting. Learning
rate is not an hyperparameter in random forest. Increase in the number of tree will cause
under fitting.

6) Imagine, you are working with “Analytics Vidhya” and you want to develop a
machine learning algorithm which predicts the number of views on the articles.

Your analysis is based on features like author name, number of articles written by
the same author on Analytics Vidhya in past and a few other features. Which of
the following evaluation metric would you choose in that case?

1. Mean Square Error

2. Accuracy
3. F1 Score

A) Only 1

B) Only 2

C) Only 3

D) 1 and 3

E) 2 and 3

F) 1 and 2

Solution:(A)

You can think that the number of views of articles is the continuous target variable which
fall under the regression problem. So, mean squared error will be used as an evaluation
metrics.

7) Given below are three images (1,2,3). Which of the following option is correct
for these images?

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

C)
A) 1 is tanh, 2 is ReLU and 3 is SIGMOID activation functions.

B) 1 is SIGMOID, 2 is ReLU and 3 is tanh activation functions.

C) 1 is ReLU, 2 is tanh and 3 is SIGMOID activation functions.

D) 1 is tanh, 2 is SIGMOID and 3 is ReLU activation functions.

Solution: (D)

The range of SIGMOID function is [0,1].

The range of the tanh function is [-1,1].

The range of the RELU function is [0, infinity].

So Option D is the right answer.

8) Below are the 8 actual values of target variable in the train file.

[0,0,0,1,1,1,1,1]

What is the entropy of the target variable?

A) -(5/8 log(5/8) + 3/8 log(3/8))

B) 5/8 log(5/8) + 3/8 log(3/8)

C) 3/8 log(5/8) + 5/8 log(3/8)

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

D) 5/8 log(3/8) – 3/8 log(5/8)

Solution: (A)The formula for entropy is

So the answer is A.

9) Let’s say, you are working with categorical feature(s) and you have not looked
at the distribution of the categorical variable in the test data.

You want to apply one hot encoding (OHE) on the categorical feature(s). What
challenges you may face if you have applied OHE on a categorical variable of train
dataset?

A) All categories of categorical variable are not present in the test dataset.

B) Frequency distribution of categories is different in train as compared to the test

dataset.

C) Train and Test always have same distribution.

D) Both A and B

E) None of these

Solution: (D)Both are true, The OHE will fail to encode the categories which is present
in test but not in train so it could be one of the main challenges while applying OHE. The
challenge given in option B is also true you need to more careful while applying OHE if
frequency distribution doesn’t same in train and test.

10) Skip gram model is one of the best models used in Word2vec algorithm for
words embedding. Which one of the following models depict the skip gram
model?

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

A) A

B) B

C) Both A and B

D) None of these

Solution: (B)

Both models (model1 and model2) are used in Word2vec algorithm. The model1
represent a CBOW model where as Model2 represent the Skip gram model.

11) Let’s say, you are using activation function X in hidden layers of neural
network. At a particular neuron for any given input, you get the output as “-
0.0001”. Which of the following activation function could X represent?

A) ReLU

B) tanh

C) SIGMOID

D) None of these

Solution: (B)The function is a tanh because the this function output range is between (-
1,-1).

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

12) [True or False] LogLoss evaluation metric can have negative values.

A) TRUE
B) FALSE

Solution: (B)Log loss cannot have negative values.

13) Which of the following statements is/are true about “Type-1” and “Type-2”
errors?

1. Type1 is known as false positive and Type2 is known as false negative.

2. Type1 is known as false negative and Type2 is known as false positive.
3. Type1 error occurs when we reject a null hypothesis when it is actually true.

A) Only 1

B) Only 2

C) Only 3

D) 1 and 2

E) 1 and 3

F) 2 and 3

Solution: (E)

14) Which of the following is/are one of the important step(s) to pre-process the
text in NLP based projects?

1. Stemming
2. Stop word removal
3. Object Standardization

A) 1 and 2

B) 1 and 3

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

C) 2 and 3

D) 1,2 and 3

Solution: (D)

Stemming is a rudimentary rule-based process of stripping the suffixes (“ing”, “ly”, “es”,
“s” etc) from a word.

Stop words are those words which will have not relevant to the context of the data for
example is/am/are.

Object Standardization is also one of the good way to pre-process the text.

15) Suppose you want to project high dimensional data into lower dimensions.
The two most famous dimensionality reduction algorithms used here are PCA and
t-SNE. Let’s say you have applied both algorithms respectively on data “X” and
you got the datasets “X_projected_PCA” , “X_projected_tSNE”.

Which of the following statements is true for “X_projected_PCA” &

“X_projected_tSNE” ?

A) X_projected_PCA will have interpretation in the nearest neighbour space.

B) X_projected_tSNE will have interpretation in the nearest neighbour space.

C) Both will have interpretation in the nearest neighbour space.

D) None of them will have interpretation in the nearest neighbour space.

Solution: (B)

t-SNE algorithm considers nearest neighbour points to reduce the dimensionality of the
data. So, after using t-SNE we can think that reduced dimensions will also have
interpretation in nearest neighbour space. But in the case of PCA it is not the case.

Context: 16-17

Given below are three scatter plots for two features (Image 1, 2 & 3 from left to
right).

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

16) In the above images, which of the following is/are examples of multi-collinear
features?

A) Features in Image 1

B) Features in Image 2

C) Features in Image 3

D) Features in Image 1 & 2

E) Features in Image 2 & 3

F) Features in Image 3 & 1

Solution: (D)

17) In previous question, suppose you have identified multi-collinear features.

Which of the following action(s) would you perform next?

1. Remove both collinear variables.

2. Instead of removing both variables, we can remove only one variable.
3. Removing correlated variables might lead to loss of information. In order to
retain those variables, we can use penalized regression models like ridge or
lasso regression.

A) Only 1

B)Only 2

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

C) Only 3

D) Either 1 or 3

E) Either 2 or 3

Solution: (E)

You cannot remove the both features because after removing the both features you will
lose all of the information so you should either remove the only 1 feature or you can use
the regularization algorithm like L1 and L2.

18) Adding a non-important feature to a linear regression model may result in.

1. Increase in R-square
2. Decrease in R-square

A) Only 1 is correct

B) Only 2 is correct

C) Either 1 or 2

D) None of these

Solution: (A)

After adding a feature in feature space, whether that feature is important or unimportant
features the R-squared always increase.

19) Suppose, you are given three variables X, Y and Z. The Pearson correlation
coefficients for (X, Y), (Y, Z) and (X, Z) are C1, C2 & C3 respectively.

A) D1= C1, D2 < C2, D3 > C3

B) D1 = C1, D2 > C2, D3 > C3

C) D1 = C1, D2 > C2, D3 < C3

D) D1 = C1, D2 < C2, D3 < C3

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

E) D1 = C1, D2 = C2, D3 = C3

F) Cannot be determined

Solution: (E)Correlation between the features won’t change if you add or subtract a
value in the features.

20) Imagine, you are solving a classification problems with highly imbalanced
class. The majority class is observed 99% of times in the training data.

Your model has 99% accuracy after taking the predictions on test data. Which of
the following is true in such a case?

1. Accuracy metric is not a good idea for imbalanced class problems.

2. Accuracy metric is a good idea for imbalanced class problems.
3. Precision and recall metrics are good for imbalanced class problems.
4. Precision and recall metrics aren’t good for imbalanced class problems.

A) 1 and 3

B) 1 and 4

C) 2 and 3

D) 2 and 4

Solution: (A)Refer the question number 4 from in this article.

21) In ensemble learning, you aggregate the predictions for weak learners, so that
an ensemble of these models will give a better prediction than prediction of
individual models.

Which of the following statements is / are true for weak learners used in ensemble
model?

1. They don’t usually overfit.

2. They have high bias, so they cannot solve complex learning problems
3. They usually overfit.

A) 1 and 2

B) 1 and 3

C) 2 and 3

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

D) Only 1

E) Only 2

F) None of the above

Solution: (A)

Weak learners are sure about particular part of a problem. So, they usually don’t overfit
which means that weak learners have low variance and high bias.

22) Which of the following options is/are true for K-fold cross-validation?

1. Increase in K will result in higher time required to cross validate the result.
2. Higher values of K will result in higher confidence on the cross-validation
result as compared to lower value of K.
3. If K=N, then it is called Leave one out cross validation, where N is the
number of observations.

A) 1 and 2

B) 2 and 3

C) 1 and 3

D) 1,2 and 3

Solution: (D)

Larger k value means less bias towards overestimating the true expected error (as
training folds will be closer to the total dataset) and higher running time (as you are
getting closer to the limit case: Leave-One-Out CV). We also need to consider the
variance between the k folds accuracy while selecting the k.

Question Context 23-24

Cross-validation is an important step in machine learning for hyper parameter

tuning. Let’s say you are tuning a hyper-parameter “max_depth” for GBM by
selecting it from 10 different depth values (values are greater than 2) for tree
based model using 5-fold cross validation.

Time taken by an algorithm for training (on a model with max_depth 2) 4-fold is 10
seconds and for the prediction on remaining 1-fold is 2 seconds.

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

Note: Ignore hardware dependencies from the equation.

23) Which of the following option is true for overall execution time for 5-fold cross
validation with 10 different values of “max_depth”?

A) Less than 100 seconds

B) 100 – 300 seconds

C) 300 – 600 seconds

D) More than or equal to 600 seconds

C) None of the above

D) Can’t estimate

Solution: (D)

Each iteration for depth “2” in 5-fold cross validation will take 10 secs for training and 2
second for testing. So, 5 folds will take 12*5 = 60 seconds. Since we are searching over
the 10 depth values so the algorithm would take 60*10 = 600 seconds. But training and
testing a model on depth greater than 2 will take more time than depth “2” so overall
timing would be greater than 600.

24) In previous question, if you train the same algorithm for tuning 2 hyper
parameters say “max_depth” and “learning_rate”.

You want to select the right value against “max_depth” (from given 10 depth
values) and learning rate (from given 5 different learning rates). In such cases,
which of the following will represent the overall time?

A) 1000-1500 second

B) 1500-3000 Second

C) More than or equal to 3000 Second

D) None of these

Solution: (D)Same as question number 23.

25) Given below is a scenario for training error TE and Validation error VE for a
machine learning algorithm M1. You want to choose a hyperparameter (H) based
on TE and VE.

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

H TE VE
1 105 90
2 200 85
3 250 96
4 105 85
5 300 100
Which value of H will you choose based on the above table?

A) 1

B) 2

C) 3

D) 4

E) 5

Solution: (D)Looking at the table, option D seems the best

26) What would you do in PCA to get the same projection as SVD?

A) Transform data to zero mean

B) Transform data to zero median

C) Not possible

D) None of these

Solution: (A)When the data has a zero mean vector PCA will have same projections as
SVD, otherwise you have to centre the data first before taking SVD.

Question Context 27-28

Assume there is a black box algorithm, which takes training data with multiple
observations (t1, t2, t3,…….. tn) and a new observation (q1). The black box
outputs the nearest neighbor of q1 (say ti) and its corresponding class label ci.

You can also think that this black box algorithm is same as 1-NN (1-nearest
neighbor).

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

27) It is possible to construct a k-NN classification algorithm based on this black

box alone.

Note: Where n (number of training observations) is very large compared to k.

A) TRUE

B) FALSE

Solution: (A)

In first step, you pass an observation (q1) in the black box algorithm so this algorithm
would return a nearest observation and its class.

In second step, you through it out nearest observation from train data and again input
the observation (q1). The black box algorithm will again return the a nearest observation
and it’s class.

You need to repeat this procedure k times

28) Instead of using 1-NN black box we want to use the j-NN (j>1) algorithm as
black box. Which of the following option is correct for finding k-NN using j-NN?

1. J must be a proper factor of k

2. J > k
3. Not possible

A) 1

B) 2

C) 3

Solution: (A)Same as question number 27

29) Suppose you are given 7 Scatter plots 1-7 (left to right) and you want to
compare Pearson correlation coefficients between variables of each scatterplot.

Which of the following is in the right order?

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

1. 1<2<3<4
2. 1>2>3 > 4
3. 7<6<5<4
4. 7>6>5>4

A) 1 and 3

B) 2 and 3

C) 1 and 4

D) 2 and 4

Solution: (B)

from image 1to 4 correlation is decreasing (absolute value). But from image 4 to 7
correlation is increasing but values are negative (for example, 0, -0.3, -0.7, -0.99).

30) You can evaluate the performance of a binary class classification problem
using different metrics such as accuracy, log-loss, F-Score. Let’s say, you are
using the log-loss function as evaluation metric.

Which of the following option is / are true for interpretation of log-loss as an

evaluation metric?

1.
If a classifier is confident about an incorrect classification, then log-loss will
penalise it heavily.
2. For a particular observation, the classifier assigns a very small probability for the
correct class then the corresponding contribution to the log-loss will be very large.
3. Lower the log-loss, the better is the model.

A) 1 and 3

B) 2 and 3

C) 1 and 2

D) 1,2 and 3

Solution: (D)Options are self-explanatory.

Context Question 31-32

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

Below are five samples given in the dataset.

Note: Visual distance between the points in the image represents the actual
distance.

31) Which of the following is leave-one-out cross-validation accuracy for 3-NN (3-
nearest neighbor)?

A) 0

D) 0.4

C) 0.8

D) 1

Solution: (C)

In Leave-One-Out cross validation, we will select (n-1) observations for training and 1
observation of validation. Consider each point as a cross validation point and then find
the 3 nearest point to this point. So if you repeat this procedure for all points you will get
the correct classification for all positive class given in the above figure but negative class
will be misclassified. Hence you will get 80% accuracy.

32) Which of the following value of K will have least leave-one-out cross validation
accuracy?

A) 1NN

B) 3NN

C) 4NN

D) All have same leave one out error

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

Solution: (A)Each point which will always be misclassified in 1-NN which means that
you will get the 0% accuracy.

33) Suppose you are given the below data and you want to apply a logistic
regression model for classifying it in two given classes.

You are using logistic regression with L1 regularization.

Where C is the regularization

parameter and w1 & w2 are the coefficients of x1 and x2.

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

Which of the following option is correct when you increase the value of C from zero to a
very large value?

A) First w2 becomes zero and then w1 becomes zero

B) First w1 becomes zero and then w2 becomes zero

C) Both becomes zero at the same time

D) Both cannot be zero even after very large value of C

Solution: (B)

34) Suppose we have a dataset which can be trained with 100% accuracy with help
of a decision tree of depth 6. Now consider the points below and choose the
option based on these points.

Note: All other hyper parameters are same and other factors are not affected.

1. Depth 4 will have high bias and low variance

2. Depth 4 will have low bias and low variance

A) Only 1

B) Only 2

C) Both 1 and 2

D) None of the above

Solution: (A)If you fit decision tree of depth 4 in such data means it will more likely to
underfit the data. So, in case of underfitting you will have high bias and low variance.

35) Which of the following options can be used to get global minima in k-Means
Algorithm?

1. Try to run algorithm for different centroid initialization

2. Adjust number of iterations
3. Find out the optimal number of clusters

A) 2 and 3

B) 1 and 3

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

C) 1 and 2

D) All of above

Solution: (D)All of the option can be tuned to find the global minima.

36) Imagine you are working on a project which is a binary classification problem.
You trained a model on training dataset and get the below confusion matrix on
validation dataset.

Based on the above confusion matrix, choose which option(s) below will give you
correct predictions?

1. Accuracy is ~0.91
2. Misclassification rate is ~ 0.91
3. False positive rate is ~0.95
4. True positive rate is ~0.95

A) 1 and 3

B) 2 and 4

C) 1 and 4

D) 2 and 3

Solution: (C)

The Accuracy (correct classification) is (50+100)/165 which is nearly equal to 0.91.

The true Positive Rate is how many times you are predicting positive class correctly so
true positive rate would be 100/105 = 0.95 also known as “Sensitivity” or “Recall”

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

37) For which of the following hyperparameters, higher value is better for decision
tree algorithm?

1. Number of samples used for split

2. Depth of tree
3. Samples for leaf

A)1 and 2

B) 2 and 3

C) 1 and 3

D) 1, 2 and 3

E) Can’t say

Solution: (E)

For all three options A, B and C, it is not necessary that if you increase the value of
parameter the performance may increase. For example, if we have a very high value of
depth of tree, the resulting tree may overfit the data, and would not generalize well. On
the other hand, if we have a very low value, the tree may underfit the data. So, we can’t
say for sure that “higher is better”.

Context 38-39

Imagine, you have a 28 * 28 image and you run a 3 * 3 convolution neural network
on it with the input depth of 3 and output depth of 8.

Note: Stride is 1 and you are using same padding.

38) What is the dimension of output feature map when you are using the given
parameters.

A) 28 width, 28 height and 8 depth

B) 13 width, 13 height and 8 depth

C) 28 width, 13 height and 8 depth

D) 13 width, 28 height and 8 depth

Solution: (A)The formula for calculating output size is

output size = (N – F)/S + 1

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

where, N is input size, F is filter size and S is stride.

Read this article to get a better understanding.

39) What is the dimensions of output feature map when you are using following
parameters.

A) 28 width, 28 height and 8 depth

B) 13 width, 13 height and 8 depth

C) 28 width, 13 height and 8 depth

D) 13 width, 28 height and 8 depth

Solution: (B)Same as above

40) Suppose, we were plotting the visualization for different values of C (Penalty
parameter) in SVM algorithm. Due to some reason, we forgot to tag the C values
with visualizations. In that case, which of the following option best explains the C
values for the images below (1,2,3 left to right, so C values are C1 for image1, C2
for image2 and C3 for image3 ) in case of rbf kernel.

A) C1 = C2 = C3

B) C1 > C2 > C3

C) C1 < C2 < C3

D) None of these

Solution: (C)

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

Penalty parameter C of the error term. It also controls the trade-off between smooth
decision boundary and classifying the training points correctly. For large values of C, the
optimization will choose a smaller-margin hyperplane.

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

MCQ-Clustering - Clustering QUIZ

Bsc (computer science) (Savitribai Phule Pune University)

StuDocu is not sponsored or endorsed by any college or university

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)
lOMoARcPSD|7874213

Questions & Answers

Q1. Movie Recommendation systems are an example of:

1. Classification
2. Clustering
3. Reinforcement Learning
4. Regression

Options:

B. A. 2 Only

C. 1 and 2

D. 1 and 3

E. 2 and 3

F. 1, 2 and 3

H. 1, 2, 3 and 4

Solution: (E)

Generally, movie recommendation systems cluster the users in a finite number of similar
groups based on their previous activities and profile. Then, at a fundamental level,
people in the same cluster are made similar recommendations.

In some scenarios, this can also be approached as a classification problem for assigning
the most appropriate movie class to the user of a specific group of users. Also, a movie
recommendation system can be viewed as a reinforcement learning problem where it
learns by its previous recommendations and improves the future recommendations.

Q2. Sentiment Analysis is an example of:

1. Regression
2. Classification
3. Clustering
4. Reinforcement Learning

Options:

A. 1 Only

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

B. 1 and 2

C. 1 and 3

D. 1, 2 and 3

E. 1, 2 and 4

F. 1, 2, 3 and 4

Solution: (E)

Sentiment analysis at the fundamental level is the task of classifying the sentiments
represented in an image, text or speech into a set of defined sentiment classes like
happy, sad, excited, positive, negative, etc. It can also be viewed as a regression
problem for assigning a sentiment score of say 1 to 10 for a corresponding image, text or
speech.

Another way of looking at sentiment analysis is to consider it using a reinforcement

learning perspective where the algorithm constantly learns from the accuracy of past
sentiment analysis performed to improve the future performance.

Q3. Can decision trees be used for performing clustering?

A. True

B. False

Solution: (A)

Decision trees can also be used to for clusters in the data but clustering often generates
natural clusters and is not dependent on any objective function.

Q4. Which of the following is the most appropriate strategy for data cleaning
before performing clustering analysis, given less than desirable number of data
points:

1. Capping and flouring of variables

2. Removal of outliers

Options:

A. 1 only

B. 2 only

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

C. 1 and 2

D. None of the above

Solution: (A)

Removal of outliers is not recommended if the data points are few in number. In this
scenario, capping and flouring of variables is the most appropriate strategy.

Q5. What is the minimum no. of variables/ features required to perform clustering?

A. 0

B. 1

C. 2

D. 3

Solution: (B)

At least a single variable is required to perform clustering analysis. Clustering analysis

with a single variable can be visualized with the help of a histogram.

Q6. For two runs of K-Mean clustering is it expected to get same clustering
results?

A. Yes

B. No

Solution: (B)

K-Means clustering algorithm instead converses on local minima which might also
correspond to the global minima in some cases but not always. Therefore, it’s advised to
run the K-Means algorithm multiple times before drawing inferences about the clusters.

However, note that it’s possible to receive same clustering results from K-means by
setting the same seed value for each run. But that is done by simply making the
algorithm choose the set of same random no. for each run.

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

Q7. Is it possible that Assignment of observations to clusters does not change

between successive iterations in K-Means

A. Yes

B. No

C. Can’t say

D. None of these

Solution: (A)

When the K-Means algorithm has reached the local or global minima, it will not alter the
assignment of data points to clusters for two successive iterations.

Q8. Which of the following can act as possible termination conditions in K-

Means?

1. For a fixed number of iterations.

2. Assignment of observations to clusters does not change between
iterations. Except for cases with a bad local minimum.
3. Centroids do not change between successive iterations.
4. Terminate when RSS falls below a threshold.

Options:

A. 1, 3 and 4

B. 1, 2 and 3

C. 1, 2 and 4

D. All of the above

Solution: (D)

All four conditions can be used as possible termination condition in K-Means clustering:

1. This condition limits the runtime of the clustering algorithm, but in some cases the
quality of the clustering will be poor because of an insufficient number of
iterations.
2. Except for cases with a bad local minimum, this produces a good clustering, but
runtimes may be unacceptably long.
3. This also ensures that the algorithm has converged at the minima.

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

4. Terminate when RSS falls below a threshold. This criterion ensures that the
clustering is of a desired quality after termination. Practically, it’s a good practice
to combine it with a bound on the number of iterations to guarantee termination.

Q9. Which of the following clustering algorithms suffers from the problem of
convergence at local optima?

1. K- Means clustering algorithm

2. Agglomerative clustering algorithm
3. Expectation-Maximization clustering algorithm
4. Diverse clustering algorithm

Options:

A. 1 only

B. 2 and 3

C. 2 and 4

D. 1 and 3

E. 1,2 and 4

F. All of the above

Solution: (D)

Out of the options given, only K-Means clustering algorithm and EM clustering algorithm
has the drawback of converging at local minima.

Q10. Which of the following algorithm is most sensitive to outliers?

A. K-means clustering algorithm

B. K-medians clustering algorithm

C. K-modes clustering algorithm

D. K-medoids clustering algorithm

Solution: (A)

Out of all the options, K-Means clustering algorithm is most sensitive to outliers as it
uses the mean of cluster data points to find the cluster center.

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

Q11. After performing K-Means Clustering analysis on a dataset, you observed the
following dendrogram. Which of the following conclusion can be drawn from the
dendrogram?

A. There were 28 data points in clustering analysis

B. The best no. of clusters for the analyzed data points is 4

C. The proximity function used is Average-link clustering

D. The above dendrogram interpretation is not possible for K-Means clustering analysis

Solution: (D)

A dendrogram is not possible for K-Means clustering analysis. However, one can create
a cluster gram based on K-Means clustering analysis.

Q12. How can Clustering (Unsupervised Learning) be used to improve the

accuracy of Linear Regression model (Supervised Learning):

1. Creating different models for different cluster groups.

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

2. Creating an input feature for cluster ids as an ordinal variable.

3. Creating an input feature for cluster centroids as a continuous variable.
4. Creating an input feature for cluster size as a continuous variable.

Options:

A. 1 only

B. 1 and 2

C. 1 and 4

D. 3 only

E. 2 and 4

F. All of the above

Solution: (F)

Creating an input feature for cluster ids as ordinal variable or creating an input feature
for cluster centroids as a continuous variable might not convey any relevant information
to the regression model for multidimensional data. But for clustering in a single
dimension, all of the given methods are expected to convey meaningful information to
the regression model. For example, to cluster people in two groups based on their hair
length, storing clustering ID as ordinal variable and cluster centroids as continuous
variables will convey meaningful information.

Q13. What could be the possible reason(s) for producing two different
dendrograms using agglomerative clustering algorithm for the same dataset?

A. Proximity function used

B. of data points used

C. of variables used

D. B and c only

E. All of the above

Solution: (E)

Change in either of Proximity function, no. of data points or no. of variables will lead to
different clustering results and hence different dendrograms.

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

Q14. In the figure below, if you draw a horizontal line on y-axis for y=2. What will
be the number of clusters formed?

A. 1

B. 2

C. 3

D. 4

Solution: (B)

Since the number of vertical lines intersecting the red horizontal line at y=2 in the
dendrogram are 2, therefore, two clusters will be formed.

Q15. What is the most appropriate no. of clusters for the data points represented
by the following dendrogram:

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

A. 2

B. 4

C. 6

D. 8

Solution: (B)

The decision of the no. of clusters that can best depict different groups can be chosen by
observing the dendrogram. The best choice of the no. of clusters is the no. of vertical
lines in the dendrogram cut by a horizontal line that can transverse the maximum
distance vertically without intersecting a cluster.

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

In the above example, the best choice of no. of clusters will be 4 as the red horizontal
line in the dendrogram below covers maximum vertical distance AB.

Q16. In which of the following cases will K-Means clustering fail to give good
results?

1. Data points with outliers

2. Data points with different densities
3. Data points with round shapes
4. Data points with non-convex shapes

Options:

A. 1 and 2

B. 2 and 3

C. 2 and 4

D. 1, 2 and 4

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

E. 1, 2, 3 and 4

Solution: (D)

K-Means clustering algorithm fails to give good results when the data contains outliers,
the density spread of data points across the data space is different and the data points
follow non-convex shapes.

Q17. Which of the following metrics, do we have for finding dissimilarity between
two clusters in hierarchical clustering?

1. Single-link
2. Complete-link
3. Average-link

Options:

A. 1 and 2

B. 1 and 3

C. 2 and 3

D. 1, 2 and 3

Solution: (D)

All of the three methods i.e. single link, complete link and average link can be used for
finding dissimilarity between two clusters in hierarchical clustering.

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

Q18. Which of the following are true?

1. Clustering analysis is negatively affected by multicollinearity of features

2. Clustering analysis is negatively affected by heteroscedasticity

Options:

A. 1 only

B. 2 only

C. 1 and 2

D. None of them

Solution: (A)

Clustering analysis is not negatively affected by heteroscedasticity but the results are
negatively impacted by multicollinearity of features/ variables used in clustering as the
correlated feature/ variable will carry extra weight on the distance calculation than
desired.

Q19. Given, six points with the following attributes:

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

Which of the following clustering representations and dendrogram depicts the use
of MIN or Single link proximity function in hierarchical clustering:

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

Solution: (A)

For the single link or MIN version of hierarchical clustering, the proximity of two clusters
is defined to be the minimum of the distance between any two points in the different
clusters. For instance, from the table, we see that the distance between points 3 and 6 is
0.11, and that is the height at which they are joined into one cluster in the dendrogram.
As another example, the distance between clusters {3, 6} and {2, 5} is given by dist({3,
6}, {2, 5}) = min(dist(3, 2), dist(6, 2), dist(3, 5), dist(6, 5)) = min(0.1483, 0.2540, 0.2843,
0.3921) = 0.1483.

Q20 Given, six points with the following attributes:

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

Which of the following clustering representations and dendrogram depicts the use
of MAX or Complete link proximity function in hierarchical clustering:

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

Solution: (B)

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

For the single link or MAX version of hierarchical clustering, the proximity of two clusters
is defined to be the maximum of the distance between any two points in the different
clusters. Similarly, here points 3 and 6 are merged first. However, {3, 6} is merged with
{4}, instead of {2, 5}. This is because the dist({3, 6}, {4}) = max(dist(3, 4), dist(6, 4)) =
max(0.1513, 0.2216) = 0.2216, which is smaller than dist({3, 6}, {2, 5}) = max(dist(3, 2),
dist(6, 2), dist(3, 5), dist(6, 5)) = max(0.1483, 0.2540, 0.2843, 0.3921) = 0.3921 and
dist({3, 6}, {1}) = max(dist(3, 1), dist(6, 1)) = max(0.2218, 0.2347) = 0.2347.

Q21 Given, six points with the following attributes:

Which of the following clustering representations and dendrogram depicts the use
of Group average proximity function in hierarchical clustering:

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

B.
C.

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

Solution: (C)

For the group average version of hierarchical clustering, the proximity of two clusters is
defined to be the average of the pairwise proximities between all pairs of points in the
different clusters. This is an intermediate approach between MIN and MAX. This is
expressed by the following equation:

Here, the distance between some clusters. dist({3, 6, 4}, {1}) = (0.2218 + 0.3688 +
0.2347)/(3 ∗ 1) = 0.2751. dist({2, 5}, {1}) = (0.2357 + 0.3421)/(2 ∗ 1) = 0.2889. dist({3,
6, 4}, {2, 5}) = (0.1483 + 0.2843 + 0.2540 + 0.3921 + 0.2042 + 0.2932)/(6∗1) = 0.2637.
Because dist({3, 6, 4}, {2, 5}) is smaller than dist({3, 6, 4}, {1}) and dist({2, 5}, {1}), these
two clusters are merged at the fourth stage

Q22. Given, six points with the following attributes:

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

Which of the following clustering representations and dendrogram depicts the use
of Ward’s method proximity function in hierarchical clustering:

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

Solution: (D)

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

Ward method is a centroid method. Centroid method calculates the proximity between
two clusters by calculating the distance between the centroids of clusters. For Ward’s
method, the proximity between two clusters is defined as the increase in the squared
error that results when two clusters are merged. The results of applying Ward’s method
to the sample data set of six points. The resulting clustering is somewhat different from
those produced by MIN, MAX, and group average.

Q23. What should be the best choice of no. of clusters based on the following
results:

A. 1

B. 2

C. 3

D. 4

Solution: (C)

The silhouette coefficient is a measure of how similar an object is to its own cluster
compared to other clusters. Number of clusters for which silhouette coefficient is highest
represents the best choice of the number of clusters.

Q24. Which of the following is/are valid iterative strategy for treating missing
values before clustering analysis?

A. Imputation with mean

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

B. Nearest Neighbor assignment

C. Imputation with Expectation Maximization algorithm

D. All of the above

Solution: (C)

All of the mentioned techniques are valid for treating missing values before clustering
analysis but only imputation with EM algorithm is iterative in its functioning.

Q25. K-Mean algorithm has some limitations. One of the limitation it has is, it
makes hard assignments(A point either completely belongs to a cluster or not
belongs at all) of points to clusters.

Note: Soft assignment can be consider as the probability of being assigned to

each cluster: say K = 3 and for some point xn, p1 = 0.7, p2 = 0.2, p3 = 0.1)

Which of the following algorithm(s) allows soft assignments?

1. Gaussian mixture models

2. Fuzzy K-means

Options:

A. 1 only

B. 2 only

C. 1 and 2

D. None of these

Solution: (C)

Both, Gaussian mixture models and Fuzzy K-means allows soft assignments.

Q26. Assume, you want to cluster 7 observations into 3 clusters using K-Means
clustering algorithm. After first iteration clusters, C1, C2, C3 has following
observations:

C1: {(2,2), (4,4), (6,6)}

C2: {(0,4), (4,0)}

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

C3: {(5,5), (9,9)}

What will be the cluster centroids if you want to proceed for second iteration?

A. C1: (4,4), C2: (2,2), C3: (7,7)

B. C1: (6,6), C2: (4,4), C3: (9,9)

C. C1: (2,2), C2: (0,0), C3: (5,5)

D. None of these

Solution: (A)

Finding centroid for data points in cluster C1 = ((2+4+6)/3, (2+4+6)/3) = (4, 4)

Finding centroid for data points in cluster C2 = ((0+4)/2, (4+0)/2) = (2, 2)

Finding centroid for data points in cluster C3 = ((5+9)/2, (5+9)/2) = (7, 7)

Hence, C1: (4,4), C2: (2,2), C3: (7,7)

Q27. Assume, you want to cluster 7 observations into 3 clusters using K-Means
clustering algorithm. After first iteration clusters, C1, C2, C3 has following
observations:

C1: {(2,2), (4,4), (6,6)}

C2: {(0,4), (4,0)}

C3: {(5,5), (9,9)}

What will be the Manhattan distance for observation (9, 9) from cluster centroid
C1. In second iteration.

A. 10

B. 5*sqrt(2)

C. 13*sqrt(2)

D. None of these

Solution: (A)

Manhattan distance between centroid C1 i.e. (4, 4) and (9, 9) = (9-4) + (9-4) = 10

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

Q28. If two variables V1 and V2, are used for clustering. Which of the following are
true for K means clustering with k =3?

1. If V1 and V2 has a correlation of 1, the cluster centroids will be in a straight

line
2. If V1 and V2 has a correlation of 0, the cluster centroids will be in straight
line

Options:

A. 1 only

B. 2 only

C. 1 and 2

D. None of the above

Solution: (A)

If the correlation between the variables V1 and V2 is 1, then all the data points will be in
a straight line. Hence, all the three cluster centroids will form a straight line as well.

Q29. Feature scaling is an important step before applying K-Mean algorithm. What
is reason behind this?

A. In distance calculation it will give the same weights for all features

B. You always get the same clusters. If you use or don’t use feature scaling

C. In Manhattan distance it is an important step but in Euclidian it is not

D. None of these

Solution; (A)

Feature scaling ensures that all the features get same weight in the clustering analysis.
Consider a scenario of clustering people based on their weights (in KG) with range 55-
110 and height (in inches) with range 5.6 to 6.4. In this case, the clusters produced
without scaling can be very misleading as the range of weight is much higher than that of
height. Therefore, its necessary to bring them to same scale so that they have equal
weightage on the clustering result.

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

Q30. Which of the following method is used for finding optimal of cluster in K-
Mean algorithm?

A. Elbow method

B. Manhattan method

C. Ecludian mehthod

D. All of the above

E. None of these

Solution: (A)

Out of the given options, only elbow method is used for finding the optimal number of
clusters. The elbow method looks at the percentage of variance explained as a function
of the number of clusters: One should choose a number of clusters so that adding
another cluster doesn’t give much better modeling of the data.

Q31. What is true about K-Mean Clustering?

1. K-means is extremely sensitive to cluster center initializations

2. Bad initialization can lead to Poor convergence speed
3. Bad initialization can lead to bad overall clustering

Options:

A. 1 and 3

B. 1 and 2

C. 2 and 3

D. 1, 2 and 3

Solution: (D)

All three of the given statements are true. K-means is extremely sensitive to cluster
center initialization. Also, bad initialization can lead to Poor convergence speed as well
as bad overall clustering.

Q32. Which of the following can be applied to get good results for K-means
algorithm corresponding to global minima?

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

1. Try to run algorithm for different centroid initialization

2. Adjust number of iterations
3. Find out the optimal number of clusters

Options:

A. 2 and 3

B. 1 and 3

C. 1 and 2

D. All of above

Solution: (D)

All of these are standard practices that are used in order to obtain good clustering
results.

Q33. What should be the best choice for number of clusters based on the
following results:

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

A. 5

B. 6

C. 14

D. Greater than 14

Solution: (B)

Based on the above results, the best choice of number of clusters using elbow method is
6.

Q34. What should be the best choice for number of clusters based on the
following results:

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

A. 2

B. 4

C. 6

D. 8

Solution: (C)

Generally, a higher average silhouette coefficient indicates better clustering quality. In

this plot, the optimal clustering number of grid cells in the study area should be 2, at
which the value of the average silhouette coefficient is highest. However, the SSE of this
clustering solution (k = 2) is too large. At k = 6, the SSE is much lower. In addition, the
value of the average silhouette coefficient at k = 6 is also very high, which is just lower
than k = 2. Thus, the best choice is k = 6.

Q35. Which of the following sequences is correct for a K-Means algorithm using
Forgy method of initialization?

1. Specify the number of clusters

2. Assign cluster centroids randomly
3. Assign each data point to the nearest cluster centroid

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

4. Re-assign each point to nearest cluster centroids

5. Re-compute cluster centroids

Options:

A. 1, 2, 3, 5, 4

B. 1, 3, 2, 4, 5

C. 2, 1, 3, 4, 5

D. None of these

Solution: (A)

The methods used for initialization in K means are Forgy and Random Partition. The
Forgy method randomly chooses k observations from the data set and uses these as the
initial means. The Random Partition method first randomly assigns a cluster to each
observation and then proceeds to the update step, thus computing the initial mean to be
the centroid of the cluster’s randomly assigned points.

Q36. If you are using Multinomial mixture models with the expectation-
maximization algorithm for clustering a set of data points into two clusters, which
of the assumptions are important:

A. All the data points follow two Gaussian distribution

B. All the data points follow n Gaussian distribution (n >2)

C. All the data points follow two multinomial distribution

D. All the data points follow n multinomial distribution (n >2)

Solution: (C)

In EM algorithm for clustering its essential to choose the same no. of clusters to classify
the data points into as the no. of different distributions they are expected to be generated
from and also the distributions must be of the same type.

Q37. Which of the following is/are not true about Centroid based K-Means
clustering algorithm and Distribution based expectation-maximization clustering
algorithm:

1. Both starts with random initializations

2. Both are iterative algorithms

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

3. Both have strong assumptions that the data points must fulfill
4. Both are sensitive to outliers
5. Expectation maximization algorithm is a special case of K-Means
6. Both requires prior knowledge of the no. of desired clusters
7. The results produced by both are non-reproducible.

Options:

A. 1 only

B. 5 only

C. 1 and 3

D. 6 and 7

E. 4, 6 and 7

F. None of the above

Solution: (B)

All of the above statements are true except the 5 th as instead K-Means is a special case
of EM algorithm in which only the centroids of the cluster distributions are calculated at
each iteration.

Q38. Which of the following is/are not true about DBSCAN clustering algorithm:

1. For data points to be in a cluster, they must be in a distance threshold to a

core point
2. It has strong assumptions for the distribution of data points in dataspace
3. It has substantially high time complexity of order O(n 3)
4. It does not require prior knowledge of the no. of desired clusters
5. It is robust to outliers

Options:

A. 1 only

B. 2 only

C. 4 only

D. 2 and 3

E. 1 and 5

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

F. 1, 3 and 5

Solution: (D)

 DBSCAN can form a cluster of any arbitrary shape and does not have strong
assumptions for the distribution of data points in the dataspace.
 DBSCAN has a low time complexity of order O(n log n) only.

Q39. Which of the following are the high and low bounds for the existence of F-
Score?

A. [0,1]

B. (0,1)

C. [-1,1]

D. None of the above

Solution: (A)

The lowest and highest possible values of F score are 0 and 1 with 1 representing that
every data point is assigned to the correct cluster and 0 representing that the precession
and/ or recall of the clustering analysis are both 0. In clustering analysis, high value of F
score is desired.

Q40. Following are the results observed for clustering 6000 data points into 3
clusters: A, B and C:

What is the F1-Score with respect to cluster B?

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

A. 3

B. 4

C. 5

D. 6

Solution: (D)

Here,

True Positive, TP = 1200

True Negative, TN = 600 + 1600 = 2200

False Positive, FP = 1000 + 200 = 1200

False Negative, FN = 400 + 400 = 800

Therefore,

Precision = TP / (TP + FP) = 0.5

Recall = TP / (TP + FN) = 0.6

Hence,

F1 = 2 * (Precision * Recall)/ (Precision + recall) = 0.54 ~ 0.5

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

MCQ-KNN - KNN QuiZ

Bsc (computer science) (Savitribai Phule Pune University)

StuDocu is not sponsored or endorsed by any college or university

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)
lOMoARcPSD|7874213

Skill test Questions and Answers

1) [True or False] k-NN algorithm does more computation on test time rather than
train time.

A) TRUE
B) FALSE

Solution: A

The training phase of the algorithm consists only of storing the feature vectors and class
labels of the training samples.

In the testing phase, a test point is classified by assigning the label which are most
frequent among the k training samples nearest to that query point – hence higher
computation.

2) In the image below, which would be the best value for k assuming that the
algorithm you are using is k-Nearest Neighbor.

A) 3
B) 10
C) 20
D 50

Solution: B

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

Validation error is the least when the value of k is 10. So it is best to use this value of k

3) Which of the following distance metric can not be used in k-NN?

A) Manhattan
B) Minkowski
C) Tanimoto
D) Jaccard
E) Mahalanobis
F) All can be used

Solution: F

All of these distance metric can be used as a distance metric for k-NN.

4) Which of the following option is true about k-NN algorithm?

A) It can be used for classification

B) It can be used for regression
C) It can be used in both classification and regression

Solution: C

We can also use k-NN for regression problems. In this case the prediction can be based
on the mean or the median of the k-most similar instances.

5) Which of the following statement is true about k-NN algorithm?

1. k-NN performs much better if all of the data have the same scale
2. k-NN works well with a small number of input variables (p), but struggles when
the number of inputs is very large
3. k-NN makes no assumptions about the functional form of the problem being
solved

A) 1 and 2
B) 1 and 3
C) Only 1
D) All of the above

Solution: D

The above mentioned statements are assumptions of kNN algorithm

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

6) Which of the following machine learning algorithm can be used for imputing
missing values of both categorical and continuous variables?

A) K-NN
B) Linear Regression
C) Logistic Regression

Solution: A

k-NN algorithm can be used for imputing missing value of both categorical and
continuous variables.

7) Which of the following is true about Manhattan distance?

A) It can be used for continuous variables

B) It can be used for categorical variables
C) It can be used for categorical as well as continuous
D) None of these

Solution: A

Manhattan Distance is designed for calculating the distance between real valued
features.

8) Which of the following distance measure do we use in case of categorical

variables in k-NN?

1. Hamming Distance
2. Euclidean Distance
3. Manhattan Distance

A) 1
B) 2
C) 3
D) 1 and 2
E) 2 and 3
F) 1,2 and 3

Solution: A

Both Euclidean and Manhattan distances are used in case of continuous variables,
whereas hamming distance is used in case of categorical variable.

9) Which of the following will be Euclidean Distance between the two data point
A(1,3) and B(2,3)?

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

A) 1
B) 2
C) 4
D) 8

Solution: A

sqrt( (1-2)^2 + (3-3)^2) = sqrt(1^2 + 0^2) = 1

10) Which of the following will be Manhattan Distance between the two data point
A(1,3) and B(2,3)?

A) 1
B) 2
C) 4
D) 8

Solution: A

sqrt( mod((1-2)) + mod((3-3))) = sqrt(1 + 0) = 1

Context: 11-12

Suppose, you have given the following data where x and y are the 2 input variables and
Class is the dependent variable.

Below is a scatter plot which shows the above data in 2D space.

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

11) Suppose, you want to predict the class of new data point x=1 and y=1 using
eucludian distance in 3-NN. In which class this data point belong to?

A) + Class
B) – Class

C) Can’t say

D) None of these

Solution: A

All three nearest point are of +class so this point will be classified as +class.

12) In the previous question, you are now want use 7-NN instead of 3-KNN which
of the following x=1 and y=1 will belong to?

A) + Class
B) – Class

C) Can’t say

Solution: B

Now this point will be classified as – class because there are 4 – class and 3 +class
point are in nearest circle.

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

Context 13-14:

Suppose you have given the following 2-class data where “+” represent a postive class
and “” is represent negative class.

13) Which of the following value of k in k-NN would minimize the leave one out
cross validation accuracy?

A) 3
B) 5
C) Both have same
D) None of these

Solution: B

5-NN will have least leave one out cross validation error.

14) Which of the following would be the leave on out cross validation accuracy for
k=5?

A) 2/14
B) 4/14
C) 6/14
D) 8/14
E) None of the above

Solution: E

In 5-NN we will have 10/14 leave one out cross validation accuracy.

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

15) Which of the following will be true about k in k-NN in terms of Bias?

A) When you increase the k the bias will be increases

B) When you decrease the k the bias will be increases
C) Can’t say
D) None of these

Solution: A

large K means simple model, simple model always condider as high bias

16) Which of the following will be true about k in k-NN in terms of variance?

A) When you increase the k the variance will increases

B) When you decrease the k the variance will increases
C) Can’t say
D) None of these

Solution: B

Simple model will be consider as less variance model

17) The following two distances(Eucludean Distance and Manhattan Distance)

have given to you which generally we used in K-NN algorithm. These distance are
between two points A(x1,y1) and B(x2,Y2).

Your task is to tag the both distance by seeing the following two graphs. Which of
the following option is true about below graph ?

A) Left is Manhattan Distance and right is euclidean Distance

B) Left is Euclidean Distance and right is Manhattan Distance
C) Neither left or right are a Manhattan Distance
D) Neither left or right are a Euclidian Distance
Solution: B

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

Left is the graphical depiction of how euclidean distance works, whereas right one is of
Manhattan distance.

18) When you find noise in data which of the following option would you consider
in k-NN?

A) I will increase the value of k

B) I will decrease the value of k
C) Noise can not be dependent on value of k
D) None of these

Solution: A

To be more sure of which classifications you make, you can try increasing the value of k.

19) In k-NN it is very likely to overfit due to the curse of dimensionality. Which of
the following option would you consider to handle such problem?

1. Dimensionality Reduction
2. Feature selection

A) 1
B) 2
C) 1 and 2
D) None of these

Solution: C

In such case you can use either dimensionality reduction algorithm or the
feature selection algorithm

20) Below are two statements given. Which of the following will be true both
statements?

1. k-NN is a memory-based approach is that the classifier immediately adapts as we

collect new training data.
2. The computational complexity for classifying new samples grows linearly with the
number of samples in the training dataset in the worst-case scenario.

A) 1
B) 2
C) 1 and 2
D) None of these

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

Solution: C

Both are true and self explanatory

21) Suppose you have given the following images(1 left, 2 middle and 3 right),
Now your task is to find out the value of k in k-NN in each image where k1 is for 1 st,
k2 is for 2nd and k3 is for 3rd figure.

A) k1 > k2> k3
B) k1<k2
C) k1 = k2 = k3
D) None of these
Solution: D

Value of k is highest in k3, whereas in k1 it is lowest

22) Which of the following value of k in the following graph would you give least
leave one out cross validation accuracy?

A) 1

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

B) 2
C) 3
D) 5
Solution: B

If you keep the value of k as 2, it gives the lowest cross validation accuracy. You can try
this out yourself.

23) A company has build a kNN classifier that gets 100% accuracy on training
data. When they deployed this model on client side it has been found that the
model is not at all accurate. Which of the following thing might gone wrong?

Note: Model has successfully deployed and no technical issues are found at client
side except the model performance

A) It is probably a overfitted model

B) It is probably a underfitted model
C) Can’t say
D) None of these

Solution: A

In an overfitted module, it seems to be performing well on training data, but it is not

generalized enough to give the same results on a new data.

24) You have given the following 2 statements, find which of these option is/are
true in case of k-NN?

1. In case of very large value of k, we may include points from other classes into the
neighborhood.
2. In case of too small value of k the algorithm is very sensitive to noise

A) 1
B) 2
C) 1 and 2
D) None of these

Solution: C

Both the options are true and are self explanatory.

25) Which of the following statements is true for k-NN classifiers?

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

A) The classification accuracy is better with larger values of k

B) The decision boundary is smoother with smaller values of k
C) The decision boundary is linear
D) k-NN does not require an explicit training step

Solution: D

Option A: This is not always true. You have to ensure that the value of k is not too high or
not too low.

Option B: This statement is not true. The decision boundary can be a bit jagged

Option C: Same as option B

Option D: This statement is true

26) True-False: It is possible to construct a 2-NN classifier by using the 1-NN

classifier?

A) TRUE
B) FALSE

Solution: A

You can implement a 2-NN classifier by ensembling 1-NN classifiers

27) In k-NN what will happen when you increase/decrease the value of k?

A) The boundary becomes smoother with increasing value of K

B) The boundary becomes smoother with decreasing value of K
C) Smoothness of boundary doesn’t dependent on value of K
D) None of these

Solution: A

The decision boundary would become smoother by increasing the value of K

28) Following are the two statements given for k-NN algorthm, which of the
statement(s)

is/are true?

1. We can choose optimal value of k with the help of cross validation

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

2. Euclidean distance treats each feature as equally important

A) 1
B) 2
C) 1 and 2
D) None of these

Solution: C

Both the statements are true

Context 29-30:

Suppose, you have trained a k-NN model and now you want to get the prediction on test
data. Before getting the prediction suppose you want to calculate the time taken by k-NN
for predicting the class for test data.
Note: Calculating the distance between 2 observation will take D time.

29) What would be the time taken by 1-NN if there are N(Very large) observations
in test data?

A) N*D
B) N*D*2
C) (N*D)/2
D) None of these

Solution: A

The value of N is very large, so option A is correct

30) What would be the relation between the time taken by 1-NN,2-NN,3-NN.

A) 1-NN >2-NN >3-NN

B) 1-NN < 2-NN < 3-NN
C) 1-NN ~ 2-NN ~ 3-NN
D) None of these

Solution: C

The training time for any value of k in kNN algorithm is the same.

1. A project team performed a feature selection

procedure on the full data set and reduced their large

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

feature set to a smaller set. Then they split the data

into test and training portions. They built their model on
training data using several diferent model settings, and
report the best test error they achieved. Which of the
following is TRUE about the given experimental setup?
a) Best setup
b) Problematic setup
c) Invalid setup
d) Cannot be decided
Answer: (b) Problematic setup
(a) Using the full data for feature selection will leak information from
the test examples into the model. The feature selection should be
done exclusively using training and validation data not on test data.
(b) The best parameter setting should not be chosen based on the
test error; this has the danger of overitting to the test data. They
should have used validation data and use the test data only in the
inal evaluation step.

2. If we increase the k value in k-nearest neighbor, the

model will _____ the bias and ______ the variance.
a) Decrease, Decrease
b) Increase, Decrease
c) Decrease, Increase
d) Increase, Increase
Answer: (b) Increase, Decrease
When K increases to a large value, the model becomes simplest. All
test data point will belong to the same class: the majority class. This
is under-it, that is, high bias and low variance.

Bias-Variance tradeof
The bias is an error from erroneous assumptions in the learning
algorithm. High bias can cause an algorithm to miss the relevant
relations between features and target outputs. In other words, model
with high bias pays very little attention to the training data and
oversimpliies the model.
The variance is an error from sensitivity to small luctuations in the
training set. High variance can cause an algorithm to model the

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

random noise in the training data, rather than the intended outputs.
In other words, model with high variance pays a lot of attention to
training data and does not generalize on the data which it hasn’t
seen before. [Source: Refer here]

3. For a large k value the k-nearest neighbor model

becomes _____ and ______ .
a) Complex model, Overit
b) Complex model, Underit
c) Simple model, Underit
d) Simple model, Overit
Answer: (c) Simple model, Underit
When K increases to inf, the model is simplest. All test data point will
belong to the same class: the majority class. This is under-it, that is,
high bias and low variance.
knn classiication is an averaging operation. To come to a decision,
the labels of K nearest neighbour samples are averaged. The
standard deviation (or the variance) of the output of averaging
decreases as the number of samples increases. In the case K==N (you
select K as large as the size of the dataset), variance becomes zero.
Underitting means the model does not it, in other words, does not
predict, the (training) data very well.
Overitting means that the model predicts the (training) data too well. It
is too good to be true. If the new data point comes in, the prediction
may be wrong.

4. When we have a real-valued input attribute during

decision-tree learning, what would be the impact multi-
way split with one branch for each of the distinct values
of the attribute?
a) It is too computationally expensive.
b) It would probably result in a decision tree that scores
badly on the training set and a test set.
c) It would probably result in a decision tree that scores
well on the training set but badly on a test set.
d) It would probably result in a decision tree that scores
well on a test set but badly on a training set.

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

lOMoARcPSD|7874213

Answer: (c) It would probably result in a decision tree that scores well
on the training set but badly on a test set
It is usual to make only binary splits because multiway splits break
the data into small subsets too quickly. This causes a bias towards
splitting predictors with many classes since they are more likely to
produce relatively pure child nodes, which results in overitting. [For
more, refer here]

5. The VC dimension of a Perceptron is _____ the VC

dimension of a simple linear SVM.
a) Larger than
b) Smaller than
c) Same as
d) Not at all related
Answer: (c) Same as
Both Perceptron and linear SVM are linear discriminators (i.e. a line in
2D space or a plane in 3D space.), so they should have the same VC
dimension.
VC dimension
The Vapnik–Chervonenkis (VC) dimension is a measure of the capacity
(complexity, expressive power, richness, or lexibility) of a space of
functions that can be learned by a statistical binary classiication
algorithm. It is deined as the cardinality of the largest set of points
that the algorithm can shatter. [Wikipedia]

Downloaded by Dipali Mehta (dipalivmehta016@gmail.com)

Ans: Solution A

Ans: Solution B

3. What is supervised learning?

Ans: Solution B

4. What is Unsupervised learning?

Ans: Solution A

5. What is Semi-Supervised learning?

Ans: Solution C

7. Sentiment Analysis is an example of:

Regression,

Classification

Clustering

Reinforcement Learning

Options:

A. 1 Only

B. 1 and 2

C. 1 and 3

D. 1, 2 and 4

Ans : Solution D

8. The process of forming general concept definitions from examples of concepts to be

learned.
a) Deduction
b) abduction
c) induction
d) conjunction

Ans : Solution C

9. Computers are best at learning

a) facts.
b) concepts.
c) procedures.
d) principles.
Ans : Solution A

10. Data used to build a data mining model.

a) validation data
b) training data
c) test data
d) hidden data

Ans : Solution B

11. Supervised learning and unsupervised clustering both require at least one
a) hidden attribute.
b) output attribute.
c) input attribute.
d) categorical attribute.

Ans : Solution A

Ans : Solution B

Ans : Solution C

14. A term used to describe the case when the independent variables in a multiple regression model
are correlated is
a) Regression
b) correlation
c) multicollinearity
d) none of the above

Ans : Solution C

16. A multiple regression model has

a) only one independent variable
b) more than one dependent variable
c) more than one independent variable
d) none of the above

Ans : Solution B

Ans : Solution C

18. The adjusted multiple coefficient of determination accounts for

a) the number of dependent variables in the model
b) the number of independent variables in the model
c) unusually large predictors
d) none of the above

Ans : Solution D

19. The multiple coefficient of determination is computed by

a) dividing SSR by SST
b) dividing SST by SSR
c) dividing SST by SSE
d) none of the above

Ans : Solution C

20. For a multiple regression model, SST = 200 and SSE = 50. The multiple coefficient of
determination is
a) 0.25
b) 4.00
c) 0.75
d) none of the above

Ans : Solution B

21. A nearest neighbor approach is best used

Ans : Solution B

22. Another name for an output attribute.

a) predictive variable
b) independent variable
c) estimated variable
d) dependent variable

Ans : Solution B

23. Classification problems are distinguished from estimation problems in that

Ans : Solution C

24. Which statement is true about prediction problems?

Ans : Solution D

25. Which statement about outliers is true?

Ans : Solution A

27. Which of the following is a common use of unsupervised clustering?

Ans : Solution A

28. The average positive difference between computed and desired outcome values.
a) root mean squared error
b) mean squared error
c) mean absolute error
d) mean positive error

Ans : Solution D

29. Selecting data so as to assure that each class is properly represented in both the training and
test set.
a) cross validation
b) stratification
c) verification
d) bootstrapping

Ans : Solution B

Ans : Solution A
31. Data used to optimize the parameter settings of a supervised learner model.
a) Training
b) Test
c) Verification
d) Validation

Ans : Solution D

32. Bootstrapping allows us to

Ans : Solution A

Ans : Solution B

Ans : Solution C

35. The average squared difference between classifier predicted output and actual output.
a) mean squared error
b) root mean squared error
c) mean absolute error
d) mean relative error

Ans : Solution A

36. Simple regression assumes a __________ relationship between the input attribute and output
attribute.
a) Linear
b) Quadratic
c) reciprocal
d) inverse

Ans : Solution A

37. Regression trees are often used to model _______ data.

a) Linear
b) Nonlinear
c) Categorical
d) Symmetrical

Ans : Solution B

38. The leaf nodes of a model tree are

a) averages of numeric output attribute values.
b) nonlinear regression equations.
c) linear regression equations.
d) sums of numeric output attribute values.

Ans : Solution C

39. Logistic regression is a ________ regression technique that is used to model data having a
_____outcome.
a) linear, numeric
b) linear, binary
c) nonlinear, numeric
d) nonlinear, binary

Ans : Solution D

40. This technique associates a conditional probability value with each data instance.
a) linear regression
b) logistic regression
c) simple regression
d) multiple linear regression

Ans : Solution B

42. With Bayes classifier, missing data items are

a) treated as equal compares.
b) treated as unequal compares.
c) replaced with a default value.
d) ignored.

Ans : Solution B

43. This clustering algorithm merges and splits nodes to help modify nonoptimal partitions.
a) agglomerative clustering
b) expectation maximization
c) conceptual clustering
d) K-Means clustering

Ans : Solution D

Ans : Solution C

Ans : Solution B
UNIT –II

2.What is pca.components_ in Sklearn?

Set of all eigen vectors for the projection space
Matrix of principal components
Result of the multiplication matrix
None of the above options
Ans A

Ans D

7. PCA works better if there is?

A linear structure in the data
If the data lies on a curved surface and not on a flat surface
If variables are scaled in the same unit
A. 1 and 2
B. 2 and 3
C. 1 and 3
D. 1 ,2 and 3
Ans Solution: (C)

9. Which of the following option(s) is / are true?

11. [ True or False ] PCA can be used for projecting and visualizing data in lower dimensions.
A. TRUE
B. FALSE

Solution: (A)
Sometimes it is very useful to plot the data in lower dimensions. We can take the first 2 principal
components and then visualize the data using scatter plot.

13. Which of the following is an example of a deterministic algorithm?

1. Which of the following methods do we use to best fit the data in Logistic Regression?
A) Least Square Error
B) Maximum Likelihood
C) Jaccard distance
D) Both A and B
Ans Solution: B

4. Which of the following algorithms do we use for Variable Selection?

A) LASSO
B) Ridge
C) Both
D) None of these
Ans Solution: A
In case of lasso we apply a absolute penality, after increasing the penality in lasso some of the
coefficient of variables may become zero

5. Which of the following statement is true about outliers in Linear regression?

7. Which of the following is true about Residuals?

A) Lower is better
B) Higher is better
C) A or B depend on the situation
D) None of these
Ans Solution: (A)
Residuals refer to the error values of the model. Therefore lower residuals are desired.

A) Since the there is a relationship means our model is not good

10. Which of the following option is true?

A) Linear Regression errors values has to be normally distributed but in case of Logistic
Regression it is not the case
B) Logistic Regression errors values has to be normally distributed but in case of Linear
Regression it is not the case
C) Both Linear Regression and Logistic Regression error values have to be normally distributed
D) Both Linear Regression and Logistic Regression error values have not to be normally
distributed
Ans Solution: A

12. True-False: Linear Regression is a supervised machine learning algorithm.

13. True-False: Linear Regression is mainly used for Regression.

A) TRUE
B) FALSE
Solution: (A)
Linear Regression has dependent variables that have continuous values.
14. True-False: It is possible to design a Linear regression algorithm using a neural network?

A) TRUE
B) FALSE

Solution: (A)

True. A Neural network can be used as a universal approximator, so it can definitely implement
a linear regression algorithm.

18. Which of the following is true about Residuals ?

A) Lower is better
B) Higher is better
C) A or B depend on the situation
D) None of these
Solution: (A)
Residuals refer to the error values of the model. Therefore lower residuals are desired.

A) 1 and 2
B) 1 and 3
C) 2 and 3
D) 1,2 and 3
Solution: (D)
Instead of gradient descent, Normal Equation can also be used to find coefficients.

Question Context 24-26:

A) Since the there is a relationship means our model is not good

Question Context 29-31:

31. In terms of bias and variance. Which of the following is true when you fit degree 2
polynomial?

A) Bias will be high, variance will be high

Question Context 32-33:

33. What do you expect will happen with bias and variance as you increase the size of training
data?

A) Bias increases and Variance increases

Question Context 34:

Consider the following data where one input(X) and one output(Y) is given.

34. What would be the root mean square training error for this data if you run a Linear
Regression model of the form (Y = A0+A1X)?

A) Less than 0
B) Greater than zero
C) Equal to 0
D) None of these
Solution: (C)
We can perfectly fit the line on the following data so mean error will be zero.

Question Context 35-36:

Suppose you have been given the following scenario for training and validation error for Linear
Regression.
Number Validation
Learning Training
Scenario of Error
Rate Error
iterations

1 0.1 1000 100 110

2 0.2 600 90 105

3 0.3 400 110 110

4 0.4 300 120 130

5 0.4 250 130 150

Question Context 37-38:

A) L1
B) L2
C) Any
D) None of these
Solution: (D)
I won’t use any regularization methods because regularization is used in case of overfitting.

39. True-False: Is Logistic regression a supervised machine learning algorithm?

40. True-False: Is Logistic regression mainly used for Regression?

A) TRUE
B) FALSE
Solution: B
Logistic regression is a classification algorithm, don’t confuse with the name regression.

42. True-False: Is it possible to apply a logistic regression algorithm on a 3-class Classification

problem?
A) TRUE
B) FALSE
Solution: A
Yes, we can apply logistic regression on 3 classification problem, We can use One Vs all method
for 3 class classification in logistic regression.

46. [True-False] Standardisation of features is required before training a Logistic Regression.

A) TRUE
B) FALSE
Solution: B
Standardization isn’t required for logistic regression. The main goal of standardizing features is
to help convergence of the technique used for optimization.

47. Which of the following algorithms do we use for Variable Selection?

A) LASSO
B) Ridge
C) Both
D) None of these

Solution: A
In case of lasso we apply a absolute penality, after increasing the penality in lasso some of the
coefficient of variables may become zero.
Context: 48-49

Consider a following model for logistic regression: P (y =1|x, w)= g(w0 + w1x)
where g(z) is the logistic function.

In the above equation the P (y =1|x; w) , viewed as a function of x, that we can get by changing the
parameters w.

48 What would be the range of p in such case?

A) (0, inf)
B) (-inf, 0 )
C) (0, 1)
D) (-inf, inf)

Solution: C

For values of x in the range of real number from −∞ to +∞ Logistic function will give the output
between (0,1)

49 In above question what do you think which function would make p between (0,1)?

A) logistic function
B) Log likelihood function
C) Mixture of both
D) None of them

Solution: A

Explanation is same as question number 10

50. Suppose you have been given a fair coin and you want to find out the odds of getting heads.
Which of the following option is true for such a case?

A) odds will be 0
B) odds will be 0.5
C) odds will be 1
D) None of these

Solution: C

Odds are defined as the ratio of the probability of success and the probability of failure. So in case of fair
coin probability of success is 1/2 and the probability of failure is 1/2 so odd would be 1

51. The logit function(given as l(x)) is the log of odds function. What could be the range of logit
function in the domain x=[0,1]?
A) (– ∞ , ∞)
B) (0,1)
C) (0, ∞)
D) (- ∞, 0)

Solution: A

For our purposes, the odds function has the advantage of transforming the probability function, which
has values from 0 to 1, into an equivalent function with values between 0 and ∞. When we take the
natural log of the odds function, we get a range of values from -∞ to ∞.

52. Which of the following option is true?

A) Linear Regression errors values has to be normally distributed but in case of Logistic Regression it is
not the case
B) Logistic Regression errors values has to be normally distributed but in case of Linear Regression it is
not the case
C) Both Linear Regression and Logistic Regression error values have to be normally distributed
D) Both Linear Regression and Logistic Regression error values have not to be normally distributed

Solution:A

53. Which of the following is true regarding the logistic function for any value “x”?

Note:
Logistic(x): is a logistic function of any number “x”

Logit(x): is a logit function of any number “x”

Logit_inv(x): is a inverse logit function of any number “x”

A) Logistic(x) = Logit(x)
B) Logistic(x) = Logit_inv(x)
C) Logit_inv(x) = Logit(x)
D) None of these

Solution: B

54. How will the bias change on using high(infinite) regularisation?

Solution: A

Model will become very simple so bias will be very high.

Note: Consider remaining parameters are same.

A) Training accuracy increases

B) Training accuracy increases or remains the same
C) Testing accuracy decreases
D) Testing accuracy increases or remains the same

Solution: A and D

56. Choose which of the following options is true regarding One-Vs-All method in Logistic Regression.

A) We need to fit n models in n-class classification problem

B) We need to fit n-1 models to classify into n classes
C) We need to fit only 1 model to classify into n classes
D) None of these
Solution: A

If there are n classes, then n separate logistic regression has to fit, where the probability of each
category is predicted over the rest of the categories combined.

57. Below are two different logistic models with different values for β0 and β1.

Which of the
following statement(s) is true about β0 and β1 values of two logistics models (Green, Black)?

Note: consider Y = β0 + β1*X. Here, β0 is intercept and β1 is coefficient.

A) β1 for Green is greater than Black

B) β1 for Green is lower than Black
C) β1 for both models is same
D) Can’t Say

Solution: B

β0 and β1: β0 = 0, β1 = 1 is in X1 color(black) and β0 = 0, β1 = −1 is in X4 color (green)

Context 58-60

A) A
B) B
C) C
D)None of these

Solution: C

Since in figure 3, Decision boundary is not smooth that means it will over-fitting the data.

59. What do you conclude after seeing this visualization?

1. The training error in first plot is maximum as compare to second and third plot.

2. The best model for this regression problem is the last (third) plot because it has minimum
training error (zero).

3. The second model is more robust than first and third because it will perform best on unseen
data.

4. The third model is overfitting more as compare to first and second.

5. All will perform same because we have not seen the testing data.

A) 1 and 3
B) 1 and 3
C) 1, 3 and 4
D) 5

Solution: C

60. Suppose, above decision boundaries were generated for the different value of regularization.
Which of the above decision boundary shows the maximum regularization?

A) A
B) B
C) C
D) All have equal regularization

Solution: A

Since, more regularization means more penality means less complex decision boundry that shows in first
figure A.

61. What would do if you want to train logistic regression on same data that will take less time as well
as give the comparatively similar accuracy(may not be same)?

Suppose you are using a Logistic Regression model on a huge dataset. One of the problem you may face
on such huge data is that Logistic regression will take very long time to train.

A) Decrease the learning rate and decrease the number of iteration

Solution: D

62. Which of the following image is showing the cost function for y =1.

Following is the loss function in logistic regression(Y-axis loss function and x axis log probability) for
two class classification problem.

Note: Y is the target class

A) A
B) B
C) Both
D) None of these

Solution: A

A is the true answer as loss function decreases as the log probability increases

63. Suppose, Following graph is a cost function for logistic regression.

Now, How many local minimas are present in the graph?

A) 1
B) 2
C) 3
D) 4

Solution: C
There are three local minima present in the graph

64. Can a Logistic Regression classifier do a perfect classification on the below data?

Note: You can use only X1 and X2 variables where X1 and X2 can take only two binary values(0,1).

A) TRUE
B) FALSE
C) Can’t say
D) None of these

Solution: B

No, logistic regression only forms linear decision surface, but the examples in the figure are not linearly
separable.
UNIT IV

1. The SVM’s are less effective when:

A) The data is linearly separable

B) The data is clean and ready to use
C) The data is noisy and contains overlapping points

Ans Solution: C

When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.

2. The cost parameter in the SVM means:

A) The number of cross-validations to be made

B) The kernel to be used
C) The tradeoff between misclassification and simplicity of the model
D) None of the above

Ans Solution: C

The cost parameter decides how much an SVM should be allowed to “bend” with the data. For a low
cost, you aim for a smooth decision surface and for a higher cost, you aim to classify more points
correctly. It is also simply referred to as the cost of misclassification.

3. Which of the following are real world applications of the SVM?

A) Text and Hypertext Categorization

B) Image Classification
C) Clustering of News Articles
D) All of the above

Ans Solution: D

SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.

4. Which of the following is true about Naive Bayes ?

Assumes that all the features in a dataset are equally important

Assumes that all the features in a dataset are independent

Both A and B - answer

None of the above options

Ans Solution: C

5 What do you mean by generalization error in terms of the SVM?

A) How far the hyperplane is from the support vectors

B) How accurately the SVM can predict outcomes for unseen data
C) The threshold amount of error in an SVM

Ans Solution: B

Generalisation error in statistics is generally the out-of-sample error which is the measure of how
accurately a model can predict values for previously unseen data.

6 The SVM’s are less effective when:

A) The data is linearly separable

B) The data is clean and ready to use
C) The data is noisy and contains overlapping points

Ans Solution: C

When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.

7 What is/are true about kernel in SVM?

1. Kernel function map low dimensional data to high dimensional space

2. It’s a similarity function

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: C

Both the given statements are correct.

Question Context:8– 9

A) Yes
B) No

Solution: A

These three examples are positioned such that removing any one of them introduces slack in the
constraints. So the decision boundary would completely change.

9. [True or False] If you remove the non-red circled points from the data, the decision boundary will
change?

A) True
B) False

Solution: B

On the other hand, rest of the points in the data won’t affect the decision boundary much.

10. What do you mean by generalization error in terms of the SVM?

A) How far the hyperplane is from the support vectors

B) How accurately the SVM can predict outcomes for unseen data
C) The threshold amount of error in an SVM

Solution: B

A) The optimal hyperplane if exists, will be the one that completely separates the data
B) The soft-margin classifier will separate the data
C) None of the above

Solution: A

At such a high level of misclassification penalty, soft margin will not hold existence as there will be no
room for error.

12. What do you mean by a hard margin?

A) The SVM allows very low error in classification

B) The SVM allows high amount of error in classification
C) None of the above

Solution: A

A hard margin means that an SVM is very rigid in classification and tries to work extremely well in the
training set, causing overfitting.

13. The minimum time complexity for training an SVM is O(n2). According to this fact, what sizes of
datasets are not best suited for SVM’s?

A) Large datasets
B) Small datasets
C) Medium sized datasets
D) Size does not matter

Solution: A

Datasets which have a clear classification boundary will function best with SVM’s.

14. The effectiveness of an SVM depends upon:

A) Selection of Kernel
B) Kernel Parameters
C) Soft Margin Parameter C
D) All of the above

Solution: D

The SVM effectiveness depends upon how you choose the basic 3 requirements mentioned above in
such a way that it maximises your efficiency, reduces error and overfitting.

15. upport vectors are the data points that lie closest to the decision surface.
A) TRUE
B) FALSE

Solution: A

They are the points closest to the hyperplane and the hardest ones to classify. They also have a direct
bearing on the location of the decision surface.

16. The SVM’s are less effective when:

A) The data is linearly separable

B) The data is clean and ready to use
C) The data is noisy and contains overlapping points

Solution: C

When the data has noise and overlapping points, there is a problem in drawing a clear hyperplane
without misclassifying.

17. Suppose you are using RBF kernel in SVM with high Gamma value. What does this signify?

Solution: B

The gamma parameter in SVM tuning signifies the influence of points either near or far away from the
hyperplane.

For a low gamma, the model will be too constrained and include all points of the training dataset,
without really capturing the shape.

For a higher gamma, the model will capture the shape of the dataset well.

18. The cost parameter in the SVM means:

A) The number of cross-validations to be made

B) The kernel to be used
C) The tradeoff between misclassification and simplicity of the model
D) None of the above

What would happen when you use very large value of C(C->infinity)?

Note: For small C was also classifying all data points correctly

A) We can still classify data correctly for given setting of hyper parameter C
B) We can not classify data correctly for given setting of hyper parameter C
C) Can’t Say
D) None of these

Solution: A

For large values of C, the penalty for misclassifying points is very high, so the decision boundary will
perfectly separate the data if possible.

20. What would happen when you use very small C (C~0)?

A) Misclassification would happen

B) Data will be correctly classified
C) Can’t say
D) None of these

Solution: A

The classifier can maximize the margin between most of the points, while misclassifying a few points,
because the penalty is so low.

21. If I am using all features of my dataset and I achieve 100% accuracy on my training set, but ~70% on
validation set, what should I look out for?

A) Underfitting
B) Nothing, the model is perfect
C) Overfitting

Solution: C

If we’re achieving 100% training accuracy very easily, we need to check to verify if we’re overfitting our
data.
22. Which of the following are real world applications of the SVM?

A) Text and Hypertext Categorization

B) Image Classification
C) Clustering of News Articles
D) All of the above

Solution: D

SVM’s are highly versatile models that can be used for practically all real world problems ranging from
regression to clustering and handwriting recognitions.

Question Context: 23 – 25

Suppose you have trained an SVM with linear decision boundary after training SVM, you correctly infer
that your SVM model is under fitting.

23. Which of the following option would you more likely to consider iterating SVM next time?

A) You want to increase your data points

B) You want to decrease your data points
C) You will try to calculate more variables
D) You will try to reduce the features

Solution: C

The best option here would be to create more features for the model.

24. Suppose you gave the correct answer in previous question. What do you think that is actually
happening?

1. We are lowering the bias

2. We are lowering the variance
3. We are increasing the bias
4. We are increasing the variance

A) 1 and 2
B) 2 and 3
C) 1 and 4
D) 2 and 4

Solution: C

A) We will increase the parameter C

B) We will decrease the parameter C
C) Changing in C don’t effect
D) None of these

Solution: A

Increasing C parameter would be the right thing to do here, as it will ensure regularized model

26. We usually use feature normalization before using the Gaussian kernel in SVM. What is true about
feature normalization?

1. We do feature normalization so that new feature will dominate other

2. Some times, feature normalization is not feasible in case of categorical variables
3. Feature normalization always helps when we use Gaussian kernel in SVM

A) 1
B) 1 and 2
C) 1 and 3
D) 2 and 3

Solution: B

Statements one and two are correct.

Question Context: 27-29

Suppose you are dealing with 4 class classification problem and you want to train a SVM model on the
data for that you are using One-vs-all method. Now answer the below questions?

27. How many times we need to train our SVM model in such case?

A) 1
B) 2
C) 3
D) 4

Solution: D

A) 20
B) 40
C) 60
D) 80

Solution: B

It would take 10×4 = 40 seconds

29 Suppose your problem has changed now. Now, data has only 2 classes. What would you think how
many times we need to train SVM in such case?

A) 1
B) 2
C) 3
D) 4

Solution: A

Training the SVM only one time would give you appropriate results

Question context: 30 –31

30. Now, think that you increase the complexity (or degree of polynomial of this kernel). What would
you think will happen?

A) Increasing the complexity will over fit the data

B) Increasing the complexity will under fit the data
C) Nothing will happen since your model was already 100% accurate
D) None of these

Solution: A

A) 1
B) 2
C) 1 and 2
D) None of these

Solution: C

Both the given statements are correct.

32. What is/are true about kernel in SVM?

1. Kernel function map low dimensional data to high dimensional space

2. It’s a similarity function

A) 1
B) 2
C) 1 and 2
D) None of these

Solution: C

Both the given statements are correct.

UNIT V

1. Which of the following is a widely used and effective machine learning algorithm based on the
idea of bagging?

a) Decision Tree
b) Regression
c) Classification
d) Random Forest

Ans D

2. Which of the following is a disadvantage of decision trees?

a) Factor analysis
b) Decision trees are robust to outliers
c) Decision trees are prone to be overfit
d) None of the above

Ans C

3. Can decision trees be used for performing clustering?

a. True
b. False

Ans Solution: (A)

Decision trees can also be used to for clusters in the data but clustering often generates natural
clusters and is not dependent on any objective function.

4. Which of the following algorithm is most sensitive to outliers?

a. K-means clustering algorithm

b. K-medians clustering algorithm
c. K-modes clustering algorithm
d. K-medoids clustering algorithm

Ans Solution: (A)

5 Sentiment Analysis is an example of:

Regression

Classification

Clustering

Reinforcement Learning

Options:

a. 1 Only
b. 1 and 2
c. 1 and 3
d. 1, 2 and 4

Ans D

6 Which of the following is the most appropriate strategy for data cleaning before performing
clustering analysis, given less than desirable number of data points:

Capping and flouring of variables

Removal of outliers
Options:
a. 1 only
b. 2 only
c. 1 and 2
d. None of the above

Ans A

7 Which of the following is/are true about bagging trees?

1. In bagging trees, individual trees are independent of each other

2. Bagging is the method for improving the performance by aggregating the results of weak
learners

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: C

Both options are true. In Bagging, each individual trees are independent of each other because they
consider different subset of features and samples.

8. Which of the following is/are true about boosting trees?

1. In boosting trees, individual weak learners are independent of each other

2. It is the method for improving the performance by aggregating the results of weak learners

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: B

2. Individual tree is built on all the features

3. Individual tree is built on a subset of observations

4. Individual tree is built on full set of observations

A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4

Ans Solution: A

Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.

10. Suppose you are using a bagging based algorithm say a RandomForest in model building.
Which of the following can be true?

1. Number of tree should be as large as possible

2. You will have interpretability after using Random Forest

A) 1
B) 2
C) 1 and 2
D) None of these

Ans Solution: A

11. Which of the following is/are true about Random Forest and Gradient Boosting ensemble
methods?

1. Both methods can be used for classification task

2. Random Forest is use for classification whereas Gradient Boosting is use for regression task

3. Random Forest is use for regression whereas Gradient Boosting is use for Classification task

4. Both methods can be used for regression task

A) 1
B) 2
C) 3
D) 4
E) 1 and 4

Solution: E

Both algorithms are design for classification as well as regression task.

12. In Random forest you can generate hundreds of trees (say T1, T2 …..Tn) and then aggregate the
results of these tree. Which of the following is true about individual(Tk) tree in Random Forest?

1. Individual tree is built on a subset of the features

2. Individual tree is built on all the features

3. Individual tree is built on a subset of observations

4. Individual tree is built on full set of observations

A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4

Solution: A

Random forest is based on bagging concept, that consider faction of sample and faction of feature for
building the individual trees.

13. Which of the following algorithm doesn’t uses learning Rate as of one of its hyperparameter?

1. Gradient Boosting

2. Extra Trees

3. AdaBoost

4. Random Forest

A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4

Solution: D
Random Forest and Extra Trees don’t have learning rate as a hyperparameter.

14. Which of the following algorithm are not an example of ensemble learning algorithm?

A) Random Forest
B) Adaboost
C) Extra Trees
D) Gradient Boosting
E) Decision Trees

Solution: E

Decision trees doesn’t aggregate the results of multiple trees so it is not an ensemble algorithm.

15. Suppose you are using a bagging based algorithm say a RandomForest in model building. Which of
the following can be true?

1. Number of tree should be as large as possible

2. You will have interpretability after using RandomForest

A) 1
B) 2
C) 1 and 2
D) None of these

Solution: A

16. True-False: The bagging is suitable for high variance low bias models?

A) TRUE
B) FALSE

Solution: A

The bagging is suitable for high variance low bias models or you can say for complex models.

17. To apply bagging to regression trees which of the following is/are true in such case?

1. We build the N regression with N bootstrap sample

2. We take the average the of N regression tree

3. Each tree has a high variance with low bias

A) 1 and 2
B) 2 and 3
C) 1 and 3
D) 1,2 and 3

Solution: D

All of the options are correct and self-explanatory

18. How to select best hyper parameters in tree based models?

A) Measure performance over training data

B) Measure performance over validation data
C) Both of these
D) None of these

Solution: B

We always consider the validation results to compare with the test result.

19. In which of the following scenario a gain ratio is preferred over Information Gain?

A) When a categorical variable has very large number of category

B) When a categorical variable has very small number of category
C) Number of categories is the not the reason
D) None of these

Solution: A

When high cardinality problems, gain ratio is preferred over Information Gain technique.

20. Suppose you have given the following scenario for training and validation error for Gradient
Boosting. Which of the following hyper parameter would you choose in such case?

Scenario Depth Training Error Validation Error

1 2 100 110

2 4 90 105

3 6 50 100

4 8 45 105
5 10 30 150

A) 1
B) 2
C) 3
D) 4

Solution: B

Scenario 2 and 4 has same validation accuracies but we would select 2 because depth is lower is better
hyper parameter.

21. Which of the following is/are not true about DBSCAN clustering algorithm:

1. For data points to be in a cluster, they must be in a distance threshold to a core point

2. It has strong assumptions for the distribution of data points in dataspace

3. It has substantially high time complexity of order O(n 3)

4. It does not require prior knowledge of the no. of desired clusters

5. It is robust to outliers

Options:

A. 1 only

B. 2 only

C. 4 only

D. 2 and 3

Solution: D

 DBSCAN can form a cluster of any arbitrary shape and does not have strong assumptions for the
distribution of data points in the data space.

 DBSCAN has a low time complexity of order O (n log n) only.

22. Point out the correct statement.

23. Which of the following is required by K-means clustering?

a) defined distance metric
b) number of clusters
c) initial guess as to cluster centroids
d) all of the mentioned

Answer: d
Explanation: K-means clustering follows partitioning approach.

24. Point out the wrong statement.

a) k-means clustering is a method of vector quantization
b) k-means clustering aims to partition n observations into k clusters
c) k-nearest neighbor is same as k-means
d) none of the mentioned

Answer: c
Explanation: k-nearest neighbour has nothing to do with k-means.

25. Which of the following function is used for k-means clustering?

a) k-means
b) k-mean
c) heat map
d) none of the mentioned

Answer: a
Explanation: K-means requires a number of clusters.

26. K-means is not deterministic and it also consists of number of iterations.

a) True
b) False

Answer: a
Explanation: K-means clustering produces the final estimate of cluster centroids.
27.
Page 1 of 7

UNIVERSITY OF OSLO
Faculty of Mathematics and Natural Sciences

Exam in INF3490/4490 — Biologically Inspired Computing

Day of exam: December 3rd, 2014
Exam hours: 14:30 – 18:30
This examination paper consists of 7 pages.
Appendices: 1
Permitted materials: None

Make sure that your copy of this examination paper is complete before answering.

The exam text consists of problems 1-30 (multiple choice questions) to be answered on
the form that is enclosed in the appendix and problems 31-33 which are answered on
the usual sheets. Problems 1-30 have a total weight of 60%, while problems 31-33 have a
weight of 40%.

About problem 1-30:

Each problem consists of a topic in the left column and a number of statements each indicated
by a capital letter. Problems are answered by marking true statements with a clear cross (X)
in the corresponding row and column in the attached form, and leaving false statements
unmarked. Each problem has a variable number of true statements, but there is always at
least one true and false statement for each problem. 0.5 points are given for each marked true
statement and for each false statement left unmarked, resulting in a score ranging from 0 to
60.

You can use the right column of the text as a draft. The form in the appendix is the one to be
handed in (remember to include your candidate number).

Problem 1
Hill climbing A Is a population-based optimization algorithm
B Results depend on the starting points
C Can only be done when a solution has a finite number of
neighbors
D Has less randomness than greedy search

Problem 2
Strategy A Adapt mutation using a fixed strategy schedule
parameters B Improve the chances of finding a better solution in the
short term
C Improve the chances of finding the global optimum
D Adapt mutation by adjusting the normal distribution spread
Page 2 of 7

Problem 3
Evolution A Random parents selection
strategies have B Uniform mutation
C Recombination by partially mapped crossover
D Fitness proportional survivor selection

Problem 4
The crossover A Integer representations
operators used in B Real-valued representations
binary C Permutation representations
representations can D Tree representations
also be used in

Problem 5
Permutation A Swap mutation
representation B Creep mutation
works with C Scramble mutation
D Insert mutation

Problem 6
Adding an offset to A Fitness proportional selection
all fitness values B Ranking selection
affects selection C Tournament selection
pressure in D 𝜇 + 𝜆 selection

Problem 7
One can improve A Ensuring that the initial population well distributed
results on multi- B Reducing the population size
modal problems by C Reducing the fitness of individuals that are close to others
D Increasing the selection pressure

Problem 8
Pareto dominance A Is hard to combine with tournament selection
B Can be used to sort points according to multiple objectives
C Reduces the objective functions to a scalar value
D A solution dominates another if it is as good in every way
and better in at least one
Page 3 of 7

Problem 9
Running multiple A An exhaustive search
times is necessary B An evolution strategy
to measure the C Training a multi-layer perceptron
performance of D Training a self-organizing map

Problem 10
Machine learning A Should be distinguished from self-learning
B Is applicable to classification problems
C A number of different biology-inspired methods could be
used for machine learning
D Is learning automatically from examples

Problem 11
Machine learning A Can be applied to analyze new data
B Is an alternative to artificial intelligence
C Can be used at design time and/or at run time
D Is always learning from scratch and not adaptation of a
previously learned system

Problem 12
Machine learning A Supervised learning is good for clustering problems
algorithms B Reinforcement learning is about learning behavior based
on reward
C Unsupervised learning does not require target values
D Selecting among the above learning methods is
independent of the problem to be solved

Problem 13
Swarm intelligence A Are inspired by interaction in nature between living beings
algorithms in motion
B Are focused on centralized control
C Simple local rules are often applicable
D It is difficult to predict the global behavior of the system

Problem 14
Particle Swarm A Is a population based algorithm
Optimization B Particles are selected for survival based on their fitness
(PSO) C Velocity and position of each solution are updated
D Updates are also based on neighbor particles

Problem 15
Cartesian Genetic A Has less restrictions than Genetic Programming
Programming B Can be used for evolving digital circuits
(CGP) C The level-back parameter indicates the number of previous
columns a node can connect to
D Crossover is always used
Page 4 of 7

Problem 16
Classification A Concerns finding decision boundaries that can be used to
separate out different classes
B Evolvable hardware is not applicable for classification
C Non-linear decision boundaries can solve more complex
problems than linear boundaries (straight lines)
D A test set is more relevant for testing generalization than
the training set

Problem 17
Biological neural A The outputs from a neurons are pulses of fixed strength
networks (height) and duration
B The output from the neuron is called a synapse
C Synapses can be inhibitory or excitatory
D Learning takes place in the dendrites

Problem 18
Which function does the A NAND
following multi-layer B NOR
perceptron realize: C AND
D XOR

Problem 19
Multilayer A Usually, the weights are initially set to small random
perceptron network values
B A hard limiting activation function is often used
C The weights can only be updated after all the training
vectors have been presented
D Multiple layers of neurons allow for less complex decision
boundaries than a single layer

Problem 20
Support Vector A Support vectors are used for computing hyperplanes
Machines (SVMs) B Is a method for minimizing the margin to hyperplanes
C Nonlinear problems are handled with mapping inputs to
lower-dimensional space
D Kernel functions are used for transforming data
Page 5 of 7

Problem 21
Which separation line would A
SVM most likely choose? B
C
D

Problem 22
Soft margins in A Reduce misclassifications during training
SVMs B Allow some of the training data to be misclassified by
introducing slack variables
C Reduce the problem of training data overfitting
D Are not useful if any training data is mislabeled

Problem 23
Ensemble learning A A combination of classifiers are applied for classification
B Classifiers should be trained to be slightly different
C In bagging, each training sample (data point) is used only
once for each iteration
D Minority voting is used if there is disagreement

Problem 24
Principal A Finds the directions with the most variation in the data
component B Is useful for visualizing data
analysis (PCA) C Dimensions are increased when applying PCA
D Eigenvalues and eigenvectors are computed from the
covariance matrix

Problem 25
Unsupervised A Categorizes training vectors by identifying similarities
learning between them
B Can use the same error functions as supervised learning
C Collaborative learning methods are often applied between
classes
D The data applied is unlabeled
Page 6 of 7

Problem 26
k-means A Automatically finds the number of clusters
B Each cluster center is moved to the mean of data points
assigned to it for each iteration
C A too small number of clusters may lead to overfitting
D The algorithm has converged when the change in cluster
assignment is less than a threshold

Problem 27
Self-Organizing A Includes both a competition and collaboration part
Feature Map B Two or more weight layers are often used
C Training data that are similar excite neurons that are near to
each other
D Represents a clustering technique

Problem 28
Self-Organizing A Increased network size leads to increased generalization
Feature Map B Weights of the winner neuron (and its neighborhood) are
learning updated
C The number of weights being modified for each training
vector is increased throughout learning
D A neighborhood function is used to compute the distance to
the winner neuron

Problem 29
Reinforcement A Works best with smaller state spaces
learning B Keeps a log of all individual actions taken by the agent
C Requires the agent to know the rewards for every action
D Models learning behavior in animals

Problem 30
Reinforcement A Is specified in the interval −1,0
learning discount B Is used to account for uncertainties about future rewards
factor C Develops exponentially with time
D Adjusts the balance between shortsightedness and
farsightedness
Page 7 of 7

Problem 31 (6%)
In a few sentences, sketch how you could modify a hill climbing algorithm in order to
improve chances of finding the global optima.

Problem 32 (10%)
If you were to design an evolutionary algorithm to optimize the following problems, what
kind of genetic representation (genotype) would you choose, and why? (Maximum two
sentences for each)
a) Finding the best route for delivering a set of packages to different addresses
b) Optimize parameters of a physical structure like an antenna with a given shape
c) Design of a digital circuit

Problem 33 (24%)
SiO, the student welfare organization, would like to have a system for sorting utensils after
washing. You are going to help them designing a camera based classifier system for sorting
knifes, forks, spoons and teaspoons into separate bins. You have a machine vision library
available that lets you identify where there is a utensil in the camera images, and it extracts a
large number of features for each identified object that we can use as inputs.
(a) (4%) What class of learning algorithm would be best to use in this case, supervised,
unsupervised or reinforcement learning? Justify your answer.
(b) (4%) We would like to make a system for distinguishing the utensils using a multi-layer
perceptron network. How many output neurons should the network have, and what would
each of them represent?
(c) (8%) Sketch the steps in the forward and backward phase of the multi-layer perceptron
algorithm (backpropagation). Use words and not equations.
(d) (4%) What are the different approaches to how often weights are updated during training?
(e) (4%) How would you find out when to stop the training?
Appendix
Page 8 of 71

INF3490/INF4490 Answers problems 1 – 30 for candidate no: __________

Problem A B C D
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Appendix
Page 9 of 71

INF3490/INF4490 Answers problems 1 – 30

Problem A B C D
1 Ο
2 Ο Ο
3 Ο
4 Ο Ο
5 Ο Ο Ο
6 Ο
7 Ο Ο
8 Ο Ο
9 Ο Ο Ο
10 Ο Ο Ο
11 Ο Ο
12 Ο Ο
13 Ο Ο Ο
14 Ο Ο Ο
15 Ο Ο
16 Ο Ο Ο
17 Ο Ο
18 Ο
19 Ο
20 Ο Ο
21 Ο
22 Ο Ο
23 Ο Ο
24 Ο Ο Ο
25 Ο Ο
26 Ο Ο
27 Ο Ο Ο
28 Ο Ο
29 Ο Ο
30 Ο Ο Ο
Student name

MA2823: Foundations of Machine Learning

Final Exam – Solutions
December 16, 2016
Instructor: Chloé-Agathe Azencott
Multiple choice questions
1. (1 point) Taking a bootstrap sample of n data points in p dimensions means:
Sampling p features with replacement.
√
Sampling p features without replacement.
Sampling n samples with replacement.
Sampling k < n samples without replacement.

Solution: Sampling n samples with replacement.

2. (2 points) Which of the following statements are true?

Training a k-nearest-neighbors classifier takes more computational time than ap-
plying it.
The more training examples, the more accurate the prediction of a k-nearest-neighbors.
k-nearest-neighbors cannot be used for regression.
A k-nearest-neighbors is sensitive to outliers.

Solution: False. True. False. True.

3. (4 points) Check all the binary classifiers that are able to correctly separate the training data
(circles vs. triangles) given in Figure 1.
Logistic regression
SVM with linear kernel
SVM with RBF kernel
Decision tree
3-nearest-neighbor classifier (with Euclidean distance).
MA2823 2 / 12 Dec. 16, 2016

1.0

0.8

0.6

0.4

0.2

0.0

0.0 0.2 0.4 0.6 0.8 1.0

Figure 1: Training data for Question 3.

Solution:

• Logistic regression and linear SVM: linear decision functions, hence no.

• SVM with RBF kernel: yes.

• 3-NN: the 3 nearest neighbors of any point in our training set are 1 of the same
class and 2 of the opposite class, hence 3-NN will be systematically wrong.

• DT: yes, you can partition the space with lines orthogonal to the axes in such a way
that every sample ends up in a different region.

Short questions
4. (1 point) In a Bayesian learning framework, what is a posterior?

Solution: The updated probability p(θ|D) of a model, after having seen the data.

5. (1 point) Give an example of a loss function for classification problems.

Solution: cross-entropy; hinge loss; number of errors; etc.

6. (1 point) Give an example of an unsupervised learning algorithm.

Solution: Dimensionality reduction; PCA; clustering; k-means; etc.

7. (1 point) Pearson’s correlation between two variables x and z ∈ Rp is given by

Pp
j=1 (xj − x̄)(zj − z̄)
ρ(x, z) = qP qP ,
p 2 p 2
j=1 (x j − x̄) j=1 (zj − z̄)
MA2823 3 / 12 Dec. 16, 2016
Pp
where x̄ = j=1 xj . If the data is centered, why is this also referred to as the cosine-
similarity?

Solution: If the data is centered,

hx, zi
ρ(x, z) = = cos θ
||x||.||z||

where θ is the angle between x and z.

8. A decision tree partitions the data space X in m regions R1 , R2 , . . . , RmP

. The function f that
m
associates a label to a data point x ∈ X can be written
( as: f (x) = k=1 ck Ix∈Rk , where
1 if x ∈ Rk
Ix∈Rk is an indicator function, i.e. Ix∈Rk = 1x∈Rk =
0 otherwise.
Given a training set D = {xi , y i }i=1,...,n where xi ∈ X for i = 1, . . . , n, and assuming
we have an algorithm that allows us to define Rk for k = 1, . . . , m, how does one define
ck (k = 1, . . . , m) for:
(a) (1 point) a classification problem (y i ∈ {0, 1})?

Solution: ck is the majority class of training points in Rk

(b) (1 point) a regression problem (y i ∈ R)?

Solution: ck is the average label of training points in Rk .

9. (2 points) A data scientist runs a principal component analysis on their data and tells you
that the percentage of variance explained by the first 3 components is 80 %. How is this
percentage of variance explained computed?

Solution: The overall variance is computed as the sum of the variances of all variables
(i.e. the sum of the diagonal terms of the covariance matrix). The variance explained
(or accounted for) by one PC is the variance of this PC (i.e. the diagonal term on the
corresponding entry of the covariance matrix of the data projected onto its PCs). The
variance explained by the first 3 components is the sum of the tree first values on the
diagonal of the covariance matrix of the data projected onto its PCs.

10. Assume you are given data {(x1 , y 1 ), . . . , (xn , y n )} where xi ∈ X and y i ∈ R. You are
planning to train an SVM. You define a kernel k and obtain, on your training data, the kernel
matrix K presented in Figure 2, where Kij = k(xi , xj ).
(a) (1 point) What is the issue here?
MA2823 4 / 12 Dec. 16, 2016

Figure 2: Kernel matrix on the training data for Question 10

Solution: Diagonal dominance: the kernel is equivalent to the identity matrix and
the SVM won’t learn.

(b) (1 point) How can you address it?

K
Solution: Normalize the kernel matrix by Kij ← √ ij , or manipulate a coeffi-
Kii Kjj
cient of your kernel to obtain non-zero off-diagonal terms.

11. (2 points) Assume we are given data {(x1 , y 1 ), . . . , (xn , y n )} where xi ∈ Rp and y i ∈ R,
and a parameter λ > 0. We denote by X the n × p matrix of row vectors x1 , . . . , xn and
y = (y 1 , . . . , y n ). We are also given a graph structure on the features, where vertices are
features and edges connect related features. We denote by E the set of edges of this graph.
The graph-Laplacian-regularized linear regression estimator is defined as:
X
β̂ = arg minp ||y − Xβ||22 + λ (βu − βv )2 .
β∈R
(u,v)∈ E

− βv )2 enforce?
P
What does the regularizer (u,v)∈ E (βu

Solution: That connected features get similar weights.

12. Consider a data set described using 1 000 features in total. The labels have been generated
using the first 50 features. Another 50 features are exact copies of these features. The 900
remaining features are uninformative. Assume we have 100 000 training data points.
(a) (2 points) How many features will a filtering approach select?

Solution: 100 (50 informative + 50 copies).

MA2823 5 / 12 Dec. 16, 2016

(b) (2 points) How many features will a wrapper approach select?

Solution: 50 (only informative features).

Problems
13. Perceptron. Consider the following Boolean function:

x1 x2 y = ¬x1 ∪ x2
0 0 1
0 1 1
1 0 0
1 1 1

(a) (2 points) Can this function be represented by a perceptron? Explain your answer.

Solution: Yes, because the function is linearly separable.

1.0 + +
0.8

0.6

0.4

0.2

0.0 + -
0.0 0.2 0.4 0.6 0.8 1.0

(b) (4 points) If yes, draw a perceptron that represents it. Otherwise, build a multilayer
neural network that will.

Solution: A perceptron has the following architecture:

w0 = 1, w1 = −1, w2 = 2
Its output is given by: 1 if w0 + w1 x1 + w2 x2 > 0 and 0 otherwise.
This is one of many possible solutions. w0 , w1 , w2 must give the equation of a line
that separates (1, 0) from (0, 0), (0, 1) and (1, 1).
MA2823 6 / 12 Dec. 16, 2016

14. Multi-class classification. Assume p random variables X1 , . . . , Xp , conditionally indepen-

dent given Y . Y is a discrete random variable that can take one of K values y1 , . . . , yK ,
corresponding to K classes. X is boolean.
We suppose that each Xj follows a Bernouilli distribution:
u
P (Xj = u|Y = yk ) = θjk (1 − θjk )(1−u) , u ∈ {0, 1}.
We observe n datapoints x1 , . . . , xn and their labels y 1 , . . . , y n .
(
1 if y i = yk
In what follows, you can use the indicator Iik = 1yi =yk =
0 otherwise.
We will call nk the number of training points in class k, and njk the count of training points
in class k for which xj = 1:
Xn
njk = 1yi =yk xij .
i=1
(a) (2 points) What is the likelihood of the parameter θjk ?

Solution: The likelihood of the parameters is given by

n
Y
L(θjk ) = p(Xj = xij |θjk )Iik
i=1
n Iik
Y xi (1−xij )
= θjkj (1 − θjk ) .
i=1

(b) (3 points) What is the maximum likelihood estimator of θjk ?

Solution: The log-likelihood is:

n
X
Iik xij log θjk + (1 − xij ) log (1 − θjk ) .

l(θjk ) =
i=1

Taking the derivative with respect to θjk and setting it to 0:

∂l(θjk )
= 0,
∂θjk

we obtain n n
1 X i 1 X
Iik xj + Iik (1 − xij ).
θjk i=1 1 − θjk i=1

Finally,
njk
θ̂jk = .
nk
MA2823 7 / 12 Dec. 16, 2016

For a data point x = (x1 , . . . , xp ), we can write the Naive Bayes decision rule as:
!
P (Y = yk )P (x|Y = yk )
f (x) = arg max PK .
k=1,...,K l=1 P (Y = yl )P (x|Y = yl )

(c) (2 points) When making predictions, we use the rule

p
!
Y
f (x) = arg max P (Y = yk ) P (xj |Y = yk ) .
k=1,...,K j=1

Why?

Solution: Because (i) the independence assumption lets us write

p
Y
P (x|Y = yk ) = p(xj |Y = yk )
j=1

(ii) the denominator does not depend on k.

(d) (1 point) Given a data point x, how can you calculate P (X = x) given the parameters
estimated by Naive Bayes?
P
Solution: P (X = x) as k P (X = x|Y = yk )P (Y = yk ).

15. Virtual high-througput screening. Figure 3 presents the performance of several algo-
rithms applied to the problem of classifying molecules in two classes: those that inhibit
Human Respiratory Syncytial Virus (HRSV), and those that do not. HRSV is the most fre-
quent cause of respiratory tract infections in small children, with a worldwide estimated
prevalence of about 34 million cases per year among children under 5 years of age.
(a) (1 point) Which method gives the best performance?

Solution: Random forests (top line).

(b) (2 points) The goal of this study is to develop an algorithm that can be used to suggest,
among a large collection of several million of molecules, those that should be experi-
mentally tested for activity against HRSV. Compounds that are active against HSRV are
good leads from which to develop new medical treatments against infections caused by
this virus. In this context, is it preferable to have a high sensitivity or a high specificity?
Which part of the ROC curve is the most interesting?

Solution: We want a low false positive rate (so as to ensure there are mostly promis-
ing compounds among those that will be selected for further development; thera-
MA2823 8 / 12 Dec. 16, 2016

Figure 3: ROC curves for several algorithms classifying molecules according to their action on
HRSV, computed on a test set. Sensitivity = True Positive Rate. Specificity = 1 - False Positive
Rate. VS-RF : Random Forest. SVM : Support Vector Machine. GP : Gaussian Process. LDA :
Linear Discriminant Analysis. kNN : k-Nearest Neighbors. Source: M. Hao, Y. Li, Y. Wang, and S.
Zhang, Int. J. Mol. Sci. 2011, 12(2), 1259-1280.

peutic development is costly), i.e. high specificity. We’re interested in the left part
of the curve: what sensitivity can we get for a fixed specificity?

(c) (1 point) In this study, the authors have represented the molecules based on 777 de-
scriptors. Those descriptors include the number of oxygen atoms, the molecular weights,
the number of rotatable bonds, or the estimated solubility of the molecule. They have
fewer samples (216) than descriptors. What is the danger here?

Solution: Overfitting.

16. Kernel ridge regression. Assume we are given data {(x1 , y 1 ), . . . , (xn , y n )} where xi ∈ Rp
is centered and y i ∈ R, and a parameter λ > 0. We denote by X the n × p matrix of row
vectors x1 , . . . , xn and y = (y 1 , . . . , y n ). The ridge regression estimator is defined as:

β̂ = arg minp ||y − Xβ||22 + λ||β||22 .

β∈R
MA2823 9 / 12 Dec. 16, 2016

One way to write the solution to this problem is:

β̂ = X > (XX > + λI)−1 y.
(a) (1 point) Does this solution always exist? Justify your answer.

Solution: Yes: (XX > + λI) can always be inverted if λ > 0.

(b) (2 points) Write down the value of the prediction for a data point x0 ∈ Rp , as a function
of X, y and λ.

Solution:
ŷ = β̂ > x0 = y > (XX > + λI)−1 Xx0 .

(c) (2 points) Let us now replace all data points with their image in a Hilbert space H: x
is replaced by φ(x), where φ : Rp → H. Let us define K as the n × n matrix with
entries Kij = hφ(xi ), φ(xj )iH , and κ as the n-dimemsional vector with entries κi =
hφ(xi ), φ(x0 )iH .
We are now solving the following optimization problem:
β̂ = arg minp ||y − Φβ||22 + λ||β||22 ,
β∈R

where Φ is the n × p matrix of row vectors φ(x1 ), . . . , φ(xn ).

Write down the value of the prediction for a data point x0 ∈ Rp , as a function of K, κ,
y and λ, without using φ.

Solution:
ŷ = β̂ > x0 = y > (K + λI)−1 κ.

(d) (2 points) Could the kernel trick be applied in a similar fashion to the l1 -regularized
linear regression (Lasso)?

Solution: No, because unlike ||w||2 , ||w||1 cannot be expressed as a dot product.

17. Quadratic SVM. We are given the 2-dimensional training data D shown in Figure 4 for a
binary classification problem (circles vs. triangles). Assume we are using an SVM with a
quadratic kernel. Let C be the cost parameter of the SVM.
Assuming D = {xi , y i }i=1,...,n with x ∈ R2 and y ∈ {−1, +1}, recall that the SVM is solving
the following optimization problem:
n
1 X
arg min ||w||2 + C ξi such that
w∈Rp ,b∈R 2 i=1
y i hw, φ(xi )i + b ≥ 1 − ξi for all i = 1, . . . , n

ξi ≥ 0 for all i = 1, . . . , n,
MA2823 10 / 12 Dec. 16, 2016

12 12

10 10

8 8

6 6

4 4

2 2

0 0

2 2
2 0 2 4 6 8 10 12 2 0 2 4 6 8 10 12

(a) Very large C. (b) Very small C.

Figure 4: Training data for Question 17.

where φ is such that hφ(x), φ(x0 )i = (hx, x0 i + 1)2 .

(a) (2 points) On Figure 4 (a), draw the decision boundary for a very large value of C. Jus-
tify your answer here.

Solution: The soft-margin formulation of the SVM can be rewritten as

1
arg min + C × error(f ) .
f margin(f )

Large C means the classifier makes few errors. Quadratic SVM means the decision
boundary is an ellipsoid.

(b) (2 points) On Figure 4 (b), draw the decision boundary for a very small value of C. Jus-
tify your answer here.
MA2823 11 / 12 Dec. 16, 2016

Solution: The soft-margin formulation of the SVM can be rewritten as

1
arg min + C × error(f ) .
f margin(f )

Small C means the classifier has a large margin. Quadratic SVM means the decision
boundary is an ellipsoid.

Solution: Small C. The two triangles near the circles are most likely noise/outliers.

18. K-means clustering.

(a) (4 points) Consider the unlabeled two-dimensional data represented on Fig. 5. Using
the two points marqued as squares as initial centroids, draw (on that same figure) the
clusters obtained after one iteration of the k-means algorithm (k = 2).

Solution:
7
6
5
4 Cluster 1
3
2
1
0 Cluster 2
0 1 2 3 4 5 6 7

(b) (2 points) Does your solution change after another iteration of the k-means algorithm?
MA2823 12 / 12 Dec. 16, 2016

Solution: No.

7
6
5
4
3
2
1
0
0 1 2 3 4 5 6 7
Figure 5: Data for Question 18.

Bonus questions
19. (1 point) In scikit-learn, what is the difference between the methods predict and
predict_proba for classifiers?

Solution: predict returns a class prediction, while predict_proba returns the prob-
abilities to belong to each of the classes.

20. (1 point) Which feature(s) can you use to represent months in such a way that December is
equally distant from January and November using the Euclidean distance?

Solution: Map to a circle and use cosine and sine of the angle, i.e. use 2 features cos( πk
6
)
and sin( πk
6
).
Family name: Vision and Machine-Learning
Given name: 1/28/2011

Multiple-Choice Questionnaire
Group B

No documents authorized.
There can be several right answers to a question.
Marking-scheme: 2 points if all right answers are selected, 1 point in case of a
right but incomplete answer, 0 point if a wrong answer is selected.

Question N.1:
Large scale visual search. The number of visual words (VW) and their
structure are parameters, if we want to perform large scale search for particular
objects/buildings/scenes taken from dierent viewpoints. Select the statements
which are correct.
Possible answers:
a. A small number of VW (between 1000 and 4000) gives excellent performance.
b. A very large number of VW (between 200k and 1M) gives excellent perfor-
mance.
c. An average number of VW (around 20k) combined with a renement based
on a short binary signature gives excellent performance.
d. A hierarchically structured visual vocabulary improves the performance in
terms of search accuracy.

Question N.2:
Image features. Given scale invariant interest points and SIFT descriptors
which are normalized in the direction of the dominant gradient orient. Select
the properties which are correct.
Possible answers:
a. These descriptors allow to match images taken at dierent distances.
b. These descriptors are invariant to image rotation and translation.
c. These descriptors are invariant to ane transformations.
d. The detected regions indicate the local characteristic scale of the image.

1
Question N.3:
Image features. The Harris detector extracts interest points for a given image.
Select the properties which are correct.
Possible answers:
a. The detector is based on the auto-correlation matrix.
b. The detector selects the characteristic scale.
c. The detector nds discriminant points.
d. The detector is invariant to rotation.

Question N.4:
Bag-of-features models for category-level classication. Image classi-
cation is one task in category recognition. Select the statements which are true
in the context of image classication.
Possible answers:
a. The PASCAL dataset is a standard to compare the performance of dierent
algorithms for image classication.
b. Image classication allows to localize objects in the image.
c. When training an image classier, we use positive and negative training im-
ages.
d. The number of visual words used in the context of image classication is in
general very high (between 100k and 1M visual words).

Question N.5:
Bag-of-features models for category-level classication. The spatial
pyramid kernel can be used for image classication. Select the statements which
are correct.
Possible answers:
a. The spatial pyramid kernel captures coarsely the global spatial layout of the
image.
b. The spatial pyramid kernel works well for classifying scenes.
c. The spatial pyramid kernel is well adapted for classifying images with objects
in arbitrary positions.
d. The spatial pyramid kernel is invariant to image rotation.

2
Question N.6:
Camera geometry and image alignment. Two images, I and I 0 , are cap-
tured by two cameras with internal calibration matrices K and K0 . The two
cameras have the same camera center and are related by a pure rotation given
by matrix R (the mosaicking scenario). What is the form of homography H re-
lating the two images in terms of K, K0 and R?
Hint: Start from the perspective projection equation, which has the form x = z1 K[R t]X,
where x (3-vector) and X (4-vector) are the image and scene points in homogeneous
coordinates, respectively. Assume the two cameras have the same camera center at
the
t1 = t2 = 0, and use the fact that the scene point X can be written as X = X̃

origin ,
1
where X̃ (3-vector) is in non-homogenous coordinates.

Possible answers:
a. H = K0 RK
b. H = RKR−1
c. H = RK0 R−1
d. H = K0 RK−1

Question N.7:
Camera geometry and image alignment. What is the minimal number of
point-to-point correspondences to compute (i) homography and (ii) 2D ane
geometric transformation?
Possible answers:
a. 1 correspondence for ane transformation, 2 correspondences for homogra-
phy.
b. 2 correspondences for ane transformation, 3 correspondences for homogra-
phy.
c. 3 correspondences for ane transformation, 4 correspondences for homogra-
phy.
d. 4 correspondences for ane transformation, 5 correspondences for homogra-
phy.

3
Question N.8:
Large scale visual search. N sift descriptors are indexed using a randomized
KD-tree discussed in the lecture. What is the complexity (in terms of N ) of
nding an approximate nearest neighbor to a query sift descriptor?
Possible answers:
a. N2

b. N

c. log N

d. log log N

Question N.9:
Unsupervised learning. The k-means algorithm is a :
Possible answers:
a. supervised learning algorithm.
b. unsupervised learning algorithm.
c. semi-supervised learning algorithm.
d. weakly supervised learning algorithm.

Question N.10:
Unsupervised learning. Let k(·, ·) a positive denite kernel dening a simi-
larity measure. The spectral clustering algorithm relies upon the singular value
decomposition (SVD) of:
Possible answers:
a. K = [k(xi , xj ]1≤i,j≤n

b. K̃ = ΠKΠ, with Π = In,n − n1 1n 1Tn and K dened above

c.
PL = D − K, with D = diag(deg(x1 ), . . . , deg(xn )) et deg(xi ) =
k(xi , xj ) for i = 1, . . . , n
n
j=1

d. L = K̃ − K, with K̃ and K dened above

4
Question N.11:
Supervised learning. The support vector machine uses the loss function
Possible answers:
a. `(y, f ) = max(0, 1 − yf )

b. `(y, f ) = log(1 + exp(−yf ))

c. `(y, f ) = (y − f )2

d. `(y, f ) = |y − f |

Question N.12:
Supervised learning. For competitive performance (and competitive gener-
alization error), the C parameter of the support vector machine should be:
Possible answers:
a. kept xed to C = 1 regardless of the training data at hand
b. optimized on the test set, used eventually for evaluting the true performance
of the learning algorithm
c. optimized on the training set
d. optimized through a cross-validation loop on the training set

Question N.13:
Category-level localization. A linear SVM classier used in combination
with the sliding-window object detector
Possible answers:
a. is fast because of the cascade structure
b. is fast because it can be expressed in the form of a dot-product f (x) =
w T x + b.

c. is slower compared to the nonlinear SVM.

d. usually has lower accuracy compared to the nonlinear SVM.

5
Question N.14:
Category-level localization. Pictorial structure models are often used to
model objects in terms of parts and relations between parts. The graph of a
pictorial structure model
Possible answers:
a. has nodes corresponding to object parts and edges corresponding to part
relations.
b. has nodes corresponding to part relations and edges corresponding to object
parts.
c. has associated energy function which can always be optimized in polynomial
time.
d. typically has a tree or a star structure due to eciency reasons.

Question N.15:
Motion and human actions. Optical ow estimation is problematic
Possible answers:
a. in homogeneous image areas.
b. in textured image areas.
c. at image edges.
d. at the boundaries of moving objects.

Question N.16:
Motion and human actions. Movie scripts can be used as a source of readily
available supervision. They can provide
Possible answers:
a. spatial supervision for objects in the video.
b. noisy temporal supervision.
c. reliable temporal supervision.
d. complete description of a video.

6
Multiple-Choice Questionnaire
Group B

a b c d
Question
n.1
Question
n.2
Question
n.3
Question
n.4
Question
n.5
Question
n.6
Question
n.7
Question
n.8
Question
n.9
Question
n.10
Question
n.11
Question
n.12
Question
n.13
Question
n.14
Question
n.15
Question
n.16
Multiple-Choice Questionnaire
Group B

a b c d
Question X X
n.1
Question X X X
n.2
Question X X X
n.3
Question X X
n.4
Question X X
n.5
Question X
n.6
Question X
n.7
Question X
n.8
Question X
n.9
Question X
n.10
Question X
n.11
Question X
n.12
Question X X
n.13
Question X X
n.14
Question X X X
n.15
Question X
n.16
1. The process of forming general concept definitions from examples of concepts to be learned.

A. Deduction
B. abduction
C. induction
D. conjunction

2. Computers are best at learning

A. facts.
B. concepts.
C. procedures.
D. principles.

3. Data used to build a data mining model.

A. validation data
B. training data
C. test data
D. hidden data

4. Supervised learning and unsupervised clustering both require at least one

A. hidden attribute.
B. output attribute.
C. input attribute.
D. categorical attribute.

5. Supervised learning differs from unsupervised clustering in that supervised learning requires

A. at least one input attribute.

B. input attributes to be categorical.
C. at least one output attribute.
D. ouput attriubutes to be categorical.

6. A regression model in which more than one independent variable is used to predict the dependent
variable is called

A. a simple linear regression model

B. a multiple regression models
C. an independent model
D. none of the above
7. A term used to describe the case when the independent variables in a multiple regression model are
correlated is

A. regression
B. correlation
C. multicollinearity
D. none of the above

8. A multiple regression model has the form: y = 2 + 3x1 + 4x2. As x1 increases by 1 unit (holding x2
constant), y will

A. increase by 3 units
B. decrease by 3 units
C. increase by 4 units
D. decrease by 4 units

9. A multiple regression model has

A. only one independent variable

B. more than one dependent variable
C. more than one independent variable
D. none of the above

10. A measure of goodness of fit for the estimated regression equation is the

A. multiple coefficient of determination

B. mean square due to error
C. mean square due to regression
D. none of the above

11. The adjusted multiple coefficient of determination accounts for

A. the number of dependent variables in the model

B. the number of independent variables in the model
C. unusually large predictors
D. none of the above

12. The multiple coefficient of determination is computed by

A. dividing SSR by SST

B. dividing SST by SSR
C. dividing SST by SSE
D. none of the above

13. For a multiple regression model, SST = 200 and SSE = 50. The multiple coefficient of determination is

A. 0.25
B. 4.00
C. 0.75
D. none of the above
14. A nearest neighbor approach is best used

A. with large-sized datasets.

B. when irrelevant attributes have been removed from the data.
C. when a generalized model of the data is desireable.
D. when an explanation of what has been found is of primary importance.

15. Determine which is the best approach for each problem.

A. supervised learning
B. unsupervised clustering
C. data query

1. What is the average weekly salary of all female employees under forty years of age? (C)
2. Develop a profile for credit card customers likely to carry an average monthly balance of more
than $1000.00. (A)
3. Determine the characteristics of a successful used car salesperson. (A)
4. What attribute similarities group customers holding one or several insurance policies? (A)
5. Do meaningful attribute relationships exist in a database containing information about credit
card customers? (B)
6. Do single men play more golf than married men? (C)
7. Determine whether a credit card transaction is valid or fraudulent (A)

16. Another name for an output attribute.

A. predictive variable
B. independent variable
C. estimated variable
D. dependent variable

17. Classification problems are distinguished from estimation problems in that

A. classification problems require the output attribute to be numeric.

B. classification problems require the output attribute to be categorical.
C. classification problems do not allow an output attribute.
D. classification problems are designed to predict future outcome.

18. Which statement is true about prediction problems?

A. The output attribute must be categorical.

B. The output attribute must be numeric.
C. The resultant model is designed to determine future outcomes.
D. The resultant model is designed to classify current behavior.
19. Which statement about outliers is true?

A. Outliers should be identified and removed from a dataset.

B. Outliers should be part of the training dataset but should not be present in the test data.
C. Outliers should be part of the test dataset but should not be present in the training data.
D. The nature of the problem determines how outliers are used.
E. More than one of a,b,c or d is true.

20. Which statement is true about neural network and linear regression models?

A. Both models require input attributes to be numeric.

B. Both models require numeric attributes to range between 0 and 1.
C. The output of both models is a categorical attribute value.
D. Both techniques build models whose output is determined by a linear sum of weighted input
attribute values.
E. More than one of a,b,c or d is true.

21. Which of the following is a common use of unsupervised clustering?

A. detect outliers
B. determine a best set of input attributes for supervised learning
C. evaluate the likely performance of a supervised learner model
D. determine if meaningful relationships can be found in a dataset
E. All of a,b,c, and d are common uses of unsupervised clustering.

22. The average positive difference between computed and desired outcome values.

A. root mean squared error

B. mean squared error
C. mean absolute error
D. mean positive error

23. Selecting data so as to assure that each class is properly represented in both the training and
test set.

A. cross validation
B. stratification
C. verification
D. bootstrapping

24. The standard error is defined as the square root of this computation.

A. The sample variance divided by the total number of sample instances.

B. The population variance divided by the total number of sample instances.
C. The sample variance divided by the sample mean.
D. The population variance divided by the sample mean.
25. Data used to optimize the parameter settings of a supervised learner model.

A. training
B. test
C. verification
D. validation

26. Bootstrapping allows us to

A. choose the same training instance several times.

B. choose the same test set instance several times.
C. build models with alternative subsets of the training data several times.
D. test a model with alternative subsets of the test data several times.

27. The correlation between the number of years an employee has worked for a company and
the salary of the employee is 0.75. What can be said about employee salary and years
worked?

A. There is no relationship between salary and years worked.

B. Individuals that have worked for the company the longest have higher salaries.
C. Individuals that have worked for the company the longest have lower salaries.
D. The majority of employees have been with the company a long time.
E. The majority of employees have been with the company a short period of time.

28. The correlation coefficient for two real-valued attributes is –0.85. What does this value tell
you?

A. The attributes are not linearly related.

B. As the value of one attribute increases the value of the second attribute also increases.
C. As the value of one attribute decreases the value of the second attribute increases.
D. The attributes show a curvilinear relationship.

29. The average squared difference between classifier predicted output and actual output.

A. mean squared error

B. root mean squared error
C. mean absolute error
D. mean relative error

30. Simple regression assumes a __________ relationship between the input attribute and
output attribute.

A. linear
B. quadratic
C. reciprocal
D. inverse
31. Regression trees are often used to model _______ data.

A. linear
B. nonlinear
C. categorical
D. symmetrical

32. The leaf nodes of a model tree are

A. averages of numeric output attribute values.

B. nonlinear regression equations.
C. linear regression equations.
D. sums of numeric output attribute values.

33. Logistic regression is a ________ regression technique that is used to model data having a
_____outcome.

A. linear, numeric
B. linear, binary
C. nonlinear, numeric
D. nonlinear, binary

34. This technique associates a conditional probability value with each data instance.

A. linear regression
B. logistic regression
C. simple regression
D. multiple linear regression

35. This supervised learning technique can process both numeric and categorical input attributes.

A. linear regression
B. Bayes classifier
C. logistic regression
D. backpropagation learning

36. With Bayes classifier, missing data items are

A. treated as equal compares.

B. treated as unequal compares.
C. replaced with a default value.
D. ignored.
37. This clustering algorithm merges and splits nodes to help modify nonoptimal partitions.

A. agglomerative clustering
B. expectation maximization
C. conceptual clustering
D. K-Means clustering

38. This clustering algorithm initially assumes that each data instance represents a single cluster.

A. agglomerative clustering
B. conceptual clustering
C. K-Means clustering
D. expectation maximization

39. This unsupervised clustering algorithm terminates when mean values computed for the
current iteration of the algorithm are identical to the computed mean values for the previous
iteration.

A. agglomerative clustering
B. conceptual clustering
C. K-Means clustering
D. expectation maximization

40. Machine learning techniques differ from statistical techniques in that machine learning
methods

A. typically assume an underlying distribution for the data.

B. are better able to deal with missing and noisy data.
C. are not able to explain their behavior.
D. have trouble with large-sized datasets.
CS 189 Introduction to
Spring 2013 Machine Learning Final
• You have 3 hours for the exam.

• The exam is closed book, closed notes except your one-page (two sides) or two-page (one side) crib sheet.

• Please use non-programmable calculators only.

• Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a
brief explanation. All short answer sections can be successfully answered in a few sentences AT MOST.

• For true/false questions, fill in the True/False bubble.

• For multiple-choice questions, fill in the bubbles for ALL CORRECT CHOICES (in some cases, there may be
more than one). For a question with p points and k choices, every false positive wil incur a penalty of p/(k − 1)
points.

• For short answer questions, unnecessarily long explanations and extraneous data will be penalized.
Please try to be terse and precise and do the side calculations on the scratch papers provided.

• Please draw a bounding box around your answer in the Short Answers section. A missed answer without
a bounding box will not be regraded.

First name

Last name

SID

For staff use only:

Q1. True/False /23
Q2. Multiple Choice Questions /36
Q3. Short Answers /26
Total /85

1
Q1. [23 pts] True/False
(a) [1 pt] Solving a non linear separation problem with a hard margin Kernelized SVM (Gaussian RBF Kernel)
might lead to overfitting.
True False

(b) [1 pt] In SVMs, the sum of the Lagrange multipliers corresponding to the positive examples is equal to the sum
of the Lagrange multipliers corresponding to the negative examples.
True False

(d) [1 pt] V (X) = E[X]2 − E[X 2 ]

True False

(e) [1 pt] In the discriminative approach to solving classification problems, we model the conditional probability
of the labels given the observations.
True False

(f ) [1 pt] In a two class classification problem, a point on the Bayes optimal decision boundary x∗ always satisfies
P (y = 1|x∗ ) = P (y = 0|x∗ ).
True False

(g) [1 pt] Any linear combination of the components of a multivariate Gaussian is a univariate Gaussian.
True False

(h) [1 pt] For any two random variables X ∼ N (µ1 , σ12 ) and Y ∼ N (µ2 , σ22 ), X + Y ∼ N (µ1 + µ2 , σ12 + σ22 ).
True False

(i) [1 pt] Stanford and Berkeley students are trying to solve the same logistic regression problem for a dataset.
The Stanford group claims that their initialization point will lead to a much better optimum than Berkeley’s
initialization point. Stanford is correct.
True False
p
(j) [1 pt] In logistic regression, we model the odds ratio ( 1−p ) as a linear function.
True False

(k) [1 pt] Random forests can be used to classify infinite dimensional data.
True False

(l) [1 pt] In boosting we start with a Gaussian weight distribution over the training samples.
True False

(m) [1 pt] In Adaboost, the error of each hypothesis is calculated by the ratio of misclassified examples to the total
number of examples.
True False

(n) [1 pt] When k = 1 and N → ∞, the kNN classification rate is bounded above by twice the Bayes error rate.
True False

(o) [1 pt] A single layer neural network with a sigmoid activation for binary classification with the cross entropy
loss is exactly equivalent to logistic regression.
True False

2
(p) [1 pt] The loss function for LeNet5 (the convolutional neural network by LeCun et al.) is convex.
True False

(q) [1 pt] Convolution is a linear operation i.e. (αf1 + βf2 ) ∗ g = αf1 ∗ g + βf2 ∗ g.
True False

(r) [1 pt] The k-means algorithm does coordinate descent on a non-convex objective function.
True False

(s) [1 pt] A 1-NN classifier has higher variance than a 3-NN classifier.
True False

(t) [1 pt] The single link agglomerative clustering algorithm groups two clusters on the basis of the maximum
distance between points in the two clusters.
True False

(u) [1 pt] The largest eigenvector of the covariance matrix is the direction of minimum variance in the data.
True False

(v) [1 pt] The eigenvectors of AAT and AT A are the same.

True False

(w) [1 pt] The non-zero eigenvalues of AAT and AT A are the same.
True False

3
Q2. [36 pts] Multiple Choice Questions
(a) [4 pts] In linear regression, we model P (y|x) ∼ N (wT x + w0 , σ 2 ). The irreducible error in this model is
.

σ2 E[(y − E[y|x])|x]

E[(y − E[y|x])2 |x] E[y|x]

(b) [4 pts] Let S1 and S2 be the set of support vectors and w1 and w2 be the learnt weight vectors for a linearly
separable problem using hard and soft margin linear SVMs respectively. Which of the following are correct?

S1 ⊂ S2 S1 may not be a subset of S2

w1 = w2 w1 may not be equal to w2 .

(c) [4 pts] Ordinary least-squares regression is equivalent to assuming that each data point is generated according
to a linear function of the input plus zero-mean, constant-variance Gaussian noise. In many systems, however,
the noise variance is itself a positive linear function of the input (which is assumed to be non-negative, i.e.,
x ≥ 0). Which of the following families of probability models correctly describes this situation in the univariate
case?
2 2 2
P (y|x) = √1
σ 2πx
exp(− (y−(w2xσ
0 +w1 x))
2 ) P (y|x) = √1
σ 2πx
exp(− (y−(w0 +(w 1 +σ )x))
2σ 2 )

(y−(w0 +w1 x))2 2

P (y|x) = √1 exp(−
σ 2π 2σ 2 ) P (y|x) = 1
√
σx 2π
exp(− (y−(w2x0 2+w
σ2
1 x))
)

(d) [3 pts] The left singular vectors of a matrix A can be found in .

Eigenvectors of AAT Eigenvectors of A2

Eigenvectors of AT A Eigenvalues of AAT

(e) [3 pts] Averaging the output of multiple decision trees helps .

Increase bias Increase variance

Decrease bias Decrease variance

(f ) [4 pts] Let A be a symmetric matrix and S be the matrix containing its eigenvectors as column vectors, and D
a diagonal matrix containing the corresponding eigenvalues on the diagonal. Which of the following are true:

AS = SD SA = DS

AS = DS AS = DS T

(g) [4 pts] Consider the following dataset: A = (0, 2), B = (0, 1) and C = (1, 0). The k-means algorithm is
initialized with centers at A and B. Upon convergence, the two centers will be at

A and C C and the midpoint of AB

A and the midpoint of BC A and B

4
(h) [3 pts] Which of the following loss functions are convex?

Misclassification loss Hinge loss

Logistic loss Exponential Loss (e(−yf (x)) )

(i) [3 pts] Consider T1 , a decision stump (tree of depth 2) and T2 , a decision tree that is grown till a maximum
depth of 4. Which of the following is/are correct?

Bias(T1 ) < Bias(T2 ) V ariance(T1 ) < V ariance(T2 )

Bias(T1 ) > Bias(T2 ) V ariance(T1 ) > V ariance(T2 )

(j) [4 pts] Consider the problem of building decision trees with k-ary splits (split one node intok nodes) and
you are deciding k for each node by calculating the entropy impurity for different values of k and optimizing
simultaneously over the splitting threshold(s) and k. Which of the following is/are true?

The algorithm will always choose k = 2 There will be k −1 thresholds for a k-ary split

This model is strictly more powerful than a

The algorithm will prefer high values of k binary decision tree.

5
Q3. [26 pts] Short Answers
σ2 σ12
(a) [5 pts] Given that (x1 , x2 ) are jointly normally distributed with µ = µµ12 and Σ = σ 1

σ22
(σ21 = σ12 ), give
21
an expression for the mean of the conditional distribution p(x1 |x2 = a).

This can be solved by writing p(x1 |x2 = a) = p(x 1 ,x2 =a)

p(x2 =a) . x2 being a component of a multivariate Gaussian is
a univariate Gaussian with x2 ∼ N (µ2 , σ22 ). Write out the Gaussian densities and simplify (complete squares)
to see the following:
σ12
x1 |x2 = a ∼ N (µ̄, σ̄ 2 ), µ̄ = µ1 + 2 (a − µ2 )
σ22

(b) [4 pts] The logistic function is given by σ(x) = 1

1+e−x . Show that σ 0 (x) = σ(x)(1 − σ(x)).

e−x e−x

1 1 1
σ 0 (x) = −x
= . = 1− = σ(x)(1 − σ(x))
(1 + e ) 2 (1 + e ) (1 + e−x )
−x 1 + e−x 1 + e−x

(c) Let X have a uniform distribution

(
1
θ 0≤x≤θ
p(x; θ) =
0 otherwise
Suppose that n samples x1 , . . . , xn are drawn independently according to p(x; θ).
(i) [5 pts] The maximum likelihood estimate of θ is x(n) = max(x1 , x2 , . . . , xn ). Show that this estimate of θ
is biased.

Biased estimator: θ̂ (the sample estimate) is a biased estimator of θ (the population distribution parameter)
if E[θ̂] 6= θ.

n
Here θ̂ = x(n) . And E[x(n) ] = n+1 θ 6= θ. The steps for finding E[x(n) ] are given in the solutions of Homework
2, problem 5(c).

(ii) [2 pts] Give an expression for an unbiased estimator of θ.

n+1
θ̂unbiased = x(n)
n
n+1 n+1 n+1 n
E[θ̂unbiased ] = E[ x(n) ] = E[x(n) ] = × θ=θ
n n n n+1

6
(d) [5 pts] Consider the problem of fitting the following function to a dataset of 100 points {(xi , yi )}, i = 1 . . . 100:

y = αcos(x) + βsin(x) + γ

This problem can be solved using the least squares method with a solution of the form:
 
α
β  = (X T X)−1 X T Y
γ

What are X and Y ?

   
cos(x1 ) sin(x1 ) 1 y1
 cos(x2 ) sin(x2 ) 1  y2 
X= 
 .. ..

..  Y = 
 ..


 . . .  . 
cos(x100 ) sin(x100 ) 1 y100

(e) [5 pts] Consider the problem of binary classification using the Naive Bayes classifier. You are given two dimen-
sional features (X1 , X2 ) and the categorical class conditional distributions in the tables below. The entries in
the tables correspond to P (X1 = x1 |Ci ) and P (X2 = x2 |Ci ) respectively. The two classes are equally likely.

PP Class PP Class
PP PP
C1 C2 C1 C2
X1 = P P PP X2 = PP P P
−1 0.2 0.3 −1 0.4 0.1
0 0.4 0.6 0 0.5 0.3
1 0.4 0.1 1 0.1 0.6

Given a data point (−1, 1), calculate the following posterior probabilities:

P (C1 |X1 = −1, X2 = 1) = Using Bayes’ Rule and conditional independence assumption of Naive Bayes

P (X1 =−1,X2 =1|C1 )P (C1 ) P (X1 =−1|C1 )P (X2 =1|C1 )P (C1 )

P (X1 =−1,X2 =1)
= P (X1 =−1|C1 )P (X2 =1|C1 )P (C1 )+P (X1 =−1|C2 )P (X2 =1|C2 )P (C2 )
= 0.1

P (C2 |X1 = −1, X2 = 1) = 1 − P (C1 |X2 = −1, X1 = 1) = 0.9

7
Scratch paper

8
Scratch paper

9
CS 189 Introduction to
Spring 2016 Machine Learning Final
• Please do not open the exam before you are instructed to do so.

• The exam is closed book, closed notes except your two-page cheat sheet.

• Electronic devices are forbidden on your person, including cell phones, iPods, headphones, and laptops.
Turn your cell phone off and leave all electronics at the front of the room, or risk getting a zero on
the exam.

• You have 3 hours.

• Please write your initials at the top right of each page (e.g., write “JS” if you are Jonathan Shewchuk). Finish
this by the end of your 3 hours.

• Mark your answers on front of each page, not the back. We will not scan the backs of each page, but you may
use them as scratch paper. Do not attach any extra sheets.

• The total number of points is 150. There are 30 multiple choice questions worth 3 points each, and 6 written
questions worth a total of 60 points.

• For multiple-choice questions, fill in the boxes for ALL correct choices: there may be more than one correct
choice, but there is always at least one correct choice. NO partial credit on multiple-choice questions: the
set of all correct answers must be checked.

First name

Last name

SID

First and last name of student to your left

First and last name of student to your right

1
Q1. [90 pts] Multiple Choice
Check the boxes for ALL CORRECT CHOICES. Every question should have at least one box checked. NO PARTIAL
CREDIT: the set of all correct answers (only) must be checked.

(1) [3 pts] What strategies can help reduce overfitting in decision trees?

Pruning Enforce a minimum number of samples in leaf

nodes
Make sure each leaf node is one pure class
Enforce a maximum depth for the tree

(2) [3 pts] Which of the following are true of convolutional neural networks (CNNs) for image analysis?

Filters in earlier layers tend to include edge They have more parameters than fully-
detectors connected networks with the same number of lay-
ers and the same numbers of neurons in each layer

Pooling layers reduce the spatial resolution of A CNN can be trained for unsupervised learn-
the image ing tasks, whereas an ordinary neural net cannot

(3) [3 pts] Neural networks

optimize a convex cost function always output values between 0 and 1

can be used for regression as well as classifica-

tion can be used in an ensemble

(4) [3 pts] Which of the following are true about generative models?

They model the joint distribution P (class = The perceptron is a generative model
C AND sample = x)
Linear discriminant analysis is a generative
They can be used for classification model

(5) [3 pts] Lasso can be interpreted as least-squares linear regression where

weights are regularized with the `1 norm the weights have a Gaussian prior

weights are regularized with the `2 norm the solution algorithm is simpler

(6) [3 pts] Which of the following methods can achieve zero training error on any linearly separable dataset?

Decision tree 15-nearest neighbors

Hard-margin SVM Perceptron

(7) [3 pts] The kernel trick

can be applied to every classification algorithm is commonly used for dimensionality reduction

changes ridge regression so we solve a d × d exploits the fact that in many learning al-
linear system instead of an n × n system, given n gorithms, the weights can be written as a linear
sample points with d features combination of input points

2
(8) [3 pts] Suppose we train a hard-margin linear SVM on n > 100 data points in R2 , yielding a hyperplane with
exactly 2 support vectors. If we add one more data point and retrain the classifier, what is the maximum
possible number of support vectors for the new hyperplane (assuming the n + 1 points are linearly separable)?

2 n

3 n+1

(9) [3 pts] In latent semantic indexing, we compute a low-rank approximation to a term-document matrix. Which
of the following motivate the low-rank reconstruction?

Finding documents that are related to each The low-rank approximation provides a loss-
other, e.g. of a similar genre less method for compressing an input matrix

In many applications, some principal compo-

nents encode noise rather than meaningful struc- Low-rank approximation enables discovery of
ture nonlinear relations

(10) [3 pts] Which of the following are true about subset selection?

Subset selection can substantially decrease the Subset selection can reduce overfitting
bias of support vector machines

Ridge regression frequently eliminates some of Finding the true best subset takes exponential
the features time

(11) [3 pts] In neural networks, nonlinear activation functions such as sigmoid, tanh, and ReLU

speed up the gradient calculation in backprop- help to learn nonlinear decision boundaries
agation, as compared to linear units

are applied only to the output units always output values between 0 and 1

(12) [3 pts] Suppose we are given data comprising points of several different classes. Each class has a different
probability distribution from which the sample points are drawn. We do not have the class labels. We use
k-means clustering to try to guess the classes. Which of the following circumstances would undermine its
effectiveness?

Some of the classes are not normally dis- The variance of each distribution is small in
tributed all directions

Each class has the same mean You choose k = n, the number of sample points

(13) [3 pts] Which of the following are true of spectral graph partitioning methods?

They find the cut with minimum weight They minimize a quadratic function subject to
one constraint: the partition must be balanced
They use one or more eigenvectors of the
Laplacian matrix The Normalized Cut was invented at Stanford

(14) [3 pts] Which of the following can help to reduce overfitting in an SVM classifier?

Use of slack variables High-degree polynomial features

Normalizing the data Setting a very low learning rate

3
(15) [3 pts] Which value of k in the k-nearest neighbors algorithm generates the solid decision boundary depicted
here? There are only 2 classes. (Ignore the dashed line, which is the Bayes decision boundary.)

k=1 k=2

k = 10 k = 100

(16) [3 pts] Consider one layer of weights (edges) in a convolutional neural network (CNN) for grayscale images,
connecting one layer of units to the next layer of units. Which type of layer has the fewest parameters to be
learned during training? (Select one.)

A convolutional layer with 10 3 × 3 filters A convolutional layer with 8 5 × 5 filters

A max-pooling layer that reduces a 10 × 10 A fully-connected layer from 20 hidden units

image to 5 × 5 to 4 output units

(17) [3 pts] In the kernelized perceptron algorithm with learning rate = 1, the coefficient ai corresponding to a
training example xi represents the weight for K(xi , x). Suppose we have a two-class classification problem with
yi ∈ {1, −1}. If yi = 1, which of the following can be true for ai ?

ai = −1 ai = 1

ai = 0 ai = 5

(18) [3 pts] Suppose you want to split a graph G into two subgraphs. Let L be G’s Laplacian matrix. Which of the
following could help you find a good split?

The eigenvector corresponding to the second- The left singular vector corresponding to the
largest eigenvalue of L second-largest singular value of L

The eigenvector corresponding to the second- The left singular vector corresponding to the
smallest eigenvalue of L second-smallest singular value of L

(19) [3 pts] Which of the following are properties that a kernel matrix always has?

Invertible All the entries are positive

At least one negative eigenvalue Symmetric

4
(20) [3 pts] How does the bias-variance decomposition of a ridge regression estimator compare with that of ordinary
least squares regression? (Select one.)

Ridge has larger bias, larger variance Ridge has smaller bias, larger variance

Ridge has larger bias, smaller variance Ridge has smaller bias, smaller variance

(21) [3 pts] Both PCA and Lasso can be used for feature selection. Which of the following statements are true?

Lasso selects a subset (not necessarily a strict PCA and Lasso both allow you to specify how
subset) of the original features many features are chosen

PCA produces features that are linear combi- PCA and Lasso are the same if you use the
nations of the original features kernel trick

(22) [3 pts] Which of the following are true about forward subset selection?

O(2d ) models must be trained during the al- It finds the subset of features that give the
gorithm, where d is the number of features lowest test error

It greedily adds the feature that most improves Forward selection is faster than backward se-
cross-validation accuracy lection if few features are relevant to prediction

(23) [3 pts] You’ve just finished training a random forest for spam classification, and it is getting abnormally bad
performance on your validation set, but good performance on your training set. Your implementation has no
bugs. What could be causing the problem?

Your decision trees are too deep You have too few trees in your ensemble

You are randomly sampling too many features Your bagging implementation is randomly
when you choose a split sampling sample points without replacement
   
6 3 1
2 7   0
9 6 and labels y = 1. Let f1 denote
(24) [3 pts] Consider training a decision tree given a design matrix X =    

4 2 0
feature 1, corresponding to the first column of X, and let f2 denote feature 2, corresponding to the second
column. Which of the following splits at the root node gives the highest information gain? (Select one.)

f1 > 2 f2 > 3

f1 > 4 f2 > 6

(25) [3 pts] In terms of the bias-variance decomposition, a 1-nearest neighbor classifier has than a
3-nearest neighbor classifier.

higher variance higher bias

lower variance lower bias

5
(26) [3 pts] Which of the following are true about bagging?

In bagging, we choose random subsamples of The main purpose of bagging is to decrease

the input points with replacement the bias of learning algorithms.

Bagging is ineffective with logistic regression, If we use decision trees that have one sample
because all of the learners learn exactly the same point per leaf, bagging never gives lower training
decision boundary error than one ordinary decision tree

(27) [3 pts] An advantage of searching for an approximate nearest neighbor, rather than the exact nearest neighbor,
is that

it sometimes makes exhaustive search much the nearest neighbor classifier is sometimes
faster much more accurate

you find all the points within a distance of

it sometimes makes searching in a k-d tree (1 + )r from the query point, where r is the dis-
much faster tance from the query point to its nearest neighbor

(28) [3 pts] In the derivation of the spectral graph partitioning algorithm, we relax a combinatorial optimization
problem to a continuous optimization problem. This relaxation has the following effects.

The combinatorial problem requires an ex- The combinatorial problem requires finding
act bisection of the graph, but the continuous al- eigenvectors, whereas the continuous problem re-
gorithm can produce (after rounding) partitions quires only matrix multiplication
that aren’t perfectly balanced

The combinatorial problem cannot be modi- The combinatorial problem is NP-hard, but
fied to accommodate vertices that have different the continuous problem can be solved in polyno-
masses, whereas the continuous problem can mial time

(29) [3 pts] The firing rate of a neuron

determines how strongly the dendrites of the is more analogous to the output of a unit in a
neuron stimulate axons of neighboring neurons neural net than the output voltage of the neuron

only changes very slowly, taking a period of can sometimes exceed 30,000 action potentials
several seconds to make large adjustments per second

(30) [3 pts] In algorithms that use the kernel trick, the Gaussian kernel

gives a regression function or predictor func- is equivalent to lifting the d-dimensional sam-
tion that is a linear combination of Gaussians cen- ple points to points in a space whose dimension
tered at the sample points is exponential in d

is less prone to oscillating than polynomials, has good properties in theory but is rarely
assuming the variance of the Gaussians is large used in practice

(31) 3 bonus points! The following Berkeley professors were cited in this semester’s lectures (possibly self-cited)
for specific research contributions they made to machine learning.

David Culler Michael Jordan

Jitendra Malik Leo Breiman

Anca Dragan Jonathan Shewchuk

6
Q2. [8 pts] Feature Selection
A newly employed former CS 189/289A student trains the latest Deep Learning classifier and obtains state-of-the-art
accuracy. However, the classifier uses too many features! The boss is overwhelmed and asks for a model with fewer
features.

Let’s try to identify the most important features. Start with a simple dataset in R2 .

(1) [4 pts] Describe the training error of a Bayes optimal classifier that can see only the first feature of the data.
Describe the training error of a Bayes optimal classifier that can see only the second feature.

The first feature yields a training error of 50% (like random guessing). The second feature offers a training error of
zero.

(2) [4 pts] Based on this toy example, the student decides to fit a classifier on each feature individually, then
rank the features by their classifier’s accuracy, take the best k features, and train a new classifier on those k
features. We call this approach variable ranking. Unfortunately, the classifier trained on the best k features
obtains horrible accuracy, unless k is very close to d, the original number of features!
Construct a toy dataset in R2 for which variable ranking fails. In other words, a dataset where a variable is
useless by itself, but potentially useful alongside others. Use + for data points in Class 1, and O for data points
in Class 2.

An XOR Dataset is unpredictable with either feature. (This extends to n-dimensions, with the n-bit parity string.)

7
Q3. [10 pts] Gradient Descent for k-means Clustering
Recall the loss function for k-means clustering with k clusters, sample points x1 , ..., xn , and centers µ1 , ..., µk :
k
X X
L= kxi − µj k2 ,
j=1 xi ∈Sj

where Sj refers to the set of data points that are closer to µj than to any other cluster mean.

(1) [4 pts] Instead of updating µj by computing the mean, let’s minimize L with batch gradient descent while
holding the sets Sj fixed. Derive the update formula for µ1 with learning rate (step size) .

∂L ∂ X
= (xi − µ1 )> (xi − µ1 )
∂µ1 ∂µ1
xi ∈S1
X
= 2(µ1 − xi ).
xi ∈S1

Therefore the update formula is X

µ1 ← µ1 + (xi − µ1 ).
xi ∈S1

(Note: writing 2 instead of is fine.)

(2) [2 pts] Derive the update formula for µ1 with stochastic gradient descent on a single sample point xi . Use
learning rate .
µ1 ← µ1 + (xi − µ1 ) if xi ∈ S1 , otherwise no change.

(3) [4 pts] In this part, we will connect the batch gradient descent update equation with the standard k-means
algorithm. Recall that in the update step of the standard algorithm, we assign each cluster center to be the
mean (centroid) of the data points closest to that center. It turns out that a particular choice of the learning
rate (which may be different for each cluster) makes the two algorithms (batch gradient descent and the
standard k-means algorithm) have identical update steps. Let’s focus on the update for the first cluster, with
center µ1 . Calculate the value of so that both algorithms perform the same update for µ1 . (If you do it right,
the answer should be very simple.)
In the standard algorithm, we assign µ1 ← xi ∈S1 |S11 | xi .
P

Comparing to the answer in (1), we set xi ∈S1 |S11 | xi = µ1 + xi ∈S1 (xi − µ1 ) and solve for .
P P

X 1 X 1 X
xi − µ1 = (xi − µ1 )
|S1 | |S1 |
xi ∈S1 xi ∈S1 xi ∈S1
X 1 X
(xi − µ1 ) = (xi − µ1 ).
|S1 |
xi ∈S1 xi ∈S1

1
Thus = |S1 | .

(Note: answers that differ by a constant factor are fine if consistent with answer for (1).)

8
Q4. [10 pts] Kernels
(1) [2 pts] What is the primary motivation for using the kernel trick in machine learning algorithms?
If we want to map sample points to a very high-dimensional feature space, the kernel trick can save us from
having to compute those features explicitly, thereby saving a lot of time.
(Alternative solution: the kernel trick enables the use of infinite-dimensional feature spaces.)

(2) [4 pts] Prove that for every design matrix X ∈ Rn×d , the corresponding kernel matrix is positive semidefinite.
For every vector z ∈ Rn ,
z> Kz = z> XX > z = |X > z|2 ,
which is clearly nonnegative.

(3) [2 pts] Suppose that a regression algorithm contains the following line of code.

w ← w + X > M XX > u

Here, X ∈ Rn×d is the design matrix, w ∈ Rd is the weight vector, M ∈ Rn×n is a matrix unrelated to X,
and u ∈ Rn is a vector unrelated to X. We want to derive a dual version of the algorithm in which we express
the weights w as a linear combination of samples Xi (rows of X) and a dual weight vector a contains the
coefficients of that linear combination. Rewrite the line of code in its dual form so that it updates a correctly
(and so that w does not appear).

a ← a + M XX > u

(4) [2 pts] Can this line of code for updating a be kernelized? If so, show how. If not, explain why.
Yes:
a ← a + M Ku

9
Q5. [12 pts] Let’s PCA
 
6 −4
 −3 5 
You are given a design matrix X = 
 −2
. Let’s use PCA to reduce the dimension from 2 to 1.
6 
7 −3

(1) [6 pts] Compute the covariance matrix for the sample points. (Warning: Observe that X is not centered.)
Then compute the unit eigenvectors, and the corresponding eigenvalues, of the covariance matrix. Hint: If
you graph the points, you can probably guess the eigenvectors (then verify that they really are eigenvectors).

> 82 −80
The covariance matrix is X X = .
−80 82
" # " #
√1 √1
Its unit eigenvectors are 2 with eigenvalue 2 and 2 with eigenvalue 162. (Note: either eigenvector
√1 − √12
2
can be replaced with its negation.)

(2) [3 pts] Suppose we use PCA to project the sample points onto a one-dimensional space. What one-dimensional
subspace are we projecting onto? For each of the four sample points in X (not the centered version of X!),
write the coordinate (in principal coordinate space, not in R2 ) that the point is projected to.
" #
√1

2 1
We are projecting onto the subspace spanned by . (Equivalently, onto the space spanned by . Equiva-
− √12 −1
10
lently, onto the line x + y = 0.) The projections are (6, −4) → √
2
, (−3, 5) → − √82 , (−2, 6) → − √82 , (7, −3) → 10
√
2
.

(3) [3 pts] Given a design matrix X that is taller than it is wide, prove that every right singular vector of X with
singular value σ is an eigenvector of the covariance matrix with eigenvalue σ 2 .

If v is a right singular vector of X, then there is a singular value decomposition X = U DV > such that v is a column
of V . Here each of U and V has orthonormal columns, V is square, and D is square and diagonal. The covariance
matrix is X > X = V DU > U DV > = V D2 V > . This is an eigendecomposition of X > X, so each singular vector in V
with singular value σ is an eigenvector of X > X with eigenvalue σ 2 .

10
Q6. [10 pts] Trees
13

1 5 5
16
10 12 2 12
3 15 3 4 10 9
17
2 4 1 16 8 14
14 13 6 7 15 11
6
8 11 17
9
7

(1) [5 pts] Above, we have two depictions of the same k-d tree, which we have built to solve nearest neighbor
queries. Each node of the tree at right represents a rectangular box at left, and also stores one of the sample
points that lie inside that box. (The root node represents the whole plane R2 .) If a treenode stores sample point
i, then the line passing through point i (in the diagram at left) determines which boxes the child treenodes
represent.
Simulate running an exact 1-nearest neighbor query, where the bold X is the query point. Recall that the query
algorithm visits the treenodes in a smart order, and keeps track of the nearest point it has seen so far.
• Write down the numbers of all the sample points that serve as the “nearest point seen so far” sometime
while the query algorithm is running, in the order they are encountered.
• Circle all the subtrees in the k-d tree at upper right that are never visited during this query. (This is why
k-d tree search is usually faster than exhaustive search.)

Nearest point seen so far: first 5, then 12, then 10.

The unvisited subtrees are rooted at 2, 13, 7, and 17.

(2) [5 pts] We are building a decision tree for a 2-class classification problem. We have n training points, each having
d real-valued features. At each node of the tree, we try every possible univariate split (i.e. for each feature, we
try every possible splitting value for that feature) and choose the split that maximizes the information gain.
Explain why it is possible to build the tree in O(ndh) time, where h is the depth of the tree’s deepest node.
Your explanation should include an analysis of the time to choose one node’s split. Assume that we can radix
sort real numbers in linear time.

Consider choosing the split at a node whose box contains n0 sample points. For each of the d features, we can sort
the sample points in O(n0 d) time. Then we can compute the entropy for the first split (separating the first sample
in the sorted list from the others) in O(n0 ) time, then we can walk through the list and update the entropy for each
successive split in O(1) time, summing to a total of O(n0 ) time for each of the d features. So it takes O(n0 d) time
overall to choose a split.

Each sample point participates in at most h treenodes, so each sample point contributes at most dh to the running
time, for a total running time of at most O(ndh).

11
Q7. [10 pts] Self-Driving Cars and Backpropagation
You want to train a neural network to drive a car. Your training data consists of grayscale 64 × 64 pixel images. The
training labels include the human driver’s steering wheel angle in degrees and the human driver’s speed in miles per
hour. Your neural network consists of an input layer with 64 × 64 = 4,096 units, a hidden layer with 2,048 units,
and an output layer with 2 units (one for steering angle, one for speed). You use the ReLU activation function for
the hidden units and no activation function for the outputs (or inputs).

(1) [2 pts] Calculate the number of parameters (weights) in this network. You can leave your answer as an
expression. Be sure to account for the bias terms.

4097 × 2048 + 2049 × 2

(2) [3 pts] You train your network with the cost function J = 12 |y − z|2 . Use the following notation.
• x is a training image (input) vector with a 1 component appended to the end, y is a training label (input)
vector, and z is the output vector. All vectors are column vectors.
• r(γ) = max{0, γ} is the ReLU activation function, r0 (γ) is its derivative (1 if γ > 0, 0 otherwise), and
r(v) is r(·) applied component-wise to a vector.
• g is the vector of hidden unit values before the ReLU activation functions are applied, and h = r(g) is
the vector of hidden unit values after they are applied (but we append a 1 component to the end of h).
• V is the weight matrix mapping the input layer to the hidden layer; g = V x.
• W is the weight matrix mapping the hidden layer to the output layer; z = W h.
Derive ∂J/∂Wij .

∂J ∂z
= (z − y)>
∂Wij ∂Wij
= (zi − yi )hj

(3) [1 pt] Write ∂J/∂W as an outer product of two vectors. ∂J/∂W is a matrix with the same dimensions as W ;
it’s just like a gradient, except that W and ∂J/∂W are matrices rather than vectors.

∂J
= (z − y)h>
∂W

(4) [4 pts] Derive ∂J/∂Vij .

∂J ∂z
= (z − y)>
∂Vij ∂Vij
∂h
= (z − y)> W
∂Vij
= (z − y)> W [0, . . . , r0 (gi ) xj , . . . , 0]>
= ((z − y)> W )i r0 (gi ) xj .

12
10-601 Machine Learning, Midterm Exam

Instructors: Tom Mitchell, Ziv Bar-Joseph

Monday 22nd October, 2012

There are 5 questions, for a total of 100 points.

This exam has 16 pages, make sure you have all pages before you begin.
This exam is open book, open notes, but no computers or other electronic devices.

Good luck!

Name:

Andrew ID:

Question Points Score

Short Answers 20
Comparison of ML algorithms 20
Regression 20
Bayes Net 20
Overfitting and PAC Learning 20
Total: 100

1
10-601 Machine Learning Midterm Exam October 18, 2012

Question 1. Short Answers

True False Questions.
(a) [1 point] We can get multiple local optimum solutions if we solve a linear regression problem by
minimizing the sum of squared errors using gradient descent.
True False

Solution:
False

(b) [1 point] When a decision tree is grown to full depth, it is more likely to fit the noise in the data.
True False

Solution:
True

(d) [1 point] When the feature space is larger, over fitting is more likely.
True False

Solution:
True

(e) [1 point] We can use gradient descent to learn a Gaussian Mixture Model.
True False

Solution:
True

Short Questions.
(f) [3 points] Can you represent the following boolean function with a single logistic threshold unit
(i.e., a single unit from a neural network)? If yes, show the weights. If not, explain why not in 1-2
sentences.

A B f(A,B)
1 1 0
0 0 0
1 0 1
0 1 0

Page 1 of 16
10-601 Machine Learning Midterm Exam October 18, 2012

Solution:
Yes, you can represent this function with a single logistic threshold unit, since it is linearly
separable. Here is one example.

F (A, B) = 1{A − B − 0.5 > 0}

(1)

Page 2 of 16
10-601 Machine Learning Midterm Exam October 18, 2012

(g) [3 points] Suppose we clustered a set of N data points using two different clustering algorithms:
k-means and Gaussian mixtures. In both cases we obtained 5 clusters and in both cases the centers
of the clusters are exactly the same. Can 3 points that are assigned to different clusters in the k-
means solution be assigned to the same cluster in the Gaussian mixture solution? If no, explain. If
so, sketch an example or explain in 1-2 sentences.

Solution:
Yes, k-means assigns each data point to a unique cluster based on its distance to the cluster
center. Gaussian mixture clustering gives soft (probabilistic) assignment to each data point.
Therefore, even if cluster centers are identical in both methods, if Gaussian mixture compo-
nents have large variances (components are spread around their center), points on the edges
between clusters may be given different assignments in the Gaussian mixture solution.

Circle the correct answer(s).

(h) [3 points] As the number of training examples goes to infinity, your model trained on that data
will have:
A. Lower variance B. Higher variance C. Same variance

Solution:
Lower variance

(i) [3 points] As the number of training examples goes to infinity, your model trained on that data
will have:
A. Lower bias B. Higher bias C. Same bias

Solution:
Same bias

(j) [3 points] Suppose you are given an EM algorithm that finds maximum likelihood estimates for a
model with latent variables. You are asked to modify the algorithm so that it finds MAP estimates
instead. Which step or steps do you need to modify:
A. Expectation B. Maximization C. No modification necessary D. Both

Solution:
Maximization

Page 3 of 16
10-601 Machine Learning Midterm Exam October 18, 2012

Question 2. Comparison of ML algorithms

Assume we have a set of data from patients who have visited UPMC hospital during the year 2011. A
set of features (e.g., temperature, height) have been also extracted for each patient. Our goal is to decide
whether a new visiting patient has any of diabetes, heart disease, or Alzheimer (a patient can have one
or more of these diseases).

(a) [3 points] We have decided to use a neural network to solve this problem. We have two choices:
either to train a separate neural network for each of the diseases or to train a single neural network
with one output neuron for each disease, but with a shared hidden layer. Which method do you
prefer? Justify your answer.

Solution:
1- Neural network with a shared hidden layer can capture dependencies between diseases.
It can be shown that in some cases, when there is a dependency between the output nodes,
having a shared node in the hidden layer can improve the accuracy.
2- If there is no dependency between diseases (output neurons), then we would prefer to have
a separate neural network for each disease.

(b) [3 points] Some patient features are expensive to collect (e.g., brain scans) whereas others are not
(e.g., temperature). Therefore, we have decided to first ask our classification algorithm to predict
whether a patient has a disease, and if the classifier is 80% confident that the patient has a disease,
then we will do additional examinations to collect additional patient features In this case, which
classification methods do you recommend: neural networks, decision tree, or naive Bayes? Justify
your answer in one or two sentences.

Solution:
We expect students to explain how each of these learning techniques can be used to output
a confidence value (any of these techniques can be modified to provide a confidence value).
In addition, Naive Bayes is preferable to other cases since we can still use it for classification
when the value of some of the features are unknown.
We gave partial credits to those who mentioned neural network because of its non-linear de-
cision boundary, or decision tree since it gives us an interpretable answer.

(c) Assume that we use a logistic regression learning algorithm to train a classifier for each disease.
The classifier is trained to obtain MAP estimates for the logistic regression weights W . Our MAP
estimator optimizes the objective
Y
W ← arg max ln[P (W ) P (Y l |X l , W )]
W
l

where l refers to the lth training example. We adopt a Gaussian prior with zero mean for the
weights W = hw1 . . . wn i, making the above objective equivalent to:
X X
W ← arg max −C wi + ln P (Y l |X l , W )
W
i l

Note C here is a constant, and we re-run our learning algorithm with different values of C. Please
answer each of these true/false questions, and explain/justify your answer in no more than 2
sentences.
i. [2 points] The average log-probability of the training data can never increase as we increase C.
True False

Page 4 of 16
10-601 Machine Learning Midterm Exam October 18, 2012

Solution:
True. As we increase C, we give more weight to constraining the predictor. Thus it makes
our predictor less flexible to fit to training data (over constraining the predictor, makes it
unable to fit to training data).

ii. [2 points] If we start with C = 0, the average log-probability of test data will likely decrease as
we increase C.
True False

Solution:
False. As we increase the value of C (starting from C = 0), we avoid our predictor to over
fit to training data and thus we expect the accuracy of our predictor to be increased on the
test data.

iii. [2 points] If we start with a very large value of C, the average log-probability of test data can
never decrease as we increase C.
True False

Solution:
False. Similar to the previous parts, if we over constraint the predictor (by choosing very large
value of C), then it wouldn’t be able to fit to training data and thus makes it to perform worst
on the test data.

Page 5 of 16
10-601 Machine Learning Midterm Exam October 18, 2012

(d) Decision boundary

(a) (b)

Figure 1: Labeled training set.

i. [2 points] Figure 1(a) illustrates a subset of our training data when we have only two features:
X1 and X2 . Draw the decision boundary for the logistic regression that we explained in part
(c).

Solution:
The decision boundary for logistic regression is linear. One candidate solution which clas-
sifies all the data correctly is shown in Figure 1. We will accept other possible solutions
since decision boundary depends on the value of C (it is possible for the trained classifier
to miss-classify a few of the training data if we choose a large value of C).

ii. [3 points] Now assume that we add a new data point as it is shown in Figure 1(b). How does
it change the decision boundary that you drew in Figure 1(a)? Answer this by drawing both
the old and the new boundary.

Solution:
We expect the decision boundary to move a little toward the new data point.

(e) [3 points] Assume that we record information of all the patients who visit UPMC every day. How-
ever, for many of these patients we don’t know if they have any of the diseases, can we still improve
the accuracy of our classifier using these data? If yes, explain how, and if no, justify your answer.

Solution:
Yes, by using EM. In the class, we showed how EM can improve the accuracy of our classifier
using both labeled and unlabeled data. For more details, please look at http://www.cs.
cmu.edu/˜tom/10601_fall2012/slides/GrMod3_10_9_2012.pdf, page 6.

Page 6 of 16
10-601 Machine Learning Midterm Exam October 18, 2012

Question 3. Regression
Consider real-valued variables X and Y . The Y variable is generated, conditional on X, from the fol-
lowing process:

∼ N (0, σ 2 )
Y = aX +

where every is an independent variable, called a noise term, which is drawn from a Gaussian distri-
bution with mean 0, and standard deviation σ. This is a one-feature linear regression model, where a
is the only weight parameter. The conditional probability of Y has distribution p(Y |X, a) ∼ N (aX, σ 2 ),
so it can be written as
1 1 2
p(Y |X, a) = √ exp − 2 (Y − aX)
2πσ 2σ
The following questions are all about this model.

MLE estimation
(a) [3 points] Assume we have a training dataset of n pairs (Xi , Yi ) for i = 1..n, and σ is known.
Which ones of the following equations correctly represent the maximum likelihood problem for
estimating a? Say yes or no to each one. More than one of them should have the answer “yes.”
X 1 1
[Solution: no] arg max √ exp(− 2 (Yi − aXi )2 )
a
i
2πσ 2σ
Y 1 1
[Solution: yes] arg max √ exp(− 2 (Yi − aXi )2 )
a
i
2πσ 2σ
X 1
[Solution: no] arg max exp(− (Yi − aXi )2 )
a
i
2σ 2
Y 1
[Solution: yes] arg max exp(− (Yi − aXi )2 )
a
i
2σ 2
1X
[Solution: no] arg max (Yi − aXi )2
a 2 i
1X
[Solution: yes] arg min (Yi − aXi )2
a 2 i

(b) [7 points] Derive the maximum likelihood estimate of the parameter a in terms of the training
example Xi ’s and Yi ’s. We recommend you start with the simplest form of the problem you found
above.

Solution:

Page 7 of 16
10-601 Machine Learning Midterm Exam October 18, 2012

1
− aXi )2 and minimize F . Then
P
Use F (a) = 2 i (Yi
" #
∂ 1X 2
0= (Yi − aXi ) (2)
∂a 2 i
X
= (Yi − aXi )(−Xi ) (3)
i
X
= aXi2 − Xi Yi (4)
i
P
X i Yi
a = Pi 2 (5)
i Xi

Partial credit: 1 point for writing a correct objective, 1 point for taking the derivative, 1 point
for getting the chain rule correct, 1 point for a reasonable attempt at solving for a. 6 points for
correct up to a sign error.
P P
Many people got yi / xi as the answer, by erroneously cancelling xi on top and bottom.
4 points Pfor this answer when it is clear this cancelling caused the problem. If theyPexplicitly
xi yi / x2i along the way, 6 points. If it is completely unclear where
P P
derived yi / x i
came from, sometimes worth only 3 points (based on the partial credit rules above).
Some people wrote a gradient descent rule. We intended to ask for a closed-form maximum
likelihood estimate, not an algorithm to get it. (Yes, it is true that lectures never said there
exists a closed-form solution for linear regression MLE. But there is. In fact, there is a closed-
form solution even for multiple features, via linear algebra.) But we gave 4 points for getting
the rule correct; 3 points for correct with a sign error.
For gradient descent/ascent signs are tricky. If you are using the log-likelihood, thus maxi-
mization, you want gradient ascent, and thus add the gradient. If instead you’re doing the
minimization problem, andPusing gradient descent, need to Psubtract the gradient. Either way,
it comes out to a ← a + η i (yi − axi )xi . Interpretation: i (yi − axi )xi is the correlation of
data against the residual. In the case of positive x,y, if the data still correlates with the residual,
that means predictions are too low, so you want to increase a.
Here is a lovely book chapter by Tufte (1974) on one-feature linear regression:
http://www.edwardtufte.com/tufte/dapp/chapter3.html

MAP estimation
Let’s put a prior on a. Assume a ∼ N (0, λ2 ), so

1 1
p(a|λ) = √ exp(− 2 a2 )
2πλ 2λ
The posterior probability of a is

p(Y1 , . . . Yn |X1 , . . . Xn , a)p(a|λ)

p(a | Y1 , . . . Yn , X1 , . . . Xn , λ) = R
a0
p(Y1 , . . . Yn |X1 , . . . Xn , a0 )p(a0 |λ)da0
We can ignore the denominator when doing MAP estimation.

(c) [3 points] Under the following conditions, how do the prior and conditional likelihood curves
change? Do aM LE and aM AP become closer together, or further apart?

Page 8 of 16
10-601 Machine Learning Midterm Exam October 18, 2012

p(a|λ) prior probability: p(Y1 . . . Yn |X1 . . . Xn , a) |aM LE − aM AP | increase

wider, narrower, or same? conditional likelihood: or decrease?
wider, narrower, or same?

As λ → ∞ [Solution: wider] [Solution: same] [Solution: decrease]

As λ → 0 [Solution: narrower] [Solution: same] [Solution: increase]

More data: [Solution: same] [Solution: narrower] [Solution: decrease]

as n → ∞
(fixed λ)

(d) [7 points] Assume σ = 1, and a fixed prior parameter λ. Solve for the MAP estimate of a,
arg max [ln p(Y1 ..Yn | X1 ..Xn , a) + ln p(a|λ)]
a

Your solution should be in terms of Xi ’s, Yi ’s, and λ.

Solution:

∂ ∂` ∂ log p(a|λ)
[log p(Y |X, a) + log p(a|λ)] = + (6)
∂a ∂a ∂a
To stay sane, let’s look at it as maximization, not minimization. (It’s easy to get signs wrong by
trying to use the squared error minimization form from before.) Since σ = 1, the log-likelihood
and its derivative is

" #
1 Y 1 2
`(a) = log √ exp − 2 (Yi − aXi ) (7)
i
2πσ 2σ
1 X
`(a) = − log Z − (Yi − aXi )2 (8)
2 i
∂` X
=− (Yi − aXi )(−Xi ) (9)
∂a i
X
= (Yi − aXi )Xi (10)
i
X
= Xi Yi − aXi2 (11)
i

Next get the partial derivative for the log-prior.

√

∂ log p(a) ∂ 1
= − log( 2πλ) − 2 a2 (12)
∂a ∂a 2λ
a
=− 2 (13)
λ

Page 9 of 16
10-601 Machine Learning Midterm Exam October 18, 2012

The full partial is the sum of that and the log-likelihood which we did before.

∂` ∂ log p(a)
0= + (14)
∂a ∂a !
X
2 a
0= Xi Yi − aXi − 2 (15)
i
λ
P
Xi Yi
a = P i2 (16)
( i Xi ) + 1/λ2

Partial credit: 1 point for writing out the log posterior, and/or doing some derivative. 1 point
for getting the derivative correct.
For full solution: deduct a point for a sign error. (There are many potential places for flipping
2
signs). Deduct a point for having
P n/λ : this results from wrapping a sum around the log-prior.
(Only the log-likelihood as a i around it since it’s the probability of drawing each data point.
The parameter a is drawn only once.)
Some people didn’t set σ = 1 and kept σ to the end. We simply gave credit if substituting
σ = 1 gave the right answer; a few people may have derived the wrong answer but we didn’t
carefully check all these cases.
People who did gradient descent rules were graded similarly as before: 4 points if correct,
deduct one for sign error.

Page 10 of 16
10-601 Machine Learning Midterm Exam October 18, 2012

Question 4. Bayes Net

Consider a Bayesian network B with boolean variables.

X11 X12 X13

X21 X22

X31 X32 X33

(a) [2 points] From the rule we covered in lecture, is there any variable(s) conditionally independent
of X33 given X11 and X12 ? If so, list all.

Solution:
X21

(b) [2 points] From the rule we covered in lecture, is there any variable(s) conditionally independent
of X33 given X22 ? If so, list all.

Solution:
Everything but X22 , X33 .

(c) [3 points] Write the joint probability P (X11 , X12 , X13 , X21 , X22 , X31 , X32 , X33 ) factored according
to the Bayes net. How many parameters are necessary to define the conditional probability distri-
butions for this Bayesian network?

(d) [2 points] Write an expression for P (X13 = 0, X22 = 1, X33 = 0) in terms of the conditional proba-
bility distributions given in your answer to part (c). Show your work.

Solution:
P (X13 = 0)P (X22 = 1|X13 = 0)P (X33 = 0|X22 = 1)

Page 11 of 16
10-601 Machine Learning Midterm Exam October 18, 2012

(e) [3 points] From your answer to (d), can you say X13 and X33 are independent? Why?

Solution:
No. Conditional independence doesn’t imply marginal independence.

(f) [3 points] Can you say the same thing when X22 = 1? In other words, can you say X13 and X33
are independent given X22 = 1? Why?

Solution:
Yes. X22 is the only parent of X33 and X13 is a nondescendant of X33 , so by the rule in the
lecture we can say they are independent given X22 = 1

(g) [2 points] Replace X21 and X22 by a single new variable X2 whose value is a pair of boolean
values, defined as: X2 = hX21, X22i. Draw the new Bayes net B 0 after the change.

Solution:

X11 X12 X13

X2 = (X21 , X22 )

X31 X32 X33

Page 12 of 16
10-601 Machine Learning Midterm Exam October 18, 2012

(h) [3 points] Do all the conditional independences in B hold in the new network B 0 ? If not, write one
that is true in B but not in B 0 . Consider only the variables present in both B and B 0 .

Solution:
No. For instance, X32 is not conditionally independnt of X33 given X22 anymore.
* Note: We noticed the problem description was a bit ambiguous, so we also accepted yes as a
correct answer

Page 13 of 16
10-601 Machine Learning Midterm Exam October 18, 2012

Question 5. Overfitting and PAC Learning

(a) Consider the training set accuracy and test set accuracy curves plotted above, during decision tree
learning, as the number of nodes in the decision tree grows. This decision tree is being used to
learn a function f : X → Y , where training and test set examples are drawn independently at
random from an underlying distribution P (X), after which the trainer provides a noise-free label
Y . Note error = 1 - accuracy. Please answer each of these true/false questions, and explain/justify
your answer in 1 or 2 sentences.
i. [2 points] T or F: Training error at each point on this curve provides an unbiased estimate of
true error.

Solution:
False. Training error is an optimistically biased estimate of true error, because the hypoth-
esis was chosen based on its fit to the training data.

ii. [1 point] T or F: Test error at each point on this curve provides an unbiased estimate of true
error.

Solution:
True. The expected value of test error (taken over different draws of random test sets) is
equal to true error.

iii. [1 point] T of F: Training accuracy minus test accuracy provides an unbiased estimate of the
degree of overfitting.

Solution:
True. We defined overfitting as test error minus training error, which is equal to training
accuracy minus test accuracy.

iv. [1 point] T or F: Each time we draw a different test set from P (X) the test accuracy curve may
vary from what we see here.

Solution:
True. Of course each random draw from P (X) may vary from another draw.

v. [1 point] T or F: The variance in test accuracy will increase as we increase the number of test
examples.

Page 14 of 16
10-601 Machine Learning Midterm Exam October 18, 2012

Solution:
False. The variance in test accuracy will decrease as we increase the size of the test set.

(b) Short answers.

i. [2 points] Given the above plot of training and test accuracy, which size decision tree would
you choose to use to classify future examples? Give a one-sentence justification.

Solution:
The tree with 10 nodes. This has the highest test accuracy of any of the trees, and hence
the highest expected true accuracy.

ii. [2 points] What is the amount of overfitting in the tree you selected?

Solution:
overfitting = training accuracy minus test accuracy = 0.77 - 0.74 = 0.03

Let us consider the above plot of training and test error from the perspective of agnostic PAC
bounds. Consider the agnostic PAC bound we discussed in class:

1
m≥ (ln |H| + ln(1/δ))
22
where is defined to be the difference between errortrue (h) and errortrain (h) for any hypothesis h
output by the learner.

iii. [2 points] State in one carefully worded sentence what the above PAC bound guarantees about
the two curves in our decision tree plot above.

Solution:
If we train on m examples drawn at random from P (X), then with probability (1 − δ) the
overfitting (difference between training and true accuracy) for each hypothesis in the plot
will be less than or equal to . Note the the true accuracy is the expected value of the test
accuracy, taken over different randomly drawn test sets.

iv. [2 points] Assume we used 200 training examples to produce the above decision tree plot.
If we wish to reduce the overfitting to half of what we observe there, how many training
examples would you suggest we use? Justify your answer in terms of the agnostic PAC bound,
in no more than two sentences.

Solution:
The bound shows that m grows as 212 . Therefore if we wish to halve , it will suffice to
increase m by a factor of 4. We should use 200 × 4 = 800 training examples.

v. [2 points] Give a one sentence explanation of why you are not certain that your recommended
number of training examples will reduce overfitting by exactly one half.

Solution:
There are several reasons, including the following. 1. Our PAC theory result gives a
bound, not an equality, so 800 examples might decrease overfitting by more than half. 2.
The ”observed” overfitting is actually the test set accuracy, which is only an estimate of
true accuracy, so it may vary from true accuracy and our ”observed” overfitting will vary
accordingly.

Page 15 of 16
10-601 Machine Learning Midterm Exam October 18, 2012

(c) You decide to estimate of the probability θ that a particular coin will turn up heads, by flipping it
10 times. You notice that if repeat this experiment, each time obtaining as new set of 10 coin flips,
you get different resulting estimates. You repeat the experiment N = 20 times, obtaining estimates
θ̂1 , θ̂2 . . . θ̂20 . You calculate the variance in these estimates as
i=N
1 X i
var = (θ̂ − θmean )2
N i=1

where θmean is the mean of your estimates θ̂1 , θ̂2 . . . θ̂20 .

i. [4 points] Which do you expect to produce a smaller value for var: a Maximum likelihood
estimator (MLE), or a Maximum a posteriori (MAP) estimator that uses a Beta prior? Assume
both estimators are given the same data. Justify your answer in one sentence.

Solution:
We should expect the MAP estimate to produce a smaller value for var, because using the
Beta prior is equivalent to adding in a fixed set of ”hallucinated” training examples that
will not vary from experiment to experiment.

Page 16 of 16
Sample questions for “Fundamentals of Machine Learning 2018”
Teacher: Mohammad Emtiyaz Khan

A few important informations:

• In the final exam, no electronic devices are allowed except a calculator. Make
sure that your calculator is only a calculator and cannot be used for any other
purpose.

• No documents allowed apart from one A4 sheet of your own notes.

• You are not allowed to talk to others

• For derivations, clearly explain your derivation step by step. In the final
exam you will be marked for steps as well as for the end result.

• For multiple-choice questions, you also need to provide explanations. You

will be marked for your answer as well as for your explanations.

• We will denote the output data vector by y which is a vector that contains
all yn , and the feature matrix by X which is a matrix containing features xTn
as rows. Also, x en = [1, xTn ]T .

• N denotes the number of data points and D denotes the dimensionality.

1 Multiple-Choice/Numerical Questions
1. Choose the options that are correct regarding machine learning (ML) and
artificial intelligence (AI),

(A) ML is an alternate way of programming intelligent machines.

(B) ML and AI have very different goals.
(C) ML is a set of techniques that turns a dataset into a software.
(D) AI is a software that can emulate the human mind.

Answer: (A), (C), (D)

2. Which of the following sentence is FALSE regarding regression?

(A) It relates inputs to outputs.

(B) It is used for prediction.
(C) It may be used for interpretation.
(D) It discovers causal relationships.

1
Answer: (D)

3. What is the rank of the following matrix?

 
1 1 1
A= 1 1 1  (1)
 

1 1 1

Answer: 1

4. What is the dimensionality of the null space of the following matrix?

 
1 1 1
A= 1 1 1  (2)
 

1 1 1

Answer: 2

5. What is the dimensionality of the null space of the following matrix?

 
3 2 −9
A =  −6 −4 18  (3)
 

12 8 −36

Answer: 2

6. For the one-parameter model, mean-Square error (MSE) is defined as follows:

1
PN 2
2N n=1 (yn − β0 ) . We have a half term in the front because,

(A) scaling MSE by half makes gradient descent converge faster.

(B) presence of half makes it easy to do grid search.
(C) it does not matter whether half is there or not.
(D) none of the above

Answer: C

7. Grid search is,

(A) Linear in D.
(B) Polynomial in D.
(C) Exponential in D.
(D) Linear in N .

Answer: C,D

8. The advantage of Grid search is (are),

(A) It can be applied to non-differentiable functions.

2
(B) It can be applied to non-continuous functions.
(C) It is easy to implement.
(D) It runs reasonably fast for multiple linear regression.

Answer: A,B,C.

9. Gradient of a continuous and differentiable function

(A) is zero at a minimum

(B) is non-zero at a maximum
(C) is zero at a saddle point
(D) decreases as you get closer to the minimum

Answer: A,C,D

10. Consider a linear-regression model with N = 3 and D = 1 with input-ouput

pairs as follows: y1 = 22, x1 = 1, y2 = 3, x2 = 1, y3 = 3, x3 = 2. What
is the gradient of mean-square error (MSE) with respect to β1 when β0 = 0
and β1 = 1? Give your answer correct to two decimal digits.
Answer: -1.66 (deviation 0.01)

11. Let us say that we have computed the gradient of our cost function and
stored it in a vector g. What is the cost of one gradient descent update
given the gradient?

(A) O(D)
(B) O(N )
(C) O(N D)
(D) O(N D2 )

Answer: (A)

12. Let us say that we are fitting one-parameter model to the data, i.e. yn ≈ β0 .
(0)
The average of y1 , y2 , . . . , yN is 1. We start gradient descent at β0 = 0 and
set the step-size to 0.5. What is the value of β0 after 3 iterations, i.e., the
(3)
value of β0 ?
Answer: 0.875 (deviation 0.01)

13. Let us say that we are fitting one-parameter model to the data, i.e. yn ≈ β0 .
(0)
The average of y1 , y2 , . . . , yN is 1. We start gradient descent at β0 = 10 and
set the step-size to 0.5. What is the value of β0 after 3 iterations, i.e., the
(3)
value of β0 ?
Answer: CA: 2.125 (deviation 0.01)

3
14. Computational complexity of Gradient descent is,

(A) linear in D
(B) linear in N
(C) polynomial in D
(D) dependent on the number of iterations

Answer: C

15. Generalization error measures how well an algorithm perform on unseen data.
The test error obtained using cross-validation is an estimate of the general-
ization error. Is this estimate unbiased?
Answer: (No)

16. K-fold cross-validation is

(A) linear in K
(B) quadratic in K
(C) cubic in K
(D) exponential in K

Answer: A

17. You observe the following while fitting a linear regression to the data: As
you increase the amount of training data, the test error decreases and the
training error increases. The train error is quite low (almost what you expect
it to), while the test error is much higher than the train error.
What do you think is the main reason behind this behavior. Choose the
most probable option.

(A) High variance

(B) High model bias
(C) High estimation bias
(D) None of the above

Answer: A

18. Adding more basis functions in a linear model... (pick the most probably
option)

(A) Decreases model bias

(B) Decreases estimation bias
(C) Decreases variance

4
(D) Doesn’t affect bias and variance

Answer: A

2 Multiple-output regression

Suppose we have N regression training-pairs, but instead of one output for each
input vector xn ∈ RD , we now have 2 outputs yn = [yn1 , yn2 ] where each yn1 and
yn2 are real numbers. For each output yn1 , we wish to fit a separate linear model:

yn1 ≈ f1 (xn ) = β10 + β11 xn1 + β12 xn2 + . . . + β1D xnD = β T1 x

en (4)
yn2 ≈ f2 (xn ) = β20 + β21 xn1 + β22 xn2 + . . . + β2D xnD = β T2 x
en (5)

where β 1 and β 2 are vectors of β1d and β2d respectively, for d = 0, 1, 2, . . . , D, and
eTn = [1 xTn ].
x

Our goal is to estimate β 1 and β 2 for which we choose to minimize the following
cost function:
N D D
X 1 T
2 1 T
2 X
2
X
2
L(β 1 , β 2 ) := yn1 − β 1 x
en + yn2 − β 2 x
en + λ1 β1d + λ2 β2d .
n=1
2 2 d=0 d=0
(6)

(A) Derive the gradient of L with respect to β 1 and β 2 .

(B) Suppose N = 20 and D = 15. Do we need to regularize? Explain your

answer.

(C) Suppose we increase the number of data points from N = 20 to N = 200.

Should we decrease the value of λ1 and λ2 ? Explain your answer.

(D) What is the computation complexity with respect to N and D?

Answer:

PN h 2 i
(A) ∂L
:= − yn1 − β T1 x
en x
en + λ1 β 1 , same for β 2 .
∂β1 n=1

(B) The number of parameters is equal to 30 and the number of data points is
equal to 40. It is good to regularize, but just a mild regularization will do
since the number of parameters is still less than number of data points.

(D) Same as gradient descent (please put an exact number here for the final
exam).

5
3 Eigenvalues

Given a real-valued matrix X, show that all the non-zero eigenvalues of XXT and
XT X are the same.

Answer: To prove this, you can use the SVD of X = USVT . Then XXT =
US2 UT and XT X = VS2 V. The non-zero eigenvalues are the same, although the
number of eigenvalues are different.

4 Artificial Neural Networks

Consider the following artificial neural network with the nonlinear transformation
znm = σ(anm ) (see figure below). Here, n is the data index and m is the index of
hidden units. There are two binary outputs yn1 and yn2 taking values in {0, 1}.

Figure 1: Artificial neural network

Suppose you have N = 200 data points but M = 200 hidden units for each layer.
What problem(s) are you likely to encounter when training such a network? How
would you solve the problem(s)?

Answer: Overfitting. There are multiple ways to tackle this problem as discussed
in the lecture.

6
1. Which of the following would be more appropriate to be replaced with question mark in the following figure?

a) Data Analysis
b) Data Science
c) Descriptive Analytics
d) None of the mentioned
View Answer

Answer: b
Explanation: Data Science is a multidisciplinary which involves extraction of knowledge from large volumes of data that are
structured or unstructured.

2. Point out the correct statement.

a) Raw data is original source of data
b) Preprocessed data is original source of data
c) Raw data is the data obtained after processing steps
d) None of the mentioned
View Answer

Answer: a
Explanation: Accounting programs are prototypical examples of data processing applications.

3. Which of the following is performed by Data Scientist?

a) Define the question
b) Create reproducible code
c) Challenge results
d) All of the mentioned
View Answer

Answer: d
Explanation: A data scientist is a job title for an employee or business intelligence (BI) consultant who excels at analyzing data,
particularly large amounts of data.

4. Which of the following is the most important language for Data Science?
a) Java
b) Ruby
c) R
d) None of the mentioned
View Answer

Answer: c
Explanation: R is free software for statistical computing and analysis.

5. Point out the wrong statement.

a) Merging concerns combining datasets on the same observations to produce a result with more variables
b) Data visualization is the organization of information according to preset specifications
c) Subsetting can be used to select and exclude variables and observations
d) All of the mentioned
View Answer

Answer: b
Explanation: Data formatting is the organization of information according to preset specifications.

6. Which of the following approach should be used to ask Data Analysis question?
a) Find only one solution for particular problem
b) Find out the question which is to be answered
c) Find out answer from dataset without asking question
d) None of the mentioned
View Answer

Answer: b
Explanation: Data analysis has multiple facets and approaches.

7. Which of the following is one of the key data science skills?

a) Statistics
b) Machine Learning
c) Data Visualization
d) All of the mentioned
View Answer

Answer: d
Explanation: Data visualization is the presentation of data in a pictorial or graphical format.

8. Which of the following is a key characteristic of a hacker?

a) Afraid to say they don’t know the answer
b) Willing to find answers on their own
c) Not Willing to find answers on their own
d) All of the mentioned
View Answer

Answer: b
Explanation: Hacker is an expert at programming and solving problems with a computer.

9. Which of the following is characteristic of Processed Data?

a) Data is not ready for analysis
b) All steps should be noted
c) Hard to use for data analysis
d) None of the mentioned
View Answer

Answer: b
Explanation: Processing includes merging, summarizing and subsetting data.

10. Raw data should be processed only one time.

a) True
b) False
View Answer

Answer: b
Explanation: Raw data may only need to be processed once.

Sanfoundry Global Education & Learning Series – Data Science.

This set of Data Science Multiple Choice Questions & Answers (MCQs) focuses on “ToolBox Overview”.
1. Which of the following principle is incorrectly represented in the below figure?

a) Show Comparisons
b) Integrate Evidence
c) Describe Evidence
d) None of the mentioned
View Answer

Answer: d
Explanation: Principles of Analytical graphs are sequentially shown in the stepwise manner.

2. Point out the correct statement.

a) Least square is an estimation tool
b) Least square problems falls in to three categories
c) Compound least square is one of the category of least square
d) None of the mentioned
View Answer

Answer: a
Explanation: The Method of Least Squares is a procedure to determine the best fit line to data.

3. How many principles of analytical graphs exist?

a) 3
b) 4
c) 6
d) None of the mentioned
View Answer

Answer: c
Explanation: Six Principles of Analytical Graphs are useful for data analysis.

4. Which of the following is not a step in data analysis?

a) Obtain the data
b) Clean the data
c) EDA
d) None of the mentioned
View Answer

Answer: d
Explanation: EDA stands for Exploratory Data Analysis.

5. Point out the wrong statement.

a) Simple linear regression is equipped to handle more than one predictor
b) Compound linear regression is not equipped to handle more than one predictor
c) Linear regression consists of finding the best-fitting straight line through the points
d) All of the mentioned
View Answer

Answer: a
Explanation: Simple linear regression is equipped to handle more than one predictor.
6. Which of the following technique comes under practical machine learning?
a) Bagging
b) Boosting
c) Forecasting
d) None of the mentioned
View Answer

Answer: b
Explanation: Boosting is an approach to machine learning based on the idea of creating a highly accurate predictor.

7. Data Products shown in the below figure is built using which programming language?

a) S
b) Python
c) R
d) Java
View Answer

Answer: c
Explanation: Products mentioned in the figure are web application frameworks written in R.

8. Which of the following technique is also referred to as Bagging?

a) Bootstrap aggregating
b) Bootstrap subsetting
c) Bootstrap predicting
d) All of the mentioned
View Answer

Answer: a
Explanation: Bagging is used in statistical classification and regression.

9. Which of the following is characteristic of Raw Data?

a) Data is ready for analysis
b) Original version of data
c) Easy to use for data analysis
d) None of the mentioned
View Answer

Answer: b
Explanation: Raw data is data that has not been processed for use.

10. Standard normal RVs are always labelled as Z.

a) True
b) False
View Answer

Answer: b
Explanation: Standard normal RVs are often labelled as Z.

1. Which of the following CLI command can also be used to rename files?
a) rm
b) mv
c) rm -r
d) none of the mentioned
View Answer
Answer: b
Explanation: mv stands for move.

2. Point out the correct statement.

a) CLI can help you to organize messages
b) CLI can help you to organize files and folders
c) Navigation of directory is possible using CLI
d) None of the mentioned
View Answer

Answer: b
Explanation: CLI stands for Command Line Interface.

3. Which of the following command allows you to change directory to one level above your parent directory?
a) cd
b) cd.
c) cd..
d) none of the mentioned
View Answer

Answer: c
Explanation: cd stands for change directory.

4. Which of the following is not a CLI command?

a) delete
b) rm
c) clear
d) none of the mentioned
View Answer

Answer: a
Explanation: rm can be used to remove files and directories.

5. Point out the wrong statement.

a) Command is the CLI command which does a specific task
b) There is one and only flag for every command in CLI
c) Flags are the options given to command for activating particular behaviour
d) All of the mentioned
View Answer

Answer: b
Explanation: Depending on the command, there can be zero or more flags and arguments.

6. Which of the following systems record changes to a file over time?

a) Record Control
b) Version Control
c) Forecast Control
d) None of the mentioned
View Answer

Answer: b
Explanation: Version control is also known as revision control.

7. Which of the following is a revision control system?

a) Git
b) NumPy
c) Slidify
d) None of the mentioned
View Answer

Answer: a
Explanation: Git is a free and open source distributed version control system designed to handle everything from small to very large
projects with speed and efficiency.
8. Which of the following command line environment is used for interacting with Git?
a) GitHub
b) Git Bash
c) Git Boot
d) All of the mentioned
View Answer

Answer: b
Explanation: Git for Windows provides a BASH emulation used to run Git from the command line.

9. Which of the following web hosting service use Git control system?
a) GitHub
b) Open Hash
c) Git Bash
d) None of the mentioned
View Answer

Answer: a
Explanation: GitHub is a Web-based Git repository hosting service, which offers all of the distributed revision control and source
code management (SCM) functionality of Git.

10. cp command can be used to copy the content of directories.

a) True
b) False
View Answer

Answer: a
Explanation: -r flag should be used for copying the content.

1. Which of the following adds all new files to local repository?

a) git add .
b) git add -u
c) git add -A
d) none of the mentioned
View Answer

Answer: a
Explanation: You should do this before committing.

2. Point out the correct statement.

a) You don’t need GitHub to use Git
b) CLI can help you to organize files and folders
c) Navigation of directory is possible using CLI
d) None of the mentioned
View Answer

Answer: b
Explanation: CLI stands for Command Line Interface.

3. Which of the following command updates tracking for files that are modified?
a) git add .
b) git add -u
c) git add -A
d) none of the mentioned
View Answer

Answer: b
Explanation: The git add command adds a change in the working directory to the staging area.

4. Which of the following command is used to give a message description?

a) git command -m
b) git command -d
c) git command -message
d) none of the mentioned
View Answer
Answer: a
Explanation: This only updates your local repository.

5. Point out the wrong statement.

a) You need GitHub to use Git
b) GitHub allows you to share repositories with others
c) GitHub allows you to access others repositories
d) All of the mentioned
View Answer

Answer: a
Explanation: GitHub can store a remote copy of your repository.

6. Which of the following command allows you to update the repository?

a) push
b) pop
c) update
d) none of the mentioned
View Answer

Answer: a
Explanation: The git branch command is your general-purpose branch administration tool.

7. Which of the following is the correct way of creating GitHub repository in to well labelled commits?
a) Fork another user’s repository
b) Pop another user’s repository
c) Zip another user’s repository
d) None of the mentioned
View Answer

Answer: a
Explanation: A fork is a copy of a repository.

8. Which of the following command is used to squash the commits?

a) rebase
b) squash
c) boot
d) all of the mentioned
View Answer

Answer: a
Explanation: In Git, there are two main ways to integrate changes from one branch into another: the merge and the rebase.

9. Which of the following statement would create branch named as ‘sanfoundry’?

a) git checkout -b sanfoundry
b) git checkout -c sanfoundry
c) git check -b sanfoundry
d) none of the mentioned
View Answer

Answer: a
Explanation: A branch in Git is simply a lightweight movable pointer to one of these commits.

10. branch command is used to determine which branch you are currently in.
a) True
b) False
View Answer

Answer: a
Explanation: -r flag should be used for copying the content.
1. Which of the following principle characteristic is odd man out in the below figure?

a) Principle 1
b) Principle 2
c) Principle 3
d) Principle 4
View Answer

Answer: c
Explanation: Multivariate Data is the only characteristic related to Principle 3.

2. Point out the correct statement.

a) Descriptive analysis is first kind of data analysis performed
b) Descriptions can be generalized without statistical modelling
c) Description and Interpretation are same in descriptive analysis
d) None of the mentioned
View Answer

Answer: b
Explanation: Descriptive analysis describe a set of data.

3. Which of the following allows you to find the relationship you didn’t about?
a) Inferential
b) Exploratory
c) Causal
d) None of the mentioned
View Answer

Answer: b
Explanation: In statistics, exploratory data analysis is an approach to analyzing data sets to summarize their main characteristics, often
with visual methods.

4. Which of the following command help us to give message description?

a) git command -m
b) git command -d
c) git command -message
d) none of the mentioned
View Answer

Answer: a
Explanation: This only updates your local repository.

5. Point out the wrong statement.

a) Exploratory analyses are usually the final way
b) Exploratory models are useful for discovering new connection
c) Exploratory analysis alone should not be used for predicting
d) All of the mentioned
View Answer

Answer: a
Explanation: Exploratory analyses are usually not the final way.

6. Which of the following uses data on some object to predict values for other object?
a) Inferential
b) Exploratory
c) Predictive
d) None of the mentioned
View Answer

Answer: c
Explanation: A prediction is a forecast, but not only about the weather.

7. Which of the following is the common goal of statistical modelling?

a) Inference
b) Summarizing
c) Subsetting
d) None of the mentioned
View Answer

Answer: a
Explanation: Inference is the act or process of deriving logical conclusions from premises known or assumed to be true.

8. Which of the following model is usually a gold standard for data analysis?
a) Inferential
b) Descriptive
c) Causal
d) All of the mentioned
View Answer

Answer: c
Explanation: A causal model is an abstract model that describes the causal mechanisms of a system.

9. Which of the following analysis should come in place of question mark in the below figure?

a) Inferential
b) Exploratory
c) Causal
d) None of the mentioned
View Answer

Answer: a
Explanation: Inferential statistics is concerned with making predictions or inferences about a population from observations and
analyses of a sample.

10. Causal analysis is commonly applied to census data.

a) True
b) False
View Answer

Answer: b
Explanation: Descriptive analysis is commonly applied to census data.

1. Which of the following type of data science question is missing in the figure?

a) Correlative
b) Exploratory
c) Relative
d) None of the mentioned
View Answer

Answer: b
Explanation: Exploratory analysis is used to find relationships about you didn’t know about.

2. Point out the correct statement.

a) Descriptive analysis can be more useful for defining future studies
b) Correlation does imply causation
c) Inference is commonly the goal of statistical model
d) None of the mentioned
View Answer

Answer: b
Explanation: Inference depends heavily on the sampling scheme.

3. Which of the following uses relatively small amount of data to estimate about bigger population?
a) Inferential
b) Exploratory
c) Causal
d) None of the mentioned
View Answer

Answer: a
Explanation: Inferential statistics is concerned with making predictions or inferences about a population from observations and
analyses of a sample.

4. Which of the following analysis helps out to find the effect of variable change?
a) Inferential
b) Exploratory
c) Causal
d) None of the mentioned
View Answer

Answer: c
Explanation: Causal Analysis provides the real reason why things happen and hence allows focused change activity.

5. Point out the correct statement.

a) Exploratory analyses are not usually the final way
b) Inferential models are useful for discovering new connection
c) Inference involves estimating uncertainty
d) All of the mentioned
View Answer

Answer: c
Explanation: Statistical inference is the process of deducing properties of an underlying distribution by analysis of data.

6. Which of the following relationship are usually identified as average effects?

a) Descriptive
b) Causal
c) Predictive
d) None of the mentioned
View Answer

Answer: b
Explanation: A correlation is a measure or degree of relationship between two variables.
7. Which of the following is more applicable to the below figure?

a) Descriptive
b) Causal
c) Predictive
d) None of the mentioned
View Answer

Answer: a
Explanation: Google trends helps to describe the set of data.

8. Which of the following analysis is usually modeled by deterministic set of equations?

a) Predictive
b) Causal
c) Mechanistic
d) All of the mentioned
View Answer

Answer: c
Explanation: Equations are based on physical/engineering science.

9. Which of the following analysis are incredibly hard to infer?

a) Inferential
b) Exploratory
c) Causal
d) Mechanistic
View Answer

Answer: d
Explanation: Mechanistic analysis are hard to infer except for simple simulations.

10. Accurate prediction depends heavily on measuring the right variables.

a) True
b) False
View Answer

Answer: a
Explanation: Prediction is very hard, especially for future references.

1. Which of the following term is appropriate to the below figure?

a) Large Data
b) Big Data
c) Dark Data
d) None of the mentioned
View Answer

Answer: b
Explanation: Big data is a broad term for data sets so large or complex that traditional data processing applications are inadequate.
2. Point out the correct statement.
a) Machine learning focuses on prediction, based on known properties learned from the training data
b) Data Cleaning focuses on prediction, based on known properties learned from the training data
c) Representing data in a form which both mere mortals can understand and get valuable insights is as much a science as much as it is art
d) None of the mentioned
View Answer

Answer: d
Explanation: Visualization is becoming a very important aspect.

3. Which of the following characteristic of big data is relatively more concerned to data science?
a) Velocity
b) Variety
c) Volume
d) None of the mentioned
View Answer

Answer: b
Explanation: Big data enables organizations to store, manage, and manipulate vast amounts of disparate data at the right speed and at
the right time.

4. Which of the following analytical capabilities are provided by information management company?
a) Stream Computing
b) Content Management
c) Information Integration
d) All of the mentioned
View Answer

Answer: d
Explanation: With stream computing, store less, analyze more and make better decisions faster.

5. Point out the wrong statement.

a) The big volume indeed represents Big Data
b) The data growth and social media explosion have changed how we look at the data
c) Big Data is just about lots of data
d) All of the mentioned
View Answer

Answer: c
Explanation: Big Data is actually a concept providing an opportunity to find new insight into your existing data as well guidelines to
capture and analysis your future data.

6. Which of the following step is performed by data scientist after acquiring the data?
a) Data Cleansing
b) Data Integration
c) Data Replication
d) All of the mentioned
View Answer

Answer: a
Explanation: Data cleansing, data cleaning or data scrubbing is the process of detecting and correcting (or removing) corrupt or
inaccurate records from a record set, table, or database.

7. 3V’s are not sufficient to describe big data.

a) True
b) False
View Answer

Answer: a
Explanation: IBM data scientists break big data into four dimensions: volume, variety, velocity and veracity.

8. Which of the following focuses on the discovery of (previously) unknown properties on the data?
a) Data mining
b) Big Data
c) Data wrangling
d) Machine Learning
View Answer

Answer: a
Explanation: Data munging or data wrangling is loosely the process of manually converting or mapping data from one “raw” form
into another format that allows for more convenient consumption of the data with the help of semi-automated tools.

9. Which of the following language should be replaced with the question mark in the below figure?

a) Java
b) PHP
c) COBOL
d) None of the mentioned
View Answer

Answer: a
Explanation: Java is used for processing data in Big data Analytics.

10. Beyond Volume, variety and velocity are the issues of big data veracity.
a) True
b) False
View Answer

Answer: a
Explanation: Data Veracity is uncertain or imprecise data.

1. Which of the following design term is perfectly applicable to the below figure?

a) Correlation
b) Confounding
c) Causation
d) None of the mentioned
View Answer

Answer: b
Explanation: Confounding can be dealt with either at the study design stage, or at the analysis stage.

2. Point out the correct statement.

a) If equations are known but the parameters are not, they may be inferred with data analysis
b) If equations are not known but the parameters are, they may be inferred with data analysis
c) If equations and parameter are not, they may be inferred with data analysis
d) None of the mentioned
View Answer

Answer: a
Explanation: Usually the random component of data is measurement error.
3. Which of the following is the top most important thing in data science?
a) answer
b) question
c) data
d) none of the mentioned
View Answer

Answer: b
Explanation: The second most important is the data.

4. Which of the following approach should be used if you can’t fix the variable?
a) randomize it
b) non stratify it
c) generalize it
d) none of the mentioned
View Answer

Answer: a
Explanation: If you can’t fix the variable, stratify it.

5. Point out the wrong statement.

a) Randomized studies are not used to identify causation
b) Complication approached exist for inferring causation
c) Causal relationships may not apply to every individual
d) All of the mentioned
View Answer

Answer: a
Explanation: Randomized studies are usually used to identify causation.

6. Which of the following is a good way of performing experiments in data science?

a) Measure variability
b) Generalize to the problem
c) Have Replication
d) All of the mentioned
View Answer

Answer: d
Explanation: Experiments on causal relationships investigate the effect of one or more variables on one or more outcome variables.

7. Which of the following is commonly referred to as ‘data fishing’?

a) Data bagging
b) Data booting
c) Data merging
d) None of the mentioned
View Answer

Answer: d
Explanation: Data dredging is sometimes referred to as “data fishing”.

8. Which of the following data mining technique is used to uncover patterns in data?
a) Data bagging
b) Data booting
c) Data merging
d) Data Dredging
View Answer

Answer: d
Explanation: Data dredging, also called as data snooping, refers to the practice of misusing data mining techniques to show
misleading scientific ‘research’.
9. Which of the following figure correctly shows approximate order of difficulty?

c)
d) All of the mentioned
View Answer

Answer: a
Explanation: Predictive analysis is the practice of extracting information from existing data sets.

10. If X predicts Y, it does mean X causes Y.

a) True
b) False
View Answer

Answer: b
Explanation: If X predicts Y, it does not mean X causes Y.

1. Which of the following operations are supported on Time Frames?

a) idxmax
b) ixmax
c) ixmin
d) none of the mentioned
View Answer

Answer: a
Explanation: Operands can also appear in a reversed order.

2. Point out the correct statement.

a) Timedeltas are differences in times, expressed in difference units
b) You can construct a Timedelta scalar through various argument
c) DateOffsets cannot be used in construction
d) All of the mentioned
View Answer

Answer: a
Explanation: Timedeltas can be both positive and negative.

3. Numeric reduction operation for timedelta64[ns] will return _________ objects.

a) Timeseries
b) Timeplus
c) Timedelta
d) None of the mentioned
View Answer

Answer: c
Explanation: NaT are skipped during evaluation.

4. Which of the following scalars can be converted to other ‘frequencies’ by as typing to a specific timedelta type?
a) Timedelta Series
b) TimedeltaIndex
c) Timedelta
d) All of the mentioned
View Answer

Answer: d
Explanation: These operations yield Series and propagate NaT -> nan.

5. Point out the wrong statement.

a) min, max, idxmin, idxmax operations are supported on Series
b) You cannot pass a timedelta to get a particular value
c) Division by the numpy scalar is true division
d) None of the mentioned
View Answer

Answer: b
Explanation: Dividing or multiplying a timedelta64[ns] Series by an integer or integer Series yields another timedelta64[ns] dtypes
Series.
6. Which of the following is used to generate an index with time delta?
a) TimeIndex
b) TimedeltaIndex
c) LeadIndex
d) None of the mentioned
View Answer

Answer: b
Explanation: Using TimedeltaIndex you can pass string-like, Timedelta, timedelta, or np.timedelta64 objects.

7. Combination of TimedeltaIndex with DatetimeIndex allow certain combination operations that are NaT preserving.
a) True
b) False
View Answer

Answer: a
Explanation: You can also convert indices to yield another index.

8. Using _________ on categorical data will produce similar output to a Series or DataFrame of type string.
a) .desc()
b) .describe()
c) .rank()
d) none of the mentioned
View Answer

Answer: b
Explanation: Categorical data has a categories and a ordered property.

9. Which of the following method can be used to rename categorical data?

a) Categorical.rename_categories()
b) Categorical.rename()
c) Categorical.mv_categories()
d) None of the mentioned
View Answer

Answer: a
Explanation: Renaming categories is done by assigning new values to the Series.cat.categories property.

10. All values of categorical data are either in categories or np.nan.

a) True
b) False
View Answer

Answer: a
Explanation: Categoricals are pandas data type.

1. The plot method on Series and DataFrame is just a simple wrapper around ____________
a) gplt.plot()
b) plt.plot()
c) plt.plotgraph()
d) none of the mentioned
View Answer

Answer: b
Explanation: If the index consists of dates, it calls gcf().autofmt_xdate() to try to format the x-axis nicely.

2. Point out the correct combination with regards to kind keyword for graph plotting.
a) ‘hist’ for histogram
b) ‘box’ for boxplot
c) ‘area’ for area plots
d) all of the mentioned
View Answer
Answer: d
Explanation: The kind keyword argument of plot() accepts a handful of values for plots other than the default Line plot.

3. Which of the following value is provided by kind keyword for barplot?

a) bar
b) kde
c) hexbin
d) none of the mentioned
View Answer

Answer: a
Explanation: bar can also be used for barplot.

4. You can create a scatter plot matrix using the __________ method in pandas.tools.plotting.
a) sca_matrix
b) scatter_matrix
c) DataFrame.plot
d) all of the mentioned
View Answer

Answer: b
Explanation: You can create density plots using the Series/DataFrame.plot.

5. Point out the wrong combination with regards to kind keyword for graph plotting.
a) ‘scatter’ for scatter plots
b) ‘kde’ for hexagonal bin plots
c) ‘pie’ for pie plots
d) none of the mentioned
View Answer

Answer: b
Explanation: kde is used for density plots.

6. Which of the following plots are used to check if a data set or time series is random?
a) Lag
b) Random
c) Lead
d) None of the mentioned
View Answer

Answer: a
Explanation: Random data should not exhibit any structure in the lag plot.

7. Plots may also be adorned with error bars or tables.

a) True
b) False
View Answer

Answer: a
Explanation: There are several plotting functions in pandas.tools.plotting.

8. Which of the following plots are often used for checking randomness in time series?
a) Autocausation
b) Autorank
c) Autocorrelation
d) None of the mentioned
View Answer

Answer: c
Explanation: If the time series is random, such autocorrelations should be near zero for any and all time-lag separations.

9. __________ plots are used to visually assess the uncertainty of a statistic.

a) Lag
b) RadViz
c) Bootstrap
d) None of the mentioned
View Answer

Answer: c
Explanation: Resulting plots and histograms are what constitutes the bootstrap plot.

10. Andrews curves allow one to plot multivariate data.

a) True
b) False
View Answer

Answer: a
Explanation: Curves belonging to samples of the same class will usually be closer together and form larger structures.

1. Which of the following is used to compute the percent change over a given number of periods?
a) pct_change
b) percent_change
c) per_change
d) none of the mentioned
View Answer

Answer: a
Explanation: Series, DataFrame, and Panel all have a method pct_change.

2. Point out the correct statement.

a) Pandas represents timestamps in microsecond resolution
b) Pandas is 100% thread safe
c) For Series and DataFrame objects, var normalizes by N-1 to produce unbiased estimates
d) All of the mentioned
View Answer

Answer: c
Explanation: Pandas represents timestamps in nanosecond resolution.

3. Which of the following object has a method cov to compute covariance between series?
a) Series
b) DataFrame
c) Panel
d) None of the mentioned
View Answer

Answer: a
Explanation: DataFrame has a method cov to compute pairwise covariances among the series in the DataFrame, also excluding
NA/null values.

4. Which of the following specifies the required minimum number of observations for each column pair in order to have a valid result?
a) min_periods
b) max_periods
c) minimum_periods
d) all of the mentioned
View Answer

Answer: a
Explanation: DataFrame.cov also supports an optional min_periods.

5. Point out the wrong statement.

a) lxml is very fast
b) lxml requires Cython to install correctly
c) lxml does not make any guarantees about the results of it’s parse
d) none of the mentioned
View Answer
Answer: c
Explanation: There are some versioning issues surrounding the libraries that are used to parse HTML tables in the top-level pandas io
function read_html.

6. Which of the following is implemented on DataFrame to compute the correlation between like-labeled Series contained in different
DataFrame objects?
a) corrwith
b) corwith
c) corwit
d) none of the mentioned
View Answer

Answer: a
Explanation: A score close to 1 means their tastes are very similar.

7. rolling_count function gives the number of non-null observations.

a) True
b) False
View Answer

Answer: b
Explanation: The binary operators take two Series or DataFrames.

8. Which of the following method produces a data ranking with ties being assigned the mean of the ranks for the group?
a) rank
b) dense_rank
c) partition_rank
d) none of the mentioned
View Answer

Answer: a
Explanation: rank is also a DataFrame method.

9. Which of the following can potentially change the dtype of a series?

a) reindex_like
b) index_like
c) itime_like
d) none of the mentioned
View Answer

Answer: a
Explanation: reindex_like silently inserts NaNs and the dtype changes accordingly.

10. cov and corr supports the optional min_periods keyword.

a) True
b) False
View Answer

Answer: a
Explanation: Non-numeric columns will be automatically excluded from the correlation calculation.

1. Which of the following thing can be data in Pandas?

a) a python dict
b) an ndarray
c) a scalar value
d) all of the mentioned
View Answer

Answer: d
Explanation: The passed index is a list of axis labels.

2. Point out the correct statement.

a) If data is a list, if index is passed the values in data corresponding to the labels in the index will be pulled out
b) NaN is the standard missing data marker used in pandas
c) Series acts very similarly to a array
d) None of the mentioned
View Answer

Answer: b
Explanation: If data is a dict, if index is passed the values in data corresponding to the labels in the index will be pulled out.

3. The result of an operation between unaligned Series will have the ________ of the indexes involved.
a) intersection
b) union
c) total
d) all of the mentioned
View Answer

Answer: b
Explanation: If a label is not found in one Series or the other, the result will be marked as missing NaN.

4. Which of the following input can be accepted by DataFrame?

a) Structured ndarray
b) Series
c) DataFrame
d) All of the mentioned
View Answer

Answer: d
Explanation: DataFrame is a 2-dimensional labeled data structure with columns of potentially different types.

5. Point out the wrong statement.

a) A DataFrame is like a fixed-size dict in that you can get and set values by index label
b) Series can be be passed into most NumPy methods expecting an ndarray
c) A key difference between Series and ndarray is that operations between Series automatically align the data based on label
d) None of the mentioned
View Answer

Answer: a
Explanation: A Series is like a fixed-size dict in that you can get and set values by index label.

6. Which of the following takes a dict of dicts or a dict of array-like sequences and returns a DataFrame?
a) DataFrame.from_items
b) DataFrame.from_records
c) DataFrame.from_dict
d) All of the mentioned
View Answer

Answer: a
Explanation: DataFrame.from_dict operates like the DataFrame constructor except for the orient parameter which is ‘columns’ by
default.

7. Series is a one-dimensional labeled array capable of holding any data type.

a) True
b) False
View Answer

Answer: a
Explanation: The axis labels are collectively referred to as the index.

8. Which of the following works analogously to the form of the dict constructor?
a) DataFrame.from_items
b) DataFrame.from_records
c) DataFrame.from_dict
d) All of the mentioned
View Answer
Answer: a
Explanation: DataFrame.from_records takes a list of tuples or an ndarray with structured dtype.

9. Which of the following operation works with the same syntax as the analogous dict operations?
a) Getting columns
b) Setting columns
c) Deleting columns
d) All of the mentioned
View Answer

Answer: d
Explanation: You can treat a DataFrame semantically like a dict of like-indexed Series objects.

10. If data is an ndarray, index must be the same length as data.

a) True
b) False
View Answer

Answer: a
Explanation: If no index is passed, one will be created having values [0, …, len(data) – 1].

1. All pandas data structures are ___ mutable but not always _______mutable.
a) size, value
b) semantic, size
c) value, size
d) none of the mentioned
View Answer

Answer: c
Explanation: The length of a Series cannot be changed.

2. Point out the correct statement.

a) Pandas consist of set of labeled array data structures
b) Pandas consist of an integrated group by engine for aggregating and transforming data sets
c) Pandas consist of moving window statistics
d) All of the mentioned
View Answer

Answer: d
Explanation: Some elements may be close to one another according to one distance and farther away according to another.

3. Which of the following statement will import pandas?

a) import pandas as pd
b) import panda as py
c) import pandaspy as pd
d) all of the mentioned
View Answer

Answer: a
Explanation: You can read data from a CSV file using the read_csv function.

4. Which of the following object you get after reading CSV file?
a) DataFrame
b) Character Vector
c) Panel
d) All of the mentioned
View Answer

Answer: a
Explanation: You get columns out of a DataFrame the same way you get elements out of a dictionary.

5. Point out the wrong statement.

a) Series is 1D labeled homogeneously-typed array
b) DataFrame is general 2D labeled, size-mutable tabular structure with potentially heterogeneously-typed columns
c) Panel is generally 2D labeled, also size-mutable array
d) None of the mentioned
View Answer

Answer: c
Explanation: Panel is generally 3D labeled.

6. Which of the following library is similar to Pandas?

a) NumPy
b) RPy
c) OutPy
d) None of the mentioned
View Answer

Answer: a
Explanation: NumPy is the fundamental package for scientific computing with Python.

7. Panel is a container for Series, and DataFrame is a container for dataFrame objects.
a) True
b) False
View Answer

Answer: b
Explanation: DataFrame is a container for Series, and panel is a container for dataFrame objects.

8. Which of the following is prominent python “statistics and econometrics library”?

a) Bokeh
b) Seaborn
c) Statsmodels
d) None of the mentioned
View Answer

Answer: c
Explanation: Bokeh is a Python interactive visualization library for large datasets that natively uses the latest web technologies.

9. Which of the following is a foundational exploratory visualization package for the R language in pandas ecosystem?
a) yhat
b) Seaborn
c) Vincent
d) None of the mentioned
View Answer

Answer: a
Explanation: It has great support for pandas data objects.

10. Pandas consist of static and moving window linear and panel regression.
a) True
b) False
View Answer

Answer: a
Explanation: Time series and cross-sectional data are special cases of panel data.

1. Quandl API for Python wraps the ________ REST API to return Pandas DataFrames with time series indexes.
a) Quandl
b) PyDatastream
c) PyData
d) None of the mentioned
View Answer

Answer: a
Explanation: PyDatastream is a Python interface to the Thomson Dataworks Enterprise (DWE/Datastream) SOAP API to return
indexed pandas dataFrames or panels with financial data.
2. Point out the correct statement.
a) Statsmodels provides powerful statistics, econometrics, analysis and modeling functionality that is out of panda’s scope
b) Vintage leverages pandas objects as the underlying data container for computation
c) Bokeh is a Python interactive visualization library for small datasets
d) All of the mentioned
View Answer

Answer: a
Explanation: Bokeh goal is to provide elegant, concise construction of novel graphics in the style of D3.

3. Which of the following library is used to retrieve and acquire statistical data and metadata disseminated in SDMX 2.1?
a) pandaSDMX
b) freedapi
c) geopandas
d) all of the mentioned
View Answer

Answer: a
Explanation: Geopandas extends pandas data objects to include geographic information which supports geometric operations.

4. Which of the following provides a standard API for doing computations with MongoDB?
a) Blaze
b) Geopandas
c) FRED
d) All of the mentioned
View Answer

Answer: a
Explanation: If your work entails maps and geographical coordinates, and you love pandas, you should take a close look at
Geopandas.

5. Point out the wrong statement.

a) qgrid is an interactive grid for sorting and filtering DataFrames
b) Pandas DataFrames implement _repr_html_ methods which are utilized by IPython Notebook
c) Spyder is a cross-platform Qt-based open-source R IDE
d) None of the mentioned
View Answer

Answer: c
Explanation: Spyder is a cross-platform Qt-based open-source Python IDE.

6. Which of the following makes use of pandas and returns data in a series or dataFrame?
a) pandaSDMX
b) freedapi
c) OutPy
d) none of the mentioned
View Answer

Answer: b
Explanation: freedapi module requires a FRED API key that you can obtain for free on the FRED website.

7. Spyder can introspect and display Pandas DataFrames.

a) True
b) False
View Answer

Answer: b
Explanation: Spyder show both “column wise min/max and global min/max coloring.

8. Which of the following is used for machine learning in python?

a) scikit-learn
b) seaborn-learn
c) stats-learn
d) none of the mentioned
View Answer
Answer: a
Explanation: scikit-learn is built on NumPy, SciPy, and matplotlib.

9. The ________ project builds on top of pandas and matplotlib to provide easy plotting of data.
a) yhat
b) Seaborn
c) Vincent
d) None of the mentioned
View Answer

Answer: b
Explanation: Seaborn has great support for pandas data objects.

10. x-ray brings the labeled data power of pandas to the physical sciences.
a) True
b) False
View Answer

Answer: a
Explanation: It aims to provide a pandas-like and pandas-compatible toolkit for analytics on multi-dimensional arrays

1. Which of the following is the base layer for all of the sparse indexed data structures?
a) SArray
b) SparseArray
c) PyArray
d) None of the mentioned
View Answer

Answer: b
Explanation: SparseArray is a 1-dimensional ndarray-like object storing only values distinct from the fill_value.

2. Point out the correct statement.

a) All of the standard pandas data structures have a to_sparse method
b) Any sparse object can be converted back to the standard dense form by calling to_dense
c) The sparse objects exist for memory efficiency reasons
d) All of the mentioned
View Answer

Answer: d
Explanation: The to_sparse method takes a kind argument and a fill_value.

3. Which of the following is not an indexed object?

a) SparseSeries
b) SparseDataFrame
c) SparsePanel
d) None of the mentioned
View Answer

Answer: d
Explanation: SparseArray can be converted back to a regular ndarray by calling to_dense.

4. Which of the following list-like data structure is used for managing a dynamic collection of SparseArrays?
a) SparseList
b) GeoList
c) SparseSeries
d) All of the mentioned
View Answer

Answer: a
Explanation: To create one, simply call the SparseList constructor with a fill_value.

5. Point out the wrong statement.

a) to_array. append can accept scalar values or any 2-dimensional sequence
b) Two kinds of SparseIndex are implemented
c) The integer format keeps an arrays of all of the locations where the data are not equal to the fill value
d) None of the mentioned
View Answer

Answer: a
Explanation: to_array. append can accept scalar values or any 1-dimensional sequence.

6. Which of the following method is used for transforming a SparseSeries indexed by a MultiIndex to a scipy.sparse.coo_matrix?
a) SparseSeries.to_coo()
b) Series.to_coo()
c) SparseSeries.to_cooser()
d) None of the mentioned
View Answer

Answer: a
Explanation: Experimental api to transform between sparse pandas and scipy.sparse structures.

7. The integer format tracks only the locations and sizes of blocks of data.
a) True
b) False
View Answer

Answer: b
Explanation: The block format tracks only the locations and sizes of blocks of data.

8. Which of the following is used for testing for membership in the list of column names?
a) in
b) out
c) elseif
d) none of the mentioned
View Answer

Answer: a
Explanation: For DataFrames, likewise, in applies to the column axis.

9. Which of the following indexing capabilities is used as a concise means of selecting data from a pandas object?
a) In
b) ix
c) ipy
d) none of the mentioned
View Answer

Answer: b
Explanation: ix and reindex are 100% equivalent.

10. Pandas follow the NumPy convention of raising an error when you try to convert something to a bool.
a) True
b) False
View Answer

Answer: a
Explanation: This happens in an if or when using the boolean operations, and, or, or not.
1. Which of the following block information is odd man out?

a) Subsetting
b) Raw data
c) Ready for analysis
d) None of the mentioned
View Answer

Answer: b
Explanation: Characteristics mentioned in the diagram are traits of processed data.

2. Point out the correct statement.

a) Data has only qualitative value
b) Data has only quantitative value
c) Data has both qualitative and quantitative value
d) None of the mentioned
View Answer

Answer: a
Explanation: Data belongs to the set of items.

3. Data that summarize all observations in a category are called __________ data.
a) frequency
b) summarized
c) raw
d) none of the mentioned
View Answer

Answer: b
Explanation: The summary could be the sum of the observations, the number of occurrences, their mean value, and so on.

4. Which of the following is an example of raw data?

a) original swath files generated from a sonar system
b) initial time-series file of temperature values
c) a real-time GPS-encoded navigation file
d) all of the mentioned
View Answer

Answer: d
Explanation: Raw data refers to data that have not been changed since acquisition.

5. Point out the correct statement.

a) Primary data is original source of data
b) Secondary data is original source of data
c) Questions are obtained after data processing steps
d) None of the Mentioned
View Answer

Answer: a
Explanation: Primary data is also referred to as raw data.

6. Which of the following data is put into a formula to produce commonly accepted results?
a) Raw
b) Processed
c) Synchronized
d) All of the Mentioned
View Answer

Answer: b
Explanation: Raw data came from direct measurements.

7. Processing data includes subsetting, formatting and merging only.

a) True
b) False
View Answer

Answer: b
Explanation: There are many other techniques applied to raw data.

8. Which of the following is another name for raw data?

a) destination data
b) eggy data
c) secondary
d) machine learning
View Answer

Answer: b
Explanation: Although raw data has the potential to become “information,” extraction, organization, and sometimes analysis and
formatting for presentation are required for that to occur.

9. Which type of data is generated by POS terminal in a busy supermarket each day?
a) Source
b) Processed
c) Synchronized
d) All of the mentioned
View Answer

Answer: a
Explanation: Raw data is sometimes referred to as source data.

10. Following figure represents correct sequence of steps in performing data analysis.

a) True
b) False
View Answer

Answer: a
Explanation: Data analysis is not a goal in itself; the goal is to enable the business to make better decisions.
1. Which of the following is an example of tidy data?
a) complicated JSON from facebook API
b) complicated JSON from Twitter API
c) unformatted excel file
d) all of the mentioned
View Answer

Answer: d
Explanation: Tidy data is obtained after processing script.

2. Point out the correct statement.

a) Nearly 80% of data analysis is spent on wrangling data
b) Nearly 20% of data analysis is spent on data dredging
c) Nearly 80% of data analysis is spent on the cleaning and preparing data
d) None of the mentioned
View Answer

Answer: c
Explanation: Data cleansing is the process of detecting and correcting (or removing) corrupt or inaccurate records from a record set,
table, or database.

3. Which of the following is a trait of tidy data?

a) each variable in one column
b) each observation in different row
c) one table for each kind of variable
d) none of the mentioned
View Answer

Answer: b
Explanation: The summary could be the sum of the observations, the number of occurrences, their mean value, and so on.

4. Which of the following package is used for tidy data?

a) tidyr
b) souryr
c) NumPy
d) all of the mentioned
View Answer

Answer: a
Explanation: tidyr is used for tidy data with spread and gather functions.

5. Point out the wrong statement.

a) Tidy datasets are all alike but every messy dataset is messy in its own way
b) Most statistical datasets are data frames made up of rows and columns
c) Tidy datasets provide a standardized way to link the structure of a dataset with its semantics
d) None of the mentioned
View Answer

Answer: d
Explanation: The tidy data standard has been designed to simplify the development of data analysis tools that work well together.

6. Which of the following process involves structuring datasets to facilitate analysis?

a) Data tidying
b) Data mining
c) Data booting
d) All of the mentioned
View Answer

Answer: a
Explanation: The principles of tidy data provide a standard way to organize data values within a dataset.

7. Strange binary file generated from machines is an example of tidy data.

a) True
b) False
View Answer
Answer: b
Explanation: Data sets stored in spreadsheets, such as Microsoft’s Excel, are binary, not raw ASCII data files.

8. Which of the following is the most common problem with messy data?
a) Column headers are values
b) Variables are stored in both rows and columns
c) A single observational unit is stored in multiple tables
d) All of the mentioned
View Answer

Answer: d
Explanation: Real datasets can, and often do, violate the three precepts of tidy data in almost every way imaginable.

9. tidyr is a reframing of _______ designed to accompany the tidy data framework.

a) reshape5
b) dplyr
c) reshape2
d) all of the mentioned
View Answer

Answer: c
Explanation: tidyr does less reframing than reshape2.

10. Raw data in the real-world is tidy and properly formatted.

a) True
b) False
View Answer

Answer: a
Explanation: Data analysis is not a goal in itself; the goal is to enable the business to make better decisions.

1. Which of the following function is used for loading flat files?

a) read.data
b) read.sheet
c) read.table
d) none of the mentioned
View Answer

Answer: c
Explanation: This reads data in to the RAM.

2. Point out the correct statement.

a) XLConnect package has more options for manipulating access files
b) XLConnect vignette package can also be used for manipulating excel files
c) write.xlsx write out an excel file with different argument
d) None of the mentioned
View Answer

Answer: c
Explanation: write.xlsx write out an excel file with similar argument.

3. Which of the following is an important parameter of read.table function?

a) file
b) header
c) sep
d) all of the mentioned
View Answer

Answer: d
Explanation: More parameters are required for loading the data.

4. Which of the following will set the character that represents missing value?
a) na.quote
b) na.strings
c) nrows
d) all of the mentioned
View Answer

Answer: b
Explanation: na.strings takes a character vector.

5. Point out the wrong statement.

a) data.table inherits from data.frame
b) data.table is written in Java
c) data.table is faster at subsetting and updating data
d) none of the mentioned
View Answer

Answer: b
Explanation: data.table is written in C.

6. Which of the following package is used for reading excel data?

a) xlsx
b) xlsc
c) read.sheet
d) all of the mentioned
View Answer

Answer: a
Explanation: read.xlsx and read.xlsx functions are part of xlsx package.

7. Which of the following can be used to view all the tables in memory?
a) tables
b) alltable
c) table
d) none of the mentioned
View Answer

Answer: a
Explanation: The table function is a very basic, but essential, function to master while performing interactive data analyses.

8. Which of the following function programmatically extract parts of XML file?

a) XmlSApply
b) XmlApply
c) XmlSApplyData
d) All of the mentioned
View Answer

Answer: a
Explanation: xmlSApply are simple wrappers for tapply and lappy functions.

9. Which of the following package is used for reading JSON data?

a) jsonlite
b) json
c) jsondata
d) all of the mentioned
View Answer

Answer: a
Explanation: The jsonlite package is a JSON generator optimized for the web.

10. Extracting XML is the basis for most web scraping.

a) True
b) False
View Answer
Answer: a
Explanation: XML is particularly used in web applications.

1. Which of the following package is used to connect MySQL RDBMS with R?

a) RMySQL vignette
b) MySQL vignette
c) RSQL vignette
d) None of the mentioned
View Answer

Answer: a
Explanation: This package contains meta information and index.

2. Point out the correct statement.

a) HDF5 is a hierarchical format
b) HDF5 does not support range of different data types
c) HDF5 is used for storing small datasets
d) None of the mentioned
View Answer

Answer: a
Explanation: HDF5 is used for storing large datasets.

3. Which of the following is used to extract data from HTML code of websites?
a) Webscraping
b) Webdredging
c) Webcleaning
d) All of the mentioned
View Answer

Answer: a
Explanation: Webscraping is a great way to get data.

4. Which of the following function is used to read data off the webpages?
a) read.web
b) read.Lines
c) read.Line
d) all of the mentioned
View Answer

Answer: b
Explanation: read.Lines function will extract the web page data.

5. Point out the wrong statement.

a) hdf5 can be used to reading/writing from disc in Python
b) rhdf5 is an interface for hdf5 format
c) maximum size of an HDF5 dataset is fixed when it is created
d) all of the mentioned
View Answer

Answer: b
Explanation: hdf5 can be used to reading/writing from disc in R.

6. Which of the following package is used for reading HTML and XML data?
a) httr
b) http
c) httx
d) all of the mentioned
View Answer

Answer: a
Explanation: httr contains tools for Working with URLs and HTTP.
7. httr package does not work well with facebook and twitter API.
a) True
b) False
View Answer

Answer: b
Explanation: Most modern APIs use something like oauth.

8. Which of the following request can be issued from httr package?

a) GET
b) PUT
c) DELETE
d) All of the mentioned
View Answer

Answer: d
Explanation: Authentication is necessary for issuing a request.

9. Which of the following package loads data from SPSS?

a) read.spss(SPSS)
b) read.oct(SPSS)
c) read.xpot(SPSS)
d) all of the mentioned
View Answer

Answer: a
Explanation: SPSS is a comprehensive and flexible statistical analysis and data management solution.

10. Which of the following package is used for reading GIS data?
a) rdgal
b) rgeos
c) raster
d) all of the mentioned
View Answer

Answer: d
Explanation: A geographic information system is a system designed to capture, store, manipulate, analyze, manage, and present all
types of spatial or geographical data.

1. Which of the following function gives information about top level data?
a) head
b) tail
c) summary
d) none of the mentioned
View Answer

Answer: a
Explanation: The function head is very useful for working with lists, tables, data frames and even functions.

2. Point out the correct statement.

a) head function work on string
b) tail function work on string
c) head function work on string but tail function do not
d) none of the mentioned
View Answer

Answer: d
Explanation: Both head and tail function do not work on strings.

3. Which of the following function is used for quantiles of quantitative values?

a) quantile
b) quantity
c) quantiles
d) all of the mentioned
View Answer
Answer: a
Explanation: In probability and statistics, the quantile function specifies, for a given probability in the probability distribution of a
random variable, the value at which the probability of the random variable will be less than or equal to that probability.

4. Which of the following function is used for determining missing values?

a) any
b) all
c) is
d) all of the mentioned
View Answer

Answer: d
Explanation: In R, missing values are represented by the symbol NA.

5. Point out the wrong statement.

a) Common variables are used to create missingness vector
b) Common variables are used to cutting up quantitative variables
c) Common variables are not used to apply transforms
d) All of the mentioned
View Answer

Answer: c
Explanation: Common variables are not used to apply transforms.

6. Which of the following transforms can be performed with data value?

a) log2
b) cos
c) log10
d) all of the mentioned
View Answer

Answer: d
Explanation: Many common transforms can be applied to the data with R.

7. Each observation forms a column in tidy data.

a) True
b) False
View Answer

Answer: b
Explanation: Each variable forms a column in tidy data.

8. Which of the following function is used for casting data frames?

a) dcast
b) ucast
c) rcast
d) all of the mentioned
View Answer

Answer: a
Explanation: Use acast or dcast depending on whether you want vector/matrix/array output or data frame output.

9. Which of the following join is by default used in plyr package?

a) left
b) right
c) full
d) all of the mentioned
View Answer

Answer: a
Explanation: Join is faster in plyr package.
10. mutate function is used for casting as multi dimensional arrays.
a) True
b) False
View Answer

Answer: b
Explanation: mutate is used for adding new variables.

1. Which of the following function is good for the automatic splitting of names?
a) split
b) strsplit
c) autsplit
d) none of the mentioned
View Answer

Answer: b
Explanation: strsplit split a character string or vector of character strings using a regular expression or a literal string.

2. Point out the correct statement.

a) gsub is used for fixing character vectors
b) sub is used for finding values like grep
c) grep is used for fixing character vectors
d) none of the mentioned
View Answer

Answer: a
Explanation: sub and gsub is used for fixing character vectors.

3. Which of the following function is used for fixing character vectors?

a) tolower
b) toUPPER
c) toLOWER
d) all of the mentioned
View Answer

Answer: a
Explanation: It translates character to lowercase.

4. Which of the following metacharacter is used to refer to any character?

a) %
b) @
c) .
d) All of the mentioned
View Answer

Answer: c
Explanation: A dot in function name can mean any of the following: nothing at all; a separator between method and class in S3
method.

5. Point out the wrong statement.

a) Variables with character values should be made less descriptive
b) Variables with character values should usually be made into factor variable
c) Common variables are used to apply transforms
d) All of the mentioned
View Answer

Answer: a
Explanation: Variables with character values should be made more descriptive.

6. Which of the following is used for specifying character class with metacharacter?
a) []
b) {}
c) /+
d) All of the mentioned
View Answer
Answer: a
Explanation: You can list set of characters to accept a given point in the match.

7. Regular expressions can be thought of as a combination of literals and metacharacters.

a) True
b) False
View Answer

Answer: a
Explanation: Regular expressions have rich set of metacharacters.

8. Which of the following signs are used to indicate repetition?

a) #
b) *
c) –
d) All of the mentioned
View Answer

Answer: b
Explanation: * and + are metacharacters for repetition of data.

9. Which of the following function is used for searching text strings by means of regular expression?
a) grepd
b) grepl
c) gepexpr
d) all of the mentioned
View Answer

Answer: b
Explanation: grep, grepl, regexpr, gregexpr and regexec search for matches to argument pattern within each element of a character
vector.

10. merge function is used for merging data frames.

a) True
b) False
View Answer

Answer: a
Explanation: To merge two data frames horizontally, use the merge function.

1. Which of the the following graphic device information is odd man out in the below figure?

a) quartz
b) window
c) unix
d) x11
View Answer

Answer: c
Explanation: unix keyword does not exist with regards to graphics device.

2. Point out the correct statement.

a) On Mac, the screen device is launched with quartz
b) On Windows, the screen device is launched with wind
c) On Unix, the screen device is launched with x12
d) All of the mentioned
View Answer

Answer: a
Explanation: On Windows, the screen device is launched with window function.

3. Which of the following is an example of graphics device?

a) PDF
b) SVG
c) JPEG
d) All of the mentioned
View Answer

Answer: d
Explanation: When the plot() function is invoked, R sends the data corresponding to the plot over, and the graphics device generates
the plot.

4. Which of the following file format is graphic device only for windows?
a) pdf
b) svg
c) win.metafile
d) all of the mentioned
View Answer

Answer: c
Explanation: Exporting graphics to a Windows MetaFile can be achieved via the win.metafile.

5. Point out the wrong statement.

a) For quick visualizations and exploratory analysis, usually you want to use the screen device
b) Functions like xyplot in lattice will not default to sending a plot to the screen device
c) Not all graphics devices are available on all platforms
d) None of the mentioned
View Answer

Answer: b
Explanation: window function cannot be used on Mac.

6. Which of the following system most often don’t have postscript viewer?
a) Windows
b) Linux
c) Mac
d) All of the mentioned
View Answer

Answer: a
Explanation: postscript is older format but it resizes well.

7. There are mainly three types of file devices.

a) True
b) False
View Answer

Answer: b
Explanation: There are mainly basic types of file devices-vector and bitmap.

8. Which of the following is a bitmap file type?

a) tiff
b) svg
c) pdf
d) none of the mentioned
View Answer
Answer: c
Explanation: TIFF is a computer file format for storing raster graphics images.

9. Which of the following function displays currently active graphics device?

a) dev.present
b) dev.cur
c) pre.cur
d) all of the mentioned
View Answer

Answer: b
Explanation: You can change the active graphics device with dev.set.

10. The most familiar place for a plot to be “sent” is screen device.
a) True
b) False
View Answer

Answer: a
Explanation: On Linux, the screen device is launched with x11 function.

Sanfoundry Global Education & Learning Series – Data Science.

1. Which of the following function has parameters shown in the below figure?

a) par
b) bar
c) base
d) all of the mentioned
View Answer

Answer: a
Explanation: R makes it easy to combine multiple plots into one overall graph, using either the par( ) or layout( ) function.

2. Point out the correct statement.

a) Vector formats are good for line drawings and plots with solid colors using a modest number of points
b) Vector formats are good for plots with a large number of points, natural scenes or web based plots
c) The default graphics device is always the screen device
d) All of the mentioned
View Answer

Answer: a
Explanation: Bitmap formats are good for plots with a large number of points, natural scenes or web based plots.

3. Which of the following will copy the plot from one device to another?
a) dev.copy
b) dev.copypdf
c) dev.device
d) all of the mentioned
View Answer

Answer: a
Explanation: Copying a plot to another device can be useful because some plots require a lot of code and it can be a pain to type all
that in again for a different device.
4. Which of the following is used to change active graphic device?
a) dev.set
b) dev.int
c) dev.win
d) all of the mentioned
View Answer

Answer: a
Explanation: You can change the active graphics device with dev.set(<integer>) where <integer> is the number associated with the
graphics device you want to switch to.

5. Point out the wrong statement.

a) File devices are useful for creating plots that can be included in other documents or sent to other people
b) Plots must be created on a graphics device
c) For file devices, there are vector and bitmap formats
d) None of the mentioned
View Answer

Answer: d
Explanation: For file devices, there are vector and bitmap formats.

6. Which of the following is the second goal of PCA?

a) data compression
b) statistical analysis
c) data dredging
d) all of the mentioned
View Answer

Answer: a
Explanation: The principal components are equal to the right singular values if you first scale the variables.

7. dev.copy2pdf specifically copy a plot to a PDF file.

a) True
b) False
View Answer

Answer: a
Explanation: Copying a plot is not an exact operation, so the result may not be identical to the original.

8. Which of the following is a vector file device?

a) png
b) svg
c) bmp
d) none of the mentioned
View Answer

Answer: b
Explanation: svg stands for scalable vector graphics.

9. Which of the following is alternative technique toprincipal component analysis?

a) Factor analysis
b) Independent components analysis
c) Latent semantic analysis
d) All of the mentioned
View Answer

Answer: d
Explanation: PC’s may mix real patterns.

10. Every open graphics device is assigned an integer greater than 2.

a) True
b) False
View Answer
Answer: b
Explanation: Every open graphics device is assigned an integer greater than equal to 2.

1. Which of the following block information is odd man out in the below figure?

a) Scatterplots
b) 5 number summary
c) 2D Graph
d) None of the mentioned
View Answer

Answer: b
Explanation: 5 number summary is one dimensional graph.

2. Which type of graph is shown in the following figure?

a) Scatterplot
b) Barplot
c) Overlaying
d) None of the mentioned
View Answer

Answer: b
Explanation: A bar plot represents an estimate of central tendency for a numeric variable with the height of each rectangle.

3. Which of the following annotation function is used to add or modify text?

a) word
b) graph
c) lines
d) all of the mentioned
View Answer

Answer: d
Explanation: points and axis are other well known annotation function.

4. Which of the following package is implemented by lattice plotting system?

a) grDevices
b) grid
c) graphics
d) all of the mentioned
View Answer

Answer: b
Explanation: Use grid on to display the major grid lines.
5. Point out the wrong statement.
a) Plot are created with multiple functions only
b) Plots are created with both single and multiple function calls
c) Annotation in plot is not especially intuitive
d) None of the mentioned
View Answer

Answer: a
Explanation: Plots are created with single function also.

6. Which of the following parameter defines line type such as dashed and dotted?
a) lty
b) pch
c) lwd
d) all of the mentioned
View Answer

Answer: a
Explanation: lwd is used for line width.

7. The core plotting engine is encapsulated in graphics package.

a) True
b) False
View Answer

Answer: a
Explanation: graphics package contain plotting functions.

8. Which of the following argument specifies margin size with regards to par function?
a) las
b) bg
c) mar
d) all of the mentioned
View Answer

Answer: c
Explanation: par function is used to specify global parameters.

9. How many stages commonly occurs in creation of plot?

a) 2
b) 5
c) 8
d) All of the mentioned
View Answer

Answer: a
Explanation: The base plotting system is highly flexible.

10. Base graphics are used most commonly for creating 2D graphics.
a) True
b) False
View Answer

Answer: a
Explanation: Base graphics is a very powerful system for creating 2D graphics.
1. Which of the following clustering type has characteristic shown in the below figure?

a) Partitional
b) Hierarchical
c) Naive bayes
d) None of the mentioned
View Answer

Answer: b
Explanation: Hierarchical clustering groups data over a variety of scales by creating a cluster tree or dendrogram.

2. Point out the correct statement.

Answer: d
Explanation: Some elements may be close to one another according to one distance and farther away according to another.

3. Which of the following is finally produced by Hierarchical Clustering?

a) final estimate of cluster centroids
b) tree showing how close things are to each other
c) assignment of each point to clusters
d) all of the mentioned
View Answer

Answer: b
Explanation: Hierarchical clustering is an agglomerative approach.

4. Which of the following is required by K-means clustering?

a) defined distance metric
b) number of clusters
c) initial guess as to cluster centroids
d) all of the mentioned
View Answer

Answer: d
Explanation: K-means clustering follows partitioning approach.

5. Point out the wrong statement.

a) k-means clustering is a method of vector quantization
b) k-means clustering aims to partition n observations into k clusters
c) k-nearest neighbor is same as k-means
d) none of the mentioned
View Answer

Answer: c
Explanation: k-nearest neighbor has nothing to do with k-means.

6. Which of the following combination is incorrect?

a) Continuous – euclidean distance
b) Continuous – correlation similarity
c) Binary – manhattan distance
d) None of the mentioned
View Answer

Answer: d
Explanation: You should choose a distance/similarity that makes sense for your problem.

7. Hierarchical clustering should be primarily used for exploration.

a) True
b) False
View Answer

Answer: a
Explanation: Hierarchical clustering is deterministic.

8. Which of the following function is used for k-means clustering?

a) k-means
b) k-mean
c) heatmap
d) none of the mentioned
View Answer

Answer: a
Explanation: K-means requires a number of clusters.

9. Which of the following clustering requires merging approach?

a) Partitional
b) Hierarchical
c) Naive Bayes
d) None of the mentioned
View Answer

Answer: b
Explanation: Hierarchical clustering requires a defined distance as well.

10. K-means is not deterministic and it also consists of number of iterations.

a) True
b) False
View Answer

Answer: a
Explanation: K-means clustering produces the final estimate of cluster centroids.

1. Which of the following graphs has properties in the below figure?

a) Exploratory
b) Inferential
c) Causal
d) None of the mentioned
View Answer

Answer: a
Explanation: Making plots of the data reveals various interesting features.
2. Which of the following dimension type graph is shown in the below figure?

a) one-dimensional
b) two-dimensional
c) three-dimensional
d) none of the mentioned
View Answer

Answer: b
Explanation: A two-dimensional graph is a set of points in two-dimensional space.

3. Which of the following gave rise to need of graphs in data analysis?

a) Data visualization
b) Communicating results
c) Decision making
d) All of the mentioned
View Answer

Answer: d
Explanation: A picture can tell better story than data.

4. Which of the following is characteristic of exploratory graph?

a) Made slowly
b) Axes are not cleaned up
c) Color is used for personal information
d) All of the mentioned
View Answer

Answer: c
Explanation: A large number of exploratory graphs are made.

5. Point out the correct statement.

a) coplots are one dimensional data graph
b) Exploratory graphs are made quickly
c) Exploratory graphs are made relatively less in number
d) All of the mentioned
View Answer

Answer: a
Explanation: coplot is used for two dimensional representation.

6. Which of the following graph can be used for simple summarization of data?
a) Scatterplot
b) Overlaying
c) Barplot
d) All of the mentioned
View Answer

Answer: c
Explanation: A bar chart or bar graph is a chart that presents Grouped data with rectangular bars with lengths proportional to the
values that they represent.

7. Color and shape are used to add dimensions to graph data.

a) True
b) False
View Answer
Answer: a
Explanation: Graphs are commonly used by print and electronic media.

8. Which of the following information is not given by five-number summary?

a) Mean
b) Median
c) Mode
d) All of the mentioned
View Answer

Answer: c
Explanation: The mode is the value that appears most often in a set of data.

9. Which of the following is also referred to as overlayed 1D plot?

a) lattice
b) barplot
c) gplot
d) all of the mentioned
View Answer

Answer: a
Explanation: lattice is an add-on package that implements Trellis graphics.

10. Spinning plots can be used for two dimensional data.

a) True
b) False
View Answer

Answer: a
Explanation: There are many ways to create a 3D spinning plot as well.

1. Which of the following problem is solved by reproducibility?

a) Scalability
b) Data availability
c) Improved data analysis
d) None of the mentioned
View Answer

Answer: b
Explanation: More transparency is achieved with reproducibility.

2. Point out the correct statement with respect to replication.

a) Focuses on the validity of the data analysis
b) Focuses on the validity of the scientific claim
c) Arguably a minimum standard for any scientific study
d) All of the mentioned
View Answer

Answer: a
Explanation: Data replication if the same data is stored on multiple storage device.

3. Which of the following is effective way of checking validity of data analysis?

a) Re-run the analysis
b) Review the code
c) Check the sensitivity
d) All of the mentioned
View Answer

Answer: d
Explanation: Reproducibility addresses the most “downstream” aspect of the research process.

4. Which of the following is similar to a pre-specified clinical trial protocol?

a) Caching-based Data Analysis
b) Evidence-based Data Analysis
c) Markdown-based Data Analysis
d) All of the mentioned
View Answer

Answer: b
Explanation: Evidence-based Data Analysis a deterministic statistical machine.

5. Point out the wrong statement with respect to reproducibility.

a) Focuses on the validity of the data analysis
b) The ultimate standard for strengthening scientific evidence
c) Important when replication is impossible
d) None of the mentioned
View Answer

Answer: b
Explanation: Replication is particularly important in studies that can impact broad policy or regulatory decisions.

6. Which of the following can be used for data analysis model?

a) CRAN
b) CPAN
c) CTAN
d) All of the mentioned
View Answer

Answer: d
Explanation: Different problems require different approaches and expertise.

7. Reproducibility determines correctness of data analysis.

a) True
b) False
View Answer

Answer: b
Explanation: Reproducibility has nothing to do with validity of data analysis.

8. Which of the following step is not required in data analysis?

a) Synthesize results
b) Create reproducible code
c) Interpret results
d) None of the mentioned
View Answer

Answer: d
Explanation: The data set may depend on your goal.

9. Which of the following gives reviewers an important tool without dramatically increasing the burden?
a) Quality research
b) Replication research
c) Reproducible research
d) None of the mentioned
View Answer

Answer: c
Explanation: Reproducible research is important, but does not necessarily solve the critical question of whether a data analysis is
trustworthy.

10. Result analysis are relatively easy to replicate or reproduce.

a) True
b) False
View Answer

Answer: b
Explanation: Complicated analyses should not be trusted.
1. Which of the following is suitable for knitr?
a) Reports
b) Data preprocessing documents
c) Technical manuals
d) All of the mentioned
View Answer

Answer: a
Explanation: knitr has short technical documents.

2. Point out the correct combination related to output statements.

a) results: “asis”
b) echo: true
c) echo=false
d) none of the mentioned
View Answer

Answer: a
Explanation: Global option relating to echo have values TRUE and FALSE.

3. Which of the following is required for not echoing the code?

a) echo=TRUE
b) print=TRUE
c) echo=FALSE
d) all of the mentioned
View Answer

Answer: a
Explanation: Code has to be written to set the global options.

4. Which of the following global options are available for figures in knitr?
a) fig.height
b) fig.size
c) fig.breadth
d) all of the mentioned
View Answer

Answer: a
Explanation: fig.height has numeric value.

5. Which of the following global option has value “hide”?

a) results
b) fig.width
c) echo
d) none of the mentioned
View Answer

Answer: a
Explanation: Workflow R Markdown is a format for writing reproducible, dynamic reports with R.

6. Which of the following is the correct order of conversion?

a) .md->.Rmd->.html
b) .Rmd->.md->.html
c) .Rmd->.md->.xml
d) all of the mentioned
View Answer

Answer: a
Explanation: knitr converts markdown document in to html by default.

7. knitr is good for complex time-consuming computations.

a) True
b) False
View Answer
Answer: b
Explanation: knitr is poor for complex time-consuming computations.

8. Which of the following statement is used for importing knitr library?

a) library(knitr)
b) import knitr
c) lib(knitr)
d) none of the mentioned
View Answer

Answer: a
Explanation: knitr is not good for documents that require precise formatting.

9. The document produced by knitr document has which of the following extension?
a) .md
b) .rmd
c) .html
d) none of the mentioned
View Answer

Answer: b
Explanation: knitr produces markdown document.

10. Code chunks begin with “`{r} and end with “`.
a) True
b) False
View Answer

Answer: a
Explanation: Code chunks can have names.

1. What is the role of processing code in the research pipeline?

a) Transforms the analytical results into figures and tables
b) Transforms the analytic data into measured data
c) Transforms the measured data into analytic data
d) All of the mentioned
View Answer

Answer: c
Explanation: Data science workflow is a non-linear, iterative process.

2. Which of the following is a goal of literate statistical programming?

a) Combine explanatory text and data analysis code in a single document
b) Ensure that data analysis documents are always exported in JPEG format
c) Require those data analysis summaries are always written in R
d) None of the mentioned
View Answer

Answer: a
Explanation: Literate Statistical Practice is a programming methodology.

3. What does it mean to weave a literate statistical program?

a) Convert a program from S to python
b) Convert the program into a human readable document
c) Convert a program to decompress it
d) All of the mentioned
View Answer

Answer: b
Explanation: Literate Statistical Programming can be done with knitr.

4. Which of the following is required to implement a literate programming system?

a) A programming language like Perl
b) A programming language like Java
c) A programming language like R
d) All of the mentioned
View Answer

Answer: c
Explanation: R is a language and environment for statistical computing and graphics.

5. What is one way in which the knitr system differs from Sweave?
a) knitr allows for the use of markdown instead of LaTeX
b) knitr is written in python instead of R
c) knitr lacks features like caching of code chunks
d) none of the mentioned
View Answer

Answer: a
Explanation: knitr is an engine for dynamic report generation with R.

6. Which of the following is useful way to put text, code, data, output all in one document?
a) Literate statistical programming
b) Object oriented programming
c) Descriptive programming
d) All of the mentioned
View Answer

Answer: a
Explanation: Object-oriented programming is a programming language model organized around objects rather than “actions” and data
rather than logic.

7. Some chunks have to be re-computed every time you re-knit the file.
a) True
b) False
View Answer

Answer: b
Explanation: All chunks have to be re-computed every time you re-knit the file.

8. Which of the following tool can be used for integrating text and code in one document?
a) knitr
b) ggplot2
c) NumPy
d) None of the mentioned
View Answer

Answer: a
Explanation: knitr is a way to write LaTeX, HTML, and Markdown with R code interlaced.

9. Which of the following should be set on chunk by chunk basis to store results of computation?
a) cache=TRUE
b) cache=FALSE
c) caching=TRUE
d) none of the mentioned
View Answer

Answer: a
Explanation: After the first run. The results are loaded from cache.

10. Dependencies are checked explicitly in caching caveats.

a) True
b) False
View Answer

Answer: b
Explanation: Dependencies are not checked explicitly in caching caveats.
1. Original idea comes of Literate Statistical Practice from _______________
a) Don Knuth
b) Don Cutting
c) Douglas Cutting
d) All of the mentioned
View Answer

Answer: a
Explanation: Literate programs are tangled to produce machine readable documents.

2. Point out the correct statement.

a) An article is stream of code and text
b) Analysis code is divided in to code chunks only
c) Literate programs are tangled to produce human readable documents
d) None of the mentioned
View Answer

Answer: a
Explanation: Analysis code is divided in to code chunks and text.

3. Which of the following is required for literate programming?

a) documentation language
b) mapper language
c) reducer language
d) all of the mentioned
View Answer

Answer: a
Explanation: Programming language is also required for literate programming.

4. Which of the following is required to implement a literate programming system?

a) A programming language like Perl
b) A programming language like Java
c) A programming language like R
d) All of the mentioned
View Answer

Answer: c
Explanation: R is a language and environment for statistical computing and graphics.

5. Which of the following way is required to make work reproducible?

a) keep track of things
b) Save output
c) Save data in proprietary formats
d) None of the mentioned
View Answer

Answer: a
Explanation: Save data in NON proprietary formats to make work reproducible.

6. Which of the following disadvantage does literate programming have?

a) Slow processing of documents
b) Code is not automatic
c) No logical order
d) All of the mentioned
View Answer

Answer: a
Explanation: Code and text is in one place.

7. knitr supports only one documentation language.

a) True
b) False
View Answer
Answer: b
Explanation: knitr supports various documentation languages.

8. Which of the following tool documentation language is supported by knitr?

a) RMarkdown
b) LaTeX
c) HTML
d) None of the mentioned
View Answer

Answer: a
Explanation: knitr is available on CRAN.

9. Which of the following package by Yihui is built in to RStudio environment?

a) rpy2
b) knitr
c) ggplot2
d) none of the mentioned
View Answer

Answer: b
Explanation: It can be exported to pdf and html.

10. Literate program code is live-automatic “regression test” when building a document.
a) True
b) False
View Answer

Answer: a
Explanation: Data and results are automatically updated to reflect external changes.

1. Which of the following is the probability calculus of beliefs, given that beliefs follow certain rules?
a) Bayesian probability
b) Frequency probability
c) Frequency inference
d) Bayesian inference
View Answer

Answer: a
Explanation: Data scientists tend to fall within shades of gray of these and various other schools of inference.

2. Point out the correct statement.

a) Bayesian inference is the use of Bayesian probability representation of beliefs to perform inference
b) NULL is the standard missing data marker used in S
c) Frequency inference is the use of Bayesian probability representation of beliefs to perform inference
d) None of the mentioned
View Answer

Answer: a
Explanation: Frequency probability is the long run proportion of times an event occurs in independent, identically distributed
repetitions.

3. Which of the following can be considered as random variable?

a) The outcome from the roll of a die
b) The outcome of flip of a coin
c) The outcome of exam
d) All of the mentioned
View Answer

Answer: d
Explanation: The probability distribution of a discrete random variable is a list of probabilities associated with each of its possible
values.
4. Which of the following random variable that take on only a countable number of possibilities?
a) Discrete
b) Non Discrete
c) Continuous
d) All of the mentioned
View Answer

Answer: a
Explanation: Continuous random variable can take any value on some subset of the real line.

5. Point out the wrong statement.

a) A random variable is a numerical outcome of an experiment
b) There are three types of random variable
c) Continuous random variable can take any value on the real line
d) None of the mentioned
View Answer

Answer: b
Explanation: There are two types of random variable-continuous and discrete.

6. Which of the following is also referred to as random variable?

a) stochast
b) aleatory
c) eliette
d) all of the mentioned
View Answer

Answer: b
Explanation: Random variable is also known as stochastic variable.

7. Bayesian inference uses frequency interpretations of probabilities to control error rates.

a) True
b) False
View Answer

Answer: b
Explanation: Frequency inference uses frequency interpretations of probabilities to control error rates.

8. Which of the following condition should be satisfied by function for pmf?

a) The sum of all of the possible values is 1
b) The sum of all of the possible values is 0
c) The sum of all of the possible values is infinite
d) All of the mentioned
View Answer

Answer: a
Explanation: A probability mass function evaluated at a value corresponds to the probability that a random variable takes that value.

9. Which of the following function is associated with a continuous random variable?

a) pdf
b) pmv
c) pmf
d) all of the mentioned
View Answer

Answer: a
Explanation: pdf stands for probability density function.

10. Statistical inference is the process of drawing formal conclusions from data.
a) True
b) False
View Answer
Answer: a
Explanation: Statistical inference requires navigating the set of assumptions and tools.

1. The expected value or _______ of a random variable is the center of its distribution.
a) mode
b) median
c) mean
d) bayesian inference
View Answer

Answer: c
Explanation: A probability model connects the data to the population using assumptions.

2. Point out the correct statement.

a) Some cumulative distribution function F is non-decreasing and right-continuous
b) Every cumulative distribution function F is decreasing and right-continuous
c) Every cumulative distribution function F is increasing and left-continuous
d) None of the mentioned
View Answer

Answer: d
Explanation: Every cumulative distribution function F is non-decreasing and right-continuous.

3. Which of the following of a random variable is a measure of spread?

a) variance
b) standard deviation
c) empirical mean
d) all of the mentioned
View Answer

Answer: a
Explanation: Densities with a higher variance are more spread out than densities with a lower variance.

4. The square root of the variance is called the ________ deviation.

a) empirical
b) mean
c) continuous
d) standard
View Answer

Answer: d
Explanation: Standard Deviation (SD) is the measure of spread of the numbers in a set of data from its mean value.

5. Point out the wrong statement.

a) A percentile is simply a quantile with expressed as a percent
b) There are two types of random variable
c) R cannot approximate quantiles for you for common distributions
d) None of the mentioned
View Answer

Answer: c
Explanation: R can approximate quantiles for you for common distributions.

6. Which of the following inequality is useful for interpreting variances?

a) Chebyshev
b) Stautaory
c) Testory
d) All of the mentioned
View Answer

Answer: a
Explanation: Chebyshev’s inequality is also spelled as Tchebysheff’s inequality.
7. For continuous random variables, the CDF is the derivative of the PDF.
a) True
b) False
View Answer

Answer: b
Explanation: For continuous random variables, the PDF is the derivative of the CDF.

8. Chebyshev’s inequality states that the probability of a “Six Sigma” event is less than ___________
a) 10%
b) 20%
c) 30%
d) 3%
View Answer

Answer: d
Explanation: If a bell curve is assumed, the probability of a “six sigma” event is on the order of one ten millionth of a percent.

9. Which of the following random variables are the default model for random samples?
a) iid
b) id
c) pmd
d) all of the mentioned
View Answer

Answer: a
Explanation: Random variables are said to be iid if they are independent and identically distributed.

10. Cumulative distribution functions are used to specify the distribution of multivariate random variables.
a) True
b) False
View Answer

Answer: a
Explanation: In the case of a continuous distribution, it gives the area under the probability density function from minus infinity to x.

1. Which of the following goal is incorrectly represented in the below figure?

a) Relationship between variables

b) Distribution of variables
c) Inference about relationships
d) Causal
View Answer

Answer: d
Explanation: Causal is not directly related to goal of statistical modelling.

2. Point out the correct statement.

a) The exponent of a normally distributed random variables follows what is called the log-normal distribution
b) Sums of normally distributed random variables are again normally distributed even if the variables are dependent
c) The square of a standard normal random variable follows what is called chi-squared distribution
d) All of the mentioned
View Answer
Answer: d
Explanation: Many random variables, properly normalized, limit to a normal distribution.

3. Which of the following is incorrect with respect to use of Poisson distribution?

a) Modeling event/time data
b) Modeling bounded count data
c) Modeling contingency tables
d) All of the mentioned
View Answer

Answer: b
Explanation: Poisson distribution is used for modeling unbounded count data.

4. __________ random variables are used to model rates.

a) Empirical
b) Binomial
c) Poisson
d) All of the mentioned
View Answer

Answer: c
Explanation: Poisson distribution is used to model counts.

5. Point out the wrong statement.

a) The normal distribution is asymmetric and peaked about its mode
b) A constant times a normally distributed random variable is also normally distributed
c) Sample means of normally distributed random variables are again normally distributed
d) None of the mentioned
View Answer

Answer: a
Explanation: The normal distribution is symmetric and peaked about its mean.

6. Which of the following form the basis for frequency interpretation of probabilities?
a) Asymptotics
b) Symptotics
c) Asymmetry
d) All of the mentioned
View Answer

Answer: a
Explanation: Asymptotics is the term for the behavior of statistics as the sample size.

7. Bernoulli random variables take (only) the values 1 and 0.

a) True
b) False
View Answer

Answer: a
Explanation: The Bernoulli distribution arises as the result of a binary outcome.

8. The _________ basically states that the sample mean is consistent.

a) LAN
b) LLN
c) LWN
d) None of the mentioned
View Answer

Answer: b
Explanation: LLN stands for law of large numbers.

9. Which of the following theorem states that the distribution of averages of iid variables, properly normalized, becomes that of a standard
normal as the sample size increases?
a) Central Limit Theorem
b) Central Mean Theorem
c) Centroid Limit Theorem
d) All of the mentioned
View Answer

Answer: a
Explanation: The Central Limit Theorem (CLT) is one of the most important theorems in statistics.

10. The binomial random variables are obtained as the sum of iid Gaussian trials.
a) True
b) False
View Answer

Answer: a
Explanation: The binomial random variables are obtained as the sum of iid Bernoulli trials.

1. The _________ of the Chi-squared distribution is twice the degrees of freedom.

a) variance
b) standard deviation
c) mode
d) none of the mentioned
View Answer

Answer: a
Explanation: The mean of the Chi-squared is its degrees of freedom.

2. Point out the correct statement.

a) Asymptotics are incredibly useful for simple statistical inference and approximations
b) Asymptotics often lead to nice understanding of procedures
c) An estimator is consistent if it converges to what you want to estimate
d) All of the mentioned
View Answer

Answer: d
Explanation: Consistency is neither necessary nor sufficient for one estimator to be better than another.

3. Gosset’s distribution is invented by which of the following scientist?

a) William Gosset
b) William Gosling
c) Gosling Gosset
d) All of the mentioned
View Answer

Answer: a
Explanation: Gosset’s distribution is indexed by a degrees of freedom.

4. The _________ of a collection of data is the joint density evaluated as a function of the parameters with the data fixed.
a) probability
b) likelihood
c) poisson distribution
d) all of the mentioned
View Answer

Answer: b
Explanation: Likelihood analysis of data uses the likelihood to perform inference regarding the unknown parameter.

5. Point out the wrong statement.

a) Asymptotics generally give assurances about finite sample performance
b) The sample variance and the sample standard deviation are consistent as well
c) The sample mean and the sample variance are unbiased as well
d) None of the mentioned
View Answer
Answer: a
Explanation: The kinds of asymptotics that do are orders of magnitude more difficult to work with.

6. Which of the following is a property of likelihood?

a) Ratios of likelihood values measure the relative evidence of one value of the unknown parameter to another
b) Given a statistical model and observed data, all of the relevant information contained in the data regarding the unknown parameter is
contained in the likelihood
c) The Resultant likelihood is multiplication of individual likelihood
d) All of the mentioned
View Answer

Answer: d
Explanation: Likelihood is the hypothetical probability that an event that has already occurred would yield a specific outcome.

7. CLT is mostly useful as an approximation.

a) True
b) False
View Answer

Answer: a
Explanation: The CLT applies in an endless variety of settings.

8. The beta distribution is the default prior for parameters between ____________
a) 0 and 10
b) 1 and 2
c) 0 and 1
d) None of the mentioned
View Answer

Answer: c
Explanation: Bayesian statistics posits a prior on the parameter of interest.

9. Which of the following mean is a mixture of the MLE and the prior mean?
a) interior
b) exterior
c) posterior
d) all of the mentioned
View Answer

Answer: c
Explanation: MLE stands for maximum likelihood.

10. Usually replacing the standard error by its estimated value does change the CLT.
a) True
b) False
View Answer

Answer: b
Explanation: Usually replacing the standard error by its estimated value doesn’t change the CLT.

1. Which of the following testing is concerned with making decisions using data?
a) Probability
b) Hypothesis
c) Causal
d) None of the mentioned
View Answer

Answer: b
Explanation: The null hypothesis is assumed true and statistical evidence is required to reject it in favor of a research or alternative
hypothesis.

2. Point out the correct statement.

a) Power of a one sided test is lower than the power of the associated two sided test
b) Power of a two sided test is greater than the power of the associated one sided test
c) Hypothesis testing is less commonly used
d) None of the mentioned
View Answer

Answer: d
Explanation: Power of a one sided test is greater than the power of the associated two sided test.

3. Which of the following value is the most common measure of “statistical significance”?
a) P
b) A
c) L
d) All of the mentioned
View Answer

Answer: a
Explanation: The P-value is the probability under the null hypothesis of obtaining evidence as extreme or more extreme than would
be observed by chance alone.

4. What is the purpose of multiple testing in statistical inference?

a) Minimize errors
b) Minimize false positives
c) Minimize false negatives
d) All of the mentioned
View Answer

Answer: d
Explanation: A false positive is an error in some evaluation process in which a condition tested for is mistakenly found to have been
detected.

5. Point out the wrong statement with respect to FDR.

a) FDR is difficult to calculate
b) FDR is relatively less conservative
c) FDR allows for more false positives
d) None of the mentioned
View Answer

Answer: a
Explanation: FDR stands for false discovery rate.

6. Which of the following is the oldest multiple testing correction?

a) Bonferroni correction
b) Bernoulli correction
c) Likelihood correction
d) All of the mentioned
View Answer

Answer: a
Explanation: Bonferroni correction is easy to calculate.

7. The pooled estimator is a mixture of the group variances, placing greater weight on whichever has a larger sample size.
a) True
b) False
View Answer

Answer: a
Explanation: If the sample sizes are the same the pooled variance estimate is the average of the group variances.

8. Which of the following tool is used for constructing confidence intervals and calculating standard errors for difficult statistics?
a) baggyer
b) bootstrap
c) jacknife
d) none of the mentioned
View Answer
Answer: b
Explanation: The bootstrap procedure follows from the so called bootstrap principle.

9. Which of the following tool is used for estimating standard errors and the bias of estimators?
a) knitr
b) jackknife
c) ggplot2
d) all of the mentioned
View Answer

Answer: c
Explanation: jackknife involves resampling data.

10. Power is the probability of rejecting the null hypothesis when it is true.
a) True
b) False
View Answer

Answer: b
Explanation: Power is the probability of rejecting the null hypothesis when it is false.

1. Which of the following function can be replaced with the question mark in the below figure?

a) boxplot
b) lplot
c) levelplot
d) all of the mentioned
View Answer

Answer: c
Explanation: levelplot is used plotting “image”.

2. Point out the correct statement.

a) The mean is a measure of central tendency of the data
b) Empirical mean is related to “centering” the random variables
c) The empirical standard deviation is a measure of spread
d) All of the mentioned
View Answer

Answer: d
Explanation: The process of centering and scaling the data is called “normalizing” the data.

3. Which of the following implies no relationship with respect to correlation?

a) Cor(X, Y) = 1
b) Cor(X, Y) = 0
c) Cor(X, Y) = 2
d) All of the mentioned
View Answer

Answer: b
Explanation: Correlation is a statistical technique that can show whether and how strongly pairs of variables are related.

4. Normalized data are centered at ___ and have units equal to standard deviations of the original data.
a) 0
b) 5
c) 1
d) 10
View Answer

Answer: a
Explanation: In statistics and applications of statistics, normalization can have a range of meanings.

5. Point out the wrong statement.

a) Regression through the origin yields an equivalent slope if you center the data first
b) Normalizing variables results in the slope being the correlation
c) Least squares is not an estimation tool
d) None of the mentioned
View Answer

Answer: c
Explanation: Least squares is an estimation tool.

6. Which of the following is correct with respect to residuals?

a) Positive residuals are above the line, negative residuals are below
b) Positive residuals are below the line, negative residuals are above
c) Positive residuals and negative residuals are below the line
d) All of the mentioned
View Answer

Answer: a
Explanation: Residuals can be thought of as the outcome with the linear association of the predictor removed.

7. Minimizing the likelihood is the same as maximizing -2 log likelihood.

a) True
b) False
View Answer

Answer: a
Explanation: Maximizing the likelihood is the same as minimizing 2 log likelihood.

8. Which of the following refers to the circumstance in which the variability of a variable is unequal across the range of values of a
second variable that predicts it?
a) Heterogeneity
b) Heteroskedasticity
c) Heteroelasticty
d) None of the mentioned
View Answer

Answer: b
Explanation: Heteroskedasticity has serious consequences for the OLS estimator.

9. Which of the following outcome is odd man out in the below figure?

a) R Squared
b) Kappa
c) RMSE
d) All of the mentioned
View Answer
Answer: b
Explanation: Kappa is categorical outcome.

10. Residuals are useful for investigating best model fit.

a) True
b) False
View Answer

Answer: b
Explanation: Residuals are useful for investigating poor model fit.

1. Which of the following is the correct formula for total variation?

a) Total Variation = Residual Variation – Regression Variation
b) Total Variation = Residual Variation + Regression Variation
c) Total Variation = Residual Variation * Regression Variation
d) All of the mentioned
View Answer

Answer: b
Explanation: The complementary part of the total variation is called unexplained or residual.

2. Point out the correct statement.

a) A standard error is needed to create a prediction interval
b) The prediction interval must incorporate the variability in the data around the line
c) Investors use the residual variance to measure the accuracy of their predictions on the value of an asset
d) All of the mentioned
View Answer

Answer: d
Explanation: In statistics, explained variation measures the proportion to which a mathematical model accounts for the variation of a
given data set.

3. Which of the following things can be accomplished with linear model?

a) Flexibly fit complicated functions
b) Uncover complex multivariate relationships
c) Build accurate prediction models
d) All of the mentioned
View Answer

Answer: d
Explanation: Linear models are the single most important applied statistical and machine learning technique.

4. Which of the following statement is incorrect with respect to outliers?

a) Outliers can have varying degrees of influence
b) Outliers can be the result of spurious or real processes
c) Outliers cannot conform to the regression relationship
d) None of the mentioned
View Answer

Answer: c
Explanation: Outliers can conform to the regression relationship.

5. Point out the wrong statement.

a) The fraction of variance unexplained is an established concept in the context of linear regression
b) “Explained variance” is routinely used in principal component analysis
c) The general linear model extends simple linear regression (SLR) by adding terms linearly into the model
d) None of the mentioned
View Answer

Answer: d
Explanation: Linearity refers to a mathematical relationship or function that can be graphically represented as a straight line.
6. Which of the following can be useful for diagnosing data entry errors?
a) hat values
b) dffit
c) resid
d) all of the mentioned
View Answer

Answer: a
Explanation: resid returns the ordinary residuals.

7. Multivariate regression estimates are exactly those having removed the linear relationship of the other variables from both the regressor
and response.
a) True
b) False
View Answer

Answer: a
Explanation: Multivariate Data Analysis refers to any statistical technique used to analyze data that arises from more than one
variable.

8. Residual ______ plots investigate normality of the errors.

a) RR
b) PP
c) QQ
d) None of the mentioned
View Answer

Answer: c
Explanation: Patterns in your residual plots generally indicate some poor aspect of model fit.

9. Which of the following show residuals divided by their standard deviations?

a) rstudent
b) cooks.distance
c) rstandard
d) all of the mentioned
View Answer

Answer: c
Explanation: rstandard stands for standardized residuals.

10. The least squares estimate for the coefficient of a multivariate regression model is exactly regression through the origin with the linear
relationships.
a) True
b) False
View Answer

Answer: b
Explanation: Multivariate regression adjusts a coefficient for the linear impact of the other variables.

1. How many components are present in generalized linear models?

a) 2
b) 4
c) 6
d) None of the mentioned
View Answer

Answer: d
Explanation: Generalized linear models involve three components.

2. Point out the wrong statement.

a) Additive response models don’t make much sense if the response is discrete, or strictly positive
b) Transformations are often easy to interpret in linear model
c) Regression models are used to predict one variable from one or more other variables
d) All of the mentioned
View Answer
Answer: b
Explanation: Transformations are often hard to interpret in linear model.

3. Which of the following component is involved in generalized linear models?

a) An exponential family model for the response
b) A systematic component via a linear predictor
c) A link function that connects the means of the response to the linear predictor
d) All of the mentioned
View Answer

Answer: d
Explanation: GLM is a flexible generalization of ordinary linear regression that allows for response variables that have error
distribution models other than a normal distribution.

4. Collection of exchangeable binary outcomes for the same covariate data are called _______ outcomes.
a) random
b) direct
c) binomial
d) none of the mentioned
View Answer

Answer: c
Explanation: The multivariate regression model for binary outcomes gives odds ratios, not risk ratios.

5. Point out the wrong statement.

a) Asymptotics are used for inference usually
b) Adding squared terms makes it continuously differentiable at the knot points
c) Adding squared terms makes it twice continuously differentiable at the knot points
d) None of the mentioned
View Answer

Answer: c
Explanation: Adding cubic terms makes it twice continuously differentiable at the knot points.

6. Which of the following is example use of Poisson distribution?

a) Analyzing contingency table data
b) Modeling web traffic hits
c) Incidence rates
d) All of the mentioned
View Answer

Answer: d
Explanation: The Poisson distribution is a useful model for counts and rates.

7. Principal components or factor analytic models on covariates are often useful for reducing complex covariate spaces.
a) True
b) False
View Answer

Answer: a
Explanation: The space of models explodes quickly as you add interactions and polynomial terms.

8. How many outcomes are possible with bernoulli trial?

a) 2
b) 3
c) 4
d) None of the mentioned
View Answer

Answer: a
Explanation: Bernoulli trial is a random experiment with exactly two possible outcomes.
9. Which of the following analysis is a statistical process for estimating the relationships among variables?
a) Causal
b) Regression
c) Multivariate
d) All of the mentioned
View Answer

Answer: b
Explanation: Regression models provide the scientist with a powerful tool, allowing predictions about past, present, or future events
to be made with information about past or present events.

10. Linear models are the most useful applied statistical technique.
a) True
b) False
View Answer

Answer: b
Explanation: Linear model do have limitations.

1. Which of the following can be used to generate balanced cross–validation groupings from a set of data?
a) createFolds
b) createSample
c) createResample
d) none of the mentioned
View Answer

Answer: a
Explanation: createResample can be used to make simple bootstrap samples.

2. Point out the wrong statement.

a) Simple random sampling of time series is probably the best way to resample times series data.
b) Three parameters are used for time series splitting
c) Horizon parameter is the number of consecutive values in test set sample
d) All of the mentioned
View Answer

Answer: a
Explanation: Simple random sampling of time series is probably not the best way to resample times series data.

3. Which of the following function can be used to maximize the minimum dissimilarities?
a) sumDiss
b) minDiss
c) avgDiss
d) all of the mentioned
View Answer

Answer: d
Explanation: sumDiss can be used to maximize the total dissimilarities.

4. Which of the following function can create the indices for time series type of splitting?
a) newTimeSlices
b) createTimeSlices
c) binTimeSlices
d) none of the mentioned
View Answer

Answer: b
Explanation: Rolling forecasting origin techniques are associated with time series type of splitting.

5. Point out the correct statement.

a) Asymptotics are used for inference usually
b) Caret includes several functions to pre-process the predictor data
c) The function dummyVars can be used to generate a complete set of dummy variables from one or more factors
d) All of the mentioned
View Answer
Answer: d
Explanation: The function dummyVars takes a formula and a data set and outputs an object that can be used to create the dummy
variables using the predict method.

6. Which of the following can be used to create sub–samples using a maximum dissimilarity approach?
a) minDissim
b) maxDissim
c) inmaxDissim
d) all of the mentioned
View Answer

Answer: b
Explanation: Splitting is based on the predictors.

7. caret does not use the proxy package.

a) True
b) False
View Answer

Answer: b
Explanation: caret uses the proxy package.

8. Which of the following function can be used to create balanced splits of the data?
a) newDataPartition
b) createDataPartition
c) renameDataPartition
d) none of the mentioned
View Answer

Answer: b
Explanation: If the y argument to this function is a factor, the random sampling occurs within each class and should preserve the
overall class distribution of the data.

9. Which of the following package tools are present in caret?

a) pre-processing
b) feature selection
c) model tuning
d) all of the mentioned
View Answer

Answer: d
Explanation: There are many different modeling functions in R.

10. caret stands for classification and regression training.

a) True
b) False
View Answer

Answer: a
Explanation: The caret package is a set of functions that attempt to streamline the process for creating predictive models.

1. Which of the following function is a wrapper for different lattice plots to visualize the data?
a) levelplot
b) featurePlot
c) plotsample
d) none of the mentioned
View Answer

Answer: b
Explanation: featurePlot is used for data visualization in caret.

2. Point out the wrong statement.

a) In every situation, the data generating mechanism can create predictors that only have a single unique value
b) Predictors might have only a handful of unique values that occur with very low frequencies
c) The function findLinearCombos uses the QR decomposition of a matrix to enumerate sets of linear combinations
d) All of the mentioned
View Answer

Answer: a
Explanation: In some situations, the data generating mechanism can create predictors that only have a single unique value.

3. Which of the following function can be used to identify near zero-variance variables?
a) zeroVar
b) nearVar
c) nearZeroVar
d) all of the mentioned
View Answer

Answer: c
Explanation: The saveMetrics argument can be used to show the details and usually defaults to FALSE.

4. Which of the following function can be used to flag predictors for removal?
a) searchCorrelation
b) findCausation
c) findCorrelation
d) none of the mentioned
View Answer

Answer: c
Explanation: Some models thrive on correlated predictors.

5. Point out the correct statement.

a) findLinearColumns will also return a vector of column positions can be removed to eliminate the linear dependencies
b) findLinearCombos will return a list that enumerates dependencies
c) the function findLinearRows can be used to generate a complete set of row variables from one factor
d) none of the mentioned
View Answer

Answer: b
Explanation: For each linear combination, it will incrementally remove columns from the matrix and test to see if the dependencies
have been resolved.

6. Which of the following can be used to impute data sets based only on information in the training set?
a) postProcess
b) preProcess
c) process
d) all of the mentioned
View Answer

Answer: b
Explanation: This can be done with K-nearest neighbors.

7. The function preProcess estimates the required parameters for each operation.
a) True
b) False
View Answer

Answer: a
Explanation: predict.preProcess is used to apply them to specific data sets.

8. Which of the following can also be used to find new variables that are linear combinations of the original set with independent
components?
a) ICA
b) SCA
c) PCA
d) None of the mentioned
View Answer
Answer: a
Explanation: ICA stands for independent component analysis.

9. Which of the following function is used to generate the class distances?

a) preprocess.classDist
b) predict.classDist
c) predict.classDistance
d) all of the mentioned
View Answer

Answer: b
Explanation: By default, the distances are logged.

10. The preProcess class can be used for many operations on predictors.
a) True
b) False
View Answer

Answer: a
Explanation: Operations include centering and scaling.

1. varImp is a wrapper around the evimp function in the _______ package.

a) numpy
b) earth
c) plot
d) none of the mentioned
View Answer

Answer: b
Explanation: The earth package is an implementation of Jerome Friedman’s Multivariate Adaptive Regression Splines.

2. Point out the wrong statement.

a) The trapezoidal rule is used to compute the area under the ROC curve
b) For regression, the relationship between each predictor and the outcome is evaluated
c) An argument, para, is used to pick the model fitting technique
d) All of the mentioned
View Answer

Answer: c
Explanation: An argument, nonpara, is used to pick the model fitting technique.

3. Which of the following curve analysis is conducted on each predictor for classification?
a) NOC
b) ROC
c) COC
d) All of the mentioned
View Answer

Answer: b
Explanation: For two class problems, a series of cutoffs is applied to the predictor data to predict the class.

4. Which of the following function tracks the changes in model statistics?

a) varImp
b) varImpTrack
c) findTrack
d) none of the mentioned
View Answer

Answer: a
Explanation: GCV change value can also be tracked.

5. Point out the correct statement.

a) The difference between the class centroids and the overall centroid is used to measure the variable influence
b) The Bagged Trees output contains variable usage statistics
c) Boosted Trees uses different approach as a single tree
d) None of the mentioned
View Answer

Answer: a
Explanation: The larger the difference between the class centroid and the overall center of the data, the larger the separation between
the classes.

6. Which of the following model model include a backwards elimination feature selection routine?
a) MCV
b) MARS
c) MCRS
d) All of the mentioned
View Answer

Answer: b
Explanation: MARS stands for Multivariate Adaptive Regression Splines.

7. The advantage of using a model-based approach is that is more closely tied to the model performance.
a) True
b) False
View Answer

Answer: a
Explanation: Model-based approach is able to incorporate the correlation structure between the predictors into the importance
calculation.

8. Which of the following model sums the importance over each boosting iteration?
a) Boosted trees
b) Bagged trees
c) Partial least squares
d) None of the mentioned
View Answer

Answer: a
Explanation: gbm package can be used here.

9. Which of the following argument is used to set importance values?

a) scale
b) set
c) value
d) all of the mentioned
View Answer

Answer: a
Explanation: All measures of importance are scaled to have a maximum value of 100.

10. For most classification models, each predictor will have a separate variable importance for each class.
a) True
b) False
View Answer

Answer: a
Explanation: The exceptions are classification trees, bagged trees and boosted trees.

1. Which of the following is the valid component of the predictor?

a) data
b) question
c) algorithm
d) all of the mentioned
View Answer
Answer: d
Explanation: A prediction is a statement about the future.

2. Point out the wrong statement.

a) In Sample Error is also called generalization error
b) Out of Sample Error is the error rate you get on the new dataset
c) In Sample Error is also called resubstitution error
d) All of the mentioned
View Answer

Answer: a
Explanation: Out of Sample Error is also called generalization error.

3. Which of the following is correct order of working?

a) questions->input data ->algorithms
b) questions->evaluation ->algorithms
c) evaluation->input data ->algorithms
d) all of the mentioned
View Answer

Answer: a
Explanation: Evaluation is done in the last.

4. Which of the following shows correct relative order of importance?

a) question->features->data->algorithms
b) question->data->features->algorithms
c) algorithms->data->features->question
d) none of the mentioned
View Answer

Answer: b
Explanation: Garbage in should be equal to garbage out.

5. Point out the correct statement.

a) In Sample Error is the error rate you get on the same dataset used to model a predictor
b) Data have two parts-signal and noise
c) The goal of predictor is to find signal
d) None of the mentioned
View Answer

Answer: d
Explanation: Perfect in sample prediction can be built.

6. Which of the following is characteristic of best machine learning method?

a) Fast
b) Accuracy
c) Scalable
d) All of the mentioned
View Answer

Answer: d
Explanation: There is always a trade-off in prediction accuracy.

7. True positive means correctly rejected.

a) True
b) False
View Answer

Answer: b
Explanation: True positive means correctly identified.

8. Which of the following trade-off occurs during prediction?

a) Speed vs Accuracy
b) Simplicity vs Accuracy
c) Scalability vs Accuracy
d) None of the mentioned
View Answer

Answer: d
Explanation: Interpretability also matters during prediction.

9. Which of the following expression is true?

a) In sample error < out sample error
b) In sample error > out sample error
c) In sample error = out sample error
d) All of the mentioned
View Answer

Answer: a
Explanation: Out of sample error is given more importance.

10. Backtesting is a key component of effective trading-system development.

a) True
b) False
View Answer

Answer: a
Explanation: Backtesting is the process of applying a trading strategy or analytical method to historical data to see how accurately the
strategy or method would have predicted actual results.

1. Which of the following is correct use of cross validation?

a) Selecting variables to include in a model
b) Comparing predictors
c) Selecting parameters in prediction function
d) All of the mentioned
View Answer

Answer: d
Explanation: Cross-validation is also used to pick type of prediction function to be used.

2. Point out the wrong combination.

a) True negative=correctly rejected
b) False negative=correctly rejected
c) False positive=correctly identified
d) All of the mentioned
View Answer

Answer: c
Explanation: False positive means incorrectly identified.

3. Which of the following is a common error measure?

a) Sensitivity
b) Median absolute deviation
c) Specificity
d) All of the mentioned
View Answer

Answer: d
Explanation: Sensitivity and specificity are statistical measures of the performance of a binary classification test, also known in
statistics as classification function.

4. Which of the following is not a machine learning algorithm?

a) SVG
b) SVM
c) Random forest
d) None of the mentioned
View Answer
Answer: a
Explanation: SVM stands for scalable vector machine.

5. Point out the wrong statement.

a) ROC curve stands for receiver operating characteristic
b) Foretime series, data must be in chunks
c) Random sampling must be done with replacement
d) None of the mentioned
View Answer

Answer: d
Explanation: Random sampling with replacement is the bootstrap.

6. Which of the following is a categorical outcome?

a) RMSE
b) RSquared
c) Accuracy
d) All of the mentioned
View Answer

Answer: c
Explanation: RMSE stands for Root Mean Squared Error.

7. For k cross-validation, larger k value implies more bias.

a) True
b) False
View Answer

Answer: b
Explanation: For k cross-validation, larger k value implies less bias.

8. Which of the following method is used for trainControl resampling?

a) repeatedcv
b) svm
c) bag32
d) none of the mentioned
View Answer

Answer: a
Explanation: repeatedcv stands for repeated cross-validation.

9. Which of the following can be used to create the most common graph types?
a) qplot
b) quickplot
c) plot
d) all of the mentioned
View Answer

Answer: a
Explanation: qplot() is short for a quick plot.

10. For k cross-validation, smaller k value implies less variance.

a) True
b) False
View Answer

Answer: a
Explanation: Larger k value implies more variance.

1. Predicting with trees evaluate _____________ within each group of data.

a) equality
b) homogeneity
c) heterogeneity
d) all of the mentioned
View Answer

Answer: b
Explanation: Predicting with trees is easy to interpret.

2. Point out the wrong statement.

a) Training and testing data must be processed in different way
b) Test transformation would mostly be imperfect
c) The first goal is statistical and second is data compression in PCA
d) All of the mentioned
View Answer

Answer: a
Explanation: Training and testing data must be processed in same way.

3. Which of the following method options is provided by train function for bagging?
a) bagEarth
b) treebag
c) bagFDA
d) all of the mentioned
View Answer

Answer: d
Explanation: Bagging can be done using bag function as well.

4. Which of the following is correct with respect to random forest?

a) Random forest are difficult to interpret but often very accurate
b) Random forest are easy to interpret but often very accurate
c) Random forest are difficult to interpret but very less accurate
d) None of the mentioned
View Answer

Answer: a
Explanation: Random forest is top performing algorithm in prediction.

5. Point out the correct statement.

a) Prediction with regression is easy to implement
b) Prediction with regression is easy to interpret
c) Prediction with regression performs well when linear model is correct
d) All of the mentioned
View Answer

Answer: d
Explanation: Prediction with regression gives poor performance in non linear settings.

6. Which of the following library is used for boosting generalized additive models?
a) gamBoost
b) gbm
c) ada
d) all of the mentioned
View Answer

Answer: a
Explanation: Boosting can be used with any subset of classifier.

7. The principal components are equal to left singular values if you first scale the variables.
a) True
b) False
View Answer

Answer: b
Explanation: The principal components are equal to left singular values if you first scale the variables.
8. Which of the following is statistical boosting based on additive logistic regression?
a) gamBoost
b) gbm
c) ada
d) mboost
View Answer

Answer: a
Explanation: mboost is used for model based boosting.

9. Which of the following is one of the largest boost subclass in boosting?

a) variance boosting
b) gradient boosting
c) mean boosting
d) all of the mentioned
View Answer

Answer: b
Explanation: R has multiple boosting libraries.

10. PCA is most useful for non linear type models.

a) True
b) False
View Answer

Answer: b
Explanation: PCA is most useful for linear type models.

1. Which of the following is correct about regularized regression?

a) Can help with bias trade-off
b) Cannot help with model selection
c) Cannot help with variance trade-off
d) All of the mentioned
View Answer

Answer: a
Explanation: Regularized regression does not perform as well as random forest.

2. Point out the wrong statement.

a) Model based approach may be computationally convenient
b) Model based approach use Bayes theorem
c) Model based approach are reasonably inaccurate on real problems
d) All of the mentioned
View Answer

Answer: c
Explanation: Model based approach are reasonably accurate on real problems.

3. Which of the following methods are present in caret for regularized regression?
a) ridge
b) lasso
c) relaxo
d) all of the mentioned
View Answer

Answer: d
Explanation: In caret one can tune over the no of predictors to retain instead of defined values for penalty.

4. Which of the following method can be used to combine different classifiers?

a) Model stacking
b) Model combining
c) Model structuring
d) None of the mentioned
View Answer
Answer: a
Explanation: Model ensembling is also used for combining different classifiers.

5. Point out the correct statement.

a) Combining classifiers improves interpretability
b) Combining classifiers reduces accuracy
c) Combining classifiers improves accuracy
d) All of the mentioned
View Answer

Answer: c
Explanation: You can combine classifier by averaging.

6. Which of the following function provides unsupervised prediction?

a) cl_forecast
b) cl_nowcast
c) cl_precast
d) none of the mentioned
View Answer

Answer: d
Explanation: cl_predict function is clue package provides unsupervised prediction.

7. Model based prediction considers relatively easy version for covariance matrix.
a) True
b) False
View Answer

Answer: b
Explanation: Model based prediction considers relatively easy version for covariance matrix.

8. Which of the following is used to assist the quantitative trader in the development?
a) quantmod
b) quantile
c) quantity
d) mboost
View Answer

Answer: a
Explanation: Quandl package is similar to quantmod.

9. Which of the following function can be used for forecasting?

a) predict
b) forecast
c) ets
d) all of the mentioned
View Answer

Answer: b
Explanation: Forecasting is the process of making predictions of the future based on past and present data and analysis of trends.

10. Predictive analytics is same as forecasting.

a) True
b) False
View Answer

Answer: b
Explanation: Predictive analytics goes beyond forecasting.

1. Which of the following project is used for calling R products from web?
a) OpenCPU
b) OpenDisk
c) OpenMem
d) All of the mentioned
View Answer

Answer: a
Explanation: OpenCPU is complementary to OpenCPU.

2. Point out the wrong statement.

a) Shiny is platform for creating interactive programs embedded in to web page
b) Shiny is invented by R folks
c) Time required to create data products using shiny is more
d) All of the mentioned
View Answer

Answer: c
Explanation: Time to create data products is less using shiny.

3. Which of the following statement will install shiny?

a) install.packages(“shiny”)
b) install.library(“shiny”)
c) install.lib(“shiny”)
d) all of the mentioned
View Answer

Answer: a
Explanation: Shiny applications are automatically “live” in the same way that spreadsheets are live.

4. Which of the following can be done by shiny?

a) Tabbed main panels
b) Editable data tables
c) Dynamic UI
d) All of the mentioned
View Answer

Answer: d
Explanation: shiny allows users to upload files.

5. Point out the correct statement.

a) shiny project is a directory containing at least three parts
b) shiny project is a file containing at least three parts
c) shiny project consist is a directory containing only one part
d) none of the mentioned
View Answer

Answer: d
Explanation: shiny project consist is a directory containing at least two parts.

6. Which of the following function can interrupt execution and can be called continuously?
a) browser()
b) browse()
c) search()
d) all of the mentioned
View Answer

Answer: a
Explanation: Debugging shiny apps can be difficult.

7. runApp() will run the shiny and open the browser window.
a) True
b) False
View Answer

Answer: a
Explanation: The chart is rendered within the browser using Flash.
8. Which of the following function is for single checkbox widget?
a) checkboxInput
b) dateInput
c) singleboxInput
d) all of the mentioned
View Answer

Answer: a
Explanation: Shiny comes with a family of pre-built widgets, each created with a transparently named R function.

9. How many components are involved in shiny?

a) 3
b) 4
c) 5
d) none of the mentioned
View Answer

Answer: d
Explanation: Shiny apps have two components:user-interface script and server script.

10. All of the styled elements are handled through server.R.

a) True
b) False
View Answer

Answer: b
Explanation: All of the styled elements are handled through ui.R.

1. Which of the following framework is compatible with slidify?

a) io2015
b) io2012
c) d3
d) all of the mentioned
View Answer

Answer: b
Explanation: D3 is a JavaScript library for visualizing data with HTML, SVG, and CSS.

2. Point out the wrong statement.

a) Slidify is created by Ramnath Vaidyanathan
b) Slidify is non customizable
c) Slidify presentation are just HTML files
d) All of the mentioned
View Answer

Answer: b
Explanation: Slidify is customizable and extendable.

3. Which of the following statement will load slidify?

a) library(slidify)
b) install.library(slidify)
c) install.load(slidify)
d) all of the mentioned
View Answer

Answer: a
Explanation: Devtools should be installed in advance.

4. Which of the following will be used to compose the content of the presentation?
a) ui.RMD
b) index.RMD
c) server.RMD
d) all of the mentioned
View Answer
Answer: b
Explanation: index.RMD is an R markdown document.

5. Point out the correct statement.

a) Slidify allows embedded code chunks
b) Slidify presentation cannot be shared easily
c) Slidify is difficult to use
d) None of the mentioned
View Answer

Answer: a
Explanation: Slidify allows mathematical formulas as well.

6. Which of the following statement generates a html slide deck from index.Rmd?
a) slidify(“index.Rmd”)
b) lib.slidify(“index.Rmd”)
c) slidifylib(“index.Rmd”)
d) all of the mentioned
View Answer

Answer: a
Explanation: It is a static file, which means that you can open it in your browser locally and it should display fine.

7. The first part of index.Rmd is XML code.

a) True
b) False
View Answer

Answer: b
Explanation: The first part of index.Rmd is YAML code.

8. Which of the following statement will install slidify from github?

a) install_github(‘slidify’, ‘ramnathv’)
b) install_github(‘slidify’, ‘r’)
c) install(‘slidify’, ‘ramnathv’)
d) all of the mentioned
View Answer

Answer: a
Explanation: Slidify is not on CRAN.

9. Which of the following element can be added to slidify?

a) Quiz
b) RCharts
c) Shiny apps
d) All of the mentioned
View Answer

Answer: d
Explanation: Many interactive elements can be added to slidify.

10. MathJax is a cross-browser JavaScript library that displays mathematical notation in web browsers.
a) True
b) False
View Answer

Answer: a
Explanation: MathJax uses MathML.

1. Which of the following is R interface to google charts?

a) googleVis
b) googleChart
c) googleDataVis
d) all of the mentioned
View Answer

Answer: a
Explanation: googleVis allow users to create interactive charts based on data frames.

2. Point out the wrong statement.

a) The plot command does open a graphics device in the modern way
b) Motion Chart is only displayed when hosted on a web server
c) gvisMotionChart is used to create motion chart
d) All of the mentioned
View Answer

Answer: a
Explanation: The plot command does not open a graphics device in the traditional way.

3. Which of the following create a Google Gadget based on a Google Visualization Object?
a) createGadget
b) createGoogleGadget
c) newGoogleGadget
d) all of the mentioned
View Answer

Answer: b
Explanation: createGoogleGadget returns a Google Gadget XML string.

4. Which of the following reads a data.frame and creates text output referring to the Google Visualization API?
a) gvisAnnotatedLine
b) gvisTimeLine
c) gvisAnnotatedTimeLine
d) none of the mentioned
View Answer

Answer: c
Explanation: An annotated time line is an interactive time series line chart with optional annotations.

5. Point out the correct statement.

a) gvisAnnotationChart returns list of class “gvis” and “list”
b) The gvisAreaChart function reads a data.frame and creates text output referring to the Google Visualization API
c) gvisAreaChart returns list of class “gvis” and “list”
d) All of the mentioned
View Answer

Answer: d
Explanation: This can be included into a web page, or as a stand-alone page.

6. Which of the following is used for creating interacting tables?

a) gvisGeoChart
b) gvisTable
c) gvisLineChart
d) all of the mentioned
View Answer

Answer: b
Explanation: gvisLineChart is used for creating line charts.

7. gvisAnnotatedTimeLine returns list of class “gvis” and “list”.

a) True
b) False
View Answer

Answer: a
Explanation: The chart is rendered within the browser using flash.
8. The actual chart of gvisBarChart is rendered by the web browser using _________ or VML.
a) JPEG
b) SVG
c) PDF
d) All of the mentioned
View Answer

Answer: b
Explanation: gvisBarChart reads data frame.

9. Which of the following is used for creating tree maps?

a) gvisGeoChart
b) gvisTable
c) gvisTreeMap
d) all of the mentioned
View Answer

Answer: c
Explanation: gvisGeoChart is used for interactive maps.

10. gvisAnnotationChart charts are interactive time series line charts that support annotations.
a) True
b) False
View Answer

Answer: a
Explanation: Unlike the gvisAnnotatedTimeLine, which uses flash, annotation charts are SVG/VML and should be preferred
whenever possible.

1. Which of the following is contained in NumPy library?

a) n-dimensional array object
b) tools for integrating C/C++ and Fortran code
c) fourier transform
d) all of the mentioned
View Answer

Answer: d
Explanation: NumPy is the fundamental package for scientific computing with Python.

2. Point out the wrong statement.

a) ipython is an enhanced interactive Python shell
b) matplotlib will enable you to plot graphics
c) rPy provides a lot of scientific routines that work on top of NumPy
d) all of the mentioned
View Answer

Answer: c
Explanation: SciPy provides a lot of scientific routines that work on top of NumPy.

3. The ________ function returns its argument with a modified shape, whereas the ________ method modifies the array itself.
a) reshape, resize
b) resize, reshape
c) reshape2, resize
d) all of the mentioned
View Answer

Answer: a
Explanation: If a dimension is given as -1 in a reshaping operation, the other dimensions are automatically calculated.

4. To create sequences of numbers, NumPy provides a function __________ analogous to range that returns arrays instead of lists.
a) arange
b) aspace
c) aline
d) all of the mentioned
View Answer
Answer: a
Explanation: When arange is used with floating point arguments, it is generally not possible to predict the number of elements
obtained.

5. Point out the correct statement.

a) NumPy main object is the homogeneous multidimensional array
b) In Numpy, dimensions are called axes
c) Numpy array class is called ndarray
d) All of the mentioned
View Answer

Answer: d
Explanation: The number of axes is called rank.

6. Which of the following function stacks 1D arrays as columns into a 2D array?

a) row_stack
b) column_stack
c) com_stack
d) all of the mentioned
View Answer

Answer: b
Explanation: column_stack is equivalent to vstack only for 1D arrays.

7. ndarray is also known as the alias array.

a) True
b) False
View Answer

Answer: a
Explanation: numpy.array is not the same as the Standard Python Library class array.array.

8. Which of the following method creates a new array object that looks at the same data?
a) view
b) copy
c) paste
d) all of the mentioned
View Answer

Answer: a
Explanation: The copy method makes a complete copy of the array and its data.

9. Which of the following function can be used to combine different vectors so as to obtain the result for each n-uplet?
a) iid_
b) ix_
c) ixd_
d) all of the mentioned
View Answer

Answer: b
Explanation: Length of the 1D boolean array must coincide with the length of the dimension (or axis) you want to slice.

10. ndarray.dataitemSize is the buffer containing the actual elements of the array.
a) True
b) False
View Answer

Answer: a
Explanation: ndarray.data is the buffer containing the actual elements of the array.

1. Which of the following sets the size of the buffer used in ufuncs?
a) bufsize(size)
b) setsize(size)
c) setbufsize(size)
d) all of the mentioned
View Answer

Answer: c
Explanation: Adjusting the size of the buffer may therefore alter the speed at which ufunc calculations of various sorts are completed.

2. Point out the wrong statement.

a) A universal function is a function that operates on ndarrays in an element-by-element fashion
b) In Numpy, universal functions are instances of the numpy.ufunction class
c) Many of the built-in functions are implemented in compiled C code
d) All of the mentioned
View Answer

Answer: b
Explanation: ufunc instances can also be produced using the frompyfunc factory function.

3. Which of the following attribute should be used while checking for type combination input and output?
a) .types
b) .type
c) .class
d) all of the mentioned
View Answer

Answer: a
Explanation: Universal functions in NumPy are flexible enough to have mixed type signatures.

4. Which of the following returns an array of ones with the same shape and type as a given array?
a) all_like
b) ones_like
c) one_alike
d) all of the mentioned
View Answer

Answer: b
Explanation: The optional output arguments of the function can be used to help you save memory for large calculations.

5. Point out the wrong statement.

a) Each universal function takes array inputs and produces array outputs
b) Broadcasting is used throughout NumPy to decide how to handle disparately shaped arrays
c) The output of the ufunc is necessarily an ndarray, if all input arguments are ndarrays
d) All of the mentioned
View Answer

Answer: c
Explanation: The output of the ufunc is not necessarily an ndarray, if all input arguments are not ndarrays.

6. Which of the following set the floating-point error callback function or log object?
a) setter
b) settercall
c) setterstack
d) all of the mentioned
View Answer

Answer: b
Explanation: seterr sets how floating-point errors are handled.

7. Some ufuncs can take output arguments.

a) True
b) False
View Answer

Answer: b
Explanation: All ufuncs can take output arguments. If necessary, output will be cast to the data-type of the provided output array.
8. ___________ decompose the elements of x into mantissa and twos exponent.
a) trunc
b) fmod
c) frexp
d) ldexp
View Answer

Answer: c
Explanation: fmod function return the element-wise remainder of division.

9. Which of the following function take only single value as input?

a) iscomplex
b) minimum
c) fmin
d) all of the mentioned
View Answer

Answer: a
Explanation: iscomplex function returns a bool array, where true if input element is complex.

10. The array object returned by __array_prepare__ is passed to the ufunc for computation.
a) True
b) False
View Answer

Answer: a
Explanation: If the class has an __array_wrap__ method, the returned ndarray result will be passed to that method just before passing
control back to the caller.

1. When talking to a speech recognition program, the program divides each second of your speech into 100 separate __________
a) Codes
b) Phonemes
c) Samples
d) Words
View Answer

Answer: c
Explanation: None.

2. Which term is used for describing the judgmental or commonsense part of problem solving?
a) Heuristic
b) Critical
c) Value based
d) Analytical
View Answer

Answer: a
Explanation: None.

3. Which stage of the manufacturing process has been described as “the mapping of function onto form”?
a) Design
b) Distribution
c) Project management
d) Field service
View Answer

Answer: a
Explanation: None.

4. Which kind of planning consists of successive representations of different levels of a plan?

a) hierarchical planning
b) non-hierarchical planning
c) project planning
d) all of the mentioned
View Answer
Answer: a
Explanation: None.

5. What was originally called the “imitation game” by its creator?

a) The Turing Test
b) LISP
c) The Logic Theorist
d) Cybernetics
View Answer

Answer: a
Explanation: None.

6. Decision support programs are designed to help managers make __________

a) budget projections
b) visual presentations
c) business decisions
d) vacation schedules
View Answer

Answer: c
Explanation: None.

7. PROLOG is an AI programming language, which solves problems with a form of symbolic logic known as predicate calculus. It was
developed in 1972 at the University of Marseilles by a team of specialists. Can you name the person who headed this team?
a) Alain Colmerauer
b) Niklaus Wirth
c) Seymour Papert
d) John McCarthy
View Answer

Answer: a
Explanation: None.

8. Programming a robot by physically moving it through the trajectory you want it to follow be called __________
a) contact sensing control
b) continuous-path control
c) robot vision control
d) pick-and-place control
View Answer

Answer: b
Explanation: None.

9. To invoke the LISP system, you must enter __________

a) AI
b) LISP
c) CL (Common Lisp)
d) Both LISP and CL
View Answer

Answer: b
Explanation: None.

10. In LISP, what is the function (list-length <list>)?

a) returns a new list that is equal to &lt:list> by copying the top-level element of <list>
b) returns the length of <list>
c) returns t if <list> is empty
d) all of the mentioned
View Answer

Answer: b
Explanation: None.
11. ART (Automatic Reasoning Tool) is designed to be used on __________
a) LISP machines
b) Personal computers
c) Microcomputers
d) All of the mentioned
View Answer

Answer: a
Explanation: None.

12. Which particular generation of computers is associated with artificial intelligence?

a) Second
b) Fourth
c) Fifth
d) Third
View Answer

Answer: c
Explanation: None.

13. Shaping teaching techniques to fit the learning patterns of individual students is the goal of __________
a) decision support
b) automatic programming
c) intelligent computer-assisted instruction
d) expert systems
View Answer

Answer: c
Explanation: None.

14. Which of the following function returns t If the object is a symbol m LISP?
a) (* <object>)
b) (symbolp <object>)
c) (nonnumeric <object>)
d) (constantp <object>)
View Answer

Answer: b
Explanation: None.

15. The symbols used in describing the syntax of a programming language are __________
a) 0
b) {}
c) “”
d) <>
View Answer

Answer: d
Explanation: None.

1. Ambiguity may be caused by ______________

a) syntactic ambiguity
b) multiple word meanings
c) unclear antecedents
d) all of the mentioned
View Answer

Answer: d
Explanation: None.

2. Which company offers the LISP machine considered “the most powerful symbolic processor available”?
a) LMI
b) Symbolics
c) Xerox
d) Texas Instruments
View Answer

Answer: b
Explanation: None.

3. What of the following is considered a pivotal event in the history of Artificial Intelligence?
a) 1949, Donald O, The organization of Behavior
b) 1950, Computing Machinery and Intelligence
c) 1956, Dartmouth University Conference Organized by John McCarthy
d) 1961, Computer and Computer Sense
View Answer

Answer: c
Explanation: None.

4. What are the two subfields of Natural language processing?

a) symbolic and numeric
b) time and motion
c) algorithmic and heuristic
d) understanding and generation
View Answer

Answer: c
Explanation: None.

5. High-resolution, bit-mapped displays are useful for displaying _____________

a) clearer characters
b) graphics
c) more characters
d) all of the mentioned
View Answer

Answer: c
Explanation: None.

6. A bidirectional feedback loop links computer modeling with _____________

a) artificial science
b) heuristic processing
c) human intelligence
d) cognitive science
View Answer

Answer: c
Explanation: None.

7. Which of the following have people traditionally done better than computers?
a) recognizing relative importance
b) finding similarities
c) resolving ambiguity
d) all of the mentioned
View Answer

Answer: c
Explanation: None.

8. In LISP, the function evaluates both and is _____________

a) set
b) setq
c) add
d) eva
View Answer
Answer: a
Explanation: None.

9. Which type of actuator generates a good deal of power but tends to be messy?
a) electric
b) hydraulic
c) pneumatic
d) both hydraulic & pneumatic
View Answer

Answer: b
Explanation: None.

10. Research scientists all over the world are taking steps towards building computers with circuits patterned after the complex
interconnections existing among the human brain’s nerve cells. What name is given to such type of computers?
a) Intelligent computers
b) Supercomputers
c) Neural network computers
d) Smart computers
View Answer

Answer: c
Explanation: None.

11. The integrated circuit was invented by Jack Kilby of _____________

a) MIT
b) Texas Instruments
c) Xerox
d) All of the mentioned
View Answer

Answer: b
Explanation: None.

12. People overcome natural language problems by _____________

a) grouping attributes into frames
b) understanding ideas in context
c) identifying with familiar situations
d) both understanding ideas in context & identifying with familiar situations
View Answer

Answer: d
Explanation: None.

13. The Cedar, BBN Butterfly, Cosmic Cube and Hypercube machine can be characterized as _____________
a) SISD
b) MIMD
c) SIMD
d) MISD
View Answer

Answer: b
Explanation: None.

14. A series of AI systems, developed by Pat Langley to explore the role of heuristics in scientific discovery is ________
a) RAMD
b) BACON
c) MIT
d) DU
View Answer

Answer: b
Explanation: None.
1. Nils Nilsson headed a team at SRI that created a mobile robot named _____________
a) Robotics
b) Dedalus
c) Shakey
d) Vax
View Answer

Answer: c
Explanation: None.

2. An Artificial Intelligence technique that allows computers to understand associations and relationships between objects and events is
called _____________
a) heuristic processing
b) cognitive science
c) relative symbolism
d) pattern matching
View Answer

Answer: c
Explanation: None.

3. The new organization established to implement the Fifth Generation Project is called _____________
a) ICOT (Institute for New Generation Computer Technology)
b) MITI (Ministry of International Trade and Industry)
c) MCC (Microelectronics and Computer Technology Corporation)
d) SCP (Strategic Computing Program)
View Answer

Answer: a
Explanation: None.

4. What is the field that investigates the mechanics of human intelligence?

a) history
b) cognitive science
c) psychology
d) sociology
View Answer

Answer: b
Explanation: None.

5. What is the name of the computer program that simulates the thought processes of human beings?
a) Human logic
b) Expert reason
c) Expert system
d) Personal information
View Answer

Answer: c
Explanation: None.

6. What is the name of the computer program that contains the distilled knowledge of an expert?
a) Database management system
b) Management information System
c) Expert system
d) Artificial intelligence
View Answer

Answer: c
Explanation: None.

7. Claude Shannon described the operation of electronic switching circuits with a system of mathematical logic called _____________
a) LISP
b) XLISP
c) Neural networking
d) Boolean algebra
View Answer

Answer: c
Explanation: None.

8. A computer program that contains expertise in a particular domain is called?

a) intelligent planner
b) automatic processor
c) expert system
d) operational symbolizer
View Answer

Answer: c
Explanation: None.

9. What is the term used for describing the judgmental or commonsense part of problem solving?
a) Heuristic
b) Critical
c) Value based
d) Analytical
View Answer

Answer: a
Explanation: None.

10. What was originally called the “imitation game” by its creator?
a) The Turing Test
b) LISP
c) The Logic Theorist
d) Cybernetics
View Answer

Answer: a
Explanation: None.

11. Decision support programs are designed to help managers make _____________
a) budget projections
b) visual presentations
c) business decisions
d) vacation schedules
View Answer

Answer: c
Explanation: None.

12. Programming a robot by physically moving it through the trajectory you want it to follow is called _____________
a) contact sensing control
b) continuous-path control
c) robot vision control
d) pick-and-place control
View Answer

Answer: b
Explanation: None

1. What is the primary interactive method of communication used by humans?

a) reading
b) writing
c) speaking
d) all of the mentioned
View Answer
Answer: c
Explanation: None.

2. Elementary linguistic units that are smaller than words are?

a) allophones
b) phonemes
c) syllables
d) all of the mentioned
View Answer

Answer: d
Explanation: None.

3. In LISP, the atom that stands for “true” is _____________

a) t
b) ml
c) y
d) time
View Answer

Answer: a
Explanation: None.

4. A mouse device may be _____________

a) electro-chemical
b) mechanical
c) optical
d) both mechanical and optical
View Answer

Answer: d
Explanation: None.

5. An expert system differs from a database program in that only an expert system _____________
a) contains declarative knowledge
b) contains procedural knowledge
c) features the retrieval of stored information
d) expects users to draw their own conclusions
View Answer

Answer: b
Explanation: None.

6. Arthur Samuel is linked inextricably with a program that played _____________

a) checkers
b) chess
c) cricket
d) football
View Answer

Answer: a
Explanation: None.

7. Natural language understanding is used in _____________

a) natural language interfaces
b) natural language front ends
c) text understanding systems
d) all of the mentioned
View Answer

Answer: d
Explanation: None.
8. Which of the following are examples of software development tools?
a) debuggers
b) editors
c) assemblers, compilers and interpreters
d) all of the mentioned
View Answer

Answer: d
Explanation: None.

9. Which is the first AI programming language?

a) BASIC
b) FORTRAN
c) IPL(Inductive logic programming)
d) LISP
View Answer

Answer: d
Explanation: None.

10. The Personal Consultant is based on?

a) EMYCIN
b) OPS5+
c) XCON
d) All of the mentioned
View Answer

Answer: d
Explanation: None.

1. What is Machine learning?

a) The autonomous acquisition of knowledge through the use of computer programs
b) The autonomous acquisition of knowledge through the use of manual programs
c) The selective acquisition of knowledge through the use of computer programs
d) The selective acquisition of knowledge through the use of manual programs
View Answer

Answer: a
Explanation: Machine learning is the autonomous acquisition of knowledge through the use of computer programs.

2. Which of the factors affect the performance of learner system does not include?
a) Representation scheme used
b) Training scenario
c) Type of feedback
d) Good data structures
View Answer

Answer: d
Explanation: Factors that affect the performance of learner system does not include good data structures.

3. Different learning methods does not include?

a) Memorization
b) Analogy
c) Deduction
d) Introduction
View Answer

Answer: d
Explanation: Different learning methods does not include the introduction.

4. In language understanding, the levels of knowledge that does not include?

a) Phonological
b) Syntactic
c) Empirical
d) Logical
View Answer

Answer: c
Explanation: In language understanding, the levels of knowledge that does not include empirical knowledge.

5. A model of language consists of the categories which does not include?

a) Language units
b) Role structure of units
c) System constraints
d) Structural units
View Answer

Answer: d
Explanation: A model of language consists of the categories which does not include structural units.

6. What is a top-down parser?

a) Begins by hypothesizing a sentence (the symbol S) and successively predicting lower level constituents until individual preterminal
symbols are written
b) Begins by hypothesizing a sentence (the symbol S) and successively predicting upper level constituents until individual preterminal
symbols are written
c) Begins by hypothesizing lower level constituents and successively predicting a sentence (the symbol S)
d) Begins by hypothesizing upper level constituents and successively predicting a sentence (the symbol S)
View Answer

Answer: a
Explanation: A top-down parser begins by hypothesizing a sentence (the symbol S) and successively predicting lower level
constituents until individual preterminal symbols are written.

7. Among the following which is not a horn clause?

a) p
b) Øp V q
c) p → q
d) p → Øq
View Answer

Answer: d
Explanation: p → Øq is not a horn clause.

8. The action ‘STACK(A, B)’ of a robot arm specify to _______________

a) Place block B on Block A
b) Place blocks A, B on the table in that order
c) Place blocks B, A on the table in that order
d) Place block A on block B
View Answer

Answer: d
Explanation: The action ‘STACK(A,B)’ of a robot arm specify to Place block A on block B.

1. How many terms are required for building a bayes model?

a) 1
b) 2
c) 3
d) 4
View Answer

Answer: c
Explanation: The three required terms are a conditional probability and two unconditional probability.

2. What is needed to make probabilistic systems feasible in the world?

a) Reliability
b) Crucial robustness
c) Feasibility
d) None of the mentioned
View Answer
Answer: b
Explanation: On a model-based knowledge provides the crucial robustness needed to make probabilistic system feasible in the real
world.

3. Where does the bayes rule can be used?

a) Solving queries
b) Increasing complexity
c) Decreasing complexity
d) Answering probabilistic query
View Answer

Answer: d
Explanation: Bayes rule can be used to answer the probabilistic queries conditioned on one piece of evidence.

4. What does the bayesian network provides?

a) Complete description of the domain
b) Partial description of the domain
c) Complete description of the problem
d) None of the mentioned
View Answer

Answer: a
Explanation: A Bayesian network provides a complete description of the domain.

5. How the entries in the full joint probability distribution can be calculated?
a) Using variables
b) Using information
c) Both Using variables & information
d) None of the mentioned
View Answer

Answer: b
Explanation: Every entry in the full joint probability distribution can be calculated from the information in the network.

6. How the bayesian network can be used to answer any query?

a) Full distribution
b) Joint distribution
c) Partial distribution
d) All of the mentioned
View Answer

Answer: b
Explanation: If a bayesian network is a representation of the joint distribution, then it can solve any query, by summing all the
relevant joint entries.

7. How the compactness of the bayesian network can be described?

a) Locally structured
b) Fully structured
c) Partial structure
d) All of the mentioned
View Answer

Answer: a
Explanation: The compactness of the bayesian network is an example of a very general property of a locally structured system.

8. To which does the local structure is associated?

a) Hybrid
b) Dependant
c) Linear
d) None of the mentioned
View Answer

Answer: c
Explanation: Local structure is usually associated with linear rather than exponential growth in complexity.
9. Which condition is used to influence a variable directly by all the others?
a) Partially connected
b) Fully connected
c) Local connected
d) None of the mentioned
View Answer

Answer: b
Explanation: None.

10. What is the consequence between a node and its predecessors while creating bayesian network?
a) Functionally dependent
b) Dependant
c) Conditionally independent
d) Both Conditionally dependant & Dependant
View Answer

Answer: c
Explanation: The semantics to derive a method for constructing bayesian networks were led to the consequence that a node can be
conditionally independent of its predecessors.

1. A _________ is a decision support tool that uses a tree-like graph or model of decisions and their possible consequences, including
chance event outcomes, resource costs, and utility.
a) Decision tree
b) Graphs
c) Trees
d) Neural Networks
View Answer

Answer: a
Explanation: Refer the definition of Decision tree.

2. Decision Tree is a display of an algorithm.

a) True
b) False
View Answer

Answer: a
Explanation: None.

3. What is Decision Tree?

a) Flow-Chart
b) Structure in which internal node represents test on an attribute, each branch represents outcome of test and each leaf node represents
class label
c) Flow-Chart & Structure in which internal node represents test on an attribute, each branch represents outcome of test and each leaf
node represents class label
d) None of the mentioned
View Answer

Answer: c
Explanation: Refer the definition of Decision tree.

4. Decision Trees can be used for Classification Tasks.

a) True
b) False
View Answer

Answer: a
Explanation: None.

5. Choose from the following that are Decision Tree nodes?

a) Decision Nodes
b) End Nodes
c) Chance Nodes
d) All of the mentioned
View Answer

Answer: d
Explanation: None.

6. Decision Nodes are represented by ____________

a) Disks
b) Squares
c) Circles
d) Triangles
View Answer

Answer: b
Explanation: None.

7. Chance Nodes are represented by __________

a) Disks
b) Squares
c) Circles
d) Triangles
View Answer

Answer: c
Explanation: None.

8. End Nodes are represented by __________

a) Disks
b) Squares
c) Circles
d) Triangles
View Answer

Answer: d
Explanation: None.

9. Which of the following are the advantage/s of Decision Trees?

a) Possible Scenarios can be added
b) Use a white box model, If given result is provided by a model
c) Worst, best and expected values can be determined for different scenarios
d) All of the mentioned
View Answer

Answer: d
Explanation: None

1. What is true about Machine Learning?

A. Machine Learning (ML) is that field of computer science

B. ML is a type of artificial intelligence that extract patterns out of raw data by using an algorithm or method.
C. The main focus of ML is to allow computer systems learn from experience without being explicitly programmed or human
intervention.
D. All of the above

View Answer

Ans : D

Explanation: All statement are true about Machine Learning.

2. ML is a field of AI consisting of learning algorithms that?

A. Improve their performance

B. At executing some task
C. Over time with experience
D. All of the above

View Answer

Ans : D

Explanation: ML is a field of AI consisting of learning algorithms that : Improve their performance (P), At executing some task (T),
Over time with experience (E).

3. p → 0q is not a?

A. hack clause
B. horn clause
C. structural clause
D. system clause

View Answer

Ans : B

Explanation: p → 0q is not a horn clause.

4. The action _______ of a robot arm specify to Place block A on block B.

A. STACK(A,B)
B. LIST(A,B)
C. QUEUE(A,B)
D. ARRAY(A,B)

View Answer

Ans : A

Explanation: The action 'STACK(A,B)' of a robot arm specify to Place block A on block B.

5. A__________ begins by hypothesizing a sentence (the symbol S) and successively predicting lower level constituents until individual
preterminal symbols are written.

A. bottow-up parser
B. top parser
C. top-down parser
D. bottom parser

View Answer

Ans : C

Explanation: A top-down parser begins by hypothesizing a sentence (the symbol S) and successively predicting lower level
constituents until individual preterminal symbols are written.

6. A model of language consists of the categories which does not include ________.

A. System Unit
B. structural units.
C. data units
D. empirical units

View Answer

Ans : B

Explanation: A model of language consists of the categories which does not include structural units.

7. Different learning methods does not include?

A. Introduction
B. Analogy
C. Deduction
D. Memorization
View Answer

Ans : A

Explanation: Different learning methods does not include the introduction.

8. The model will be trained with data in one single batch is known as ?

A. Batch learning
B. Offline learning
C. Both A and B
D. None of the above

View Answer

Ans : C

Explanation: we have end-to-end Machine Learning systems in which we need to train the model in one go by using whole
available training data. Such kind of learning method or algorithm is called Batch or Offline learning.

9. Which of the following are ML methods?

A. based on human supervision

B. supervised Learning
C. semi-reinforcement Learning
D. All of the above

View Answer

Ans : A

Explanation: The following are various ML methods based on some broad categories : Based on human supervision, Unsupervised
Learning, Semi-supervised Learning and Reinforcement Learning

10. In Model based learning methods, an iterative process takes place on the ML models that are built based on various model parameters,
called ?

A. mini-batches
B. optimizedparameters
C. hyperparameters
D. superparameters

View Answer

Ans : C

Explanation: In Model based learning methods, an iterative process takes place on the ML models that are built based on various
model parameters, called hyperparameters.
ELL784/EEL709: Introduction to Machine Learning
Minor Test I, Form: A (please write this Form ID on the cover page of your answer script)
Maximum marks: 20

Section 1. Multiple choice questions

Each question may have any number of correct answers, including zero. List all choices you believe to be
correct (1 mark for each correct answer, -0.5 for each incorrect answer). No justification is required.

1. Consider a binary classification problem. Suppose I have trained a model on a linearly separable
training set, and now I get a new labeled data point which is correctly classified by the model, and far
away from the decision boundary. If I now add this new point to my earlier training set and re-train,
in which cases is the learnt decision boundary likely to change?
(a) When my model is a perceptron.
(b) When my model is logistic regression.
(c) When my model is an SVM.
(d) When my model is Gaussian discriminant analysis.
2. When doing least-squares regression with regularisation (assuming that the optimisation can be done
exactly), increasing the value of the regularisation parameter λ
(a) will never decrease the training error.
(b) will never increase the training error.
(c) will never decrease the testing error.
(d) will never increase the testing error.
(e) may either increase or decrease the training error.
(f) may either increase or decrease the testing error.
3. Which of the following points would Bayesians and frequentists disagree on?
(a) The use of a non-Gaussian noise model in probabilistic regression.
(b) The use of probabilistic modelling for regression.
(c) The use of prior distributions on the parameters in a probabilistic model.
(d) The use of class priors in Gaussian Discriminant Analysis.
(e) The idea of assuming a probability distribution over models.
4. Regarding bias and variance, which of the follwing statements are true? (Here ‘high’ and ‘low’ are
relative to the ideal model.)
(a) Models which overfit have a high bias.
(b) Models which overfit have a low bias.
(c) Models which underfit have a high variance.
(d) Models which underfit have a low variance.
5. Which of the following are characteristics of data sampled from a Gaussian distribution?
(a) The sample mean systematically underestimates the true mean.
(b) The sample variance systematically overestimates the true variance.
(c) Both the sample mean and variance are unbiased estimators of the true values.

1
6. Suppose your model is overfitting. Which of the following is NOT a valid way to try and reduce the
overfitting?
(a) Increase the amount of training data.
(b) Improve the optimisation algorithm being used for error minimisation.
(c) Decrease the model complexity.
(d) Reduce the noise in the training data.
7. You are reviewing papers for the World’s Fanciest Machine Learning Conference, and you see submis-
sions with the following claims. Which ones would you consider accepting?
(a) My method achieves a training error lower than all previous methods!
(b) My method achieves a test error lower than all previous methods! (Footnote: When regulari-
sation parameter λ is chosen so as to minimise test error.)
(c) My method achieves a test error lower than all previous methods! (Footnote: When regulari-
sation parameter λ is chosen so as to minimise cross-validaton error.)
(d) My method achieves a cross-validation error lower than all previous methods! (Footnote:
When regularisation parameter λ is chosen so as to minimise cross-validaton error.)

Section 2. Numerical questions

Please show all steps in your working fully and clearly, except where indicated otherwise.
8. The Indian Railways have been trialling 2 different machine learning methods which attempt to predict
whether a train will arrive at its final destination on time or not, using a number of input features
corresponding to weather conditions, train priorities, ongoing repair works etc. (for this purpose, ‘on
time’ is defined as no more than 10 minutes after its scheduled time). The methods have been tested
on a common set of 500 train runs, and the results are as follows:

Actually on time Actually late

Method 1 predicted on time 131 155
Method 1 predicted late 19 195
Method 2 predicted on time 82 72
Method 2 predicted late 68 278

Suppose we set up a simple probabilistic model for this as follows: θ is the prior probability of a train
being late; p is the probability of a late prediction from Method 1 if the train is on time (also called
the False Positive Rate (FPR)); and q is the probability of a late prediction from Method 1 if the train
is in fact late (also called the True Positive Rate (TPR)).
(a) Write down the joint likelihood of the data for Method 1, as a function of the three model parameters
θ, p, and q. Obtain maximum likelihood estimates for each of these parameters. [4]
(b) Suppose the loss matrix for this prediction task is defined as follows:

Actually on time Actually late

Predicted on time 0 1
Predicted late K 0

Using the parameter estimates computed above, obtain the expected loss for Method 1 as a function
of K. [2]
(c) Obtain the expected loss for Method 2 as well (you can compute its FPR and TPR directly, without
doing the maximum likelihood derivations again); which is the preferable method? What is the critical
value of K at which this preference changes? [4]

2
Answer Key for Exam A
Section 1. Multiple choice questions
Each question may have any number of correct answers, including zero. List all choices you believe to be
correct (1 mark for each correct answer, -0.5 for each incorrect answer). No justification is required.

Section 2. Numerical questions

Actually on time Actually late

Method 1 predicted on time 131 155
Method 1 predicted late 19 195
Method 2 predicted on time 82 72
Method 2 predicted late 68 278

L(θ, p, q) = θ350 (1 − θ)150 p19 (1 − p)131 q 195 (1 − q)155

(Compute partial derivatives with respect to all parameters and set to 0 to get ML estimates):
350
θ̂M L =
500
19
p̂M L =
150
195
q̂M L =
350

(b) Suppose the loss matrix for this prediction task is defined as follows:

2
Actually on time Actually late
Predicted on time 0 1
Predicted late K 0

Using the parameter estimates computed above, obtain the expected loss for Method 1 as a function
of K. [2]

E[L] = θ(1 − q) + K(1 − θ)p

350 155 150 19
= × +K ×
500 350 500 150
155 19
= +K
500 500
155 + 19K
=
500

(c) Obtain the expected loss for Method 2 as well (you can compute its FPR and TPR directly, without
doing the maximum likelihood derivations again); which is the preferable method? What is the critical
value of K at which this preference changes? [4]
For Method 2:
68
p̂M L =
150
278
q̂M L =
350
350 72 150 68
E[L] = × +K ×
500 350 500 150
72 68
= +K
500 500
72 + 68K
=
500
The preferable method is the one with the lower expected loss, which depends on the value of K. Let
the critical value be KC , then we have

155 + 19KC = 72 + 68KC

83 = 49KC
83
KC =
49
If K > KC , then Method 1 is preferable; if K < KC , then Method 2 is preferable.

3
Clustering VS Classification

MCQ
1. What is the relation between the distance between clusters and the corresponding class
discriminability?
a. proportional
b. inversely-proportional
c. no-relation

Ans: (a)

2. To measure the density at a point, consider

a. sphere of any size
b. sphere of unit volume
c. hyper-cube of unit volume
d. both (b) and (c)

Ans: (d)

3. Agglomerative clustering falls under which type of clustering method?

a. partition
b. hierarchical
c. none of the above

Ans: (b)

4. Indicate which is/are a method of clustering

a. linkage method
b. split and merge
c. both a and b
d. neither a nor b

Ans: (c)

5. K means and K-medioids are example of which type of clustering method?

a. Hierarchical
b. partition
c. probabilistic
d. None of the above.

Ans: (b)
6. Unsupervised classification can be termed as
a. distance measurement
b. dimensionality reduction
c. clustering
d. none of the above

Ans: (d)

7. Indicate which one is a method of density estimation

a. Histogram based
b. Branch and bound procedure
c. Neighborhood distance
d. all of the above

Ans: (c)
Linear Algebra

MCQ
1. Which of the properties are true for matrix multiplication
a. Distributive
b. Commutative
c. both a and b
d. neither a nor b

Ans: (a)

2. Which of the operations can be valid with two matrices of different sizes?
a. addition
b. subtraction
c. multiplication
d. Division

Ans: (c)

3. Which of the following statements are true?

a. trace(A)=trace(A’)
b. det(A)=det(A’)
c. both a and b
d. neither a nor b

Ans: (c)

4. Which property ensures that inverse of a matrix exists?

a. determinant is non-zero
b. determinant is zero
c. matrix is square
d. trace of matrix is positive value.

Ans: (a)

5. Identify the correct order of general to specific matrix?

a. square->identity->symmetric->diagonal
b. symmetric->diagonal->square->Identity
c. square->diagonal->Identity->symmetric
d. square->symmetric->diagonal->identity

Ans: (d)

6. Which of the statements are true?

a. If A is a symmetric matrix, inv(A) is also symmetric
b. det(inv(A)) = 1/det(A)
c. If A and B are invertible matrices, AB is an invertible matrix too.
d. all of the above

Ans: (d)

7. Which of the following options hold true?

a. inv(inv(A)) = A
b. inv(kA)=inv(A)/k
c. inv(A’) = inv(A)’
d. all of the above

Ans: (d)
Eigenvalues and Eigenvectors

MCQ
2 7
1. The Eigenvalues of a matrix � � are
−1 −6
a. 3 and 0
b. -2 and 7
c. -5 and 1
d. 3 and -5

Ans: (c)

0 1 1
2. The Eigenvalues of �1 0 1� are
1 1 0
a. -1, 1 and 2
b. 1, 1 and -2
c. -1, -1 and 2
d. 1, 1 and 2

Ans: (c)

0 1 1
3. The Eigenvectors of �1 0 1� are
1 1 0
a. (1 1 1), (1 0 1) and (1 1 0)
b. (1 1 -1), (1 0 -1) and (1 1 0)
c. (-1 1 -1), (1 0 1) and (1 1 0)
d. (1 1 1), (-1 0 1) and (-1 1 0)

Ans: (d)

4. Indicate which of the statements are true?

a. A and A*A have same Eigenvectors
b. If m is an Eigenvalue of A, then m^2 is an Eigenvalue of A*A.
c. both a and b
d. neither a nor b

Ans: (c)

5. Indicate which of the statements are true?

a. If m is an Eigenvalue of A, the m is an Eigenvalue of A’
b. If m is an Eigenvalue of A, then 1/m is the Eigenvalue of inv(A)
c. both a and b
d. neither a nor b

Ans: (c)

6. Indicate which of the statements are true?

a. A singular matrix must have a zero Eigenvalue
b. A singular matrix must have a negative Eigenvalue
c. A singular matrix must have a complex Eigenvalue
d. (d) All of the above

Ans: (a)
Vector Spaces

MCQ
1. Which of these is a vector space?
a. {(x y z w)′ ∈ R4 |x + y − z + w = 0}
b. {(x y z)′ ∈ R3 |x + y + z = 0}
c. {(x y z)′ ∈ R3 |x 2 + y 2 + z 2 = 1}
a 1
d. {� � |a, b, c ∈ R}
b c
Ans: (a)

2. Under which of the following operations {(x, y)|x, y ∈ R} is a vector space?

a. (x1 , y1 ) + (x2 , y2 ) = (x1 + x2 , y1 + y2 ) and r. (x, y) = (rx, y)
b. (x1 , y1 ) + (x2 , y2 ) = (x1 + x2 , y1 + y2 ) and r. (x, y) = (rx, 0)
c. both a and b
d. neither a nor b

Ans: (d)

3. Which of the following statements are true?

a. r. v�⃗ = �0⃗, if and only if r=0
b. r1 . �v⃗ = r2 . �v⃗, if and only if r1 = r2
c. set of all matrices under usual operations is not a vector space
d. all of the above

Ans: (d)

a − 3b + 6c
4. What is the dimension of the subspace H = �� 5a + 4d � : a, b, c, d ∈ R�
b − 2c − d
5d
a. 1
b. 2
c. 3
d. 4

Ans: (c)

2 −1 1 −6 8
5. What is the rank of the matrix � 1 −2 −4 3 −2 �
−7 8 10 3 −10
4 −5 −7 0 4
a. 2
b. 3
c. 4
d. 5

Ans: (a)

6. If v1, v2, v3, v4 are in 𝑅 4 and v3 is not a linear combination of v1, v2, v4, then {v1, v2, v3, v4}
must be linearly independent.
a. True
b. False

Ans: (b). For example, if v4 = v1 + v2, then 1v1 + 1 v2 + 0 v3 - 1 v4 = 0.

1 1 3
7. The vectors x1=�1� , x2=�−1� , x3=�1� are :
1 2 4
a. Linearly dependent
b. Linearly independent

Ans: (a). Because 2x1 + x2 -x3 = 0.

1 −5
8. The vectors x1=� � , x2=� � are :
2 3
a. Linearly dependent
b. Linearly independent

Ans: (b).
Rank and SVD

MCQ
1. The number of non-zero rows in an echelon form is called?

a. reduced echelon form

b. rank of a matrix
c. conjugate of the matrix
d. cofactor of the matrix

Ans: (b)

2. Let A and В be arbitrary m x n matrices. Then which one of the following statement is true
a. 𝑟𝑎𝑛𝑘(𝐴 + 𝐵) ≤ 𝑟𝑎𝑛𝑘(𝐴) + 𝑟𝑎𝑛𝑘(𝐵)
b. 𝑟𝑎𝑛𝑘(𝐴 + 𝐵) < 𝑟𝑎𝑛𝑘(𝐴) + 𝑟𝑎𝑛𝑘(𝐵)
c. 𝑟𝑎𝑛𝑘(𝐴 + 𝐵) ≥ 𝑟𝑎𝑛𝑘(𝐴) + 𝑟𝑎𝑛𝑘(𝐵)
d. 𝑟𝑎𝑛𝑘(𝐴 + 𝐵) > 𝑟𝑎𝑛𝑘(𝐴) + 𝑟𝑎𝑛𝑘(𝐵)
Ans: (a)

0 0 0
3. The rank of the matrix � � is
0 0 0

a. 0
b. 2
c. 1
d. 3

Ans: (a)

1 1 1
4. The rank of �1 1 1�is
1 1 1

a. 3
b. 2
c. 1
d. 0

Ans: (c)
5. Consider the following two statements:
I. The maximum number of linearly independent column vectors of a matrix A is called the rank of A.
II. If A is an n x n square matrix, it will be nonsingular is rank A = n.
With reference to the above statements, which of the following applies?

a. Both the statements are false2

b. Both the statements are true
c. I is true but II is false.
d. I is false but II is true

Ans: (b)

6. The rank of a 3 x 3 matrix C (= AB), found by multiplying a non-zero column matrix A of size
3 x 1 and a non-zero row matrix B of size 1 x 3, is

a. 0
b. 1
c. 2
d. 3

Ans: (b)

1 2
7. Find the singular values of the matrix 𝐵 = � �
2 1

a. 2 and 4
b. 3 and 4
c. 2 and 3
d. 3 and 1

Ans: (d)

8. “Grahm-Schmidt” Process involves factorizing a matrix as a multiplication of two matrices

a. One is Orthogonal and the other one is upper-triangular

b. Both are symmetric
c. One is symmetric and the other one is anti-symmetric
d. One is diagonal and the other one is symmetric
Ans: (a)

9. SVD is defined as 𝐴 = 𝑈 Σ𝑉 𝑇 where 𝑈 consists of Eigenvectors of

a. AAT
b. ATA
c. AA-1
d. A*A

Ans: (a)

10. SVD is defined as A = U ΣV T , where Σ is :

a. diagonal matrix having singular values

b. diagonal matrix having arbitrary values
c. identity matrix
d. non diagonal matrix

Ans: (a)
Normal Distribution and Decision Boundary I

MCQ
1. Three components of Bayes decision rule are class prior, likelihood and …
a. Evidence
b. Instance
c. Confidence
d. Salience

Ans: (a)

2. Gaussian function is also called … function

a. Bell
b. Signum
c. Fixed Point
d. Quintic

Ans: (a)

3. The span of the Gaussian curve is determined by the …. of the distribution

a. Mean
b. Mode
c. Median
d. Variance

Ans: (d)

4. When the value of the data is equal to the mean of the distribution in which it belongs to, the
Gaussian function attains … value
a. Minimum
b. Maximum
c. Zero
d. None of the above

Ans: (b)

5. The full width of the Gaussian function at half the maximum is

a. 2.35𝜎
b. 1. 5𝜎
c. 0.5𝜎
d. 0.355𝜎

Ans: (a)
6. Property of correlation coefficient is
a. −1 ≤ 𝜌𝑥𝑦 ≤ 1
b. −0.5 ≤ 𝜌𝑥𝑦 ≤ 1
c. −1 ≤ 𝜌𝑥𝑦 ≤ 1.5
d. −0.5 ≤ 𝜌𝑥𝑦 ≤ 0.5

Ans: (a)

7. The correlation coefficient can be viewed as … angle between two vectors in RD

a. Sin
b. Cos
c. Tan
d. Sec

Ans: (b)

8. For a n-dimensional data, number of correlation coefficient is equal to

a. nC 2
b. n-1
c. n2
d. log(n)

Ans: (a)

9. Iso-contour lines of smaller radius depicts …. value of the density function

a. Higher
b. Lower
c. Equal
d. None of the above

Ans: (a)
Normal Distribution and Decision Boundary II

MCQ
1. If the covariance matrix is strictly diagonal with equal variance then the iso-contour lines (data
scatter) of the data resembles
a. Concentric circle
b. Ellipse
c. Oriented Ellipse
d. None of the above

Ans: (a)

2. Nature of the decision boundary is determined by

a. Decision Rule
b. Decision boundary
c. Discriminant function
d. None of the above

Ans: (c)

3. In Supervised learning, class labels of the training samples are

a. Known
b. Unknown
c. Doesn’t matter
d. Partially known

Ans: (a)

4. In learning is online then it is called

a. Supervised
b. Unsupervised
c. Semi-supervised
d. None of the above

Ans: (b)

5. In supervised learning, the process of learning is

a. Online
b. Offline
c. Partially online and offline
d. Doesn’t matter

Ans: (b)
6. For spiral data the decision boundary will be
a. Linear
b. Non-linear
c. Does not exist

Ans: (b)

7. In a 2-class problem, if the discriminant function satisfies 𝑔1 (𝑥) = 𝑔2 (𝑥) then, the data point
lies
a. On the DB
b. Class 1’s side
c. Class 2’s side
d. None of the above

Ans: (a)
Bayes Theorem

MCQ
1. 𝑃�𝑋⃗�𝑃�𝑤𝑖 �𝑋⃗� =
a. 𝑃�1 − 𝑋⃗�𝑃�𝑤𝑖 �𝑋⃗�
b. 𝑃�𝑋⃗�𝑃�1 − 𝑤𝑖 �𝑋⃗�
c. 𝑃�𝑋⃗|𝑤𝑖 �𝑃(𝑤𝑖 )
d. 𝑃�𝑋⃗ − 𝑤𝑖 �𝑃�𝑤𝑖 �𝑋⃗�

Ans: (c)

2. In Bayes Theorem, unconditional probability is called as

a. Evidence
b. Likelihood
c. Prior
d. Posterior

Ans: (a)

3. In Bayes Theorem, Class conditional probability is called as

a. Evidence
b. Likelihood
c. Prior
d. Posterior

Ans: (b)

4. When the covariance term in Mahalobian distance becomes Identity then the distance is similar
to
a. Euclidean distance
b. Manhattan distance
c. City block distance
d. Geodesic distance

Ans: (a)

5. The decision boundary for an N-dimensional (N>3) data will be a

a. Point
b. Line
c. Plane
d. Hyperplane

Ans: (d)
6. Bayes error is the ….. bound of probability of classification error.
a. Lower
b. Upper

Ans: (a)

7. Bayes decision rule is the theoretically …….. classifier that minimize probability of classification
error.
a. Best
b. Worst
c. Average

Ans: (a)
Linear Discriminant Function and Perceptron Learning

MCQ
1. A perceptron is:
a. a single McCulloch-Pitts neuron
b. an autoassociative neural network
c. a double layer autoassociative neural network
d. All the above
Ans: (a)
2. Perceptron is used as a classifier for
a. Linearly separable data
b. Non-linearly separable data
c. Linearly non-separable data
d. Any data
Ans: (a)
3. A 4-input neuron has weights 1, 2, 3 and 4. The transfer function is linear with the
constant of proportionality being equal to 2. The inputs are 4, 10, 5 and 20 respectively.
The output will be:
a. 238
b. 76
c. 119
d. 178
Ans: (a)

4. Consider a perceptron for which training sample, 𝑢𝑢 ∈ 𝑅 2 and

𝑢𝑢1 f(a)
1 𝑓𝑜𝑟 𝑎 > 0
𝑢𝑢2
𝑓(𝑎) = � 0 𝑓𝑜𝑟 𝑎 = 0
−1 𝑓𝑜𝑟 𝑎 < 0

Let the desired output (y) be 1 when elements of class A = {(1,2),(2,4),(3,3),(4,4)} is

applied as input and let it be -1 for the class B = {(0,0),(2,3),(3,0),(4,2)}. Let the initial
connection weights w 0 (0) = +1, w 1 (0) = -2, w 2 (0) = +1 and learning rate be η = 0.5.
This perceptron is to be trained by perceptron convergence procedure, for which the
weight update formula is (𝑡 + 1) = 𝑤(𝑡) + η(𝑦 − 𝑓(𝑎))𝑢𝑢 , where 𝑓(𝑎) is the actual
output.

A. If u = (4,4) is applied as input, then w(1)=?

a. [2,2,5]T
b. [2,1,5]T
c. [2,1,1]T
d. [2,0,5]T

Ans: (a)

B. If (4,2) is then applied, what will be w(2)

a. [1,-2,3]T
b. [-1,-2,3]T
c. [1,-2,-3]T
d. [1,2,3]T

Ans: (a)

5. Perceptron training rule converges, if data is

a. Linearly separable
b. Non-linearly separable
c. Linearly non-separable data
d. Any data
Ans: (a)

6. Is XOR problem solvable using a single perceptron

a. Yes
b. No
c. Can’t say
Ans: (b)

7. Consider a perceptron for which training sample, 𝑢𝑢 ∈ 𝑅 2 and actual output, 𝑥 ∈ {0,1}, let
the desired output be 0 when elements of class A={(2,4),(3,2),(3,4)} is applied as input
and let it be 1 for the class B={(1,0),(1,2),(2,1)}. Let the learning rate η be 0.5 and initial
connection weights are w 0 =0, w 1 =1, w 2 =1. Answer the following questions:
A. Shall the perceptron convergence procedure terminate if the input patterns from class
A and B are repeatedly applied by choosing a very small learning rate?
a. Yes
b. No
c. Can’t say
Ans: (a). Since Classes are linearly separable.
B. Now add sample (5,2) to class B, what is your answer now, i.e. will it converge or
not?
a. Yes
b. No
c. Can’t say
Ans: (b). After adding above sample, classes become non linear separable.
Linear and Non-Linear Decision Boundaries

MCQ
1. Decision Boundary in case of same covariance matrix, with identical diagonal elements is :
a. Linear
b. Non-Linear
c. None of the above

Ans: (a)

2. Decision Boundary in case of diagonal covariance matrix, with identical diagonal elements is
given by 𝑊 𝑇 (𝑋 − 𝑋0 ) = 0, where 𝑊 is given by:
a. (𝜇𝑘 − 𝜇𝑙 )/ 𝜎 2
b. (𝜇𝑘 + 𝜇𝑙 )/ 𝜎 2
c. (𝜇𝑘2 + 𝜇𝑙2 )/ 𝜎 2
d. (𝜇𝑘 + 𝜇𝑙 )/ 𝜎
Ans: (a)

3. Decision Boundary in case of arbitrary covariance matrix but identical for all class is :
a. Linear
b. Non-Linear
c. None of the above

Ans: (a)

4. Decision Boundary in case of arbitrary covariance matrix but identical for all class is given by
𝑊 𝑇 (𝑋 − 𝑋0 ) = 0, where 𝑊 is given by:

a. (𝜇𝑘 − 𝜇𝑙 )/ 𝜎 2
b. Σ −1 ( µ k − µl )
c. (𝜇𝑘2 + 𝜇𝑙2 )/ 𝜎 2
d. Σ −1 ( µ k2 − µl2 )

Ans: (b)

5. Decision Boundary in case of arbitrary covariance matrix and also unequal is :

a. Linear
b. Non-Linear
c. None of the above

Ans: (b)
6. Discriminant function in case of arbitrary covariance matrix and all parameters are class
dependent is given by �𝑋 𝑇 𝑊𝑖 𝑋 + 𝑤 𝑇𝑖 𝑋 + 𝑤𝑖𝑜 � = 0, where 𝑊 is given by:

1
a. − Σ i−1
2
b. Σ i µi
−1

1
c. − Σ i−1µi
2
1 −1
d. − Σ i
4
Ans: (a)
PCA

MCQ
1. The tool used to obtain a PCA is
a. LU Decomposition
b. QR Decomposition
c. SVD
d. Cholesky Decomposition

Ans: (c)

2. PCA is used for

a. Dimensionality Enhancement
b. Dimensionality Reduction
c. Both
d. None

Ans: (b)

3. The scatter matrix of the transformed feature vector is given by

a. ∑𝑁 𝑘=1(𝑥𝑘 − 𝜇)(𝑥𝑘 − 𝜇)
𝑇

b. ∑𝑁 𝑇
𝑘=1(𝑥𝑘 − 𝜇) (𝑥𝑘 − 𝜇)
c. ∑𝑁 𝑘=1(𝜇 − 𝑥𝑘 )(𝜇 − 𝑥𝑘 )
𝑇

d. ∑𝑁 𝑇
𝑘=1(𝜇 − 𝑥𝑘 ) (𝜇 − 𝑥𝑘 )

Ans: (a)

4. PCA is used for

a. Supervised Classification
b. Unsupervised Classification
c. Semi-supervised Classification
d. Cannot be used for classification

Ans: (b)
5. The vectors which correspond to the vanishing singular values of a matrix that span the null
space of the matrix are:
a. Right singular vectors
b. Left singular vectors
c. All the singular vectors
d. None

Ans: (a)

6. If 𝑆 is the scatter of the data in the original domain, then the scatter of the transformed feature
vectors is given by
a. 𝑆 𝑇
b. 𝑆
c. 𝑊𝑆𝑊 𝑇
d. 𝑊 𝑇 𝑆𝑊

Ans: (d)

7. The largest Eigen vector gives the direction of the

a. Maximum scatter of the data
b. Minimum scatter of the data
c. No such information can be interpreted
d. Second largest Eigen vector which is in the same direction.

Ans: (a)

8. The following linear transform does not have a fixed set of basis vectors:
a. DCT
b. DFT
c. DWT
d. PCA

Ans: (d)

9. The Within Class scatter matrix is given by:

a. ∑𝑐𝑖=1 ∑𝑁
𝑘=1(𝑥𝑘 − 𝜇𝑖 )(𝑥𝑘 − 𝜇𝑖 )
𝑇

b. ∑𝑐𝑖=1 ∑𝑁 𝑇
𝑘=1(𝑥𝑘 − 𝜇𝑖 ) (𝑥𝑘 − 𝜇𝑖 )
c. ∑𝑐𝑖=1 ∑𝑁
𝑘=1(𝑥𝑖 − 𝜇𝑘 )(𝑥𝑖 − 𝜇𝑘 )
𝑇

d. ∑𝑐𝑖=1 ∑𝑁 𝑇
𝑘=1(𝑥𝑖 − 𝜇𝑘 ) (𝑥𝑖 − 𝜇𝑘 )

Ans: (a)
10. The Between Class scatter matrix is given by:
a. ∑𝑐𝑖=1 𝑁𝑖 (𝜇𝑖 − 𝜇)(𝜇𝑖 − 𝜇)𝑇
b. ∑𝑐𝑖=1 𝑁𝑖 (𝜇𝑖 − 𝜇)𝑇 (𝜇𝑖 − 𝜇)
c. ∑𝑐𝑖=1 𝑁𝑖 (𝜇 − 𝜇𝑖 )(𝜇 − 𝜇𝑖 )𝑇
d. ∑𝑐𝑖=1 𝑁𝑖 (𝜇 − 𝜇𝑖 )𝑇 (𝜇 − 𝜇𝑖 )

Ans: (a)

11. Which of the following is unsupervised technique?

a. PCA
b. LDA
c. Bayes
d. None of the above

Ans: (a)
Linear Discriminant Analysis

MCQ
1. Linear Discriminant Analysis is
a. Unsupervised Learning
b. Supervised Learning
c. Semi-supervised Learning
d. None of the above

Ans: (b)

2. The following property of a within-class scatter matrix is a must for LDA:

a. Singular
b. Non-singular
c. Does not matter
d. Problem-specific

Ans: (b)

3. In Supervised learning, class labels of the training samples are

a. Known
b. Unknown
c. Doesn’t matter
d. Partially known

Ans: (a)

4. The upper bound of the number of non-zero Eigenvalues of S w -1S B (C = No. of Classes)
a. C - 1
b. C + 1
c. C
d. None of the above

Ans: (a)

5. If S w is singular and N<D, its rank is at most (N is total number of samples, D dimension of data, C
is number of classes)
a. N + C
b. N
c. C
d. N - C

Ans: (d)
6. If S w is singular and N<D the alternative solution is to use (N is total number of samples, D
dimension of data)
a. EM
b. PCA
c. ML
d. Any one of the above

Ans: (b)
GMM

MCQ
1. A method to estimate the parameters of a distribution is
a. Maximum Likelihood
b. Linear Programming
c. Dynamic Programming
d. Convex Optimization

Ans: (a)

2. Gaussian mixtures are also known as

a. Gaussian multiplication
b. Non-linear super-position of Gaussians
c. Linear super-position of Gaussians
d. None of the above

Ans: (c)

3. The mixture coefficients of the GMM add upto

a. 1
b. 0
c. Any value greater then 0
d. Any value less than 0

Ans: (a)

4. The mixture coefficients are

a. Strictly positive
b. Positive
c. Strictly negative
d. Negative

Ans: (b)

5. The mixture coefficients can take a value

a. Greater than zero
b. Greater than 1
c. Less than zero
d. Between zero and 1

Ans: (d)

6. For Gaussian mixture models, parameters are estimated using a closed form solution by
a. Expectation Minimization
b. Expectation Maximization
c. Maximum Likelihood
d. None of the above

Ans: (b)

7. Latent Variable in GMM is also known as:

a. Prior Probability
b. Posterior Probability
c. Responsibility
d. None of the above

Ans: (b,c)

8. A GMM with K Gaussian mixture has K covariance matrices, with dimension:

a. Arbitrary
b. K X K
c. D X D (Dimension of data)
d. N X N (No of samples in the dataset)

Ans: c
References:

1. Pattern Recognition and Machine Learning, Christopher M.

Bishop, ISBN-13: 978-0387310732, Springer, 2007.
2. Linear Algebra and Its Applications, David C. Lay, ISBN-13: 978-
0321780720, Pearson, 20011.
3. Pattern Classification. Richard O. Duda, Peter E. Hart, David G.
Strok, ISBN- 9814-12-602-0, Wiley, 2004.
4. http://home.scarlet.be/math/Pvect.htm
5. http://en.wikibooks.org/wiki/Linear_Algebra/Combining_Subspaces/Solutions
6. http://www.eee.metu.edu.tr/~halici/courses/543LectureNotes/questions/qch6/index.html
CS5691: Pattern recognition and machine learning
Quiz - 2
Course Instructor : Prashanth L. A.
Date : Feb-22, 2019 Duration : 40 minutes
Name of the student :
Roll No :
INSTRUCTIONS: For MCQ questions, you do not have to justify the answer. For the rest, provide
proper justification for the answers. Please use rough sheets for any calculations if necessary. Please
DO NOT submit the rough sheets. DO NOT use pencil for writing the answers.

I. Multiple Choice Questions

Note: 1 mark for the correct answer. Only one answer is correct. Please write the choice code a, b, c or d in the
answer box provided.

(1) Let {X1 , . . . , Xn } be i.i.d. samples from N(µ, σ 2 ), with σ > 0. Letting µ̂n =
1 Pn
n i=1 Xi . Then, which of the following statements is true?
Pn 2
Pn 2
(a) i=1 (Xi − µ̂n ) = i=1 (Xi − µ) .
Pn
(Xi − µ̂n )2 ≤ ni=1 (Xi − µ)2 .
P
(b)
Pni=1 2
Pn 2
(c) i=1 (Xi − µ̂n ) > i=1 (Xi − µ) .
Pn 2
Pn 2
(d) An inequality/equality relating i=1 (Xi − µ̂n ) and i=1 (Xi − µ) does not
always hold.

Answer:

(2) Consider a Bayesian estimation problem, with data {X1 , . . . , Xn } i.i.d. from N(θ, 1),
and a N(0, 1) prior. Letting Sn = ni=1 Xi , the posterior mean is
P

Sn Sn
(a) (b)
n n+1
nSn Sn + 1
(c) (d)
n+1 n+2
Answer:

(3) Let X ∼ Unif[0, θ]. Then, the maximum likelihood estimate of θ, given i.i.d. samples
{X1 , . . . , Xn } is
(a) ni=1 Snn .
P
(b) mini=1,...,n Xi .
(c) maxi=1,...,n Xi . (d) 21 (maxi=1,...,n Xi − mini=1,...,n Xi ).

Answer:

(4) Suppose that we are trying to fit a linear and 10th degree polynomial to data coming
from a cubic function, corrupted by standard Gaussian noise. Let M1 and M2 denote
the models corresponding to the linear and 10 degree polynomial. Then,
(a) Bias(M1 ) ≤ Bias(M2 ), Variance(M1 ) ≤ Variance(M2 ).
(b) Bias(M1 ) ≤ Bias(M2 ), Variance(M1 ) ≥ Variance(M2 ).
(c) Bias(M1 ) ≥ Bias(M2 ), Variance(M1 ) ≤ Variance(M2 ).
(d) Bias(M1 ) ≥ Bias(M2 ), Variance(M1 ) ≥ Variance(M2 ).
Answer:

(5) Consider a regression problem, with scalar input X ∈ R, and target Y ∈ R. Suppose
(X, Y ) is bivariate normal with non-zero means, positive variances, and non-zero cor-
relation. Then, the optimal predictor, for the square loss, as a function of X is
(a) Quadratic. (b) Constant.
(c) Linear. (d) None of the above.

Answer:

II. A problem that requires a detailed solution

(1) Consider a distribution over (X, Y ) given by the following assumptions:
Y ∈ {−1, +1}, X ∈ {0, 1}3 .
P (Y = +1) = a, P (Y = −1) = 1 − a,
X|Y = −1 ∼ Bern(θ1 ) × Bern(θ2 ) × Bern(θ3 ),
X|Y = +1 ∼ Bern(τ1 ) × Bern(τ2 ) × Bern(τ3 ).
We have 10 training points from the above distribution, given by the table below.
X1 X2 X3 Y
1 0 0 +1
0 1 1 −1
0 1 0 +1
1 1 0 +1
1 1 1 −1
1 0 0 +1
1 0 1 +1
0 0 1 −1
0 1 1 +1
0 0 0 −1
i. Give the ML estimates for a, θ1 , θ2 , θ3 , τ1 , τ2 , τ3 . (3 marks)
3
ii. For all the 8 points in the instance space {0, 1} , give the estimate of the pos-
terior probability P (Y = +1 | X), and give the prediction that minimises the
mis-classification rate (or the Bayes classifier for the zero-one loss), in the form
of a table with 8 rows. (2 marks)
Ordination, Principal component analysis
Quiz
The goal of quizzes is to help you learn.
Compare your answers to the list of correct answers at the end of the quiz.

Ordination – generalities
1. The primary objective of an ordination of multivariate data is to display the objects in a
diagram where similar objects are together and objects with different characteristics are far apart.
– True, False.
2. Ecologists use multivariate ordination methods such as PCA because the data they want to
display are multivariate. – True, False.
3. An ordination method is a statistical test. – True, False.
Principal component analysis (PCA) – computation
4. Principal component analysis (PCA) can be used with variables of any mathematical types:
quantitative, qualitative, or a mixture of these types. – True, False.
5. Principal component analysis (PCA) requires quantitative multivariate data. – True, False.
6. The sum of the PCA eigenvalues is equal to the sum of the variances of the variables. – True,
False.
7. Variances and covariances can be computed for variables of any mathematical types:
quantitative, qualitative, or a mixture of these types. – True, False.
Variable transformation
8. For variables with physical dimensions (e.g. kg), their variances also have physical
dimensions. – True, False.
9. The variables subjected to PCA must all have the same physical dimensions. – True, False.
10. When the variables have different physical dimensions, they must be made dimensionless by
standardization or ranging before PCA. – True, False.
11. Tables of environmental variables that have different physical dimensions must be
standardized before PCA. – True, False.
12. PCA ordination diagrams are easier to interpret when the distributions of the variables are
symmetrical. – True, False.
13. For community composition data, the Hellinger and chord transformations are appropriate
before PCA. – True, False.
2

PCA biplots
14. PCA biplots are graphs in which objects and variables (descriptors) are represented together.
– True, False.
15. In PCA, distance biplots (scaling 1) correctly represent the positions of the objects with
respect to one another, projected in 2 dimensions. – True, False.
16. In PCA, correlation biplots (scaling 2) correctly represent the angular relationships among the
variables, projected in 2 dimensions. – True, False.
17. Groups of similar sites can be identified on distance biplots (scaling 1). – True, False.
18. Intercorrelated groups of species can be identified on correlation biplots (scaling 2). – True,
False.
Equilibrium circle of descriptors
19. An equilibrium circle of descriptors can be drawn on PCA distance biplots (scaling 1). –
True, False.
20. An equilibrium circle of descriptors can be drawn on PCA correlation biplots (scaling 2). –
True, False.
Meaningful components, algorithms
21. The most meaningful and interpretable principal components are those that have the largest
eigenvalues. – True, False.
22. The broken-stick model is often used as a null model against which one can assess the
eigenvalues, in order to determine the most important eigenvalues and how many PCA axes one
should examine and plot. – True, False.
23. Eigen decomposition, singular value decomposition (SVD) and iterative search of
eigenvalues and eigenvectors are three different ways of computing PCA. They produce the same
PCA results. – True, False.
3

Correct answers to the questions about PCA –

1. True
2. True
3. False

4. False
5. True
6. True
7. False

8. True
9. True
10. True
11. True
12. True
13. True

14. True
15. True
16. True
17. True
18. True

19. True
20. False

21. True
22. True
23. True
4
Sample questions for “Fundamentals of Machine Learning 2018”
Teacher: Mohammad Emtiyaz Khan

A few important informations:

• In the final exam, no electronic devices are allowed except a calculator. Make
sure that your calculator is only a calculator and cannot be used for any other
purpose.

• No documents allowed apart from one A4 sheet of your own notes.

• You are not allowed to talk to others

• For derivations, clearly explain your derivation step by step. In the final
exam you will be marked for steps as well as for the end result.

• For multiple-choice questions, you also need to provide explanations. You

will be marked for your answer as well as for your explanations.

• We will denote the output data vector by y which is a vector that contains
all yn , and the feature matrix by X which is a matrix containing features xTn
as rows. Also, x en = [1, xTn ]T .

• N denotes the number of data points and D denotes the dimensionality.

1 Multiple-Choice/Numerical Questions
1. Choose the options that are correct regarding machine learning (ML) and
artificial intelligence (AI),

(A) ML is an alternate way of programming intelligent machines.

(B) ML and AI have very different goals.
(C) ML is a set of techniques that turns a dataset into a software.
(D) AI is a software that can emulate the human mind.

Answer: (A), (C), (D)

2. Which of the following sentence is FALSE regarding regression?

(A) It relates inputs to outputs.

(B) It is used for prediction.
(C) It may be used for interpretation.
(D) It discovers causal relationships.

1
Answer: (D)

3. What is the rank of the following matrix?

 
1 1 1
A= 1 1 1  (1)
 

1 1 1

Answer: 1

4. What is the dimensionality of the null space of the following matrix?

 
1 1 1
A= 1 1 1  (2)
 

1 1 1

Answer: 2

5. What is the dimensionality of the null space of the following matrix?

 
3 2 −9
A =  −6 −4 18  (3)
 

12 8 −36

Answer: 2

6. For the one-parameter model, mean-Square error (MSE) is defined as follows:

1
PN 2
2N n=1 (yn − β0 ) . We have a half term in the front because,

(A) scaling MSE by half makes gradient descent converge faster.

(B) presence of half makes it easy to do grid search.
(C) it does not matter whether half is there or not.
(D) none of the above

Answer: C

7. Grid search is,

(A) Linear in D.
(B) Polynomial in D.
(C) Exponential in D.
(D) Linear in N .

Answer: C,D

8. The advantage of Grid search is (are),

(A) It can be applied to non-differentiable functions.

2
(B) It can be applied to non-continuous functions.
(C) It is easy to implement.
(D) It runs reasonably fast for multiple linear regression.

Answer: A,B,C.

9. Gradient of a continuous and differentiable function

(A) is zero at a minimum

(B) is non-zero at a maximum
(C) is zero at a saddle point
(D) decreases as you get closer to the minimum

Answer: A,C,D

10. Consider a linear-regression model with N = 3 and D = 1 with input-ouput

11. Let us say that we have computed the gradient of our cost function and
stored it in a vector g. What is the cost of one gradient descent update
given the gradient?

(A) O(D)
(B) O(N )
(C) O(N D)
(D) O(N D2 )

Answer: (A)

3
14. Computational complexity of Gradient descent is,

(A) linear in D
(B) linear in N
(C) polynomial in D
(D) dependent on the number of iterations

Answer: C

16. K-fold cross-validation is

(A) linear in K
(B) quadratic in K
(C) cubic in K
(D) exponential in K

Answer: A

(A) High variance

(B) High model bias
(C) High estimation bias
(D) None of the above

Answer: A

18. Adding more basis functions in a linear model... (pick the most probably
option)

(A) Decreases model bias

(B) Decreases estimation bias
(C) Decreases variance

4
(D) Doesn’t affect bias and variance

Answer: A

2 Multiple-output regression

yn1 ≈ f1 (xn ) = β10 + β11 xn1 + β12 xn2 + . . . + β1D xnD = β T1 x

en (4)
yn2 ≈ f2 (xn ) = β20 + β21 xn1 + β22 xn2 + . . . + β2D xnD = β T2 x
en (5)

where β 1 and β 2 are vectors of β1d and β2d respectively, for d = 0, 1, 2, . . . , D, and
eTn = [1 xTn ].
x

(A) Derive the gradient of L with respect to β 1 and β 2 .

(B) Suppose N = 20 and D = 15. Do we need to regularize? Explain your

answer.

(C) Suppose we increase the number of data points from N = 20 to N = 200.

Should we decrease the value of λ1 and λ2 ? Explain your answer.

(D) What is the computation complexity with respect to N and D?

Answer:

PN h 2 i
(A) ∂L
:= − yn1 − β T1 x
en x
en + λ1 β 1 , same for β 2 .
∂β1 n=1

(D) Same as gradient descent (please put an exact number here for the final
exam).

5
3 Eigenvalues

Given a real-valued matrix X, show that all the non-zero eigenvalues of XXT and
XT X are the same.

Answer: To prove this, you can use the SVD of X = USVT . Then XXT =
US2 UT and XT X = VS2 V. The non-zero eigenvalues are the same, although the
number of eigenvalues are different.

4 Artificial Neural Networks

Figure 1: Artificial neural network

Suppose you have N = 200 data points but M = 200 hidden units for each layer.
What problem(s) are you likely to encounter when training such a network? How
would you solve the problem(s)?

Answer: Overfitting. There are multiple ways to tackle this problem as discussed
in the lecture.

Empowerment-Technology-SHS - Q1 - Mod1 - ICT in The Context of Global Communication - Ver3
86% (238)
Empowerment-Technology-SHS - Q1 - Mod1 - ICT in The Context of Global Communication - Ver3
53 pages
Prof Stephen Offei - Torts Law
No ratings yet
Prof Stephen Offei - Torts Law
1,171 pages
40 R Programming Interview Questions & Answers For All Levels - DataCamp
No ratings yet
40 R Programming Interview Questions & Answers For All Levels - DataCamp
22 pages
170 Machine Learning Interview Questios - Greatlearning
100% (1)
170 Machine Learning Interview Questios - Greatlearning
57 pages
Pathfinders' Guide To Eberron
100% (2)
Pathfinders' Guide To Eberron
316 pages
Sheikh Zayed Grand Mosque
No ratings yet
Sheikh Zayed Grand Mosque
2 pages
LKPD Analytical Exposition
No ratings yet
LKPD Analytical Exposition
4 pages
Top 100 Machine Learning Questions With Answers For Interview PDF
100% (3)
Top 100 Machine Learning Questions With Answers For Interview PDF
48 pages
Data Science Interview Question
No ratings yet
Data Science Interview Question
93 pages
Machine Learning Imp Questions
100% (2)
Machine Learning Imp Questions
95 pages
ML Performance Improvement Cheatsheet
No ratings yet
ML Performance Improvement Cheatsheet
11 pages
Lesson From Top 10 Investor in World PDF
67% (3)
Lesson From Top 10 Investor in World PDF
14 pages
J - Suffering, Sickness & Healing
No ratings yet
J - Suffering, Sickness & Healing
4 pages
(Final) 600+ ML MCQ
100% (2)
(Final) 600+ ML MCQ
319 pages
Machine Learning Revision Notes
No ratings yet
Machine Learning Revision Notes
6 pages
I Am Sharing 'Interview' With You
100% (3)
I Am Sharing 'Interview' With You
65 pages
CS7641 Machine Learning Midterm Notes PDF
No ratings yet
CS7641 Machine Learning Midterm Notes PDF
239 pages
Deep Learning
No ratings yet
Deep Learning
5 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
27 pages
Combined ML
100% (1)
Combined ML
705 pages
Deep Learning Questions
50% (2)
Deep Learning Questions
51 pages
Lectures Machine Learning
No ratings yet
Lectures Machine Learning
205 pages
ML Notes
100% (2)
ML Notes
125 pages
Data Science Interview Questions
100% (2)
Data Science Interview Questions
55 pages
Basic Interview Q's On ML PDF
100% (2)
Basic Interview Q's On ML PDF
243 pages
Data Science Interview Questions
100% (1)
Data Science Interview Questions
68 pages
Deep Learning CNN
100% (1)
Deep Learning CNN
22 pages
Lecture 9 PDF
100% (1)
Lecture 9 PDF
28 pages
Deep Learning Interview Questions
No ratings yet
Deep Learning Interview Questions
17 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
135 pages
40 Interview Questions Asked at Startups in Machine Learning - Data Science
100% (3)
40 Interview Questions Asked at Startups in Machine Learning - Data Science
33 pages
49 Machine Learning
No ratings yet
49 Machine Learning
300 pages
Data Science Related Interview Question
100% (1)
Data Science Related Interview Question
77 pages
Machine Learning 1
No ratings yet
Machine Learning 1
160 pages
Data Science PPT Module 1
100% (1)
Data Science PPT Module 1
24 pages
18ai72 Aml QP Solutions
No ratings yet
18ai72 Aml QP Solutions
39 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
15 pages
Machine Learning Notes
100% (1)
Machine Learning Notes
115 pages
Class Xi Python
100% (2)
Class Xi Python
138 pages
30 Days of Interview Preparation
100% (1)
30 Days of Interview Preparation
415 pages
Machine Learning Interview Questions
100% (1)
Machine Learning Interview Questions
4 pages
Machine Learning and Linear Regression
100% (1)
Machine Learning and Linear Regression
55 pages
Introduction
100% (1)
Introduction
49 pages
Deep Learning Notes
100% (1)
Deep Learning Notes
16 pages
Unit 1 - Machine Learning
No ratings yet
Unit 1 - Machine Learning
21 pages
Predict 422 - Module 8
100% (1)
Predict 422 - Module 8
138 pages
Bias Varience Trade Off
100% (2)
Bias Varience Trade Off
35 pages
Machine Learning Notes: 2. All The Commands For Eda
100% (2)
Machine Learning Notes: 2. All The Commands For Eda
5 pages
Machine Learning Lecture Notes
No ratings yet
Machine Learning Lecture Notes
119 pages
Handwritten Machine Learning Notes
No ratings yet
Handwritten Machine Learning Notes
114 pages
Data Science 5
100% (3)
Data Science 5
216 pages
Mastering Data Science Interview Loops
50% (2)
Mastering Data Science Interview Loops
23 pages
Data Science Interview Questions (#Day11) PDF
100% (1)
Data Science Interview Questions (#Day11) PDF
11 pages
Top 100 ML Interview Q&A
100% (1)
Top 100 ML Interview Q&A
39 pages
ML First Unit
No ratings yet
ML First Unit
70 pages
Python Full
100% (1)
Python Full
59 pages
30 Frequently Asked Deep Learning Interview Questions and Answers
100% (1)
30 Frequently Asked Deep Learning Interview Questions and Answers
28 pages
Machine Learning Notes 2 - TutorialsDuniya PDF
No ratings yet
Machine Learning Notes 2 - TutorialsDuniya PDF
92 pages
Machine Learning
100% (2)
Machine Learning
211 pages
ML Notes All
No ratings yet
ML Notes All
257 pages
[Ebooks PDF] download Practical Data Science with Jupyter: Explore Data Cleaning, Pre-processing, Data Wrangling, Feature Engineering and Machine Learning using Python and Jupyter (English Edition) Prateek Gupta full chapters
100% (4)
[Ebooks PDF] download Practical Data Science with Jupyter: Explore Data Cleaning, Pre-processing, Data Wrangling, Feature Engineering and Machine Learning using Python and Jupyter (English Edition) Prateek Gupta full chapters
50 pages
Pattern Classification
100% (1)
Pattern Classification
42 pages
Classification Algorithms
100% (2)
Classification Algorithms
23 pages
Unit 3: Classification & Regression: Question Bank and Its Solution
No ratings yet
Unit 3: Classification & Regression: Question Bank and Its Solution
180 pages
PDF Machine Learning
100% (1)
PDF Machine Learning
222 pages
Machine Learning with Python: Design and Develop Machine Learning and Deep Learning Technique using real world code examples
From Everand
Machine Learning with Python: Design and Develop Machine Learning and Deep Learning Technique using real world code examples
Abhishek Vijayvargia
No ratings yet
Unit 1 MCQ
No ratings yet
Unit 1 MCQ
55 pages
Ubiquitious Computing MCQ
No ratings yet
Ubiquitious Computing MCQ
12 pages
DONE SOFT COMPUTING Unit 1
No ratings yet
DONE SOFT COMPUTING Unit 1
3 pages
Unit IV Business Performance Management Systems
No ratings yet
Unit IV Business Performance Management Systems
35 pages
Semester-I: Information and Cyber Security
No ratings yet
Semester-I: Information and Cyber Security
10 pages
Business Intelligence (BI) Maturity Model: Unit VI BI Maturity, Strategy and Modern Trends in BI
No ratings yet
Business Intelligence (BI) Maturity Model: Unit VI BI Maturity, Strategy and Modern Trends in BI
59 pages
U2 Bai
No ratings yet
U2 Bai
39 pages
ML 1
No ratings yet
ML 1
51 pages
ML U3 MCQ
No ratings yet
ML U3 MCQ
20 pages
MCQQQQQQQQQ
No ratings yet
MCQQQQQQQQQ
35 pages
Internship Report
No ratings yet
Internship Report
39 pages
Dual Nature of Light CREAM
No ratings yet
Dual Nature of Light CREAM
14 pages
1997, Srikant Et Al, JAP
No ratings yet
1997, Srikant Et Al, JAP
11 pages
P.S. From Aeropostale To Open April 26 at Salmon Run Mall
No ratings yet
P.S. From Aeropostale To Open April 26 at Salmon Run Mall
2 pages
The Human Memory - Luke Mastin (2010)
50% (2)
The Human Memory - Luke Mastin (2010)
59 pages
Compare and Contrast Essay Thesis Statements
100% (3)
Compare and Contrast Essay Thesis Statements
6 pages
Pre-Romantic Poets - Burns
No ratings yet
Pre-Romantic Poets - Burns
4 pages
(LN) The Gal Is Sitting Behind Me, and Loves Me - Volume 02 (JNCodex)
No ratings yet
(LN) The Gal Is Sitting Behind Me, and Loves Me - Volume 02 (JNCodex)
251 pages
10kgkby Sscexamcrazy - Blogspot.in
No ratings yet
10kgkby Sscexamcrazy - Blogspot.in
201 pages
IB-DP Grade 11 English A Literature HL Paper 1 (MS)
No ratings yet
IB-DP Grade 11 English A Literature HL Paper 1 (MS)
3 pages
Lewis, A. Et Al. Research Resin Mummy. 2010
No ratings yet
Lewis, A. Et Al. Research Resin Mummy. 2010
4 pages
Walking The Labyrinth
100% (4)
Walking The Labyrinth
10 pages
panchanama 1
No ratings yet
panchanama 1
1 page
Spring Boot Security Configuration
No ratings yet
Spring Boot Security Configuration
2 pages
Attock Sunset Testing Report (PTA Format)
No ratings yet
Attock Sunset Testing Report (PTA Format)
1 page
Committee On Ways and Means: U.S. House of Representatives Washington, DC 20515
No ratings yet
Committee On Ways and Means: U.S. House of Representatives Washington, DC 20515
2 pages
Arguments For The Existence of God
100% (1)
Arguments For The Existence of God
42 pages
7.1 Coyne2019 - Ludwig Von Mises On War and The Economy
No ratings yet
7.1 Coyne2019 - Ludwig Von Mises On War and The Economy
14 pages
Sample Daily Lesson Log
No ratings yet
Sample Daily Lesson Log
7 pages
Field Study 2 Learning Episode 3
No ratings yet
Field Study 2 Learning Episode 3
5 pages
Principles of Teaching Essay V2
No ratings yet
Principles of Teaching Essay V2
2 pages
Tantra Guna
No ratings yet
Tantra Guna
14 pages
Science Unit 1
No ratings yet
Science Unit 1
2 pages