0% found this document useful (0 votes)

15 views6 pages

22EE514

Uploaded by

vishwakarmarohan519

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views6 pages

22EE514

Uploaded by

vishwakarmarohan519

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

SEMESTER 1 EXAMINATIONS 2021/2022

MODULE: EE514 - Data Analysis and Machine Learning

PROGRAMME(S):
MECE MEng Electronic & Computer Engineering
MCTY MSc Electronic and Computer Technology
MSAR MSc in Astrophysics and Relativity
GCIOT Grad Cert in the Internet of Things
CAPT PhD-track
GCECE Grad Cert. in Electronic & Computer Eng
EEPT PhD-track
MQTY Qualifier Prog MSc Electronic & Computer
MEQ Masters Engineering Qualifier Course
MEPT PhD-track
MEPD PhD
MECEW MEng In Electronic & Comp Eng. (Wuhan)
CSPT PhD-track
CAPD PhD
YEAR OF STUDY: 1,2,3,C
EXAMINERS:
Dr. Kevin McGuinness 087 6596732
Prof. Roberto Verdone (External) External
TIME ALLOWED: 3 hours
INSTRUCTIONS: Answer 4 questions. All questions carry equal marks.

PLEASE DO NOT TURN OVER THIS PAGE UNTIL INSTRUCTED TO DO SO

The use of programmable or text storing calculators is expressly forbidden.
Please note that where a candidate answers more than the required number of questions,
the examiner will mark all questions attempted and then select the highest scoring ones.

There are no additional requirements for this paper.

EE514–Data Analysis and Machine Learning Page 1 of 6

Semester 1 EXAMINATIONS 2021/2022
QUESTION 1 [TOTAL MARKS: 25]
Data summarization

Q 1(a) [5 Marks]
Explain, using the aid of a diagram, what is meant by a mode of a distribution. Explain
the distinction between modes and the mean and median. Give an example of a
distribution where the mean, median, and mode are coincident.
Q 1(b) [5 Marks]
How might you go about estimating the modes of the distribution of a quantitative
(continuous) random variable from a sample? What considerations are important
and what difficulties might be encountered? How could this approach be extended to
estimate the modes of the distribution a random variable in RD from a sample and what
difficulties might be encountered?
Q 1(c) [5 Marks]
Describe what is meant by the interquartile range (IQR) and explain how it could be
calculated from a sample. Explain a common method for detecting outliers based on the
interquartile range. Should outliers always be discarded before subsequent analysis?
Explain your reasoning.
Q 1(d) [6 Marks]
2 N
Suppose that you have a sample {(xi , yi ) ∈ R }i=1 of two random variables X and Y
and you suspect that there is an exponential relationship between X and Y of the form

Y = exp(aX + b) + ϵ,

with ϵ ∼ N (0, σ 2 ). Explain how you might fit the parameters a and b and explain how
you could formally measure the strength of the relationship between X and Y .
Q 1(e) [4 Marks]
Describe a statistic that can be used to measure how heavy the tails of a distribution
are and explain how you could estimate this statistic from a sample.

[End Question 1]

EE514–Data Analysis and Machine Learning Page 2 of 6

Semester 1 EXAMINATIONS 2021/2022
QUESTION 2 [TOTAL MARKS: 25]
Supervised Learning

Q 2(a) [6 Marks]
What is the purpose of validation data in supervised machine learning and how does
it differ from test data? Outline the benefits and drawbacks of using k-fold cross
validation as opposed to a dedicated validation set. Describe a situation where a
dedicated validation set may be preferable to k-fold cross validation.
Q 2(b) [8 Marks]
Explain the training and prediction procedure for a k-nearest neighbour classifier. In
the case of k = 1, what is the training error of such a model? In the case of k = N ,
where N is the number of training examples, what will the prediction of the model be?
In general, what happens to classifier bias and variance as k is increased?
Q 2(c) [8 Marks]
How many free parameters are there in a 10-class quadratic discriminant analysis
model if the dimension of the input is 100? Describe TWO assumptions you could
make to reduce the number of free parameters and calculate the number of parameters
in the simplified models.
Q 2(d) [3 Marks]
Explain what is meant by overfitting. How could you tell if a predictive model has
overfit?

[End Question 2]

EE514–Data Analysis and Machine Learning Page 3 of 6

Semester 1 EXAMINATIONS 2021/2022
QUESTION 3 [TOTAL MARKS: 25]
Linear models

Q 3(a) [6 Marks]
Show that putting a zero mean isotropic Gaussian prior w ∼ N (0, σI) on the weights of
a linear model results in a regularization term proportional to the squared 2-norm of the
weights ∥w∥22 in the loss function.
Q 3(b) [5 Marks]
Linear regression models can be fit in closed form by solving the normal equations:

X T Xw = X T y,

where w ∈ RD are the unknown weights, y ∈ RN contains the targets yi , and X ∈ RN ×D

is a data matrix with the training example inputs xi on the rows. Describe a situation
when you might prefer to fit a linear regression model using an iterative approach like
stochastic gradient descent, rather than fitting it by solving the normal equations and
explain your reasoning.
Q 3(c) [14 Marks]
Given a training set T = {(xi , yi )}N
i=1 with x i ∈ R2000
and yi ∈ {0, 1}, the loss function
for the L1 regularized logistic regression model can be written as:
N
1 X
L(w) = l(yi , σ(wT xi + b))2 + λ∥w∥1 ,
2N i=1

where λ is the regularization hyperparameter, l(y, ŷ) is a per-sample loss function, and
σ(x) is the sigmoid function.

i. Which symbol(s) in the formula above represents the parameters of the model and
how many parameters does the model have?

ii. Explain in your own words why the sigmoid function is used here.

iii. Describe what you expect to happen to the model parameters w as the value of λ
increases.

iv. Describe how you could go about selecting a good value for the λ hyperparameter
in practice.

v. Name and write down the formula for an appropriate per-sample loss function l(y, ŷ)
to use here.

[End Question 3]

EE514–Data Analysis and Machine Learning Page 4 of 6

Semester 1 EXAMINATIONS 2021/2022
QUESTION 4 [TOTAL MARKS: 25]
Unsupervised learning

Q 4(a) [5 Marks]
Write down and explain the k-means objective function (assuming fixed k) and describe
an algorithm for approximate minimization. Is it possible, in general, to exactly minimize
the k-means objective in polynomial time? Explain why or why not.
Q 4(b) [5 Marks]
How many parameters are there in a 6-component Gaussian mixture model in 5
dimensions? Describe an assumption that could be made to reduce the number of
parameters.
Q 4(c) [15 Marks]
N ×D
Given a sample matrix X ∈ R , the principal component analysis algorithm seeks
a transform Q such that the empirical covariance matrix of the transformed variables
Z = XQ is diagonal.

1. What is meant by the empirical covariance matrix here? How would you compute
the empirical covariance matrix for X?

2. Describe how to compute the transformation Q from the empirical covariance

matrix X.

3. Show that the covariance of the transformed variables is Z = XQ is a diagonal

matrix Λ.

4. Explain the meaning of the values found along the main diagonal of Λ.

[End Question 4]

EE514–Data Analysis and Machine Learning Page 5 of 6

Semester 1 EXAMINATIONS 2021/2022
QUESTION 5 [TOTAL MARKS: 25]
Neural networks and deep learning

Q 5(a) [5 Marks]
What is a convolution layer? Name and describe the hyperparameters that need to be
chosen when using a convolution layer?
Q 5(b) [7 Marks]
Explain with the aid of a diagram how you could recognize that a neural network is
overfitting the training data while training the network. Describe three ways you could
mitigate overfitting.
Q 5(c) [10 Marks]
Consider the neural network depicted in the figure below:

ŷ

(2) (2) (2) (2)

h1 h2 h3 h4

(1) (1) (1) (1)

h1 h2 h3 h4

x1 x2 x3 x4 x5

1. What is the dimension of the input of the model?

(1)
2. Give a formula for computing hi from the inputs, assuming the activation function
is a rectified linear unit. Introduce notation as needed for weights and biases.
3. Assuming each non-input node has an associated weights and biases (not shown),
calculate the number of parameters of the model.
4. Assuming this was a binary classification model, what would be a suitable activa-
tion function for ŷ?
5. Assuming this was a binary classification model, what would be an appropriate
loss function to use to fit this model?

Q 5(d) [3 Marks]
Describe how you could choose the initial starting weights when optimizing a multilayer
perceptron. Explain why is it important not to initialize all the weights to zero.

[End Question 5]

[END OF EXAM]

EE514–Data Analysis and Machine Learning Page 6 of 6

Semester 1 EXAMINATIONS 2021/2022

Cs 419 Endsemsols
No ratings yet
Cs 419 Endsemsols
6 pages
MAST90083 2021 S2 Exam Paper
No ratings yet
MAST90083 2021 S2 Exam Paper
4 pages
204EC004
No ratings yet
204EC004
1 page
2022 Exam2 Solution
No ratings yet
2022 Exam2 Solution
10 pages
IBM322 Last Year ETE
No ratings yet
IBM322 Last Year ETE
5 pages
Mid Sem Exam
No ratings yet
Mid Sem Exam
3 pages
Semester One Final Examinations 2019 DATA7703
No ratings yet
Semester One Final Examinations 2019 DATA7703
13 pages
Compre FoDS
No ratings yet
Compre FoDS
3 pages
1160 CS F425 20241218114944 Comprehensive Exam Question Paper
No ratings yet
1160 CS F425 20241218114944 Comprehensive Exam Question Paper
5 pages
E9 205 - Machine Learning For Signal Processing: Practice Midterm Exam
No ratings yet
E9 205 - Machine Learning For Signal Processing: Practice Midterm Exam
4 pages
2019-20-I ES Key
No ratings yet
2019-20-I ES Key
4 pages
ML Final 2021
No ratings yet
ML Final 2021
12 pages
2011 End Spring 2011 Computer Science Machine Learning
No ratings yet
2011 End Spring 2011 Computer Science Machine Learning
10 pages
204MC001 2
No ratings yet
204MC001 2
2 pages
ECE521H1 20191 631567517513final2019
No ratings yet
ECE521H1 20191 631567517513final2019
14 pages
Uct633 Est 23
No ratings yet
Uct633 Est 23
3 pages
UofT CSC411/2515 Exam Guide
No ratings yet
UofT CSC411/2515 Exam Guide
17 pages
CS-30004 (Dsa) - CS End Nov 2024
No ratings yet
CS-30004 (Dsa) - CS End Nov 2024
17 pages
FA1 Module 1,2,3 ML
No ratings yet
FA1 Module 1,2,3 ML
6 pages
IBM322 Last Year ETE
No ratings yet
IBM322 Last Year ETE
5 pages
204EC004
No ratings yet
204EC004
2 pages
Activities Super
No ratings yet
Activities Super
6 pages
2019-20-I MS Key
No ratings yet
2019-20-I MS Key
6 pages
X400004 2021 02 09 Course
No ratings yet
X400004 2021 02 09 Course
8 pages
Mid Term Solutions
No ratings yet
Mid Term Solutions
5 pages
21ad62 Model Paper
No ratings yet
21ad62 Model Paper
38 pages
19 - Statistical Method & Probability Theory - Set 1
No ratings yet
19 - Statistical Method & Probability Theory - Set 1
2 pages
CS725 2020 Quiz1
No ratings yet
CS725 2020 Quiz1
3 pages
Semester Two Examinations 2023 DATA7703
No ratings yet
Semester Two Examinations 2023 DATA7703
15 pages
đề học máy 1
No ratings yet
đề học máy 1
3 pages
ML 20230316 1
No ratings yet
ML 20230316 1
9 pages
AI Exam for Computer Science Students
No ratings yet
AI Exam for Computer Science Students
6 pages
Practice Final sp22
No ratings yet
Practice Final sp22
10 pages
BDS 2019-20
No ratings yet
BDS 2019-20
5 pages
VTU Model Question Paper 18EC44 1
No ratings yet
VTU Model Question Paper 18EC44 1
3 pages
Deep Learning
No ratings yet
Deep Learning
9 pages
Advanced Machine Learning Course Overview
No ratings yet
Advanced Machine Learning Course Overview
159 pages
CAT2 Key
No ratings yet
CAT2 Key
10 pages
Exam 2011
No ratings yet
Exam 2011
22 pages
Sample Solutions: F F F T F T F F
No ratings yet
Sample Solutions: F F F T F T F F
2 pages
ML4N Exam Guidelines & Exercises
No ratings yet
ML4N Exam Guidelines & Exercises
6 pages
Worksheet For Quiz
No ratings yet
Worksheet For Quiz
5 pages
Midterm Solutions
No ratings yet
Midterm Solutions
11 pages
Machine Learning Quiz for Students
No ratings yet
Machine Learning Quiz for Students
45 pages
Sem 2023
No ratings yet
Sem 2023
6 pages
Mtech AI Syllabus
No ratings yet
Mtech AI Syllabus
159 pages
Semester I: Discipline: Interdisciplinary
No ratings yet
Semester I: Discipline: Interdisciplinary
155 pages
Win 23 3170724 Merged
No ratings yet
Win 23 3170724 Merged
9 pages
End Sem Papers 5th Sem (2022)
No ratings yet
End Sem Papers 5th Sem (2022)
11 pages
CST383 A
No ratings yet
CST383 A
4 pages
CS 760 Machine Learning Exam Spring 2010
No ratings yet
CS 760 Machine Learning Exam Spring 2010
10 pages
EE2211 Past Paper Ans
No ratings yet
EE2211 Past Paper Ans
19 pages
AIML Regression Sample QP-1
No ratings yet
AIML Regression Sample QP-1
2 pages
AI60201 2024 Endsem Solutions
No ratings yet
AI60201 2024 Endsem Solutions
5 pages
Compre FoDS
No ratings yet
Compre FoDS
2 pages
Machine Learning Foundations and Applications Assignment 1 Due Date: 10 October, 2021
No ratings yet
Machine Learning Foundations and Applications Assignment 1 Due Date: 10 October, 2021
3 pages
COMP 1003&1433 Midterm (Tuesday)
No ratings yet
COMP 1003&1433 Midterm (Tuesday)
8 pages
Machine Learning Insem-01 QP
No ratings yet
Machine Learning Insem-01 QP
6 pages
Practical Implementation 02
No ratings yet
Practical Implementation 02
13 pages
Paired t-Test Guide for Students
No ratings yet
Paired t-Test Guide for Students
3 pages
Predictive Modelling Coded Project
No ratings yet
Predictive Modelling Coded Project
33 pages
Teacher Views on No Classroom Policy
No ratings yet
Teacher Views on No Classroom Policy
3 pages
SEM Techniques Performance Analysis
No ratings yet
SEM Techniques Performance Analysis
11 pages
Polytechnic Mathematics Material Compilation - Statistics
No ratings yet
Polytechnic Mathematics Material Compilation - Statistics
39 pages
LSS Cheat Sheets Revised
No ratings yet
LSS Cheat Sheets Revised
28 pages
Statistics Formula Tables
No ratings yet
Statistics Formula Tables
8 pages
Unit 4 A Major 1
No ratings yet
Unit 4 A Major 1
9 pages
Understanding Cp and Cpk in Process Capability
100% (1)
Understanding Cp and Cpk in Process Capability
18 pages
Economic Predictions With Big Data
No ratings yet
Economic Predictions With Big Data
5 pages
EViews 8 Users Guide II
No ratings yet
EViews 8 Users Guide II
1,005 pages
Normal Distribution
No ratings yet
Normal Distribution
15 pages
Standard Deviation
No ratings yet
Standard Deviation
8 pages
Sampling & Forecasting Methods
No ratings yet
Sampling & Forecasting Methods
8 pages
Week 11
No ratings yet
Week 11
2 pages
BasicStatistics I
No ratings yet
BasicStatistics I
90 pages
Panel Vector Autoregression Under Cross-Sectional Dependence
No ratings yet
Panel Vector Autoregression Under Cross-Sectional Dependence
26 pages
G*Power Sample Size Calculations
No ratings yet
G*Power Sample Size Calculations
4 pages
Key Statistical Symbols Explained
No ratings yet
Key Statistical Symbols Explained
1 page
Point and Interval Estimation: Population Variance Is Known
No ratings yet
Point and Interval Estimation: Population Variance Is Known
2 pages
Event Log Modeling and Analysis For System Failure Prediction
No ratings yet
Event Log Modeling and Analysis For System Failure Prediction
14 pages
R&D and Marketing Spend Analysis
No ratings yet
R&D and Marketing Spend Analysis
23 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
3 pages
BayesianStatisticsandMarketing ByRossiand Allenby
No ratings yet
BayesianStatisticsandMarketing ByRossiand Allenby
26 pages
MCQ M-IV Unit 6 Mechanical
No ratings yet
MCQ M-IV Unit 6 Mechanical
7 pages
M.pharmacy Statistics Notes
100% (2)
M.pharmacy Statistics Notes
19 pages
Full Statistics
No ratings yet
Full Statistics
108 pages
Slide Intro To Statistics Tutorku - UTS
No ratings yet
Slide Intro To Statistics Tutorku - UTS
85 pages
Module 11 (C)
No ratings yet
Module 11 (C)
4 pages