0% found this document useful (0 votes)

91 views8 pages

Machine Learning Interview Questions

This document discusses several key machine learning concepts including: 1. Fixing high bias by obtaining new features, fitting more complex functions, or adding neural network layers/parameters. 2. Fixing high variance by getting more training examples, using fewer features, or adding more training samples to neural networks. 3. Common machine learning algorithms like online learning, confusion matrices, and precision vs recall. 4. Causes and solutions for high bias and high variance in models. 5. Normalization and standardization techniques. 6. Other concepts like multilabel classification, TF-IDF, and handling sparse data.

Uploaded by

Priya Koshta

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

Download as docx, pdf, or txt

0% found this document useful (0 votes)

91 views8 pages

Machine Learning Interview Questions

Uploaded by

Priya Koshta

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

Download as docx, pdf, or txt

You are on page 1/ 8

Difference between Multilabel and Multiclass classification

Normalization vs Standardization (when to use what)

What is tfidf
Bias(underfitting) vs variance(overfitting)

Fixing High Bias:

Obtaining new features
Fitting a more complex function
Adding more layers in neural network
Increasing the number of more input parameters flowing into the model in neural network

Fixing High Variance:

Getting more training examples
Taking smaller set of features
Adding more training samples in neural networks

Online Learning

Confusion Matrix
Precision vs Recall

When you plot a learning curve(training error,cross validation error as a function of number of
training examples) if training error and cross validation error both are high then high bias or high
variance High Bias

If you train a model and it performs well on the training set and fails to generalize on the validation
set then High bias or high variance High Variance

Adding new features fixes High Bias or High Variance High Bias

In case of neural network adding more layers reduces bias or variance Bias

What is Online Learning(Normal gradient descent vs stochastic gradient descent)

Difference between Multilabel and Multiclass classification

What is tfidf tf*idf

tf=number of times the word appears in a document

Idf = log(total number of documents/number of documents in which this word appears)

PageRank Algorithm and its applications(Word Sense Disambiguation what specific meaning is
being conveyed by the given sentence whenever it’s appearing, Auto Summarization of
Text(TextRank))
How would you build a model to distinguish between apple company and apple fruit

Standardization vs Normalization when to use what

Standardization: mean of 0 and standard deviation 1

60% of the value lies in between -1 and 1

Normalization : num-min/max-min 0-1 range

Doesnt decrease the importance of outliers

Min- Max tries to get the values closer to mean. But when there are outliers in the data which are
important and we don’t want to loose their impact ,we go with Z score normalization.

If your data contains more than 30% missing values what would you do ??
-> Perform treatement or drop them from the analysis

Why do we prefer one hot encoding over label encoding

Accuracy of a model is 99. Do you thing the model is good to go with.

Feature is 70% sparse how would you handle the data

When do you take log of a feature

Skewed classes dealing

What is the role of bias in a neural network

The bias value allows the activation function to be shifted to the left or right, to better fit the data.
Hence changes to the weights alter the steepness of the sigmoid curve, whilst the bias offsets it,
shifting the entire curve so it fits better. Note also how the bias only influences the output values, it
doesn’t interact with the actual input data

New Questions

1.Explain Stochastic Gradient Descent to a five year old.

2.What is dropout? What happens if we apply it to a neural network ?
3.How do we use a 1x1 convolution unit ?
(https://www.youtube.com/watch?v=vcp0XvDAX68)
4.When would you use ML? When would you use NN?
5.What is dimensionality reduction ? How is it useful ?
The goal is to simplify data without losing too much information. It is important because a. It helps
to run the algorithm much faster b. The data will take less memory and space
6.Why is logistic regression called regression when we are using it for classification instead
?
7. What are some challenges faced in machine learning ?
There are multiple challenges that can be categorised due to two main causes
1. Bad Data
a. Insufficient quantity of training data
b. Non representative Training Data
c. Poor Quality of Data
d. Irrelevant Features
Solution : Feature Engineering : Selection, Extraction and Creation of better
features
2. Bad Algorithm
a. Underfitting the training data
b. Overfitting the training data
Solution

Underfitting OverFitting

1.Using a powerful model 1.Reducing the no. parameters

2.Finding better features 2.Get more training data

3.Reducing regularization 3.Reduce noise in training data

hyperparameters

4.Using regularization
8.What is validation set used for?

9.What is the no free lunch theorem ?

This implies that a model that explains a certain situation well may fail in another situation. In both
statistics and machine learning, we need to check our assumptions before relying on a model. The
“No Free Lunch” theorem states that there is no one model that works best for every problem

10. Why is MSE suited for Linear Regression as a cost function ?

The MSE cost function for a Linear Regression model happens to be a convex function, which
means that if you pick any two points on the curve, the line segment joining them never crosses
the curve. This implies that there are no local minima, just one global minimum.

It is also a continuous function with a slope that never changes abruptly.

These two facts have a great consequence: Gradient Descent is guaranteed to approach arbitrarily
close the global minimum (if you wait long enough and if the learning rate is not too high

11. Does Stochastic Gradient Descent always return optimal parameter values. If not, why ?
Due to its stochastic (i.e., random) nature, this algorithm is much less regular than Batch Gradient
Descent: instead of gently decreasing until it reaches the minimum, the cost function will bounce
up and down, decreasing only on average.

Over time it will end up very close to the minimum, but once it gets there it will continue to bounce
around, never settling down . So once the algorithm stops, the final parameter values are good, but
not optimal.

12. How do you tackle the randomness in Stochastic Gradient Descent ?

Randomness is good to escape from local optima, but bad because it means that the algorithm can
never settle at the minimum. One solution to this dilemma is to gradually reduce the learning rate.
The steps start out large (which helps make quick progress and escape local minima), then get
smaller and smaller, allowing the algorithm to settle at the global minimum. This process is called
simulated annealing, because it resembles the process of annealing in metallurgy where molten
metal is slowly cooled down.

The function that determines the learning rate at each iteration is called the learning schedule. If
the learning rate is reduced too quickly, you may get stuck in a local minimum, or even end up
frozen halfway to the minimum. If the learning rate is reduced too slowly, you may jump around the
minimum for a long time and end up with a suboptimal solution if you halt training too early

COMPUTER VISION AND SELF DRIVING CARS

1. What is computer vision ?

Computer Vision is a subset of AI where we try to teach computers to gain high level
information from images and video
2. Why is C++ preferred for implementing computer vision algorithms
Speed: C++ beats other languages and takes less time and memory
Community and Libraries support
Portability: easier to port C++ into code bases of other languages but not the other way
round
3. Applications of Computer Vision
Object Detection
Object Recognition
Object Localization
4. Commonly used Algorithms in Self Driving Cars
AdaBoost: Adaptive Boosting
K-means clustering
Principal Component Analysis [PCA]
Histogram of Gradients [HOG]
Support Vector Machines [SVM]
Neural Network Regression
5. Commonly used Mathematical Methods used in Computer Vision
According to this course: CS491Y/791Y Mathematical Methods for Computer Vision

They are:

1. Linear Algebra
2. Singular Value Decomposition
3. Introductory level Pattern Recognition
4. Principal Component Analysis
5. Linear Discriminant Analysis
6. Fourier Transform
7. Wavelets
8. Probability, Bayes rule, Maximum Likelihood, MAP
9. Mixtures and Expectation-Maximization Algorithm
10. Introductory level Statistical Learning
11. Support Vector Machines
12. Genetic Algorithms
13. Hidden Markov Models
14. Bayesian Networks
15. Kalman filtering

6. How many channels are there in Grayscale Image and RGB image ?

LINEAR ALGEBRA

1. What is broadcasting in connection to Linear Algebra?

2. What are scalars, vectors, matrices, and tensors?
3. What is Hadamard product of two matrices?
4. What is an inverse matrix?
5. If inverse of a matrix exists, how to calculate it?
6. What is the determinant of a square matrix? How is it
calculated? What is the connection of determinant to
eigenvalues?
7. Discuss span and linear dependence.
8. What is Ax = b? When does Ax =b has a unique solution?
9. In Ax = b, what happens when A is fat or tall?
10. When does inverse of A exist?
11. What is a norm? What is L1, L2 and L infinity norm?
12. What are the conditions a norm has to satisfy?
13. Why is squared of L2 norm preferred in ML than just
L2 norm?
14. When L1 norm is preferred over L2 norm?
15. Can the number of nonzero elements in a vector be
defined as L0 norm? If no, why?
16. What is Frobenius norm?
17. What is a diagonal matrix?
18. Why is multiplication by diagonal matrix
computationally cheap? How is the multiplication
different for square vs. non-square diagonal matrix?
19. At what conditions does the inverse of a diagonal
matrix exist?
20. What is a symmetrix matrix?
21. What is a unit vector?
22. When are two vectors x and y orthogonal?
23. At R^n what is the maximum possible number of
orthogonal vectors with non-zero norm?
24. When are two vectors x and y orthonormal?
25. What is an orthogonal matrix? Why is computationally
preferred?
26. What is eigendecomposition, eigenvectors and
eigenvalues?
27. How to find eigen values of a matrix?
28. Write the eigendecomposition formula for a matrix. If
the matrix is real symmetric, how will this change?
29. Is the Eigendecomposition guaranteed to be unique? If
not, then how do we represent it?
30. What are positive definite, negative definite, positive
semi definite and negative semi definite matrices?
31. What is Singular Value Decomposition? Why do we use
it? Why not just use ED?
32. Given a matrix A, how will you calculate its Singular
Value Decomposition?
33. What are singular values, left singulars and right
singulars?
34. What is the connection of Singular Value
Decomposition of A with functions of A?
35. Why are singular values always non-negative?
36. What is the Moore Penrose pseudo inverse and how to
calculate it?
37. If we do Moore Penrose pseudo inverse on Ax = b, what
solution is provided is A is fat? Moreover, what solution
is provided if A is tall?
38. Which matrices can be decomposed by ED?
39. Which matrices can be decomposed by SVD?
40. What is the trace of a matrix?
41. How to write Frobenius norm of a matrix A in terms of
trace?
42. Why is trace of a multiplication of matrices invariant
to cyclic permutations?
43. What is the trace of a scalar?
44. Write the frobenius norm of a matrix in terms of trace?

Module 1 Quiz - Coursera15
50% (4)
Module 1 Quiz - Coursera15
1 page
2nd Exam Question Paper 2
No ratings yet
2nd Exam Question Paper 2
16 pages
ML Interview Questions PDF
100% (5)
ML Interview Questions PDF
20 pages
Data Structures and Algorithm Analysis in C 4th Edition Mark A
No ratings yet
Data Structures and Algorithm Analysis in C 4th Edition Mark A
12 pages
Q1-What's The Trade-Off Between Bias and Variance?
100% (1)
Q1-What's The Trade-Off Between Bias and Variance?
5 pages
ML Concepts: 1. Parametric Vs Non-Parametric Models:: Examples: Linear, Logistic, SVM
No ratings yet
ML Concepts: 1. Parametric Vs Non-Parametric Models:: Examples: Linear, Logistic, SVM
34 pages
Cheat Sheet Final
100% (2)
Cheat Sheet Final
7 pages
Interview Questions
100% (1)
Interview Questions
67 pages
RBF, KNN, SVM, DT
No ratings yet
RBF, KNN, SVM, DT
9 pages
Radial Basis Function
No ratings yet
Radial Basis Function
35 pages
Pca
No ratings yet
Pca
19 pages
Q1. What Is Data Science? List The Differences Between Supervised and Unsupervised Learning
100% (1)
Q1. What Is Data Science? List The Differences Between Supervised and Unsupervised Learning
41 pages
2023.02 - Time Series Forecasting With Transformer Models - en
100% (1)
2023.02 - Time Series Forecasting With Transformer Models - en
52 pages
Intro To Machine Learning With TensorFlow Nanodegree Program Syllabus
No ratings yet
Intro To Machine Learning With TensorFlow Nanodegree Program Syllabus
15 pages
Machine Learning Bits
100% (2)
Machine Learning Bits
28 pages
Data Science Interview Questions - Statistics: Mohit Kumar Dec 12, 2018 11 Min Read
100% (1)
Data Science Interview Questions - Statistics: Mohit Kumar Dec 12, 2018 11 Min Read
14 pages
Machine Learning Interview Questions
No ratings yet
Machine Learning Interview Questions
41 pages
Lecture 9 PDF
100% (1)
Lecture 9 PDF
28 pages
Transformers in NLP 1
No ratings yet
Transformers in NLP 1
9 pages
30 Frequently Asked Deep Learning Interview Questions and Answers
100% (1)
30 Frequently Asked Deep Learning Interview Questions and Answers
28 pages
Deep Learning
No ratings yet
Deep Learning
5 pages
Interview Questions For Machine Learning Total 215 Questions
100% (1)
Interview Questions For Machine Learning Total 215 Questions
70 pages
ML 2 (Mainly KNN)
100% (1)
ML 2 (Mainly KNN)
12 pages
Hyperparameters
No ratings yet
Hyperparameters
15 pages
Module2.3 Hyperparameter Optimization
No ratings yet
Module2.3 Hyperparameter Optimization
29 pages
LASSO Book Tibshirani PDF
No ratings yet
LASSO Book Tibshirani PDF
362 pages
Assignment 5
No ratings yet
Assignment 5
3 pages
Advice For Applying Machine Learning: Deciding What To Try Next
No ratings yet
Advice For Applying Machine Learning: Deciding What To Try Next
30 pages
CS7641 Machine Learning Midterm Notes PDF
No ratings yet
CS7641 Machine Learning Midterm Notes PDF
239 pages
Machine Learning Guide Line
No ratings yet
Machine Learning Guide Line
10 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
3 pages
Chapter 5.3-Mulitple Linear Regression
No ratings yet
Chapter 5.3-Mulitple Linear Regression
26 pages
27 SVM Interview Questions (ANSWERED) To Master Before ML & Data Science Interview - MLStack - Cafe
No ratings yet
27 SVM Interview Questions (ANSWERED) To Master Before ML & Data Science Interview - MLStack - Cafe
25 pages
UE20CS302 Unit4 Slides
No ratings yet
UE20CS302 Unit4 Slides
312 pages
Seminar Report Machine Learning
No ratings yet
Seminar Report Machine Learning
20 pages
AML 04 Backpropagation
100% (1)
AML 04 Backpropagation
26 pages
Random Forest
No ratings yet
Random Forest
32 pages
ML Unit 1 Notes
100% (1)
ML Unit 1 Notes
19 pages
Mcqs Bank Unit 1: A) The Autonomous Acquisition of Knowledge Through The Use of Computer Programs
100% (1)
Mcqs Bank Unit 1: A) The Autonomous Acquisition of Knowledge Through The Use of Computer Programs
8 pages
Recommender Systems Notes
No ratings yet
Recommender Systems Notes
21 pages
Quiz Week 7 - Support Vector Machines
100% (1)
Quiz Week 7 - Support Vector Machines
3 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
21 pages
02 ML Supervised Learning
No ratings yet
02 ML Supervised Learning
32 pages
Deep Learning - Wikipedia
No ratings yet
Deep Learning - Wikipedia
36 pages
Machine Learning Interview Questions
No ratings yet
Machine Learning Interview Questions
276 pages
Advanced Deep Learning Ghosal
No ratings yet
Advanced Deep Learning Ghosal
9 pages
771 A18 Lec4
100% (1)
771 A18 Lec4
128 pages
Parallelism of Statistics and Machine Learning & Logistic Regression Versus Random Forest
100% (1)
Parallelism of Statistics and Machine Learning & Logistic Regression Versus Random Forest
72 pages
Deep Learning CNN
100% (1)
Deep Learning CNN
28 pages
Statistics Interview Questions & Answers For Data Scientists
No ratings yet
Statistics Interview Questions & Answers For Data Scientists
43 pages
Nueral Network Mcqs
No ratings yet
Nueral Network Mcqs
6 pages
Five Equations That Changed The World
No ratings yet
Five Equations That Changed The World
288 pages
Recurrent Neural Networks - Hinton
No ratings yet
Recurrent Neural Networks - Hinton
57 pages
Machine Learning Interviews V 2 Week 11715787639480
No ratings yet
Machine Learning Interviews V 2 Week 11715787639480
49 pages
Data Science Related Interview Question
100% (1)
Data Science Related Interview Question
77 pages
Chapter 6 ML Classifications
No ratings yet
Chapter 6 ML Classifications
51 pages
Machine Learning: Lecture 13: Model Validation Techniques, Overfitting, Underfitting
100% (2)
Machine Learning: Lecture 13: Model Validation Techniques, Overfitting, Underfitting
26 pages
CNN PPT Unit Iv
No ratings yet
CNN PPT Unit Iv
134 pages
Mastering Parallel Programming with R
From Everand
Mastering Parallel Programming with R
Simon R. Chapple
No ratings yet
all_cards
No ratings yet
all_cards
106 pages
Exercises INF 5860 Solution Hints
No ratings yet
Exercises INF 5860 Solution Hints
11 pages
Machine Learning HC
No ratings yet
Machine Learning HC
4 pages
4TH Test 10TH
No ratings yet
4TH Test 10TH
1 page
Hamming Code
No ratings yet
Hamming Code
3 pages
Lec 3
No ratings yet
Lec 3
31 pages
LAB 3 Handout
No ratings yet
LAB 3 Handout
2 pages
NpTel Numerical Analysis Syllabus
No ratings yet
NpTel Numerical Analysis Syllabus
4 pages
Data Mining For Business in Python Deck
No ratings yet
Data Mining For Business in Python Deck
93 pages
Problem 10 Problem Set 10.3A Page 414: Maximize Z y y y Subject To y +y + +y C, y 0
No ratings yet
Problem 10 Problem Set 10.3A Page 414: Maximize Z y y y Subject To y +y + +y C, y 0
35 pages
EQ-502 Element Stiffness Matrices
No ratings yet
EQ-502 Element Stiffness Matrices
10 pages
CO3 L11 Newton's Backward DF
No ratings yet
CO3 L11 Newton's Backward DF
54 pages
Solutions Interpolation (And Curve Fitting)
No ratings yet
Solutions Interpolation (And Curve Fitting)
19 pages
2.3.3. Sorting Algorithms-3
No ratings yet
2.3.3. Sorting Algorithms-3
4 pages
Exercises Week7 8
No ratings yet
Exercises Week7 8
17 pages
Machine Learning Techniques For Heart Disease Prediction
No ratings yet
Machine Learning Techniques For Heart Disease Prediction
8 pages
Hermite by Dr. Rajesh Mathpal
No ratings yet
Hermite by Dr. Rajesh Mathpal
29 pages
MA6459 Numerical Methods
No ratings yet
MA6459 Numerical Methods
12 pages
Lab 3 PDF
No ratings yet
Lab 3 PDF
11 pages
Tybms Sem6 or Nov19
No ratings yet
Tybms Sem6 or Nov19
7 pages
Test I. Read Each Item, Then Choose The
No ratings yet
Test I. Read Each Item, Then Choose The
2 pages
Lemh 206
No ratings yet
Lemh 206
12 pages
Ai (50Q) MCQ
No ratings yet
Ai (50Q) MCQ
11 pages
Hashing in DBMS: Static & Dynamic With Examples
No ratings yet
Hashing in DBMS: Static & Dynamic With Examples
8 pages
04 Abstract
No ratings yet
04 Abstract
2 pages
Assignment 2
No ratings yet
Assignment 2
2 pages
MCQ Na A
0% (1)
MCQ Na A
9 pages
Fibonacci Series in C Using Recursion PDF
100% (1)
Fibonacci Series in C Using Recursion PDF
2 pages
Seismic New
No ratings yet
Seismic New
16 pages
Smith Julius, Signal Processing Libraries For Faust
No ratings yet
Smith Julius, Signal Processing Libraries For Faust
9 pages
Singly Linked List As Circular: Example
No ratings yet
Singly Linked List As Circular: Example
18 pages