Machine learning_question bank
Machine learning_question bank
Machine learning (ML) is a subset of artificial intelligence (AI) that involves the use of algorithms
and statistical models to enable computers to improve their performance on a task through
uncovering hidden patterns within datasets,experience (data) without being explicitly
programmed for that task. Essentially, it allows machines to learn from data and make predictions
or decisions based on that learning.
Supervised learning
All supervised learning algorithms need labeled data. Labeled data is data that is grouped into samples that
are tagged with one or more labels. In other words, applying supervised learning requires you to tell your
model
Examples:
Unsupervised learning
In unsupervised learning, a person feeds a machine a large amount of information, asks a question, and then
the machine is left to figure out how to answer the question by itself.
Examples:
Reinforcement learning
Reinforcement learning is a machine learning model similar to supervised learning, but the algorithm isn’t
trained using sample data. This model learns as it goes by using trial and error. A sequence of successful
outcomes is reinforced to develop the best recommendation for a given problem. The foundation of
reinforcement learning is rewarding the “right” behavior and punishing the “wrong” behavior.
Examples:
● Clearly define the problem you're trying to solve. Understand the business or scientific
objectives, the task at hand (e.g., classification, regression, clustering), and the output you
aim to produce.
2. Data Collection:
● Gather the relevant data needed for the problem. This can be in the form of structured data
(e.g., databases) or unstructured data (e.g., text, images). Ensure the data is representative
of the problem you are trying to model.
3. Data Preprocessing:
4. Choosing a Model:
● Select an appropriate machine learning algorithm based on the problem type (e.g.,
supervised, unsupervised, or reinforcement learning). Consider factors such as accuracy,
interpretability, computational efficiency, and the amount of available data.
5. Model Training:
● Split the dataset into training and test sets (often with an additional validation set).
● Train the chosen model on the training dataset. Fine-tune hyperparameters using methods
such as cross-validation to improve the model’s performance.
6. Model Evaluation:
● Evaluate the model on the test set using appropriate evaluation metrics, depending on the
task. For classification tasks, metrics like accuracy, precision, recall, F1-score, and AUC
(Area Under the ROC Curve) are common. For regression, you might use metrics like
Mean Squared Error (MSE) or R² score.
7. Model Deployment:
● Once satisfied with the model's performance, deploy the model in the real-world
environment where it can be used to make predictions on unseen data.
4. Explain the steps required for selecting the right machine learning algorithm.
5. Explain procedure to design machine learning procedure.
6. What is machine learning? Explain how supervised learning is difficult from
unsupervised learning.
Supervised learning involves training a model on labeled data, where each input has a
corresponding correct output. The model learns to map inputs to outputs, making it easier to
evaluate performance through known labels (e.g., classification, regression).
Unsupervised learning, on the other hand, deals with unlabeled data. The model identifies
patterns or structures in the data without explicit guidance, making it harder to evaluate and
interpret results since there are no predefined labels (e.g., clustering, dimensionality reduction).
In summary, supervised learning is easier to evaluate due to labeled data, while unsupervised
learning is more complex due to the lack of labels.
You might be wondering, what does it mean to "reward" a machine? Good question!
Rewarding a machine means that you give your agent positive reinforcement for performing
the "right" thing and negative reinforcement for performing the "wrong" things.
Each time the comparison is positive, the machine receives positive numerical feedback, or a
reward.
Each time the comparison is negative, the machine receives negative numerical feedback, or a
penalty.
1. Data:
○ Raw input that the model uses to learn patterns. Includes features (input variables)
and labels (output variables).
2. Model:
○ The algorithm or mathematical structure that learns from the data to make
predictions or decisions.
3. Training:
○ The process of feeding data to the model and adjusting its parameters to minimize
errors or maximize performance.
4. Loss Function:
○ A measure of how well the model's predictions match the actual data. Used to guide
the training process.
5. Optimization:
○ Techniques used to minimize the loss function and improve the model’s accuracy.
Examples include Gradient Descent.
6. Evaluation:
○ Assessing the model’s performance using metrics (e.g., accuracy, precision) to
ensure it generalizes well to new, unseen data.
7. Hyperparameters:
○ Parameters set before the training process (e.g., learning rate, number of layers) that
influence model performance and training.
1. Linear Regression:
○ Approximates a function by fitting a linear relationship between input variables and
output. Suitable for regression tasks with continuous data.
2. Polynomial Regression:
○ Extends linear regression by fitting a polynomial curve to the data. Captures
non-linear relationships.
3. Decision Trees:
○ Uses a tree-like model of decisions to approximate functions by splitting data into
subsets based on feature values. Handles both classification and regression.
4. Neural Networks:
○ Consists of interconnected layers of nodes (neurons) that learn complex patterns
and relationships. Useful for both classification and regression.
11. Differentiate classification and regression.
procedure.
1. Data Collection
● Description: Gather relevant data from various sources, such as databases, online
repositories, or through web scraping. The quality and quantity of the data collected will
2. Data Preparation
● Description: Clean and preprocess the collected data to make it suitable for analysis. This
features, encoding categorical variables, and splitting the dataset into training and testing
subsets.
● Objective: Ensure the data is clean, structured, and ready for model training.
● Description: Select an appropriate machine learning algorithm based on the problem type
(classification, regression, clustering, etc.), data characteristics, and project goals. Consider
● Objective: Choose the algorithm that best matches the problem and data characteristics to
4. Training Model
● Description: Use the training dataset to train the chosen machine learning model. This
involves feeding the data into the model and adjusting its parameters based on the learning
● Objective: Develop a model that accurately captures the underlying patterns in the data.
5. Evaluating Model
● Description: Assess the model's performance using the testing dataset and various
evaluation metrics (e.g., accuracy, precision, recall, mean squared error). This step helps
● Objective: Verify the model's effectiveness and ensure it performs well on data it has not
encountered during training.
6. Predictions
● Description: Use the trained and evaluated model to make predictions on new or unseen
data. This is the final application of the model in real-world scenarios or for
decision-making purposes.
● Objective: Apply the model to generate actionable insights or predictions based on new
input data.
Unit 2
Data Preprocessing is the process of transforming raw data into a clean and structured
format that can be effectively used in machine learning models. The raw data collected
from various sources is often incomplete, inconsistent, and may contain noise, so
preprocessing helps to improve the quality and relevance of the data. Properly
preprocessed data enhances the performance of machine learning models
6 .What is Dimensionality in Data Set. Explain High dimensionality problem in Machine
Learning.
In simple terms, high-dimensionality refers to having too many features (or columns) in your
dataset. If a dataset has a lot of features, it is considered "high-dimensional." While having more
data might seem like a good thing, it can actually cause several issues when training machine
learning models. This is often called the curse of dimensionality.
● Imagine a balloon inflating—it takes up more space as it grows. Similarly, when you add
more features (dimensions) to your dataset, the space between data points increases.
Overfitting:
● With too many features, a model can end up memorizing the training data rather than
learning general patterns. This is called overfitting.
● Each new feature adds complexity to the model. This means the training process takes
more time, uses more computing resources, and becomes harder to manage.
7. Why data sampling is important. Explain any one technique
Data sampling is crucial in machine learning because it allows us to work with smaller,
more manageable datasets while still maintaining the overall distribution and
characteristics of the full dataset.
why it's useful:
One of the simplest and most commonly used techniques is Random Sampling. In random
sampling, a subset of data is chosen completely at random from the full dataset. This means that
every data point has an equal chance of being selected.
9.List out the methods of Dimensionality reduction. Explain any one in details.
Principal Component Analysis (PCA) is a method used to reduce the number of features or
dimensions in your data while keeping the most important information.
1. Start with Your Data: Imagine you have a table with lots of columns (features) and rows
(data points).
2. Find the Main Patterns: PCA looks at how different features in your data relate to each
other and finds the main patterns or directions of variation.
3. Create New Dimensions: It creates new dimensions (called principal components) that
capture the most important patterns in the data.
4. Reduce Dimensions: You can then keep only the most important dimensions and discard
the less important ones, making your data simpler and easier to work with.
Key Steps:
1. Standardize Data: Normalize features to have zero mean and unit variance.
2. Compute Covariance Matrix: Determine relationships between features.
3. Extract Eigenvalues and Eigenvectors: Identify the directions (principal components)
with the highest variance.
4. Select Top Components: Choose the top principal components based on eigenvalues.
5. Transform Data: Project the original data onto the selected principal components.
Benefits:
9.Explain PCA.
Principal Component Analysis (PCA) is a method used to simplify complex data by reducing the
number of features while keeping the most important information.
1. Normalize Data: Adjust the data so that each feature has an average of zero and the same
scale. This helps ensure that no single feature dominates because of its scale.
2. Find Patterns: PCA looks for patterns in the data to understand how different features
relate to each other.
3. Create New Features: It then creates new features (called principal components) that are
combinations of the original features. These new features capture the most important
patterns in the data.
4. Reduce Dimensions: You can use only the most important new features (principal
components) to represent the data, which simplifies it without losing much of the original
information.
Benefits:
● Simplifies Data: Makes data easier to understand and analyze by reducing the number of
features.
● Improves Performance: Helps in speeding up algorithms and improving their accuracy by
removing irrelevant or redundant features.
Linear Discriminant Analysis (LDA) is a technique used to simplify data by reducing its
dimensions while preserving the class separability. It’s often used for classification problems where
you want to distinguish between different categories.
1. Understand the Groups: LDA works with labeled data where each sample belongs to a
specific class or category.
2. Find the Best Separation: It looks for the best way to separate the classes by finding a
new set of features (discriminants) that maximize the difference between the classes.
3. Project the Data: It transforms the data into a lower-dimensional space where the classes
are more distinct and separated from each other.
4. Simplify and Classify: With the reduced dimensions, it’s easier to classify new data points
because the classes are more distinct and easier to separate.
Benefits:
Feature engineering is the process of improving this information so the model can
understand it better. It's like making the data more useful. For example, you might break
down a big number (like house size) into smaller categories (small, medium, large) to
make predictions easier.
In simple terms:
1. Filter Methods:
○ These check each feature individually, based on basic statistics, without
using the model.
○ Example: Removing features with little variation (like a feature that is
mostly the same for all data).
2. Wrapper Methods:
○ These test different combinations of features by running the model
multiple times to find the best subset.
○ Example: Forward selection adds features one by one, checking if adding
each feature improves performance.
3. Embedded Methods:
○ These select features automatically during model training.
○ Example: Lasso regression, where unimportant features get a coefficient
of zero, effectively removing them.
● What it is: A technique that reduces the number of features by finding the most
important ones that capture the most variance in the data.
● How it works: It transforms the original features into new ones (principal
components) that are uncorrelated and ordered by importance.
● What it is: A method that reduces dimensions while preserving the information
that helps to distinguish between different classes in the data.
● How it works: It looks for the feature combinations that best separate the
classes in the dataset.
3. Independent Component Analysis (ICA)
● What it is: A technique used to separate a mixed signal into its independent
components.
● How it works: It assumes that the observed data is a combination of
independent sources and aims to recover those sources.
● What it is: Deep learning models automatically learn features from raw data.
● How it works: Layers of the neural network extract different levels of features,
from simple to complex.
6. Autoencoders
● What it is: Techniques for converting text into numerical features for machine
learning.
● How it works:
○ BoW: Counts how often each word appears in a document.
○ TF-IDF: Weighs the word counts based on how common or rare the
words are across multiple documents.
Conclusion
Summary:
PCA and LDA are commonly used for dimensionality reduction and feature
transformation.
TF-IDF handles text data, and Autoencoders use neural networks for feature extraction in
complex data.
1. Reduces the Number of Features: Techniques like PCA combine multiple features into
fewer, more useful ones, reducing the complexity.
2. Prevents Overfitting: By reducing unnecessary features, the model focuses on important
patterns and avoids learning noise.
3. Speeds Up the Model: With fewer features, the model runs faster and uses less computing
power.
4. Makes Data Easy to Understand: Feature extraction can reduce high-dimensional data to
2 or 3 dimensions, making it easier to visualize.
5. Combines Similar Features: It merges related features into one, reducing redundancy and
making the model more stable.
6. Filters Out Irrelevant Information: By creating new, meaningful features, it removes
unhelpful data, improving the model's accuracy.
1. Curse of Dimensionality:
○ As the number of dimensions increases, the data becomes sparse, making it difficult
to find patterns or meaningful clusters.
2. Overfitting:
○ High-dimensional data can lead to models capturing noise rather than true patterns,
reducing generalization.
3. Increased Computation:
○ More dimensions require more computational resources for processing and training
models.
4. Model Interpretability:
○ High-dimensional data makes it harder to understand and interpret the relationships
between features and the target variable.
By reducing the number of features, we address the issues of overfitting, computational cost, and
model interpretability, leading to better performance and more meaningful insights.
Definition: Dimensionality reduction is the process of reducing the number of features or variables
in a dataset while retaining as much of the important information as possible. It simplifies the
dataset by transforming high-dimensional data into a lower-dimensional space.
Principal Component Analysis (PCA) is a method used to simplify complex data by reducing the
number of features while keeping the most important information.
1. Normalize Data: Adjust the data so that each feature has an average of zero and the same
scale. This helps ensure that no single feature dominates because of its scale.
2. Find Patterns: PCA looks for patterns in the data to understand how different features
relate to each other.
3. Create New Features: It then creates new features (called principal components) that are
combinations of the original features. These new features capture the most important
patterns in the data.
4. Reduce Dimensions: You can use only the most important new features (principal
components) to represent the data, which simplifies it without losing much of the original
information.
Benefits:
● Simplifies Data: Makes data easier to understand and analyze by reducing the number of
features.
● Improves Performance: Helps in speeding up algorithms and improving their accuracy by
removing irrelevant or redundant features.
Unit 5
1. Define :
a. Random variables
b. Probability
c. Conditional Probability
d. Discrete distributions
e. Continuous distributions
f. Sampling
g. Testing
h. Hypothesis
Random Variables:
→ A random variable is a number that can change based on the outcome of a random event.
Probability:
Conditional Probability:
Conditional probability is the chance of an event happening, given that another event has already
happened.
Discrete Distributions:
Discrete distributions describe the probabilities of outcomes for variables that can take specific,
separate values (like whole numbers).
Continuous Distributions:
Continuous distributions describe the probabilities of outcomes for variables that can take any
value within a range (like height or weight).
Sampling:
Sampling is the process of selecting a small group from a larger population to study and draw
conclusions about the whole group.
Testing:
Testing is the process of checking a hypothesis using sample data to see if it is likely true or false.
Hypothesis:
A hypothesis is a testable statement about a population, usually consisting of a null hypothesis (no
effect) and an alternative hypothesis (some effect).
2. What is Concepts of probability. What is the importance of it in ML.
Probability is a mathematical concept that quantifies the likelihood of an event occurring. It's a
fundamental tool used in various fields, including statistics and machine learning.
🫥
3. Explain distribution and its methods in details.
Distribution refers to how data points or values are spread or arranged across a dataset.
👍
Understanding data distribution helps in selecting appropriate models, detecting anomalies, and
choosing suitable statistical techniques.
1. Data Transformation:
○ Change the shape of data using methods like logarithms or square roots to make it
more normal (less skewed).
2. Standardization:
○ Adjust the data so it has a mean of 0 and a standard deviation of 1, which is helpful
for normally distributed data.
3. Normalization:
○ Scale all data points to fit within a specific range (like 0 to 1), useful for data that's
evenly spread out (uniform distribution).
4. Handling Outliers:
○ Deal with extreme values that might distort the model by removing them or limiting
them to a certain value (capping).
5. Bootstrapping:
○ Resample data to create many smaller datasets, helpful when you're unsure about
the original data's distribution.
Understanding how to handle distributions helps in preparing your data correctly, leading to better
machine learning results.
Monte Carlo Approximation is a method used to estimate numerical results using random
sampling. It’s commonly used for problems that are difficult or impossible to solve exactly due to
their complexity.
Key Concepts:
1. Random Sampling:
○ Random values (samples) are generated to estimate a desired result. The larger the
sample size, the more accurate the estimate becomes.
2. Applications:
○ Used in areas like finance (risk analysis), physics (particle simulations), machine
learning (optimization), and statistics (probability estimation).
3. Procedure:
○ Step 1: Define the problem, often an integral or probability that needs to be
approximated.
○ Step 2: Generate random samples from a defined distribution.
○ Step 3: Compute a function (e.g., average, sum) over these random samples.
○ Step 4: Use the results to approximate the desired quantity (e.g., an integral value).
Example:
Benefits:
Limitation:
● Requires a large number of samples for high accuracy, which can be computationally
expensive.
Monte Carlo Approximation is powerful for estimating complex problems using randomness and
statistical principles.
Unit 6
1. Explain Bayes’ theorem in details.
→
2. Describe the Impotence of Bayesian methods in ML.
→
Structure:
● Nodes: Each node represents a random variable. These could be anything
from observable data points to latent variables.
● Edges: Directed edges (arrows) connect pairs of nodes, representing
conditional dependencies. If there's an arrow from node A to node B, A is a
parent of B.
● Conditional Probabilities: Each node has an associated conditional
probability distribution that quantifies the effect of the parents on the node.
How It Works:
Construct the Network: Define the structure by identifying all relevant
variables and their relationships.
Specify Probabilities: Assign conditional probability tables (CPT) to each
node, showing the probability of each state given its parents.
Inference: Use the network to perform probabilistic inference. This involves
computing the likelihood of certain outcomes given evidence.
Applications:
Medical Diagnosis: Determining the probability of diseases given
symptoms. flue fever and cough wala example likh dena.
Predictive Analytics: Forecasting future events based on historical data.
Natural Language Processing: Understanding context and meaning in text.
Unit 7
1. Define:
a. Supervised Learning
b. Classification
c. Regression
d. Learning
Supervised Learning
Supervised Learning is a type of machine learning where the model is trained on a
labeled dataset, meaning the data has both input and output value
Classification
Classification is a type of supervised learning where the output variable is
categorical. The aim is to predict the category or class to which a new observation
belongs. For example, classifying emails as 'spam' or 'not spam'.
Regression
Regression is another type of supervised learning where the output variable is a
continuous value. The goal is to predict a numeric value based on input features.
For instance, predicting house prices based on location, size, etc.
Learning
the process of gaining knowledge from data.
2. Explain Supervised Learning in details.
→ Supervised Learning (Detailed but Precise):
Supervised learning is a type of machine learning where the model is trained on a labeled dataset,
meaning each training example has an input-output pair. The model learns to map inputs (features)
to the correct output (labels) by identifying patterns in the data.
Key Concepts:
1. Labeled Data:
○ In supervised learning, the dataset contains inputs \(X\) (features) and
corresponding outputs \(Y\) (labels). The goal is to predict the output for new,
unseen inputs based on the patterns learned from the training data.
2. Training Process:
○ The algorithm learns by minimizing the difference between the predicted outputs
and the actual outputs using a loss function. Over time, the model adjusts its internal
parameters (weights) to improve its accuracy.
3. Types of Supervised Learning:
○ Classification: When the output is a discrete label (e.g., cat vs. dog).
○ Regression: When the output is continuous (e.g., predicting house prices).
4. Steps Involved:
○ Data Collection: Gather labeled data relevant to the problem.
○ Data Preprocessing: Clean, transform, and split the data into training and testing
sets.
○ Model Training: Train the model on the training set using the labeled data.
○ Model Evaluation: Test the model on the testing set to check its accuracy.
○ Model Improvement: Fine-tune the model by adjusting hyperparameters or using
better algorithms if necessary.
5. Applications:
○ Email spam detection, image recognition, medical diagnosis, and stock price
prediction.
Supervised learning is effective for tasks where labeled data is available, and the goal is to make
accurate predictions based on past examples.
🎉
5. List Classification algorithms. Explain Decision Tree as classification method.
// this by kiran V vote
List of Classification Algorithms
Classification algorithms are used to assign data into different categories (labels or
classes). Here are some popular classification algorithms:
Logistic Regression
Decision Trees
Random Forest
Support Vector Machines (SVM)
K-Nearest Neighbors (KNN)
Naive Bayes
Artificial Neural Networks (ANN)
Gradient Boosting (e.g., XGBoost, LightGBM)
AdaBoost
Multilayer Perceptron (MLP)
Decision Tree as a Classification Method (For 4 Marks)
A Decision Tree is a supervised learning algorithm used for classification. It works by
splitting data into branches based on feature values to make decisions.
Structure: It has a tree-like structure with:
The algorithm selects the best feature to split the data using measures like Gini Impurity or
Entropy.
Data is recursively split until it reaches a leaf node, which provides the classification.
Advantages:
➕
—----------------------------------------------------------------------------------------------------------
Sakir’s version
Classification Algorithms:
1. Logistic Regression
2. Decision Tree
3. Support Vector Machine (SVM)
4. K-Nearest Neighbors (KNN)
5. Naive Bayes
6. Random Forest
7. Gradient Boosting (e.g., XGBoost, LightGBM)
8. Neural Networks
A Decision Tree is a supervised machine learning algorithm used for both classification and
regression tasks. It works by splitting the data into subsets based on the most significant feature
values. This process continues recursively, creating branches and forming a tree-like structure
where each node represents a feature, each branch represents a decision, and each leaf node
represents the outcome (class label).
Key Concepts:
Process:
1. Select the Best Feature: The decision tree algorithm evaluates all features and selects the
one that best splits the data into different classes. The most common criteria are Gini Index
and Information Gain (based on Entropy).
2. Splitting: The chosen feature divides the dataset into subsets, making decisions at each
branch based on the feature's value. This is done recursively at each level of the tree.
3. Stopping Criteria: The tree continues to split until one of the stopping criteria is met:
○ Maximum depth is reached.
○ No more features left to split.
○ All the data in a node belong to a single class (pure node).
4. Prediction: Once the tree is built, new data can be classified by following the branches
based on feature values until a leaf node (class label) is reached.
Example:
Consider a decision tree to classify if an email is spam or not. Features could include:
At each node, the decision tree splits based on the most important feature, e.g., if certain keywords
are present, leading to branches such as "Spam" or "Not Spam."
Advantages:
Disadvantages:
● Prone to overfitting: If the tree grows too large, it may capture noise.
● Instability: A small change in data can lead to a completely different tree.
Linear Regression
Polynomial Regression
Lasso Regression
Decision Tree Regression
Random Forest Regression
Support Vector Regression (SVR)
Linear Regression as a Regression Model
Linear Regression is one of the simplest and most widely used regression algorithms. It
models the relationship between two variables by fitting a straight line through the data
points.
1. Objective
The goal of linear regression is to predict a continuous target variable (Y) based on the
input features (X). It assumes a linear relationship between the input and the output.
2. Equation
The linear regression model is based on the equation of a straight line:
3. How It Works
The model tries to find the best-fitting line by minimizing the error (difference) between
the predicted values and the actual values.
The most common method for this is Ordinary Least Squares (OLS), which minimizes the
sum of squared errors.
7. Differentiate Linear Regression and Logistics Regression.
→
8. Why Support Vector Machines (SVM) Classifiers have improved classification over
Linear ones? Discuss Hyperplane in SVM.
→ Why SVM Classifiers Have Improved Classification Over Linear
Ones:
Hyperplane in SVM:
A hyperplane is the decision boundary that separates different classes in an SVM model. In a
two-dimensional space, it’s a line; in higher dimensions, it becomes a plane or a hyperplane.
● Goal of SVM: To find the optimal hyperplane that not only separates the classes but also
maximizes the margin, which is the distance between the hyperplane and the closest data
points from both classes (called support vectors).
● Support Vectors: These are the critical points closest to the hyperplane that influence its
position and orientation.
SVM strives for the hyperplane with the largest margin, which helps improve classification
performance, especially for complex datasets
9. Are True Positive and True Negative enough for accurate Classification? If only
FalseNegative is reduced, does it lead to skewed classification? Give reasons for
your ans—-------------wers
Unit 8
1. Define:
a. Unsupervised Learning
b. Clustering
c. Association
d. Confusion Matrix
2. Supervised vs. Unsupervised Learning
3. Explain Applications of unsupervised Machine Learning
4. What is Clustering. Explain K-mean clustering algorithm.
5. Write a note on
6. Write a note on KNN.
7. Define Association rules. Explain application of Association Rule.
8. Describe K-nearest Neighbor learning Algorithm for continues valued target function.
9. Discuss the major drawbacks of K-nearest Neighbor learning Algorithm and how it can be
corrected
10. Define the following terms with respect to K - Nearest Neighbor Learning:
i) Regression ii) Residual iii) Kernel Function.
11. Define the following terms
a. Sample error
b. True error
c. Random Variable
d. Expected value
e. Variance
f. standard Deviation
12. Explain Binomial Distribution with an example.
13. Explain Normal or Gaussian distribution with an example.
14. Explain the Central Limit Theorem with an example.
15. Write the Procedure for estimating the difference in error between two learning methods.
Approximate confidence intervals for this estimate
16. Describe K-nearest neighbour algorithm. Why is it known as instance-based Learning?
Unit 9
1. Define:
a. Neural Network
b. Neurons
c. Activation Function
d. Backpropagation
e. Deep Learning
2. Explain Types of Activation functions in details.
3. Explain various type of neural network.
4. Explain ANN in details
5. List the Architecture of Neural Network. Explain each in details.
6. Explain Backpropagation in ANN.
7. Write a note on RNN
8. Write a short note on feed forward neural network.
9. Write a note on CNN
10. Explain Deep Learning
11. What is difference between Machine Learning and Deep Learning.
12. Explain the concept of a Perceptron with a neat diagram.
13. Discuss the Perceptron training rule.
14. Under what conditions the perceptron rule fails and it becomes necessary to apply
the delta rule
15. What do you mean by Gradient Descent?
16. Derive the Gradient Descent Rule.
17. What are the conditions in which Gradient Descent is applied.
18. What are the difficulties in applying Gradient Descent.
19. Differentiate between Gradient Descent and Stochastic Gradient Descent
20. Define Delta Rule.
21. Derive the Backpropagation rule considering the training rule for Output Unit
weights and Training Rule for Hidden Unit weights
22. Write the algorithm for Back propagation.
23. Explain how to learn Multilayer Networks using Gradient Descent Algorithm.
24. What is Squashing Function?
25. What is Cost function in Back Propagation? Discuss Back propagation algorithm.
26. What is a Neural Network (NN)? With an example, discuss most suitable NN application.