0% found this document useful (0 votes)
24 views

Machine learning_question bank

Uploaded by

akshuwhatsapp
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views

Machine learning_question bank

Uploaded by

akshuwhatsapp
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

Unit -1

1. Define Machine learning? Briefly explain the types of learning.

Definition of Machine Learning:

Machine learning (ML) is a subset of artificial intelligence (AI) that involves the use of algorithms
and statistical models to enable computers to improve their performance on a task through
uncovering hidden patterns within datasets,experience (data) without being explicitly
programmed for that task. Essentially, it allows machines to learn from data and make predictions
or decisions based on that learning.

Types of Machine Learning:

Supervised learning
All supervised learning algorithms need labeled data. Labeled data is data that is grouped into samples that
are tagged with one or more labels. In other words, applying supervised learning requires you to tell your
model
Examples:

● Classification (e.g., spam detection)


● Regression (e.g., predicting house prices)

Unsupervised learning
In unsupervised learning, a person feeds a machine a large amount of information, asks a question, and then
the machine is left to figure out how to answer the question by itself.
Examples:

● Clustering (e.g., customer segmentation)


● Association (e.g., market basket analysis)

Reinforcement learning
Reinforcement learning is a machine learning model similar to supervised learning, but the algorithm isn’t
trained using sample data. This model learns as it goes by using trial and error. A sequence of successful
outcomes is reinforced to develop the best recommendation for a given problem. The foundation of
reinforcement learning is rewarding the “right” behavior and punishing the “wrong” behavior.
Examples:

● Game AI (e.g., playing chess)


● Autonomous driving
2. What are the steps in designing a machine learning problem? OR Machine learning
procedure.
ans 1. Problem Definition:

● Clearly define the problem you're trying to solve. Understand the business or scientific
objectives, the task at hand (e.g., classification, regression, clustering), and the output you
aim to produce.

2. Data Collection:

● Gather the relevant data needed for the problem. This can be in the form of structured data
(e.g., databases) or unstructured data (e.g., text, images). Ensure the data is representative
of the problem you are trying to model.

3. Data Preprocessing:

● Cleaning: Handle missing or inconsistent data by filling, deleting, or imputing missing


values.
● Normalization/Scaling: Scale numerical features so that no single feature dominates the
learning process.
● Encoding Categorical Data: Convert categorical variables into a numerical form using
techniques like one-hot encoding.
● Handling Imbalance: Address class imbalance in the dataset (if required) by
oversampling, undersampling, or using specialized techniques.

4. Choosing a Model:

● Select an appropriate machine learning algorithm based on the problem type (e.g.,
supervised, unsupervised, or reinforcement learning). Consider factors such as accuracy,
interpretability, computational efficiency, and the amount of available data.

5. Model Training:

● Split the dataset into training and test sets (often with an additional validation set).
● Train the chosen model on the training dataset. Fine-tune hyperparameters using methods
such as cross-validation to improve the model’s performance.

6. Model Evaluation:

● Evaluate the model on the test set using appropriate evaluation metrics, depending on the
task. For classification tasks, metrics like accuracy, precision, recall, F1-score, and AUC
(Area Under the ROC Curve) are common. For regression, you might use metrics like
Mean Squared Error (MSE) or R² score.

7. Model Deployment:

● Once satisfied with the model's performance, deploy the model in the real-world
environment where it can be used to make predictions on unseen data.

8. Monitoring and Maintenance:


Continuously monitor the model's performance post-deployment. If model performance
degrades due to changing data distributions or other factors, retrain or update the model with
new data.

3. Define issues in machine Learning.

Issues in Machine Learning:

1. Insufficient Data: Limited or poor-quality data can lead to underfitting or inaccurate


models.
2. Data Quality: Noisy, incomplete, or inconsistent data can hinder learning and reduce
model performance.
3. Overfitting: Models may either fit the training data too closely (overfitting) that it fails to
generallize
4. Underfitting: fail to capture patterns (underfitting) for simpler data.
5. Imbalanced Data: Unequal class distributions can bias models
6. Bias & Fairness: Models can inherit biases from data, leading to unfair outcomes.
7. Scalability: Some algorithms struggle with large datasets, requiring efficient techniques to
scale up.

4. Explain the steps required for selecting the right machine learning algorithm.
5. Explain procedure to design machine learning procedure.
6. What is machine learning? Explain how supervised learning is difficult from
unsupervised learning.

Supervised learning involves training a model on labeled data, where each input has a
corresponding correct output. The model learns to map inputs to outputs, making it easier to
evaluate performance through known labels (e.g., classification, regression).

Unsupervised learning, on the other hand, deals with unlabeled data. The model identifies
patterns or structures in the data without explicit guidance, making it harder to evaluate and
interpret results since there are no predefined labels (e.g., clustering, dimensionality reduction).

In summary, supervised learning is easier to evaluate due to labeled data, while unsupervised
learning is more complex due to the lack of labels.

7. Short notes on Machine learning Applications.


ans
Healthcare: Diagnosing diseases, personalized medicine, drug discovery.
Finance: Fraud detection, algorithmic trading, credit scoring.
eCommerce: Recommendation systems, inventory management, customer segmentation.
NLP: Chatbots, sentiment analysis, text summarization.
Autonomous Vehicles: Self-driving cars, collision avoidance.
Image Recognition: Facial recognition, object detection, medical imaging.
Robotics: Industrial automation, human-robot interaction.
Cybersecurity: Threat detection, spam filtering, malware detection.
Entertainment: Content recommendation, gaming AI.
Manufacturing: Predictive maintenance, quality control
8. Differentiate supervised and unsupervised Machine Learning algorithm.
9. Write short note on Reinforcement learning.
Reinforcement learning
Reinforcement learning is a machine learning model similar to supervised learning, but the
algorithm isn’t trained using sample data. This model learns as it goes by using trial and error.
A sequence of successful outcomes is reinforced to develop the best recommendation for a
given problem. The foundation of reinforcement learning is rewarding the “right” behavior
and punishing the “wrong” behavior.

You might be wondering, what does it mean to "reward" a machine? Good question!
Rewarding a machine means that you give your agent positive reinforcement for performing
the "right" thing and negative reinforcement for performing the "wrong" things.
Each time the comparison is positive, the machine receives positive numerical feedback, or a
reward.
Each time the comparison is negative, the machine receives negative numerical feedback, or a
penalty.

10. Explain Key elements of Machine Learning. Explain various function


approximation methods.

Key Elements of Machine Learning

1. Data:
○ Raw input that the model uses to learn patterns. Includes features (input variables)
and labels (output variables).
2. Model:
○ The algorithm or mathematical structure that learns from the data to make
predictions or decisions.
3. Training:
○ The process of feeding data to the model and adjusting its parameters to minimize
errors or maximize performance.
4. Loss Function:
○ A measure of how well the model's predictions match the actual data. Used to guide
the training process.
5. Optimization:
○ Techniques used to minimize the loss function and improve the model’s accuracy.
Examples include Gradient Descent.
6. Evaluation:
○ Assessing the model’s performance using metrics (e.g., accuracy, precision) to
ensure it generalizes well to new, unseen data.
7. Hyperparameters:
○ Parameters set before the training process (e.g., learning rate, number of layers) that
influence model performance and training.

Function Approximation Methods

1. Linear Regression:
○ Approximates a function by fitting a linear relationship between input variables and
output. Suitable for regression tasks with continuous data.
2. Polynomial Regression:
○ Extends linear regression by fitting a polynomial curve to the data. Captures
non-linear relationships.
3. Decision Trees:
○ Uses a tree-like model of decisions to approximate functions by splitting data into
subsets based on feature values. Handles both classification and regression.
4. Neural Networks:
○ Consists of interconnected layers of nodes (neurons) that learn complex patterns
and relationships. Useful for both classification and regression.
11. Differentiate classification and regression.

12. Differentiate between Training data and Testing Data.

13. Explain any two important machine learning library in python.


Scikit-learn
Overview: Scikit-learn is one of the most popular and widely-used machine learning libraries in
Python. It provides simple and efficient tools for data mining, data analysis, and machine learning,
TensorFlow
Overview: TensorFlow, developed by Google, is a powerful open-source library for numerical
computation and large-scale machine learning. It is widely used for building deep learning models
and neural networks.
14. Define Following
a. Regression : Predicts continuous numerical values based on input features,
modeling the relationship between variables.
b. Learning: The process by which a machine learning model improves its
performance over time by adjusting to data.
c. Machine Learning:A field of AI that enables systems to learn from data and make
predictions or decisions without explicit programming.
d. Classification:A field of AI that enables systems to learn from data and make
predictions or decisions without explicit programming.
e. Clustering
f. Training Data:The dataset used to train a model, containing input-output pairs to
learn from.
g. Test Data:A separate dataset used to evaluate the model’s performance on unseen
data.
h. Function Approximation: Finding a function that closely matches a set of data
points, often used in regression tasks.
i. Overfitting:When a model learns too well from the training data, including its
noise, leading to poor performance on new data.

15. explain the flow diagram of machine learning

procedure.

1. Data Collection

● Description: Gather relevant data from various sources, such as databases, online

repositories, or through web scraping. The quality and quantity of the data collected will

significantly influence the model's performance.


● Objective: Obtain a comprehensive dataset that represents the problem domain accurately.

2. Data Preparation

● Description: Clean and preprocess the collected data to make it suitable for analysis. This

step includes handling missing values, removing duplicates, normalizing or scaling

features, encoding categorical variables, and splitting the dataset into training and testing

subsets.

● Objective: Ensure the data is clean, structured, and ready for model training.

3. Choosing Learning Algorithm

● Description: Select an appropriate machine learning algorithm based on the problem type

(classification, regression, clustering, etc.), data characteristics, and project goals. Consider

different algorithms and evaluate their suitability for the task.

● Objective: Choose the algorithm that best matches the problem and data characteristics to

achieve optimal performance.

4. Training Model

● Description: Use the training dataset to train the chosen machine learning model. This

involves feeding the data into the model and adjusting its parameters based on the learning

algorithm to minimize errors and improve predictions.

● Objective: Develop a model that accurately captures the underlying patterns in the data.

5. Evaluating Model

● Description: Assess the model's performance using the testing dataset and various

evaluation metrics (e.g., accuracy, precision, recall, mean squared error). This step helps

determine how well the model generalizes to new, unseen data.

● Objective: Verify the model's effectiveness and ensure it performs well on data it has not
encountered during training.

6. Predictions

● Description: Use the trained and evaluated model to make predictions on new or unseen

data. This is the final application of the model in real-world scenarios or for

decision-making purposes.

● Objective: Apply the model to generate actionable insights or predictions based on new

input data.

Unit 2

1. Write a note on Machine Learning activities.


ans
2. Explain different Types of data in Machine Learning with example.
3. What is data quality. Explain the importance of data quality and also
explain it remediation.
1.
5 Define data preprocessing and techniques used for data preprocessing.
ans

Data Preprocessing is the process of transforming raw data into a clean and structured
format that can be effectively used in machine learning models. The raw data collected
from various sources is often incomplete, inconsistent, and may contain noise, so
preprocessing helps to improve the quality and relevance of the data. Properly
preprocessed data enhances the performance of machine learning models
6 .What is Dimensionality in Data Set. Explain High dimensionality problem in Machine
Learning.

High-Dimensionality Problem in Machine Learning (Simplified)

In simple terms, high-dimensionality refers to having too many features (or columns) in your
dataset. If a dataset has a lot of features, it is considered "high-dimensional." While having more
data might seem like a good thing, it can actually cause several issues when training machine
learning models. This is often called the curse of dimensionality.

Why is high-dimensional data problematic?

Too Much Space, Not Enough Data:

● Imagine a balloon inflating—it takes up more space as it grows. Similarly, when you add
more features (dimensions) to your dataset, the space between data points increases.

Overfitting:

● With too many features, a model can end up memorizing the training data rather than
learning general patterns. This is called overfitting.

More Features = More Complexity:

● Each new feature adds complexity to the model. This means the training process takes
more time, uses more computing resources, and becomes harder to manage.
7. Why data sampling is important. Explain any one technique
Data sampling is crucial in machine learning because it allows us to work with smaller,
more manageable datasets while still maintaining the overall distribution and
characteristics of the full dataset.
why it's useful:

Technique: Random Sampling

One of the simplest and most commonly used techniques is Random Sampling. In random
sampling, a subset of data is chosen completely at random from the full dataset. This means that
every data point has an equal chance of being selected.

8. What is Dimensionality reduction. What are the benefits of Dimensionality reduction?


Dimensionality reduction is the process of reducing the number of input features (or
dimensions) in a dataset while retaining as much important information as possible. In
machine learning, datasets often have many features, which can lead to problems like
overfitting, increased computational cost, and difficulty in interpreting the data.

Benefits of Dimensionality Reduction

1. Improved Model Performance:


○ Benefit: Reduces overfitting by eliminating irrelevant or redundant features, leading
to better generalization and performance.
○ Example: Simplified models often have fewer parameters to tune, which can lead to
more stable and robust predictions.
2. Enhanced Visualization:
○ Benefit: Allows for visualization of high-dimensional data in 2D or 3D, making it
easier to understand and interpret.
○ Example: Techniques like PCA can project data into two or three dimensions for
visualization.
3. Reduced Computational Cost:
○ Benefit: Lowers the computational resources and time required for training models
and processing data.
○ Example: Fewer features mean faster training times and reduced memory usage.
4. Increased Interpretability:
○ Benefit: Simplifies the model and data, making it easier to understand the
relationships between features and outcomes.
○ Example: Reduced feature space can highlight the most significant variables
affecting the model.
5. Mitigation of Curse of Dimensionality:
○ Benefit: Addresses issues related to high-dimensional spaces where data becomes
sparse and distance metrics lose meaning.
○ Example: Techniques like t-SNE can help in maintaining the local structure of data
in lower dimensions.

9.List out the methods of Dimensionality reduction. Explain any one in details.

Principal Component Analysis (PCA) is a method used to reduce the number of features or
dimensions in your data while keeping the most important information.

How PCA Works:

1. Start with Your Data: Imagine you have a table with lots of columns (features) and rows
(data points).
2. Find the Main Patterns: PCA looks at how different features in your data relate to each
other and finds the main patterns or directions of variation.
3. Create New Dimensions: It creates new dimensions (called principal components) that
capture the most important patterns in the data.
4. Reduce Dimensions: You can then keep only the most important dimensions and discard
the less important ones, making your data simpler and easier to work with.

Key Steps:

1. Standardize Data: Normalize features to have zero mean and unit variance.
2. Compute Covariance Matrix: Determine relationships between features.
3. Extract Eigenvalues and Eigenvectors: Identify the directions (principal components)
with the highest variance.
4. Select Top Components: Choose the top principal components based on eigenvalues.
5. Transform Data: Project the original data onto the selected principal components.
Benefits:

● Simplifies data while retaining most of its variation.


● Reduces computational complexity and improves model performance.

9.Explain PCA.

Principal Component Analysis (PCA) is a method used to simplify complex data by reducing the
number of features while keeping the most important information.

Here’s how it works in simple steps:

1. Normalize Data: Adjust the data so that each feature has an average of zero and the same
scale. This helps ensure that no single feature dominates because of its scale.
2. Find Patterns: PCA looks for patterns in the data to understand how different features
relate to each other.
3. Create New Features: It then creates new features (called principal components) that are
combinations of the original features. These new features capture the most important
patterns in the data.
4. Reduce Dimensions: You can use only the most important new features (principal
components) to represent the data, which simplifies it without losing much of the original
information.

Benefits:

● Simplifies Data: Makes data easier to understand and analyze by reducing the number of
features.
● Improves Performance: Helps in speeding up algorithms and improving their accuracy by
removing irrelevant or redundant features.

10. Explain LDA.

Linear Discriminant Analysis (LDA) is a technique used to simplify data by reducing its
dimensions while preserving the class separability. It’s often used for classification problems where
you want to distinguish between different categories.

Here’s a simple way to understand LDA:

1. Understand the Groups: LDA works with labeled data where each sample belongs to a
specific class or category.
2. Find the Best Separation: It looks for the best way to separate the classes by finding a
new set of features (discriminants) that maximize the difference between the classes.
3. Project the Data: It transforms the data into a lower-dimensional space where the classes
are more distinct and separated from each other.
4. Simplify and Classify: With the reduced dimensions, it’s easier to classify new data points
because the classes are more distinct and easier to separate.

Benefits:

● Improves Classification: Makes it easier to distinguish between different classes.


● Reduces Complexity: Simplifies the data by reducing the number of features while
retaining class information.

11.Differentiate PCA and LDA.

12.What is difference between Dimensionality reduction and Feature subset selection.


Unit 3
1. What is modeling in machine learning? What are the types of Models in Machine
Learning?
2. What is difference between Predictive and Descriptive Model.
3. Write a note on Predictive Model.
4. Explain the training of Predictive Model.
5. Explain the process of Supervised Learning Model.
6. Explain Supervised Machine Learning in details.
7. Explain Linear Regression as Supervised Machine Learning.
8. What is cost
function? 9.
10. List the methods for Model evaluation. Explain each. How we can improve the
performance of model.
Unit 4

1. What is feature and feature engineering.


→ A feature is just a piece of information or data you use in a model. For example, if
you're trying to predict how much a house costs, a feature could be the number of rooms.

Feature engineering is the process of improving this information so the model can
understand it better. It's like making the data more useful. For example, you might break
down a big number (like house size) into smaller categories (small, medium, large) to
make predictions easier.

In simple terms:

● Feature = data used for prediction.


● Feature engineering = making the data more useful for better predictions.
2. Explain the need of feature engineering in ML.
● Improves model performance: Well-engineered features can make the model more
accurate. For example, combining or transforming features (like calculating the ratio of two
features) can give the model better insights.
● Makes data understandable for algorithms: Some algorithms work better with specific
types of data (e.g., numerical vs. categorical). Feature engineering helps convert raw data
into the right format.
● Simplifies complex relationships: It helps simplify relationships between variables that
might be too complex or noisy in their raw form. For example, converting a date into "day
of the week" or "month" can uncover patterns.
● Reduces model complexity: By selecting only the most important features, you can reduce
noise in the data and avoid overfitting (when a model learns too much from irrelevant data).
● Handles missing data: It can help deal with missing or inconsistent data, such as filling
gaps or creating new features to capture important patterns.

3. Explain the process of feature subset selection in details.


Steps in Feature Subset Selection:
● Understand the Data: Know what each feature represents.
● Apply Selection Methods: Use techniques like filtering, forward selection, or embedded
methods to choose the best features.
● Evaluate the Model: Test the model's performance with the selected features to make sure
it's working better.
● Refine: Keep adjusting and testing until you find the best feature set.

4. Explain the methods of feature subset selection in details.

Main Techniques for Feature Subset Selection:

1. Filter Methods:
○ These check each feature individually, based on basic statistics, without
using the model.
○ Example: Removing features with little variation (like a feature that is
mostly the same for all data).
2. Wrapper Methods:
○ These test different combinations of features by running the model
multiple times to find the best subset.
○ Example: Forward selection adds features one by one, checking if adding
each feature improves performance.
3. Embedded Methods:
○ These select features automatically during model training.
○ Example: Lasso regression, where unimportant features get a coefficient
of zero, effectively removing them.

Filter: Independent of the model, fast but less accurate.

Wrapper: Model-dependent, provides better accuracy but is computationally expensive.

Embedded: Selects features during training, balancing performance and efficiency.

5. Differentiate feature extraction and feature reduction



6. Explain the methods of feature extractions in details.

Here’s a simplified explanation of methods for feature extraction:

1. Principal Component Analysis (PCA)

● What it is: A technique that reduces the number of features by finding the most
important ones that capture the most variance in the data.
● How it works: It transforms the original features into new ones (principal
components) that are uncorrelated and ordered by importance.

2. Linear Discriminant Analysis (LDA)

● What it is: A method that reduces dimensions while preserving the information
that helps to distinguish between different classes in the data.
● How it works: It looks for the feature combinations that best separate the
classes in the dataset.
3. Independent Component Analysis (ICA)

● What it is: A technique used to separate a mixed signal into its independent
components.
● How it works: It assumes that the observed data is a combination of
independent sources and aims to recover those sources.

5. Feature Learning with Neural Networks

● What it is: Deep learning models automatically learn features from raw data.
● How it works: Layers of the neural network extract different levels of features,
from simple to complex.

6. Autoencoders

● What it is: A type of neural network used to learn efficient representations of


data.
● How it works: It compresses the data into a smaller size (encoding) and then
reconstructs it back (decoding).

7. Bag of Words (BoW) and TF-IDF

● What it is: Techniques for converting text into numerical features for machine
learning.
● How it works:
○ BoW: Counts how often each word appears in a document.
○ TF-IDF: Weighs the word counts based on how common or rare the
words are across multiple documents.

Conclusion

Summary:

PCA and LDA are commonly used for dimensionality reduction and feature
transformation.

ICA is focused on statistical independence.

TF-IDF handles text data, and Autoencoders use neural networks for feature extraction in
complex data.

7. List Issues in high-dimensional data. How we can solve it by feature extractions.

Issues in High-Dimensional Data:


1. Curse of Dimensionality: Too many features make it hard for the model to find patterns,
leading to poor results.
2. Overfitting: With too many features, the model may learn unnecessary details, making it
less accurate on new data.
3. Slow Computation: More features make the model slower and require more computing
power.
4. Hard to Visualize: It's difficult to visualize and understand data when there are too many
features.
5. Redundant Information: Some features may provide the same information, which can
confuse the model.
6. Irrelevant Features: Many features may not actually help in making predictions, adding
noise to the model.

How Feature Extraction Solves These Issues:

1. Reduces the Number of Features: Techniques like PCA combine multiple features into
fewer, more useful ones, reducing the complexity.
2. Prevents Overfitting: By reducing unnecessary features, the model focuses on important
patterns and avoids learning noise.
3. Speeds Up the Model: With fewer features, the model runs faster and uses less computing
power.
4. Makes Data Easy to Understand: Feature extraction can reduce high-dimensional data to
2 or 3 dimensions, making it easier to visualize.
5. Combines Similar Features: It merges related features into one, reducing redundancy and
making the model more stable.
6. Filters Out Irrelevant Information: By creating new, meaningful features, it removes
unhelpful data, improving the model's accuracy.

8. List Issues in high-dimensional data. How we can solve it by feature reduction.


→ Issues in High-Dimensional Data:___________________________

1. Curse of Dimensionality:
○ As the number of dimensions increases, the data becomes sparse, making it difficult
to find patterns or meaningful clusters.
2. Overfitting:
○ High-dimensional data can lead to models capturing noise rather than true patterns,
reducing generalization.
3. Increased Computation:
○ More dimensions require more computational resources for processing and training
models.
4. Model Interpretability:
○ High-dimensional data makes it harder to understand and interpret the relationships
between features and the target variable.

How to Solve it by Feature Reduction:

1. Principal Component Analysis (PCA):


○ Reduces the dimensionality by projecting data into fewer components while
preserving variance.
2. Feature Selection:
○ Selects the most important features using techniques like Chi-square, Mutual
Information, or Correlation.
3. Lasso Regression (L1 Regularization):
○ Shrinks less important feature coefficients to zero, effectively selecting a subset of
features.
4. Embedded Methods:
○ Algorithms like decision trees inherently perform feature selection during the
model-building process.

By reducing the number of features, we address the issues of overfitting, computational cost, and
model interpretability, leading to better performance and more meaningful insights.

9. What is dimensionality reduction. Explain PCA in details.


→ Dimensionality Reduction

Definition: Dimensionality reduction is the process of reducing the number of features or variables
in a dataset while retaining as much of the important information as possible. It simplifies the
dataset by transforming high-dimensional data into a lower-dimensional space.

Principal Component Analysis (PCA) is a method used to simplify complex data by reducing the
number of features while keeping the most important information.

Here’s how it works in simple steps:

1. Normalize Data: Adjust the data so that each feature has an average of zero and the same
scale. This helps ensure that no single feature dominates because of its scale.
2. Find Patterns: PCA looks for patterns in the data to understand how different features
relate to each other.
3. Create New Features: It then creates new features (called principal components) that are
combinations of the original features. These new features capture the most important
patterns in the data.
4. Reduce Dimensions: You can use only the most important new features (principal
components) to represent the data, which simplifies it without losing much of the original
information.

Benefits:

● Simplifies Data: Makes data easier to understand and analyze by reducing the number of
features.
● Improves Performance: Helps in speeding up algorithms and improving their accuracy by
removing irrelevant or redundant features.
Unit 5
1. Define :
a. Random variables
b. Probability
c. Conditional Probability
d. Discrete distributions
e. Continuous distributions
f. Sampling
g. Testing
h. Hypothesis

Random Variables:

→ A random variable is a number that can change based on the outcome of a random event.

Probability:

Probability is the chance of something happening, expressed as a number between 0 (impossible)


and 1 (certain).

Conditional Probability:

Conditional probability is the chance of an event happening, given that another event has already
happened.

Discrete Distributions:

Discrete distributions describe the probabilities of outcomes for variables that can take specific,
separate values (like whole numbers).

Continuous Distributions:

Continuous distributions describe the probabilities of outcomes for variables that can take any
value within a range (like height or weight).

Sampling:

Sampling is the process of selecting a small group from a larger population to study and draw
conclusions about the whole group.

Testing:

Testing is the process of checking a hypothesis using sample data to see if it is likely true or false.

Hypothesis:

A hypothesis is a testable statement about a population, usually consisting of a null hypothesis (no
effect) and an alternative hypothesis (some effect).
2. What is Concepts of probability. What is the importance of it in ML.

Probability is a mathematical concept that quantifies the likelihood of an event occurring. It's a
fundamental tool used in various fields, including statistics and machine learning.

Key Concepts in Probability:

● Event: A specific outcome or set of outcomes of an experiment.


● Sample Space: The set of all possible outcomes of an experiment.
● Probability: A numerical value between 0 and 1 that measures the likelihood of an event
occurring.
● Random Variable: A variable that can take on different values based on the outcome of a
random experiment.
● Probability Distribution: A function that describes the probability of each possible value
of a random variable.

Importance of Probability in Machine Learning:

a. Modeling Uncertainty: Machine learning models often deal with uncertain or


noisy data. Probability theory provides a framework for quantifying and handling
uncertainty.
b. Decision Making: Many machine learning algorithms involve making decisions
based on probabilistic models. For example, in classification problems, the model
might predict the probability of an instance belonging to each class, and the
decision is made based on these probabilities.
c. Generative Models: Generative models like Bayesian networks and Hidden
Markov Models use probability distributions to generate new data or samples.
d. Bayesian Inference: Bayesian inference is a statistical method that uses
probability theory to update beliefs about the world based on new evidence. It's a
powerful tool in machine learning for tasks like parameter estimation and model
selection.
e. Reinforcement Learning: In reinforcement learning, agents make decisions based
on the probability of different actions leading to desired outcomes.
f. Statistical Learning: Many machine learning algorithms are based on statistical
principles, and probability theory is essential for understanding these algorithms.

🫥
3. Explain distribution and its methods in details.

→ Distribution in Machine Learning:

Distribution refers to how data points or values are spread or arranged across a dataset.

👍
Understanding data distribution helps in selecting appropriate models, detecting anomalies, and
choosing suitable statistical techniques.

Handling Distributions in Easy Way:

1. Data Transformation:
○ Change the shape of data using methods like logarithms or square roots to make it
more normal (less skewed).
2. Standardization:
○ Adjust the data so it has a mean of 0 and a standard deviation of 1, which is helpful
for normally distributed data.
3. Normalization:
○ Scale all data points to fit within a specific range (like 0 to 1), useful for data that's
evenly spread out (uniform distribution).
4. Handling Outliers:
○ Deal with extreme values that might distort the model by removing them or limiting
them to a certain value (capping).
5. Bootstrapping:
○ Resample data to create many smaller datasets, helpful when you're unsure about
the original data's distribution.

Understanding how to handle distributions helps in preparing your data correctly, leading to better
machine learning results.

4. What is difference between Discrete distributions and Continuous distributions.


5. Explain Monte Carlo Approximation.


→ Monte Carlo Approximation:

Monte Carlo Approximation is a method used to estimate numerical results using random
sampling. It’s commonly used for problems that are difficult or impossible to solve exactly due to
their complexity.

Key Concepts:

1. Random Sampling:
○ Random values (samples) are generated to estimate a desired result. The larger the
sample size, the more accurate the estimate becomes.
2. Applications:
○ Used in areas like finance (risk analysis), physics (particle simulations), machine
learning (optimization), and statistics (probability estimation).
3. Procedure:
○ Step 1: Define the problem, often an integral or probability that needs to be
approximated.
○ Step 2: Generate random samples from a defined distribution.
○ Step 3: Compute a function (e.g., average, sum) over these random samples.
○ Step 4: Use the results to approximate the desired quantity (e.g., an integral value).

Example:

To estimate the value of π using Monte Carlo:

● Generate random points inside a square.


● Count how many fall inside a quarter circle inscribed in that square.
● Approximate π as 4 times the ratio of points inside the circle to total points.

Benefits:

● Applicable to complex, multi-dimensional problems.


● Provides a probabilistic estimate even when exact solutions are hard to obtain.

Limitation:

● Requires a large number of samples for high accuracy, which can be computationally
expensive.

Monte Carlo Approximation is powerful for estimating complex problems using randomness and
statistical principles.

Unit 6
1. Explain Bayes’ theorem in details.


2. Describe the Impotence of Bayesian methods in ML.

3. Write a note on Bayesian theorem.


4. Explain Bayesian Belief Network.
A Bayesian Belief Network (BBN), or Bayesian Network, is a graphical model that
represents the probabilistic relationships among a set of variables. Here's a simple
breakdown:

Structure:
● Nodes: Each node represents a random variable. These could be anything
from observable data points to latent variables.
● Edges: Directed edges (arrows) connect pairs of nodes, representing
conditional dependencies. If there's an arrow from node A to node B, A is a
parent of B.
● Conditional Probabilities: Each node has an associated conditional
probability distribution that quantifies the effect of the parents on the node.

How It Works:
Construct the Network: Define the structure by identifying all relevant
variables and their relationships.
Specify Probabilities: Assign conditional probability tables (CPT) to each
node, showing the probability of each state given its parents.
Inference: Use the network to perform probabilistic inference. This involves
computing the likelihood of certain outcomes given evidence.

Applications:
Medical Diagnosis: Determining the probability of diseases given
symptoms. flue fever and cough wala example likh dena.
Predictive Analytics: Forecasting future events based on historical data.
Natural Language Processing: Understanding context and meaning in text.

5. Explain Confusion Matrix with respect to detection of “Spam e-mails”.


6. With a suitable example, explain Face Recognition using Machine Learning

Unit 7
1. Define:
a. Supervised Learning
b. Classification
c. Regression
d. Learning

Supervised Learning
Supervised Learning is a type of machine learning where the model is trained on a
labeled dataset, meaning the data has both input and output value

Classification
Classification is a type of supervised learning where the output variable is
categorical. The aim is to predict the category or class to which a new observation
belongs. For example, classifying emails as 'spam' or 'not spam'.

Regression
Regression is another type of supervised learning where the output variable is a
continuous value. The goal is to predict a numeric value based on input features.
For instance, predicting house prices based on location, size, etc.

Learning
the process of gaining knowledge from data.
2. Explain Supervised Learning in details.
→ Supervised Learning (Detailed but Precise):

Supervised learning is a type of machine learning where the model is trained on a labeled dataset,
meaning each training example has an input-output pair. The model learns to map inputs (features)
to the correct output (labels) by identifying patterns in the data.

Key Concepts:
1. Labeled Data:
○ In supervised learning, the dataset contains inputs \(X\) (features) and
corresponding outputs \(Y\) (labels). The goal is to predict the output for new,
unseen inputs based on the patterns learned from the training data.
2. Training Process:
○ The algorithm learns by minimizing the difference between the predicted outputs
and the actual outputs using a loss function. Over time, the model adjusts its internal
parameters (weights) to improve its accuracy.
3. Types of Supervised Learning:
○ Classification: When the output is a discrete label (e.g., cat vs. dog).
○ Regression: When the output is continuous (e.g., predicting house prices).
4. Steps Involved:
○ Data Collection: Gather labeled data relevant to the problem.
○ Data Preprocessing: Clean, transform, and split the data into training and testing
sets.
○ Model Training: Train the model on the training set using the labeled data.
○ Model Evaluation: Test the model on the testing set to check its accuracy.
○ Model Improvement: Fine-tune the model by adjusting hyperparameters or using
better algorithms if necessary.
5. Applications:
○ Email spam detection, image recognition, medical diagnosis, and stock price
prediction.

Supervised learning is effective for tasks where labeled data is available, and the goal is to make
accurate predictions based on past examples.

3. What are the Classification Model in Supervised Machine Learning.


4. Explain the process of Supervised Machine Learning.

🎉
5. List Classification algorithms. Explain Decision Tree as classification method.
// this by kiran V vote
List of Classification Algorithms
Classification algorithms are used to assign data into different categories (labels or
classes). Here are some popular classification algorithms:

Logistic Regression
Decision Trees
Random Forest
Support Vector Machines (SVM)
K-Nearest Neighbors (KNN)
Naive Bayes
Artificial Neural Networks (ANN)
Gradient Boosting (e.g., XGBoost, LightGBM)
AdaBoost
Multilayer Perceptron (MLP)
Decision Tree as a Classification Method (For 4 Marks)
A Decision Tree is a supervised learning algorithm used for classification. It works by
splitting data into branches based on feature values to make decisions.
Structure: It has a tree-like structure with:

Nodes: Representing decisions based on a feature.


Branches: Representing possible outcomes of the decisions.
Leaves: Representing the final class label (e.g., "Yes" or "No").
Working:

The algorithm selects the best feature to split the data using measures like Gini Impurity or
Entropy.
Data is recursively split until it reaches a leaf node, which provides the classification.
Advantages:

Easy to visualize and interpret.


Can handle both numerical and categorical data.
Disadvantages:

Prone to overfitting if the tree becomes too complex.


Unstable: Small data changes can result in a completely different tree.


—----------------------------------------------------------------------------------------------------------
Sakir’s version

Classification Algorithms:

1. Logistic Regression
2. Decision Tree
3. Support Vector Machine (SVM)
4. K-Nearest Neighbors (KNN)
5. Naive Bayes
6. Random Forest
7. Gradient Boosting (e.g., XGBoost, LightGBM)
8. Neural Networks

Decision Tree as a Classification Method:

A Decision Tree is a supervised machine learning algorithm used for both classification and
regression tasks. It works by splitting the data into subsets based on the most significant feature
values. This process continues recursively, creating branches and forming a tree-like structure
where each node represents a feature, each branch represents a decision, and each leaf node
represents the outcome (class label).

Key Concepts:

1. Nodes: Represent a feature or attribute.


2. Branches: Represent the outcome of a decision based on the node (feature).
3. Leaf Nodes: Represent the final class label (classification result).
4. Splitting: The process of dividing the data based on a feature that gives the highest
information gain or lowest impurity (e.g., Gini Index, Entropy).
5. Root Node: The top-most node, which represents the most important feature.

Process:

1. Select the Best Feature: The decision tree algorithm evaluates all features and selects the
one that best splits the data into different classes. The most common criteria are Gini Index
and Information Gain (based on Entropy).
2. Splitting: The chosen feature divides the dataset into subsets, making decisions at each
branch based on the feature's value. This is done recursively at each level of the tree.
3. Stopping Criteria: The tree continues to split until one of the stopping criteria is met:
○ Maximum depth is reached.
○ No more features left to split.
○ All the data in a node belong to a single class (pure node).
4. Prediction: Once the tree is built, new data can be classified by following the branches
based on feature values until a leaf node (class label) is reached.

Example:

Consider a decision tree to classify if an email is spam or not. Features could include:

● Presence of certain keywords.


● Length of the email.
● Number of links.

At each node, the decision tree splits based on the most important feature, e.g., if certain keywords
are present, leading to branches such as "Spam" or "Not Spam."

Advantages:

● Easy to interpret: Visual and straightforward.


● Handles both numerical and categorical data.
● No need for feature scaling.

Disadvantages:

● Prone to overfitting: If the tree grows too large, it may capture noise.
● Instability: A small change in data can lead to a completely different tree.

6. List Regression Algorithms. Explain Linear Regression as Regression Model.


→ List of Regression Algorithms
Regression algorithms are used to predict continuous values (numbers). Here are some
common regression algorithms:

Linear Regression
Polynomial Regression
Lasso Regression
Decision Tree Regression
Random Forest Regression
Support Vector Regression (SVR)
Linear Regression as a Regression Model
Linear Regression is one of the simplest and most widely used regression algorithms. It
models the relationship between two variables by fitting a straight line through the data
points.

1. Objective
The goal of linear regression is to predict a continuous target variable (Y) based on the
input features (X). It assumes a linear relationship between the input and the output.
2. Equation
The linear regression model is based on the equation of a straight line:

3. How It Works
The model tries to find the best-fitting line by minimizing the error (difference) between
the predicted values and the actual values.
The most common method for this is Ordinary Least Squares (OLS), which minimizes the
sum of squared errors.
7. Differentiate Linear Regression and Logistics Regression.

8. Why Support Vector Machines (SVM) Classifiers have improved classification over
Linear ones? Discuss Hyperplane in SVM.
→ Why SVM Classifiers Have Improved Classification Over Linear
Ones:

1. Better Handling of Non-linear Data:


○ Linear classifiers work well when the data is linearly separable (can be divided by a
straight line), but many real-world problems involve non-linear data. SVM uses
kernel tricks to transform data into higher dimensions, making it easier to classify
non-linear data.
2. Maximizing Margin:
○ SVM focuses on maximizing the margin between different classes, which improves
generalization and makes the model more robust. Linear classifiers do not
specifically focus on maximizing the margin.
3. Flexibility with Kernels:
○ SVM allows the use of various kernels (e.g., polynomial, radial basis function) to
adapt to different types of data, unlike linear classifiers that stick to a linear
approach.

Hyperplane in SVM:

A hyperplane is the decision boundary that separates different classes in an SVM model. In a
two-dimensional space, it’s a line; in higher dimensions, it becomes a plane or a hyperplane.

● Goal of SVM: To find the optimal hyperplane that not only separates the classes but also
maximizes the margin, which is the distance between the hyperplane and the closest data
points from both classes (called support vectors).
● Support Vectors: These are the critical points closest to the hyperplane that influence its
position and orientation.

SVM strives for the hyperplane with the largest margin, which helps improve classification
performance, especially for complex datasets

9. Are True Positive and True Negative enough for accurate Classification? If only
FalseNegative is reduced, does it lead to skewed classification? Give reasons for
your ans—-------------wers
Unit 8
1. Define:
a. Unsupervised Learning
b. Clustering
c. Association
d. Confusion Matrix
2. Supervised vs. Unsupervised Learning
3. Explain Applications of unsupervised Machine Learning
4. What is Clustering. Explain K-mean clustering algorithm.
5. Write a note on
6. Write a note on KNN.
7. Define Association rules. Explain application of Association Rule.
8. Describe K-nearest Neighbor learning Algorithm for continues valued target function.
9. Discuss the major drawbacks of K-nearest Neighbor learning Algorithm and how it can be
corrected
10. Define the following terms with respect to K - Nearest Neighbor Learning:
i) Regression ii) Residual iii) Kernel Function.
11. Define the following terms
a. Sample error
b. True error
c. Random Variable
d. Expected value
e. Variance
f. standard Deviation
12. Explain Binomial Distribution with an example.
13. Explain Normal or Gaussian distribution with an example.
14. Explain the Central Limit Theorem with an example.
15. Write the Procedure for estimating the difference in error between two learning methods.
Approximate confidence intervals for this estimate
16. Describe K-nearest neighbour algorithm. Why is it known as instance-based Learning?

Unit 9
1. Define:
a. Neural Network
b. Neurons
c. Activation Function
d. Backpropagation
e. Deep Learning
2. Explain Types of Activation functions in details.
3. Explain various type of neural network.
4. Explain ANN in details
5. List the Architecture of Neural Network. Explain each in details.
6. Explain Backpropagation in ANN.
7. Write a note on RNN
8. Write a short note on feed forward neural network.
9. Write a note on CNN
10. Explain Deep Learning
11. What is difference between Machine Learning and Deep Learning.
12. Explain the concept of a Perceptron with a neat diagram.
13. Discuss the Perceptron training rule.
14. Under what conditions the perceptron rule fails and it becomes necessary to apply
the delta rule
15. What do you mean by Gradient Descent?
16. Derive the Gradient Descent Rule.
17. What are the conditions in which Gradient Descent is applied.
18. What are the difficulties in applying Gradient Descent.
19. Differentiate between Gradient Descent and Stochastic Gradient Descent
20. Define Delta Rule.
21. Derive the Backpropagation rule considering the training rule for Output Unit
weights and Training Rule for Hidden Unit weights
22. Write the algorithm for Back propagation.
23. Explain how to learn Multilayer Networks using Gradient Descent Algorithm.
24. What is Squashing Function?
25. What is Cost function in Back Propagation? Discuss Back propagation algorithm.
26. What is a Neural Network (NN)? With an example, discuss most suitable NN application.

You might also like