0% found this document useful (0 votes)

11 views77 pages

UNIT-3 Supervised Learning

The document outlines the course details for a unit on Supervised Learning in Artificial Intelligence and Machine Learning, taught by Dr. Raju at the Noida Institute of Engineering and Technology. It includes course outcomes, a syllabus covering regression and classification techniques, and various types of regression analysis, including univariate and multivariate regression. Key concepts such as Mean Squared Error, R-squared, and logistic regression are also discussed, emphasizing their importance in predictive modeling.

Uploaded by

mallickanjaneya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views77 pages

UNIT-3 Supervised Learning

Uploaded by

mallickanjaneya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

Noida Institute of Engineering and Technology, Greater Noida

Artificial Intelligence & Machine Learning

Unit: 3
Supervised Learning

Dr. Raju
Course Details
Assistant Professor & HoD
B-Tech 3rd Sem. ONLINE & Offline
(Sec A) Department of CSE(AIML)

Dr. Raju, Assistant Prof. (CSE (AIML)) UNIT 03

Faculty Introduction

• Name : Dr. Raju

• Qualification: Ph.D
• Experience: More than 9 years
• Subject Taught: Neural Network,
DBMS, Object Oriented
Programming, Computer Graphics,
COA, Digital Image Processing,
Computer Application

Dr. Raju, Assistant Prof. (CSE (AIML))

UNIT 03
Course Outcomes (CO)

Course Outcomes
(CO)
Bloom’s Knowledge Level (KL)
Course outcome: After completion of this course
students will be able to:
CO 1 Choose and apply the most suitable search algorithm for a given problem to find the goal state. K3

CO2 Comprehend and apply feature engineering and data visualization concepts. K3

CO3 Critically analyze the strengths and weaknesses of various regression and classification algorithms. K5

CO4 Develop approaches that incorporates appropriate clustering algorithms to solve a specific data clustering K3
problem.

CO5 Analyze the efficiency using the ensemble learning techniques, probabilistic learning and reinforcement learning K4
algorithms.

Dr. Raju, Assistant Prof. (CSE (AIML)) U

NIT 03
Syllabus
Lecture
Unit Module
s
Introduction to AI and problem-solving methods
Introduction to AI and Intelligent agent, Different Approaches of AI, Problem Solving by searching Techniques: Uninformed search- BFS, DFS, Iterative
deepening, Bi directional search,
Unit-I Informed search- Iterative deepening, Bi directional search, Heuristic search, Greedy Best First Search, A * search, Local Search Algorithms- Hill
Climbing and Simulated Annealing
Adversial Search- Game Playing- minimax, alpha-beta pruning, constraint satisfaction problems

Machine Learning & Feature Engineering

Introduction to Machine Learning, Types of Machine Learning, Feature Engineering: Features and their types, handing missing data, Dealing with
Unit-II categorical features, Working with features: Feature Scaling, Feature selection, Feature Extraction: Principal Component Analysis (PCA) algorithm

Supervised Learning
Regression & Classification: Types of regression (Univariate, Multivariate, Polynomial), Mean Square Error, R square error, Logistic Regression,
Unit Regularization: Bias and Variance, Overfitting and Underfitting, L1 and L2 Regularization, Regularized Linear Regression, Decision Trees (ID3, C4.5,
III CART), Confusion matrix, k-folds cross-validation, K Nearest Neighbour, Support vector machine.

Unsupervised Machine Learning

Unit Introduction to clustering, Types of clustering: K-means clustering, K-mode, K-medoid, hierarchical clustering, single-linkage, multiple linkage, AGNES
IV and DIANA algorithms, Gaussian mixture models density based clustering, DBSCAN

Ensemble & Reinforcement Learning

Probabilistic learning: Bayesian Learning, Naive Bayes Classifier, Bayesian belief networks, Ensembles Learning: Random Forest, Gradient Boosting,
Unit V XGBoost., Reinforcement Learning: Introduction to reinforcement learning, models of reinforcement learning: Markov decision process, Q-learning.

Dr. Raju, Assistant Prof. (CSE (AIML)) U

NIT 03
Course Contents / Syllabus

UNIT-III Supervised Learning 8 Hours

Regression & Classification: Types of regression (Univariate, Multivariate,
Polynomial), Mean Square Error, R square error, Logistic Regression,
Regularization: Bias and Variance, Overfitting and Underfitting, L1 and L2
Regularization, Regularized Linear Regression, Decision Trees (ID3, C4.5, CART),
Confusion matrix, k-folds cross-validation, K Nearest Neighbour, Support vector
machine.

Dr. Raju, Assistant Prof. (CSE (AIML)) U

NIT 03
Regression
• Regression is a supervised learning technique which helps in finding the
correlation between variables and enables us to predict the continuous
output variable based on the one or more predictor variables.
• Regression refers to a type of predictive modeling technique used to estimate
the relationships among variables.
• It involves predicting a continuous outcome variable based on one or more
predictor variables (features).
• It is a statistical method to model the relationship between a dependent
(target) and independent (predictor) variables with one or more independent
variables.
• It predicts continuous/real values such as temperature, age, salary, price, etc.
• It is mainly used for prediction, forecasting, time series modeling, and
determining the causal-effect relationship between variables.
Dr. Raju, Assistant Prof. (CSE (AIML)) U
NIT 03
Regression
• "Regression shows a line or curve that passes through all the datapoints on
target-predictor graph in such a way that the vertical distance between the
datapoints and the regression line is minimum."

Dr. Raju, Assistant Prof. (CSE (AIML)) U

NIT 03
Terminologies Related to the Regression Analysis
• Dependent Variable: The main factor in Regression analysis which
we want to predict or understand is called the dependent
variable. It is also called target variable.
• Independent Variable: The factors which affect the dependent
variables or which are used to predict the values of the dependent
variables are called independent variable, also called as a
predictor.
• Outliers: Outlier is an observation which contains either very low
value or very high value in comparison to other observed values.
An outlier may hamper the result, so it should be avoided.

Dr. Raju, Assistant Prof. (CSE (AIML)) U

NIT 03
Terminologies Related to the Regression Analysis
• Multicollinearity: If the independent variables are highly
correlated with each other than other variables, then such
condition is called Multicollinearity. It should not be present in the
dataset, because it creates problem while ranking the most
affecting variable.
• Underfitting and Overfitting: If our algorithm works well with the
training dataset but not well with test dataset, then such problem
is called Overfitting. And if our algorithm does not perform well
even with training dataset, then such problem is called
underfitting.

Dr. Raju, Assistant Prof. (CSE (AIML)) U

NIT 03
Types of Regression

Dr. Raju, Assistant Prof. (CSE (AIML)) U

NIT 03
Type of Regression: Univariate
• Univariate regression refers to a statistical technique that analyzes the
relationship between a single independent variable (predictor) and a single
dependent variable (outcome). The goal is to model how changes in the
independent variable affect the dependent variable.
• Simple Linear Regression
Y=a+bX+ϵ
Use Case: Predicting outcomes like sales based on advertising spend
• Polynomial Regression
Y=a+b1 X+b2 X2+b3 X3+...+bn Xn +ϵ
Use Case: Modeling relationships where the effect of the independent
variable changes at different levels, such as growth patterns that are
quadratic.
Dr. Raju, Assistant Prof. (CSE (AIML)) U
NIT 03
Type of Regression: Univariate
• Logarithmic Regression
Y=a+blog(X)+ϵ
Use Case: Analyzing phenomena like the relationship between income and
consumption, where increases in income lead to smaller increases in
consumption.
• Exponential Regression
Y=a⋅e bX
Use Case: Modeling population growth or radioactive decay.

Dr. Raju, Assistant Prof. (CSE (AIML)) U

NIT 03
Type of Regression:Multivariate regression
• Multivariate regression involves the analysis of multiple independent
variables to predict a single dependent variable.
• Multiple Linear Regression
Y=a+b1X1+b2X2+...+bnXn+ϵ
Use Case: Predicting a person’s weight based on height, age, and
exercise frequency.
• Ridge Regression

where λ is the regularization parameter.

Use Case: Useful when there are many predictors, and you want to reduce model
complexity.
Dr. Raju, Assistant Prof. (CSE (AIML)) U
NIT 03
Type of Regression:Multivariate regression
• Lasso Regression

Use Case: Effective for variable selection in models with a large number of
predictors.
• Elastic Net Regression

• Use Case: Useful when there are many correlated variables and you want
a more robust model.

Dr. Raju, Assistant Prof. (CSE (AIML)) U

NIT 03
Mean Square Error (MSE)
• Mean Squared Error (MSE) is a common metric used to evaluate the performance of
regression models. It measures the average squared difference between the actual
(observed) values and the predicted values generated by a model. The formula for MSE
is:

Dr. Raju, Assistant Prof. (CSE (AIML)) U

NIT 03
Mean Square Error (MSE)

Observation Actual (y) Predict (Y) y-Y (y-Y)^2

1 3 2.5 0.5 0.25
2 -0.5 0 -0.5 0.25
3 2 2 0 0
4 7 8.5 -1.5 2.25
MSE 0.6875

• The Mean Squared Error (MSE) for this dataset is 0.6875. This value gives us an
indication of how well the predicted values match the actual values, with a lower MSE
representing better model performance.

Dr. Raju, Assistant Prof. (CSE (AIML)) U

NIT 03
Importance of MSE
• Performance Evaluation:
• MSE provides a quantitative measure of how well a regression model predicts
outcomes. A lower MSE indicates a better fit to the data, meaning the model's
predictions are closer to the actual values.
• Sensitivity to Outliers:
• Since MSE squares the errors, it gives greater weight to larger errors. This
sensitivity makes it effective for detecting models that may not perform well on
extreme values, although it can also make the metric overly influenced by outliers.
• Optimization Objective:
• Many machine learning algorithms, particularly those based on gradient descent
(e.g., linear regression, neural networks), use MSE as the loss function to minimize
during training. By minimizing MSE, models learn to make better predictions.

Dr. Raju, Assistant Prof. (CSE (AIML)) U

NIT 03
Importance of MSE
• Comparative Analysis:
• MSE allows for easy comparison between different models or algorithms. By
evaluating multiple models using MSE, practitioners can select the one with the best
performance based on this metric.
• Interpretable Metric:
• Although MSE itself is in squared units of the target variable, it is straightforward to
interpret. When paired with the square root (resulting in Root Mean Squared Error,
RMSE), it can be expressed in the same units as the target variable, enhancing
interpretability.

Dr. Raju, Assistant Prof. (CSE (AIML)) U

NIT 03
R-squared
• R-squared, also known as the coefficient of determination, is a statistical measure that
indicates how well the independent variables in a regression model explain the
variability of the dependent variable.
• It provides an indication of the goodness of fit of the model.

Dr. Raju, Assistant Prof. (CSE (AIML)) U

NIT 03
Interpretation of R-squared
• Range: R-squared values range from 0 to 1.
• 0: Indicates that the model does not explain any variability in the
dependent variable (the mean of the dependent variable is the best
predictor).
• 1: Indicates that the model explains all the variability in the dependent
variable (perfect prediction).
• Value Interpretation:
• An R² value of 0.70, for example, suggests that 70% of the variability in
the dependent variable can be explained by the independent variables in
the model, while 30% is attributed to other factors or random noise..

Dr. Raju, Assistant Prof. (CSE (AIML)) U

NIT 03
Linear Regression (Beyond The Syllabus)
• Linear regression is a statistical regression method which is used for predictive analysis.
• It is one of the very simple and easy algorithms which works on regression and shows
the relationship between the continuous variables.
• It is used for solving the regression problem in machine learning.
• Linear regression shows the linear relationship between the independent variable (X-
axis) and the dependent variable (Y-axis), hence called linear regression.
• If there is only one input variable (x), then such linear regression is called simple linear
regression. And if there is more than one input variable, then such linear regression is
called multiple linear regression.
• The relationship between variables in the linear regression model can be explained
using the below image. Here we are predicting the salary of an employee on the basis of
the year of experience.

Dr. Raju, Assistant Prof. (CSE (AIML)) U

NIT 03
Linear Regression:
• Some popular applications of linear regression are:
• Analyzing trends and sales estimates
• Salary forecasting
• Real estate prediction
• Arriving at ETAs in traffic.

Dr. Raju, Assistant Prof. (CSE (AIML)) U

NIT 03
Colab Link for Linear Regression:

[Link]
DDNR4vngO2EkZcfFaAM4jItw#scrollTo=0X7hGyLc11EZ

Dr. Raju, Assistant Prof. (CSE (AIML)) U

NIT 03
Logistic Regression:
• Logistic regression is another supervised learning algorithm which is used to solve the
classification problems.
• In classification problems, we have dependent variables in a binary or discrete format
such as 0 or 1.
• Logistic regression algorithm works with the categorical variable such as 0 or 1, Yes or
No, True or False, Spam or not spam, etc.
• It is a predictive analysis algorithm which works on the concept of probability.
• Logistic regression is a type of regression, but it is different from the linear regression
algorithm in the term how they are used.
• Logistic regression uses sigmoid function or logistic function which is a complex cost
function.
• This sigmoid function is used to model the data in logistic regression. The function can
be represented as:

Dr. Raju, Assistant Prof. (CSE (AIML)) U

NIT 03
Logistic Regression:

Dr. Raju, Assistant Prof. (CSE (AIML)) U

NIT 03
Types of Logistic Regression:

Binary(0/1, pass/fail)
Multi(cats, dogs, lions)
Ordinal(low, medium, high)

Dr. Raju, Assistant Prof. (CSE (AIML)) U

NIT 03
Binary logistic regression
• Binary logistic regression predicts the relationship between the
independent and binary dependent variables.
• Some examples of the output of this regression type may be,
success/failure, 0/1, or true/false.
• Examples:
• Deciding on whether or not to offer a loan to a bank customer:
Outcome = yes or no.
• Evaluating the risk of cancer: Outcome = high or low.
• Predicting a team’s win in a football match: Outcome = yes or no.

Dr. Raju, Assistant Prof. (CSE (AIML)) U

NIT 03
Multinomial logistic regression
• A categorical dependent variable has two or more discrete
outcomes in a multinomial regression type. This implies that this
regression type has more than two possible outcomes.
• Examples:
• Let’s say you want to predict the most popular transportation type for
2040. Here, transport type equates to the dependent variable, and the
possible outcomes can be electric cars, electric trains, electric buses, and
electric bikes.
• Predicting whether a student will join a college, vocational/trade school,
or corporate industry.
• Estimating the type of food consumed by pets, the outcome may be wet
food, dry food, or junk food.
Dr. Raju, Assistant Prof. (CSE (AIML)) U
NIT 03
Ordinal logistic regression

• Ordinal logistic regression applies when the dependent

variable is in an ordered state (i.e., ordinal).
• The dependent variable (y) specifies an order with two or
more categories or levels.
• Examples: Dependent variables represent,
• Formal shirt size: Outcomes = XS/S/M/L/XL
• Survey answers: Outcomes = Agree/Disagree/Unsure
• Scores on a math test: Outcomes = Poor/Average/Good

Dr. Raju, Assistant Prof. (CSE (AIML)) U

NIT 03
Key advantages of logistic regression

Dr. Raju, Assistant Prof. (CSE (AIML)) U

NIT 03
Colab Link for Linear Regression:

• Case Study: Predicting Diabetes Using Logistic Regression

• [Link]
I_T6uSz2CphiMcLKDYo#scrollTo=IHo9w80zCKoR

Dr. Raju, Assistant Prof. (CSE (AIML)) U

NIT 03
Regularization

• Regularization is a technique used in machine learning and statistics to prevent

overfitting, which occurs when a model learns the noise in the training data instead of
the underlying patterns.
• Purpose:
• To improve model generalization and performance on unseen data by reducing
overfitting.
• Common Types:
• L1 Regularization (Lasso): Adds a penalty equal to the absolute value of the
coefficients. It can produce sparse models by driving some coefficients to zero,
effectively selecting features.
• L2 Regularization (Ridge): Adds a penalty equal to the square of the coefficients. It
tends to shrink coefficients evenly, preventing any one feature from having too
much influence.
• Elastic Net: Combines both L1 and L2 penalties, allowing for feature selection and
coefficient shrinkage.
Dr. Raju, Assistant Prof. (CSE (AIML)) U
NIT 03
Regularization

• Techniques:
• Dropout: Randomly sets a fraction of neurons to zero during training in
neural networks, which helps prevent co-adaptation.
• Early Stopping: Involves monitoring the model's performance on a
validation set and stopping training when performance begins to
degrade.
• Benefits:
• Helps to avoid overfitting.
• Encourages simpler models, which are often more interpretable.
• Can improve prediction accuracy on new data.