0% found this document useful (0 votes)

11 views12 pages

ML File - 1

The document provides a comprehensive introduction to various supervised and unsupervised learning techniques using the Scikit-learn library in Python. It covers exercises on K-Nearest Neighbors, K-Means Clustering, Linear Regression, Logistic Regression, and Decision Trees, detailing their theoretical foundations, code implementations, and practical applications. Each exercise aims to enhance understanding of machine learning concepts and their real-world applications.

Uploaded by

shubhamgoelgaming

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views12 pages

ML File - 1

Uploaded by

shubhamgoelgaming

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Exercise 1: Introduction to Scikit-learn for Supervised Learning

Objective: To gain hands-on experience with implementing supervised learning algorithms

using the Scikit-learn library in Python.

Solution:

Theory:

Supervised Learning is a type of machine learning where the algorithm is trained on labeled
data, i.e., input features (X) and corresponding output labels (y). The goal is to learn a mapping
function that can predict the output for new, unseen data.

Scikit-learn (sklearn) is a Python library that provides simple and efficient tools for data
mining and machine learning. It includes several classification, regression, and clustering
algorithms.

In this exercise, we use the K-Nearest Neighbors (KNN) classifier on the Iris dataset, a
popular dataset containing measurements of three types of iris flowers.
Key Features of Scikit-learn:

• Supervised learning: Classification and regression algorithms like:

o Decision Tree

o Random Forest

o K-Nearest Neighbors (KNN)

o Support Vector Machine (SVM)

o Logistic Regression
• Unsupervised learning: Clustering and dimensionality reduction algorithms like:

o K-Means

o PCA (Principal Component Analysis)

• Model selection: Tools for cross-validation, hyperparameter tuning (GridSearchCV,

RandomizedSearchCV)

• Preprocessing: Functions for:

o Data scaling (e.g., StandardScaler)

o Handling missing values

o Encoding categorical variables

• Datasets: Includes built-in sample datasets like:
o Iris

1|Page
o Digits

o Boston Housing (deprecated)

o Wine, etc.

Why Use Scikit-learn?

• Easy to learn and use

• Well-documented

• Integrates well with other libraries like NumPy, Pandas, and Matplotlib

• Widely used in education, research, and industry

---------------------------------------------Page End-----------------------------------------------------

Exercise 2: Exploring Unsupervised Learning with K-Means Clustering

Objective: To explore the concepts of unsupervised learning and clustering using the K-Means
algorithm.

Solution:

Objective:
To explore the concepts of unsupervised learning by applying the K-Means clustering
algorithm using the Scikit-learn library. This exercise helps understand how to group similar
data points when no labels are provided.

Theory:

What is Unsupervised Learning?

Unsupervised learning is a type of machine learning where the model is not given any labeled
output data. The goal is to discover hidden patterns or groupings in the data.
What is K-Means Clustering?

K-Means is a popular unsupervised learning algorithm used for clustering data into K groups
(clusters) based on feature similarity. It works by:
1. Choosing the number of clusters (K).

2. Randomly selecting initial cluster centroids.

3. Assigning data points to the nearest centroid.

4. Updating centroids based on the mean of points in each cluster.

5. Repeating steps 3 and 4 until convergence.

Code Implementation:
import pandas as pd

2|Page
import numpy as np

import [Link] as plt

from [Link] import KMeans

df = pd.read_csv('/content/[Link]')
[Link]()

[Link]()

# features scalling

from [Link] import StandardScaler

scaler = StandardScaler()

df['Age']=scaler.fit_transform(df[['Age']])

df['Income($)']=scaler.fit_transform(df[['Income($)']])
print(df)
# plot data point

[Link](figsize=(10,4))

[Link](df['Age'],df['Income($)'],s=100)

[Link]('Age')

[Link]('Income')

[Link]('Customer Data')

[Link]()
# Assuming 'df' is your DataFrame containing the 'Age' and 'Income($)' columns

X = df[['Age', 'Income($)']] # Select the features for clustering

km = KMeans(n_clusters=5)

ypred = km.fit_predict(X) # Pass the feature data to fit_predict

ypred

df['cluster'] = ypred
print(df)

## to get the centroid of the cluster

centroid=km.cluster_centers_

3|Page
centroid

df1=df[df['cluster']==0]

df1

df2=df[df['cluster']==1]
df3=df[df['cluster']==2]

[Link](df1['Age'],df1['Income($)'],color='green',label='cluster1',s=150)

[Link](df2['Age'],df2['Income($)'],color='red',label='cluster2',s=150)

[Link](df3['Age'],df3['Income($)'],color='blue',label='cluster3',s=150)

# to draw the centroid

[Link](centroid[:,0],centroid[:,1],s=200,marker="*",color='purple',label='centroid'
)

[Link]()

Output:

Fig no.1

4|Page
Fig no.2

---------------------------------------------------Page End-----------------------------------------------

Exercise 3: Implementing Linear Regression from Scratch.

Objective: To gain a deeper understanding of linear regression by implementing it from scratch

using Python.

Solution:
Theory:

What is Linear Regression?

Linear regression is a supervised learning algorithm used for predicting continuous values. It
assumes a linear relationship between the input feature x and the output y, modeled by the
equation:

y=mx+cy = mx + cy=mx+c

Where:
• m is the slope (also called weight or coefficient),

• c is the intercept (bias),

• x is the independent variable,

• y is the dependent variable (target).

Code Implementation:

import numpy as np
import [Link] as mtp

5|Page
import pandas as pd

data_set= pd.read_csv("Salary_Data.csv")

x=data_set.iloc[:,:-1].values

y=data_set.iloc[:,1].values
print(x)

print(y)

# splitting the data set into training and testing

from sklearn.model_selection import train_test_split

x_train, x_test, y_train, y_test= train_test_split(x,y,test_size=1/3, random_state=0)

# fitting the simple linear regression model to the training dataset

from sklearn.linear_model import LinearRegression

regressor= LinearRegression() # regressor is just a variable we can replace it to any other
variable (such as a,x,etc)

[Link](x_train,y_train)
# prediction of test and training set result

y_pred= [Link](x_test)

print(y_pred)

x_pred= [Link](x_train)

print(x_pred)

# visualizing the training set results

[Link](x_train,y_train,color="green")
[Link](x_train,x_pred,color="red")

[Link]("Salary vs Experience (Training set)")

[Link]("Years of Experience")

[Link]("Salary(In Rupees)")

[Link]()

# visualizing the test set result

[Link](x_test,y_test,color="blue")
[Link](x_train,x_pred,color="red")

6|Page
[Link]("Salary vs Experience (Test Dataset)")

[Link]("Years of Experience")

[Link]("Salary(In Rupees)")

[Link]()
Output:

Fig no.1

Fig no.2
------------------------------------------------------------

7|Page
Exercise 4: Binary Classification with Logistic Regression.

Objective: To implement logistic regression for binary classification tasks and understand its
application in real-world scenarios.

Solution:

Theory:

What is Binary Classification?

Binary classification is a supervised learning task where the output variable has only two
possible classes, e.g., yes/no, 0/1, true/false.

What is Logistic Regression?

Logistic Regression is a statistical model used for classification tasks. It estimates the
probability that a given input point belongs to a certain class using the sigmoid (logistic)
function:

σ(z)=11+e−z, where z=w⋅x+b\sigma(z) = \frac{1}{1 + e^{-z}}, \text{ where } z = w \cdot x +

bσ(z)=1+e−z1, where z=w⋅x+b

The output is a probability between 0 and 1, and a threshold (usually 0.5) is used to assign class
labels.

Real-world Applications:

• Email spam detection (Spam or Not Spam)

• Disease diagnosis (Positive or Negative)

• Credit risk assessment (Default or Not)

Code Implementation:

import pandas as pd

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LogisticRegression

from [Link] import accuracy_score, classification_report

from [Link] import LabelEncoder

# load the titanic data

data = pd.read_csv("/content/[Link]")

print(data)

# Select features and target

features =['Pclass','Sex','Age','SibSp','Parch','Fare',]

8|Page
data = data[features + ['Survived']]

# Handle missing values

data['Age'].fillna(data['Age'].median(), inplace=True)

# Convert categorical column 'sex' to numeric

le = LabelEncoder()

data['Sex'] = le.fit_transform(data['Sex']) # male =1, female = 0

# split features and target

X = data[features]

Y = data['Survived']

# Train-test split

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.2, random_state=42)

# Train logistic regression
model = LogisticRegression(max_iter=200)

[Link](X_train,Y_train)

# Predictions

Y_pred = [Link](X_test)

# Evaluation

print("Accuracy:", accuracy_score(Y_test, Y_pred))

print("\nClassification Report:\n", classification_report(Y_test, Y_pred))

from [Link] import accuracy_score, classification_report, confusion_matrix,
roc_curve, auc
import [Link] as plt

import seaborn as sns

cm= confusion_matrix(Y_test, Y_pred)

[Link](figsize=(6, 5))

[Link](cm, annot=True, fmt='d', cmap='Blues', xticklabels=['Did Not Survive',

'Survived'], yticklabels=['Did Not Survived', 'Survived'])

[Link]('Predicted')
[Link]('Actual')

9|Page
[Link]('Confusion Matrix')

[Link]()

Output:

Fig no.1

Exercise 5: Decision Tree Classifier for Multiclass Classification

Objective: To understand the working of decision tree classifiers and their application in
multiclass classification problems.

Solution:
Theory:

What is a Decision Tree Classifier?

A Decision Tree is a supervised machine learning algorithm used for both classification and
regression tasks. It works by splitting the dataset into branches based on feature values,
forming a tree structure. Each node represents a decision based on a feature, and each leaf node
represents a class label.

What is Multiclass Classification?

Multiclass classification involves classifying inputs into more than two categories, unlike
binary classification. For example:

• Classifying flowers as Setosa, Versicolor, or Virginica

• Digit recognition (0–9)

10 | P a g e
Use Case:

Classify iris flowers into three species using Decision Tree.

Code Implementation:

from [Link] import load_iris

from [Link] import DecisionTreeClassifier

from sklearn.model_selection import train_test_split

from [Link] import accuracy_score, classification_report

# Load dataset

iris = load_iris()
X = [Link]
y = [Link]

# Split into training and testing data

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train the Decision Tree classifier

model = DecisionTreeClassifier()
[Link](X_train, y_train)

# Make predictions

y_pred = [Link](X_test)

# Evaluate the model

print("Accuracy:", accuracy_score(y_test, y_pred))

print("Classification Report:\n", classification_report(y_test, y_pred,

target_names=iris.target_names))

Output:

11 | P a g e
Fig no.1

12 | P a g e

Python For Data Science IA 1 Programs
No ratings yet
Python For Data Science IA 1 Programs
14 pages
Machine Learning Evaluation Guide
100% (1)
Machine Learning Evaluation Guide
504 pages
Titanic Data Analysis with Python
No ratings yet
Titanic Data Analysis with Python
20 pages
Logistic & Naïve Bayes Analysis
No ratings yet
Logistic & Naïve Bayes Analysis
17 pages
Scikit-Learn: Library For Machine Learning and Data Science With Python
100% (1)
Scikit-Learn: Library For Machine Learning and Data Science With Python
11 pages
Python For Data Science IA 1 Programs
No ratings yet
Python For Data Science IA 1 Programs
14 pages
Logistic Regression
No ratings yet
Logistic Regression
4 pages
Machine Learning: Supervised /unsupervised
No ratings yet
Machine Learning: Supervised /unsupervised
33 pages
ML Python Exercises UOM BDS Classification
No ratings yet
ML Python Exercises UOM BDS Classification
18 pages
Machine Learning Classification Bootcamp
No ratings yet
Machine Learning Classification Bootcamp
7 pages
20MEMECH Part 3 - Classification
No ratings yet
20MEMECH Part 3 - Classification
49 pages
Machine Learning Lab Manual 06
100% (1)
Machine Learning Lab Manual 06
8 pages
ML Lab Manual
No ratings yet
ML Lab Manual
13 pages
ML Lab Mannual
No ratings yet
ML Lab Mannual
29 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
20 pages
Machine Learning Basics for Beginners
No ratings yet
Machine Learning Basics for Beginners
14 pages
ML Cheatsheet
No ratings yet
ML Cheatsheet
4 pages
Aychew Chernet
No ratings yet
Aychew Chernet
8 pages
Scikit-Learn Python Cheat Sheet
100% (1)
Scikit-Learn Python Cheat Sheet
1 page
Supervised Learning with Scikit-Learn
100% (2)
Supervised Learning with Scikit-Learn
178 pages
Slides (A12 A14)
No ratings yet
Slides (A12 A14)
353 pages
Scikit-Learn Python Cheat Sheet
100% (1)
Scikit-Learn Python Cheat Sheet
1 page
Machine Learning
No ratings yet
Machine Learning
8 pages
Machine Learning Lab New
No ratings yet
Machine Learning Lab New
14 pages
Supervised ML with Flask & Docker
No ratings yet
Supervised ML with Flask & Docker
30 pages
Scikit-Learn Python Cheat Sheet
No ratings yet
Scikit-Learn Python Cheat Sheet
1 page
Scikit-Learn Algorithm Overview
No ratings yet
Scikit-Learn Algorithm Overview
1 page
Iii Aid - ML
No ratings yet
Iii Aid - ML
30 pages
ML Notes 1
No ratings yet
ML Notes 1
3 pages
ML Lab Manual
No ratings yet
ML Lab Manual
14 pages
Machine Learning Cheatsheet
No ratings yet
Machine Learning Cheatsheet
5 pages
Final
No ratings yet
Final
13 pages
ML External Xerox
No ratings yet
ML External Xerox
1 page
ML Models
No ratings yet
ML Models
21 pages
Machine Learning Lab Manual Guide
No ratings yet
Machine Learning Lab Manual Guide
13 pages
Scikit-Learn Classification Cheat Sheet
No ratings yet
Scikit-Learn Classification Cheat Sheet
1 page
Data Mining Lab Manual CSE VII Sem
No ratings yet
Data Mining Lab Manual CSE VII Sem
63 pages
DSBDA Practicals
No ratings yet
DSBDA Practicals
16 pages
St. John College of Engineering and Management, Palghar - Maharashtra
No ratings yet
St. John College of Engineering and Management, Palghar - Maharashtra
11 pages
Machine Learning Strategies
No ratings yet
Machine Learning Strategies
59 pages
Crash Course Sul Machine Learning ?
No ratings yet
Crash Course Sul Machine Learning ?
13 pages
Advance AI and ML LAB
No ratings yet
Advance AI and ML LAB
16 pages
Data Science for Business Leaders
No ratings yet
Data Science for Business Leaders
18 pages
Beginner's Guide To Implementing A Simple Machine Learning Project - DeV Community
No ratings yet
Beginner's Guide To Implementing A Simple Machine Learning Project - DeV Community
9 pages
Naïve Bayes and K-NN Algorithm Guide
No ratings yet
Naïve Bayes and K-NN Algorithm Guide
10 pages
Logistic Regression
No ratings yet
Logistic Regression
21 pages
Minor Lab
No ratings yet
Minor Lab
4 pages
Unit2 ML Programs
No ratings yet
Unit2 ML Programs
7 pages
ML File
No ratings yet
ML File
10 pages
Practicalpgm ML
No ratings yet
Practicalpgm ML
33 pages
Lesson 09 - Introduction To Model Building
No ratings yet
Lesson 09 - Introduction To Model Building
85 pages
ML Regression & Classification Guide
100% (1)
ML Regression & Classification Guide
45 pages
Mlda - Lab
No ratings yet
Mlda - Lab
35 pages
Machine Learning and Deep Learning Supervised Learning 1682688720
No ratings yet
Machine Learning and Deep Learning Supervised Learning 1682688720
121 pages
LIFE 312 2023 Tentative Module Plan - 1
No ratings yet
LIFE 312 2023 Tentative Module Plan - 1
6 pages
A Primer For Teaching Digital History
No ratings yet
A Primer For Teaching Digital History
273 pages
CAE Examination Sample
No ratings yet
CAE Examination Sample
10 pages
Audit of Rights&sampling Methods
No ratings yet
Audit of Rights&sampling Methods
19 pages
Life 2e ADVANCED Mid-Term Test 1
50% (2)
Life 2e ADVANCED Mid-Term Test 1
11 pages
Advanced Calculus for Economists
No ratings yet
Advanced Calculus for Economists
8 pages
Statement of The Problem-HYPHOTHESIS-Significance of The Study-Scope and Limitations
No ratings yet
Statement of The Problem-HYPHOTHESIS-Significance of The Study-Scope and Limitations
4 pages
Evyap Oleo HPST Product Data Sheet
No ratings yet
Evyap Oleo HPST Product Data Sheet
1 page
Importance of Individual Capacity Building For Successful Solar Program Implementation A Case Study in The Philippines
No ratings yet
Importance of Individual Capacity Building For Successful Solar Program Implementation A Case Study in The Philippines
9 pages
Chapter 2 Edev
No ratings yet
Chapter 2 Edev
3 pages
Diss DLP 8-11
No ratings yet
Diss DLP 8-11
14 pages
STS Final
No ratings yet
STS Final
3 pages
ARIMA Model: Time Series Forecasting
No ratings yet
ARIMA Model: Time Series Forecasting
4 pages
Optics and Light: MCQs and Short Answers
No ratings yet
Optics and Light: MCQs and Short Answers
14 pages
Structural Dynamics Lab Guide
No ratings yet
Structural Dynamics Lab Guide
15 pages
Change Management in Organizations
No ratings yet
Change Management in Organizations
64 pages
Sitala: Goddess of Smallpox in Bengal
No ratings yet
Sitala: Goddess of Smallpox in Bengal
25 pages
Quality Models Complete
No ratings yet
Quality Models Complete
20 pages
2024 Science Faculty Rules & Regulations
No ratings yet
2024 Science Faculty Rules & Regulations
192 pages
Mid Term Test 1009
No ratings yet
Mid Term Test 1009
5 pages
Worksheet of Obj (G-7)
No ratings yet
Worksheet of Obj (G-7)
9 pages
Syllabus UCS408
No ratings yet
Syllabus UCS408
2 pages
Edlasia Filipe Research Report
No ratings yet
Edlasia Filipe Research Report
88 pages
Syllabus of Nursing Foundation
No ratings yet
Syllabus of Nursing Foundation
40 pages
Matrices and Determinant
No ratings yet
Matrices and Determinant
27 pages
Grade 8 Math Inequalities Test
No ratings yet
Grade 8 Math Inequalities Test
3 pages
University of Kerala Model Question Paper
No ratings yet
University of Kerala Model Question Paper
3 pages
Intrinsic vs. Extrinsic Motivation Explained
No ratings yet
Intrinsic vs. Extrinsic Motivation Explained
17 pages
E. A. Maxwell - Fallacies in Mathematics-Cambridge University Press (2006)
100% (1)
E. A. Maxwell - Fallacies in Mathematics-Cambridge University Press (2006)
94 pages
Main: Exploded View
No ratings yet
Main: Exploded View
30 pages