100% found this document useful (1 vote)

906 views25 pages

Machine Learning Lab Dlihebca6sem

Uploaded by

morrigyroblo86

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

906 views25 pages

Machine Learning Lab Dlihebca6sem

Uploaded by

morrigyroblo86

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

The Dalai Lama Institute for Higher Education

Bangalore University

Machine Learning Lab Manual

BCA 6th Sem

LIST OF PROGRAMS

1. Install and set up Python and essential libraries like NumPy and pandas

2. Introduce scikit-learn as a machine learning library.

3. Install and set up scikit-learn and other necessary tools.

4. Write a program to Load and explore the dataset of .CVS and excel files using

pandas.
5. Write a program to Visualize the dataset to gain insights using Matplotlib or Seaborn

by plotting scatter plots, bar charts.

6. Write a program to Handle missing data, encode categorical variables, and perform

feature scaling.

7. Write a program to implement a k-Nearest Neighbours (k-NN) classifier using scikit-

learn and Train the classifier on the dataset and evaluate its performance.

8. Write a program to implement a linear regression model for regression tasks and

Train the model on a dataset with continuous target variables.

9. Write a program to implement a decision tree classifier using scikit-learn and

visualize the decision tree and understand its splits.

10. Write a program to Implement K-Means clustering and Visualize clusters.

1. install and set up Python and essential libraries like NumPy and Pandas.

Install Python: If you haven't already installed Python, you can download it from the official
website:
To verify (terminal)
python --version
Install pip: pip is a package manager for Python that allows you to easily install and manage
libraries. Most recent versions of Python come with pip pre-installed. You can verify if pip is
installed by running the following command in your terminal or command prompt:
pip --version
Install NumPy and pandas: Once you have Python and pip installed, you can use pip to install
NumPy and pandas by running the following commands in your terminal or command prompt:
#In terminal
pip install numpy
pip install pandas
This will download and install NumPy and Pandas along with any dependencies they require.
Verify installation: After installing NumPy and pandas, you can verify that they were installed
correctly by running the following commands in Python's interactive mode or a Python script:

import numpy
import pandas
print(numpy.__version__)
print(pandas.__version__)

These commands should print the versions of NumPy and pandas that were installed.
Output:
2. Introduce sci-kit-learn as a machine learning library.

Scikit-learn (Sklearn) is the most useful and robust library for machine learning in Python. It
provides a selection of efficient tools for machine learning and statistical modeling including
classification, regression, clustering and dimensionality reduction via a consistence interface
in Python. This library, which is largely written in Python, is built upon NumPy,
SciPy and Matplotlib.

Installation

If you already installed NumPy and Scipy, the following are the two easiest ways to install
scikit-learn −

Using pip

The following command can be used to install sci-kit-learn via pip

pip install -U scikit-learn

Features

Rather than focusing on loading, manipulating and summarising data, Scikit-learn library is
focused on modeling the data. Some of the most popular groups of models provided by Sklearn
are as follows −

Supervised Learning algorithms − Almost all the popular supervised learning algorithms,
like Linear Regression, Support Vector Machine (SVM), Decision Tree etc., are the part of
scikit-learn.

Unsupervised Learning algorithms − On the other hand, it also has all the popular
unsupervised learning algorithms from clustering, factor analysis, PCA (Principal Component
Analysis) to unsupervised neural networks.

Clustering − This model is used for grouping unlabeled data.

Cross Validation − It is used to check the accuracy of supervised models on unseen data.
3. Install and set up scikit-learn and other necessary tools.

scikit-learn, a powerful Python library for machine learning. Here are the steps to set it up:
Install Python: If you haven’t already installed Python, download and install the latest version
of Python 3 from the official Python website.
Install scikit-learn using pip: Open your terminal or command prompt and run the following
command:

pip install -U scikit-learn

To verify your installation, you can use the following commands:

python -m pip show scikit-learn

# To see which version and where scikit-learn is installed

python -m pip freeze

# To see all packages installed in the active virtual environment

import sklearn

import numpy

import pandas

import matplotlib

print(sklearn.__version__)

print(numpy.__version__)

print(pandas.__version__)

print(matplotlib.__version__)

These commands should print the versions of scikit-learn and other libraries that were
installed.
Output:
4. Write a program to Load and explore the dataset of .CVS and excel files using

pandas.

import pandas as pd

def explore_dataset(file_path):

# Check if the file is a CSV or Excel file

if file_path.endswith('.csv'):

# Load CSV file into a pandas DataFrame

df = pd.read_csv(file_path)

elif file_path.endswith('.xlsx'):

# Load Excel file into a pandas DataFrame

df = pd.read_excel(file_path)

else:

print("Unsupported file format. Please provide a CSV or Excel file.")

return

# Display basic information about the DataFrame

print("Dataset information:")

print([Link]())

# Display the first few rows of the DataFrame

print("\nFirst few rows of the dataset:")

print([Link]())
# Display summary statistics for numerical columns

print("\nSummary statistics:")

print([Link]())

# Display unique values for categorical columns

print("\nUnique values for categorical columns:")

for column in df.select_dtypes(include='object').columns:

print(f"{column}: {df[column].unique()}")

# Example usage

file_path = '[Link]'

# Change this to the path of your CSV or Excel file

explore_dataset(file_path)
Output:
5. Write a program to Visualize the dataset to gain insights using Matplotlib or
Seaborn by plotting scatter plots, and bar charts.

import pandas as pd
import [Link] as plt
import seaborn as sns

def visualize_dataset(file_path):

# Load the dataset into a pandas DataFrame

df = pd.read_csv(file_path)

# Assuming it's a CSV file, change accordingly if it's an Excel file

# Plot scatter plots

[Link](df)
[Link]("Pairplot of the Dataset")
[Link]()

# Plot bar chart for categorical column (assuming the first column is categorical)

if [Link][:, 0].dtype == 'object':

[Link](x=[Link][0], data=df)
[Link]("Bar Chart of Categorical Column")
[Link]([Link][0])
[Link]("Count")
[Link]()
else:
print("No categorical column found to plot bar chart.")

# Example usage
file_path = '[Link]' # Change this to the path of your CSV file
visualize_dataset(file_path)
Output:
6. Write a program to Handle missing data, encode categorical variables, and
perform feature scaling.

import pandas as pd

from [Link] import load_iris

from [Link] import SimpleImputer

from [Link] import OneHotEncoder, StandardScaler

# Load Iris dataset

iris = load_iris()

iris_df = [Link](data=[Link], columns=iris.feature_names)

iris_df['target'] = [Link]

def preprocess_dataset(df):

# Handle missing data (Iris dataset doesn't have missing values, but we'll simulate
some)

[Link][::10, 0] = float('NaN')

# Simulate missing values in the first column

imputer = SimpleImputer(strategy='mean')

df[[Link]] = imputer.fit_transform(df[[Link]])

# Encode categorical variable (if applicable)

# Since Iris dataset doesn't have categorical variables, we'll skip this step

# Perform feature scaling

scaler = StandardScaler()

df[[Link][:-1]] = scaler.fit_transform(df[[Link][:-1]])

return df

# Preprocess Iris dataset

preprocessed_df = preprocess_dataset(iris_df)

# Display preprocessed dataset

print("Preprocessed dataset:")

print(preprocessed_df.head())
Output:
7. Write a program to implement a k-Nearest Neighbours (k-NN) classifier using
scikit-learn and Train the classifier on the dataset and evaluate its performance.

import numpy as np

import pandas as pd

from [Link] import load_iris

from sklearn.model_selection import train_test_split

from [Link] import KNeighborsClassifier

from [Link] import accuracy_score, classification_report

# Load Iris dataset

iris = load_iris()

X = [Link]

y = [Link]

# Split the dataset into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,

random_state=42)

# Initialize the k-NN classifier

k = 3 # Number of neighbors
knn_classifier = KNeighborsClassifier(n_neighbors=k)

# Train the classifier

knn_classifier.fit(X_train, y_train)

# Make predictions on the testing set

y_pred = knn_classifier.predict(X_test)

# Evaluate the classifier's performance

accuracy = accuracy_score(y_test, y_pred)

print("Accuracy:", accuracy)

# Display classification report

print("Classification Report:")

print(classification_report(y_test, y_pred, target_names=iris.target_names))

Output:
8. Write a program to implement a linear regression model for regression tasks
and Train the model on a dataset with continuous target variables.

import numpy as np

import pandas as pd

from [Link] import load_boston

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LinearRegression

from [Link] import mean_squared_error, r2_score

# Load Boston Housing dataset

boston = load_boston()

X = [Link]

y = [Link]

# Convert the data to a pandas DataFrame for easier manipulation

boston_df = [Link](data=X, columns=boston.feature_names)

boston_df['target'] = y

# Split the dataset into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,

random_state=42)

# Initialize Linear Regression model

linear_regression = LinearRegression()

# Train the model

linear_regression.fit(X_train, y_train)

# Make predictions on the testing set

y_pred = linear_regression.predict(X_test)

# Evaluate the model's performance

mse = mean_squared_error(y_test, y_pred)

r2 = r2_score(y_test, y_pred)

print("Mean Squared Error:", mse)

print("R-squared Score:", r2)

Output:
9. Write a program to implement a decision tree classifier using scikit-learn and

visualize the decision tree and understand its splits.

import numpy as np

import pandas as pd

from [Link] import load_iris

from sklearn.model_selection import train_test_split

from [Link] import DecisionTreeClassifier, plot_tree

import [Link] as plt

# Load Iris dataset

iris = load_iris()

X = [Link]

y = [Link]

# Split the dataset into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize Decision Tree classifier

decision_tree = DecisionTreeClassifier()

# Train the classifier

decision_tree.fit(X_train, y_train)
# Visualize the decision tree

[Link](figsize=(12, 8))

plot_tree(decision_tree, feature_names=iris.feature_names,
class_names=iris.target_names, filled=True)

[Link]()
Output:
10. Write a program to Implement K-Means clustering and Visualize clusters.

import numpy as np
import [Link] as plt
from [Link] import make_blobs
from [Link] import KMeans

# Generate sample data

X, y = make_blobs(n_samples=500, centers=4, cluster_std=0.8, random_state=42)

# Create a K-Means clusterer with 4 clusters

kmeans = KMeans(n_clusters=4, random_state=42)

# Fit the data

[Link](X)

# Get cluster labels

labels = kmeans.labels_

# Plot the data with cluster labels

[Link](figsize=(8, 6))
[Link](X[:, 0], X[:, 1], c=labels, cmap='viridis')
[Link](kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1], s=100,
c='red', label='Centroids')
[Link]('K-Means Clustering')
[Link]('X')
[Link]('Y')
[Link]()
[Link]()

Output:

R22 ML Question Bank For It and CSM
No ratings yet
R22 ML Question Bank For It and CSM
4 pages
Question Bank - Machine Learning
100% (1)
Question Bank - Machine Learning
4 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
1 page
K-Nearest Neighbors: KNN Algorithm Pseudocode
No ratings yet
K-Nearest Neighbors: KNN Algorithm Pseudocode
2 pages
Deep Learning Lab Manual for JNTU Hyderabad
No ratings yet
Deep Learning Lab Manual for JNTU Hyderabad
20 pages
Workshop ML & DL
No ratings yet
Workshop ML & DL
1 page
Clustering Techniques Overview
No ratings yet
Clustering Techniques Overview
26 pages
Machine Learning Lab Viva
No ratings yet
Machine Learning Lab Viva
3 pages
SQL & PL/SQL Exercises for Students
No ratings yet
SQL & PL/SQL Exercises for Students
10 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
35 pages
LP I ML Viva Questions
100% (1)
LP I ML Viva Questions
9 pages
SVM Notes
No ratings yet
SVM Notes
40 pages
Clustering Techniques for Data Scientists
No ratings yet
Clustering Techniques for Data Scientists
5 pages
Similarity and Dissimilarity
No ratings yet
Similarity and Dissimilarity
34 pages
Machine Learning Question Bank-Unit 3
No ratings yet
Machine Learning Question Bank-Unit 3
6 pages
Machine Learning for Tech Enthusiasts
No ratings yet
Machine Learning for Tech Enthusiasts
12 pages
Unit 1 - Machine Learning
No ratings yet
Unit 1 - Machine Learning
21 pages
Machine Learning Notes - Concepts, Algorithms
No ratings yet
Machine Learning Notes - Concepts, Algorithms
171 pages
Cs8080 Unit3 Text Classification and Clustering
No ratings yet
Cs8080 Unit3 Text Classification and Clustering
171 pages
3-1 Bigdata (Spark)
No ratings yet
3-1 Bigdata (Spark)
3 pages
Machine Learning Theory Essentials
No ratings yet
Machine Learning Theory Essentials
9 pages
Classification - Issues Regarding Classification and Prediction
No ratings yet
Classification - Issues Regarding Classification and Prediction
42 pages
ML Unit-2
No ratings yet
ML Unit-2
26 pages
Handout - BITS-F464 - Machine - Learning - August 2019
No ratings yet
Handout - BITS-F464 - Machine - Learning - August 2019
4 pages
Compiler Design Short Notes and Concepts
100% (1)
Compiler Design Short Notes and Concepts
36 pages
Artificial Intelligence PPT-7 - Inference in FOL
No ratings yet
Artificial Intelligence PPT-7 - Inference in FOL
46 pages
Machine Learning Course Overview
No ratings yet
Machine Learning Course Overview
7 pages
Bca 6 Sem Machine Learning 91697 Jan 2023
No ratings yet
Bca 6 Sem Machine Learning 91697 Jan 2023
2 pages
Machine Learning Lab Manual Final
No ratings yet
Machine Learning Lab Manual Final
65 pages
Machine Learning Question Bank
No ratings yet
Machine Learning Question Bank
1 page
ML - Viva QnA - Doubtly - in
No ratings yet
ML - Viva QnA - Doubtly - in
14 pages
ML Lab Manual for CS Students
No ratings yet
ML Lab Manual for CS Students
62 pages
Combining Classifiers in Machine Learning An Introductory Guide
No ratings yet
Combining Classifiers in Machine Learning An Introductory Guide
11 pages
Trie and Redblack Tree Mcqs
No ratings yet
Trie and Redblack Tree Mcqs
9 pages
Scikit-learn Machine Learning Tutorial
No ratings yet
Scikit-learn Machine Learning Tutorial
17 pages
Machine Learning Question Paper - May 18 - Computer Engineering (Semester 8) - Mumbai University (MU)
100% (1)
Machine Learning Question Paper - May 18 - Computer Engineering (Semester 8) - Mumbai University (MU)
4 pages
(New) (New) ML KNN Introduction Handwritten Notes
No ratings yet
(New) (New) ML KNN Introduction Handwritten Notes
6 pages
Unit 5 - Cluster Analysis
No ratings yet
Unit 5 - Cluster Analysis
14 pages
Machine Learning Concepts and Techniques
No ratings yet
Machine Learning Concepts and Techniques
12 pages
ML Lab Manual 2025-2
No ratings yet
ML Lab Manual 2025-2
35 pages
Designing A Learning System
No ratings yet
Designing A Learning System
21 pages
Machine Learning Quiz for Students
No ratings yet
Machine Learning Quiz for Students
8 pages
Machine Learning Revision Notes
No ratings yet
Machine Learning Revision Notes
6 pages
C Lab Worksheet 11A - 1 C & C++ Pointers Part 3: Pointers, Array and Functions
No ratings yet
C Lab Worksheet 11A - 1 C & C++ Pointers Part 3: Pointers, Array and Functions
5 pages
Advanced Deep Learning Practical File
No ratings yet
Advanced Deep Learning Practical File
29 pages
C Program Output Analysis and Errors
100% (1)
C Program Output Analysis and Errors
48 pages
Concept Learning in Machine Learning
No ratings yet
Concept Learning in Machine Learning
71 pages
Data Mining: Classification & Prediction
No ratings yet
Data Mining: Classification & Prediction
16 pages
Cp4252-Machine Learning Lab Manual 23-24
No ratings yet
Cp4252-Machine Learning Lab Manual 23-24
28 pages
Lab Manual in Theory of Computation
100% (2)
Lab Manual in Theory of Computation
20 pages
Data Clustering..
No ratings yet
Data Clustering..
10 pages
ML Material Unit1
No ratings yet
ML Material Unit1
32 pages
Noc20 Cs81 Assignment 01 Week 03
No ratings yet
Noc20 Cs81 Assignment 01 Week 03
5 pages
Machine Learning
No ratings yet
Machine Learning
2 pages
UNIT 2 - Notes
No ratings yet
UNIT 2 - Notes
31 pages
ML Pgms - 24mar2025
No ratings yet
ML Pgms - 24mar2025
23 pages
ML LabManual
No ratings yet
ML LabManual
16 pages
ML Lab Manual (Vim)
No ratings yet
ML Lab Manual (Vim)
13 pages
Digital Principal and System Design
No ratings yet
Digital Principal and System Design
17 pages
Datascience 2 PDF
No ratings yet
Datascience 2 PDF
24 pages
100 Days of ML Code Journey
100% (1)
100 Days of ML Code Journey
15 pages
Election Prediction Using ML
No ratings yet
Election Prediction Using ML
30 pages
DL Exp1
No ratings yet
DL Exp1
8 pages
Subject: Informatics Practices (Code-065) Class - XII
No ratings yet
Subject: Informatics Practices (Code-065) Class - XII
11 pages
Lasagne
No ratings yet
Lasagne
127 pages
Toxic Comments Classification
No ratings yet
Toxic Comments Classification
10 pages
EDA Unit II
No ratings yet
EDA Unit II
117 pages
Python Modules and File Handling Basics
No ratings yet
Python Modules and File Handling Basics
62 pages
Python Practical Exam
No ratings yet
Python Practical Exam
2 pages
RT 4 Ip 065
No ratings yet
RT 4 Ip 065
8 pages
Designing Machine Learning Workflows in Python Chapter1
No ratings yet
Designing Machine Learning Workflows in Python Chapter1
32 pages
Data Science Programs
No ratings yet
Data Science Programs
11 pages
Lucknow Public School - 20241201 - 220143 - 0000
No ratings yet
Lucknow Public School - 20241201 - 220143 - 0000
44 pages
MBA House Price Prediction Report
No ratings yet
MBA House Price Prediction Report
37 pages
Flyte Cheat Sheet (2023.3.6)
No ratings yet
Flyte Cheat Sheet (2023.3.6)
1 page
AI Data Preprocessing Guide
No ratings yet
AI Data Preprocessing Guide
12 pages
Kavya Kavya Content Sheet
No ratings yet
Kavya Kavya Content Sheet
12 pages
Unit-1 Question Bank (AKTU Question Paper Solution)
No ratings yet
Unit-1 Question Bank (AKTU Question Paper Solution)
25 pages
Smart Farming Report
No ratings yet
Smart Farming Report
67 pages
Final Report Indhu
No ratings yet
Final Report Indhu
23 pages
Data Cleaning and Manipulation in Python
No ratings yet
Data Cleaning and Manipulation in Python
33 pages
Python Pandas DataFrame Creation Guide
No ratings yet
Python Pandas DataFrame Creation Guide
4 pages
Data Science Handwritten Notes - 3
No ratings yet
Data Science Handwritten Notes - 3
26 pages
Experiment No.1
No ratings yet
Experiment No.1
5 pages
Programs
No ratings yet
Programs
8 pages
Python Data Analytics With Pandas NumPy and Matplotlib 3rd Edition Fabio 1484295323 9781484295328 HQ File Fast Access
No ratings yet
Python Data Analytics With Pandas NumPy and Matplotlib 3rd Edition Fabio 1484295323 9781484295328 HQ File Fast Access
355 pages
CSE 220 Midterm Exam Fall 2023
No ratings yet
CSE 220 Midterm Exam Fall 2023
2 pages
FaceTrack 2
No ratings yet
FaceTrack 2
15 pages
Lasso Regression in Machine Learning
No ratings yet
Lasso Regression in Machine Learning
14 pages