ML Lab Programs (1)

1
Machine Learning Lab

Course Code:20PCC32XX
Prerequisites: Computer Programming

Course objectives:
1. Learn installation of Anakonda-Python and its useful packages

2. Study various regression methods with respect to problem solving
3. Understand clustering algorithms and apply in appropriate problems.
4. Understand classification algorithms and apply in appropriate problems.
Course Outcomes:
LIST OF EXPERIMENTS
1. Install the python software/Anaconda- python and install useful package for Machine
learning load the dataset(sample), understand, and visualize the Data
2. Implement simple linear regression
3. Implement multivariate linear regression.
4. Implement simple logistic regression and multivariate logistic regression.
5. Implement decision trees.
6. Implement any 3 classification algorithms.
7. Implement random forests algorithm
8. Implement K-means, KNN algorithmsm
9. Implement SVM on any applicable datasets.
10.Implement neural networks
11.Implement CA.
12.Implement anomaly detection and recommendation
REFERENCES
1. Machine Learning with Python/Scikit-Learn,

Application to the Estimation of Occupancy and Human
2
2. Implement Simple Linear Regression
Example: predicting a person's weight based on their height.
Import necessary libraries

import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
Sample dataset: height (inches) vs weight (pounds)
X = np.array([58, 59, 60, 61, 62, 63, 64, 65, 66, 67]).reshape(-1, 1) Heights
y = np.array([115, 117, 120, 123, 126, 129, 132, 136, 140, 144]) Weights
Split data into training and testing sets (80% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
Create the model
model = LinearRegression()
Train the model
model.fit(X_train, y_train)
Make predictions on the test set
y_pred = model.predict(X_test)
Evaluate the model (Mean Squared Error)
mse = mean_squared_error(y_test, y_pred)
print(f"Mean Squared Error: {mse}")
Visualization of the line
plt.scatter(X, y, color='blue', label='Data Points')
plt.plot(X, model.predict(X), color='red', label='Regression Line')
plt.xlabel('Height (inches)')
plt.ylabel('Weight (pounds)')
plt.legend()
plt.show()
Print the slope (coefficient) and intercept of the line
print(f"Slope (Coefficient): {model.coef_[0]}")
print(f"Intercept: {model.intercept_}")
3. Implement Multivariate Linear Regression
We will use a dataset with multiple features (independent variables) such as size of the house
(square footage), number of bedrooms, and age of the house to predict the house price.

import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
Sample dataset: size (sq ft), bedrooms, age (years) and price (in thousands)
data = {
'Size': [1500, 1600, 1700, 1875, 2000, 2100, 2300, 2400, 2550, 2700],
'Bedrooms': [3, 3, 3, 4, 4, 3, 4, 4, 5, 5],
3
'Age': [10, 12, 13, 14, 15, 16, 17, 18, 19, 20],
'Price': [300, 320, 340, 360, 400, 410, 430, 450, 480, 500]
}
Create a DataFrame
df = pd.DataFrame(data)
Features (Size, Bedrooms, Age)
X = df[['Size', 'Bedrooms', 'Age']]
Target (Price)
y = df['Price']
Split data into training and testing sets (80% train, 20% test)
Create the model
model = LinearRegression()
Train the model
Make predictions on the test set
Evaluate the model (Mean Squared Error)
mse = mean_squared_error(y_test, y_pred)
print(f"Mean Squared Error: {mse}")
Print the coefficients and intercept
print(f"Coefficients (Weights): {model.coef_}")
print(f"Intercept (Bias): {model.intercept_}")
Example prediction: Predict the price of a house with 2200 sq ft, 4 bedrooms, and 15 years
old
example = np.array([[2200, 4, 15]])
predicted_price = model.predict(example)
print(f"Predicted price for a 2200 sq ft, 4 bedroom, 15 year old house: {predicted_price[0]}
thousand dollars")
4. Implement Simple Logistic Regression and Multivariate Logistic Regression
Simple Logistic Regression (One feature):

We will classify whether a person will buy a car based on their age (a single feature).

import numpy as np
import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
Sample data: Age vs Buy (0 = No, 1 = Yes)
data = {
'Age': [22, 25, 47, 52, 46, 56, 55, 60, 62, 61],
'Buy': [0, 0, 1, 1, 1, 1, 1, 1, 1, 0]
}
Features (Age)
X = df[['Age']]
Target (Buy)
4
y = df['Buy']
Split the dataset into training and testing sets
Create the Logistic Regression model
model = LogisticRegression()
Train the model
Make predictions
Evaluate the model
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy}")
Example prediction: Will a 50-year-old buy a car?
example = np.array([[50]])
pred = model.predict(example)
print(f"Prediction for age 50: {'Yes' if pred[0] == 1 else 'No'}")
Multivariate Logistic Regression (Multiple features):

Here, we will classify whether a student pass based on their study time and number of practice
tests taken.
Sample data: Study Time, Practice Tests, and Pass (0 = Fail, 1 = Pass)
data = {
'StudyTime': [10, 20, 30, 40, 25, 35, 50, 60, 70, 80],
'PracticeTests': [1, 3, 2, 4, 3, 5, 6, 5, 7, 8],
'Pass': [0, 0, 1, 1, 0, 1, 1, 1, 1, 1]
}
Features (StudyTime, PracticeTests)
X = df[['StudyTime', 'PracticeTests']]
Target (Pass)
y = df['Pass']
Create the Logistic Regression model
model = LogisticRegression()
Train the model
Make predictions
Evaluate the model
Example prediction: Will a student who studies 55 hours and takes 6 practice tests pass?
example = np.array([[55, 6]])
pred = model.predict(example)
print(f"Prediction for 55 hours study and 6 practice tests: {'Pass' if pred[0] == 1 else 'Fail'}")
5. Implement Decision Trees

5
Decision Trees classify data points by splitting the dataset into branches based on feature
values. The splits are made using metrics like Gini impurity or entropy.

from sklearn.tree import DecisionTreeClassifier
from sklearn import tree
Sample data: Features (Hours Studied, Practice Tests), Target (Pass = 1, Fail = 0)
data = {
'StudyTime': [10, 20, 30, 40, 25, 35, 50, 60, 70, 80],
'PracticeTests': [1, 3, 2, 4, 3, 5, 6, 5, 7, 8],
'Pass': [0, 0, 1, 1, 0, 1, 1, 1, 1, 1]
}
Target (Pass)
y = df['Pass']
Create the Decision Tree Classifier
model = DecisionTreeClassifier()
Train the model
Make predictions
Evaluate the model
Visualize the decision tree
tree.plot_tree(model)
plt.show()
6. Implement Any 3 Classification Algorithms
We will implement Logistic Regression, k-Nearest Neighbors (k-NN), and Support Vector
Machine (SVM) on the same dataset.
from sklearn.neighbors import KNeighborsClassifier

from sklearn.svm import SVC
Logistic Regression
logistic_model = LogisticRegression()
logistic_model.fit(X_train, y_train)
logistic_pred = logistic_model.predict(X_test)
logistic_acc = accuracy_score(y_test, logistic_pred)
print(f"Logistic Regression Accuracy: {logistic_acc}")
k-Nearest Neighbors (k-NN)
knn_model = KNeighborsClassifier(n_neighbors=3)
knn_model.fit(X_train, y_train)
knn_pred = knn_model.predict(X_test)
6
knn_acc = accuracy_score(y_test, knn_pred)

print(f"k-NN Accuracy: {knn_acc}")
Support Vector Machine (SVM)
svm_model = SVC()
svm_model.fit(X_train, y_train)
svm_pred = svm_model.predict(X_test)
svm_acc = accuracy_score(y_test, svm_pred)
print(f"SVM Accuracy: {svm_acc}")
7. Implement Random Forest Algorithm
Random Forest is an ensemble learning method that combines multiple decision trees to
improve accuracy and prevent overfitting.
from sklearn.ensemble import RandomForestClassifier

Create the Random Forest model
rf_model = RandomForestClassifier(n_estimators=100, random_state=42)
Train the model
rf_model.fit(X_train, y_train)
Make predictions
y_pred_rf = rf_model.predict(X_test)
Evaluate the model
rf_accuracy = accuracy_score(y_test, y_pred_rf)
print(f"Random Forest Accuracy: {rf_accuracy}")
8. Implement K-means and KNN Algorithms
K-means Clustering:
K-means is an unsupervised algorithm that divides data points into (k) clusters, where each
data point belongs to the cluster with the nearest mean.
Import necessary librarie

from sklearn.cluster import KMeans
import numpy as np
import matplotlib.pyplot as plt
Sample dataset: 2D data points
X = np.array([[1, 2], [1, 4], [1, 0], [10, 2], [10, 4], [10, 0]])
Create the KMeans model
kmeans = KMeans(n_clusters=2, random_state=0)
Train the model
kmeans.fit(X)
Predict the clusters
clusters = kmeans.predict(X)
Visualize the results
plt.scatter(X[:, 0], X[:, 1], c=clusters, cmap='viridis')
plt.scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1], s=300, c='red',
marker='X')
plt.title("K-means Clustering")
plt.show()
Print cluster centers
7
print("Cluster centers:", kmeans.cluster_centers_)
K-Nearest Neighbors (KNN) Classification:

KNN is a supervised algorithm that classifies data points based on the classes of their
nearest neighbors.
from sklearn.neighbors import KNeighborsClassifier

Sample dataset: Features (Hours Studied, Practice Tests), Target (Pass = 1, Fail = 0)
data = {
'StudyTime': [10, 20, 30, 40, 25, 35, 50, 60, 70, 80],
'PracticeTests': [1, 3, 2, 4, 3, 5, 6, 5, 7, 8],
'Pass': [0, 0, 1, 1, 0, 1, 1, 1, 1, 1]
}
Target (Pass)
y = df['Pass']
Create the KNN model
knn = KNeighborsClassifier(n_neighbors=3)
Train the model
knn.fit(X_train, y_train)
Make predictions
y_pred = knn.predict(X_test)
Evaluate the model
print(f"KNN Accuracy: {accuracy}")
9. Implement SVM on Any Applicable Dataset
Support Vector Machine (SVM) is a supervised learning algorithm that finds the hyperplane
that best separates data into classes.
from sklearn.svm import SVC

from sklearn.datasets import load_iris
Load the iris dataset
iris = load_iris()
X = iris.data
y = iris.target
Create the SVM model
svm_model = SVC()
Train the model
8
svm_model.fit(X_train, y_train)
Make predictions
y_pred = svm_model.predict(X_test)
Evaluate the model
print(f"SVM Accuracy on Iris dataset: {accuracy}")
10. Implement Neural Networks
Neural networks consist of layers of interconnected nodes, and they are used for various
types of classification or regression tasks.
We'll implement a basic Multi-Layer Perceptron (MLP) using the `MLPClassifier` from the
`sklearn` library.
from sklearn.neural_network import MLPClassifier

from sklearn.datasets import load_digits
from sklearn.metrics import classification_report
Load digits dataset
digits = load_digits()
X = digits.data
y = digits.target
Create the MLP model (Neural Network)
mlp = MLPClassifier(hidden_layer_sizes=(100,), max_iter=300, random_state=42)
Train the model
mlp.fit(X_train, y_train)
Make predictions
y_pred = mlp.predict(X_test)
Evaluate the model
print("Neural Network Classification Report:")
print(classification_report(y_test, y_pred))
10. Implement Correlation Analysis (CA)
Correlation analysis is used to study the strength and direction of the linear relationship
between two continuous variables.
import seaborn as sns

import pandas as pd
Sample dataset: Hours studied, Grades, Hours of sleep
data = {
'Hours_Studied': [5, 10, 15, 20, 25],
'Grades': [50, 60, 70, 80, 90],
'Hours_Sleep': [8, 7, 6, 5, 4]
}
Calculate the correlation matrix
corr_matrix = df.corr()
Visualize the correlation matrix
9
sns.heatmap(corr_matrix, annot=True, cmap='coolwarm')

plt.title('Correlation Analysis')
plt.show()
Print the correlation values
print("Correlation matrix:")
print(corr_matrix)
11. Implement Anomaly Detection and Recommendation
Anomaly Detection:
Anomaly detection identifies outliers or rare items that differ significantly from the majority
of the data.
from sklearn.ensemble import IsolationForest

Sample data: 2D points
X = np.array([[1, 2], [1, 4], [1, 0], [10, 2], [10, 4], [10, 0], [50, 50], [100, 100]])
Create the IsolationForest model
iso_forest = IsolationForest(contamination=0.1)
Fit the model
iso_forest.fit(X)
Predict anomalies (-1: anomaly, 1: normal)
predictions = iso_forest.predict(X)
Output the anomaly predictions
print("Anomaly detection predictions:", predictions)
Recommendation Systems (Using Nearest Neighbors):

Recommendation systems predict user preferences and suggest relevant items based on
past data.
from sklearn.neighbors import NearestNeighbors

Sample user-item data (rows: users, columns: items)
data = np.array([[5, 3, 0, 1],
[4, 0, 0, 1],
[1, 1, 0, 5],
[0, 0, 5, 4],
[0, 3, 4, 5]])
Create the model
model = NearestNeighbors(metric='cosine', algorithm='brute')
Fit the model
model.fit(data)
Find the nearest neighbors for the first user
distances, indices = model.kneighbors(data[0].reshape(1, -1), n_neighbors=3)
Output the nearest neighbors for user 1
print("Nearest neighbors for user 1:")
print(indices)

ML Lab Programs (1)

Uploaded by

ML Lab Programs (1)

Uploaded by

1

Machine Learning Lab

Prerequisites: Computer Programming

1. Learn installation of Anakonda-Python and its useful packages

1. Machine Learning with Python/Scikit-Learn,

2. Implement Simple Linear Regression

Example: predicting a person's weight based on their height.

Import necessary libraries

3. Implement Multivariate Linear Regression

Import necessary libraries

4. Implement Simple Logistic Regression and Multivariate Logistic Regression

Simple Logistic Regression (One feature):

Import necessary libraries

Multivariate Logistic Regression (Multiple features):

5. Implement Decision Trees

Import necessary libraries

6. Implement Any 3 Classification Algorithms

from sklearn.neighbors import KNeighborsClassifier

knn_acc = accuracy_score(y_test, knn_pred)

7. Implement Random Forest Algorithm

from sklearn.ensemble import RandomForestClassifier

8. Implement K-means and KNN Algorithms

Import necessary librarie

print("Cluster centers:", kmeans.cluster_centers_)

K-Nearest Neighbors (KNN) Classification:

from sklearn.neighbors import KNeighborsClassifier

9. Implement SVM on Any Applicable Dataset

from sklearn.svm import SVC

10. Implement Neural Networks

from sklearn.neural_network import MLPClassifier

10. Implement Correlation Analysis (CA)

import seaborn as sns

sns.heatmap(corr_matrix, annot=True, cmap='coolwarm')

11. Implement Anomaly Detection and Recommendation

from sklearn.ensemble import IsolationForest

Recommendation Systems (Using Nearest Neighbors):

from sklearn.neighbors import NearestNeighbors

You might also like