COMSATS University Islamabad, Wah Campus
Electrical & Computer Engineering Department
Program: BCS
Semester/Section: C
Subject: Artificial Intelligence
Instructor: Engr. Adnan Saleem Mughal
Group:
Sp20-bcs-012
Sp20-bcs-016
Lab Assignment / Report No: 3 CLO [4]
Linear Regression.
Project Progress.
# Importing Necessary Libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = (20.0, 10.0)
# Reading Data
data = pd.read_csv('headbrain.csv')
print(data.shape)
data.head()
# Collecting X and Y
X = data['Head Size(cm^3)'].values
Y = data['Brain Weight(grams)'].values
# Mean X and Y
mean_x = np.mean(X)
mean_y = np.mean(Y)
# Total number of values
m = len(X)
# Using the formula to calculate b1 and b2
numer = 0
denom = 0
for i in range(m):
numer += (X[i] - mean_x) * (Y[i] - mean_y)
denom += (X[i] - mean_x) ** 2
b1 = numer / denom
b0 = mean_y - (b1 * mean_x)
# Print coefficients
print(b1, b0)
# Plotting Values and Regression Line
max_x = np.max(X) + 100
min_x = np.min(X) - 100
# Calculating line values x and y
x = np.linspace(min_x, max_x, 1000)
y = b0 + b1 * x
# Ploting Line
plt.plot(x, y, color='#58b970', label='Regression Line')
# Ploting Scatter Points
plt.scatter(X, Y, c='#ef5423', label='Scatter Plot')
plt.xlabel('Head Size in cm3')
plt.ylabel('Brain Weight in grams')
plt.legend()
# Calculating Root Mean Squares Error
rmse = 0
for i in range(m):
y_pred = b0 + b1 * X[i]
rmse += (Y[i] - y_pred) ** 2
rmse = np.sqrt(rmse/m)
print(rmse)
ss_t = 0
ss_r = 0
for i in range(m):
y_pred = b0 + b1 * X[i]
ss_t += (Y[i] - mean_y) ** 2
ss_r += (Y[i] - y_pred) ** 2
r2 = 1 - (ss_r/ss_t)
print(r2)
#--------------------------------------------------------------------------------
%matplotlib inline
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = (20.0, 10.0)
from mpl_toolkits.mplot3d import Axes3D
data = pd.read_csv('student.csv')
print(data.shape)
data.head()
math = data['Math'].values
read = data['Reading'].values
write =data['Writing'].values
# Ploting the scores as scatter plot
fig = plt.figure()
ax = Axes3D(fig)
ax.scatter(math, read, write, color='#ef1234')
plt.show()
m = len(math)
x0 =np.ones(m)
X = np.array([x0, math, read]).T
# Initial Coefficients
B = np.array([0, 0,0])
Y = np.array(write)
alpha = 0.0001
def cost_function(X, Y, B):
m = len(Y)
J = np.sum((X.dot(B) - Y) ** 2)/(2 * m)
return J
inital_cost = cost_function(X, Y, B)
print(inital_cost)
2470.11
def gradient_descent(X, Y, B, alpha, iterations):
cost_history = [0] * iterations
m = len(Y)
for iteration in range(iterations):
# Hypothesis Values
h = X.dot(B)
# Difference b/w Hypothesis and Actual Y
loss = h - Y
# Gradient Calculation
gradient = X.T.dot(loss) / m
# Changing Values of B using Gradient
B = B - alpha * gradient
# New Cost Value
cost = cost_function(X, Y, B)
cost_history[iteration] = cost
return B, cost_history
# 100000 Iterations
newB, cost_history = gradient_descent(X, Y, B, alpha, 100000)
# New Values of B
print(newB)
# Final Cost of new B
print(cost_history[-1])
[-0.47889172 0.09137252 0.90144884]
Model Evaluation - RMSE
def rmse(Y, Y_pred):
rmse = np.sqrt(sum((Y - Y_pred) ** 2) / len(Y))
return rmse
# Model Evaluation - R2 Score
def r2_score(Y, Y_pred):
mean_y = np.mean(Y)
ss_tot = sum((Y - mean_y) ** 2)
ss_res = sum((Y - Y_pred) ** 2)
r2 = 1 - (ss_res / ss_tot)
return r2
Y_pred = X.dot(newB)
print(rmse(Y, Y_pred))
print(r2_score(Y, Y_pred))
4.57714397273
Project Title:
Sentiment Analysis of Document Verification (SADV)
Submitted by:
Ayesha Asghar SP20-BCS-012
Yasaal Maryum SP20-BCS-016
Semester Project Specifications
• Objective
Sentiment analysis identifies and extracts subjective information from the text using
natural language processing and text mining.
The goal of project is to improve the efficiency and effectiveness of documents
verification processes. It is a machine learning model that can accurately classify
documents using sentiment analysis.
• Summary
The semester project on sentiment analysis of document verification aims to develop a
machine learning model that can accurately classify documents as authentic or fraudulent
based on the sentiment expressed in the text. The project team will focus on using
sentiment analysis as a key feature to improve the accuracy of the model. The goal of the
project is to improve the efficiency and effectiveness of document verification processes.
At the end of the semester, the success of the project will be evaluated based on the
accuracy of the model and its potential impact on the document verification process.
• Block Diagram
Data Collection Result ML model
(Raw data) Visualization tunning
(Matplotlib to (User
Processing Document
Data Verification
(NLTK) (Use trained model
Feature Sentiment Analysis
Extraction (create & train)
(Sci-kit learn)
• Flow Chart
Input Data Process Extract
Data features
Use Train a
Trained Sentiment
Model Analysis
Positive; Negative;
Continue Notify
verification Authority
sentiment analysis results Visualize results Collect user
to verify the authenticity. sentiment analysis feedback and
document verification. fine-tune.
• Working
Sentiment Analysis of Document Verification involves a combination of machine
learning techniques to analyze the sentiment of a document and verify its authenticity.
The performance of the model can be improved by using more data, better preprocessing
techniques, advanced machine learning algorithms, and fine-tuning the model
parameters. Further details will be added along with project progress.
• Simulation
During the implementation of the sentiment analysis project using the tkinter framework, several
aspects were considered. Here are the results and observations from the simulation:
1. Dataset Analysis:
The entire dataset was analyzed to understand the sentiment distribution and the characteristics
of the data. This analysis provided insights into the distribution of positive, negative, and neutral
sentiments in the dataset, allowing for a better understanding of the sentiment landscape.
2. User Sentence Instant Analysis:
The application allowed users to input a sentence for instant sentiment analysis. The user's input
sentence was processed using the trained sentiment analysis model to predict the sentiment associated
with the input. The sentiment analysis algorithm classified the user's sentence into one of the predefined
sentiment categories, such as positive, negative, or neutral.
• Problems Encountered:
While developing the sentiment analysis project, the following issues were encountered:
1. Dataset Limitations:
The accuracy and effectiveness of the sentiment analysis model heavily rely on the
quality and diversity of the dataset used for training. Insufficient or biased training data may
lead to suboptimal performance and biased sentiment predictions.
• Troubleshooting:
To address the dataset limitations, efforts were made to collect a diverse and representative
dataset. Data preprocessing techniques, such as cleaning and normalization, were applied to improve the
quality of the training data. Additionally, techniques like data augmentation and cross-validation were
utilized to enhance the model's performance and mitigate biases.