PARUL INSTITUTE OF ENGINEERING & TECHNOLOGY
FACULTY OF ENGINEERING & TECHNOLOGY
MACHINE LEARNING LABORATORTY(303105353)
ERP: 2203031241460
PRACTICAL-4
AIM:-
Implement the naïve Bayesian classifier for a sample training data set stored as a .CSV
file. Compute the accuracy of the classifier, considering a few test datasets.
Code:-
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score, confusion_matrix,
precision_score,recall_score
msg = pd.read_csv('document.csv', names=['message', 'label'])
print('Total instance of dataset:', msg.shape[0])
msg['labelnum'] = msg['label'].map({'pos': 1, 'neg': 0})
X = msg['message']
y = msg['labelnum']
Total instance of dataset: 18
#Split the data into training and testing sets
Xtrain, Xtest, ytrain, ytest = train_test_split(X, y)
# Vectorize the text data
vectorizer = CountVectorizer()
Xtrain_dm = vectorizer.fit_transform(Xtrain)
Xtest_dm = vectorizer.transform(Xtest)
Xtest_dm
<5x48 sparse matrix of type '<class 'numpy.int64'>'
with 12 stored elements in Compressed Sparse Row format>
ML/6AI8/AI/PIET/FET/PU Page No:1
PARUL INSTITUTE OF ENGINEERING & TECHNOLOGY
FACULTY OF ENGINEERING & TECHNOLOGY
MACHINE LEARNING LABORATORTY(303105353)
ERP: 2203031241460
#train the Navie Bayes classifier
clf=MultinomialNB()
clf.fit(Xtrain_dm, ytrain)
#Evaluate the models performance
pred=clf.predict(Xtest_dm)
print('Accuarcy:',accuracy_score,ytest,pred)
print('Confusion Matrix',confusion_matrix(ytest,pred))
Accuarcy: <function accuracy_score at 0x7d5e1e5bcd60> 0 1
4 1
12 1
1 1
17 0
Name: labelnum, dtype: int64 [0 1 0 1 0]
Confusion Matrix [[1 0]
[2 2]]
ytest
ML/6AI8/AI/PIET/FET/PU Page No:2
PARUL INSTITUTE OF ENGINEERING & TECHNOLOGY
FACULTY OF ENGINEERING & TECHNOLOGY
MACHINE LEARNING LABORATORTY(303105353)
ERP: 2203031241460
pred
array([0, 1, 0, 1, 0])
#Predict sentiment for user input
user_input = input("Enter a message: ")
user_input_dm = vectorizer.transform([user_input])
user_pred = clf.predict(user_input_dm)
sentiment = 'positive' if user_pred[0] == 1 else 'negative'
print("Sentiment:", sentiment)
Enter a message: That is a bad locality to stay
Sentiment: negative
Conclusion:-
The Naive Bayes classifier has been successfully implemented on a sample
training dataset stored as a .CSV file. The classifier was able to accurately predict the sentiments
(positive or negative) of the messages. The accuracy, confusion matrix, precision, and recall
were computed for a few test datasets to evaluate the model's performance. The model
demonstrated the ability to classify user input effectively, as shown by its prediction of the
sentiment for a given user message. Overall, the Naive Bayes classifier provided a practical and
effective solution for sentiment analysis on the given dataset.
ML/6AI8/AI/PIET/FET/PU Page No:3