S J C INSTITUTE OF TECHNOLOGY
DEPARTMENT OF INFORMATION SCIENCE AND ENGINEERING
An Internship Presentation
On
“Sentiment Analysis Using
Machine Learning Methods”
PRESENTED BY
RAM SREEKAR B 1SJ21IS086
RAVI TEJA SKANDA K 1SJ21IS087
SHASHANK M 1SJ21IS94
VINAY K 1SJ21IS117
UNDER THE GUIDANCE OF
Prof. Naveen Kumar VG Prof. Satheesh Chandra Reddy S
Assistant Professor Professor & HOD
Dept. Of ISE,SJCIT Dept. Of ISE,SJCIT
CONTENTS
I. ABSTRACT
II. INTRODUCTION
III. LITERATURE SURVEY
IV. PROBLEM IDENTIFICATION & DEFINITION
V. SIGNIFICANCE & RELEVANCE OF WORK
VI. OBJECTIVES AND METHODOLOGY
VII. PLAN OF EXECUTION
[Link] REQUIREMENT SPECIFICATION
IX. ANALYSIS
X. DESIGN
XI. IMPLEMENTATION
XII. TESTING
[Link] AND DISCUSION
XIV. CONCLUSION AND FUTURE ENHANCEMENTS
XV. BIBILOGRAPHY
Dept. of ISE, SJCIT 2
ABSTRACT
• In recent years, Twitter has emerged as a crucial platform for real-time public opinion and sentiment
expression. Understanding the sentiments expressed in tweets has become increasingly important for
various applications, including brand monitoring, political analysis, and customer feedback
assessment. This project aims to develop sentiment analysis techniques applied to Twitter data to
uncover insights and patterns in public sentiment. We propose to employ machine learning algorithms
and natural language processing techniques to classify tweets into positive, negative sentiments.
Dept. of ISE, SJCIT 3
INTRODUCTION
Sentiment analysis, a subfield of natural language processing, focuses on extracting subjective
information from text to determine attitudes, emotions, and opinions. The application of machine
learning techniques for sentiment analysis, aim to analyze and classify text data into positive, negative, or
neutral sentiments . The widespread availability of social media platforms, customer reviews, and online
forums has made sentiment analysis indispensable across various domains. From collecting public
opinion on products and services to tracking social trends and political sentiments, the ability to
automatically analyze text sentiment offers valuable insights for decision-making . Sentiment analysis
involves algorithms like Naive Bayes and Support Vector Machines to advanced deep learning
architectures like Recurrent Neural Networks (RNNs) and Transformer models. It also involves
preprocessing techniques such as tokenization, stemming, and feature extraction, essential for preparing
text data for machine learning algorithms.
Dept. of ISE, SJCIT 4
LITERATURE SURVEY
Study of Twitter Sentiment Analysis using Machine Learning Algorithms on Python by Bhumika Gupta, PhD Monika Negi, Kanika Vishwakarma, Goldi through
International Journal of Computer Applications, Twitter is a platform widely used by people to express their opinions and display sentiments on different occasions.
Sentiment analysis is an approach to analyze data and retrieve sentiment that it embodies.
Twitter Sentiment Analysis Using Supervised Machine Learning by Nikhil Yadav, Omkar Kudale, Srishti Gupta, Aditi Rao, Ajitkumar Shitole through [Link]|
Springe, Sentiment analysis aims to extract opinions, attitudes, as well as emotions from social media sites such as twitter. It has become a popular research area.
Twitter Sentiment Analysis Using Machine Learning Techniques by Bac Le, HuyNguyen through Advanced Computational Methods for Knowledge Engineering,
Twitter is a popular microblogging service in which users post status messages, called ”tweets”, with no more than 140 characters.
Twitter Sentiment Analysis using Machine Learning and Knowledge-based Approach by Riya Suchdev, Pallavi Kotkar, Pallavi Kotkar, Sridhar Swamy through
International Journal of Computer Applications, Sentiment analysis is mainly concerned with identifying and classifying opinions or emotions that are expressed within a
text. These days, sharing opinions and expressing emotions through social networking websites has become very common.
Literature Survey on Sentiment Analysis of Twitter Data using Machine Learning Approaches by Ankit Pradeep Patel, Ankit Vithalbhai Patel, Sanjaykumar
Ghanshyambhai, Butani Prashant B. Sawant through IJIRST –International Journal for Innovative Research in Science & Technology, Sentiment analysis deals with
identifying and classifying opinions or sentiments which are present in source text. Social media is generating a huge amount of sentiment rich data in the form of tweets,
status updates, reviews and blog posts etc.
Dept. of ISE, SJCIT 5
PROBLEM STATEMENT
In this, we used a data set about more than some number of tweets for training classifiers. We built a
model which classified tweets collected from Twitter APIs into the positive class or the negative class.
The model runs on three steps: a classifier categorizes tweets into objective tweets or subjective tweets,
another classifier organizes subjective tweets into positive or negative and finally, the system summarizes
tweets into a virtual graph. For training. Our experiments proved to be highly accurate. Related work on
tweet sentiment analysis is rather limited, but the initial results are promising.
Dept. of ISE, SJCIT 6
SIGNIFICANCE & RELEVANCE OF WORK
Sentiment analysis on Twitter using machine learning methods holds significant relevance and importance due to several factors:
• Current Trends and Public Opinion: Twitter provides real-time updates on various topics, making it a valuable source for
gauging public opinion on current events, products, services, and political views.
• Crisis Management: Companies and organizations can quickly identify and respond to emerging issues, managing crises more
effectively.
• Customer Feedback: Businesses can analyze tweets to understand customer satisfaction and identify areas for improvement.
• Brand Monitoring: Companies can monitor their brand presence and reputation, gaining insights into how their brand is
perceived by the public.
• Election Predictions: Sentiment analysis can be used to predict election outcomes by analyzing the public's
sentiment toward candidates and political issues.
• Social Movements: Researchers can study public sentiment on social issues and movements, understanding the
level of support or opposition.
• Product Launches: Companies can assess the success of product launches by analyzing the sentiment of
tweets related to the new products.
• Competitor Analysis: Businesses can keep track of competitors’ performance and public perception.
Dept. of ISE, SJCIT 7
OBJECTIVES & METHODOLOGY
OBJECTIVES:
• Twitter is a popular micro-blogging service in which users post status messages, called tweets”, with no more than 140 characters. The millions of statuses appear on social
networking every day. In most cases, its users enter their messages with much fewer characters than the limit established. Twitter represents one of the largest and most
dynamic datasets of user generated content approximately 200 million users post400 million tweets per day.
• Tweets can express opinions on different topics, which can help to direct marketing campaigns so as to share consumers’ opinions concerning brands and products, outbreaks
of bullying, events that generate insecurity, polarity prediction in political and sports discussions, and acceptance or rejection of politicians, all in an electronic word-of-mouth
way. In such application domains, one deals with large text corpora and most often” formal language”. At least two specific issues should be addressed in any type of
computer-based tweet analysis: firstly, the frequency of misspellings and slang in tweets is much higher than that in other domains.
Methodology:
• Natural language processing (NLP): Natural Language Processing (NLP) is a field of artificial intelligence that focuses on the interaction between
computers and human languages. It involves the application of computational techniques to analyse and synthesize natural language and speech.
The goal of NLP is to enable computers to understand, interpret, and respond to human languages in a way that is both meaningful and useful.
• Sentiment Analysis: It is the interpretation and classification of emotions (positive, negative and neutral) within text data using text analysis
techniques. Sentiment analysis allows organizations to identify public sentiment towards certain words or topics.
Dept. of ISE, SJCIT 8
PLAN OF EXECUTION
Dept. of ISE, SJCIT 9
System Requirement Specification
Software requirements:
• Operating System : Windows , Linux, IOS.
• Development Environment : Kaggle or Anaconda Navigator (Jupiter Notebook ),VS code,Pycharm.
• Backend Language : Python , Pandas and NumPy
• Front end: Html, CSS and JavaScript
Hardware requirements:
• Processor : Intel Core i5 or higher.
• Hard Disk : 1TB or more.
• Ram : 8 GB(minimum) ,64 GB (recommended)
• Internet : 10Mbps or above
Dept. of ISE, SJCIT 10
ANALYSIS
•Twitter using machine learning methods include a variety of commercial tools, open-source libraries, and social media analytics
platforms. Tools like IBM Watson Natural Language Understanding, Google Cloud Natural Language API, and Microsoft Azure
Text Analytics offer robust sentiment analysis capabilities, providing pre-trained models that can efficiently classify sentiments
from tweets. VADER, a lexicon and rule-based tool, is particularly well-suited for analysing social media text, handling informal
language and emoticons effectively. Machine learning libraries such as NLTK and spaCy offer comprehensive tools for
developing custom sentiment analysis models, while the Transformers library by Hugging Face provides access to state-of-the-art
models like BERT and GPT for advanced text classification tasks. Social media analytics platforms like Hootsuite Insights,
Sprout Social, Brand watch, and Talk walker deliver integrated solutions for sentiment analysis, trend tracking, and social media
monitoring. These existing systems cater to various needs, from real-time sentiment analysis to detailed trend analysis, and offer
varying levels of customization and scalability. However, commercial tools may come with high costs, and using pre-built
solutions can limit the flexibility and specificity required for certain applications.
Dept. of ISE, SJCIT 11
DESIGN
Designing a sentiment analysis system for Twitter using machine learning involves a multi-step
approach. Initially, data is gathered from Twitter using the Twitter API, employing libraries like
tweepy to collect tweets relevant to specific keywords or hashtags. The collected tweets are then
preprocessed to clean and normalize the text, which includes removing URLs, hashtags, mentions, and
special characters, converting text to lowercase, and performing tokenization. The cleaned text is
vectorized using methods such as Bag-of-Words, TF-IDF, or advanced word embeddings like Word2Vec
or BERT to transform textual data into a numerical format suitable for model input.
For the machine learning model, traditional approaches like Logistic Regression, Naive Bayes,
or Support Vector Machines can be employed for their simplicity and interpretability.
Alternatively, more sophisticated models such as Recurrent Neural Networks (RNNs), Long
Short-Term Memory Networks (LSTMs), or Transformer-based models like BERT can be used
to capture complex patterns and contextual nuances in the data. The dataset is split into
training and testing sets, with cross-validation used to fine-tune hyperparameters and ensure
the model's robustness. Performance is evaluated using metrics such as accuracy, precision,
recall, and F1-score to gauge effectiveness
Dept. of ISE, SJCIT 12
IMPLEMENTATION
Dept. of ISE, SJCIT 13
TESTING
• Unit Testing: Check for data correctness.
• Integration Testing: Verify data flow through the system.
• Model Testing: Confirm model training functionality.
• Performance Testing: Measure prediction accuracy.
• Cross-Validation: Evaluate model consistency across data splits.
• Hyperparameter Tuning: Optimize model settings
Dept. of ISE, SJCIT 14
RESULTS AND DISCUSSION
Snapshot 1 : Distribution Of Data Snapshot 2: Outputs
Dept. of ISE, SJCIT
15
RESULTS (Contd..,)
Snapshot 3 : Positive Snapshot 4: Negative
Dept. of CSE, SJCIT 16
CONCLUSION AND FUTURE ENHANCEMENTS
• Sentiment analysis on Twitter using machine learning methods offers significant insights into public
opinion, allowing businesses, researchers, and policymakers to understand and respond to social
trends effectively.
• Overall, machine learning-driven sentiment analysis provides a powerful tool for deciphering
complex emotional tones in vast amounts of textual data from social media.
Future Enhancements
• Contextual Under
• Multilingual Support
• Emotion Detection
• Real-Time Adaptation
• Integration with Other Data Sources
• Explainability and Interpretability
• Scalability
Dept. of ISE, SJCIT 17
BIBILOGRAPHY
1. Bhumika Gupta, PhD Monika Negi, Kanika Vishwakarma, Study of Twitter Sentiment Analysis using Machine Learning
Algorithms on Python, International Journal of Computer Applications (0975 – 8887) Volume 165 – No.9, May 2017
2. Bhumika Gupta, PhD, Study of Twitter Sentiment Analysis using Machine Learning Algorithms on Python, International
Journal of Computer Applications (0975 – 8887) Volume 165 – No.9, May 2017
3. Bac Le and Huy Nguyen, Twitter Sentiment Analysis Using Machine Learning Techniques, Springer International Publishing
Switzerland 2015
4. Riya Suchdev, Pallavi Kotkar, Pallavi Kotkar, Sridhar Swamy Twitter Sentiment Analysis using Machine Learning and
Knowledge-based Approach, International Journal of Computer Applications (0975 – 8887) Volume 103 – No.4, October 2014
5. Ankit Pradeep Patel, Ankit Vithalbhai Patel, Sanjaykumar Ghanshyambhai, Butani Prashant B. Sawant, Literature Survey on
Sentiment Analysis of Twitter Data using Machine Learning Approaches, IJIRST –International Journal for Innovative
Research in Science & Technology| Volume 3 | Issue 10 | March 2017 ISSN (online): 2349-6010
Dept. of ISE, SJCIT 18
THANK YOU
Dept. of ISE, SJCIT 19