0% found this document useful (0 votes)
132 views5 pages

A Review On Advances in Sentiment Analysis A Deep Learning Approach Using Transformer Based Models

The document presents a review of advancements in sentiment analysis using deep learning, particularly focusing on transformer-based models like BERT, RoBERTa, and DistilBERT. It discusses the evolution from traditional sentiment analysis methods to modern deep learning techniques, highlighting the effectiveness of these models in multilingual sentiment identification and aspect-based sentiment analysis. The study also addresses challenges such as the need for large datasets and high computational costs while exploring the potential of lightweight models and multimodal frameworks to enhance interpretability and efficiency.

Uploaded by

agnelpolyan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
132 views5 pages

A Review On Advances in Sentiment Analysis A Deep Learning Approach Using Transformer Based Models

The document presents a review of advancements in sentiment analysis using deep learning, particularly focusing on transformer-based models like BERT, RoBERTa, and DistilBERT. It discusses the evolution from traditional sentiment analysis methods to modern deep learning techniques, highlighting the effectiveness of these models in multilingual sentiment identification and aspect-based sentiment analysis. The study also addresses challenges such as the need for large datasets and high computational costs while exploring the potential of lightweight models and multimodal frameworks to enhance interpretability and efficiency.

Uploaded by

agnelpolyan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Proceedings of the Fourth International Conference on Sentiment Analysis and Deep Learning (ICSADL-2025)

IEEE Xplore Part Number: CFP25UU5-ART; ISBN: 979-8-3315-2392-3

A Review on Advances in Sentiment Analysis: A


Deep Learning Approach using Transformer based
Models
Tejas Vijayrao Kale Mr. Sameer Mendhe
Faculty of Engineering and Technology, Datta Meghe Faculty of Engineering and Technology, Datta Meghe
Institute of Higher Education and Research Institute of Higher Education and Research
Wardha, Maharashtra, India Wardha, Maharashtra, India
2025 4th International Conference on Sentiment Analysis and Deep Learning (ICSADL) | 979-8-3315-2392-3/25/$31.00 ©2025 IEEE | DOI: 10.1109/ICSADL65848.2025.10933230

[email protected] [email protected]

Abstract—A key element of natural language processing is Neural networks are used to enable value learning.
sentiment analysis, which comprises recognizing and Word2vec (also called GloVe or Gensim) is a popular word
understanding opinions and emotions in text. Traditional
embedding system that integrates models like Skip-grams
sentiment categorization methods like machine learning and
lexicon-based approaches were made more accurate by deep and continuous bag-of-words (CBOW). Word embedding
learning techniques. Transformer-based models that capture is a language modeling and feature learning technique that
long-range relationships through self-attention methods and assigns a vector of real values to each word so that words
complicated contextual linkages in text, such as BERT, GPT, with similar meanings have the same representation. The
RoBERTa, and DistilBERT, have made important
basis of both models is the probability that words will
contributions to the field. These models perform better than
previous methods for multilingual sentiment identification, appear adjacent to one another. With skip-gram, the user
aspect-based sentiment analysis, and text categorization. can begin with a word and make educated guesses about the
However, there are problems like requirement for huge words that will probably come after it. This is reversed by
datasets and expensive processing requirements persist. continuous bag-of-words, which predicts a word that will
Recent advancements such as lightweight transformer
probably appear based on particular context terms [3].
models, multimodal frameworks, and combine efforts of
transformer variants like RoBERTa and DistilBERT to Traditional Approaches for Sentiment Analysis:
improve interpretability address these problems. This study
explores the use of RoBERTa-large model for sentiment a) Machine learning based methods
analysis task using IMDb dataset library containing 50000
movie reviews classified as positive and Negative. RoBERTa- This type of approach is used to extract sentences and
large model achieved accuracy of 96.05% on testing data feature levels. Machine learning models that categorize
which shows growing importance of transformer model in sentiment based on textual features include Support Vector
NLP applications. Machines (SVM), and Logistic Regression, Naive Bayes.
Keywords—Sentiment Analysis, Deep Learning, Among the features are n-grams, bi-grams, uni-grams, bags
Transformer Model, Natural Language Processing
of words, and Parts of Speech tags (POS).
I. INTRODUCTION
b) Corpus-based or Lexicon-based methods
A. Sentiment Analysis
These techniques, which are related to sentiment
It controls sentiments, opinions, and subjective text. It
classification approaches, involve decision trees such as k-
provides a plethora of data regarding public sentiment by
Nearest Neighbours (KNN), Hidden Markov Model
analyzing reviews and tweets. It is a tried-and-true
(HMM), Single Dimensional Classification (SDC),
technique for predicting a variety of significant events,
Conditional Random Field (CRF) and Sequential Minimal
including general elections and movie box office receipts.
Optimization (SMO) [11].
Public reviews, which can be found on many websites,
including Yelp and Amazon, are used to assess a specific B. Deep learning – Neural Networks
entity, like a person or place. One can categorize opinions
as neutral, favourable, or negative [1]. Automatically Neural networks are composed of innumerable neurons
identifying the expressive direction of user evaluations is that combine to build an amazing network, are influenced
the goal of sentiment analysis. Sentiment analysis is in high by the human brain. Both supervised and unsupervised
demand as a result of the growing requirement to assess and categories can be trained using deep learning networks [2].
categorize hidden information obtained from social media
as unstructured data.

979-8-3315-2392-3/25/$31.00 ©2025 IEEE 235


Authorized licensed use limited to: Somaiya University. Downloaded on June 18,2025 at 06:12:02 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Fourth International Conference on Sentiment Analysis and Deep Learning (ICSADL-2025)
IEEE Xplore Part Number: CFP25UU5-ART; ISBN: 979-8-3315-2392-3

Fig.1. Classification approaches of sentiment polarity [11].

a) Convolutional Neural Network (CNN)


b) Recurrent Neural Network (RNN)
This type of feed-forward neural networks was initially
It is type of neural network where feedback loops are
used in computer vision, natural language processing, and
created inside the network as a result of the connections
recommender systems. A fully connected classification between neurons forming a directed cycle. Processing
layer in this deep neural network architecture receives data sequential data using internal memory collected via guided
from the convolutional and pooling or subsampling layers.
cycles is the primary use case for RNNs. RNNs differ from
To extract features, convolution layers filter their inputs;
traditional neural networks in that they can recall and apply
several filters' outputs can then be combined. CNNs may be
previously computed information to the next element in the
more resistant to distortion and noise by lowering feature input sequence. The long short-term memory version of
resolution [12]. Categorization tasks are carried out by RNN allows the hidden layer to be given activation
completely connected layers. functions [9].

II. LITERATURE REVIEW

Table.1. Comparative study of model used by the researcher

Researcher Name and Model Used Objective Dataset Result


Year
Wu, Yichao. (2024) [1]. Bidirectional Encoder It contrasted DistilBERT's Stanford Sentiment The BERT model recieved
Representations from performance with that of Treebank is a common a score of 92.7 when all
Transformers (BERT) more conventional word dataset used in model. parameters were used.
vector models, such as
GloVe, Word2Vec, and
FastText..

Kokab, Sayyida Tabinda, GloVe, Word2vec, and Sentiment Analysis model IMDB is a sizable movie The precision and recall
Shehneela Naz., and Sohail Bidirectional Encoder to Manage Noisy Data review dataset from Kaggle, rates of the BERT-based
Asghar (2022) [4]. Representations (BERT) as are datasets from US CBRNN model were
supplied by Transformer. airlines and self-driving 0.98% and 0.98%,
cars. respectively.
Tan, Kian Long. (2022) [5]. RoBERTa-LSTM model determines the text's Sentiment140, Twitter, and The accuracy of RoBERTa-
sentiment by combining the IMDB's US airline LSTM results for IMDB is
BERT technique with long sentiment dataset. 93%.
short-term memory.

Chintalapudi, Nalini, Bidirectional Encoder Analyze tweet by Indian Dataset was taken from Fine-tuned BERT model
Francesco Amenta and Gopi Representations from netizens during covid 19 github consisting of 3090 shows accuracy of 89%.
Battineni. (2021) [6]. Transformers (BERT) lockdown to stop spreading tweets
of false news on Twitter

Onan, Aytuğ. (2021) [7]. Word2vec, fastText, and An effective sentiment Dataset with 66,000 reviews The GloVe word-
GloVe are deep learning- classification system that of MOOCs embedding system has a
based sentiment analysis performs well in MOOC 95.80% categorization
word embedding schemes. reviews accuracy.

979-8-3315-2392-3/25/$31.00 ©2025 IEEE 236


Authorized licensed use limited to: Somaiya University. Downloaded on June 18,2025 at 06:12:02 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Fourth International Conference on Sentiment Analysis and Deep Learning (ICSADL-2025)
IEEE Xplore Part Number: CFP25UU5-ART; ISBN: 979-8-3315-2392-3

III. TRANSFORMER MODELS


DistilBERT is a lighter variant of BERT that aims to
Transformers had transformed natural language
reduce computational overhead while keeping the
processing by removing the sequential nature of recurrent
majority of BERT's capabilities. It accomplishes this by
neural networks. Their architecture is based on a self-
knowledge distillation, in which a smaller model learns to
attention technique that enables the model to comprehend
copy a larger one.4. DistilBERT
a complete sequence at once while successfully capturing
global relationships. Transformers rely on input 5. XLNet
embeddings, multi-head self-attention, feedforward neural
XLNet combines the advantages of autoregressive and
networks, positional encodings, and stacked encoder-
autoencoding models. It uses permutation-based training
decoder layers to function. Tokens are converted into
to acquire bidirectional context while retaining the
dense vector representations using input embeddings,
advantages of autoregressive modelling.
which are augmented with positional encodings to
preserve word order. The model can rate the relevance of 6. T5 (Text to Text Transfer Transformer)
each word in a sequence in relation to others thanks to the
All NLP problems are reframed as text-to-text
self-attention mechanism, which computes attention
challenges by the text to text transfer transformer (T5).,
scores based on query, key, and value vectors.
with both input and output being text. The input for
Transformers stabilize training and enhance gradient flow
through the use of residual connections and layer sentiment analysis might be a statement, with the
normalization [4]. sentiment label as the output.
7. Longformer
Applications of sentiment analysis, including
multilingual sentiment analysis, aspect-based sentiment Longformer overcomes the scalability issue of ordinary
analysis, and text categorization, have found great success transformers by incorporating sparse attention methods,
with transformers. Lightweight models, such as which allow it to efficiently handle large texts. This makes
DistilBERT, are best suited for real-time use cases, but it ideal for document-level sentiment analysis.
advanced models, such as Longformer and BigBird, solve
scalability issues by using sparse attention methods on big 8. BigBird
datasets. Considering their benefits, transformers face BigBird expands Longformer's capabilities by
limitations such as high processing costs, reliance on large introducing block sparse attention, which improves its
datasets, and interpretability concerns [5]. Emerging scalability. It is very effective for evaluating huge datasets
trends stress efficient transformers, multimodal models, and conducting multimodal sentiment analyses.
explainability, and few-shot learning, all of which point to
future improvements in sentiment analysis. IV. IMPORTANCE OF TRANSFORMER MODEL
IN NLP
A. Transformer-Based Variants
Transformer models have become an essential
1. BERT (Bi directional Encoder Representation from component of natural language processing (NLP),
Transformer) changing tasks like sentiment analysis by surpassing the
BERT is a bilateral transformer that uses masked drawbacks of traditional and older deep learning
language modeling to understand context on both sides of techniques. Their unique self-attention mechanism allows
a sentence. This bidirectionality enables BERT to record them to scan entire text sequences simultaneously,
complex word associations, making it excellent for outperforming recurrent neural networks (RNNs) in
sentiment categorization and sentiment analysis based on gathering long-range correlations and complex contextual
aspect. Training on task-specific data and adding a linkages. Aspect-based sentiment analysis and more
classification layer make BERT ideal for sentiment accurate sentiment categorization are made possible by
analysis [1]. transformers' superior performance over conventional
techniques in comprehending intricate linguistic patterns.
2. GPT (Generative Pretrained Transformer) BERT and GPT models have set the bar for performance,
while more efficient versions like RoBERTa and smaller
GPT is a model that uses autoregression and designed
ones like DistilBERT have made it possible to use them in
primarily for textual production and conversational AI.
Unlike BERT, GPT processes text unidirectionally, with real-time and on devices with limited resources.
the goal of producing cohesive and contextually Transformers have proven helpful for many global uses
because they allow for sentiment analysis in two
appropriate language. GPT models, notably GPT-3, have
languages. NLP keeps growing thanks to its ability to scale
shown promise in sentiment analysis tasks by leveraging
up and adapt even though it faces challenges like high
their generative capabilities to predict sentiment labels.
costs for processing and needing huge amounts of data.
3. RoBERTa (Robustly Optimized BERT) The fact that it's being combined with other types of data
and that people are working hard to make it faster and
RoBERTa enhances on BERT by optimizing training
easier to understand shows how important it is for modern
tactics such as using larger datasets, eliminating the next
sentiment analysis and other NLP jobs. [10].
sentence prediction challenge, and implementing dynamic
masking. These changes improve RoBERTa's accuracy
and robustness for sentiment analysis applications [5].
4. DistilBERT

979-8-3315-2392-3/25/$31.00 ©2025 IEEE 237


Authorized licensed use limited to: Somaiya University. Downloaded on June 18,2025 at 06:12:02 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Fourth International Conference on Sentiment Analysis and Deep Learning (ICSADL-2025)
IEEE Xplore Part Number: CFP25UU5-ART; ISBN: 979-8-3315-2392-3

Table.2. Various transformer models with word embedded schemes used


for sentiment analysis

Word
Embedd Key Applicatio
Model Limitations
ing Features ns
Scheme
Bidirection
Sentiment
al context, Computation
Word classificatio
BERT masked ally
Piece n, question
language expensive
answering
modeling
Autoregress Text Limited
Byte Pair
ive, generation, bidirectional
GPT Encoding
unidirection conversatio understandin
(BPE)
al context nal agents g
Fig.2. Training and Validation Loss
Optimized Sentiment
BERT with analysis, High
RoBERT Word
larger text resource
a Piece
datasets and classificatio requirements
tuning n
Slightly
Lightweight
Real-time reduced
DistilBE Word BERT,
sentiment accuracy
RT Piece reduced
analysis compared to
parameters
BERT
Text
Text-to-text summarizat Requires
Sentence
T5 framework, ion, aspect- extensive
Piece
versatile based pretraining
sentiment
Combines
autoregressi Fig.3. Evaluation Accuracy over Epochs
Long-range High
Sentence ve and
XLNet dependency computation
Piece autoencodin
tasks al cost
g VI. DISCUSSION AND FUTURE WORK
approaches
RoBERTa model is built using deep neural networks
Sparse with multiple layers of self-attention and feedforward
Document-
Custom attention Limited
Longfor level networks. This model learns hierarchical representations,
embeddi mechanism support for
mer sentiment
ngs for long
analysis
shorter texts similar to deep learning models like CNNs and RNNs. The
documents
sentiment analysis model using RoBERTa-Large achieved
an Evaluation accuracy of 96.05%, outperforming
Scalable traditional machine learning approaches. The results are
Custom sparse Multimodal Complex
BigBird embeddi attention for sentiment implementat
consistent with recent studies that show transformer-based
ngs large analysis ion models outperform traditional methods in sentiment
datasets analysis. However, approach in this paper achieved higher
accuracy than previous RoBERTa implementations due to
hyperparameter tuning. The superior performance of
V. RESULTS RoBERTa can be attributed to its optimized training
Applying RoBERTa-large model for sentiment methodology, which allows for better contextual
analysis task using IMDb dataset, source from Hugging understanding of sentiment-related phrases. These findings
face dataset library containing 50000 movie reviews suggest that transformer-based sentiment analysis models
classified into positive and Negative. For faster can be effectively deployed in customer service chatbots to
experimentation, subsets of dataset were used. For training improve response accuracy. One limitation of our study is
5000 samples and for testing 2000 samples was used. the reliance on the IMDB dataset, which primarily contains
Roberta-large variant was used which consist of 24 movie reviews. The model's generalizability to other
transformer layers, 16 attention heads per layer and 355 domains, such as financial or political sentiment analysis,
million parameters. Learning rate of 2 × 10−5 for fine needs further evaluation. Future research could explore
tuning was used with batch size of 8 for training and 16 for domain-specific sentiment analysis using fine-tuned
evaluation with 10 number of epochs. transformer models or investigate the integration of
multimodal sentiment analysis incorporating text and
RoBERTa-large model accuracy on test dataset was
images.
observed to be 96.05%.

979-8-3315-2392-3/25/$31.00 ©2025 IEEE 238


Authorized licensed use limited to: Somaiya University. Downloaded on June 18,2025 at 06:12:02 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Fourth International Conference on Sentiment Analysis and Deep Learning (ICSADL-2025)
IEEE Xplore Part Number: CFP25UU5-ART; ISBN: 979-8-3315-2392-3

VII. CONCLUSION [4] Kokab, Sayyida Tabinda, Sohail Asghar, and Shehneela Naz.
The use of transformer-based models in the field of "Transformer-based deep learning models for the sentiment analysis
of social media data." Array 14 (2022): 100157.
Sentiment Analysis shows how they could overcome the [5] Tan, Kian Long, et al. "RoBERTa-LSTM: a hybrid model for
drawbacks of earlier and more conventional deep learning sentiment analysis with transformer and recurrent neural
techniques. The main applications of the contemporary network." IEEE Access 10 (2022): 21517-21525.
NLP lie in the fact that these models are able to handle long- [6] Chintalapudi, Nalini, Gopi Battineni, and Francesco Amenta.
"Sentimental analysis of COVID-19 tweets using deep learning
term dependencies, contextual subtleties, and work even for models." Infectious disease reports 13.2 (2021): 329-339.
different languages and domains by adapting models to [7] Onan, Aytuğ. "Sentiment analysis on massive open online course
their needs. Lightweight versions of real-time versions of evaluations: a text mining and deep learning approach." Computer
BERT, GPT, and DistilBERT optimized the performance Applications in Engineering Education 29.3 (2021): 572-589.
[8] Kastrati, Zenun, et al. "Sentiment analysis of students’ feedback with
of such models. RoBERTa-large model demonstrated NLP and deep learning: A systematic mapping study." Applied
exceptional performance on IMDb dataset for sentiment Sciences 11.9 (2021): 3986.
analysis task achieving 96.05% testing accuracy. This result [9] Durairaj, Ashok Kumar, and Anandan Chinnalagu. "Transformer
highlights the model ability to capture nuances in text due based contextual model for sentiment analysis of customer reviews:
A fine-tuned bert." International Journal of Advanced Computer
to its advanced pre-training strategies. Future improvement Science and Applications 12.11 (2021).
includes using and exploring hyperparameter optimization [10] Acheampong, Francisca Adoma, Henry Nunoo-Mensah, and Wenyu
techniques like grid search and extending the analysis to Chen. "Transformer models for text-based emotion detection: a
multiclass sentiment classification or multilingual datasets. review of BERT-based approaches." Artificial Intelligence
Review 54.8 (2021): 5789-5829.
Transformer-based techniques will continue to be crucial [11] Dang, Nhan Cach, María N. Moreno-García, and Fernando De la
for resolving these problems and developing sentiment Prieta. "Sentiment analysis based on deep learning: A comparative
analysis and other NLP applications. study." Electronics 9.3 (2020): 483.
[12] Yadav, Ashima, and Dinesh Kumar Vishwakarma. "Sentiment
REFERENCES analysis using deep learning architectures: a review." Artificial
Intelligence Review 53.6 (2020): 4335-4385.
[1] Wu, Yichao, et al. "Research on the Application of Deep Learning- [13] Mishev, Kostadin, et al. "Evaluation of sentiment analysis in finance:
based BERT Model in Sentiment Analysis." arXiv preprint from lexicons to transformers." IEEE access 8 (2020): 131662-
arXiv:2403.08217 (2024). 131682.
[2] Hendrawan, Rahmat, Karimah Mutisari Hana, and Angelina Prima [14] Delbrouck, Jean-Benoit, et al. "A transformer-based joint-encoding
Kurniati. "A Comparison of Machine Learning, Deep Learning, and for emotion recognition and sentiment analysis." arXiv preprint
Transformer Approaches for Amazon Product Reviews Sentiment arXiv:2006.15955 (2020).
Analysis." 2024 12th International Conference on Information and [15] Nguyen, Hien D., et al. "Language-oriented Sentiment Analysis
Communication Technology (ICoICT). IEEE, 2024. based on the Grammar Structure and Improved Self-attention
[3] Saad, Tayef Billah, et al. "A Novel Transformer Based Deep Network." ENASE. 2020.
Learning Approach of Sentiment Analysis for Movie [16] Jiang, Ming, et al. "Transformer based memory network for
Reviews." 2024 6th International Conference on Electrical sentiment analysis of web comments." IEEE Access 7 (2019):
Engineering and Information & Communication Technology 179942-179953.
(ICEEICT). IEEE, 2024.

979-8-3315-2392-3/25/$31.00 ©2025 IEEE 239


Authorized licensed use limited to: Somaiya University. Downloaded on June 18,2025 at 06:12:02 UTC from IEEE Xplore. Restrictions apply.

You might also like