0% found this document useful (0 votes)

17 views7 pages

Natural Language Processing Based Online Fake

This paper reviews various methodologies and algorithms used for online fake news detection, highlighting the challenges posed by misinformation in social media. It discusses different techniques such as machine learning, natural language processing, and feature extraction to enhance the accuracy of fake news identification. The study emphasizes the importance of reliable datasets and the need for robust systems to mitigate the adverse effects of fake news on society.

Uploaded by

NUSRAT HUSSAIN

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views7 pages

Natural Language Processing Based Online Fake

Uploaded by

NUSRAT HUSSAIN

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Proceedings of the Fifth International Conference on Communication and Electronics Systems (ICCES 2020)

IEEE Conference Record # 48766; IEEE Xplore ISBN: 978-1-7281-5371-1

Natural Language Processing based Online Fake

News Detection Challenges – A Detailed Review
Vaishali Vaibhav Hirlekar Dr.Arun Kumar
Computer Science & Engineering Department Computer Science & Engineering Department,
Sir Padmapat Singhania University, Sir Padmapat Singhania University,
Udaipur, Rajasthan, India Udaipur, Rajasthan, India
[email protected] [email protected]

Abstract— Online social media plays an important role • Several networks use clickbait techniques and phishing to
during real world events such as natural calamities, elections, raise advertising profits.
social movements etc. S ince the social media usage has increased, • Fake information can adversely impact both national
fake news has grown. The social media is often used by modifying
security and the safety of individuals, and detection of it has
true news or creating fake news to spread misinformation. The
creation and distribution of fake news poses major threats i n
become a common problem of cyber-security application
several respects from a national security point of view. Hence layer.
Fake news identification becomes an essential goal for enhancing • The algorithms were parsed in the commentary section of
the trustworthiness of the information shared on online social each article in view of the validation of fake material. This
network. Over the period of time many researcher has used means that the audience will tell whether a particular article
different methods, algorithms, tools and techniques to identify is credible or not to the algorithms. However one cannot
fake news content from online social networks. The aim of this completely rely on your audience for this insight as it may
paper is to review and examine these methodologies, different cause invariably biased result.
tools, browser extensions and analyze the degree of output in
• A dataset of humanly annotated true and false statements
question. In addition, this paper discuss the general approach of
fake news detection as well as taxonomy of feature extraction
cross-checked from data from Wikipedia articles frequently
which plays an important role to achieve maximum accuracy used by researchers in the area of machine learning could also
with the help of different Machine Learning and Natural lead to biased data.
Language Processing algorithms. • Many of the existing detectors depend on the detection of
a text source as a reference to decide if it is fake or real.
Keywords—Fake News Detection, Natural Language There are, however, two problems: the same source, whether a
Processing, Online Social Network, Machine Learning, Sentiment human reporter or a machine text generator (neural fake
Analysis. news), might produce articles both truthful and incorrect or
misleading. It does not help to differentiate between those two
I. INT RODUCT ION to determine the source of the text.
In today's era fake news is a growing challenge. One may x Satire, propaganda, hoax, conspiracy, clickbait and
describe fake news as news that consists of intentional lies or misleading or out-of-context information are the different
hoaxes that are spread via conventional news media or online types of fake news shown in the given figure 1.
social media. In order to mislead a person or an organization
financially or politically, fake news is written and distributed.
The propagation and distribution of such false news presents
significant risks, including from a national security standpoint
from many different perspectives. Authenticity of the social
media articles is often not verified. It is necessary to build a
system to automatically detect fake news to help reduce the
adverse effects of such news. Within the field of fake news
identification, a variety of strategies have been suggested to
prevent users from becoming victims of misleading social
media spreads like a wild fire.

II. M OT IVAT ION

Detection of a fake news is more difficult than Fig. 1. Different types of Fake News
misleading information. Jump-shot Tech Blog reported that 50
III. RELAT ED WORK
% of total traffic of face book referrals accounted to fake news
sites whereas 22% of the total traffic are found from genuine Systematic characterization of the websites, reputation of
sources [23]. publisher, age of the website, domain ranking and domain
popularity are few important factor considered by Kuel Xu et

978-1-7281-5371-1/20/$31.00 ©2020 IEEE 748

Authorized licensed use limited to: University of Exeter. Downloaded on July 17,2020 at 00:31:09 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Fifth International Conference on Communication and Electronics Systems (ICCES 2020)
IEEE Conference Record # 48766; IEEE Xplore ISBN: 978-1-7281-5371-1

al [1] . A tailored dataset is made up of cover news stories provide high accuracy for input classification. Open source
from the “New York Times, Washington Post, NBC News, dataset was used here, randomly selected 10,000 papers from
USA Today and Wall Street Journal” have been used as the dataset and marked ‘0’. A further group of 10,000 papers
major news outlets. “Ti-idf and Latent Dirichlet allocation were chosen from a Fake News free Kaggle dataset and
(LDA)" topic modeling is used for the analysis of fake news numbered' 1.' TF-IDF and Global Vectors (GloVe) used for
similarities and dissimilarities. Jaccard similarity has been feature extraction. “Logistic Regression (LR), Random
used to differentiate fake and real News, identify and Forest(RF), Recurrent Neural Network (RNN), Long Short
forecast it. LDA however is not an effective approach for Term Memory (LSTM), Gated Recurrent Units (GRUs)”
identifying or differentiating false or true news in the real methods used with GloVe vectors with neural network have
world. Instead of comparing the few important words for proven the most efficient way to classify fake news article.
every new article, Kuel Xu et al., also mentioned exploring Accuracy observed in the LSTM is 66 %, for GRU is 66.4 %,
word2vec algorithm which would allow embedding of vector for RNN is 71.0 %, for RF is 70 % and for LR is 66.9 %
and each word to be comparable to typically capture the respectively [5].
similarities and differences between the material in the fake
Detecting tampered images along with the tweet is a
and actual news articles.
challenging task. Shivam B. Parikh et. al. (2019), proposed a
Fake news detection can also be handled using a framework that uses computer Vision service from Microsoft
cluster based approach, [2] Chaowei Zhang et. al (2019) had Azure try to identify text in images and extract recognizable
proposed a framework that uses cluster based approach. “K- words in a readable format. Customized dataset has been used
means and Affinity Propagation (AP) algorithms” used for wherein tweets are collected using Twitter public API. The
clustering process and fake-predicate detection through verb method was able to achieve upto 83.33% accuracy with
comparison. This method encompasses filters, news that can increased number of features [6]. This approach is however
not be grouped into a cluster or the verbs have a poor degree unable to verify real tweets no longer available because of the
of resemblance to the verbs in their cluster. TF-IDF is used to removal of that tweet or a tweeting account. Fact checking
derive from feature weights. A tailored dataset is made up approach could rely on sentiment analysis for better result [7].
various news article from the website sources such as Bhavika Bhutani et.al. (2019) proposed a solution where in
advocate, naturalnews, politico, greenvillegazette and given 91 sentimental analysis is used. Customized dataset was
% as overall average accuracy. However, author mentioned prepared using different data sets from emergent dataset like
that news category like satire is beyond the scope of this study kaggle and politifact. In this approach, combination of Naïve
and could rely on models that are educated in viewpoints Bayes and tf-idf is used to predict the sentiment of the post.
and perceptions , advanced methods such as deep learning The proposed approach was evaluated using LIAR dataset,
to build a preprocessing module in real time, using a broad George McIntire and merged dataset with Naïve bayes
source of fake information from twitter, reddit, facebok etc. classifier and Random Forest classifier.
Online news media can targeted along with website Jiawei Zhang et al. (2019) discussed the importance
verification. However, validating headline of URL for fake [8] of textual information usage about the content and the
news detection is not enough. Sahil Gaonkar et. al. (2019) credibility of the news sources as it has strong correlation
proposed an approach with different classifier. The various with the creators and subject and inference result of fake news.
parameters extracted from the URL[3] that include the author, Jiawei Zhang et al., proposed a “Hybrid feature learning un it
source, post and headline are then passed through various (HFLU)” for learning the features of news article, creators.
categorizers. The overall result can, however, be improved by The tweets posted on twitter page of PolitiFact and the news
site behaviors validation. However, validating site behavior published on the it’s website are used as data sets. In the
and other related parameters can improve the overall result. proposed model, Explicit and Latent Feature extraction is
Bi-directional language expression (BERT) model done and fed as the inputs to the gated diffusive unit (GDU).
developed by Google has been used by Ye-Chan Ahn et. al Each of the creators, subjects and News Articles uses GDU,
(2019) as pre-training model, over a corpus of 1.5 million HFLU and input unit and forms a Deep Diffusive Network
pieces of data from news and text passage from Korean model. The accuracy score obtained by the model is 63 %
Wikipedia dataset. A fact corpus extracts a noun or a verb in which is 14.5 % higher than Hybrid CNN, LIC, TRIFN.
the news with high occurrence frequency in the pre- Georgios Gravanis et. al. (2019) proposed Feature Set and
processing layer. The input sentence was tokenized by the algorithm based bench marking pipeline approach. In this
Relevant Sentence Extractor wherein Word Piece Model study, Naïve Bayes, SVM, Decision Trees, KNN classifier
(WPM) used as tokenizer (WPM can be applied to all with AdaBoost and Bagging ensemble methods are used. The
languages) and Fine tuning and prediction is performed using dataset used here are from “Kaggle EXT (Kaggle with
BERT. AUROC curve yield a result of 83.8 %. The author Reuters), McIntire and Kaidmml” so that dataset becomes
suggested to use New corpus and Wikipedia data to improve UNBiased. [9] This study uses content based feature approach
flexibility as this model unique to Korean data set also that add source and news author metadata, as well as
corpus in the colloquial form frequently used in SNS should knowledge sharing features of social media, utilizing larger
be learned in detail [4]. Evaluating n-gram properties and datasets and utilizing deep learning approaches. Taxonomy of
word embedding can be effective way to detect fake news. spammer detection is discussed based on the ability to detect
Hrishikesh Telang et. al. (2019) used GloVe vectors with fake user identification, content, URL-based spam. Faiza
neural networks. Leaky ReLU activation feature used to Masood et. al. (2019) performed a study of spam detection

978-1-7281-5371-1/20/$31.00 ©2020 IEEE 749

strategies on Twitter. Moreover, performed comparison of Boost”. Using these approaches achieved F1 score of 0.77.
different feature used for spam detection [10]. Also discussed The dataset includes around 4 lacks examples among these
goals, dataset and experiment result for Anomaly detection 27.6% are from sources with the fake news, while 24,4% are
based on URL and various machine learning algorithm. Terry from trustworthy sources and during collection categorized
Traylor et. al. (2019) proposed an approach with “custom tweets whether they are from trustworthy sources or not [16].
attribution feature extraction classifier” which consider length Zaitul Iradah et. al. (2018) proposed a mechanism with hybrid,
of the quote, spam of the attribution, enables simple quote context and content based approach. In the content based
attribution and absolute distance of attribution span [11]. approach following features are considered visual based and
Customized dataset of corpus used 40 different online sources knowledge based method, style based deception detection, text
and collected 218 documents. The final overall accuracy representation, Linguistic cue. Stance and propagation tree
achieved was 69%. Attribution feature extraction can be kernel based methods are used in Social context-based [22].
combined to identify fake news in future. 87 % is achieved by style based method and visual based
achieved 83.6 % accuracy. [17] In future, Visual feature and
Linguistic analysis model on tweets has been used by network feature can be covered widely to achieve greater
Amitabha Dey et. al (2018) with customized dataset of “Fake
accuracy.
Tweets on Hillary Clinton from Fresh News”, TruthFeed,
Christian Times Newspaper sources had been used also Modified CNN has been used by Youngkyung Seo et.al.
Credible tweets from Reuters, The Economist, CNN used in (2018) that uses the MNIST dataset, second is the refined one
the process. Sentiment analysis has been done to expose collected from 16 medias and media dataset, model
hidden bias towards the subject and applied k-nearest modification, batch size control and data augmentation are few
neighbor algorithm with 66 % accuracy however BoW models important points considered in model with 75.6 accuracy [18].
will lead to improved accuracy [12]. Saranya Krishnan et. al. (2018) proposed a generalized
framework to predict tweet credibility [19]. Dataset containing
Zhixuan Zhou et. al. (2019) discussed about Adverse hurricane sandy, containing only miscellaneous and combined
attacks on the deformity, the interchange of objects and
both are used. User feature and content features are
creates uncertainty [13]. McIntire’s fake-real-news-dataset
extracted. J48 and SVM algorithms are used and achieved
used in this research containing 6,335 articles. The approach 80.68 for hurricane sandy and 81.25 % for Boston Marathon
checks several aspect such as title, content and Domain
using J48 tree. The crowd sourcing data collected and
however main focus of this is on linguistic characteristics. The saved in Dynamo DB can be used to feed into the algorithm
limitation of this method is it’s vulnerability and bias towards
design and re-raining of the classifier to further boost the
certain topics in a flawed attack. In the sense of news framework. Using a statistical models to classify social
propagation, a crowd-source information graph can be
network user behavior and leverages anomaly detection
powerful and prompt however attackers with special intentions
techniques to recognize unexpected behavioral changes.
have equal access to creating and editing. AI-based system can Manuel Egele et. al. (2017) used 1.4 billion messages from
be used with Auto encoder and Recurrent Neural Network to
twitter, 106 million messages from facebook as a dataset.
learn users behavior and detect fake news in Sina Weibo with COMPA builds a social network user activity profile that
80.2 % accuracy. Weiling Chen et. al. (2018) used tailored
identifies user vulnerabilities based on past account messages.
dataset has been made up of number of tweets from Fake news Each time a new message is produced the message is
centric dataset and Genuine news centric dataset. The
compared with the behavioral profile. An attacker aware of
approach proposed several tweet based and comment based COMPA is able to prevent COMPA from identifying its
features that describes user sentiment and spread pattern using
compromised accounts. The attacker may post messages
case studies. [14] AI-driven framework AE and RNN will
which correspond to the behavioral profiles of the accounts
integrate the features proposed into the users ' usual concerned.
comportments, then conducted a fake news detection at Sina
Weibo.
IV. T ECHNIQUES AND CHALLENGES
BiLSTM along with Unified Key Sentence Information
does matching of the sentence between key sentences and Several approaches have been used for identification of
word vectors of questions. Namwon Kim et. al. (2018) fake news. General approach is to gather data, perform the
proposed a model that consist of decision, matching various preprocessing text operation such as text cleaning,
representation, context representation, word representation and tokenization, handling stop-word and punctuation etc in order
key sentence layer on the Korean Article Dataset with an to remove the noise in the data. Then extract the features and
overall average accuracy of 69 %. Existing model extracts set use with different supervised and unsupervised models to
of key sentence however extracting key sentence achieve the better accuracy.
independently and using deep learning approach can
improve the result [15]. Furthermore, Stefan Helmsteller et. Corpus is formed by combining tweets from genuine news
al. (2018) used a model with various groups of features centric and fake news centric dataset. However, dealing with
including topic, sentiment and text feature, tweet and user spam detection in the corpus and retrieving the relevant tweets
level that can be used to boost accuracy. Learning algorithm is an essential task. An additional pre-processing is needed in
such as “Naive Bayes, Decision Trees, Support Vector this situation. The performance of a text classification model
Machines (SVM), and Neural Networks as basic classifiers is highly dependent on the words in a corpus and the features
and two ensemble methods i.e., Random Forest and XG-

978-1-7281-5371-1/20/$31.00 ©2020 IEEE 750

created from those words. To analyze and model the data Te chnique Accuracy Datase t Me thodology
further, it has to be converted into features. Techniques may & Challenges
CNT , Models
include bag of words (BOW model), tf-idf or Glove model. In AHGNA and Fake and trained in
order to achieve the better accuracy basic classifier model 91.9%
proposed valid articles opinions and
.
such Naïve Bayes, Logistic Regression, Decision Tree, FEND from website perspectives
(K-means and sources. should be
Support Vector Machine, Multilayer perception techniques or
AP ) focused.
ensemble method such as Random-Forest and XG-Boost are Naïve Bayes, Validating site
used. Deep learning based techniques are also more popular. Facebook news
Logistics NB- 74 % behaviour and
postsT witter
AI has proven to be a multi-faceted tool for detecting false Regression, LR-77.2 % other related
post Dataset by
information because a significant amount of the data can be SVM, Multi SVM- 82% parameters can
BibSonomy.or
Layer improve the
analyzed rapidly. Table I provides here a comparative g
Perception result.
summary of different Machine Learning and Natural As this model
Language Processing algorithms , dataset used by researchers specific to
BERT
along with the methodologies and challenges with accuracy Corpus of 1.5 Korean
(Bidirectional
million news dataset, new
however Figure 2 shows the general approach used in the fake Encoder
83.8 % data, , specific corpus and
news detection process. Representations
to Korean Wikipedia
from
dataset data to
T ransformers)
increase
versatility
Deep learning
Corpus techniques alo
LST M, ng with the Le
LST M- 66 %
Data Preprocessing GRU,RNN aky ReLU acti
GRU- 66.4 % Open source
RF, LR With vation feature
RNN- 71.0 % Signal Media
T F-IDF and and Glove
RF-70 & LR News Dataset
GloVe Feature Vectors
66.9 %
Extraction used to
Feature Extraction provide high a
ccuracy.
x Tf-idf T he proposed
x Bag of Words(BoW) Uses Computer
system
x Glove Vision service T weets
provides false
positive
from Microsoft collected
83 % results by
Azure to detect using T witter
deleting
text in an OCR public API
tweets or by
image.
Classification Algorithm tweeting an
account.
Different
Basic Classifier neural network
Random LIAR dataset
algorithms can
x Naïve Bayes Forests, M- 81.6% Kaggle dataset
be used to
Naive Bayes PolitiFact
x Logistic Regression enhance the
accuracy.
x Decision Tree A model has
x Support Vector Machine Learning using incorporating
Network Knowledge
structure of T weets posted on the
63%
News articles, by PolitiFact network
Ensemble Method creators and structure into
x Random Forest news subject model
learning,
x XG-Boost KNN 0.921
Uses an
Bagging
approach
KNN 0.944
focused on
Decision T ree AdaBoost Kaidmml,
content,
Naive Bayes 0.949 McIntire,
Fig.2. General Approach for Fake News Detection adding the
SVM SVM 0.950 Kaggle-EXT
news source
AdaBoost Naive Bayes
TABLE I. metadata and
TECHNIQUES AND CHALLENGES Bagging 0.881
news author
Decision Tree
Te chnique Accuracy Datase t Me thodology info.
0.858
& Challenges Detection of
(tf-idf) and - Cover news Explore Review of spam and
Latent Dirichlet stories from word2vec T witter Spam T ailored anomaly
Allocation major news algorithm -
Detection dataset detection
(LDA) outlets T echniques based on URL
modelling and use of

Te chnique Accuracy Datase t Me thodology Te chnique Accuracy Datase t Me thodology

& Challenges & Challenges
machine 1. Dataset Crowd
learning 80.68 for containing sourcing data
algorithm hurricane hurricane may be used
Combine sandy sandy to feed into
Customized attribution J48 and 81.25 % 2. Dataset the algorithm
Single/double
dataset feature SVM for containing design and
Quote
218 documents extraction with Boston only classifier re-
Attribution 69.4
from emerged Marathon using miscellaneous training to
Identifier & J48 tree 3. combining improve the
over 40 online factors to find
classifier
sources. potential false both performance.
content T he
T ailored
Fake T weets identification An attacker aw
dataset T witter
on Hillary COMPA uses st of are of
BoW model dataset consists
Clinton atistical compromised COMPA is abl
focused on the Perform in- of 1.4 billion
Source: Fresh models to classi T witter e to prevent C
labeled 66.66 % depth stance messages,
News, fy account by OMPA from id
classification, detection Facebook
T ruthFeed, social network COMPA is entifying its co
KNN dataset
ChristianTime user subject to mpromised acc
contains 106
s behaviour,. approximatel ounts.
million
Build a y 4% false
Linguistic message
visualized negatives.
characteristic
interface for
and crowd McIntire’s
sourcing
62.4
dataset
news V. MET HODOLOGIES
knowledge
Knowledge The relation between news items, topics and creator
graph crowd
graph model.
sourcing. trustworthiness infer the outcome of fake news. Classification
AI- of fake user accounts can be based on user, tweet, comment,
Autoencoder driven framewo
(AE) and
Fake and
rk
time based and sentiment based feature extraction plays an
Genuine important role to improve the accuracy of the entire process .
Recurrent combining AE
80.2 % news centric
Neural and RNN and f Moreover, feature that are based on User are properties and
dataset:
Networks ake news related attributes of the user accounts. Features on tweet are
(RNN) detection at Sin
a Weibo.
characteristic of the tweet itself. Other user views on the
Extracting key original tweet show comment related features. Content based
Sentence
matching Korean
sentence feature are based on the content. Most news articles are time
independently sensitive so models must proces s properties of news articles
through 69 % Article
with deep instantaneously. So the models should concentrate on views
BiLST M Dataset
learning
approach and viewpoints to attain higher efficiency. The following figure
Support shows the taxonomy of feature extraction.
Vector best F1 score 401,414
Machines, is 0.78 with examples,
Naive Bayes, both user 110,787
Large
Decision and from fake
dataset can
T rees and tweet news and
give better
Neural features F1 290,627
result.
Networks score is from
with Random 0.94 trustworthy
Forest and source
XG-Boost
Visual feature
and network
Context-based, Style based feature cannot
Social Context- 87 % Customized be described
based, Hybrid- Visual based data set extensively
Based approach 83.6 % because it may
be more
beneficial.
T he potential
challenge is
the model for
fast spreading
MNIST ,
fake news
refined
CNN 75.6 % from various
and media
media to
dataset
incorporate in
distributed
parallel
environments.

Name of Tool Exte nsion/

Action Me thodology
Taxonomy of Feature Extraction Tool
algorithm publication,
trained clickbait,
to detect conspiracy or
relevant Fake satire
Use r base d Content based Comment based
x No. of friend x No. of Re- x Contain News patterns.
x No. of followers tweets pictures Apple Digital news outle
x Frequency of x Metadata x No. of likes Safari , News Guard ts
tweets x Word Count x No. of tags Microsoft is for are researched by
x Ratio of re-tweets x Spam word x Length of News Guard Edge, personal use seasoned
x Age of Account x No. of comment Google only and not journalists
x No. of URL Chrome, for to educate
x Verified users Favorites
x Length of the Mozilla commercial readers and
x Friend/ Followers in comment
Firefox viewers
ratio tweet x No. of replies
T ests your T his tool uses
feed of Facebook news
FiB
Facebook for feed algorithm
T ool
Time base d Se ntime nt base d validity of the that tests the
x T weet frequency x Sentiment score text, image validity of
x T ime interval x No. of smiley’s and url. messages.
between tweets x No. of question T est the
T he "T ruth-O-
x Idle time mark & exclamation statements
Website Meter" app makes
between tweet x No. of upper case Politi Fact made on the
available in it apparent if
Internet by
x No. of Hash tag application someone is untrue
political
form about the
analysts and
hyperbole sorting.
politician.
Fig. 1. T axonomy of Feature Extraction

VI. M ISCELLANEOUS VII. ANALYSIS

Along with the different machine learning and NLP algorithm, Researching the identification and finding general
some of the browser extension and tools provide option to consensus of fake news is the most common issue in social
filter spam and notify about untrustworthy sources of news. networks makes the analysis useful. The research in this
survey paper has been carried out from 2017 onwards. It has
Table II discusses such extensions and tools with their on click
been observed that various standard machine learning
action and methodology.
algorithm performs well and gives better result compared to
TABLE II. BROWSER EXTENSION AND T OOLS
other techniques. While the contents of each research are
different, the dissemination of fake news is investigated
Name of Tool Exte nsion/
Action Me thodology differently, and each is properly evaluated in his / her own
Tool area. Deep learning based techniques are more popular AI has
Software that proven to be a multi-faceted tool for detecting false
Stop-the-
T ool blocks spam filtering spam
bullshit (ST B) information because a significant amount of the data can be
sites
analyzed rapidly. For example, in [3], In [14] 401,414 cases,
Extension’s
out of which there are 110,787 news outlets (27.6 percent)
warnings are
while 290,627 (24.4 percent) are from reliable source like
It is an Notify about powered by a list
Reuters news, news from local countries or blogs, authors
extension of untrustworthy of do
B.S. Detector
Google sources of mains that are
examined a corpus of 1.5 million posts.
Chrome. news. well-known
sources of fake VIII. CONCLUSION
data.
The purpose of this research was to review, summarize,
Shows a
contrast and evaluate the current research on counterfeit news
warning It relies on a list of
It is an
when user fake
which includes quantitative and qualitative analysis of
Fake News
extension of counterfeit news and Identifying and Intervening Strategies.
Alert visit a and misleading
Google As we reviewed, fake news detection is the machine learning
website news
Chrome. solution to deal with the problem of false news, rumors,
known by sources
fake news. and misinformation detection. In particular, the composite
FakerFact uses Shares the charact classification system comprises neural networks consisting
Faker Fact Firefox a machine eristics of valid of classical classification algorithms, which focus mainly
learning journalism

on lexical analysis of the objects as the main characteristic [12] Amitabha Dey ; Rafsan Zani Rafi ; Shahriar Hasan Parash ; Sauvik
for prediction and use of external background information. Kundu Arko ; Amitabha Chakrabarty, “ Fake News Pattern Recognition
using Linguistic Analysis,” in the Proceeding of IEEE Joint 7th
International Conference on Informatics, Electronics & Vision (ICIEV)
and 2018 2nd International Conference on Imaging, Vision & Pattern
REFERENCES Recognition (icIVPR), 2018.
[13] Zhixuan Zhou; Huankang Guan1; Meghana Moorthy Bhat; Justin Hsu;
“ Fake News Detection via NLP is Vulnerable to Adversarial Attacks,”
[1] Kuai Xu; Feng Wang; Haiyan Wang; and Bo Yang , “ Detecting Fake
in the proceeding of 11th International Conference on Agents and
News Over Online Social Media via Domain Reputations and Content
Artificial Intelligence (ICAART , 2019.
Understanding,” in the proceeding of T singhua Science and
T echnology, Vol.25 , Issue: 1 , Feb. 2020. [14] Weiling Chen ; Chenyan Yang ; Gibson Cheng ; Yan Zhang ; Chai Kiat
[2] Chaowei Zhang; Ashish Gupta; Christian Kauten; Amit V. Deokar and Yeo ; Chiew T ong Lau ; Bu Sung Lee, “ Exploiting Behavioural
Xiao Qin, “ Detecting fake news for reducing misinformation risks using Differences to Detect Fake News,” in the Proceeding of 9th IEEE
analytics approaches,” in the proceeding of ELSEVIER, European Annual Ubiquitous Computing, Electronics & Mobile Communication
Journal of Operational Research, 2019. Conference (UEMCON), 2018.
[3] Sahil Gaonkar ; Sachin Itagi ; Rhethiqe Chalippatt ; Avinash Gaonkar ; [15] 15. Namwon Kim ; Deokjin Seo ; Chang-Sung Jeong, “ FAMOUS:
Shailendra Aswale ; Pratiksha Shetgaonkar, “Detection Of Online Fake Fake News Detection Model based on Unified Key Sentence
News : A Survey,” in the Proceeding of IEEE International conference Information,” in the proceeding of IEEE 9th International Conference
on Vision T owards Emerging T rends in communication and Networking on Software Engineering and Service Science (ICSESS), 2018.
(ViT ECoN), 2019. [16] Stefan Helmstetter ; Heiko Paulheim, “Weakly Supervised Learning for
Fake News Detection on T witter,” in the Proceeding of IEEE/ACM
[4] Ye-Chan Ahn; Chang-Sung Jeong, “ Natural Language Contents
International Conference on Advances in social networks analysis and
Evaluation System for Detecting Fake News using Deep Learning,” in
Mining (ASONAM), 2018
the proceeding of IEEE 16 th International Joint Conf erence on
Computer Science and Software Engineering (JCSSE), 2019. [17] Zaitul Iradah Mahid ; Selvakumar Manickam ; Shankar Karuppayah,
[5] Hrishikesh Telang ; Shreya More; Yatri Modi and Lakshmi Kurup, “ An “ Fake News on Social Media: Brief Review on Detection T echniques,”
empirical analysis of Classification Models for Detection of Fake News in the Proceeding of IEEE Fourth International Conference on
Articles,” in the proceeding of IEEE Internation al Conference on Advances in Computing, Communication & Automation (ICACCA),
Electrical, Computer and Communication T echnologies (ICECCT ), 2018.
2019. [18] Youngkyung Seo ; Deokjin Seo ; Chang-Sung Jeong, “ FaNDeR: Fake
[6] Shivam B. Parikh ; Saurin R. Khedia ; Pradeep K. Atrey, “A Framework News Detection Model Using Media Reliability,” in the proceeding of
to Detect Fake Tweet Images on Social Media,” in the proceeding of T ENCON IEEE Region 10 Conference, 2018.
IEEE Fifth International Conference on Multimedia Big data (BigMM), [19] Saranya Krishnan and Min Chen, “Identifying T weets with Fake News,”
2019. in the Proceeding of IEEE International Conference on Information
Reuse and Integration for Data Science, 2018.
[7] Bhavika Bhutani; Neha Rastogi; Priyanshu Sehgal and Archana Purwar,
“ Fake News Detection Using Sentiment Analysis,” in the proceeding of [20] Manuel Egele; Gianluca Stringhini; Christopher Kruegel, and Giovanni
IEEE T welfth International Conference on Contemporary Computing Vigna, “ T owards Detecting Compromised Accounts on Social
(IC3), 2019. Networks”, in the proceeding of IEEE Transactions On Dependable And
[8] Jiawei Zhang; Bowen Dong and Philip S. Yu, “ FAKEDET ECT OR Secure Computing, VOL. 14, NO. 4, 2017.
effective Fake News Detection with Deep Diffusive Neural Network,” in [21] T . Mikolov; K.Chen; G.S. Corrado, and J. Dean, “ Efficient estimation of
the proceeding of arXiv:1805.08751v2 [cs.SI], 2019. word representations in vector space,” in Proceeding of the
[9] Georgios Gravanis; Athena Vakali ; Konstantinos Diamantaras and International Conference on Learning Representations, Scottsdale, AZ,
Panagiotis Karadais, “Behind the cues a benchmarking study for fake USA, 2013.
news detection,” in the proceeding of ELSEVIER Expert Systems with [22] J.Ma,W.Gao and K.Wong, “ Detect Rumors in Microblog Posts Using
Applications, 2019. Propogation Structure via kernel Learning,” in Proceeding of the 55th
Annual Meeting of the Association for Computational Linguistics, vol.
[10] Faiza Masood ; Ghana Ammad ; Ahmad Almogren ; Assad Abbas ;
1, 2017.
Hasan Ali Khattak ; Ikram Ud Din ; Mohsen Guizani ; Mansour Zuair,
“ Spammer Detection and Fake User Identification on Social Networks”, [23] Veronica Perez-Rosas; Bennett Kleinberg; Alexandra Lefevre and Rada
in the proceeding of IEEE Access, Vol. 7, 2019. Mihalcea, “ Automatic Detection of Fake News,” in Proceedings of the
[11] T erry Traylor ; Jeremy Straub ; Gurmeet ; Nicholas Snell, “ Classifying 27th International Conference on Computational Linguistics, August
Fake News Articles Using Natural Language Processing to Identify In- 2018.
Article Attribution as a Supervised Learning Estimator,” in the [24] James T horne; Andreas Vlachos; Christos Christo and Arpit Mittal,
Proceeding of IEEE 13th International Conference on Semantic “ FEVER: a large-scale dataset for Fact Extraction and VERification,” in
Computing (ICSC), 2019. Proceedings of NAACL-HLT 2018, pages 809–819, June 2018.

Authorized licensed use limited to: University of Exeter. Downloaded on July 17,2020 at 00:31:09 UTC from IEEE Xplore. Restrictions apply.

Detecting Fake News Using Machine Learning: Gaurav Kumar Choubey (21mca1061) Guide Name: DR Rajarajeswari S
No ratings yet
Detecting Fake News Using Machine Learning: Gaurav Kumar Choubey (21mca1061) Guide Name: DR Rajarajeswari S
29 pages
Fake News
No ratings yet
Fake News
9 pages
Fake News Documentation
No ratings yet
Fake News Documentation
96 pages
Reserch Paperupdated
No ratings yet
Reserch Paperupdated
8 pages
Ijresm V3 I6 32
No ratings yet
Ijresm V3 I6 32
3 pages
Fake News Detection Using Machine Learning Report Final
No ratings yet
Fake News Detection Using Machine Learning Report Final
26 pages
Fake News Detector - Final Project Report - (154429, 160041026, 160041028) (2) - Md. Rabiul Alam, 160041026
No ratings yet
Fake News Detector - Final Project Report - (154429, 160041026, 160041028) (2) - Md. Rabiul Alam, 160041026
47 pages
02 - Bharghav Fake News Detection
No ratings yet
02 - Bharghav Fake News Detection
49 pages
Fake News Detection A Survey of Techniques
No ratings yet
Fake News Detection A Survey of Techniques
4 pages
Fake News Synopsis 1
No ratings yet
Fake News Synopsis 1
6 pages
Face Mask Detection Using Deep Learning
No ratings yet
Face Mask Detection Using Deep Learning
31 pages
Big Data ML for Fake News Detection
No ratings yet
Big Data ML for Fake News Detection
31 pages
Perezrosas Coling18 PDF
No ratings yet
Perezrosas Coling18 PDF
11 pages
1 s2.0 S095741742030909X Main
No ratings yet
1 s2.0 S095741742030909X Main
15 pages
Reserch Paper
No ratings yet
Reserch Paper
8 pages
Fake News Detection with ML
No ratings yet
Fake News Detection with ML
56 pages
JPNR 2022 04 140
No ratings yet
JPNR 2022 04 140
7 pages
Fake News Detection Project Report
No ratings yet
Fake News Detection Project Report
32 pages
Fake News Detection On Social Media Using Machine Learning Report
100% (1)
Fake News Detection On Social Media Using Machine Learning Report
27 pages
CSI: A Hybrid Deep Model For Fake News Detection: Natali Ruchansky Sungyong Seo Yan Liu
No ratings yet
CSI: A Hybrid Deep Model For Fake News Detection: Natali Ruchansky Sungyong Seo Yan Liu
10 pages
Fake News Detection
No ratings yet
Fake News Detection
21 pages
Synopsis
No ratings yet
Synopsis
8 pages
Fake News Detection
No ratings yet
Fake News Detection
5 pages
1 s2.0 S2405844024012751 Main
No ratings yet
1 s2.0 S2405844024012751 Main
12 pages
Fake News Classifier Project Report
No ratings yet
Fake News Classifier Project Report
5 pages
Big Data ML for Fake News Detection
No ratings yet
Big Data ML for Fake News Detection
17 pages
Fake News Synopsis 1
No ratings yet
Fake News Synopsis 1
6 pages
Fake News Detection Using Machine Learning and Natural Language Processing
No ratings yet
Fake News Detection Using Machine Learning and Natural Language Processing
4 pages
Final Year of Computer Engineering 2022-23 Semester VII Project Synopsis
No ratings yet
Final Year of Computer Engineering 2022-23 Semester VII Project Synopsis
11 pages
Fake News Detection with ML
No ratings yet
Fake News Detection with ML
20 pages
Fake News Detec-WPS Office
No ratings yet
Fake News Detec-WPS Office
4 pages
Fake News Detection Techniques Report
No ratings yet
Fake News Detection Techniques Report
21 pages
Crop Disease
No ratings yet
Crop Disease
6 pages
Fake News Detection for Researchers
No ratings yet
Fake News Detection for Researchers
5 pages
Fake News Detection With Different Model
No ratings yet
Fake News Detection With Different Model
15 pages
Bioconf Iscku2024 00049
No ratings yet
Bioconf Iscku2024 00049
12 pages
Detecting Fake News with AI
No ratings yet
Detecting Fake News with AI
13 pages
Fake News Detection Paper
No ratings yet
Fake News Detection Paper
10 pages
Fake News Detection and Classification U
No ratings yet
Fake News Detection and Classification U
3 pages
Objectives Presentation - Group 22 - BTech Project
No ratings yet
Objectives Presentation - Group 22 - BTech Project
20 pages
Fake News Detection in Online Social Media: Problem Statement
No ratings yet
Fake News Detection in Online Social Media: Problem Statement
3 pages
Fake News Detection Using Machine Learning
No ratings yet
Fake News Detection Using Machine Learning
24 pages
Fake News Detection
No ratings yet
Fake News Detection
6 pages
Hybrid Deep Learning Model For Automatic Fake News Detection
No ratings yet
Hybrid Deep Learning Model For Automatic Fake News Detection
11 pages
An Enhanced Method For Detecting Fake Ne
No ratings yet
An Enhanced Method For Detecting Fake Ne
19 pages
Fake News Detection
No ratings yet
Fake News Detection
24 pages
8th Sem Research Paper
No ratings yet
8th Sem Research Paper
3 pages
Multi Class Fake News Detection Using LSTM Approach
No ratings yet
Multi Class Fake News Detection Using LSTM Approach
5 pages
Fake News Detection
No ratings yet
Fake News Detection
3 pages
Fake News Classification with SVM
No ratings yet
Fake News Classification with SVM
24 pages
Fake News Classification Using Machine Learning Techniques
No ratings yet
Fake News Classification Using Machine Learning Techniques
12 pages
Fake News Detection Using Machine Learning
No ratings yet
Fake News Detection Using Machine Learning
4 pages
Fake News Detection on Social Media
No ratings yet
Fake News Detection on Social Media
7 pages
Fake News Detection with AI
No ratings yet
Fake News Detection with AI
24 pages
Arti Research Paper Mca
No ratings yet
Arti Research Paper Mca
8 pages
Fake Detection Ieee
No ratings yet
Fake Detection Ieee
6 pages
Fake News Detection Using Python
No ratings yet
Fake News Detection Using Python
11 pages
2.1.9 HZ-3320D Transformer Winding Resistance Tester-User Manual
No ratings yet
2.1.9 HZ-3320D Transformer Winding Resistance Tester-User Manual
24 pages
Assignment
No ratings yet
Assignment
13 pages
اسئلة واجوبة CSWIP
0% (1)
اسئلة واجوبة CSWIP
64 pages
Steel II Design 2023 Exam
No ratings yet
Steel II Design 2023 Exam
3 pages
U4 Database Management System (DBMS)
No ratings yet
U4 Database Management System (DBMS)
31 pages
Python Constructors Explained
No ratings yet
Python Constructors Explained
5 pages
MRN0 Doesn't Consider Other Conditions Except Net Value From Purchase Order
No ratings yet
MRN0 Doesn't Consider Other Conditions Except Net Value From Purchase Order
2 pages
Maths Stats - Question - EM 13.11.2022
No ratings yet
Maths Stats - Question - EM 13.11.2022
16 pages
Moles and Equations in Chemistry
No ratings yet
Moles and Equations in Chemistry
21 pages
MMW Module 2 - MATHEMATICAL LANGUAGE AND SYMBOLS
No ratings yet
MMW Module 2 - MATHEMATICAL LANGUAGE AND SYMBOLS
12 pages
Engineering Design Task Guide
No ratings yet
Engineering Design Task Guide
4 pages
Cambridge Secondary Progression Test - Stage 7 Math Paper 1
No ratings yet
Cambridge Secondary Progression Test - Stage 7 Math Paper 1
12 pages
Skittles Color Frequency Analysis
No ratings yet
Skittles Color Frequency Analysis
10 pages
Zero-Voltage Soft-Switching DC-DC Converter-Based Charger For LV Battery in Hybrid Electric Vehicles
No ratings yet
Zero-Voltage Soft-Switching DC-DC Converter-Based Charger For LV Battery in Hybrid Electric Vehicles
8 pages
2024 PO R1 Question Paper ENG
No ratings yet
2024 PO R1 Question Paper ENG
2 pages
Astrology & Vastu Module
100% (1)
Astrology & Vastu Module
15 pages
FAR 1 Chapter - 6
No ratings yet
FAR 1 Chapter - 6
13 pages
Waves and Optics Module 1
No ratings yet
Waves and Optics Module 1
12 pages
Atlas Atlas Atlas: Modification and Information
No ratings yet
Atlas Atlas Atlas: Modification and Information
11 pages
Calculating Oleic Acid in Monolayers
No ratings yet
Calculating Oleic Acid in Monolayers
3 pages
Embedded Systems Raj Kamal Chapter 1
No ratings yet
Embedded Systems Raj Kamal Chapter 1
19 pages
Math COMPETITIONS
No ratings yet
Math COMPETITIONS
8 pages
GM 09 Axle Counter Presentation 1-13
No ratings yet
GM 09 Axle Counter Presentation 1-13
13 pages
Binary Tree and General Tree Implementation
No ratings yet
Binary Tree and General Tree Implementation
5 pages
Yamaha Ynca Receivers Protocol
No ratings yet
Yamaha Ynca Receivers Protocol
1,511 pages
Sarvarbek Umarov Zafar o G Li
No ratings yet
Sarvarbek Umarov Zafar o G Li
1 page
Windows 7 Acer TravelMate Specs
No ratings yet
Windows 7 Acer TravelMate Specs
15 pages
Curved Mirror Image Analysis
No ratings yet
Curved Mirror Image Analysis
3 pages
LANJET 2023 Mathematics Marking Scheme
No ratings yet
LANJET 2023 Mathematics Marking Scheme
15 pages
Application of Matrix
No ratings yet
Application of Matrix
3 pages

Natural Language Processing Based Online Fake

Uploaded by

Natural Language Processing Based Online Fake

Uploaded by

Proceedings of the Fifth International Conference on Communication and Electronics Systems (ICCES 2020)

IEEE Conference Record # 48766; IEEE Xplore ISBN: 978-1-7281-5371-1

Natural Language Processing based Online Fake

II. M OT IVAT ION

978-1-7281-5371-1/20/$31.00 ©2020 IEEE 748

978-1-7281-5371-1/20/$31.00 ©2020 IEEE 749

978-1-7281-5371-1/20/$31.00 ©2020 IEEE 750

978-1-7281-5371-1/20/$31.00 ©2020 IEEE 751

Te chnique Accuracy Datase t Me thodology Te chnique Accuracy Datase t Me thodology

978-1-7281-5371-1/20/$31.00 ©2020 IEEE 752

Name of Tool Exte nsion/

VI. M ISCELLANEOUS VII. ANALYSIS

978-1-7281-5371-1/20/$31.00 ©2020 IEEE 753

978-1-7281-5371-1/20/$31.00 ©2020 IEEE 754

You might also like