0% found this document useful (0 votes)

19 views34 pages

My Project Report

The document outlines a project for a machine learning-based fake news detection system that uses natural language processing (NLP) techniques to classify news as real or fake. It details the system's modules, including data collection, preprocessing, feature extraction, model training, and a user interface for predictions. The proposed system aims to automate the detection of misinformation, addressing the limitations of existing manual verification methods and ensuring real-time responsiveness.

Uploaded by

prakashaimcahod

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views34 pages

My Project Report

Uploaded by

prakashaimcahod

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

FAKE NEWS DETECTION

I
INTRODUCTION

1
FAKE NEWS DETECTION

INTRODUCTION

In the digital age, the spread of misinformation or fake news has become a serious problem,
affecting public opinion, politics, and social harmony. The rapid distribution of news on social
media platforms increases the difficulty of identifying authentic information. This project aims to
build a machine learning-based system to classify news as real or fake using natural language
processing (NLP) techniques.

Modules

1. Data Collection Module

 Loads two datasets: one containing fake news and the other containing real news.
 Adds labels: 0 for fake and 1 for real.
 Combines and shuffles the data to ensure unbiased model training.

2. Data Preprocessing Module

 Cleans the news text by removing special characters, URLs, numbers, and stopwords.
 Uses tokenization and lemmatization with NLTK to prepare text for modeling.

3. Feature Extraction Module

 Converts text into numeric vectors using TF-IDF Vectorizer.

 Captures word importance using n-grams (unigram, bigram, trigram).

4. Model Training Module

 Trains three machine learning models:

o Multinomial Naive Bayes
o Logistic Regression
o Random Forest Classifier
 Evaluates models based on accuracy and classification reports.
 Saves trained models for future use.

2
FAKE NEWS DETECTION

5. Prediction & API Module

 Provides a Flask API endpoint (/predict) that accepts news text and returns predictions.
 Preprocesses user input, transforms it using the trained vectorizer, and uses the Random
Forest model to make predictions.

6. Frontend Interface Module

 An HTML-based user interface that allows users to input news content.

 Displays results clearly, showing whether the news is Fake or Real, along with
confidence scores.

7. Health Check and Utility Module

 Includes an API route (/health) to verify server status.

 Handles static file routing for loading the HTML frontend.
 To monitor the application status.
 To ensure the API is active and responsive.
 To be used by deployment tools or uptime monitors to verify server health.

Use Case:

When the system is deployed (locally or online), this endpoint can be pinged periodically by
services like:

 Load balancers (to check if the app should receive traffic)

 CI/CD pipelines (to validate deployments)
 Cloud platforms (like Heroku, AWS) for auto-scaling and diagnostics

3
FAKE NEWS DETECTION

II
SYSTEM STUDY

4
FAKE NEWS DETECTION

SYSTEM STUDY

Problem Statement

In today’s digital world, the rise of social media and online platforms has made it easy for
misinformation and fake news to spread rapidly. Traditional manual fact-checking methods are
slow and not scalable. This creates a need for an automated system that can detect fake news
accurately and in real time.

Limitations of Existing System

The existing systems for detecting fake news are largely dependent on manual efforts by human
moderators, journalists, or third-party fact-checkers. This manual verification process is slow,
inconsistent, and impractical for handling the vast amount of content being produced and shared
online every minute. Furthermore, human-based systems are prone to bias and subjectivity,
which can influence the accuracy and neutrality of the verification. These systems also lack real-
time responsiveness, allowing misinformation to spread rapidly before it is identified. Most
existing solutions are also language-specific or content-limited, making them ineffective in
multilingual or unstructured environments. Additionally, maintaining these systems requires
considerable human resources and operational costs, making them unsustainable for large-scale
deployment.

Proposed System with Objectives

To overcome the limitations of the traditional systems, a machine learning-based fake news
detection system is proposed. This system is designed to automatically classify news articles as
either fake or real, using Natural Language Processing (NLP) and supervised machine learning
algorithms. The primary objective is to eliminate human dependency and provide a fast,
accurate, and scalable solution to detect fake news. The system cleans and preprocesses the input
text using NLP techniques and converts it into numerical format using the TF-IDF vectorizer.
Several machine learning models such as Naive Bayes, Logistic Regression, and Random Forest
are trained and evaluated. The most effective model is integrated into a user-friendly interface

5
FAKE NEWS DETECTION

using Flask and HTML. The system allows users to input news text and instantly receive
predictions, making it a practical tool for real-time use. The proposed system aims to increase
automation, improve detection accuracy, and offer a lightweight solution that can be expanded in
future.

Feasibility Study

A feasibility study is conducted to evaluate whether the proposed system is viable and worth
implementing. It helps assess the strengths and weaknesses of the system in terms of technology,
cost, user experience, and operational value. The key types of feasibility considered for this
project are:

1. Technical Feasibility

This type of feasibility assesses whether the technology needed for the project is available,
efficient, and suitable. The proposed system is developed using Python and popular open-source
libraries such as Scikit-learn, Pandas, and NLTK, which are reliable, well-documented, and
widely supported. The use of Flask as the web framework ensures a lightweight and scalable
backend. Since all tools are compatible with standard hardware, the system is technically feasible
and easy to implement even on low-cost machines.

2. Economic Feasibility

Economic feasibility refers to the cost-effectiveness of the system. This project is economically
viable as it uses completely free and open-source software. There are no licensing fees, and the
development and deployment do not require high-end infrastructure. The only investment is the
time and effort required for model training and testing, making the system affordable for
educational institutions, individuals, and small organizations.

3. Operational Feasibility

Operational feasibility focuses on how well the system will function once it is deployed. The
proposed system is user-friendly, requiring no specialized training to operate. Users can input a

6
FAKE NEWS DETECTION

news article and instantly get a prediction on whether it is fake or real. The system integrates a
simple HTML frontend with a Flask backend, making it accessible via a web browser. Its real-
time prediction capability and ease of use make it highly practical for both technical and non-
technical users.

4. Schedule Feasibility

Schedule feasibility evaluates whether the system can be developed within the available time
frame. This project was carefully planned and executed in phases: data collection, preprocessing,
model training, evaluation, and deployment. Each stage was manageable and completed within a
reasonable period, proving that the project is schedule-feasible for academic deadlines or short-
term implementations.

Considering all aspects—technical, economic, operational, and schedule—the proposed Fake

News Detection System is highly feasible. It provides a practical, affordable, and scalable
solution to a real-world problem and is ready for use in educational, research, or even production
environments.

7
FAKE NEWS DETECTION

III
SYSTEM ANALYSIS

8
FAKE NEWS DETECTION

SYSTEM ANALYSIS

System Requirements

Hardware Requirements:

 Minimum 4GB RAM

 Any modern processor
 Storage: ~500MB (for dataset and model files)

Software Requirements:

 Python 3.x
 Flask
 Scikit-learn, Pandas, NLTK
 Web Browser for frontend

System Workflow Overview

1. User inputs news content via HTML form.

2. Flask backend receives and preprocesses the text.
3. Text is vectorized using TF-IDF.
4. Random Forest model predicts whether the text is fake or real.
5. The result is shown to the user along with confidence percentages.

Technologies Used

This project integrates a combination of machine learning, natural language processing, web
development, and data analysis tools. The following technologies were used across different
stages of the project:

9
FAKE NEWS DETECTION

1. Programming Language: Python 3.x

Python is the backbone of this project. It's known for its:

 Ease of use and readability

 Rich ecosystem of libraries for machine learning and data science
 Support for rapid development

Python enabled efficient handling of text data, model training, preprocessing, and web
integration with minimal code complexity.

2. Machine Learning Libraries

Scikit-learn

A powerful machine learning library used for:

 Model training: (Naive Bayes, Logistic Regression, Random Forest)

 Model evaluation: (Accuracy, Classification Report)
 Feature extraction: (TF-IDF Vectorization)

Scikit-learn simplifies building, training, and evaluating ML pipelines with just a few lines of
code.

Pandas

Used for:

 Reading CSV files (Fake.csv and True.csv)

 Merging datasets and labeling them
 Organizing and transforming data into training format

Pandas is essential for managing and analyzing large volumes of structured data in tabular form.

10
FAKE NEWS DETECTION

NumPy

Supports efficient numerical and array operations, especially in:

 TF-IDF feature matrices

 Conversions during preprocessing and model input/output

Though used indirectly, it's foundational for many ML operations under the hood.

3. Natural Language Processing (NLP) with NLTK

NLTK (Natural Language Toolkit) is a specialized Python library for processing human
language. It's vital in preparing raw text data for machine learning.

Main tasks it performs:

 Tokenization: Splits sentences into words

 Stopword removal: Removes common words like “is”, “the”, “and” that carry little
meaning
 Lemmatization: Converts words to their base form (e.g., "running" → "run")

Without NLP preprocessing, the machine learning model would treat each word form as
unrelated, reducing prediction accuracy.

4. Feature Extraction with TF-IDF (via Scikit-learn)

TF-IDF (Term Frequency-Inverse Document Frequency) is used to convert text into numerical
format, which is required for ML algorithms.

Benefits:

 Highlights important words (frequent in a document but rare across others)

 Removes biases due to common words
 Supports n-grams (unigram, bigram, trigram) to capture context

11
FAKE NEWS DETECTION

The result is a sparse matrix where each row is a news article and each column is a weighted
word feature.

5. Model Persistence with Joblib

After training, models are saved as .joblib files. This ensures:

 Models can be reused instantly for predictions

 No need to retrain models every time
 Faster performance in deployment

Joblib also handles large NumPy arrays efficiently.

6. Web Development with Flask

Flask is a micro web framework in Python used to:

 Create an API endpoint (/predict) to receive user input

 Process the input and return predictions
 Serve the HTML frontend using static file routing

Benefits of Flask:

 Lightweight and fast

 Easy to integrate with ML models
 Requires minimal setup

7. Frontend Interface using HTML/CSS

A simple and user-friendly HTML form allows users to:

 Input news content

 Click a “Check News” button
 View whether the news is Fake or Real

12
FAKE NEWS DETECTION

The interface communicates with Flask using HTTP POST requests.

Although basic, this frontend makes the project interactive and accessible to non-technical
users.

8. Visualization (Optional): Matplotlib and Seaborn

For evaluation and documentation, optional libraries like:

 Matplotlib (for plotting graphs)

 Seaborn (for beautiful statistical plots)

can be used to:

 Display accuracy and performance metrics

 Plot a confusion matrix
 Visualize the word frequencies or distribution

These are useful during model evaluation and report preparation.

9. Development Environment

VS Code

 Used for coding Python scripts, Flask backend, and HTML frontend.
 Offers extensions like Python formatter, debugger, and Git integration.

Jupyter Notebook

 Useful during exploratory data analysis and model experimentation.

 Allows for visualization, code, and notes in a single document.

13
FAKE NEWS DETECTION

IV
SYSTEM DESIGN

14
FAKE NEWS DETECTION

SYSTEM DESIGN

The system design defines the architecture, components, and flow of data within the Fake News
Detection system. It focuses on how the system works internally and how different modules
interact to achieve the goal of predicting whether a news article is fake or real.

System Design provides the necessary understanding and detailed procedures required to
implement the system recommended during the system study phase. The main focus is on
translating the user and performance requirements into precise design specifications that can be
followed by the development team. This phase serves as a bridge, converting user-oriented
documentation, such as the system proposal, into technical documents tailored for programmers,
database administrators, and other technical staff. The design process is divided into two key
phases: Logical Design and Physical Design. Logical design involves mapping out the system’s
workflows, inputs, outputs, and data flows, often using tools like Data Flow Diagrams (DFDs) to
visually represent how information moves through the system. Once the logical framework is
complete, the physical design phase takes over to define the actual software, hardware, database
schemas, and user interfaces needed to build the system. This phase ensures that the system’s
structure and functionality are clearly specified and ready for development and implementation.

15
FAKE NEWS DETECTION

Data Flow Diagram (DFD)

 The DFD shows how data moves between processes, users, and the database.
 For your project, the flow includes:
o Input from user → Flask Backend → Preprocessing → Prediction → Output to
user

Fig 4.1

16
FAKE NEWS DETECTION

Flowchart

 The flowchart provides a logical sequence of the steps followed in the system.
 It helps visualize the decision-making process in prediction.

Fig 4.2

17
FAKE NEWS DETECTION

System Architecture

The architecture of the Fake News Detection system is based on a modular and layered design
that includes data processing, machine learning, and user interaction components. This
architecture enables real-time prediction of whether a given news article is fake or real using
trained machine learning models.

Fig 4.3

18
FAKE NEWS DETECTION

1. Presentation Layer (Frontend)

 Technology: HTML/CSS
 Purpose: To provide an interface for users to enter news content.
 Functionality:
o Accepts user input through a text box.
o Displays prediction results (Fake or Real).
o Sends user input to the backend using HTTP POST request.

2. Application Layer (Flask Backend)

 Technology: Flask (Python Web Framework)

 Purpose: Acts as the middle layer that connects the frontend with the machine learning
model.
 Functionality:
o Accepts incoming news text from the user.
o Passes the input to the preprocessing and ML pipeline.
o Sends back the prediction result and confidence score to the frontend.

3. Processing Layer (Preprocessing & Prediction Logic)

 Technology: Python, NLTK, Scikit-learn

 Purpose: Cleans, vectorizes, and processes the input text.
 Functionality:
o Text Cleaning: Removes punctuation, stopwords, and irrelevant content.
o Tokenization & Lemmatization: Prepares the text using NLP techniques.
o TF-IDF Vectorization: Converts text to numerical features.
o Model Prediction: Applies the trained ML model (Random Forest) to make a
prediction.

19
FAKE NEWS DETECTION

4. Data Layer (Trained Models & Dataset)

 Components:

o Trained models: RandomForest, NaiveBayes, LogisticRegression

o TF-IDF Vectorizer
o Dataset (CSV files for Fake and Real news)
 Storage:
o Stored in a directory called models/
o Loaded into memory at runtime using joblib

End-to-End Flow

1. User enters news text in the web interface.

2. Flask server receives the request via the /predict API.
3. Text is preprocessed using NLP tools (NLTK).
4. TF-IDF vectorizer converts the cleaned text into numerical format.
5. Trained ML model classifies the input as fake or real.
6. Prediction result and confidence scores are sent back to the user.

Advantages of This Architecture

 Modularity: Each component (preprocessing, model, frontend) is independent.

 Reusability: The same trained model can be reused across different interfaces (web app,
mobile app).
 Scalability: Can be expanded by integrating APIs, deeper models, or multilingual
support.
 Simplicity: Easy to debug, maintain, and deploy.

20
FAKE NEWS DETECTION

V
SYSTEM TESTING

21
FAKE NEWS DETECTION

SYSTEM TESTING
TESTING

Once source has been generated, software must be tested to uncover (and correct) as many errors
as possible before delivery to your customer. Your goal is to design a series of test cases that
have a likelihood of finding errors but how?? That’s where software testing techniques enter the
picture. These techniques provide systematic guidance for designing test that

(1) Exercise the internal logic of software components, and

(2) Exercise the input and output domains of the program to uncover errors in program function,
behavior, and performance.

During early stages of testing a software engineer all tests. However, as the testing process
progresses, testing specialists may become involved.

Importance of testing is that reviews and other SQA activities can and do uncover errors, but
they are not sufficient. Every time the program executed, the customers tests it! Therefore, you
have to execute the program before it gets to the customers with the specific intent of finding and
recovering all errors. In order to find the highest possible number of errors, tests must be
conducted systematically and test cases must be designed using disciplined techniques.

22
FAKE NEWS DETECTION

Testing Principles

Before applying methods to design effective test cases, a software engineer must understand the
basic principles that guide software testing. Davis suggests a set1 of testing principles that :

 All tests should be traceable to design customer requirements. The objective of software
testing is to uncover errors. It follows that the most severe (from the customer’s point of
view) are those that cause the program to fail to meet its requirements.
 Tests should be planned long before testing begins. Test planning can begin as soon as
the design model has been solidified. There, all tests can be planned and designed before
any code has been generated.
 The Pareto principle applies to software testing. Stated simply, the Pareto principle
implies that 80% of all errors uncovered during testing will likely be traceable to 20% of
all program components. The problem, of course, is to isolate these suspect components
and to thoroughly test them.
 Testing should begin “in the small” and process toward testing “in the large”. The first
tests planned and executed generally focus on individual components. As testing
progresses, focus shifts in an attempt to find errors in integrated clusters of components
and ultimately in the entire system.
 Exhaustive testing is not possible. The number of path permutations for even a
moderately sizes program is exceptionally large. For this reason, it is impossible to
execute every combination of paths during testing. It is possible, however, to adequately
cover program logic and to ensure that all conditions in the component level design have
been exercised.
 To be most effective, testing should be conducted by an independent third party. By most
effective, we mean testing that has the highest probability of finding errors ( the primary
objective of testing), the software engineer who created the system is not the best person
to conduct all tests for the software.

23
FAKE NEWS DETECTION

TESTING STEPS
Software is tested from two different perspective
1. Internal program logic is exercised using “white box” test case design techniques.
2. Software requirements are exercised using “black box” test case design techniques.
In both cases, the intent is to find the maximum number of errors with the minimum
amount of effort and time.

Black Box Testing

The technique of testing without having any knowledge of the interior workings of the
application is Black Box Testing. The tester is oblivious to the system architecture and
does not have access to the source code. Typically, when performing a black box test, a
tester will interact with the
system’s user interface by providing inputs and examining outputs without knowing how
and where the inputs are worked upon.

White Box Testing

White box testing is the detailed investigation of internal logic and structure of the code.
White box testing is also called glass testing or open box testing. In order to perform
white box testing on an application, the tester needs to process knowledge of the internal
working of the code.
The tester needs to have a look inside the source code and find out which unit/chunk of
the code is behaving inappropriately.

24
FAKE NEWS DETECTION

STAGES IN THE TESTING PROCESS

UNIT TESTING
Unit testing focuses verification effort on the smallest unit of software design the
software component or module. Using the component-level design description as a guide,
important control.
paths are tested to uncover errors within the boundary of the module. The relative
complexity of tests and uncovered errors is limited by the constrained scope established
for unit testing. The unit test is white-box oriented, and the step can be conducted in
parallel for multiple components.

Limitations of Unit Testing

Testing cannot catch each and every bug in an application. It is impossible to evaluate
every execution path in every software application. The same is the case with unit testing.
There is a limit to the number of scenarios and test data the developer can use to verify
the source code. So, after he has exhausted all options there is no choice but to step unit
testing and merge the code segment with other units.

MODULE TESTING
Module is a collection of dependent components such as an object classes an abstract data
type or some looser collection of procedures and functions. A module encapsulates
related components so can be tested without other system modules.

25
FAKE NEWS DETECTION

INTEGRATION TESTING
Integration testing is a systematic technique for constructing the program structure while
at the same time conducting tests to uncover associated with interfacing. The objective is
to take unit tested components and build a program structure that has been dictated by
design.

VALIDATION TESTING
Software validation is achieved through a series of black-box tests that demonstrate
conformity with requirements. A test plan outlines the classes of tests to be conducted
and a test procedure defines specific test cases that will be used to demonstrate
conformity with requirements. Both the plan and procedure are designed to ensure that all
functional requirements are satisfied, all behavioral characteristics are achieved, all
performance requirements are attained, documentation is correct, and human-engineered
and other requirements are met.

SYSTEM TESTING
System testing is actually a series of different tests whose primary purpose is to fully
exercise the computer-based system. Although each test has a different purpose, all work
to verify that system elements have been properly integrated and perform allocated
functions.

26
FAKE NEWS DETECTION

VI
DESIGN SNAPSHOTS

27
FAKE NEWS DETECTION

SNAPSHOTS

SOURCE CODE

28
FAKE NEWS DETECTION

UI DESIGN

29
FAKE NEWS DETECTION

UI DESIGN

30
FAKE NEWS DETECTION

VII
CONCLUSION

31
FAKE NEWS DETECTION

CONCLUSION

The project titled “Fake News Detection Using Machine Learning” has successfully
demonstrated the use of artificial intelligence and natural language processing to solve one of the
most pressing challenges of the digital era—the spread of misinformation. With the explosion
of content on social media and online platforms, fake news can influence public opinion, disrupt
social harmony, and mislead users. This system aims to tackle that problem by offering an
automated, reliable, and scalable solution.

By leveraging Natural Language Processing (NLP) techniques such as tokenization, stopword

removal, and lemmatization, the system preprocesses raw text data to prepare it for model
training. Using TF-IDF vectorization, the textual information is converted into numerical
features that machine learning models can interpret. Several models were trained and evaluated,
and the most accurate one—Random Forest—was selected for deployment. The model was
then integrated into a web-based interface using Flask and HTML, enabling real-time
interaction and prediction.

The system's modular design allows for flexibility and future enhancements. Its simplicity and
speed make it accessible to both technical and non-technical users. Moreover, it provides a
strong foundation for further research and development in fake news detection, such as
incorporating deep learning techniques (e.g., LSTM, BERT), supporting multilingual inputs,
or deploying the system as a browser extension or mobile application.

In conclusion, this project not only enhances the understanding of how machine learning can be
applied to text classification problems but also offers a meaningful, practical tool to counter
digital misinformation. It contributes to a safer and more informed digital environment and opens
up new avenues for innovation in AI-powered content validation.

32
FAKE NEWS DETECTION

VIII
BIBLIOGRAPHY

33
FAKE NEWS DETECTION

BIBLIOGRAPHY

 Scikit-learn Documentation – https://scikit-learn.org/

 NLTK (Natural Language Toolkit) – https://www.nltk.org/
 Pandas Library Documentation – https://pandas.pydata.org/
 Flask Framework Documentation – https://flask.palletsprojects.com/
 Joblib for model serialization – https://joblib.readthedocs.io/
 "Fake and real news dataset" – Kaggle (https://www.kaggle.com/clmentbisaillon/fake-
and-real-news-dataset)
 Research paper: "Fake News Detection on Social Media: A Data Mining Perspective",
ACM SIGKDD Explorations.
 Python Official Documentation – https://docs.python.org/
 YouTube Tutorials on Fake News Detection and Flask Deployment
 Stack Overflow – For code debugging and implementation guidance

Himanshusynopsis
No ratings yet
Himanshusynopsis
4 pages
Final Synopsis-Major Abhilasha, Ananya
No ratings yet
Final Synopsis-Major Abhilasha, Ananya
10 pages
Fake News Proposal
No ratings yet
Fake News Proposal
18 pages
Fake News Detection for Users
No ratings yet
Fake News Detection for Users
9 pages
Fake News Detection Project Documentation
No ratings yet
Fake News Detection Project Documentation
16 pages
Final Year of Computer Engineering 2022-23 Semester VII Project Synopsis
No ratings yet
Final Year of Computer Engineering 2022-23 Semester VII Project Synopsis
11 pages
Proposal 25
No ratings yet
Proposal 25
3 pages
Fake News Detector Partha
No ratings yet
Fake News Detector Partha
32 pages
Fake News Detector With Real Time Web Scraping
No ratings yet
Fake News Detector With Real Time Web Scraping
11 pages
Automated Fake News Detection System
No ratings yet
Automated Fake News Detection System
12 pages
Abstract
No ratings yet
Abstract
27 pages
Fake News Detection
No ratings yet
Fake News Detection
9 pages
Fake News Detection System Project Proposal
No ratings yet
Fake News Detection System Project Proposal
4 pages
MAJOR PROJECT REPORT (1) - For Merge
No ratings yet
MAJOR PROJECT REPORT (1) - For Merge
46 pages
FAke News Report
No ratings yet
FAke News Report
16 pages
Fake News Abstract
No ratings yet
Fake News Abstract
2 pages
Fake News Detector
No ratings yet
Fake News Detector
27 pages
Sairam Reserch Paper
No ratings yet
Sairam Reserch Paper
11 pages
Fake News Detection Using Machine Learning
No ratings yet
Fake News Detection Using Machine Learning
18 pages
Fake News Synopsis 1
No ratings yet
Fake News Synopsis 1
6 pages
Fake News Detection Report Rishabh Kumar
No ratings yet
Fake News Detection Report Rishabh Kumar
3 pages
Fake News Synopsis 1
No ratings yet
Fake News Synopsis 1
6 pages
Geetha Internship
No ratings yet
Geetha Internship
17 pages
DV Report 1
No ratings yet
DV Report 1
25 pages
Fake News Final Report
No ratings yet
Fake News Final Report
29 pages
BT P Final Project
No ratings yet
BT P Final Project
11 pages
Fake News Detection System
No ratings yet
Fake News Detection System
13 pages
Synopsis
No ratings yet
Synopsis
5 pages
Final Synopsis-1
No ratings yet
Final Synopsis-1
11 pages
Fake News Detection
No ratings yet
Fake News Detection
5 pages
Fake News Detection
No ratings yet
Fake News Detection
9 pages
Fake News Detection
No ratings yet
Fake News Detection
6 pages
Dar Es Salaam Institutes of Technolog1
No ratings yet
Dar Es Salaam Institutes of Technolog1
8 pages
The Main Objective Is To Detect The Fake News, Which Is A Classic Text Classification
No ratings yet
The Main Objective Is To Detect The Fake News, Which Is A Classic Text Classification
57 pages
Fake News Detection PPT 1
No ratings yet
Fake News Detection PPT 1
13 pages
Machine Learning for Fake News Detection
No ratings yet
Machine Learning for Fake News Detection
4 pages
Mega
No ratings yet
Mega
14 pages
BTPFINALPROJECT
No ratings yet
BTPFINALPROJECT
10 pages
Fake News Prediction
No ratings yet
Fake News Prediction
60 pages
Software Requirements Specification For Fake News Prediction Using Machine Learning
No ratings yet
Software Requirements Specification For Fake News Prediction Using Machine Learning
8 pages
Fake News Detection with AI
No ratings yet
Fake News Detection with AI
7 pages
Fake News Detection DOCUMENTATION DL - ST 2
No ratings yet
Fake News Detection DOCUMENTATION DL - ST 2
23 pages
(NetCrypt) Review Paper PDF
No ratings yet
(NetCrypt) Review Paper PDF
5 pages
Fake News Detection Using Machine Learning
No ratings yet
Fake News Detection Using Machine Learning
11 pages
Project Synopsis Report Format
No ratings yet
Project Synopsis Report Format
9 pages
Daa - Mini - Project (1) Orginal
No ratings yet
Daa - Mini - Project (1) Orginal
21 pages
Fake News Detection-1
No ratings yet
Fake News Detection-1
18 pages
Fake News Detection System
No ratings yet
Fake News Detection System
12 pages
BTP Research Paper
No ratings yet
BTP Research Paper
9 pages
Encryption & Decryption Apk
No ratings yet
Encryption & Decryption Apk
27 pages
Iccee Research Paper
No ratings yet
Iccee Research Paper
8 pages
Fake News Detetcion PPT 2023
No ratings yet
Fake News Detetcion PPT 2023
25 pages
Fake News Detection System Report
No ratings yet
Fake News Detection System Report
29 pages
8th Sem Research Paper
No ratings yet
8th Sem Research Paper
3 pages
Project File On Fake News (Soniya Rawat)
No ratings yet
Project File On Fake News (Soniya Rawat)
53 pages
Similarity-ManpreetKaur5521 BTP Final Proje
No ratings yet
Similarity-ManpreetKaur5521 BTP Final Proje
19 pages
ML Project Report (Final)
No ratings yet
ML Project Report (Final)
34 pages
Fake News Detection System Overview
No ratings yet
Fake News Detection System Overview
4 pages
Flappy Bird Game in Python Guide
No ratings yet
Flappy Bird Game in Python Guide
9 pages
Privacy Engineering Framework Overview
No ratings yet
Privacy Engineering Framework Overview
14 pages
Guide To The Zabbix Plugin Version 1.0.0 For Fuel
No ratings yet
Guide To The Zabbix Plugin Version 1.0.0 For Fuel
12 pages
Mutex and Lock Relationships Overview
No ratings yet
Mutex and Lock Relationships Overview
120 pages
Tushar's Resume - Devloper 2
No ratings yet
Tushar's Resume - Devloper 2
1 page
Power BI: A Comprehensive Guide
No ratings yet
Power BI: A Comprehensive Guide
34 pages
Unit II Study Material
No ratings yet
Unit II Study Material
13 pages
Project File For MBA (Business Analytics)
No ratings yet
Project File For MBA (Business Analytics)
86 pages
Sample Vulnerability Assessment Report PurpleSec
100% (2)
Sample Vulnerability Assessment Report PurpleSec
16 pages
6 People Controls
No ratings yet
6 People Controls
4 pages
Web Application Security - Unit 5 Notes
No ratings yet
Web Application Security - Unit 5 Notes
18 pages
gmst10 pp512 - en e
No ratings yet
gmst10 pp512 - en e
2 pages
SPFx and Google Analytics in SharePoint
No ratings yet
SPFx and Google Analytics in SharePoint
13 pages
Software Processes: ©ian Sommerville 2000 Software Engineering, 6th Edition. Chapter 3 Slide 1
No ratings yet
Software Processes: ©ian Sommerville 2000 Software Engineering, 6th Edition. Chapter 3 Slide 1
49 pages
r23 Aiml RTRP-FBRP Report Format
No ratings yet
r23 Aiml RTRP-FBRP Report Format
13 pages
Upload 1 Document To Download: JIS G 4303-2012 PDF
No ratings yet
Upload 1 Document To Download: JIS G 4303-2012 PDF
1 page
خلاصة اسئلة الانترفيو مهندس الشبكات والسيسكو والآى تى وملحقة بالأجوبة التفصيلية ومضاف إليها ماتريال عربى
No ratings yet
خلاصة اسئلة الانترفيو مهندس الشبكات والسيسكو والآى تى وملحقة بالأجوبة التفصيلية ومضاف إليها ماتريال عربى
11 pages
Key Properties of Operating Systems
No ratings yet
Key Properties of Operating Systems
36 pages
Parallel Computing: Overview: John Urbanic Urbanic@psc - Edu
No ratings yet
Parallel Computing: Overview: John Urbanic Urbanic@psc - Edu
34 pages
Synack AI LLM Pentesting
No ratings yet
Synack AI LLM Pentesting
1 page
System Refresh Guide: QAS to PRD Setup
No ratings yet
System Refresh Guide: QAS to PRD Setup
12 pages
Create SQL Metric Extensions in OEM 12c
No ratings yet
Create SQL Metric Extensions in OEM 12c
9 pages
NCS Offline Application SOP for Jobseekers
No ratings yet
NCS Offline Application SOP for Jobseekers
2 pages
Computer 100 Mcqs
No ratings yet
Computer 100 Mcqs
9 pages
React Native Setup & First Expo Project
No ratings yet
React Native Setup & First Expo Project
14 pages
Arena PLM Qms Strategy
No ratings yet
Arena PLM Qms Strategy
3 pages
Computer Architecture Course Overview
No ratings yet
Computer Architecture Course Overview
34 pages
Component Based Approach To Software Architecture CH 14
No ratings yet
Component Based Approach To Software Architecture CH 14
11 pages
Betika Usability Evaluation Insights
No ratings yet
Betika Usability Evaluation Insights
17 pages
Shweta Resume
No ratings yet
Shweta Resume
1 page

My Project Report

Uploaded by

My Project Report

Uploaded by

FAKE NEWS DETECTION

1. Data Collection Module

2. Data Preprocessing Module

3. Feature Extraction Module

 Converts text into numeric vectors using TF-IDF Vectorizer.

4. Model Training Module

 Trains three machine learning models:

5. Prediction & API Module

6. Frontend Interface Module

 An HTML-based user interface that allows users to input news content.

7. Health Check and Utility Module

 Includes an API route (/health) to verify server status.

 Load balancers (to check if the app should receive traffic)

Limitations of Existing System

Proposed System with Objectives

Considering all aspects—technical, economic, operational, and schedule—the proposed Fake

 Minimum 4GB RAM

System Workflow Overview

1. User inputs news content via HTML form.

1. Programming Language: Python 3.x

Python is the backbone of this project. It's known for its:

 Ease of use and readability

2. Machine Learning Libraries

A powerful machine learning library used for:

 Model training: (Naive Bayes, Logistic Regression, Random Forest)

 Reading CSV files (Fake.csv and True.csv)

Supports efficient numerical and array operations, especially in:

 TF-IDF feature matrices

3. Natural Language Processing (NLP) with NLTK

Main tasks it performs:

 Tokenization: Splits sentences into words

4. Feature Extraction with TF-IDF (via Scikit-learn)

 Highlights important words (frequent in a document but rare across others)

5. Model Persistence with Joblib

After training, models are saved as .joblib files. This ensures:

 Models can be reused instantly for predictions

Joblib also handles large NumPy arrays efficiently.

6. Web Development with Flask

Flask is a micro web framework in Python used to:

 Create an API endpoint (/predict) to receive user input

 Lightweight and fast

7. Frontend Interface using HTML/CSS

A simple and user-friendly HTML form allows users to:

 Input news content

The interface communicates with Flask using HTTP POST requests.

8. Visualization (Optional): Matplotlib and Seaborn

For evaluation and documentation, optional libraries like:

 Matplotlib (for plotting graphs)

can be used to:

 Display accuracy and performance metrics

These are useful during model evaluation and report preparation.

 Useful during exploratory data analysis and model experimentation.

Data Flow Diagram (DFD)

1. Presentation Layer (Frontend)

2. Application Layer (Flask Backend)

 Technology: Flask (Python Web Framework)

3. Processing Layer (Preprocessing & Prediction Logic)

 Technology: Python, NLTK, Scikit-learn

4. Data Layer (Trained Models & Dataset)

o Trained models: RandomForest, NaiveBayes, LogisticRegression

1. User enters news text in the web interface.

Advantages of This Architecture

 Modularity: Each component (preprocessing, model, frontend) is independent.

(1) Exercise the internal logic of software components, and

Black Box Testing

White Box Testing

STAGES IN THE TESTING PROCESS

Limitations of Unit Testing

By leveraging Natural Language Processing (NLP) techniques such as tokenization, stopword

 Scikit-learn Documentation – https://scikit-learn.org/

You might also like