Prediction of Cyber Attacks Using Data Science Technique
Prediction of Cyber Attacks Using Data Science Technique
2
TABLE OF CONTENT
01
02 EXISTING SYSTEM 12
2.1 DRAWBACKS
INTRODUCTION 13
03 3.1 DATA SCIENCE
3.2 ARTIFICIAL INTELLIGENCE
04 MACHINE LEARNING 19
05 PREPARING DATASET 21
06 PROPOSED SYSTEM 21
6.1 ADVANTAGES
07 LITERATURE SURVEY 22
08 SYSTEM STUDY 30
8.1 OBJECTIVES
8.2 PROJECT GOAL
8.3 SCOPE OF THE PROJECT
09 FEASIBILITY STUDY 37
10 LIST OF MODULES 39
PROJECT REQUIREMENTS 39
11 11.1 FUNCTIONAL REQUIREMENTS
11.2 NON-FUNCTIONAL REQUIREMENTS
3
12 ENVIRONMENT REQUIREMENT 40
13 SOFTWARE DESCRIPTION 41
13.1 ANACONDA NAVIGATOR
13.2 JUPYTER NOTEBOOK
14 PYTHON 51
15 SYSTEM ARCHITECTURE 63
16 WORKFLOW DIAGRAM 64
17 USECASE DIAGRAM 65
18 CLASS DIAGRAM 66
19 ACTIVITY DIAGRAM 67
20 SEQUENCE DIAGRAM 68
21 ER – DIAGRAM 69
22 MODULE DESCRIPTION 70
22.1 MODULE DIAGRAM
22.2 MODULE GIVEN INPUT EXPECTED
OUTPUT
23 DEPLOYMENT (GUI) 94
24 CODING 95
25 CONCLUSION 141
4
LIST OF FIGURES
5
LIST OF SYSMBOLS
Class Name
1. Class Represents a
-attribute
collection of
+ public
-attribute similar entities
-private
+operation grouped together.
# protected
+operation
+operation
Associations
2. Association Class A NAME Class B represents static
relationships
Class B
between classes.
Class A
Roles represents
the way the two
classes see each
other.
3. Actor It aggregates
6
several classes into
a single classes.
Relation(uses)
5. uses Used for additional
process
communication.
6. Relation Extends
(extends) extends relationship is used
when one use case
is similar to
another use case
but does a bit
more.
7. Communication Communication
between various
use cases.
7
9. Initial State Initial state of the
object
Interaction
13. Use case Uses case between the system
and external
environment.
Represents
14. Component physical modules
which is a
8
collection of
components.
Represents
15. Node physical modules
which are a
collection of
components
A circle in DFD
16. Data represents a state
Process/State or process which
has been triggered
due to some event
or action.
Represents external
17. External entity entities such as
keyboard, sensors
etc.
9
that occurs
between processes.
Represents the
19. Object Lifeline vertical dimensions
that the object
communications.
10
CHAPTER 1
1.1 INTRODUCTION
The upside of having the option to identify uncommon things is the capacity to
recognize new (or startling) assaults that convey many advantages. Procedures
dependent on innovation pipelines utilized in different ventures. We give
general data to the investigation of traffic data and of information, which can be
used for targetedattacks. A comparative study between machine learning
algorithms had been carried out in order to determine which algorithm is the
most accurate in predicting the type cyber Attacks. We classify four types of
attacks are DOS Attack, R2L Attack, U2R Attack, Probe attack. The results
show that the effectiveness of the proposed machine learning algorithm
technique can be compared with best accuracy with entropy calculation,
precision, Recall, F1 Score, Sensitivity, Specificity and Entropy. for the
location of street mishaps utilizing the significant distance-course of-the-street
11
kinds; great, terrible and impartial. The response to this classification is the
expression enraptured (positive, negative, or unbiased) as for street sentences,
contingent upon whether or not it is traffic. The bag-of-words (BoW) is
presently used to change each sentence over to a solitary hot code to take care
of bi-directional LSTM organizations (Bi-LSTM). In the wake of preparing, a
multi-stage muscle network utilizes softmax to arrange sentences as indicated
by area, vehicle experience, and sort of polarization. The proposed strategy
contrasts the preparation of various machines and the high-level preparing
techniques as far as precision, F scores, and different standards
Disadvantages:
1. The performance is not good and its get complicated for other networks.
12