0% found this document useful (0 votes)

24 views54 pages

Data Poison Detection in DML Systems

Uploaded by

B.Poojitha 512

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views54 pages

Data Poison Detection in DML Systems

Uploaded by

B.Poojitha 512

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

MAJOR PROJECT REPORT

DATA POISON DETECTION SCHEMES FOR DISTRIBUTED MACHINE

LEARNING
Submitted on the partial fulfillment of the Award of Degree in Bachelor of Technology in
Computer Science and Engineering

Submitted to

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY, HYDERABAD,

TELANGANA.

Submitted by
[Link] 20UC1A0507
[Link] 20UC1A0512
[Link] Pranith 20UC1A0516
[Link] 20UC1A0526

Under the guidance of

[Link]
(Assistant Professor, HOD of CSE Department)

Department of Computer Science and Engineering

TALLA PADMAVATHI COLLEGE OF ENGINEERING(AUTONOMOUS)
WARANGAL, TELANGANA

TALLA PADMAVATHI COLLEGE OF ENGINEERING,

(AUTONOMOUS)WARANGAL
Accreditated by NAAC
Affiliated to Jawaharlal Nehru Technological University, Hyderabad,
Telangana.

BONAFIDE CERTIFICATE
This is to certify that this Project Report titled “DATA POISON DECTION SCHEMES FOR
DISTRIBUTED MACHINE LEARNING “is a bonafide work carried out by [Link]
(20UC1A0507),[Link](20UC1A0512),[Link](20UC1A0516)[Link](20U
C1A0526) under our supervision in the partial fulfilment of the Award of Bachelor of
Technology in Computer Science and Engineering from Talla Padmavathi College of
Engineering, Warangal, affiliated to Jawaharlal Nehru Technological University, Hyderabad,
Telangana.

This project work does not constitute in part or full of any other works that have been earlier
submitted to this university or any other institutions for the award of any degree/ diploma.

Internal Guide Head of the Department

External Guide Principal / TPCE

2
DECLARATION

We,[Link](20UC1A0507),[Link](20UC1A0512),[Link](20UC1A0516)and
[Link](20UC1A0526) final year students from Talla Padmavathi College of Engineering,
Warangal, affilitated to Jawaharlal Nehru Technological University, Hyderabad, Telangana,
solomnly declare that this Project titled “DATA POISON DECTION SCHEMES FOR
DISTRIBUTED MACHINE LEARNING” is a bonafide work carried out by us under the
supervision of Mrs. J. Shilpa, Assistant Professor, HOD of CSE Department ,Computer Science
and Engineering for the Award of Bachelor of Technology in Computer Science and
Engineering.

We also declare that this project work does not constitute in part or full of any other works that
have been earlier submitted to this university or any other institutions for the award of any
degree/ diploma.
Signature name Roll No

1 [Link] 20UC1A0507

2 [Link] 20UC1A0512

3 [Link] Pranith 20UC1A0516

4 [Link] 20UC1A0526

2
ACKNOWLEDGEMENTS
We are grateful to our Chairman, Mr. Talla Mallesham, for providing us ambient learning
experience at our institution.

We are greatly thankful to our Director, Dr. Talla Vamshi, and Directrix Mrs. ChaitanyaTalla
Vamshi, for their encouragement and valuable academic support in all aspects.

We are thankful to our Principal, Dr. R. Velu, for his patronage towards our project and standing
as a support in the need of the hour.

We would like to acknowledge and express our sincere thanks to our guide [Link],
Assistant Professor, HOD of CSE Department, Department of Computer Science and
Engineering for introducing the present topic and for the inspiring guidance, constructive
criticism and valuable suggestions through-out our project work which have helped us in
bringing out this proficient project.

We also thank all the faculty members of our institution for their kind and sustained support
throughout our programme of study.

We thank our parents for their confidence that they have on us to be potential and useful
technological graduates to serve the society at large.

[Link] 20UC1A0507
[Link] 20UC1A0512
[Link] 20UC1A0526
[Link] Pranith 20UC1A0516

2
ABSTRACT
Distributed Machine Learning (DML) can realize massive dataset training when no single node
can work out the accurate results within an acceptable time. However, this will inevitably expose
more potential targets to attackers compared with the non-distributed environment. In this
project, we classify DML into basic-DML and semi-DML. In basic-DML, the center server
dispatches learning tasks to distributed machines and aggregates their learning results. While in
semi-DML, the center server further devotes resources into dataset learning in addition to its
duty in basic-DML. We firstly put forward a novel data poison detection scheme for basic-DML,
which utilizes a cross-learning mechanism to find out the poisoned data. We prove that the
proposed cross-learning mechanism would generate training loops, based on which a
mathematical model is established to find the optimal number of training loops. Then, for semi-
DML, we present an improved data poison detection scheme to provide better learning protection
with the aid of the central resource. To efficiently utilize the system resources, an optimal
resource allocation approach is developed. Simulation results show that the proposed scheme can
significantly improve the accuracy of the final model by up to 20% for support vector machine
and 60% for logistic regression in the basic-DML scenario. Moreover, in the semi DML scenario,
the improved data poison detection scheme with optimal resource allocation can decrease the
wasted resources for 20-100%.

2
INDEX

CONTENTS PAGE NO

1. Chapter-1

1.1 Introduction 1

2. Chapter-2

Literature Survey 3

3. Chapter-3
3.1 System Analysis 6
3.1.1 Existing System 6
3.1.2 Proposed System 6
3.1.3 Algorithm 7
3.2 System Requirements
3.3 System feasibility
4. Chapter-4
4.1 System Design
4.1.1 System Architecture
4.1.2 UML Diagrams
4.1.3 Data Flow diagram
5. Chapter-5
5.1 Software environment
5.1.1 Python Technology
5.1.2 Python library
5.1.3 Machine Learning overview
6. Chapter-6
Implementing and analysis
6.1System Implementation
6.2Modules
6.3Sample Code

2
6.4OUTPUT SCREENSHOTS
7. CHAPTER-7
7.1TESTING
7.2Types of Testing
8. CONCLUSION
9. FUTURE SCOPE
BIBLOGRAPHY

CHAPTER 1

2
1.1 INTRODUCTION
Distributed Machine Learning (DML) has been widely used in distributed systems, where no
single node can get the intelligent decision from a massive dataset within an acceptable time. In a
typical DML system, a central server has a tremendous amount of data at its disposal. It divides
the dataset into different parts and disseminates them to distributed workers who perform the
training tasks and return their results to the center. Finally, the center integrates these results and
outputs the eventual model. Unfortunately, with the number of distributed workers increasing, it
is hard to guarantee the security of each worker. This lack of security will increase the danger
that attackers poison the dataset and manipulate the training result. Poisoning attack is a typical
way to tamper the training data in machine learning. Especially in scenarios that newly generated
datasets should be periodically sent to the distributed workers for updating the decision model,
the attacker will have more chances to poison the datasets, leading to a more severe threat in
DML. Such vulnerability in machine learning has attracted much attention from researchers.
Dalvi et al initially demonstrated that attackers could manipulate the data to defeat the data miner
if they have complete information. Then Low d claimed that the perfect information assumption
is unrealistic, and proved the attackers can construct attacks with part of the information.
Afterwards, a series of works were conducted. Focusing on non-distributed machine learning
context. Recently, there are a couple of efforts devoted in preventing data from being
manipulated in DML. For example, Zhang and Esposito et al used game theory to design a
secure algorithm for distributed support vector machine (DSVM) and collaborative deep
learning, respectively. However, these schemes are designed for specific DML algorithm and
cannot be used in general DML situations. Since the adversarial attack can mislead various
machine learning algorithms, a widely applicable DML protection mechanism is urgent to be
studied. In this project, we classify DML into basic distributed machine learning (basic-DML)
and semi distributed machine learning (semi-DML), depending on whether the centre shares
resources in the dataset training tasks. Then, we present data poison detection schemes for basic-
DML and semi DML respectively. The experimental results validate the effect of our proposed
schemes. We summary the main contributions of this project as follows

2
• We put forward a data poison detection scheme for basic-DML, based on a so-called cross
learning data assignment mechanism. We prove that the cross-learning mechanism would
consequently generate training loops, and provide a mathematical model to find the optimal
number of training loops which has the highest security.

• We present a practical method to identify abnormal training results, which can be used to find
out the poisoned datasets at a reasonable cost.

• For semi-DML, we propose an improved data poison detection scheme, which can provide
better learning protection. To efficiently utilize the system resources, an optimal resource
allocation scheme is develop.

2
CHAPTER 2
LITERATURE SURVEY
1. “Collaborative task offloading in vehicular edge multi-access networks

Mobile Edge Computing (MEC) has emerged as a promising paradigm to realize user
requirements with low-latency applications. The deep integration of multi-access technologies
and MEC can significantly enhance the access capacity between heterogeneous devices and
MEC platforms. However, the traditional MEC network architecture cannot be directly applied
to the Internet of Vehicles (IoV) due to high-speed mobility and inherent characteristics.
Furthermore, given a large number of resource-rich vehicles on the road, it is a new opportunity
to execute task offloading and data processing onto smart vehicles. To facilitate good merging of
the MEC technology in IoV, this article first introduces a vehicular edge multi-access network
that treats vehicles as edge computation resources to construct the cooperative and distributed
computing architecture. For immersive applications, co-located vehicles have the inherent
properties of collecting considerable identical and similar computation tasks. We propose a
collaborative task offloading and output transmission mechanism to guarantee low latency as
well as the application- level performance. Finally, we take 3D reconstruction as an exemplary
scenario to provide insights on the design of the network framework. Numerical results
demonstrate that the proposed scheme is able to reduce the perception reaction time while
ensuring the application-level driving experiences.

2. Artificial intelligence inspired transmission scheduling in cognitive vehicular communications

and networks

The Internet of Things (IoT) platform has played a significant role in improving road transport
safety and efficiency by ubiquitously connecting intelligent vehicles through wireless
communications. Such an IoT paradigm however, brings in considerable strain on limited
spectrum resources due to the need of continuous communication and monitoring. Cognitive
radio (CR) is a potential approach to alleviate the spectrum scarcity problem through
opportunistic exploitation of the underutilized spectrum. However, highly dynamic topology and
time- varying spectrum states in CR-based vehicular networks introduce quite a few challenges
to be addressed. Moreover, a variety of vehicular communication modes, such as vehicle-to-

2
infrastructure and vehicle to-vehicle, as well as data QOS requirements pose critical issues on
efficient transmission scheduling. Based on this motivation, in this paper, we adopt 4 a deep Q -
learning approach for designing an optimal data transmission scheduling scheme in cognitive
vehicular networks to minimize transmission costs while also fully utilizing various
communication modes and resources. Furthermore, we investigate the characteristic modes and
spectrum resources chosen by vehicles indifferent network states, and propose an efficient
learning algorithm for obtaining the optimal scheduling strategies. Numerical results are
presented to illustrate the performance of the proposed scheduling schemes

3. Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems
MX Net is a Multilanguage machine learning (ML) library to ease the development of ML
algorithms, especially for deep neural networks. Embedded in the host language, it blends
declarative symbolic expression with imperative tensor computation. It offers auto differentiation
to derive gradients. MX Net is computation and memory efficient and runs on various
heterogeneous systems, ranging from mobile devices to distributed GPU clusters. This paper
describes both the API design and the system implementation of MX Net, and explains how
embedding of both symbolic expression and tensor operation is handled in a unified fashion. Our
preliminary experiments reveal promising results on large scale deep neural network applications
using multiple GPU machines.

4. Machine learning on big data: Opportunities and challenges

Machine learning (ML) is continuously unleashing its power in a wide range of applications. It
has been pushed to the forefront in recent years partly owing to the advert of big data. ML
algorithms have never been better promised while challenged by big data. Big data enables ML
algorithms to uncover more fine- grained patterns and make more timely and accurate
predictions than ever before; on the other hand, it presents major challenges to ML such as model
scalability and distributed computing. In this paper, we introduce a framework of ML on big data
(MLBID) to guide the discussion of its opportunities and challenges. The framework is centered
on ML which follows the phases of preprocessing, learning, and evaluation. In addition, the
framework is also comprised of four other components, namely big data, user, domain, and
system. The phase of ML and the components of MLBID provide directions for the identification

2
of associated opportunities and challenges and open up future work in many unexplored renders
or under explored research areas.

5. J. Chen et al. described Deep Poisons an innovative hostile network with one generator and
two distinctions to solve this problem. In particular, the generator automatically extracts hidden
features of the target class and embeds them in harmless training models. A discriminator
controls the rate of addiction harassment. Another discriminator acts as a target model to
demonstrate the effects of the drug. The novelty of Deep Poisons that the toxic training models
developed cannot be distinguished from harm lessons by defensive methods or human visual
inspection, and even harmless test models can be attacked C. Li et al. described, Machine Learning
(ML) is widely used to detect malware on various platforms, including Android. Detection
models must be retested following the data collected (e.g., monthly) to continue the evolution of
malware. However, it can also lead to toxic attacks, especially backdoor attacks, which disrupt
the learning process and create evasion tunnels for manipulated malware models. No previous
research has examined this critical issue with Android Malware Detector.

J. Chen et al. described, Advanced attackers may be vulnerable to data poisoning attacks and
may interfere with the learning process by inserting some malicious samples into the training
database. Existing defences against drug attacks are primarily target-specific attacks. Designed
for a specific type of attack. However, due to the explicit principles of the Master, it does not
work for other types. However, some common safety strategies have developed.

2
CHAPTER 3
3.1 SYSTEM ANALYSIS

3.1.1 EXISTING SYSTEM

Unfortunately, with the number of distributed workers increasing, it is hard to guarantee the
security of each worker. This lack of security will increase the danger that attackers poison the
dataset and manipulate the training result. Poisoning attack is a typical way to tamper the training
data in machine learning. Especially in scenarios that newly generated datasets should be
periodically sent to the distributed workers for updating the decision model, the attacker will
have more chances to poison the datasets, leading to a more severe threat in DML. “Support
Vector Machine” (SVM) is a supervised machine learning algorithm which can be used for both
classification and regression challenges. However, it is mostly used in classification problems. In
the SVM algorithm, we plot each data item as a point in n-dimensional space (where n is number
of features you have) with the value of each feature being the value of a particular coordinate.
Then, we perform classification by finding the hyper-plane that differentiates the two classes
very well.

3.1.2 PROPOSED SYSTEM

In distributed environment sometime attackers may modify training data and then make ML to
predict wrong result and to detect and remove such modified data we are using Data Poison
Detection technique. This technique will inspect training dataset to identify odd values and then
remove it. By applying Data Poison technique, we can improve accuracy of ML algorithms.
Basic-DML and Semi-DML In this project we are using two distributed techniques called Basic
DML and Semi DML as shown in below figure. Basic DML will divide dataset into multiple
parts and send to worker nodes and worker nodes build ML model and send result back
distributed centre server. In Semi DML centre server itself will devote resource to ML model to
train dataset.

2
3.1.3 ALGORITHMS

Random Forest

Random Forest is a popular ensemble learning algorithm used for both classification and
regression tasks. It operates by constructing a multitude of decision trees during training and
outputs the class that is the mode of the classes (classification) or mean prediction (regression) of
the individual trees. Here's a step-by-step overview of how the Random Forest algorithm works:

Bootstrapping (Random Sampling with Replacement):

Randomly select a subset of the training data with replacement. This means that some samples
may be repeated in the subset, while others may be left out.

Build Decision Trees:

For each subset of data, build a decision tree. However, when constructing each tree, at each
split, only a random subset of features is considered.

Isolation Forest Algorithm

The Isolation Forest algorithm is an unsupervised machine learning used for anomaly detection.
It was introduced to efficiently detect outliers or anomalies in a dataset. The key idea behind
Isolation Forest is that anomalies are rare and are few and far from the normal instances. The
algorithm isolates anomalies by constructing isolation trees. Here's a high-level overview of how
the Isolation Forest algorithm works:

Isolation Trees:

The algorithm builds isolation trees, which are binary trees, recursively partitioning the data into
subsets.

Random Feature Selection:

At each split in the tree, a random feature is selected.

2
Path Length:

Anomalies are expected to be isolated in shorter paths in the tree. Therefore, the average path
length from the root to an anomaly in the tree is shorter than the average path length for a normal
instance.

Scoring:

Anomaly scores are assigned based on the average path length. Shorter path lengths indicate
higher anomaly scores.

2
3.2 SOFTWARE REQUIREMENTS SPECIFICATION

Hardware Requirements:

• Processor: Intel i3 and above

• RAM : 4GB and Higher

• Hard Disk: 500GB Minimum

Software Requirements:

• Operating System: Windows

• Coding Language: Python 3.7

3.3 SYSTEM FEASIBILITY

The feasibility of the project is analyzed in this phase and business proposal is put forth with a
very general plan for the project and some cost estimates. During system analysis the feasibility
study of the proposed system is to be carried out. This is to ensure that the proposed system is
not a burden to the company. For feasibility analysis, some understanding of the major
requirements for the system is essential. Three key considerations involved in the feasibility
analysis are

 ECONOMICAL FEASIBILITY

 TECHNICAL FEASIBILITY

 SOCIAL FEASIBILITY

ECONOMICAL FEASIBILITY

This study is carried out to check the economic impact that the system will have on the
organization. The amount of fund that the company can pour into the research and development
of the system is limited. The expenditures must be justified. Thus, the developed system as well
within the budget and this was achieved because most of the technologies used are freely
available. Only the customized products had to be purchased.

2
TECHNICAL FEASIBILITY

This study is carried out to check the technical feasibility, that is, the technical requirements of
the system. Any system developed must not have a high demand on the available technical
resources. This will lead to high demands on the available technical resources. This will lead to
high demands being placed on the client. The developed system must have a modest
requirement, as only minimal or null changes are required for implementing this system.

SOCIAL FEASIBILITY

The aspect of study is to check the level of acceptance of the system by the user. This includes
the process of training the user to use the system efficiently. The user must not feel threatened by
the system, instead must accept it as a necessity. The level of acceptance by the users solely
depends on the methods that are employed to educate the user about the system and to make him
familiar with it. His level of confidence must be raised so that he is also able to make some
constructive criticism, which is welcomed, as he is the final user of the system.

2
CHAPTER 4

4.1 SYSTEM DESIGN

4.1.1 SYSTEM ARCHITECTURE:

Fig 4.1.1: SYSTEM ARCHITECTURE

We classify DML into Basic-DML and Semi-DML, which are shown in above figure. Both of the
two scenarios have a center, which contains a database, a computing server, and a parameter
server. However, the center provides different functions in these two scenarios. In the basic-DML
scenario, the center has no spare computing resource for sub-dataset training, and will send all
the sub-datasets to the distributed workers. Therefore, in the basic- DML, the center only
integrates the training results from distributed workers by the parameter server. In the semi-DML
scenario, the center has some spare resources in the computing server for sub datasets learning.
Consequently, it will keep some sub-datasets and learn from them by itself. That is to say, in the
semi-DML, the center will learn from some sub-datasets as well as integrate the results from
both of the center and distributed workers.

2
4.1.2 UML DIAGRAMS

UML stands for Unified Modeling Language. UML is a standardized general-purpose modeling
language in the field of object-oriented software engineering. The standard is managed, and was
created by, the Object Management Group.

The goal is for UML to become a common language for creating models of object-oriented
computer software. In its current form UML is comprised of two major components: a Meta-
model and a notation.

In the future, some form of method or process may also be added to; or associated with, UML.
The Unified Modeling Language is a standard language for specifying, Visualization,
Constructing and documenting the artifacts of software system, as well as for business modeling
and other non-software systems.

The UML represents a collection of best engineering practices that have proven successful in the
modeling of large and complex systems. The UML is a very important part of developing
objects-oriented software and the software development process. The UML uses mostly
graphical notations to express the design of software projects.

2
4.1. A USECASE DIAGRAM
A Use Case Diagram in the Unified Modeling Language (UML) is a type of Behavioral diagram
defined by and created from a Use-case analysis. Its purpose is to present a graphical overview
of the functionality provided by a system in terms of actors, their goals (represented as use
cases), and any dependencies between those use cases. The main purpose of a use case diagram
is to show what system functions are performed for which actor. Roles of the actors in the system
can be depicted.

Fig 4.1.A: USECASE DIAGRAM

2
4.1.B CLASS DIAGRAM
The Class Diagram is used to refine the use case diagram and define a detailed design of the
system. The class diagram classifies the actors defined in the use case diagram into a set of
interrelated classes. The relationship or association between the classes can be either an "is-a" or
"has-a" relationship. Each class in the class diagram may be capable of providing certain
functionalities. These functionalities provided by the class are termed "methods" of the class.
Apart from this, each class may have certain "attributes" that uniquely identify the class.

Fig 4.2.B: CLASS DIAGRAM

2
4.1.C SEQUENCE DIAGRAM
A Sequence Diagram represents the interaction between different objects in the system. The
important aspect of a sequence diagram is that it is time-ordered. This means that the exact
sequence of the interactions between the objects is represented step by step. Different objects in
the sequence diagram interact with each other by passing "messages".

Fig 4.3.C: SEQUENCE DIAGRAM

2
4.1.D ACTIVITY DIAGRAM
The process flows in the system are captured in the activity diagram. Similar to a state diagram,
an activity diagram also consists of activities, actions, transitions, initial and final states, and
guard conditions.

Fig 4.4.D: ACTIVITY DIAGRAM

2
4.1.E STATECHART DIAGRAM
A state diagram, as the name suggests, represents the different states that objects in the system
undergo during their life cycle. Objects in the system change states in response to events. In
addition to this, a state diagram also captures the transition of the object’s state from an initial
state to a final state in response to events affecting the system.

Fig 4.5.E: STATE DIAGRAM

2
4.1.3 FLOW DIAGRAM:
1. The DFD is also called as bubble chart. It is a simple graphical formalism that can be used to represent
a system in terms of input data to the system, various processing carried out on this data, and the output
data is generated by this system.

2. The Data Flow Diagram (DFD) is one of the most important modeling tools. It is used to model the
system components. These components are the system process, the data used by the process, an external
entity that interacts with the system and the information flows in the system.

3. DFD shows how the information moves through the system and how it is modified by a series of
transformations. It is a graphical technique that depicts information flow and the transformations that are
applied as data moves from input to output.

4. DFD is also known as bubble chart. A DFD may be used to represent a system at any level of
abstraction. DFD may be partitioned into levels that represent increasing information flow and functional
detail.

2
Fig 4.1.3: FLOW DIAGRAM

2
CHAPTER 5

5.1 SOFTWARE ENVIRONMENT

5.1.1 PYTHON TECHNOLOGY

Below are some facts about Python.

Python is currently the most widely used multi-purpose, high-level programming language.

Python allows programming in Object-Oriented and Procedural paradigms. Python programs

generally are smaller than other programming languages like Java.

Programmers have to type relatively less and indentation requirement of the language, makes
them readable all the time.

Python language is being used by almost all tech-giant companies like – Google, Amazon,
Facebook, Instagram, Dropbox, Uber… etc.

The biggest strength of Python is huge collection of standard libraries which can be used for the
following –

• Machine Learning

• GUI Applications (like Kivy, Tkinter, PyQt etc.)

• Web frameworks like Django (used by YouTube, Instagram,Dropbox)

• Image processing (like threading, Pillow)

• Web scraping (like Scrapy, BeautifulSoup, Selenium)

• Test frameworks

• Multimedia

2
5.1.2 PYTHON LIBRARY

1. Extensive Libraries

Python downloads with an extensive library and it contain code for various purposes like regular
expressions, documentation-generation, unit-testing, web browsers, threading, databases, CGI,
email, image manipulation, and more. So, we don’t have to write the complete code for that
manually.

2. Extensible

As we have seen earlier, Python can be extended to other languages. You can write some of your
code in languages like C++ or C. This comes in handy, especially in projects.

3. Embeddable

Complimentary to extensibility, Python is embeddable as well. You can put your Python code in
your source code of a different language, like C++. This lets us add scripting capabilities to our
code in the other language.

4. Improved Productivity

The language’s simplicity and extensive libraries render programmers more productive than
languages like Java and C++ do. Also, the fact that you need to write less and get more things
done.

5. IOT Opportunities

Since Python forms the basis of new platforms like Raspberry Pi, it finds the future bright for the
Internet of Things. This is a way to connect the language with the real world.

6. Simple and Easy

When working with Java, you may have to create a class to print ‘Hello World’. But in Python,
just a print statement will do. It is also quite easy to learn, understand, and code. This is why
when people pick up Python, they have a hard time adjusting to other more verbose languages
like Java.

2
7. Readable

Because it is not such a verbose language, reading Python is much like reading English. This is
the reason why it is so easy to learn, understand, and code. It also does not need curly braces to
define blocks, and indentation is mandatory. This further aids the readability of the code.

8. Object-Oriented

This language supports both the procedural and object-oriented programming paradigms. While
functions help us with code reusability, classes and objects let us model the real world. A class
allows the encapsulation of data and functions into one.

9. Free and Open-Source

Like we said earlier, Python is freely available. But not only can you download Python for free,
but you can also download its source code, make changes to it, and even distribute it. It
downloads with an extensive collection of libraries to help you with your tasks.

10. Portable

When you code your project in a language like C++, you may need to make some changes to it if
you want to run it on another platform. But it isn’t the same with Python. Here, you need to code
only once, and you can run it anywhere. This is called Write Once Run Anywhere (WORA).
However, you need to be careful enough not to include any system- dependent features.

11. Interpreted

Lastly, we will say that it is an interpreted language. Since statements are executed one by one,
debugging is easier than in compiled languages.

2
Advantages of Python Over Other Languages

[Link] Coding

Almost all of the tasks done in Python requires less coding when the same task is done in other
languages. Python also has an awesome standard library support, so you don’t have to search for
any third-party libraries to get your job done. This is the reason that many people suggest
learning Python to beginners.

2. Affordable

Python is free therefore individuals, small companies or big organizations can leverage the free
available resources to build applications. Python is popular and widely used so it gives you better
community support. The 2019 Github annual survey showed us that Python has overtaken Java
in the most popular programming language category.

3. Python is for Everyone

Python code can run on any machine whether it is Linux, Mac or Windows. Programmers need
to learn different languages for different jobs but with Python, you can professionally build web
apps, perform data analysis and machine learning, automate things, do web scraping and also
build games and powerful visualizations. It is an all-rounder programming language.

Disadvantages of Python

So far, we’ve seen why Python is a great choice for your project. But if you choose it, you should
be aware of its consequences as well. Let’s now see the downsides of choosing Python over
another language.

[Link] Limitations

We have seen that Python code is executed line by line. But since Python is interpreted, it often
results in slow execution. This, however, isn’t a problem unless speed is a focal point for the

2
project. In other words, unless high speed is a requirement, the benefits offered by Python are
enough to distract us from its speed limitations.

2. Weak in Mobile Computing and Browsers

While it serves as an excellent server-side language, Python is much rarely seen on the clientside.
Besides that, it is rarely ever used to implement smartphone-based applications. One such
application is called Carbonnelle. The reason it is not so famous despite the existence of Brython
is that it isn’t that secure.

3. Design Restrictions

As you know, Python is dynamically-typed. This means that you don’t need to declare the type of
variable while writing the code. It uses duck-typing. But wait, what’s that? Well, it just means
that if it looks like a duck, it must be a duck. While this is easy on the programmers during
coding, it can raise run-time errors.

4. Under developed Database Access Layers

Compared to more widely used technologies like JDBC (Java DataBase Connectivity) and
ODBC (Open DataBase Connectivity), Python’s database access layers are a bit underdeveloped.
Consequently, it is less often applied in huge enterprises.

5. Simple

No, we’re not kidding. Python’s simplicity can indeed be a problem. Take my example. I don’t
do Java, I’m more of a Python person. To me, its syntax is so simple that the verbosity of Java
code seems unnecessary.

This was all about the Advantages and Disadvantages of Python Programming Language.

2
History of Python:-

What do the alphabet and the programming language Python have in common? Right, both start
with ABC. If we are talking about ABC in the Python context, it's clear that the programming
language ABC is meant. ABC is a general-purpose programming language and programming
environment, which had been developed in the Netherlands, Amsterdam, at the CWI (Centrum
Wiskunde & Informatica). The greatest achievement of ABC was to influence the design of
Python. Python was conceptualized in the late 1980s. Guido van Rossum worked that time in a
project at the CWI, called Amoeba, a distributed operating system. In an interview with Bill
Venners1 , Guido van Rossum said: "In the early 1980s, I worked as an implementer on a team
building a language called ABC at Centrum voor Wiskunde en Informatica (CWI). I don't know
how well people know ABC's influence on Python. I try to mention ABC's influence because I'm
indebted to everything I learned during that project and to the people who worked on it."Later on
in the same Interview, Guido van Rossum continued: “I remembered all my experience and some
of my frustration with ABC. I decided to try to design a simple scripting language that possessed
some of ABC's better properties, but without its problems. So, I started typing. I created a simple
virtual machine, a simple parser, and a simple runtime. I made my own version of the various
ABC parts that I liked. I created a basic syntax, used indentation for statement grouping instead
of curly braces or begin-end blocks, and developed a small number of powerful data types: a
hash table (or dictionary, as we call it), a list, strings, and numbers."

2
5.1.3 MACHINE LEARNING OVERVIEW: -

Before we take a look at the details of various machine learning methods, let's start by looking at
what machine learning is, and what it isn't. Machine learning is often categorized as a subfield of
artificial intelligence, but I find that categorization can often be misleading at first brush. The
study of machine learning certainly arose from research in this context, but in the data science
application of machine learning methods, it's more helpful to think of machine learning as a
means of building models of data.

Fundamentally, machine learning involves building mathematical models to help understand

data. "Learning" enters the fray when we give these models tunable parameters that can be
adapted to observed data; in this way the program can be considered to be "learning" from the
data. Once these models have been fit to previously seen data, they can be used to predict and
understand aspects of newly observed data. I'll leave to the reader the more philosophical
digression regarding the extent to which this type of mathematical, model-based "learning" is
similar to the "learning" exhibited by the human brain. Understanding the problem setting in
machine learning is essential to using these tools effectively, and so we will start with some
broad categorizations of the types of approaches we'll discuss here.

Modules Used in Project: -

TensorFlow

TensorFlow is a free and open-source software library for dataflow and differentiable
programming across a range of tasks. It is a symbolic math library, and is also used for machine
learning applications such as neural networks. It is used for both research and production at
Google. TensorFlow was developed by the Google Brain team for internal Google use. It was
released under the Apache 2.0 open-source license on November 9, 2015.

2
Numpy

Numpy is a general-purpose array-processing package. It provides a high-performance

multidimensional array object, and tools for working with these arrays.

It is the fundamental package for scientific computing with Python. It contains various features
including these important ones:

A powerful N-dimensional array object

Sophisticated (broadcasting) functions

Tools for integrating C/C++ and Fortran code

Useful linear algebra, Fourier transform, and random number capabilities

Besides its obvious scientific uses, Numpy can also be used as an efficient multi-dimensional
container of generic data. Arbitrary data-types can be defined using Numpy which allows Numpy
to seamlessly and speedily integrate with a wide variety of databases.

Pandas

Pandas is an open-source Python Library providing high-performance data manipulation and

analysis tool using its powerful data structures. Python was majorly used for data munging and
preparation. It had very little contribution towards data analysis. Pandas solved this problem.
Using Pandas, we can accomplish five typical steps in the processing and analysis of data,
regardless of the origin of data load, prepare, manipulate, model, and analyze. Python with
Pandas is used in a wide range of fields including academic and commercial domains including
finance, economics, Statistics, analytics, etc.

Matplotlib

Matplotlib is a Python 2D plotting library which produces publication quality figures in a variety
of hardcopy formats and interactive environments across platforms. Matplotlib can be used in
Python scripts, the Python and I Python shells, the Jupyter Notebook, web application servers,

2
and four graphical user interface toolkits. Matplotlib tries to make easy things easy and hard
things possible. You can generate plots, histograms, power spectra, bar charts, error charts,
scatter plots, etc., with just a few lines of code. For examples, see the sample plots and thumbnail
gallery. For simple plotting the pyplot module provides a MATLAB-like interface, particularly
when combined with IPython. For the power user, you have full control of line styles, font
properties, axes properties, etc., via an object-oriented interface or via a set of functions familiar
to MATLAB users.

Scikit – learn

Scikit-learn provides a range of supervised and unsupervised learning algorithms via a consistent
interface in Python. It is licensed under a permissive simplified BSD license and is distributed
under many Linux distributions, encouraging academic and commercial use.

Python is an interpreted high-level programming language for general-purpose programming.

Created by Guido van Rossum and first released in 1991, Python has a design philosophy that
emphasizes code readability, notably using significant whitespace.

Python features a dynamic type system and automatic memory management. It supports multiple
programming paradigms, including object-oriented, imperative, functional and procedural, and
has a large and comprehensive standard library.

Python is Interpreted − Python is processed at runtime by the interpreter. You do not need to
compile your program before executing it. This is similar to PERL and PHP.

Python is Interactive − you can actually sit at a Python prompt and interact with the interpreter
directly to write your programs.

Python also acknowledges that speed of development is important. Readable and terse code is
part of this, and so is access to powerful constructs that avoid tediousrepetition of code.
Maintainability also ties into this may be an all but useless metric, but it does say something
about how much code you have to scan, read and/or understand to troubleshoot problems or
tweak behaviors.

2
CHAPTER 6

IMPLEMETING AND ANALYSIS

6.1 MODULES

To implement this project we have designed following modules.

Worker1: This is a worker node which accept divided dataset from center server and thenbuild
existing SVM model and Basic DML model and then calculate accuracy of both algorithms and
send result back to center server

Worker2: This is another worker node which accept other half of dataset and then run existing
SVM and Basic DML and send accuracy back to center server.

Center Server: This is a center server which upload dataset to application and then divide dataset
into two equal parts and then distribute each part to worker 1 and 2 and then collect result. This
server will run semi DML and calculate its accuracy also.

2
6.2 SAMPLE CODE

worker 1

import socket

from threading import Thread

from socketserver import ThreadingMixIn

import json

import os

import pandas as pd

import numpy as np

from sklearn.model_selection import train_test_split

from [Link] import IsolationForest

from [Link] import accuracy_score

from sklearn import svm

def runExistingSVM(dataset):

dataset = [Link]

X, Y = dataset[:, :-1], dataset[:, -1]

print("Dataset received and contain total records without poison detection : "+str(len(X)))

indices = [Link]([Link][0])

[Link](indices)

X = X[indices]

Y = Y[indices]

2
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.2)

cls = [Link]()

[Link](X_train, y_train)

for i in range(0,20):

y_test[i] = 10

prediction_data = [Link](X_test)

svm_acc = accuracy_score(y_test,prediction_data)*100

return svm_acc

def runDMLwithPoisonDataDetection(dataset):

dataset = [Link]

X, Y = dataset[:, :-1], dataset[:, -1]

indices = [Link]([Link][0])

[Link](indices)

X = X[indices]

Y = Y[indices]

X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.2)

iso = IsolationForest(contamination=0.1)

yhat = iso.fit_predict(X_train)

mask = yhat != -1

X_train, y_train = X_train[mask, :], y_train[mask]

print("Total records after poison detection : "+str(len(X_train)+X_test))

cls = [Link]()

2
[Link](X_train, y_train)

prediction_data = [Link](X_test)

svm_acc = accuracy_score(y_test,prediction_data)*100

return svm_acc

def startApplicationServer():

class ClientThread(Thread):

def _init_(self,ip,port):

Thread._init_(self)

[Link] = ip

[Link] = port

print('Request received from IP : '+ip+' with port no : '+str(port))

def run(self):

data = [Link](10000)

data = [Link]([Link]())

request_type = str([Link]("type"))

if request_type == 'basicDML':

data = str([Link]("dataset"))

f = open("[Link]", "w")

[Link](data)

2
[Link]()

dataset = pd.read_csv("[Link]")

existing_svm_accuracy = runExistingSVM(dataset)

basic_dm_accuracy = runDMLwithPoisonDataDetection(dataset)

print("SVM Accuracy without Data Poison Detection : "+str(existing_svm_accuracy))

print("SVM Accuracy after Data Poison Detection : "+str(basic_dm_accuracy))

jsondata = [Link]({"existing": str(existing_svm_accuracy),"dml":

str(basic_dm_accuracy)})

message = [Link]([Link]())

worker 2

import socket

from threading import Thread

from socketserver import ThreadingMixIn

import json

import os

import pandas as pd

import numpy as np

from sklearn.model_selection import train_test_split

from [Link] import IsolationForest

from [Link] import accuracy_score

from sklearn import svm

def runExistingSVM(dataset):

2
dataset = [Link]

X, Y = dataset[:, :-1], dataset[:, -1]

print("Dataset received and contain total records without poison detection : "+str(len(X)))

indices = [Link]([Link][0])

[Link](indices)

X = X[indices]

Y = Y[indices]

X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.2)

cls = [Link]()

[Link](X_train, y_train)

prediction_data = cls.…

distributed center server

from tkinter import messagebox

from tkinter import *

from tkinter import simpledialog

import tkinter

from tkinter import filedialog

import [Link] as plt

import numpy as np

from [Link] import askopenfilename

import os

import re

import numpy as np

2
import pandas as pd

import socket

import json

from [Link] import IsolationForest

from [Link] import accuracy_score

from sklearn import svm

from sklearn.model_selection import train_test_split

main = [Link]()

[Link]("Data Poison Detection Schemes for Distributed Machine Learning")

[Link]("1300x1200")

global filename

global svm_acc,basic_acc,semi_acc

global part1,part2

global first,second

def upload():

global filename

filename = [Link]…

2
6.3 OUTPUT SCREENSHOTS

Figure 6.3.1: Opening command prompt

Figure 6.3.2: Run the project

2
Figure 6.3.3: Upload dataset

Figure 6.3.4: Select data set

2
Figure 6.3.5: Divide dataset

Figure 6.3.6: Distribute Dataset & Run Basic-DML’

2
Figure 6.3.7: Run Semi-DML

Figure 6.3.8: Semi-DML accuracy

2
Figure 6.3.9: x-axis contains algorithm name and y-axis represents accuracy and from above
graph

Figure 6.3.10: Basic-DML and Semi-DML accuracy is better than existing SVM accuracy

2
2
CHAPTER 7
7.1 TESTING
The purpose of testing is to discover errors. Testing is the process of trying to discover every
conceivable fault or weakness in a work product. It provides a way to check the functionality of
components, sub-assemblies, assemblies and/or a finished product It is the process of exercising
software with the intent of ensuring that the Software system meets its requirements and user
expectations and does not fail in an unacceptable manner. There are various types of tests. Each
test type addresses a specific testing requirement.

7.2 TYPES OF TESTS

UNIT TESTING

Unit testing involves the design of test cases that validate that the internal program logic is
functioning properly, and that program inputs produce valid outputs. Alldecision branches and
internal code flow should be validated. It is the testing of individual software units of the
application .it is done after the completion of an individual unit before integration. This is a
structural testing, that relies on knowledge of its construction and is invasive. Unit tests perform
basic tests at component level and test a specific business process, application, and/or system
configuration. Unit tests ensure that each unique path of a business process performs accurately
to the documented specifications and contains clearly defined inputs and expected results.

INTEGRATION TESTING

Integration tests are designed to test integrated software components to determine if they actually
run as one program. Testing is event driven and is more concerned with the basic outcome of
screens or fields. Integration tests demonstrate that although the components were individually
satisfaction, as shown by successfully unit testing, the combination of components is correct and
consistent. Integration testing is specifically aimed at exposing the problems that arise from the
combination of components.

FUNCTIONAL TESTING

2
Functional tests provide systematic demonstrations that functions tested are available as
specified by the business and technical requirements, system documentation, and user manuals.
Functional testing is centered on the following items:

Valid Input: identified classes of valid input must be accepted.

Invalid Input: identified classes of invalid input must be rejected.

Functions: identified functions must be exercised.

Output: identified classes of application outputs must be exercised.

Systems/Procedures: interfacing systems or procedures must be invoked.

Organization and preparation of functional tests is focused on requirements, key functions, or

special test cases. In addition, systematic coverage pertaining to identify Business process flows;
data fields, predefined processes, and successive processes must be considered for testing.
Before functional testing is complete, additional tests are identified and the effective value of
current tests is determined.

SYSTEM TESTING

System testing ensures that the entire integrated software system meets requirements. It tests a
configuration to ensure known and predictable results. An example of system testing is the
configuration-oriented system integration test. System testing is based on process descriptions
and flows, emphasizing pre-driven process links and integration points.

WHITE BOX TESTING

White Box Testing is a testing in which in which the software tester has knowledge of the inner
workings, structure and language of the software, or at least its purpose. It is purpose. It is used
to test areas that cannot be reached from a black box level.

BLACK BOX TESTING

Black Box Testing is testing the software without any knowledge of the inner workings, structure
or language of the module being tested. Black box tests, as most other kinds of tests, must be
written from a definitive source document, such as specification or requirements document, such
as specification or requirements document. It is a testing in which the software under test is

2
treated, as a black box. you cannot “see” into it. The test provides inputs and responds to outputs
without considering how the software works.

UNIT TESTING

Unit testing is usually conducted as part of a combined code and unit test phase of the software
lifecycle, although it is not uncommon for coding and unit testing to be conducted as two distinct
phases.

Test strategy and approach

Field testing will be performed manually and functional tests will be written in detail.
Test objectives
• All field entries must work properly.
• Pages must be activated from the identified link.
• The entry screen, messages and responses must not be delayed.
Features to be tested
• No duplicate entries should be allowed
• All links should take the user to the correct page.
Integration Testing
Software integration testing is the incremental integration testing of two or more integrated
software components on a single platform to produce failures caused by interface defects.

The task of the integration test is to check that components or software applications, e.g.
components in a software system or – one step up – software applications at the company level –
interact without error.

Test Results: All the test cases mentioned above passed successfully. No defects encountered.

Acceptance Testing

User Acceptance Testing is a critical phase of any project and requires significant participation
by the end user. It also ensures that the system meets the functional requirements.

Test Results: All the test cases mentioned above passed successfully. No defects encountered.

2
8. CONCLUSION
In this project, we discussed the Data Poison Detection Schemes in both Basic-DML and Semi-
DML scenarios. The data poison detection scheme in the basic-DML scenario utilizes a threshold
of parameters to find out the poisoned sub-datasets. Moreover, we established a mathematical
model to analyze the probability of finding threats with different numbers of training loops.
Furthermore, we presented an improved data poison detection scheme and the optimal resource
allocation in the Semi-DML scenario. Simulation results show that in the Basic-DML scenario,
the proposed scheme can increase the model accuracy by up to 20%- 30%. As to the semi-DML
scenario, the improved data poison detection scheme with optimal resource allocation can
decrease the resource wastage by 50-60% compared to the other two schemes without the
optimal resource allocation.

2
9. FUTURE ENHANCEMENT
The data poisoning detection strategy may be expanded to a more dynamic pattern in the future
to accommodate the evolving application environment and level of assault. Furthermore, further
research is required on the trade-off between resource cost and security since multi training sub-
datasets will raise the system's resource consumption.

2
BIBLIOGRAPHY
[1] G. Qiao, S. Leng, K. Zhang, and Y. He, “Collaborative task offloading in vehicular edge
multiaccess networks,” IEEE Communications Magazine, vol. 56, no. 8, pp. 48–54, 2018.

[2] K. Zhang, S. Leng, X. Peng, L. Pan, S. Maharjan, and Y. Zhang, “Artificial intelligence
inspired transmission scheduling in cognitive vehicular communications and networks,” IEEE
Internet of Things Journal, vol. 6, no. 2, pp. 1987–1997, 2019.

[3] M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G.

Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P.
Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng, “TensorFlow: A system for
large- scale machine learning.” in 12thUSENIXSymposium on OperatingSystemsDesign and
Implementation (OSDI), vol. 16. USENIX Association, 2016, pp. 265–283.

[4] T. Chen, M. Li, Y. Li, M. Lin, N. Wang, M. Wang, T. Xiao, B. Xu, C. Zhang, and Z. Zhang,
“Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems,”
CoRR, vol. abs/1512.01274, 2015.

[5] Prasadu Peddi (2021), “Deeper Image Segmentation using Lloyd’sAlgorithm”,

ZKGINTERNATIONAL, vol 5, issue 2, pp: 1-7.

[6] S. Yu, M. Liu, W. Dou, X. Liu, and S. Zhou, “Networking for bigdata: A survey,” IEEE
Communications Surveys & Tutorials, vol. 19, no. 1, pp. 531–549, 2017.

[7] M. Li, D. G. Andersen, J. W. Park, A. J. Smola, A. Ahmed, V. Josifovski, J. Long, E. J.

Shekita, and B.-Y. Su, “Scaling distributed machine learning with the parameter server.” in11th

Comparison of Data Mining Techniques
No ratings yet
Comparison of Data Mining Techniques
28 pages
Student Performance Analysis Report
No ratings yet
Student Performance Analysis Report
40 pages
Network Intrusion Detection System Report
No ratings yet
Network Intrusion Detection System Report
59 pages
Data Poisoning in Federated Learning
No ratings yet
Data Poisoning in Federated Learning
47 pages
Advanced Monitoring System Project Report
No ratings yet
Advanced Monitoring System Project Report
97 pages
Quantitative Finance ML Project Report
No ratings yet
Quantitative Finance ML Project Report
18 pages
Air Quality Assessment with ML Techniques
No ratings yet
Air Quality Assessment with ML Techniques
31 pages
Data Mining, Warehousing & Pattern Recognition
No ratings yet
Data Mining, Warehousing & Pattern Recognition
5 pages
Improve Software Defect Prediction Accuracy
No ratings yet
Improve Software Defect Prediction Accuracy
40 pages
Population Census Database Project
0% (1)
Population Census Database Project
17 pages
User-Centric ML Framework for Cybersecurity
No ratings yet
User-Centric ML Framework for Cybersecurity
63 pages
AI & ML for Student Performance Prediction
No ratings yet
AI & ML for Student Performance Prediction
83 pages
Campus Placement Prediction Model
No ratings yet
Campus Placement Prediction Model
93 pages
Cyber Attack Mitigation in Distribution Systems
No ratings yet
Cyber Attack Mitigation in Distribution Systems
65 pages
Early Predictor of Retinal Diseases
No ratings yet
Early Predictor of Retinal Diseases
39 pages
Detecting Healthcare Fraud with ML
No ratings yet
Detecting Healthcare Fraud with ML
11 pages
Anomaly Detection Project Report
No ratings yet
Anomaly Detection Project Report
67 pages
Batch 9
No ratings yet
Batch 9
90 pages
Machine Learning for Drug Discovery
No ratings yet
Machine Learning for Drug Discovery
34 pages
Machine Learning for Network Threat Detection
No ratings yet
Machine Learning for Network Threat Detection
72 pages
Operating Systems Course Overview
No ratings yet
Operating Systems Course Overview
17 pages
Assistant System Engineer Exam Guide
No ratings yet
Assistant System Engineer Exam Guide
9 pages
Computer Science Syllabus Overview
No ratings yet
Computer Science Syllabus Overview
9 pages
Machine Learning for Malware Detection
100% (1)
Machine Learning for Malware Detection
112 pages
Web-Based Banking Fraud Detection ML
No ratings yet
Web-Based Banking Fraud Detection ML
10 pages
BTech-CSE Evaluation and Syllabus Guide
No ratings yet
BTech-CSE Evaluation and Syllabus Guide
7 pages
AI and Database Management Course Outline
No ratings yet
AI and Database Management Course Outline
3 pages
Driving Decision Strategy Using Stacking
No ratings yet
Driving Decision Strategy Using Stacking
76 pages
DR - Babasaheb Ambedkar Marathwada University, Aurangabad
No ratings yet
DR - Babasaheb Ambedkar Marathwada University, Aurangabad
52 pages
Automatic Answer Evaluator Project Report
100% (1)
Automatic Answer Evaluator Project Report
65 pages
Data Science and Machine Learning Overview
No ratings yet
Data Science and Machine Learning Overview
21 pages
Patrolling Robot with Deep Learning
No ratings yet
Patrolling Robot with Deep Learning
62 pages
Data Poison Detection in ML Systems
No ratings yet
Data Poison Detection in ML Systems
22 pages
Credit Card Fraud Detection Project
No ratings yet
Credit Card Fraud Detection Project
68 pages
Glaumetric Precision Monitoring System
No ratings yet
Glaumetric Precision Monitoring System
52 pages
Toxic Command Detection Internship Report
No ratings yet
Toxic Command Detection Internship Report
57 pages
Earthquake Detection with ML Algorithms
No ratings yet
Earthquake Detection with ML Algorithms
68 pages
Tamil Nadu PSC IT Programming & Data Structures
No ratings yet
Tamil Nadu PSC IT Programming & Data Structures
7 pages
Disease Prediction with Machine Learning
No ratings yet
Disease Prediction with Machine Learning
59 pages
Event Management System Project Report
No ratings yet
Event Management System Project Report
51 pages
Diabetes Prediction with ML Algorithms
No ratings yet
Diabetes Prediction with ML Algorithms
69 pages
Spammer Detection in Social Networks
No ratings yet
Spammer Detection in Social Networks
52 pages
Heart Disease Detection Project Report
No ratings yet
Heart Disease Detection Project Report
63 pages
Multiple Disease Prediction Project
No ratings yet
Multiple Disease Prediction Project
34 pages
Publication Chatbot Project Report
No ratings yet
Publication Chatbot Project Report
50 pages
Big Data Job Analysis Insights
No ratings yet
Big Data Job Analysis Insights
66 pages
PBL Report: Student Database Standardization
No ratings yet
PBL Report: Student Database Standardization
34 pages
Parkinson's Fall Detection System Report
No ratings yet
Parkinson's Fall Detection System Report
77 pages
CS-203 Lab Workbook for NEP 2025-26
No ratings yet
CS-203 Lab Workbook for NEP 2025-26
65 pages
Dimensionality Reduction Algorithms
No ratings yet
Dimensionality Reduction Algorithms
34 pages
Human Resource Analytics Project Report
No ratings yet
Human Resource Analytics Project Report
66 pages
Unsupervised ML for Railway Safety Management
No ratings yet
Unsupervised ML for Railway Safety Management
67 pages
Linux Kernel and Algorithm Programming Guide
No ratings yet
Linux Kernel and Algorithm Programming Guide
5 pages
Data Warehousing & Mining Course Outline
No ratings yet
Data Warehousing & Mining Course Outline
6 pages
AI & ML for Fake News Detection
No ratings yet
AI & ML for Fake News Detection
32 pages
Design and Analysis of Algorithms Course
No ratings yet
Design and Analysis of Algorithms Course
9 pages
Machine Learning in Canal Irrigation
No ratings yet
Machine Learning in Canal Irrigation
39 pages
Key Factors in Employee Recruitment
No ratings yet
Key Factors in Employee Recruitment
7 pages
Importance of User Interface Design
No ratings yet
Importance of User Interface Design
41 pages
Arduino Fire and Gas Detection System
No ratings yet
Arduino Fire and Gas Detection System
26 pages
Standard Operating Procedure for Data Entry
No ratings yet
Standard Operating Procedure for Data Entry
15 pages
Java User Input and Conditionals Guide
No ratings yet
Java User Input and Conditionals Guide
5 pages
Grade 9 Maths Question Booklet
100% (1)
Grade 9 Maths Question Booklet
11 pages
Activating Your SIUE e-ID and Email
No ratings yet
Activating Your SIUE e-ID and Email
2 pages
Practical Guide to LibreOffice Writer
No ratings yet
Practical Guide to LibreOffice Writer
35 pages
STC Growing: in Scale and Scope
No ratings yet
STC Growing: in Scale and Scope
56 pages
Emc
No ratings yet
Emc
35 pages
OpenText ContentBridge CE 21.2 - User Guide English (EDCCM210200-UGD-En-1)
No ratings yet
OpenText ContentBridge CE 21.2 - User Guide English (EDCCM210200-UGD-En-1)
456 pages
Azure ML Lab: Bank Credit Experiment
No ratings yet
Azure ML Lab: Bank Credit Experiment
16 pages
Workday HCM Techno Functional Training
No ratings yet
Workday HCM Techno Functional Training
2 pages
Digit Magazine's New Value Addition
No ratings yet
Digit Magazine's New Value Addition
94 pages
Quantum III Drive Start-Up Guide
No ratings yet
Quantum III Drive Start-Up Guide
28 pages
Key Loss Functions in Machine Learning
No ratings yet
Key Loss Functions in Machine Learning
4 pages
Evaporadores ELA
No ratings yet
Evaporadores ELA
44 pages
Real Estate King Project Overview
No ratings yet
Real Estate King Project Overview
57 pages
Installing Windows Server 2008 Guide
No ratings yet
Installing Windows Server 2008 Guide
9 pages
NSE4_FGT 7.2 Exam Prep with DumpsBoss
No ratings yet
NSE4_FGT 7.2 Exam Prep with DumpsBoss
7 pages
Pavan Pawar: MBA & Tech Achievements
No ratings yet
Pavan Pawar: MBA & Tech Achievements
1 page
Isolators for OHE Maintenance in Railways
No ratings yet
Isolators for OHE Maintenance in Railways
41 pages
Telerik RadControls Comparison Guide
No ratings yet
Telerik RadControls Comparison Guide
54 pages
Feeder IP Address Reference List
No ratings yet
Feeder IP Address Reference List
1 page
COS20007 Portfolio Assessment Guide
No ratings yet
COS20007 Portfolio Assessment Guide
8 pages
UX Storytelling Techniques Guide
100% (2)
UX Storytelling Techniques Guide
24 pages
ERCM Solution Overview 01JUL14
100% (1)
ERCM Solution Overview 01JUL14
20 pages
Finite Volume Method in CFD Explained
No ratings yet
Finite Volume Method in CFD Explained
50 pages
Introduction to Apache Kafka
No ratings yet
Introduction to Apache Kafka
32 pages
Call Center Performance Improvement Case
33% (3)
Call Center Performance Improvement Case
16 pages
Embedded Systems & IoT Lab Manual
100% (1)
Embedded Systems & IoT Lab Manual
59 pages
Integrating Apple Pay in Apps
No ratings yet
Integrating Apple Pay in Apps
154 pages
C0-01DD1-D Micro PLC Specifications
No ratings yet
C0-01DD1-D Micro PLC Specifications
1 page
KivyMD Theme and Layout Options Guide
No ratings yet
KivyMD Theme and Layout Options Guide
3 pages