Anomaly Detection

This document discusses anomaly detection in machine learning and network security. It defines anomalies and outliers, and explains how to detect outliers using z-scores. Common causes of anomalies are given as different data classes, natural variation, and data errors. Challenges in anomaly detection are discussed, including obtaining accurate labels for supervised learning methods and dealing with different data types.

Uploaded by

Amita Soni

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views

Anomaly Detection

Uploaded by

Amita Soni

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 49

MACHINE LEARNING AND NETWORK SECURITY

UNIT 2: Anomaly Detection

Anomaly vs Outliers
A standard normal table (also called the unit normal table or z-score table) is a mathematical table for the values
of ϕ, indicating the values of the cumulative distribution function of the normal distribution. Z-Score, also known as
the standard score, indicates how many standard deviations an entity is, from the mean.

𝑋−𝜇
𝑍=
𝜎
Reference Link: https://www.machinelearningplus.com/machine-learning/how-to-detect-outliers-with-z-score/
Standard Normal Distribution
Mean=0, Standard Deviation=1
Causes of Anomalies

1. Data from different classes

An object may be different because it is of a different class. Cases like credit card theft, Intrusion detection, outcome
of disease, abnormal test result are good examples of anomalies occurring and identified using class labels. Example:
measuring the weights of oranges, but a few grapefruit are mixed in.

2. Natural variation

In a Normal or Gaussian distribution the probability of a data object decreases rapidly. Such objects are considered as
anomalies. These are also called as outliers. Example: Unusually tall people.

3. Data measurement and Collection Errors

These kinds of errors occur when we collect erroneous data or if there is any deviation while measuring data.
Example: 200 pounds of a 2 year old.
Line A is blue line, B is green line and C is red
line.
We could use a clustering algorithm to assign membership to cluster.
Other Challenges in Anomaly Detection
Machine learning methods can be classified in many different ways. Quite frequently, we differentiate between
supervised and unsupervised learning. In supervised learning, the learning program needs labeled examples
given by a “teacher”, whereas in unsupervised learning, the program directly learns patterns from the data,
without any human intervention or guidance. The typical approach adopted by this method is to build a
predictive model for normal vs. anomaly classes. It compares any unseen data instance against the model to
identify which class it belongs to, whereas an unsupervised method works based on certain assumptions. It
assumes that (i) normal instances are far more frequent than anomalous instances and (ii) anomalous instances
are statistically different from normal instances. However, if these assumptions are not true, such methods
suffer from high false alarm rates.

For supervised learning, an important issue is to obtain accurate and representative labels, especially for the
anomaly classes.
Various Types of Data

The attributes used to describe real-life objects can be of different types. The following are the commonly used
types of attribute variables.
network

Max Fisher - The Chaos Machine - The Inside Story of How Social Media Rewired Our Minds and Our World-Little, Brown and Company (2022)
100% (6)
Max Fisher - The Chaos Machine - The Inside Story of How Social Media Rewired Our Minds and Our World-Little, Brown and Company (2022)
378 pages
UX Research Study: - Plan
No ratings yet
UX Research Study: - Plan
3 pages
Handling Outliers
No ratings yet
Handling Outliers
6 pages
Module_11(c)
No ratings yet
Module_11(c)
4 pages
1preparing Data
No ratings yet
1preparing Data
6 pages
2009 Data Cleaning
No ratings yet
2009 Data Cleaning
8 pages
Anomoly Detection - Ensemble - Classifiers
No ratings yet
Anomoly Detection - Ensemble - Classifiers
68 pages
A Review of Statistical Outlier Methods
No ratings yet
A Review of Statistical Outlier Methods
8 pages
02 - Accuracy and Precision-Chem23
No ratings yet
02 - Accuracy and Precision-Chem23
5 pages
How To Calculate Outliers
No ratings yet
How To Calculate Outliers
7 pages
Classification
No ratings yet
Classification
22 pages
1outlier - Wikipedia
No ratings yet
1outlier - Wikipedia
47 pages
Data Mining:: Concepts and Techniques
No ratings yet
Data Mining:: Concepts and Techniques
13 pages
Outlier: Occurrence and Causes
No ratings yet
Outlier: Occurrence and Causes
6 pages
Section 2 - Introduction To Machine Learning-Bje Edits - Ipynb - Colab
No ratings yet
Section 2 - Introduction To Machine Learning-Bje Edits - Ipynb - Colab
7 pages
DataScience Interview Questions
100% (1)
DataScience Interview Questions
66 pages
Data Science Interview Questions: Answer Here
No ratings yet
Data Science Interview Questions: Answer Here
54 pages
12 Outlier
No ratings yet
12 Outlier
55 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
28 pages
Detection of Outliers: Iglewicz and Hoaglin
No ratings yet
Detection of Outliers: Iglewicz and Hoaglin
2 pages
Data Cleaning
No ratings yet
Data Cleaning
4 pages
Nature of Statistics: Sample Population Parameter, Statistic
No ratings yet
Nature of Statistics: Sample Population Parameter, Statistic
3 pages
ADS Ut2
No ratings yet
ADS Ut2
23 pages
Jones
No ratings yet
Jones
8 pages
1.1 - Statistical Analysis PDF
No ratings yet
1.1 - Statistical Analysis PDF
10 pages
Data Science Interview Questions
100% (2)
Data Science Interview Questions
55 pages
LECTURE 12
No ratings yet
LECTURE 12
54 pages
SEM chapter3,4，5
No ratings yet
SEM chapter3,4，5
60 pages
4485 2nd
No ratings yet
4485 2nd
22 pages
4_Outliers_+Transformaations ML
No ratings yet
4_Outliers_+Transformaations ML
28 pages
Anomoly detection
No ratings yet
Anomoly detection
2 pages
6anomaly Fraud Detection
No ratings yet
6anomaly Fraud Detection
5 pages
Outlier Detection
No ratings yet
Outlier Detection
45 pages
Unit 1
No ratings yet
Unit 1
21 pages
Data Science Interview Questions -1
No ratings yet
Data Science Interview Questions -1
55 pages
Anomaly Detection
No ratings yet
Anomaly Detection
10 pages
Standard Deviation Dissertation
100% (1)
Standard Deviation Dissertation
5 pages
150 Essential Data Science Questions and Answers
No ratings yet
150 Essential Data Science Questions and Answers
55 pages
Application of Statistical Concepts in The Determination of Weight Variation in Coin Samples
No ratings yet
Application of Statistical Concepts in The Determination of Weight Variation in Coin Samples
3 pages
MeasuresMeasurements Eng
No ratings yet
MeasuresMeasurements Eng
87 pages
Outlier Analysis in Data Mining
No ratings yet
Outlier Analysis in Data Mining
5 pages
Complete Answer Guide for Essentials of Statistics for The Behavioral Sciences 9th Edition Gravetter Solutions Manual
100% (2)
Complete Answer Guide for Essentials of Statistics for The Behavioral Sciences 9th Edition Gravetter Solutions Manual
41 pages
Anomaly Detection 2
No ratings yet
Anomaly Detection 2
8 pages
s&Ml Unit 4- q & A
No ratings yet
s&Ml Unit 4- q & A
12 pages
12Outlier-1
No ratings yet
12Outlier-1
45 pages
Appendix B: Introduction To Statistics: Eneral Terminology
No ratings yet
Appendix B: Introduction To Statistics: Eneral Terminology
15 pages
Missing and Outlier
No ratings yet
Missing and Outlier
20 pages
Essentials of Statistics for The Behavioral Sciences 9th Edition Gravetter Solutions Manual all chapter instant download
100% (5)
Essentials of Statistics for The Behavioral Sciences 9th Edition Gravetter Solutions Manual all chapter instant download
21 pages
Project Risk
No ratings yet
Project Risk
27 pages
Error Types and Error Propagation
No ratings yet
Error Types and Error Propagation
6 pages
Standard Deviation and Its Applications
100% (1)
Standard Deviation and Its Applications
8 pages
Outlier
No ratings yet
Outlier
9 pages
Anomaly detection for data streams in large-scale distributed heterogeneous computing environments
No ratings yet
Anomaly detection for data streams in large-scale distributed heterogeneous computing environments
11 pages
IQL Chapter 5 - What Is Normal?
No ratings yet
IQL Chapter 5 - What Is Normal?
5 pages
BA UNIT-3 - Part 1
No ratings yet
BA UNIT-3 - Part 1
4 pages
Basic Analytical Concepts
No ratings yet
Basic Analytical Concepts
12 pages
10 - Anomaly Detection
No ratings yet
10 - Anomaly Detection
12 pages
Quant Descriptive Statistics
No ratings yet
Quant Descriptive Statistics
37 pages
SMDM 2023
No ratings yet
SMDM 2023
5 pages
Part A
No ratings yet
Part A
16 pages
ABC of Clinical Reasoning
From Everand
ABC of Clinical Reasoning
Nicola Cooper
No ratings yet
Artificial Intelligence Diagnosis: Fundamentals and Applications
From Everand
Artificial Intelligence Diagnosis: Fundamentals and Applications
Fouad Sabry
No ratings yet
Jókai Mór: Szép Mikhál
No ratings yet
Jókai Mór: Szép Mikhál
305 pages
Windows Printer: Canon
No ratings yet
Windows Printer: Canon
2 pages
Batch Processing Using AWS LAMBDA
No ratings yet
Batch Processing Using AWS LAMBDA
3 pages
LTE KPI Optimization - RRC Success Rate
No ratings yet
LTE KPI Optimization - RRC Success Rate
6 pages
Activity. Relations and Functions
No ratings yet
Activity. Relations and Functions
5 pages
Assignment 1 Guideline
No ratings yet
Assignment 1 Guideline
3 pages
PESTEL
No ratings yet
PESTEL
1 page
Copyright Amendment Bill 2012
No ratings yet
Copyright Amendment Bill 2012
22 pages
Green Mode Power Switch For Valley Switching Converter - Low EMI and High Efficiency FSQ0365, FSQ0265, FSQ0165, FSQ321
No ratings yet
Green Mode Power Switch For Valley Switching Converter - Low EMI and High Efficiency FSQ0365, FSQ0265, FSQ0165, FSQ321
22 pages
CS 11 Introduction
No ratings yet
CS 11 Introduction
34 pages
CS601 QUiz 1 Updated-1
No ratings yet
CS601 QUiz 1 Updated-1
53 pages
UC Academic Senate Statement On UCOP Cyber Surveillance 3 Feb 2016
No ratings yet
UC Academic Senate Statement On UCOP Cyber Surveillance 3 Feb 2016
2 pages
Sony DCR Vx2000
No ratings yet
Sony DCR Vx2000
228 pages
Project and Process Development
No ratings yet
Project and Process Development
53 pages
KVS Physics Project
No ratings yet
KVS Physics Project
10 pages
APIGateway ConceptsGuide allOS en PDF
No ratings yet
APIGateway ConceptsGuide allOS en PDF
44 pages
Deweydecimalcla01dewe PDF
100% (1)
Deweydecimalcla01dewe PDF
594 pages
Nexiq USB Link 2 Wired Edition Product Sheet
No ratings yet
Nexiq USB Link 2 Wired Edition Product Sheet
1 page
Spad Export File BWPDR
No ratings yet
Spad Export File BWPDR
2 pages
The Organized Musician - Debbie Stanley
100% (1)
The Organized Musician - Debbie Stanley
144 pages
Evaluating Functions
No ratings yet
Evaluating Functions
2 pages
Performance Management: Key Performance Areas (Kpas) - Key Performance Areas (Kpas) May Be Described As The
No ratings yet
Performance Management: Key Performance Areas (Kpas) - Key Performance Areas (Kpas) May Be Described As The
2 pages
Flight Booking System
No ratings yet
Flight Booking System
39 pages
Release Note MFP V3.82.03.10
No ratings yet
Release Note MFP V3.82.03.10
5 pages
Amazon Front End Guide
No ratings yet
Amazon Front End Guide
4 pages
000 - Instructions - CCET Analysis Tool
No ratings yet
000 - Instructions - CCET Analysis Tool
40 pages
DSI Dywidag ETA 07 0186 Suspa Wire Ex en
No ratings yet
DSI Dywidag ETA 07 0186 Suspa Wire Ex en
20 pages
Market Situation Strategy
No ratings yet
Market Situation Strategy
45 pages