0% found this document useful (0 votes)

93 views5 pages

DGIM Algorithm

The DGIM algorithm efficiently counts the number of 1's in a data stream using O(log²N) bits and provides an estimate with a maximum error of 50%. It organizes incoming bits into buckets based on specific rules, allowing for dynamic updates as new bits arrive. While the algorithm is advantageous for its space efficiency and ease of updates, it may incur significant errors if all 1's are located in the unknown region of the data stream.

Uploaded by

deepa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

93 views5 pages

DGIM Algorithm

Uploaded by

deepa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

COUNTING THE NUMBER OF 1’s IN THE DATA

STREAM

DGIM algorithm (Datar-Gionis-Indyk-Motwani Algorithm)

Designed to find the number 1’s in a data set. This

algorithm uses O(log²N) bits to represent a window of N
bit, allows to estimate the number of 1’s in the window
with and error of no more than 50%.

So this algorithm gives a 50% precise answer.

In DGIM algorithm, each bit that arrives has a timestamp,

for the position at which it arrives. if the first bit has a
timestamp 1, the second bit has a timestamp 2 and so on..
the positions are recognized with the window size N (the
window sizes are usually taken as a multiple of 2).The
windows are divided into buckets consisting of 1’s and 0's.

RULES FOR FORMING THE BUCKETS:

1. The right side of the bucket should always start

with 1. (if it starts with a 0,it is to be neglected)
E.g. · 1001011 → a bucket of size 4 ,having four 1’s
and starting with 1 on it’s right end.

2. Every bucket should have at least one 1, else no

bucket can be formed.
3. All buckets should be in powers of 2.

4. The buckets cannot decrease in size as we move to

the left. (move in increasing order towards left)

Let us take an example to understand the algorithm.

Estimating the number of 1’s and counting the buckets in

the given data stream.

This picture shows how we can form the buckets based on

the number of ones by following the rules.

In the given data stream let us assume the new bit arrives
from the right. When the new bit = 0
After the new bit ( 0 ) arrives with a time stamp 101, there
is no change in the buckets.

But what if the new bit that arrives is 1, then we need to

make changes..
· Create a new bucket with the current timestamp and size
1.

· If there was only one bucket of size 1, then nothing more

needs to be done. However, if there are now three buckets
of size 1( buckets with timestamp 100,102, 103 in the
second step in the picture) We fix the problem by
combining the leftmost(earliest) two buckets of size 1.
(purple box)

To combine any two adjacent buckets of the same size,

replace them by one bucket of twice the size. The
timestamp of the new bucket is the timestamp of the
rightmost of the two buckets.

Now, sometimes combining two buckets of size 1 may

create a third bucket of size 2. If so, we combine the
leftmost two buckets of size 2 into a bucket of size 4. This
process may ripple through the bucket sizes.

How long can you continue doing this…

You can continue if current timestamp- leftmost bucket

timestamp of window < N (=24 here) E.g. 103–87=16 < 24
so I continue, if it greater or equal to then I stop.

Finally the answer to the query.

How many 1’s are there in the last 20 bits?

Counting the sizes of the buckets in the last 20 bits, we
say, there are 11 ones.

Advantages

 Stores only O(log2 N) bits - O(log N)counts of log2N bits each

 Easy update as more bits enter - Error in count no greater than the number of 1’s in the unknown
area.

Drawbacks

• As long as the 1s are fairly evenly distributed, the error due to the unknown region is small – no
more than 50%

• But it could be that all the 1s are in the unknown area (indicated by “?” in the below figure) at the
end. In that case, the error is unbounded.

DGIM Example
No ratings yet
DGIM Example
4 pages
DGIM
No ratings yet
DGIM
4 pages
Counting Oneness in A Window
No ratings yet
Counting Oneness in A Window
12 pages
Unit 4 - Lecture 3 - DGIM Algorithm Notes
100% (1)
Unit 4 - Lecture 3 - DGIM Algorithm Notes
8 pages
B43 BDA Exp7
No ratings yet
B43 BDA Exp7
12 pages
Counting Ones in A Window: The Cost of Exact Counts
100% (1)
Counting Ones in A Window: The Cost of Exact Counts
13 pages
Module 4
No ratings yet
Module 4
20 pages
Decaying Window
No ratings yet
Decaying Window
16 pages
Counting Ones in A Window
No ratings yet
Counting Ones in A Window
27 pages
Implementing DGIM Algorithm
No ratings yet
Implementing DGIM Algorithm
3 pages
22amh32 - Data Analytics and Data Science Unit Iii & Counting Ones in Awindow 1. Counting Ones in A Window
No ratings yet
22amh32 - Data Analytics and Data Science Unit Iii & Counting Ones in Awindow 1. Counting Ones in A Window
6 pages
DGIM Algorithm for Counting 1's in Windows
No ratings yet
DGIM Algorithm for Counting 1's in Windows
7 pages
DGIM Algorithm Theory Explanation
0% (1)
DGIM Algorithm Theory Explanation
2 pages
Streams 1
No ratings yet
Streams 1
33 pages
BDA Experiment No.6
No ratings yet
BDA Experiment No.6
3 pages
Unit 2&3-44-47
No ratings yet
Unit 2&3-44-47
4 pages
Streaming Algorithms for Data Processing
No ratings yet
Streaming Algorithms for Data Processing
90 pages
Module 2 Session 7 Counting of Ones in A Window Decaying Windows
No ratings yet
Module 2 Session 7 Counting of Ones in A Window Decaying Windows
3 pages
Algorithms for Data Stream Processing
No ratings yet
Algorithms for Data Stream Processing
35 pages
Implementing DGIM Algorithm
No ratings yet
Implementing DGIM Algorithm
6 pages
DGIM
No ratings yet
DGIM
90 pages
Mining Data Streams
No ratings yet
Mining Data Streams
34 pages
Counting Ones in A Window
No ratings yet
Counting Ones in A Window
11 pages
Stream Processing and Sampling Techniques
No ratings yet
Stream Processing and Sampling Techniques
93 pages
BDA Experiment 7
No ratings yet
BDA Experiment 7
7 pages
Dynamic Hashing Techniques Guide
No ratings yet
Dynamic Hashing Techniques Guide
22 pages
Unit 3
No ratings yet
Unit 3
49 pages
Big Data Systems: Stream Processing Techniques
No ratings yet
Big Data Systems: Stream Processing Techniques
78 pages
Bda Experiment 4: Roll No. A-52 Name: Janmejay Patil Class: BE-A Batch: A3 Date of Experiment: Date of Submission Grade
No ratings yet
Bda Experiment 4: Roll No. A-52 Name: Janmejay Patil Class: BE-A Batch: A3 Date of Experiment: Date of Submission Grade
5 pages
Streaming Algorithms Complete
No ratings yet
Streaming Algorithms Complete
10 pages
Mining Data Streams (Part 1)
No ratings yet
Mining Data Streams (Part 1)
46 pages
Streaming Algorithms: Ajinkya Potdar Hemanga Krishna Borah
No ratings yet
Streaming Algorithms: Ajinkya Potdar Hemanga Krishna Borah
47 pages
Bda A4
No ratings yet
Bda A4
10 pages
Data Mining
No ratings yet
Data Mining
7 pages
Data Stream Operations and Algorithms
No ratings yet
Data Stream Operations and Algorithms
20 pages
Bda Que1
No ratings yet
Bda Que1
1 page
Streaming Data Management Techniques
No ratings yet
Streaming Data Management Techniques
6 pages
Ch05a Streams1
No ratings yet
Ch05a Streams1
48 pages
Bit Magic
No ratings yet
Bit Magic
15 pages
Mining Data Stream
No ratings yet
Mining Data Stream
31 pages
Bda Unit - 2
No ratings yet
Bda Unit - 2
12 pages
2022 Dec Bda 53151
No ratings yet
2022 Dec Bda 53151
2 pages
2022 Dec Bda 53151
No ratings yet
2022 Dec Bda 53151
2 pages
Count Ones in Bitstream Buckets
No ratings yet
Count Ones in Bitstream Buckets
1 page
Blooms Filter
No ratings yet
Blooms Filter
15 pages
A Simple Algorithm For Finding Frequent Elements in Streams and Bags
No ratings yet
A Simple Algorithm For Finding Frequent Elements in Streams and Bags
5 pages
Bitmasking for Programmers
No ratings yet
Bitmasking for Programmers
9 pages
BitMasks CF
No ratings yet
BitMasks CF
2 pages
Bloom Filters in Big Data Analytics
No ratings yet
Bloom Filters in Big Data Analytics
10 pages
Understanding Algorithms and Program Analysis
No ratings yet
Understanding Algorithms and Program Analysis
64 pages
AOA Previous Year Question Paper
No ratings yet
AOA Previous Year Question Paper
80 pages
4 Bda Chapter4 Answer
No ratings yet
4 Bda Chapter4 Answer
6 pages
Advanced Counting Techniques Explained
No ratings yet
Advanced Counting Techniques Explained
35 pages
Armstrong Number and Backtracking Concepts
No ratings yet
Armstrong Number and Backtracking Concepts
6 pages
String Algorithms and Data Compression
No ratings yet
String Algorithms and Data Compression
51 pages
B.tech Bloom Filter 3
No ratings yet
B.tech Bloom Filter 3
14 pages
PCY Algorithm for Frequent Itemsets
No ratings yet
PCY Algorithm for Frequent Itemsets
13 pages
BDA Notes Part 2
No ratings yet
BDA Notes Part 2
5 pages
DS Model PPR 2
No ratings yet
DS Model PPR 2
19 pages
Data Stream Mining Techniques
No ratings yet
Data Stream Mining Techniques
67 pages
Cloud Migration Essentials E-Book
No ratings yet
Cloud Migration Essentials E-Book
25 pages
Kramer KT 1010 KT 1010rb Qs 7
No ratings yet
Kramer KT 1010 KT 1010rb Qs 7
2 pages
NetSDK JAVA ProgrammingManual
No ratings yet
NetSDK JAVA ProgrammingManual
86 pages
CSS-Microproject For Co Students
No ratings yet
CSS-Microproject For Co Students
14 pages
eIDMS PDF
No ratings yet
eIDMS PDF
99 pages
Easytreve FAQ
No ratings yet
Easytreve FAQ
12 pages
ITP Block Works A 001 R 01
No ratings yet
ITP Block Works A 001 R 01
1 page
INAIO Syllabus
No ratings yet
INAIO Syllabus
4 pages
Python With Fast API
No ratings yet
Python With Fast API
3 pages
KSCST 2019-20 Student Project Proposal
No ratings yet
KSCST 2019-20 Student Project Proposal
7 pages
Fast Leveling Guide for Valorant
No ratings yet
Fast Leveling Guide for Valorant
3 pages
Offer Letter Varcha C
No ratings yet
Offer Letter Varcha C
25 pages
Maximize Efficiency of Existing Cooling System in Data Center 2019
No ratings yet
Maximize Efficiency of Existing Cooling System in Data Center 2019
5 pages
MOSFET Specs for Engineers
No ratings yet
MOSFET Specs for Engineers
8 pages
Maher Ali Ahmed (3)
No ratings yet
Maher Ali Ahmed (3)
4 pages
Wildside Kennels - Ed Faron - Myspace Page
60% (5)
Wildside Kennels - Ed Faron - Myspace Page
12 pages
Iecex Certificate of Conformity
No ratings yet
Iecex Certificate of Conformity
7 pages
B PDG Master
No ratings yet
B PDG Master
323 pages
AWS Cloud Engineer Resume
No ratings yet
AWS Cloud Engineer Resume
6 pages
Case Competition Rules
No ratings yet
Case Competition Rules
3 pages
Ryan International School IX Coursework
No ratings yet
Ryan International School IX Coursework
2 pages
Data Strucutres and Algorithms
No ratings yet
Data Strucutres and Algorithms
16 pages
MIPS Architecture Guide
No ratings yet
MIPS Architecture Guide
12 pages
PCTEL Options Definition File
No ratings yet
PCTEL Options Definition File
3 pages
C Program Practical File
100% (2)
C Program Practical File
36 pages
Grade 7 Pre-Technical Studies Notes
0% (1)
Grade 7 Pre-Technical Studies Notes
190 pages
Juniper Pcep PDF
No ratings yet
Juniper Pcep PDF
76 pages
EDC Unit 5
No ratings yet
EDC Unit 5
10 pages
The Global Scenario of Hindi in Web Media: December 2023
No ratings yet
The Global Scenario of Hindi in Web Media: December 2023
6 pages
E-commerce Engagement Strategy for Tier 2 Cities
No ratings yet
E-commerce Engagement Strategy for Tier 2 Cities
3 pages

DGIM Algorithm

Uploaded by

DGIM Algorithm

Uploaded by

COUNTING THE NUMBER OF 1’s IN THE DATA

DGIM algorithm (Datar-Gionis-Indyk-Motwani Algorithm)

Designed to find the number 1’s in a data set. This

So this algorithm gives a 50% precise answer.

In DGIM algorithm, each bit that arrives has a timestamp,

RULES FOR FORMING THE BUCKETS:

1. The right side of the bucket should always start

2. Every bucket should have at least one 1, else no

4. The buckets cannot decrease in size as we move to

Let us take an example to understand the algorithm.

Estimating the number of 1’s and counting the buckets in

This picture shows how we can form the buckets based on

But what if the new bit that arrives is 1, then we need to

· If there was only one bucket of size 1, then nothing more

To combine any two adjacent buckets of the same size,

Now, sometimes combining two buckets of size 1 may

How long can you continue doing this…

You can continue if current timestamp- leftmost bucket

Finally the answer to the query.

How many 1’s are there in the last 20 bits?

 Stores only O(log2 N) bits - O(log N)counts of log2N bits each

You might also like