This repository will serve as reference code for the paper "Bit-array-based alternatives to HyperLogLog" by Svante Janson, Jérémie Lumbroso and Bob Sedgewick.
The paper itself contains reference implementations in Java, but in this repository we propose a fully runnable version. We rely on the randomhash
Java package (for which there exists an exact Python equivalent) that implements affine combinations of as CRC32 hash, with Mersenne Twister generated parameters. These family of hash functions have been shown to have nice distribution and independence properties, and they behave in practice as though they are sufficiently decorrelated, even if they are by construction.
We currently have a reference implementation HyperBit64.java
which is deprecated, but we will update this repository with our final code soon.
- Heinz N. Gies: (Rust) https://github.com/axiomhq/hypertwobits/
These implementations were made of an earlier version of HyperBitBit, and may not be current: