-
Notifications
You must be signed in to change notification settings - Fork 0
/
DEDUPE-TODO
19 lines (16 loc) · 930 Bytes
/
DEDUPE-TODO
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
- Mixed buffers of dedupe-able and compressible data.
Major usecase in performance benchmarking of storage subsystems.
- Shifted dedup-able data.
Allow for dedup buffer generation to shift contents by random number
of sectors (fill the gaps with uncompressible data). Some storage
subsystems modernized the deduplication detection algorithms to look
for shifted data as well. For example, some databases push a timestamp
on the prefix of written blocks, which makes the underlying data
dedup-able in different alignment. FIO should be able to simulate such
workload.
- Generation of similar data (but not exact).
A rising trend in enterprise storage systems.
Generation of "similar" data means random uncompressible buffers
that differ by few(configurable number of) bits from each other.
The storage subsystem usually identifies the similar buffers using
locality-sensitive hashing or other methods.