4 releases (2 breaking)
Uses new Rust 2024
| new 0.7.0 | Feb 28, 2026 |
|---|---|
| 0.6.0 | Feb 26, 2026 |
| 0.5.5 | Feb 25, 2026 |
| 0.5.4 | Feb 25, 2026 |
#357 in Compression
Used in 4 crates
625KB
5K
SLoC
Hexz
Deduplicated checkpoint storage for ML model weights. Pack safetensors and GGUF files into seekable .hxz archives that chunk at tensor boundaries and store fine-tunes as XOR deltas against their base.
Documentation · PyPI · crates.io · Releases
Install
pip install hexz # Python library
cargo install hexz-cli # CLI tool
CLI
# Pack a safetensors or GGUF model
hexz store base-model.safetensors base-model.hxz
# Store a fine-tune — delta against the base (XOR compression)
hexz store finetuned.safetensors finetuned.hxz --base base-model.hxz
# Export back to safetensors
hexz extract finetuned.hxz finetuned-out.safetensors
# Extract a single tensor
hexz extract finetuned.hxz --tensor lm_head.weight
# Inspect: tensor list, shapes, storage stats
hexz inspect finetuned.hxz
# Compare two archives — which tensors changed
hexz diff base-model.hxz finetuned.hxz
# List archives in a directory with chain info and unique bytes on disk
hexz ls ./checkpoints/
# Sign and verify
hexz keygen
hexz sign --key private.key finetuned.hxz
hexz verify --key public.key finetuned.hxz
Run hexz --help or hexz COMMAND --help for full usage.
Python API
Import from safetensors (no PyTorch required)
import hexz.checkpoint as ckpt
# Convert a safetensors file — chunks at tensor boundaries
ckpt.convert("base-model.safetensors", "base-model.hxz")
# Store a fine-tune as XOR delta — only diffs stored
ckpt.convert("finetuned.safetensors", "finetuned.hxz", base="base-model.hxz")
# Export back to safetensors
ckpt.extract("finetuned.hxz", "finetuned-out.safetensors")
# Extract a single tensor to raw bytes
ckpt.extract("finetuned.hxz", tensor="lm_head.weight")
Save and load PyTorch state dicts
import hexz.checkpoint as ckpt
# Save — stores XOR delta against parent
ckpt.save(model.state_dict(), "finetuned.hxz", parent="base-model.hxz")
# Load all tensors
state = ckpt.load("finetuned.hxz", device="cuda")
# Load only what you need — reads only those blocks from disk or S3
state = ckpt.load("finetuned.hxz", keys=["lm_head.weight", "embed_tokens.weight"])
# Inspect names and shapes without loading data
manifest = ckpt.manifest("finetuned.hxz")
Random access over S3
import hexz
# Only the requested blocks are downloaded
with hexz.open("s3://my-bucket/finetuned.hxz") as r:
data = r.read(length, offset=tensor_offset)
How delta storage works
When you pass --base base.hxz (CLI) or base= (Python):
- The safetensors header tells Hexz exactly where each tensor lives — no CDC rolling-hash scan needed
- For each tensor present in both files:
delta = base_tensor XOR fine_tensor - Fine-tuning perturbs weights across all layers without inserting or deleting bytes, so
deltais dense but low-magnitude — zstd handles this well - Tensors with no parent match (new adapter layers) are stored as-is
- Tensors byte-identical to the parent cost zero extra bytes via BLAKE3 block dedup
On load, Hexz reads the base tensor, decompresses the XOR delta, and XORs again to reconstruct. This is transparent to ckpt.load().
Note: XOR delta compression ratios on real model fine-tunes have not yet been benchmarked. The theoretical basis (ZipLLM, Hachiuma et al.) predicts significant savings; empirical numbers will be added once Phase 3 is complete. See ROADMAP.md.
Storage comparison
50 fine-tunes of a 7B model (~14 GB each), stored against the same base:
| Approach | Storage |
|---|---|
| Raw file copies | ~700 GB |
| git-lfs | ~700 GB — tracks blobs, does not deduplicate content |
| DVC + S3 | ~700 GB — pointer tracking, not a content store |
| Hexz (XOR delta) | [UNTESTED — benchmark in progress] |
The CDC block dedup benchmark (validated) shows 92.4% deduplication on shifted data vs 0% for fixed-size blocks. See COMPETITIVE_COMPARISON.md for full benchmark details.
License
Licensed under Apache 2.0 or MIT, at your option.
lib.rs:
Storage backend implementations for Hexz snapshots.
This crate provides concrete implementations of hexz_core::store::StorageBackend
for local files, HTTP/HTTPS, and S3-compatible object storage. It also exposes a
ParentLoader helper for opening thin-snapshot parent chains via FileBackend.
Dependencies
~20–41MB
~543K SLoC