4 releases (2 breaking)

Uses new Rust 2024

new 0.7.0	Feb 28, 2026
0.6.0	Feb 26, 2026
0.5.5	Feb 25, 2026
0.5.4	Feb 25, 2026

#357 in Compression

Used in 4 crates

Apache-2.0

625KB
5K SLoC

Hexz

Deduplicated checkpoint storage for ML model weights. Pack safetensors and GGUF files into seekable .hxz archives that chunk at tensor boundaries and store fine-tunes as XOR deltas against their base.

Documentation · PyPI · crates.io · Releases

Install

pip install hexz           # Python library
cargo install hexz-cli     # CLI tool

CLI

# Pack a safetensors or GGUF model
hexz store base-model.safetensors base-model.hxz

# Store a fine-tune — delta against the base (XOR compression)
hexz store finetuned.safetensors finetuned.hxz --base base-model.hxz

# Export back to safetensors
hexz extract finetuned.hxz finetuned-out.safetensors

# Extract a single tensor
hexz extract finetuned.hxz --tensor lm_head.weight

# Inspect: tensor list, shapes, storage stats
hexz inspect finetuned.hxz

# Compare two archives — which tensors changed
hexz diff base-model.hxz finetuned.hxz

# List archives in a directory with chain info and unique bytes on disk
hexz ls ./checkpoints/

# Sign and verify
hexz keygen
hexz sign   --key private.key finetuned.hxz
hexz verify --key public.key  finetuned.hxz

Run hexz --help or hexz COMMAND --help for full usage.

Python API

Import from safetensors (no PyTorch required)

import hexz.checkpoint as ckpt

# Convert a safetensors file — chunks at tensor boundaries
ckpt.convert("base-model.safetensors", "base-model.hxz")

# Store a fine-tune as XOR delta — only diffs stored
ckpt.convert("finetuned.safetensors", "finetuned.hxz", base="base-model.hxz")

# Export back to safetensors
ckpt.extract("finetuned.hxz", "finetuned-out.safetensors")

# Extract a single tensor to raw bytes
ckpt.extract("finetuned.hxz", tensor="lm_head.weight")

Save and load PyTorch state dicts

import hexz.checkpoint as ckpt

# Save — stores XOR delta against parent
ckpt.save(model.state_dict(), "finetuned.hxz", parent="base-model.hxz")

# Load all tensors
state = ckpt.load("finetuned.hxz", device="cuda")

# Load only what you need — reads only those blocks from disk or S3
state = ckpt.load("finetuned.hxz", keys=["lm_head.weight", "embed_tokens.weight"])

# Inspect names and shapes without loading data
manifest = ckpt.manifest("finetuned.hxz")

Random access over S3

import hexz

# Only the requested blocks are downloaded
with hexz.open("s3://my-bucket/finetuned.hxz") as r:
    data = r.read(length, offset=tensor_offset)

How delta storage works

When you pass --base base.hxz (CLI) or base= (Python):

The safetensors header tells Hexz exactly where each tensor lives — no CDC rolling-hash scan needed
For each tensor present in both files: delta = base_tensor XOR fine_tensor
Fine-tuning perturbs weights across all layers without inserting or deleting bytes, so delta is dense but low-magnitude — zstd handles this well
Tensors with no parent match (new adapter layers) are stored as-is
Tensors byte-identical to the parent cost zero extra bytes via BLAKE3 block dedup

On load, Hexz reads the base tensor, decompresses the XOR delta, and XORs again to reconstruct. This is transparent to ckpt.load().

Note: XOR delta compression ratios on real model fine-tunes have not yet been benchmarked. The theoretical basis (ZipLLM, Hachiuma et al.) predicts significant savings; empirical numbers will be added once Phase 3 is complete. See ROADMAP.md.

Storage comparison

50 fine-tunes of a 7B model (~14 GB each), stored against the same base:

Approach	Storage
Raw file copies	~700 GB
git-lfs	~700 GB — tracks blobs, does not deduplicate content
DVC + S3	~700 GB — pointer tracking, not a content store
Hexz (XOR delta)	[UNTESTED — benchmark in progress]

The CDC block dedup benchmark (validated) shows 92.4% deduplication on shifted data vs 0% for fixed-size blocks. See COMPETITIVE_COMPARISON.md for full benchmark details.

License

Licensed under Apache 2.0 or MIT, at your option.

`lib.rs`:

Storage backend implementations for Hexz snapshots.

This crate provides concrete implementations of hexz_core::store::StorageBackend for local files, HTTP/HTTPS, and S3-compatible object storage. It also exposes a ParentLoader helper for opening thin-snapshot parent chains via FileBackend.

Dependencies

~20–41MB
~543K SLoC