6 releases
| 0.2.3 | Aug 18, 2025 |
|---|---|
| 0.2.2 | Aug 11, 2025 |
| 0.1.1 | Aug 10, 2025 |
#915 in Rust patterns
15KB
163 lines
A faster #[derive(Hash)] for Rust
TL;DR: #[derive(Hash)] hashes your struct fields and slice elements one by one, which is slow. This crate hashes the entire struct at once, which is much faster.
Limitations: The struct must be safe to view as a slice of bytes. This is enforced by requiring derived traits from either bytemuck or zerocopy, at your option.
Tell me more
This crate is inspired by the excellent blog post by @purplesyringa (who is not affiliated with this crate). Check it out for an in-depth exploration of the issues with #[derive(Hash)] and the Hash trait in general.
We achieve better performance than #[derive(Hash)] by:
- Hashing the entire struct at once (as opposed to each field individually)
- Dispatching to a sequence of primitive writes such as
hasher.write_u64which is determined at compile time, padded where necessary (as opposed to using the slow variable-length codepath in the hashers) - Replicating the optimization
stdperforms foru8and other primitive types in slices, so that e.g.&[MyType(u8)]can he hashed as fast as&[u8]. This applies to structs with multiple fields as well.
Alternatives
bytemuck and zerocopy crates provide their own implementations of this idea as #[derive(ByteHash)]. They employ fewer micro-optimizations, but their performance when used in a HashSet is mostly identical to this crate, and the difference can go either way based on the chosen hash function.
Therefore, it is recommended to try #[derive(ByteHash)] first to avoid additional dependencies, and only switch to this crate if it improves your project's benchmarks.
Usage
For using the crate with zerocopy (recommended), see the docs on derive_hash_fast_zerocopy!
For using the crate with bytemuck (which puts more restrictions on your type), see the docs on derive_hash_fast_bytemuck!
Benchmarks
Clone the repository and run cargo bench.
I've published the raw results from a run here, but nothing beats benchmarks on your hardware and on your verstion of Rust compiler.
FAQ
Is this a hash function?
No. It's a more efficient way to feed data to your chosen hash function. If you care about performance, you should use a fast hash function in conjunction with this crate, since std::hash::DefaultHasher is DoS-resistant but slow.
Is this ALWAYS faster?
Almost. In my benchmarks this approach is faster than #[derive(Hash)] across the board, but there is one exception. If you are hashing a very short slice (64 bits or less) and you're using a function with a fast fixed-size path and slow variable-sized path (pretty much only rustc_hash::FxHasher), this approach may be slower. This crate is still dramatically faster for structs and longer slices even with rustc_hash::FxHasher. Whether this helps or hinders depends on the abundance of short slices in the data you're hashing.
Does this work in #![no_std]?
Yes. Or it should, anyway. Please open an issue if it doesn't.
Why not improve the Rust compiler?
Right now the pass that expands the #[derive(Hash)] macro happens before the properties of the type required for this optimization are known. So this would require significant architectural changes.
Hopefully that will happen sooner or later, but for now there's this crate.