Crate for storing many immutable strings in one buffer with minimal resource usage.
Designed to minimize:
- memory overhead (single byte buffer + compact offsets)
- CPU overhead (cheap index-based lookups)
- binary size overhead (no panics on insertion, avoiding backtrace overhead)
See full crate docs for more details. Below is just a sneak peek.
For a companion blog post with additional design insights and real-world context, see Sometimes I Need to Store a Lot of Strings Efficiently, So I Built lite-strtab.
[dependencies]
lite-strtab = "0.1.0"use lite_strtab::StringTableBuilder;
let mut builder = StringTableBuilder::new();
let id = builder.try_push("hello").unwrap();
let table = builder.build();
assert_eq!(table.get(id), Some("hello"));See full crate docs for more details. This page is a summary.
Measured on Linux with glibc malloc using malloc_usable_size
(capturing allocator block sizes including alignment overhead).
How to read these tables:
Total=Heap allocations+Distributed fields+One-time metadataDistributed fields= string references distributed across fields/structs (e.g.String,Box<str>,StringId<u16>)- in these results,
lite-strtabusesStringId<u16>
This keeps the comparison close to real application layouts while still keeping the baseline rows readable.
YakuzaKiwami (4,650 game file paths, 238,109 bytes):
| Representation | Total | Heap allocations | Distributed fields | vs lite-strtab |
|---|---|---|---|---|
lite-strtab |
266068 (259.83 KiB) | 256736 (250.72 KiB) | 9300 (9.08 KiB) | 1.00x |
lite-strtab (null-padded) |
270708 (264.36 KiB) | 261376 (255.25 KiB) | 9300 (9.08 KiB) | 1.02x |
Vec<String> |
384240 (375.23 KiB) | 272640 (266.25 KiB) | 111600 (108.98 KiB) | 1.44x |
Box<[Box<str>]> |
346928 (338.80 KiB) | 272528 (266.14 KiB) | 74400 (72.66 KiB) | 1.30x |
EnvKeys (109 environment variable names, 1,795 bytes):
| Representation | Total | Heap allocations | Distributed fields | vs lite-strtab |
|---|---|---|---|---|
lite-strtab |
2490 (2.43 KiB) | 2240 (2.19 KiB) | 218 B | 1.00x |
lite-strtab (null-padded) |
2602 (2.54 KiB) | 2352 (2.30 KiB) | 218 B | 1.04x |
Vec<String> |
5440 (5.31 KiB) | 2824 (2.76 KiB) | 2616 (2.55 KiB) | 2.18x |
Box<[Box<str>]> |
4472 (4.37 KiB) | 2728 (2.66 KiB) | 1744 (1.70 KiB) | 1.80x |
ApiUrls (90 API endpoint URLs, 3,970 bytes):
| Representation | Total | Heap allocations | Distributed fields | vs lite-strtab |
|---|---|---|---|---|
lite-strtab |
4564 (4.46 KiB) | 4352 (4.25 KiB) | 180 B | 1.00x |
lite-strtab (null-padded) |
4660 (4.55 KiB) | 4448 (4.34 KiB) | 180 B | 1.02x |
Vec<String> |
6896 (6.73 KiB) | 4736 (4.62 KiB) | 2160 (2.11 KiB) | 1.51x |
Box<[Box<str>]> |
6112 (5.97 KiB) | 4672 (4.56 KiB) | 1440 (1.41 KiB) | 1.34x |
For complete comparisons across all datasets and full breakdown details, see full crate docs.
Summary: lite-strtab uses 20-55% less memory than the standard methods. Generally the smaller the strings, the better the savings.
In this benchmark we sequentially read all of the 4,650 strings (238,109 bytes)
and hash them with AHash to simulate a realistic workload that involves reading the strings.
| Access | Representation | avg time (µs) | avg thrpt (GiB/s) |
|---|---|---|---|
get |
Vec<String> |
13.561 | 16.352 |
get |
Box<[Box<str>]> |
13.002 | 17.056 |
get |
lite-strtab |
13.368 | 16.589 |
get |
lite-strtab (null-padded) |
13.714 | 16.171 |
get_unchecked |
Vec<String> |
13.448 | 16.490 |
get_unchecked |
Box<[Box<str>]> |
12.812 | 17.308 |
get_unchecked |
lite-strtab |
13.207 | 16.790 |
get_unchecked |
lite-strtab (null-padded) |
13.828 | 16.037 |
Reproduce with cargo bench --bench my_benchmark. Linux glibc. cargo 1.95.0-nightly (fe2f314ae 2026-01-30).
In summary, actual read performance on real data is within margin of error.
The overhead of looking up a string by ID is negligible. Any difference you see is mostly due to run to run variation.
I've experimented with data alignment too, but saw no notable difference in practice
after aligning to usize boundaries to avoid reads across word boundaries.
There may be some in random access patterns; I've only benched sequential here.
Instruction count to get &str, x86_64, release mode:
| Method | Instructions | Access Pattern |
|---|---|---|
lite-strtab::get |
~12 | bounds check → load 2 offsets → compute range → add base |
lite-strtab::get_unchecked |
~7 | load 2 offsets → compute range → add base |
Vec<String>::get |
~8 | bounds check → load ptr from heap → deref for (ptr, len) |
Vec<String>::get_unchecked |
~5 | load ptr from heap → deref for (ptr, len) |
Box<[Box<str>]>::get |
~7 | bounds check → load ptr → deref for (ptr, len) |
Box<[Box<str>]>::get_unchecked |
~4 | load ptr → deref for (ptr, len) |
Overhead of processing the data largely dominates; so the difference here is negligible.
MIT