19 releases (6 breaking)
Uses new Rust 2024
| new 0.7.0 | Mar 25, 2026 |
|---|---|
| 0.6.2 | Mar 17, 2026 |
| 0.5.3 | Mar 1, 2026 |
| 0.5.2 | Feb 15, 2026 |
| 0.1.7 | Aug 28, 2025 |
#270 in Web programming
1MB
15K
SLoC
bufjson. A low-level, low-allocation, low-copy JSON tokenizer and parser geared toward
efficient stream processing at scale.
Get started
Add bufjson to your Cargo.toml or run $ cargo add bufjson.
Here's a simple example that parses a JSON text for syntax validity and prints it with the insignificant whitespace stripped out.
use bufjson::{lexical::{Token, fixed::FixedAnalyzer}, syntax::Parser};
fn strip_whitespace(json_text: &str) {
let mut parser = Parser::new(FixedAnalyzer::new(json_text.as_bytes()));
loop {
match parser.next_non_white() {
Token::Eof => break,
Token::Err => panic!("{}", parser.err()),
_ => print!("{}", parser.content().literal()),
};
}
}
fn main() {
// Prints `{"foo":"bar","baz":[null,123]}`
strip_whitespace(r#"{ "foo": "bar", "baz": [null, 123] }"#);
}
Architecture
The bufjson crate provides a stream-oriented JSON tokenizer through the lexical::Analyzer trait,
with these implementations:
FixedAnalyzertokenizes fixed-size buffers;ReadAnalyzertokenizes sync input streams implementingio::Read; andAsyncAnalyzertokenizes async streams that yield byte buffers (COMING SOON-ISH);
The remainder of the library builds on the lexical analyzer trait.
- The
syntaxmodule provides concrete stream-oriented parser types that can wrap any lexical analyzer. - The
pointermodule enables fast stream-oriented evaluation of JSON Pointers.
Refer to the API reference docs for more detail.
When to use
Choose bufjson when you need to:
- Control and limit allocations or copying.
- Process JSON text larger than available memory.
- Extract specific values without parsing an entire JSON text.
- Edit a stream of JSON tokens (add/remove/change values in the stream).
- Access token content exactly as it appears in the JSON text (e.g. without unescaping strings).
- Protect against malicious or degenerate inputs.
- Implement custom parsing with precise behavior control.
Other libraries are more suitable for:
- Deserializing JSON text straight into in-memory data structures (use
serde_jsonorsimd-json). - Serializing in-memory data structures to JSON (use
serde_json). - Writing JSON text in a stream-oriented manner (use
serde_jsonorjson-writer).
Performance
- Zero-copy string processing where possible.
- Minimal allocations, which are explicit and optional wherever possible.
- Streaming design handles arbitrarily long JSON text without loading into memory.
- Suitable for high-throughput applications.
Benchmarks
The table below shows JSON text throughput benchmark results.1
| Component | .content() fetched |
Throughput |
|---|---|---|
FixedAnalyzer |
Never | 1 GiB/s |
FixedAnalyzer |
Always | 1 GiB/s |
ReadAnalyzer2 |
Never | 880 MiB/s |
ReadAnalyzer2 |
Always | 690 MiB/s |
Parser + FixedAnalyzer |
Never | 890 MiB/s |
Parser + FixedAnalyzer |
Always | 850 MiB/s |
2
ReadAnalyzer is fed with an in-memory std::io::Read implementation
(&[u8]) to eliminate the confounding effect of I/O.
License
Licensed under either of Apache License, Version 2.0 or MIT license at your option.Any contribution intentionally submitted for inclusion in this crate by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.
Dependencies
~1MB
~28K SLoC