#json #zero-copy #performance #low-allocation

bufjson

No frills, low-alloc, low-copy JSON lexer/parser for fast stream-oriented parsing

19 releases (6 breaking)

Uses new Rust 2024

new 0.7.0 Mar 25, 2026
0.6.2 Mar 17, 2026
0.5.3 Mar 1, 2026
0.5.2 Feb 15, 2026
0.1.7 Aug 28, 2025

#270 in Web programming

MIT/Apache

1MB
15K SLoC

bufjson. A low-level, low-allocation, low-copy JSON tokenizer and parser geared toward efficient stream processing at scale.


Get started

Add bufjson to your Cargo.toml or run $ cargo add bufjson.

Here's a simple example that parses a JSON text for syntax validity and prints it with the insignificant whitespace stripped out.

use bufjson::{lexical::{Token, fixed::FixedAnalyzer}, syntax::Parser};

fn strip_whitespace(json_text: &str) {
    let mut parser = Parser::new(FixedAnalyzer::new(json_text.as_bytes()));
    loop {
        match parser.next_non_white() {
            Token::Eof => break,
            Token::Err => panic!("{}", parser.err()),
            _ => print!("{}", parser.content().literal()),
        };
    }
}

fn main() {
    // Prints `{"foo":"bar","baz":[null,123]}`
    strip_whitespace(r#"{ "foo": "bar", "baz": [null, 123] }"#);
}

Architecture

The bufjson crate provides a stream-oriented JSON tokenizer through the lexical::Analyzer trait, with these implementations:

  • FixedAnalyzer tokenizes fixed-size buffers;
  • ReadAnalyzer tokenizes sync input streams implementing io::Read; and
  • AsyncAnalyzer tokenizes async streams that yield byte buffers (COMING SOON-ISH);

The remainder of the library builds on the lexical analyzer trait.

  • The syntax module provides concrete stream-oriented parser types that can wrap any lexical analyzer.
  • The pointer module enables fast stream-oriented evaluation of JSON Pointers.

Refer to the API reference docs for more detail.

When to use

Choose bufjson when you need to:

  • Control and limit allocations or copying.
  • Process JSON text larger than available memory.
  • Extract specific values without parsing an entire JSON text.
  • Edit a stream of JSON tokens (add/remove/change values in the stream).
  • Access token content exactly as it appears in the JSON text (e.g. without unescaping strings).
  • Protect against malicious or degenerate inputs.
  • Implement custom parsing with precise behavior control.

Other libraries are more suitable for:

  • Deserializing JSON text straight into in-memory data structures (use serde_json or simd-json).
  • Serializing in-memory data structures to JSON (use serde_json).
  • Writing JSON text in a stream-oriented manner (use serde_json or json-writer).

Performance

  • Zero-copy string processing where possible.
  • Minimal allocations, which are explicit and optional wherever possible.
  • Streaming design handles arbitrarily long JSON text without loading into memory.
  • Suitable for high-throughput applications.

Benchmarks

The table below shows JSON text throughput benchmark results.1

Component .content() fetched Throughput
FixedAnalyzer Never 1 GiB/s
FixedAnalyzer Always 1 GiB/s
ReadAnalyzer2 Never 880 MiB/s
ReadAnalyzer2 Always 690 MiB/s
Parser + FixedAnalyzer Never 890 MiB/s
Parser + FixedAnalyzer Always 850 MiB/s
1 Running on Ubuntu 22 with an Intel Core i7 1.8 GHz with four physical cores.
2 ReadAnalyzer is fed with an in-memory std::io::Read implementation (&[u8]) to eliminate the confounding effect of I/O.

License

Licensed under either of Apache License, Version 2.0 or MIT license at your option.
Any contribution intentionally submitted for inclusion in this crate by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

Dependencies

~1MB
~28K SLoC