#lock-free #ipc #channel #spsc #ipc-channel

no-std rust-tango

A lock-free, high-performance IPC channel inspired by Firedancer's Tango

3 releases

Uses new Rust 2024

new 0.1.2 Jan 29, 2026
0.1.1 Jan 29, 2026
0.1.0 Jan 29, 2026

#339 in Concurrency

MIT/Apache

61KB
1K SLoC

tango

Crates.io Documentation CI License MSRV

A lock-free, high-performance SPSC (single-producer single-consumer) channel inspired by Firedancer's Tango IPC subsystem.

Performance

Benchmarked on Apple M3 Pro with 1000 samples per benchmark:

SPSC Throughput (10K messages, 64-byte payload)

Channel Throughput Relative
tango 26.6 M msg/s 1.0x
ringbuf 10.7 M msg/s 2.5x slower
std unbounded 9.1 M msg/s 2.9x slower
std bounded 9.0 M msg/s 3.0x slower
crossbeam bounded 7.3 M msg/s 3.6x slower
crossbeam unbounded 6.8 M msg/s 3.9x slower

Ping-Pong Latency (100 round trips, 8-byte payload)

Channel Latency Relative
tango 90 µs 1.0x
ringbuf 100 µs 1.1x slower
crossbeam 105 µs 1.2x slower
std 380 µs 4.2x slower

Large Payload Throughput (2K messages, 1024-byte payload)

Channel Throughput Relative
tango 12.4 GiB/s 1.0x
ringbuf 9.4 GiB/s 1.3x slower
crossbeam 6.9 GiB/s 1.8x slower

Run benchmarks yourself:

RUST_MIN_STACK=67108864 cargo bench

Features

  • Zero-copy reads - Access message payloads directly without allocation
  • Lock-free - No mutexes, just atomic operations with careful memory ordering
  • Backpressure - Optional credit-based flow control prevents overruns
  • Overrun detection - Consumers detect when they've been lapped by producers
  • Metrics - Built-in observability for throughput, lag, and errors
  • no_std support - Works in embedded/kernel environments
  • Thoroughly tested - Unit tests, Miri, Loom, proptest, and fuzz testing

Quick Start

use rust_tango::{Consumer, DCache, Fseq, MCache, Producer};

// Create the channel components
let mcache = MCache::<64>::new();      // 64-slot metadata ring buffer
let dcache = DCache::<64, 256>::new(); // 64 chunks of 256 bytes each
let fseq = Fseq::new(1);               // Sequence counter starting at 1

let producer = Producer::new(&mcache, &dcache, &fseq);
let mut consumer = Consumer::new(&mcache, &dcache, 1);

// Publish a message
producer.publish(b"hello", 0, 0, 0).unwrap();

// Consume it (zero-copy)
if let Ok(Some(fragment)) = consumer.poll() {
    assert_eq!(fragment.payload.as_slice(), b"hello");
}

With Flow Control

Use Fctl to prevent the producer from overwriting unconsumed messages:

use rust_tango::{Consumer, DCache, Fctl, Fseq, MCache, Producer};

let mcache = MCache::<64>::new();
let dcache = DCache::<64, 256>::new();
let fseq = Fseq::new(1);
let fctl = Fctl::new(64); // 64 credits = buffer capacity

let producer = Producer::with_flow_control(&mcache, &dcache, &fseq, &fctl);
let mut consumer = Consumer::with_flow_control(&mcache, &dcache, &fctl, 1);

// Producer returns NoCredits error when buffer is full
// Consumer automatically releases credits after consuming

With Metrics

Track throughput, lag, and errors:

use rust_tango::{Consumer, DCache, Fseq, MCache, Metrics, Producer};

let mcache = MCache::<64>::new();
let dcache = DCache::<64, 256>::new();
let fseq = Fseq::new(1);
let metrics = Metrics::new();

let producer = Producer::new(&mcache, &dcache, &fseq).with_metrics(&metrics);
let mut consumer = Consumer::new(&mcache, &dcache, 1).with_metrics(&metrics);

// ... publish and consume ...

let snapshot = metrics.snapshot();
println!("Published: {}", snapshot.published);
println!("Consumed: {}", snapshot.consumed);
println!("Lag: {} messages", snapshot.lag());

Using the Builder

For ergonomic channel creation:

use rust_tango::{ChannelBuilder, Producer, Consumer};

let (mcache, dcache, fseq, fctl, metrics) = ChannelBuilder::<64, 64, 256>::new()
    .with_flow_control()
    .with_metrics()
    .build();

let producer = Producer::with_flow_control(&mcache, &dcache, &fseq, fctl.as_ref().unwrap())
    .with_metrics(metrics.as_ref().unwrap());
let mut consumer = Consumer::with_flow_control(&mcache, &dcache, fctl.as_ref().unwrap(), 1)
    .with_metrics(metrics.as_ref().unwrap());

Installation

Add to your Cargo.toml:

[dependencies]
rust-tango = "0.1"

Feature Flags

Feature Default Description
std Yes Enable standard library support (Vec, Error trait)
loom No Enable Loom testing for concurrency verification

For no_std environments:

[dependencies]
rust-tango = { version = "0.1", default-features = false }

Architecture

Tango uses a split metadata/data architecture for cache efficiency:

  • MCache: Ring buffer of 32-byte metadata entries with sequence-based validation
  • DCache: Fixed-size chunk storage for payloads (cache-line aligned)
  • Fseq: Atomic sequence counter shared between producer threads
  • Fctl: Credit counter for backpressure between producer and consumer

The lock-free protocol uses Release/Acquire ordering:

  1. Producer writes payload to DCache
  2. Producer writes metadata to MCache slot
  3. Producer stores sequence number with Release ordering
  4. Consumer loads sequence with Acquire, reads metadata, re-checks sequence

This double-read validation detects overwrites without locks.

Safety & Testing

Tango is extensively tested:

  • Unit tests: Core functionality
  • Miri: Undefined behavior detection
  • Loom: Exhaustive thread interleaving exploration
  • Proptest: Property-based testing for invariants
  • Fuzz testing: Edge case discovery with cargo-fuzz

All unsafe blocks are documented with // SAFETY: comments explaining the invariants.

When to Use Tango

Best for:

  • Single-producer single-consumer scenarios
  • Latency-sensitive applications (trading, gaming, audio)
  • High-throughput message passing
  • When you can dedicate a core to busy-polling

Consider alternatives if:

  • You need multiple producers or consumers (use crossbeam)
  • You can't busy-poll (use async channels)
  • Messages are very large (consider shared memory + handles)

Minimum Supported Rust Version

Rust 1.85 or later (edition 2024).

License

Licensed under either of:

at your option.

Dependencies

~0–2MB
~23K SLoC