Crate copyforward

Crate copyforward 

Source
Expand description

Fast copy-forward compression for message threads.

Detects repeated substrings across messages and replaces them with references to earlier occurrences, reducing storage and bandwidth.

Copy-forward compression is particularly effective for:

  • Chat logs and message threads with quoted replies
  • Document version histories with incremental changes
  • Any sequence of texts with repeated phrases or patterns

§Quick Start

§Rust

use copyforward::{exact, Config, CopyForward};

let messages = &["Hello world", "Hello world, how are you?"];
let compressed = exact(messages, Config::default());

// Render back to original text
let original = compressed.render_with(|_, _, _, text| text.to_string());
assert_eq!(original, messages);

§Python

import copyforward

messages = ["Hello world", "Hello world, how are you?"]
# Exact mode (default) - perfect compression
cf = copyforward.CopyForward.from_texts(messages)
print(cf.compression_ratio())

§Algorithm Selection

Choose between two optimized algorithms:

  • exact(): Perfect compression using binary search extension (O(n log m) time)

    • Best for: <1MB total text, when perfect compression is needed
    • Finds optimal substring matches, never misses opportunities
  • approximate(): Fast compression with capped extension (~2x faster)

    • Best for: >1MB text, when speed matters more than perfect compression
    • May split long references into multiple shorter ones
    • Still achieves excellent compression ratios (typically 50-90% size reduction)

Re-exports§

pub use crate::core::Config;
pub use crate::core::CopyForward;
pub use crate::core::CopyForwardTokens;
pub use crate::core::Segment;
pub use crate::core::TokenSegment;

Modules§

core
fixture
hashing
Shared hashing utilities for polynomial rolling hashes used by hashed algorithms.
tokenization

Structs§

Approximate
Text-mode wrapper for approximate algorithm routing through the token core.
Exact
Text-mode wrapper for exact algorithm routing through the token core.

Traits§

MessageLike
Trait for types that can be used as message inputs, supporting both regular strings and None values.
TokenLike
Trait for types that can be used as token inputs, supporting both regular tokens and None values.

Functions§

approximate
Create an approximate copy-forward compressor.
approximate_tokens
Create an approximate token-mode compressor over u32 token sequences.
exact
Create an exact copy-forward compressor.
exact_tokens
Create an exact token-mode compressor over u32 token sequences.

Type Aliases§

ApproximateTokens
Approximate copy-forward compression with capped extension.
ExactTokens
Exact copy-forward compression for token sequences (u32 IDs).