1 unstable release

0.1.0	Jan 18, 2026

#50 in #dag

MIT license

350KB
6.5K SLoC

cc-graph-kernel

Deterministic Context Slicing for Conversation DAGs

The Graph Kernel transforms a 107K+ turn conversation DAG into deterministic, replayable context slices. It answers one question:

Given a target turn, which other turns are allowed to influence meaning?

Overview
Core Contract
Architecture
Installation
Quick Start
Types Reference
SlicePolicy v1
Priority Scoring
Context Slicer Algorithm
Graph Stores
SliceExport & Fingerprinting
Determinism Guarantees
Integration with CognitiveTwin Bridge
Testing
API Reference

Overview

The Graph Kernel is the "context construction engine" for the semantic kernel stack. It sits between raw conversation data and semantic analysis, providing:

Bounded context windows around any anchor turn
Phase-aware prioritization (Synthesis > Planning > Consolidation > Debugging > Exploration)
Deterministic fingerprints for reproducibility and provenance
Salience-weighted expansion to capture the most important turns first

Why This Matters

Without deterministic context slicing:

"Same input" can produce different semantic analyses
Provenance becomes meaningless
Replay and debugging are impossible
Evidence accumulation can't be trusted

With the Graph Kernel:

Every slice has a unique, content-derived slice_id
Same anchor + same policy + same graph = identical slice
Downstream artifacts can reference their source context
The entire pipeline becomes auditable

Core Contract

The Graph Kernel guarantees exactly three things:

Deterministic Selection: Same inputs → same slice, byte-for-byte
Budget Compliance: Slices respect max_nodes and max_radius limits
Canonical Ordering: Turns sorted by TurnId, edges by (parent, child)

These guarantees are enforced by:

BTreeMap/BTreeSet for all internal collections (no HashMap)
Canonical JSON serialization for fingerprinting
Explicit ordering implementations on all types

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        ContextSlicer                            │
├─────────────────────────────────────────────────────────────────┤
│  Input: TurnId (anchor)                                         │
│  Output: SliceExport { turns, edges, slice_id }                 │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌───────────────┐    ┌───────────────┐    ┌───────────────┐   │
│  │  GraphStore   │───▶│  SlicePolicyV1│───▶│ BFS Expansion │   │
│  │ (Postgres or  │    │ (phase weights│    │ (priority     │   │
│  │  InMemory)    │    │  + budgets)   │    │  queue)       │   │
│  └───────────────┘    └───────────────┘    └───────────────┘   │
│                                                                 │
│                              │                                  │
│                              ▼                                  │
│                    ┌───────────────────┐                        │
│                    │   SliceExport     │                        │
│                    │ ┌───────────────┐ │                        │
│                    │ │ slice_id      │ │ ◀── xxHash64           │
│                    │ │ turns[]       │ │     (canonical)        │
│                    │ │ edges[]       │ │                        │
│                    │ │ policy_hash   │ │                        │
│                    │ └───────────────┘ │                        │
│                    └───────────────────┘                        │
└─────────────────────────────────────────────────────────────────┘

Installation

Add to your Cargo.toml:

[dependencies]
cc-graph-kernel = { path = "../cc-graph-kernel" }

# For PostgreSQL support (optional)
cc-graph-kernel = { path = "../cc-graph-kernel", features = ["postgres"] }

# For REST service (optional)
cc-graph-kernel = { path = "../cc-graph-kernel", features = ["service"] }

Feature Flags

Feature	Description	Dependencies
`default`	In-memory store only	None
`postgres`	PostgreSQL graph store	`sqlx`, `tokio`
`service`	REST API service	`axum`, `tower`, `tower-http`, `postgres`

REST Service

The Graph Kernel can be deployed as a standalone REST service:

# Run locally
DATABASE_URL=... cargo run --bin graph_kernel_service --features service

# Or with Docker
docker build -f Dockerfile.service -t graph-kernel .
docker run -p 8001:8001 -e DATABASE_URL=... graph-kernel

Endpoints:

POST /api/slice - Construct a slice around an anchor
POST /api/slice/batch - Batch slice construction
GET /api/policies - List registered policies
POST /api/policies - Register a new policy
GET /health - Service health check

See docs/SERVICE.md for full API documentation.

Quick Start

In-Memory (Testing)

use cc_graph_kernel::{
    ContextSlicer, SlicePolicyV1, InMemoryGraphStore,
    TurnId, TurnSnapshot, Edge, EdgeType, Role, Phase,
};
use uuid::Uuid;

// Build a graph
let mut store = InMemoryGraphStore::new();

store.add_turn(TurnSnapshot::new(
    TurnId::new(Uuid::from_u128(1)),
    "session_1".to_string(),
    Role::User,
    Phase::Synthesis,
    0.9,  // salience
    0,    // depth
    0,    // sibling_order
    0.5,  // homogeneity
    0.5,  // temporal
    1.0,  // complexity
    1704067200,  // created_at (Unix timestamp)
));

// Add more turns and edges...

// Create slicer
let policy = SlicePolicyV1::default();
let slicer = ContextSlicer::new(store, policy);

// Generate slice
let anchor_id = TurnId::new(Uuid::from_u128(1));
let slice = slicer.slice(anchor_id).unwrap();

println!("Slice ID: {}", slice.slice_id);
println!("Turns: {}", slice.num_turns());
println!("Edges: {}", slice.num_edges());

PostgreSQL (Production)

use cc_graph_kernel::{
    ContextSlicer, SlicePolicyV1, PostgresGraphStore, PostgresConfig,
    TurnId,
};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Connect to Orbit database
    let config = PostgresConfig {
        database_url: std::env::var("DATABASE_URL")?,
        max_connections: 5,
        connect_timeout_secs: 30,
    };
    
    let store = PostgresGraphStore::new(config).await?;
    let policy = SlicePolicyV1::default();
    let slicer = ContextSlicer::new(store, policy);
    
    // Slice around a specific turn
    let anchor_id = TurnId::from_str("550e8400-e29b-41d4-a716-446655440000")?;
    let slice = slicer.slice(anchor_id)?;
    
    println!("Context slice: {} turns, {} edges", 
             slice.num_turns(), slice.num_edges());
    println!("Fingerprint: {}", slice.slice_id);
    
    Ok(())
}

Types Reference

TurnId

Unique identifier for a turn in the conversation DAG.

#[derive(Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash)]
pub struct TurnId(Uuid);

impl TurnId {
    pub fn new(uuid: Uuid) -> Self;
    pub fn from_str(s: &str) -> Result<Self, uuid::Error>;
    pub fn as_uuid(&self) -> Uuid;
}

Role

Role of the turn author.

pub enum Role {
    User,       // User message
    Assistant,  // AI response
    System,     // System message
    Tool,       // Tool/function call result
}

Phase

Trajectory phase of the turn. Ordered by importance for context selection.

pub enum Phase {
    Exploration,   // weight: 0.3 (lowest)
    Debugging,     // weight: 0.5
    Consolidation, // weight: 0.6
    Planning,      // weight: 0.9
    Synthesis,     // weight: 1.0 (highest)
}

TurnSnapshot

Minimal snapshot of a turn for slicing.

pub struct TurnSnapshot {
    pub id: TurnId,
    pub session_id: String,
    pub role: Role,
    pub phase: Phase,
    pub salience: f32,                  // [0, 1]
    pub trajectory_depth: u32,
    pub trajectory_sibling_order: u32,
    pub trajectory_homogeneity: f32,    // [0, 1]
    pub trajectory_temporal: f32,       // [0, 1]
    pub trajectory_complexity: f32,
    pub created_at: i64,                // Unix timestamp
}

Edge

Directed edge in the conversation DAG.

pub struct Edge {
    pub parent: TurnId,
    pub child: TurnId,
    pub edge_type: EdgeType,
}

pub enum EdgeType {
    Reply,      // Direct reply/continuation
    Branch,     // Fork in conversation
    Reference,  // Reference to earlier turn
    Default,    // Unspecified
}

SlicePolicy v1

The policy controls how context slices are expanded around an anchor turn.

Configuration

pub struct SlicePolicyV1 {
    pub version: String,              // "slice_policy_v1"
    pub max_nodes: usize,             // Budget: max turns in slice (default: 256)
    pub max_radius: u32,              // Budget: max hops from anchor (default: 10)
    pub phase_weights: PhaseWeights,  // Priority weights per phase
    pub salience_weight: f32,         // How much salience contributes (default: 0.3)
    pub distance_decay: f32,          // Priority decay per hop (default: 0.9)
    pub include_siblings: bool,       // Whether to include siblings (default: true)
    pub max_siblings_per_node: usize, // Limit per parent (default: 5)
}

pub struct PhaseWeights {
    pub synthesis: f32,      // 1.0 (highest priority)
    pub planning: f32,       // 0.9
    pub consolidation: f32,  // 0.6
    pub debugging: f32,      // 0.5
    pub exploration: f32,    // 0.3 (lowest priority)
}

Policy Hash

Each policy configuration produces a deterministic params_hash:

let policy = SlicePolicyV1::default();
let hash = policy.params_hash(); // e.g., "a1b2c3d4e5f6789a"

Different policy parameters → different hash → different slice_id.

Priority Scoring

The priority of a turn determines its likelihood of inclusion in the slice.

Formula

priority = (phase_weight + salience × salience_weight) × distance_decay^distance

Components

Component	Description	Range
`phase_weight`	Weight from PhaseWeights based on turn's phase	0.3 - 1.0
`salience`	Turn's salience score from database	0.0 - 1.0
`salience_weight`	How much salience contributes to priority	0.0 - 1.0
`distance_decay`	Multiplicative decay per hop from anchor	0.0 - 1.0
`distance`	Number of hops from anchor turn	0, 1, 2, ...

Example

For a Synthesis turn with salience 0.8 at distance 2:

phase_weight = 1.0
salience_weight = 0.3 (default)
distance_decay = 0.9 (default)

priority = (1.0 + 0.8 × 0.3) × 0.9² = 1.24 × 0.81 = 1.004

For an Exploration turn with salience 0.3 at distance 2:

priority = (0.3 + 0.3 × 0.3) × 0.9² = 0.39 × 0.81 = 0.316

The Synthesis turn is 3× more likely to be selected.

Context Slicer Algorithm

The slicer uses a priority-queue BFS to expand around the anchor:

Algorithm

1. Initialize:
   - selected = []
   - visited = {anchor_id}
   - frontier = PriorityQueue(anchor at distance 0)

2. While frontier not empty AND |selected| < max_nodes:
   a. Pop highest-priority candidate
   b. If distance > max_radius: skip
   c. Add to selected
   d. For each neighbor (parents, children, siblings):
      - If not visited:
        - Mark visited
        - Compute priority score
        - Add to frontier with distance + 1

3. Collect edges between selected turns

4. Sort turns by TurnId, edges by (parent, child)

5. Compute slice_id fingerprint

6. Return SliceExport

Key Properties

Priority-first: High-priority turns (Synthesis, high salience) selected first
Distance-aware: Closer turns preferred via decay
Budget-bounded: Never exceeds max_nodes or max_radius
Deterministic: Same priority → ordered by TurnId for tie-breaking

Graph Stores

GraphStore Trait

pub trait GraphStore {
    type Error: std::error::Error;
    
    fn get_turn(&self, id: &TurnId) -> Result<Option<TurnSnapshot>, Self::Error>;
    fn get_turns(&self, ids: &[TurnId]) -> Result<Vec<TurnSnapshot>, Self::Error>;
    fn get_parents(&self, id: &TurnId) -> Result<Vec<TurnId>, Self::Error>;
    fn get_children(&self, id: &TurnId) -> Result<Vec<TurnId>, Self::Error>;
    fn get_siblings(&self, id: &TurnId, limit: usize) -> Result<Vec<TurnId>, Self::Error>;
    fn get_edges(&self, turn_ids: &[TurnId]) -> Result<Vec<Edge>, Self::Error>;
}

InMemoryGraphStore

For testing and small datasets:

let mut store = InMemoryGraphStore::new();
store.add_turn(turn);
store.add_edge(edge);

// Uses BTreeMap internally for deterministic iteration

PostgresGraphStore

For production with Orbit database:

// Requires "postgres" feature
let store = PostgresGraphStore::from_env().await?;

// Queries:
// - memory_turns table for turn data
// - memory_turn_edges table for parent/child relationships

Expected Schema

-- memory_turns
CREATE TABLE memory_turns (
    id UUID PRIMARY KEY,
    session_id TEXT,
    role TEXT,
    trajectory_phase TEXT,
    salience_score REAL,
    trajectory_depth INTEGER,
    trajectory_sibling_order INTEGER,
    trajectory_homogeneity REAL,
    trajectory_temporal REAL,
    trajectory_complexity REAL,
    created_at TIMESTAMP
);

-- memory_turn_edges
CREATE TABLE memory_turn_edges (
    parent_turn_id UUID REFERENCES memory_turns(id),
    child_turn_id UUID REFERENCES memory_turns(id),
    edge_type TEXT,
    PRIMARY KEY (parent_turn_id, child_turn_id)
);

SliceExport & Fingerprinting

SliceExport

The output of context slicing:

pub struct SliceExport {
    pub anchor_turn_id: TurnId,
    pub turns: Vec<TurnSnapshot>,    // Sorted by TurnId
    pub edges: Vec<Edge>,            // Sorted by (parent, child)
    pub policy_id: String,           // "slice_policy_v1"
    pub policy_params_hash: String,  // Hash of policy config
    pub schema_version: String,      // "1.0.0"
    pub slice_id: SliceFingerprint,  // Content-derived hash
}

SliceFingerprint

The fingerprint is computed from:

let canonical = (
    anchor,           // The anchor TurnId
    turn_ids,         // Vec<TurnId> of all turns
    edges,            // Vec<Edge> of all edges
    policy_id,        // "slice_policy_v1"
    policy_params_hash, // Hash of policy config
    schema_version,   // "1.0.0"
);

let slice_id = xxHash64(canonical_json_bytes(canonical));

Why This Design?

Content-derived: Changes to slice content → different fingerprint
Policy-aware: Different policy params → different fingerprint (even if turns are same)
Version-aware: Schema changes → different fingerprint
Deterministic: Same inputs → same fingerprint, always

Determinism Guarantees

What Is Guaranteed

Property	Guarantee
Turn ordering	Always sorted by TurnId
Edge ordering	Always sorted by (parent, child, edge_type)
Slice ID	Same inputs → identical slice_id on every run
Policy hash	Same policy config → identical params_hash
Serialization	Canonical JSON (no HashMap, sorted keys)

What Is NOT Guaranteed

Property	Reason
Slice content stability	Changes to graph → changes to slice
Cross-version stability	Schema version changes may change fingerprints
Float precision	Standard IEEE 754 (use quantization if needed)

Golden Tests

The crate includes 13 golden tests verifying determinism:

#[test]
fn test_same_anchor_same_slice_id_100_runs() {
    // Generate 100 slices from same anchor
    // All must have identical slice_id
}

#[test]
fn test_policy_param_change_changes_slice_id() {
    // Different policy params → different fingerprint
}

#[test]
fn test_edge_ordering_determinism() {
    // Edges always in same order
}

Integration with CognitiveTwin Bridge

The Graph Kernel provides the "context" for semantic analysis:

┌─────────────────────────────────────────────────────────────────┐
│                     CognitiveTwin Bridge                         │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌─────────────┐    ┌──────────────┐    ┌─────────────────┐    │
│  │  Atomizer   │───▶│   Proposer   │───▶│   Evidence      │    │
│  │             │    │              │    │   Runner        │    │
│  └─────────────┘    └──────────────┘    └─────────────────┘    │
│         │                  │                    │               │
│         │                  │                    │               │
│         ▼                  ▼                    ▼               │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │                    slice_id provenance                   │   │
│  │  "This atom was extracted from slice a1b2c3d4..."       │   │
│  │  "This proposal was made in context of slice x9y8z7..." │   │
│  │  "This observation occurred under slice e5f6g7h8..."    │   │
│  └─────────────────────────────────────────────────────────┘   │
│                              ▲                                  │
│                              │                                  │
│                    ┌─────────────────┐                          │
│                    │  Graph Kernel   │                          │
│                    │  SliceExport    │                          │
│                    └─────────────────┘                          │
└─────────────────────────────────────────────────────────────────┘

Usage Pattern

// In the Atomizer
let slice = graph_kernel.slice(anchor_turn_id)?;

let atom = TrajectoryAtom {
    id: AtomId::compute(...),
    provenance: AtomProvenance {
        slice_id: slice.slice_id.to_string(),
        anchor_turn_id: slice.anchor_turn_id,
        // ...
    },
    // ...
};

// In the EvidenceRunner
let observation = TraceObserved {
    slice_id: slice.slice_id.to_string(),
    trace_stats: TraceStats { ... },
    // ...
};

Testing

Run All Tests

cd cc-graph-kernel
cargo test

Run Specific Test Category

# Unit tests only
cargo test --lib

# Golden tests only
cargo test --test golden

# With postgres feature
cargo test --features postgres

Test Coverage

Category	Tests	Description
Types	6	TurnId ordering, Phase weights, Role parsing
Policy	3	Phase weights, params hash determinism
Scoring	4	Priority calculation, candidate ordering
Store	4	Add/get turns, edges, siblings
Slicer	4	Anchor inclusion, budget limits, determinism
Golden	13	100-run determinism, edge ordering, fingerprinting

API Reference

Main Entry Points

// Create a slicer
pub fn ContextSlicer::new(store: S, policy: SlicePolicyV1) -> Self;

// Generate a slice
pub fn ContextSlicer::slice(&self, anchor_id: TurnId) -> Result<SliceExport, SlicerError>;

// Access policy
pub fn ContextSlicer::policy(&self) -> &SlicePolicyV1;

// Access store
pub fn ContextSlicer::store(&self) -> &S;

SliceExport Methods

pub fn SliceExport::num_turns(&self) -> usize;
pub fn SliceExport::num_edges(&self) -> usize;
pub fn SliceExport::contains_turn(&self, id: &TurnId) -> bool;
pub fn SliceExport::anchor_turn(&self) -> Option<&TurnSnapshot>;

Policy Configuration

// Default policy
let policy = SlicePolicyV1::default();

// Custom policy
let policy = SlicePolicyV1::new(
    128,                      // max_nodes
    5,                        // max_radius
    PhaseWeights::default(),  // phase_weights
    0.4,                      // salience_weight
    0.85,                     // distance_decay
    true,                     // include_siblings
    3,                        // max_siblings_per_node
);

// Get policy hash
let hash = policy.params_hash();

Canonical Hashing

use cc_graph_kernel::{canonical_hash, canonical_hash_hex};

let hash_u64 = canonical_hash(&my_struct);
let hash_hex = canonical_hash_hex(&my_struct);

Constants

/// Schema version for all graph kernel types.
pub const GRAPH_KERNEL_SCHEMA_VERSION: &str = "1.0.0";

/// Default policy version identifier.
pub const DEFAULT_POLICY_VERSION: &str = "slice_policy_v1";

Error Handling

pub enum SlicerError {
    /// Anchor turn not found in graph.
    AnchorNotFound(TurnId),
    
    /// Error from the underlying store.
    StoreError(String),
}

License

MIT

Changelog

v0.2.0 (2026-01-01)

NEW: REST Service Feature
- graph_kernel_service binary for standalone deployment
- Axum-based REST API with /api/slice, /api/policies endpoints
- PolicyRegistry for immutable policy management
- Docker support via Dockerfile.service
- RAG++ integration via SliceClient
ServiceState for shared store and policy registry
PolicyRef for stable policy references

v0.1.0 (2026-01-01)

Initial release
TurnId, TurnSnapshot, Edge types with canonical ordering
SlicePolicyV1 with phase weights and budgets
Priority-queue BFS expansion algorithm
InMemoryGraphStore for testing
PostgresGraphStore for production (optional feature)
SliceExport with deterministic fingerprinting
37 tests (24 unit + 13 golden)

Dependencies

~12–36MB
~460K SLoC